前提条件
Python(3.10)がインストール済であること。他バージョンでもいけるが、これ以上新しいと、仮想環境にllama.cppのrequirements.txtをpipインストールしようとした際にバージョンアンマッチでエラーとなるため注意(1敗)。
Homebrewのインストール
以下のコマンドを実行する。
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
実行結果は以下のとおり。
(base) Mac-Studio ~ % /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
==> Checking for `sudo` access (which may request your password)...
Password:
==> This script will install:
/opt/homebrew/bin/brew
/opt/homebrew/share/doc/homebrew
/opt/homebrew/share/man/man1/brew.1
/opt/homebrew/share/zsh/site-functions/_brew
/opt/homebrew/etc/bash_completion.d/brew
/opt/homebrew
/etc/paths.d/homebrew
==> The following new directories will be created:
/opt/homebrew/bin
/opt/homebrew/etc
/opt/homebrew/include
/opt/homebrew/lib
/opt/homebrew/sbin
/opt/homebrew/share
/opt/homebrew/var
/opt/homebrew/opt
/opt/homebrew/share/zsh
/opt/homebrew/share/zsh/site-functions
/opt/homebrew/var/homebrew
/opt/homebrew/var/homebrew/linked
/opt/homebrew/Cellar
/opt/homebrew/Caskroom
/opt/homebrew/Frameworks
Press RETURN/ENTER to continue or any other key to abort:
==> /usr/bin/sudo /usr/bin/install -d -o root -g wheel -m 0755 /opt/homebrew
==> /usr/bin/sudo /bin/mkdir -p /opt/homebrew/bin /opt/homebrew/etc /opt/homebrew/include /opt/homebrew/lib /opt/homebrew/sbin /opt/homebrew/share /opt/homebrew/var /opt/homebrew/opt /opt/homebrew/share/zsh /opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew /opt/homebrew/var/homebrew/linked /opt/homebrew/Cellar /opt/homebrew/Caskroom /opt/homebrew/Frameworks
==> /usr/bin/sudo /bin/chmod ug=rwx /opt/homebrew/bin /opt/homebrew/etc /opt/homebrew/include /opt/homebrew/lib /opt/homebrew/sbin /opt/homebrew/share /opt/homebrew/var /opt/homebrew/opt /opt/homebrew/share/zsh /opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew /opt/homebrew/var/homebrew/linked /opt/homebrew/Cellar /opt/homebrew/Caskroom /opt/homebrew/Frameworks
==> /usr/bin/sudo /bin/chmod go-w /opt/homebrew/share/zsh /opt/homebrew/share/zsh/site-functions
==> /usr/bin/sudo /usr/sbin/chown xxx /opt/homebrew/bin /opt/homebrew/etc /opt/homebrew/include /opt/homebrew/lib /opt/homebrew/sbin /opt/homebrew/share /opt/homebrew/var /opt/homebrew/opt /opt/homebrew/share/zsh /opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew /opt/homebrew/var/homebrew/linked /opt/homebrew/Cellar /opt/homebrew/Caskroom /opt/homebrew/Frameworks
==> /usr/bin/sudo /usr/bin/chgrp admin /opt/homebrew/bin /opt/homebrew/etc /opt/homebrew/include /opt/homebrew/lib /opt/homebrew/sbin /opt/homebrew/share /opt/homebrew/var /opt/homebrew/opt /opt/homebrew/share/zsh /opt/homebrew/share/zsh/site-functions /opt/homebrew/var/homebrew /opt/homebrew/var/homebrew/linked /opt/homebrew/Cellar /opt/homebrew/Caskroom /opt/homebrew/Frameworks
==> /usr/bin/sudo /usr/sbin/chown -R xxx:admin /opt/homebrew
==> Downloading and installing Homebrew...
remote: Enumerating objects: 310230, done.
remote: Counting objects: 100% (15147/15147), done.
remote: Compressing objects: 100% (565/565), done.
remote: Total 310230 (delta 14770), reused 14627 (delta 14582), pack-reused 295083 (from 3)
remote: Enumerating objects: 55, done.
remote: Counting objects: 100% (34/34), done.
remote: Total 55 (delta 33), reused 33 (delta 33), pack-reused 21 (from 1)
==> /usr/bin/sudo /bin/mkdir -p /etc/paths.d
==> /usr/bin/sudo tee /etc/paths.d/homebrew
/opt/homebrew/bin
==> /usr/bin/sudo /usr/sbin/chown root:wheel /etc/paths.d/homebrew
==> /usr/bin/sudo /bin/chmod a+r /etc/paths.d/homebrew
==> Updating Homebrew...
==> Downloading https://ghcr.io/v2/homebrew/portable-ruby/portable-ruby/blobs/sha256:20fa657858e44a4b39171d6e4111f8a9716eb62a78ebbd1491d94f90bb7b830a
############################################################ 100.0%
==> Pouring portable-ruby-3.4.5.arm64_big_sur.bottle.tar.gz
==> Installation successful!
==> Homebrew has enabled anonymous aggregate formulae and cask analytics.
Read the analytics documentation (and how to opt-out) here:
https://docs.brew.sh/Analytics
No analytics data has been sent yet (nor will any be during this install run).
==> Homebrew is run entirely by unpaid volunteers. Please consider donating:
https://github.com/Homebrew/brew#donations
==> Next steps:
- Run these commands in your terminal to add Homebrew to your PATH:
echo >> /Users/xxx/.zprofile
echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> /Users/xxx/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"
- Run brew help to get started
- Further documentation:
https://docs.brew.sh
Homebrewのパス設定
以下のコマンドを実行する
export PATH="$PATH:/opt/homebrew/bin/"
パスが通ったかは、以下のコマンドで確認可能。バージョンが返ってくればOK。
brew --version
(base) Mac-Studio ~ % brew --version
Homebrew 4.6.10
cmakeのインストール
以下のコマンドを実行する。
brew install wget git cmake
実行結果は以下のとおり。
(base) Mac-Studio ~ % brew install wget git cmake
==> Fetching downloads for: wget, git and cmake
==> Downloading https://ghcr.io/v2/homebrew/core/wget/manifests/1.2
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/git/manifests/2.51
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/cmake/manifests/4.
############################################################ 100.0%
==> Fetching dependencies for wget: libunistring, gettext, libidn2, ca-certificates and openssl@3
==> Downloading https://ghcr.io/v2/homebrew/core/libunistring/manif
############################################################ 100.0%
==> Fetching libunistring
==> Downloading https://ghcr.io/v2/homebrew/core/libunistring/blobs
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/gettext/manifests/
############################################################ 100.0%
==> Fetching gettext
==> Downloading https://ghcr.io/v2/homebrew/core/gettext/blobs/sha2
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/libidn2/manifests/
############################################################ 100.0%
==> Fetching libidn2
==> Downloading https://ghcr.io/v2/homebrew/core/libidn2/blobs/sha2
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/ca-certificates/ma
############################################################ 100.0%
==> Fetching ca-certificates
==> Downloading https://ghcr.io/v2/homebrew/core/ca-certificates/bl
############################################################ 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/openssl/3/manifest
############################################################ 100.0%
==> Fetching openssl@3
==> Downloading https://ghcr.io/v2/homebrew/core/openssl/3/blobs/sh
############################################################ 100.0%
==> Fetching wget
==> Downloading https://ghcr.io/v2/homebrew/core/wget/blobs/sha256:
############################################################ 100.0%
==> Fetching dependencies for git: pcre2
==> Downloading https://ghcr.io/v2/homebrew/core/pcre2/manifests/10
############################################################ 100.0%
==> Fetching pcre2
==> Downloading https://ghcr.io/v2/homebrew/core/pcre2/blobs/sha256
############################################################ 100.0%
==> Fetching git
==> Downloading https://ghcr.io/v2/homebrew/core/git/blobs/sha256:e
############################################################ 100.0%
==> Fetching cmake
==> Downloading https://ghcr.io/v2/homebrew/core/cmake/blobs/sha256
############################################################ 100.0%
==> Installing dependencies for wget: libunistring, gettext, libidn2, ca-certificates and openssl@3
==> Installing wget dependency: libunistring
==> Downloading https://ghcr.io/v2/homebrew/core/libunistring/manif
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/a570da63bc1839c7e217f203abd54d4d873ebd6b99f6e88994d0e79e2ebe987c--libunistring-1.3.bottle_manifest.json
==> Pouring libunistring--1.3.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/libunistring/1.3: 59 files, 5.4MB
==> Installing wget dependency: gettext
==> Downloading https://ghcr.io/v2/homebrew/core/gettext/manifests/
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/d28158ffec04fae757cdbeb46750e8e2ed43b7b17ada49d72a5bda2cff4cd6ed--gettext-0.26.bottle_manifest.json
==> Pouring gettext--0.26.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/gettext/0.26: 2,428 files, 28.2MB
==> Installing wget dependency: libidn2
==> Downloading https://ghcr.io/v2/homebrew/core/libidn2/manifests/
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/f30f50fbde4bff9a71de54d684e482d7da3432656d680b97441163c6e5665468--libidn2-2.3.8.bottle_manifest.json
==> Pouring libidn2--2.3.8.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/libidn2/2.3.8: 80 files, 929.5KB
==> Installing wget dependency: ca-certificates
==> Downloading https://ghcr.io/v2/homebrew/core/ca-certificates/ma
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/446bcc9fbe916b3769ad3367c5fff981dfdf345e29ffc493f87e48e904d30608--ca-certificates-2025-08-12-2.bottle_manifest.json
==> Pouring ca-certificates--2025-08-12.all.bottle.2.tar.gz
==> Regenerating CA certificate bundle from keychain, this may take
🍺 /opt/homebrew/Cellar/ca-certificates/2025-08-12: 4 files, 232.9KB
==> Installing wget dependency: openssl@3
==> Downloading https://ghcr.io/v2/homebrew/core/openssl/3/manifest
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/e6659abe178bdf49b65451e77f6165a3e07274432f445342092e5ad2a927b23c--openssl@3-3.5.2.bottle_manifest.json
==> Pouring openssl@3--3.5.2.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/openssl@3/3.5.2: 7,563 files, 35.4MB
==> Installing wget
==> Pouring wget--1.25.0.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/wget/1.25.0: 92 files, 4.5MB
==> Running `brew cleanup wget`...
Disable this behaviour by setting `HOMEBREW_NO_INSTALL_CLEANUP=1`.
Hide these hints with `HOMEBREW_NO_ENV_HINTS=1` (see `man brew`).
==> Installing git dependency: pcre2
==> Downloading https://ghcr.io/v2/homebrew/core/pcre2/manifests/10
Already downloaded: /Users/xxx/Library/Caches/Homebrew/downloads/476078c344b0d9fa702da6e94d7fe5dd59bb1897d3fcc12d1b52b4bc0de68854--pcre2-10.46.bottle_manifest.json
==> Pouring pcre2--10.46.arm64_sequoia.bottle.tar.gz
🍺 /opt/homebrew/Cellar/pcre2/10.46: 242 files, 6.8MB
==> Pouring git--2.51.0.arm64_sequoia.bottle.tar.gz
==> Caveats
The Tcl/Tk GUIs (e.g. gitk, git-gui) are now in the `git-gui` formula.
Subversion interoperability (git-svn) is now in the `git-svn` formula.
==> Summary
🍺 /opt/homebrew/Cellar/git/2.51.0: 1,693 files, 55.9MB
==> Running `brew cleanup git`...
==> Pouring cmake--4.1.1.arm64_sequoia.bottle.tar.gz
==> Caveats
To install the CMake documentation, run:
brew install cmake-docs
==> Summary
🍺 /opt/homebrew/Cellar/cmake/4.1.1: 3,913 files, 58.6MB
==> Running `brew cleanup cmake`...
==> No outdated dependents to upgrade!
==> Caveats
zsh completions and functions have been installed to:
/opt/homebrew/share/zsh/site-functions
Emacs Lisp files have been installed to:
/opt/homebrew/share/emacs/site-lisp/cmake
==> git
The Tcl/Tk GUIs (e.g. gitk, git-gui) are now in the `git-gui` formula.
Subversion interoperability (git-svn) is now in the `git-svn` formula.
==> cmake
To install the CMake documentation, run:
brew install cmake-docs
xcodeをインストール
以下のコマンドを実行する。
xcode-select --install
実行結果は以下のとおり。xcodeがインストール済の場合は、以下のメッセージが表示される。
(base) Mac-Studio ~ % xcode-select --install
xcode-select: note: Command line tools are already installed. Use "Software Update" in System Settings or the softwareupdate command line interface to install updates
Llama.cppのビルド
以下のコマンドを実行し、llama.cppのリポジトリをクローンする。
git clone https://github.com/ggerganov/llama.cpp
実行結果は以下のとおり。
(base) Mac-Studio ~ % git clone https://github.com/ggerganov/llama.cpp
Cloning into 'llama.cpp'...
remote: Enumerating objects: 61275, done.
remote: Counting objects: 100% (206/206), done.
remote: Compressing objects: 100% (135/135), done.
remote: Total 61275 (delta 138), reused 71 (delta 71), pack-reused 61069 (from 3)
Receiving objects: 100% (61275/61275), 152.17 MiB | 5.26 MiB/s, done.
Resolving deltas: 100% (44500/44500), done.
llama.cppディレクトリに移動し、makeする。
cd llama.cpp
make
実行結果は以下のとおり。
(base) Mac-Studio ~ % cd llama.cpp
(base) Mac-Studio llama.cpp % make
Makefile:6: *** Build system changed:
The Makefile build has been replaced by CMake.
For build instructions see:
https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md
. Stop.
GGUF変換用のPython仮想環境作成
以下のコマンドを実行し、仮想環境を作成する(環境名は任意で変更OK)。私は「convert2gguf」という名称にした。理由は後述するが、Python3.10とするのがミソ。
python3.10 -m venv convert2gguf
作成した仮想環境をアクティベートする。
source convert2gguf/bin/activate
ターミナルの頭に(仮想環境名)が表示されていれば、アクティベートできている。以下のコマンドを実行し、Pythonバージョンを確認する。
python --version
以下のように、3.10.xと表示されていればOK。
(convert2gguf) (base) Mac-Studio llama.cpp % python --version
Python 3.10.8
以下のコマンドを実行し、llama.cppのrequirements.txtをpipインストールする。
pip install -r requirements.txt
実行結果は以下のとおり。llama.cppのrequirements.txtをpipインストールできる最新のPythonバージョンは3.10のため、エラーが出る場合はPythonの仮想環境を見直すこと。
(convert2gguf) (base) Mac-Studio llama.cpp % pip install -r requirements.txt
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/nightly, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/nightly, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/nightly
Ignoring torch: markers 'platform_machine == "s390x"' don't match your environment
Ignoring torch: markers 'platform_machine == "s390x"' don't match your environment
Collecting numpy~=1.26.4
Downloading https://download.pytorch.org/whl/nightly/numpy-1.26.4-cp310-cp310-macosx_11_0_arm64.whl (14.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.0/14.0 MB 10.8 MB/s eta 0:00:00
Collecting sentencepiece~=0.2.0
Downloading https://download.pytorch.org/whl/nightly/sentencepiece-0.2.1-cp310-cp310-macosx_11_0_arm64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 11.0 MB/s eta 0:00:00
Collecting transformers<5.0.0,>=4.45.1
Downloading transformers-4.56.1-py3-none-any.whl (11.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 4.5 MB/s eta 0:00:00
Collecting gguf>=0.1.0
Downloading gguf-0.17.1-py3-none-any.whl (96 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.2/96.2 kB 6.1 MB/s eta 0:00:00
Collecting protobuf<5.0.0,>=4.21.0
Downloading protobuf-4.25.8-cp37-abi3-macosx_10_9_universal2.whl (394 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 394.5/394.5 kB 7.1 MB/s eta 0:00:00
Collecting mistral-common>=1.8.3
Downloading mistral_common-1.8.4-py3-none-any.whl (6.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.5/6.5 MB 8.6 MB/s eta 0:00:00
Collecting torch~=2.4.0
Downloading https://download.pytorch.org/whl/cpu/torch-2.4.1-cp310-none-macosx_11_0_arm64.whl (62.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.1/62.1 MB 10.4 MB/s eta 0:00:00
Collecting aiohttp~=3.9.3
Downloading https://download.pytorch.org/whl/nightly/aiohttp-3.9.5-cp310-cp310-macosx_11_0_arm64.whl (389 kB)
━━━━━━━━━━━━━━━━━━━━━━━━ 389.9/389.9 kB 618.1 kB/s eta 0:00:00
Collecting pytest~=8.3.3
Downloading pytest-8.3.5-py3-none-any.whl (343 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 343.6/343.6 kB 3.6 MB/s eta 0:00:00
Collecting huggingface_hub~=0.23.2
Downloading huggingface_hub-0.23.5-py3-none-any.whl (402 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 402.8/402.8 kB 1.3 MB/s eta 0:00:00
Collecting matplotlib~=3.10.0
Downloading matplotlib-3.10.6-cp310-cp310-macosx_11_0_arm64.whl (8.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.1/8.1 MB 5.6 MB/s eta 0:00:00
Collecting openai~=1.55.3
Downloading openai-1.55.3-py3-none-any.whl (389 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 389.6/389.6 kB 6.5 MB/s eta 0:00:00
Collecting pandas~=2.2.3
Downloading https://download.pytorch.org/whl/nightly/pandas-2.2.3-cp310-cp310-macosx_11_0_arm64.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 10.0 MB/s eta 0:00:00
Collecting prometheus-client~=0.20.0
Downloading prometheus_client-0.20.0-py3-none-any.whl (54 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 54.5/54.5 kB 4.3 MB/s eta 0:00:00
Collecting requests~=2.32.3
Downloading requests-2.32.5-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.7/64.7 kB 5.5 MB/s eta 0:00:00
Collecting wget~=3.2
Downloading wget-3.2.zip (10 kB)
Preparing metadata (setup.py) ... done
Collecting typer~=0.15.1
Downloading typer-0.15.4-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.3/45.3 kB 4.8 MB/s eta 0:00:00
Collecting seaborn~=0.13.2
Downloading seaborn-0.13.2-py3-none-any.whl (294 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 294.9/294.9 kB 9.2 MB/s eta 0:00:00
Collecting pyyaml>=5.1
Downloading https://download.pytorch.org/whl/nightly/PyYAML-6.0.2-cp310-cp310-macosx_11_0_arm64.whl (171 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 171.8/171.8 kB 8.5 MB/s eta 0:00:00
Collecting packaging>=20.0
Downloading packaging-25.0-py3-none-any.whl (66 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.5/66.5 kB 5.8 MB/s eta 0:00:00
Collecting filelock
Downloading https://download.pytorch.org/whl/nightly/filelock-3.19.1-py3-none-any.whl (15 kB)
Collecting transformers<5.0.0,>=4.45.1
Downloading transformers-4.56.0-py3-none-any.whl (11.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 8.6 MB/s eta 0:00:00
Downloading transformers-4.55.4-py3-none-any.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 8.5 MB/s eta 0:00:00
Downloading transformers-4.55.3-py3-none-any.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 10.1 MB/s eta 0:00:00
Downloading transformers-4.55.2-py3-none-any.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 11.5 MB/s eta 0:00:00
Downloading transformers-4.55.1-py3-none-any.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 5.1 MB/s eta 0:00:00
Downloading transformers-4.55.0-py3-none-any.whl (11.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.3/11.3 MB 11.4 MB/s eta 0:00:00
Downloading transformers-4.54.1-py3-none-any.whl (11.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.2/11.2 MB 9.9 MB/s eta 0:00:00
Downloading transformers-4.54.0-py3-none-any.whl (11.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.2/11.2 MB 10.9 MB/s eta 0:00:00
Downloading transformers-4.53.3-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 11.3 MB/s eta 0:00:00
Downloading transformers-4.53.2-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 10.1 MB/s eta 0:00:00
Downloading transformers-4.53.1-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 6.5 MB/s eta 0:00:00
Downloading transformers-4.53.0-py3-none-any.whl (10.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 9.7 MB/s eta 0:00:00
Downloading transformers-4.52.4-py3-none-any.whl (10.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.5/10.5 MB 11.1 MB/s eta 0:00:00
Downloading transformers-4.52.3-py3-none-any.whl (10.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.5/10.5 MB 11.5 MB/s eta 0:00:00
Downloading transformers-4.52.2-py3-none-any.whl (10.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.5/10.5 MB 11.1 MB/s eta 0:00:00
Downloading transformers-4.52.1-py3-none-any.whl (10.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.5/10.5 MB 11.5 MB/s eta 0:00:00
Downloading transformers-4.51.3-py3-none-any.whl (10.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.4/10.4 MB 10.5 MB/s eta 0:00:00
Downloading transformers-4.51.2-py3-none-any.whl (10.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.4/10.4 MB 9.4 MB/s eta 0:00:00
Downloading transformers-4.51.1-py3-none-any.whl (10.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.4/10.4 MB 10.4 MB/s eta 0:00:00
Downloading transformers-4.51.0-py3-none-any.whl (10.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.4/10.4 MB 11.0 MB/s eta 0:00:00
Downloading transformers-4.50.3-py3-none-any.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 10.5 MB/s eta 0:00:00
Downloading transformers-4.50.2-py3-none-any.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 5.2 MB/s eta 0:00:00
Downloading transformers-4.50.1-py3-none-any.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 6.7 MB/s eta 0:00:00
Downloading transformers-4.50.0-py3-none-any.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 10.7 MB/s eta 0:00:00
Downloading transformers-4.49.0-py3-none-any.whl (10.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MB 9.8 MB/s eta 0:00:00
Downloading transformers-4.48.3-py3-none-any.whl (9.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 11.5 MB/s eta 0:00:00
Collecting safetensors>=0.4.1
Downloading https://download.pytorch.org/whl/nightly/safetensors-0.6.2-cp38-abi3-macosx_11_0_arm64.whl (432 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 432.2/432.2 kB 10.2 MB/s eta 0:00:00
Collecting transformers<5.0.0,>=4.45.1
Downloading transformers-4.48.2-py3-none-any.whl (9.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 10.3 MB/s eta 0:00:00
Downloading transformers-4.48.1-py3-none-any.whl (9.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 8.7 MB/s eta 0:00:00
Downloading transformers-4.48.0-py3-none-any.whl (9.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 4.0 MB/s eta 0:00:00
Downloading transformers-4.47.1-py3-none-any.whl (10.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.1/10.1 MB 5.5 MB/s eta 0:00:00
Downloading transformers-4.47.0-py3-none-any.whl (10.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.1/10.1 MB 9.3 MB/s eta 0:00:00
Downloading transformers-4.46.3-py3-none-any.whl (10.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MB 8.1 MB/s eta 0:00:00
Collecting tokenizers<0.21,>=0.20
Downloading tokenizers-0.20.3-cp310-cp310-macosx_11_0_arm64.whl (2.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 8.6 MB/s eta 0:00:00
Collecting tqdm>=4.27
Downloading https://download.pytorch.org/whl/nightly/tqdm-4.67.1-py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 kB 6.4 MB/s eta 0:00:00
Collecting regex!=2019.12.17
Downloading regex-2025.9.1-cp310-cp310-macosx_11_0_arm64.whl (286 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 286.6/286.6 kB 8.9 MB/s eta 0:00:00
Collecting jsonschema>=4.21.1
Downloading jsonschema-4.25.1-py3-none-any.whl (90 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 90.0/90.0 kB 7.9 MB/s eta 0:00:00
Collecting pydantic<3.0,>=2.7
Downloading pydantic-2.11.7-py3-none-any.whl (444 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 444.8/444.8 kB 10.3 MB/s eta 0:00:00
Collecting tiktoken>=0.7.0
Downloading https://download.pytorch.org/whl/nightly/tiktoken-0.11.0-cp310-cp310-macosx_11_0_arm64.whl (999 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 999.2/999.2 kB 1.3 MB/s eta 0:00:00
Collecting pydantic-extra-types[pycountry]>=2.10.5
Downloading pydantic_extra_types-2.10.5-py3-none-any.whl (38 kB)
Collecting pillow>=10.3.0
Downloading https://download.pytorch.org/whl/nightly/pillow-11.3.0-cp310-cp310-macosx_11_0_arm64.whl (4.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.7/4.7 MB 11.3 MB/s eta 0:00:00
Collecting typing-extensions>=4.11.0
Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 kB 4.7 MB/s eta 0:00:00
Collecting sympy
Downloading https://download.pytorch.org/whl/nightly/sympy-1.14.0-py3-none-any.whl (6.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 8.1 MB/s eta 0:00:00
Collecting fsspec
Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 kB 8.9 MB/s eta 0:00:00
Collecting jinja2
Downloading https://download.pytorch.org/whl/nightly/jinja2-3.1.6-py3-none-any.whl (134 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 kB 10.5 MB/s eta 0:00:00
Collecting networkx
Downloading https://download.pytorch.org/whl/nightly/networkx-3.5-py3-none-any.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 10.3 MB/s eta 0:00:00
Collecting multidict<7.0,>=4.5
Downloading https://download.pytorch.org/whl/nightly/multidict-6.6.4-cp310-cp310-macosx_11_0_arm64.whl (44 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 kB 5.7 MB/s eta 0:00:00
Collecting yarl<2.0,>=1.0
Downloading https://download.pytorch.org/whl/nightly/yarl-1.20.1-cp310-cp310-macosx_11_0_arm64.whl (89 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.3/89.3 kB 4.4 MB/s eta 0:00:00
Collecting attrs>=17.3.0
Downloading https://download.pytorch.org/whl/nightly/attrs-25.3.0-py3-none-any.whl (63 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.8/63.8 kB 5.3 MB/s eta 0:00:00
Collecting aiosignal>=1.1.2
Downloading https://download.pytorch.org/whl/nightly/aiosignal-1.4.0-py3-none-any.whl (7.5 kB)
Collecting frozenlist>=1.1.1
Downloading https://download.pytorch.org/whl/nightly/frozenlist-1.7.0-cp310-cp310-macosx_11_0_arm64.whl (46 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.8/46.8 kB 4.7 MB/s eta 0:00:00
Collecting async-timeout<5.0,>=4.0
Downloading https://download.pytorch.org/whl/nightly/async_timeout-4.0.3-py3-none-any.whl (5.7 kB)
Collecting pluggy<2,>=1.5
Downloading pluggy-1.6.0-py3-none-any.whl (20 kB)
Collecting iniconfig
Downloading iniconfig-2.1.0-py3-none-any.whl (6.0 kB)
Collecting exceptiongroup>=1.0.0rc8
Downloading exceptiongroup-1.3.0-py3-none-any.whl (16 kB)
Collecting tomli>=1
Downloading tomli-2.2.1-py3-none-any.whl (14 kB)
Collecting kiwisolver>=1.3.1
Downloading kiwisolver-1.4.9-cp310-cp310-macosx_11_0_arm64.whl (65 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 65.3/65.3 kB 5.5 MB/s eta 0:00:00
Collecting cycler>=0.10
Downloading https://download.pytorch.org/whl/nightly/cycler-0.12.1-py3-none-any.whl (8.3 kB)
Collecting pyparsing>=2.3.1
Downloading pyparsing-3.2.3-py3-none-any.whl (111 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 111.1/111.1 kB 8.5 MB/s eta 0:00:00
Collecting python-dateutil>=2.7
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 10.6 MB/s eta 0:00:00
Collecting fonttools>=4.22.0
Downloading fonttools-4.59.2-cp310-cp310-macosx_10_9_universal2.whl (2.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.8/2.8 MB 11.3 MB/s eta 0:00:00
Collecting contourpy>=1.0.1
Downloading contourpy-1.3.2-cp310-cp310-macosx_11_0_arm64.whl (253 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 253.4/253.4 kB 10.1 MB/s eta 0:00:00
Collecting jiter<1,>=0.4.0
Downloading jiter-0.10.0-cp310-cp310-macosx_11_0_arm64.whl (322 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 322.8/322.8 kB 9.3 MB/s eta 0:00:00
Collecting httpx<1,>=0.23.0
Downloading httpx-0.28.1-py3-none-any.whl (73 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 73.5/73.5 kB 6.6 MB/s eta 0:00:00
Collecting distro<2,>=1.7.0
Downloading distro-1.9.0-py3-none-any.whl (20 kB)
Collecting anyio<5,>=3.5.0
Downloading anyio-4.10.0-py3-none-any.whl (107 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 107.2/107.2 kB 8.9 MB/s eta 0:00:00
Collecting sniffio
Downloading sniffio-1.3.1-py3-none-any.whl (10 kB)
Collecting pytz>=2020.1
Downloading https://download.pytorch.org/whl/nightly/pytz-2025.2-py2.py3-none-any.whl (509 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 509.2/509.2 kB 10.7 MB/s eta 0:00:00
Collecting tzdata>=2022.7
Downloading https://download.pytorch.org/whl/nightly/tzdata-2025.2-py2.py3-none-any.whl (347 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━ 347.8/347.8 kB 10.6 MB/s eta 0:00:00
Collecting charset_normalizer<4,>=2
Downloading https://download.pytorch.org/whl/nightly/charset_normalizer-3.4.3-cp310-cp310-macosx_10_9_universal2.whl (207 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.7/207.7 kB 9.3 MB/s eta 0:00:00
Collecting certifi>=2017.4.17
Downloading https://download.pytorch.org/whl/nightly/certifi-2025.8.3-py3-none-any.whl (161 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 161.2/161.2 kB 5.7 MB/s eta 0:00:00
Collecting idna<4,>=2.5
Downloading https://download.pytorch.org/whl/nightly/idna-3.10-py3-none-any.whl (70 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 kB 6.0 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1
Downloading https://download.pytorch.org/whl/nightly/urllib3-2.5.0-py3-none-any.whl (129 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.8/129.8 kB 7.6 MB/s eta 0:00:00
Collecting shellingham>=1.3.0
Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)
Collecting rich>=10.11.0
Downloading rich-14.1.0-py3-none-any.whl (243 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 243.4/243.4 kB 8.7 MB/s eta 0:00:00
Collecting click<8.2,>=8.0.0
Downloading click-8.1.8-py3-none-any.whl (98 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.2/98.2 kB 7.5 MB/s eta 0:00:00
Collecting httpcore==1.*
Downloading httpcore-1.0.9-py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.8/78.8 kB 8.1 MB/s eta 0:00:00
Collecting h11>=0.16
Downloading h11-0.16.0-py3-none-any.whl (37 kB)
Collecting referencing>=0.28.4
Downloading referencing-0.36.2-py3-none-any.whl (26 kB)
Collecting rpds-py>=0.7.1
Downloading rpds_py-0.27.1-cp310-cp310-macosx_11_0_arm64.whl (353 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 353.5/353.5 kB 9.9 MB/s eta 0:00:00
Collecting jsonschema-specifications>=2023.03.6
Downloading jsonschema_specifications-2025.9.1-py3-none-any.whl (18 kB)
Collecting pydantic-core==2.33.2
Downloading pydantic_core-2.33.2-cp310-cp310-macosx_11_0_arm64.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 10.8 MB/s eta 0:00:00
Collecting annotated-types>=0.6.0
Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB)
Collecting typing-inspection>=0.4.0
Downloading typing_inspection-0.4.1-py3-none-any.whl (14 kB)
Collecting pycountry>=23
Downloading pycountry-24.6.1-py3-none-any.whl (6.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 11.1 MB/s eta 0:00:00
Collecting six>=1.5
Downloading https://download.pytorch.org/whl/nightly/six-1.17.0-py2.py3-none-any.whl (11 kB)
Collecting markdown-it-py>=2.2.0
Downloading markdown_it_py-4.0.0-py3-none-any.whl (87 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 87.3/87.3 kB 9.2 MB/s eta 0:00:00
Collecting pygments<3.0.0,>=2.13.0
Downloading pygments-2.19.2-py3-none-any.whl (1.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 10.5 MB/s eta 0:00:00
Collecting propcache>=0.2.1
Downloading propcache-0.3.2-cp310-cp310-macosx_11_0_arm64.whl (43 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.0/43.0 kB 4.7 MB/s eta 0:00:00
Collecting MarkupSafe>=2.0
Downloading https://download.pytorch.org/whl/nightly/MarkupSafe-3.0.2-cp310-cp310-macosx_11_0_arm64.whl (12 kB)
Collecting networkx
Downloading https://download.pytorch.org/whl/nightly/networkx-3.4.2-py3-none-any.whl (1.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 5.9 MB/s eta 0:00:00
Collecting mpmath<1.4,>=1.1.0
Downloading https://download.pytorch.org/whl/nightly/mpmath-1.3.0-py3-none-any.whl (536 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 9.5 MB/s eta 0:00:00
Collecting mdurl~=0.1
Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Using legacy 'setup.py install' for wget, since package 'wheel' is not installed.
Installing collected packages: wget, pytz, mpmath, urllib3, tzdata, typing-extensions, tqdm, tomli, sympy, sniffio, six, shellingham, sentencepiece, safetensors, rpds-py, regex, pyyaml, pyparsing, pygments, pycountry, protobuf, propcache, prometheus-client, pluggy, pillow, packaging, numpy, networkx, mdurl, MarkupSafe, kiwisolver, jiter, iniconfig, idna, h11, fsspec, frozenlist, fonttools, filelock, distro, cycler, click, charset_normalizer, certifi, attrs, async-timeout, annotated-types, typing-inspection, requests, referencing, python-dateutil, pydantic-core, multidict, markdown-it-py, jinja2, httpcore, gguf, exceptiongroup, contourpy, aiosignal, yarl, torch, tiktoken, rich, pytest, pydantic, pandas, matplotlib, jsonschema-specifications, huggingface_hub, anyio, typer, tokenizers, seaborn, pydantic-extra-types, jsonschema, httpx, aiohttp, transformers, openai, mistral-common
Running setup.py install for wget ... done
Successfully installed MarkupSafe-3.0.2 aiohttp-3.9.5 aiosignal-1.4.0 annotated-types-0.7.0 anyio-4.10.0 async-timeout-4.0.3 attrs-25.3.0 certifi-2025.8.3 charset_normalizer-3.4.3 click-8.1.8 contourpy-1.3.2 cycler-0.12.1 distro-1.9.0 exceptiongroup-1.3.0 filelock-3.19.1 fonttools-4.59.2 frozenlist-1.7.0 fsspec-2025.9.0 gguf-0.17.1 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 huggingface_hub-0.23.5 idna-3.10 iniconfig-2.1.0 jinja2-3.1.6 jiter-0.10.0 jsonschema-4.25.1 jsonschema-specifications-2025.9.1 kiwisolver-1.4.9 markdown-it-py-4.0.0 matplotlib-3.10.6 mdurl-0.1.2 mistral-common-1.8.4 mpmath-1.3.0 multidict-6.6.4 networkx-3.4.2 numpy-1.26.4 openai-1.55.3 packaging-25.0 pandas-2.2.3 pillow-11.3.0 pluggy-1.6.0 prometheus-client-0.20.0 propcache-0.3.2 protobuf-4.25.8 pycountry-24.6.1 pydantic-2.11.7 pydantic-core-2.33.2 pydantic-extra-types-2.10.5 pygments-2.19.2 pyparsing-3.2.3 pytest-8.3.5 python-dateutil-2.9.0.post0 pytz-2025.2 pyyaml-6.0.2 referencing-0.36.2 regex-2025.9.1 requests-2.32.5 rich-14.1.0 rpds-py-0.27.1 safetensors-0.6.2 seaborn-0.13.2 sentencepiece-0.2.1 shellingham-1.5.4 six-1.17.0 sniffio-1.3.1 sympy-1.14.0 tiktoken-0.11.0 tokenizers-0.20.3 tomli-2.2.1 torch-2.4.1 tqdm-4.67.1 transformers-4.46.3 typer-0.15.4 typing-extensions-4.15.0 typing-inspection-0.4.1 tzdata-2025.2 urllib3-2.5.0 wget-3.2 yarl-1.20.1
WARNING: There was an error checking the latest version of pip.
pip listは以下のとおり(実行しなくてもOK)。
(convert2gguf) (base) Mac-Studio llama.cpp % pip list --o
Package Version Latest Type
----------------- ------- ------- -----
aiohttp 3.9.5 3.12.15 wheel
async-timeout 4.0.3 5.0.1 wheel
click 8.1.8 8.2.1 wheel
huggingface-hub 0.23.5 0.34.4 wheel
numpy 1.26.4 2.2.6 wheel
openai 1.55.3 1.106.1 wheel
pandas 2.2.3 2.3.2 wheel
pip 22.2.2 25.2 wheel
prometheus_client 0.20.0 0.22.1 wheel
protobuf 4.25.8 6.32.0 wheel
pydantic_core 2.33.2 2.39.0 wheel
pytest 8.3.5 8.4.2 wheel
setuptools 63.2.0 80.9.0 wheel
tokenizers 0.20.3 0.22.0 wheel
torch 2.4.1 2.8.0 wheel
transformers 4.46.3 4.56.1 wheel
typer 0.15.4 0.17.4 wheel
WARNING: There was an error checking the latest version of pip.
Safetensors形式のモデルをGGUFに変換
いよいよ本題のGGUFへの変換を行います。以下のコマンドを実行すると、変換が行われます。
今回はHuggingfaceに掲載されている、「RakutenAI-7B」を使用します。量子化ビットはbf16にします。
–outtypeの後ろでGGUFの変換サイズを指定することができます。詳細は以下のとおり。下に行くほどモデルの質が劣化するが、メモリ消費量は減り、トークン生成速度は速くなる傾向にある。まぁ、頓珍漢な回答も出やすくなるので注意。
パラメータ | 量子化ビット | 7Bモデルを変換した際のファイルサイズ目安 | 備考 |
---|---|---|---|
f32 | 32bit | 約26GB | 高品質だが、RAMを大量に消費する。また、トークン生成速度も遅い。 |
f16 | 16bit | 約13GB | f32程ではないが、高品質。メモリ等のリソースは消費する。 |
bf16 | 16bit(bfloat16) | 約13GB | f16と同様だが、bfloat16に対応したデバイスでは、積和演算が高速。 |
q8_0 | 8bit | 約7GB | 8bit=1byte量子化。メモリ消費量とトークン生成速度のバランスがいいと思う。 |
tq2_0 | true 2-bit ternary-plus 量子化(2.7bit) | 約3GB | 4bit未満に量子化。メモリ消費量が大幅に減る。 |
tq1_0 | 1.69 bit | 約2GB | メモリ消費量が一番小さい。が、特に長文の回答を生成した際に支離滅裂な文を生成しやすくなる。 |
auto | – | 変化なし | 既存のGGUFファイルを再生成する時に使ったりする。よって、モデルサイズ等は変わらない。 |
python convert_hf_to_gguf.py "Safetensors形式のモデルが格納されたフォルダのパス" --outfile "GGUFファイル出力先のフォルダ/RakutenAI-7B-instruct_f16.gguf" --outtype bf16
実行結果は以下のとおり。
(convert2gguf) (base) Mac-Studio llama.cpp % python convert_hf_to_gguf.py "Safetensors形式のモデルが格納されたフォルダのパス" --outfile "GGUFファイル出力先のフォルダ/RakutenAI-7B-instruct_f16.gguf" --outtype bf16
INFO:hf-to-gguf:Loading model: RakutenAI-7B-instruct
INFO:hf-to-gguf:Model architecture: MistralForCausalLM
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors'
INFO:hf-to-gguf:token_embd.weight, torch.bfloat16 --> F16, shape = {4096, 48000}
INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.0.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.0.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.0.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.1.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.1.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.1.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.10.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.10.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.10.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.2.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.2.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.2.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.3.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.3.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.3.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.4.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.4.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.4.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.5.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.5.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.5.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.6.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.6.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.6.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.7.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.7.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.7.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.8.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.8.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.9.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.9.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.9.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00002-of-00003.safetensors'
INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.11.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.11.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.11.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.12.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.12.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.12.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.13.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.13.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.13.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.14.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.14.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.14.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.15.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.15.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.15.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.16.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.16.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.16.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.17.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.17.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.17.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.18.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.18.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.18.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.19.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.19.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.19.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.20.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.20.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.20.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.21.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.21.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.21.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:gguf: loading model part 'model-00003-of-00003.safetensors'
INFO:hf-to-gguf:output.weight, torch.bfloat16 --> F16, shape = {4096, 48000}
INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.22.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.22.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.22.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.23.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.23.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.23.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.24.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.24.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.24.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.25.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.25.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.25.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.26.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.26.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.26.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.27.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.27.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.27.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.28.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.28.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.28.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.29.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.29.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.29.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.30.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.30.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.30.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.bfloat16 --> F16, shape = {14336, 4096}
INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.bfloat16 --> F16, shape = {4096, 14336}
INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:blk.31.attn_k.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:blk.31.attn_output.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_q.weight, torch.bfloat16 --> F16, shape = {4096, 4096}
INFO:hf-to-gguf:blk.31.attn_v.weight, torch.bfloat16 --> F16, shape = {4096, 1024}
INFO:hf-to-gguf:output_norm.weight, torch.bfloat16 --> F32, shape = {4096}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 32768
INFO:hf-to-gguf:gguf: embedding length = 4096
INFO:hf-to-gguf:gguf: feed forward length = 14336
INFO:hf-to-gguf:gguf: head count = 32
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 10000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
WARNING:gguf.vocab:Unknown separator token '<s>' in TemplateProcessing<pair>
INFO:gguf.vocab:Setting special token type bos to 1
INFO:gguf.vocab:Setting special token type eos to 2
INFO:gguf.vocab:Setting special token type unk to 0
INFO:gguf.vocab:Setting special token type pad to 2
INFO:gguf.vocab:Setting add_bos_token to True
INFO:gguf.vocab:Setting add_sep_token to False
INFO:gguf.vocab:Setting add_eos_token to False
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:/GGUFファイル出力先のフォルダ/RakutenAI-7B-instruct_f16.gguf: n_tensors = 291, total_size = 14.7G
Writing: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 14.7G/14.7G [00:50<00:00, 293Mbyte/s]
INFO:hf-to-gguf:Model successfully exported to /GGUFファイル出力先のフォルダ/RakutenAI-7B-instruct/RakutenAI-7B-instruct_f16.gguf
コメント