OllamaとLlama3-ELYZA-JP-8Bの導入 - シンギュラリティ実験ノート

OllamaをWindowsのWSL2環境に導入した。更にLM-StudioでダウンロードしてあるLlama3-ELYZA-JP-8Bの量子化済みモデルを変換してOllamaから使えるようにしてみた。

Ollamaの導入（Windows-WSL2環境）

最近はWindows（Preview）版もあるのだが、練習をかねてLinux版をWSL2環境に導入することにした。手順をここにメモしておきたい。

本家のサイトにインストールコマンドが以下のように書かれている。

$ curl -fsSL https://ollama.com/install.sh | sh
curl: (6) Could not resolve host: ollama.com

私の環境ではcurlコマンドはエラーとなった。調べてみるとこのコマンドは外部からダウンロードしたシェルを直接シェルに渡して実行するので危険という指摘があった。

そこで下記のようにinstall.shをダウンロードして一旦確認してから実行した。

$ mkdir ollama
$ cd ollama
$ wget https://ollama.com/install.sh
--2024-09-29 12:51:20--  https://ollama.com/install.sh
Resolving ollama.com (ollama.com)... 34.120.132.20
Connecting to ollama.com (ollama.com)|34.120.132.20|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘install.sh’

install.sh         [ <=>                     ]  13.01K  --.-KB/s    in 0s

2024-09-29 12:51:20 (195 MB/s) - ‘install.sh’ saved [13320]

ここで一旦install.shの内容を確認してから実行する。（結局そのまま実行するのだが・・）

$ chmod ugo+x install.sh
$ ./install.sh
>>> Installing ollama to /usr/local
[sudo] password for xxxxxx:
>>> Downloading Linux amd64 bundle
###################################### 100.0%############################### 100.0%
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

インストールはできたようなので、ollamaのバージョンを確認する。

$ ollama --version
ollama version is 0.3.12

Llama3を実行する。初回実行時はダウンロードが始まる。

$ ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% ▕████████████████████████████████████████████████████████▏ 4.7 GB
pulling 4fa551d4f938... 100% ▕████████████████████████████████████████████████████████▏  12 KB
pulling 8ab4849b038c... 100% ▕████████████████████████████████████████████████████████▏  254 B
pulling 577073ffcc6c... 100% ▕████████████████████████████████████████████████████████▏  110 B
pulling 3f8eb4da87fa... 100% ▕████████████████████████████████████████████████████████▏  485 B
verifying sha256 digest
writing manifest
success

ダウンロードが成功したので、早速テストしてみる。日本語は苦手のようなので英語で聞く。

>>> Factorize 6x^2+x-12.
Let's factorize the quadratic expression 6x^2 + x - 12.

First, we can start by looking for two numbers whose product is -12 and whose sum is 1 (since the coefficient of x
is 1). These numbers are -4 and 3, since (-4) × 3 = -12 and (-4) + 3 = 1.

Now, let's write the expression as:

6x^2 + x - 12 = 6x^2 + 3x - 4x - 12

= 3x(2x + 1) - 4(2x + 1)

= (2x + 1)(3x - 4)

So, the factored form of the expression is:

6x^2 + x - 12 = (2x + 1)(3x - 4)

因数分解をやらせてみた。普段なら正解してくれるのだが、今回は調子が悪かったのか不正解だった。インストール自体は無事にできたようだ。

量子化済みモデルの追加（Llama3-ELYZA-JP-8B）

下記のページを参考にして実行した。

「Modelfile」というファイルを作成し、上記サイトの内容どおりに記述する。ただし量子化済みモデル（ggufファイル）へのパス名だけを修正し、以下のようにした。（usernameは自分の環境に合わせて読み替え）

FROM /mnt/c/Users/username/.cache/lm-studio/models/mmnga/Llama-3-ELYZA-JP-8B-gguf/Llama-3-ELYZA-JP-8B-Q4_K_M.gguf
TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"

次のコマンドを実行して、Modelfileからollamaのモデルを作成する。

$ ollama create elyza:jp8b -f Modelfile

以下のコマンドでLlama3-ELYZA-JP-8Bが起動できることを確認した。

$ ollama run elyza:jp8b

本日のまとめ

Ollamaは洗練されたソフトだ。Windows-WSL2のLinux環境でほとんど何のトラブルもなく導入することができた。コマンドラインでのやり取りはWindowsユーザにはとっつきにくいかもしれないが、Unix文化に慣れた人であれば逆に受入れ易いだろう。

これだけコマンドで動かせるなら、emacsからもLLMが使えるのではないか、と、ふと思い

ollamaをemacsで使うためのELISPを作成してください。

とELYZAに頼んでみた。すると、

ollamaをemacsから使うためのライブラリーは既に提供されています。例えば・・・

という回答が返ってきた。この後自分でググってみたところ、「Ellama」というEmacsライブラリが提供されていることを発見した。これを使えばEmacsの中でLLMとのチャットはもちろん、指定した文章を翻訳をさせたり、コーディングのアシストをさせたり、Webページの文章を要約させたり、いろんなことができるらしい。

以下のページが参考になった。

今どきの人はエディターはVSCodeあたりを使うのが普通で、IT業界でもEmacsを使う人は少ないと思う。とすると私のブログ読者で、このEmacs+Ellamaの情報に興味を持つ人はほとんど皆無に近いと思われる。しかし、今でもEmacsを好んで使う私としては、最新のLLMの世界と、元祖OSSとして30年以上前から続くソフトウェア環境のEmacsの組み合わせにはワクワクするものがある。Ellamaについて試したことをメモとして書きたいと思うが、長くなるのでまた別の記事としたい。