From 3413f43b47d48d4a04b24b91f0a8bd574d57f6f2 Mon Sep 17 00:00:00 2001 From: writinwaters <93570324+writinwaters@users.noreply.github.com> Date: Mon, 8 Jul 2024 19:30:29 +0800 Subject: [PATCH] Fixed a docusaurus display issue (#1431) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update --- docs/guides/deploy_local_llm.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/docs/guides/deploy_local_llm.md b/docs/guides/deploy_local_llm.md index 8b184e3af..1326125b3 100644 --- a/docs/guides/deploy_local_llm.md +++ b/docs/guides/deploy_local_llm.md @@ -236,32 +236,28 @@ You may launch the Ollama service as below: ollama serve ``` -> [!NOTE] + > Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU. -> [!TIP] + > If your local LLM is running on Intel Arcâ„¢ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`: > > ```bash > export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 > ``` -> [!NOTE] + > To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`. The console will display messages similar to the following: - - - +![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png) ### 3. Pull and Run Ollama Model Keep the Ollama service on and open another terminal and run `./ollama pull ` in Linux (`ollama.exe pull ` in Windows) to automatically pull a model. e.g. `qwen2:latest`: - - - +![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png) #### Run Ollama Model