2024-05-21 20:53:55 +08:00
---
sidebar_position: 5
slug: /deploy_local_llm
---
# Deploy a local LLM
RAGFlow supports deploying LLMs locally using Ollama or Xinference.
## Ollama
One-click deployment of local LLMs, that is [Ollama ](https://github.com/ollama/ollama ).
### Install
- [Ollama on Linux ](https://github.com/ollama/ollama/blob/main/docs/linux.md )
- [Ollama Windows Preview ](https://github.com/ollama/ollama/blob/main/docs/windows.md )
- [Docker ](https://hub.docker.com/r/ollama/ollama )
### Launch Ollama
Decide which LLM you want to deploy ([here's a list for supported LLM ](https://ollama.com/library )), say, **mistral** :
```bash
$ ollama run mistral
```
Or,
```bash
$ docker exec -it ollama ollama run mistral
```
### Use Ollama in RAGFlow
- Go to 'Settings > Model Providers > Models to be added > Ollama'.

> Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`.
- Use Ollama Models.

## Xinference
Xorbits Inference([Xinference ](https://github.com/xorbitsai/inference )) empowers you to unleash the full potential of cutting-edge AI models.
### Install
- [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html)
- [Docker ](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html )
To start a local instance of Xinference, run the following command:
```bash
$ xinference-local --host 0.0.0.0 --port 9997
```
### Launch Xinference
Decide which LLM you want to deploy ([here's a list for supported LLM ](https://inference.readthedocs.io/en/latest/models/builtin/ )), say, **mistral** .
2024-05-22 12:45:34 +08:00
Execute the following command to launch the model, remember to replace `${quantization}` with your chosen quantization method from the options listed above:
2024-05-21 20:53:55 +08:00
```bash
$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
```
### Use Xinference in RAGFlow
- Go to 'Settings > Model Providers > Models to be added > Xinference'.

> Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`.
- Use Xinference Models.

