ragflow/docs/quickstart.md

---
sidebar_position: 1
slug: /
---

# Quick start

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.

This quick start guide describes a general process from:

- Starting up a local RAGFlow server, 
- Creating a knowledge base, 
- Intervening with file parsing, to 
- Establishing an AI chat based on your datasets. 

## Prerequisites

- CPU >= 4 cores
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1

  > If you have not installed Docker on your local machine (Windows, Mac, or Linux), see [Install Docker Engine](https://docs.docker.com/engine/install/).

## Start up the server

This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike. 

1. Ensure `vm.max_map_count` >= 262144:

   > To check the value of `vm.max_map_count`:
   >
   > ```bash
   > $ sysctl vm.max_map_count
   > ```
   >
   > Reset `vm.max_map_count` to a value at least 262144 if it is not.
   >
   > ```bash
   > # In this case, we set it to 262144:
   > $ sudo sysctl -w vm.max_map_count=262144
   > ```
   >
   > This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in **/etc/sysctl.conf** accordingly:
   >
   > ```bash
   > vm.max_map_count=262144
   > ```
   > See [this guide](./guides/max_map_count.md) for instructions on permanently setting `vm.max_map_count` on an operating system other than Linux. 

2. Clone the repo:

   ```bash
   $ git clone https://github.com/infiniflow/ragflow.git
   ```

3. Build the pre-built Docker images and start up the server:

   > Running the following commands automatically downloads the *dev* version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in **docker/.env** to the intended version, for example `RAGFLOW_VERSION=v0.7.0`, before running the following commands.

   ```bash
   $ cd ragflow/docker
   $ chmod +x ./entrypoint.sh
   $ docker compose up -d
   ```

> The core image is about 9 GB in size and may take a while to load.

4. Check the server status after having the server up and running:

   ```bash
   $ docker logs -f ragflow-server
   ```

   _The following output confirms a successful launch of the system:_

   ```bash
       ____                 ______ __
      / __ \ ____ _ ____ _ / ____// /____  _      __
     / /_/ // __ `// __ `// /_   / // __ \| | /| / /
    / _, _// /_/ // /_/ // __/  / // /_/ /| |/ |/ /
   /_/ |_| \__,_/ \__, //_/    /_/ \____/ |__/|__/
                 /____/
   
    * Running on all addresses (0.0.0.0)
    * Running on http://127.0.0.1:9380
    * Running on http://x.x.x.x:9380
    INFO:werkzeug:Press CTRL+C to quit
   ```

   > If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a `network anomaly` error because, at that moment, your RAGFlow may not be fully initialized.  

5. In your web browser, enter the IP address of your server and log in to RAGFlow.

   > - With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (**sans** port number) as the default HTTP serving port `80` can be omitted when using the default configurations.

## Configure LLMs

RAGFlow is a RAG engine, and it needs to work with an LLM to offer grounded, hallucination-free question-answering capabilities. For now, RAGFlow supports the following LLMs, and the list is expanding:

- OpenAI
- Tongyi-Qianwen
- ZHIPU-AI
- Moonshot
- DeepSeek-V2
- Baichuan
- VolcEngine

>  RAGFlow also supports deploying LLMs locally using Ollama or Xinference, but this part is not covered in this quick start guide. 

To add and configure an LLM: 

1. Click on your logo on the top right of the page **>** **Model Providers**:

   ![2 add llm](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814)

   > Each RAGFlow account is able to use **text-embedding-v2** for free, a embedding model of Tongyi-Qianwen. This is why you can see Tongyi-Qianwen in the **Added models** list. And you may need to update your Tongyi-Qianwen API key at a later point.

2. Click on the desired LLM and update the API key accordingly (DeepSeek-V2 in this case):

   ![update api key](https://github.com/infiniflow/ragflow/assets/93570324/4e5e13ef-a98d-42e6-bcb1-0c6045fc1666)

   *Your added models appear as follows:* 

   ![added available models](https://github.com/infiniflow/ragflow/assets/93570324/d08b80e4-f921-480a-b41d-11832489c916)

3. Click **System Model Settings** to select the default models: 

   - Chat model, 
   - Embedding model, 
   - Image-to-text model. 

   ![system model settings](https://github.com/infiniflow/ragflow/assets/93570324/cdcc1da5-4494-44cd-ad5b-1222ed6acc3f)

> Some models, such as the image-to-text model **qwen-vl-max**, are subsidiary to a specific LLM. And you may need to update your API key to access these models. 

## Create your first knowledge base

You are allowed to upload files to a knowledge base in RAGFlow and parse them into datasets. A knowledge base is virtually a collection of datasets. Question answering in RAGFlow can be based on a particular knowledge base or multiple knowledge bases. File formats that RAGFlow supports include documents (PDF, DOC, DOCX, TXT, MD), tables (CSV, XLSX, XLS), pictures (JPEG, JPG, PNG, TIF, GIF), and slides (PPT, PPTX).

To create your first knowledge base:

1. Click the **Knowledge Base** tab in the top middle of the page **>** **Create knowledge base**.

2. Input the name of your knowledge base and click **OK** to confirm your changes.

   _You are taken to the **Configuration** page of your knowledge base._

   ![knowledge base configuration](https://github.com/infiniflow/ragflow/assets/93570324/384c671a-8b9c-468c-b1c9-1401128a9b65)

3. RAGFlow offers multiple chunk templates that cater to different document layouts and file formats. Select the embedding model and chunk method (template) for your knowledge base. 

   > IMPORTANT: Once you have selected an embedding model and used it to parse a file, you are no longer allowed to change it. The obvious reason is that we must ensure that all files in a specific knowledge base are parsed using the *same* embedding model (ensure that they are being compared in the same embedding space). 

   _You are taken to the **Dataset** page of your knowledge base._

4. Click **+ Add file** **>** **Local files** to start uploading a particular file to the knowledge base. 

5. In the uploaded file entry, click the play button to start file parsing:

   ![file parsing](https://github.com/infiniflow/ragflow/assets/93570324/19f273fa-0ab0-435e-bdf4-a47fb080a078)

   _When the file parsing completes, its parsing status changes to **SUCCESS**._

## Intervene with file parsing 

RAGFlow features visibility and explainability, allowing you to view the chunking results and intervene where necessary. To do so: 

1. Click on the file that completes file parsing to view the chunking results: 

   _You are taken to the **Chunk** page:_

   ![chunks](https://github.com/infiniflow/ragflow/assets/93570324/0547fd0e-e71b-41f8-8e0e-31649c85fd3d)

2. Hover over each snapshot for a quick view of each chunk.

3. Double click the chunked texts to add keywords or make *manual* changes where necessary:

   ![update chunk](https://github.com/infiniflow/ragflow/assets/93570324/1d84b408-4e9f-46fd-9413-8c1059bf9c76)

4. In Retrieval testing, ask a quick question in **Test text** to double check if your configurations work:

   _As you can tell from the following, RAGFlow responds with truthful citations._

   ![retrieval test](https://github.com/infiniflow/ragflow/assets/93570324/c03f06f6-f41f-4b20-a97e-ae405d3a950c)

## Set up an AI chat

Conversations in RAGFlow are based on a particular knowledge base or multiple knowledge bases. Once you have created your knowledge base and finished file parsing, you can go ahead and start an AI conversation. 

1. Click the **Chat** tab in the middle top of the mage **>** **Create an assistant** to show the **Chat Configuration** dialogue *of your next dialogue*.
   > RAGFlow offer the flexibility of choosing a different chat model for each dialogue, while allowing you to set the default models in **System Model Settings**.

2. Update **Assistant Setting**: 

   - Name your assistant and specify your knowledge bases.
   - **Empty response**:
     - If you wish to *confine* RAGFlow's answers to your knowledge bases, leave a response here. Then when it doesn't retrieve an answer, it *uniformly* responds with what you set here. 
     - If you wish RAGFlow to *improvise* when it doesn't retrieve an answer from your knowledge bases, leave it blank, which may give rise to hallucinations. 

3. Update **Prompt Engine** or leave it as is for the beginning.

4. Update **Model Setting**.

5. RAGFlow also offers conversation APIs. Hover over your dialogue **>** **Chat Bot API** to integrate RAGFlow's chat capabilities into your applications:

   ![chatbot api](https://github.com/infiniflow/ragflow/assets/93570324/fec23715-f9af-4ac2-81e5-942c5035c5e6)

6. Now, let's start the show:

   ![question1](https://github.com/infiniflow/ragflow/assets/93570324/bb72dd67-b35e-4b2a-87e9-4e4edbd6e677)

   ![question2](https://github.com/infiniflow/ragflow/assets/93570324/7cc585ae-88d0-4aa2-817d-0370b2ad7230)
Reorganized docs for docusaurus publish (#860) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-21 20:53:55 +08:00			`---`
			`sidebar_position: 1`
			`slug: /`
			`---`

			`# Quick start`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. When integrated with LLMs, it is capable of providing truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.`

			`This quick start guide describes a general process from:`

			`- Starting up a local RAGFlow server,`
			`- Creating a knowledge base,`
			`- Intervening with file parsing, to`
			`- Establishing an AI chat based on your datasets.`

			`## Prerequisites`

			`- CPU >= 4 cores`
			`- RAM >= 16 GB`
			`- Disk >= 50 GB`
			`- Docker >= 24.0.0 & Docker Compose >= v2.26.1`

			`> If you have not installed Docker on your local machine (Windows, Mac, or Linux), see [Install Docker Engine](https://docs.docker.com/engine/install/).`

			`## Start up the server`

minor editorial updates for clarity (#941) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-27 20:35:08 +08:00			`This section provides instructions on setting up the RAGFlow server on Linux. If you are on a different operating system, no worries. Most steps are alike.`

Reorganized docs for docusaurus publish (#860) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-21 20:53:55 +08:00			1. Ensure `vm.max_map_count` >= 262144:
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			> To check the value of `vm.max_map_count`:
			`>`
			> ```bash
			`> $ sysctl vm.max_map_count`
			> ```
			`>`
			> Reset `vm.max_map_count` to a value at least 262144 if it is not.
			`>`
			> ```bash
			`> # In this case, we set it to 262144:`
			`> $ sudo sysctl -w vm.max_map_count=262144`
			> ```
			`>`
			> This change will be reset after a system reboot. To ensure your change remains permanent, add or update the `vm.max_map_count` value in /etc/sysctl.conf accordingly:
			`>`
			> ```bash
			`> vm.max_map_count=262144`
			> ```
minor editorial updates for clarity (#941) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-27 20:35:08 +08:00			> See [this guide](./guides/max_map_count.md) for instructions on permanently setting `vm.max_map_count` on an operating system other than Linux.
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`2. Clone the repo:`

			```bash
			`$ git clone https://github.com/infiniflow/ragflow.git`
			```

			`3. Build the pre-built Docker images and start up the server:`

Update README (#998) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com> 2024-05-30 18:00:02 +08:00			> Running the following commands automatically downloads the dev version RAGFlow Docker image. To download and run a specified Docker version, update `RAGFLOW_VERSION` in docker/.env to the intended version, for example `RAGFLOW_VERSION=v0.7.0`, before running the following commands.
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			```bash
			`$ cd ragflow/docker`
			`$ chmod +x ./entrypoint.sh`
			`$ docker compose up -d`
			```

			`> The core image is about 9 GB in size and may take a while to load.`

			`4. Check the server status after having the server up and running:`

			```bash
			`$ docker logs -f ragflow-server`
			```

			`_The following output confirms a successful launch of the system:_`

			```bash
			`____ ______ __`
			`/ __ \ ____ _ ____ _ / ____// /____ _ __`
			/ /_/ // __ `// __ `// /_ / // __ \\| \| /\| / /
			`/ _, _// /_/ // /_/ // __/ / // /_/ /\| \|/ \|/ /`
			`/_/ \|_\| \__,_/ \__, //_/ /_/ \____/ \|__/\|__/`
			`/____/`

			`* Running on all addresses (0.0.0.0)`
			`* Running on http://127.0.0.1:9380`
			`* Running on http://x.x.x.x:9380`
			`INFO:werkzeug:Press CTRL+C to quit`
			```

			> If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a `network anomaly` error because, at that moment, your RAGFlow may not be fully initialized.

			`5. In your web browser, enter the IP address of your server and log in to RAGFlow.`

			> - With default settings, you only need to enter `http://IP_OF_YOUR_MACHINE` (sans port number) as the default HTTP serving port `80` can be omitted when using the default configurations.

			`## Configure LLMs`

			`RAGFlow is a RAG engine, and it needs to work with an LLM to offer grounded, hallucination-free question-answering capabilities. For now, RAGFlow supports the following LLMs, and the list is expanding:`

			`- OpenAI`
			`- Tongyi-Qianwen`
Expanded the supported LLM list (#960) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-28 20:13:03 +08:00			`- ZHIPU-AI`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00			`- Moonshot`
			`- DeepSeek-V2`
Expanded the supported LLM list (#960) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-28 20:13:03 +08:00			`- Baichuan`
			`- VolcEngine`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`> RAGFlow also supports deploying LLMs locally using Ollama or Xinference, but this part is not covered in this quick start guide.`

			`To add and configure an LLM:`

			`1. Click on your logo on the top right of the page > Model Providers:`

			`![2 add llm](https://github.com/infiniflow/ragflow/assets/93570324/10635088-028b-4b3d-add9-5c5a6e626814)`

			`> Each RAGFlow account is able to use text-embedding-v2 for free, a embedding model of Tongyi-Qianwen. This is why you can see Tongyi-Qianwen in the Added models list. And you may need to update your Tongyi-Qianwen API key at a later point.`

			`2. Click on the desired LLM and update the API key accordingly (DeepSeek-V2 in this case):`

			`![update api key](https://github.com/infiniflow/ragflow/assets/93570324/4e5e13ef-a98d-42e6-bcb1-0c6045fc1666)`

			`Your added models appear as follows:`

			`![added available models](https://github.com/infiniflow/ragflow/assets/93570324/d08b80e4-f921-480a-b41d-11832489c916)`

Miscellaneous updates (#769) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-14 18:46:39 +08:00			`3. Click System Model Settings to select the default models:`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`- Chat model,`
			`- Embedding model,`
			`- Image-to-text model.`

			`![system model settings](https://github.com/infiniflow/ragflow/assets/93570324/cdcc1da5-4494-44cd-ad5b-1222ed6acc3f)`

Updated RESTful API Reference (#908) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-27 18:34:16 +08:00			`> Some models, such as the image-to-text model qwen-vl-max, are subsidiary to a specific LLM. And you may need to update your API key to access these models.`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`## Create your first knowledge base`

			`You are allowed to upload files to a knowledge base in RAGFlow and parse them into datasets. A knowledge base is virtually a collection of datasets. Question answering in RAGFlow can be based on a particular knowledge base or multiple knowledge bases. File formats that RAGFlow supports include documents (PDF, DOC, DOCX, TXT, MD), tables (CSV, XLSX, XLS), pictures (JPEG, JPG, PNG, TIF, GIF), and slides (PPT, PPTX).`

			`To create your first knowledge base:`

			`1. Click the Knowledge Base tab in the top middle of the page > Create knowledge base.`

			`2. Input the name of your knowledge base and click OK to confirm your changes.`

			`_You are taken to the Configuration page of your knowledge base._`

			`![knowledge base configuration](https://github.com/infiniflow/ragflow/assets/93570324/384c671a-8b9c-468c-b1c9-1401128a9b65)`

			`3. RAGFlow offers multiple chunk templates that cater to different document layouts and file formats. Select the embedding model and chunk method (template) for your knowledge base.`

Miscellaneous updates (#769) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-14 18:46:39 +08:00			`> IMPORTANT: Once you have selected an embedding model and used it to parse a file, you are no longer allowed to change it. The obvious reason is that we must ensure that all files in a specific knowledge base are parsed using the same embedding model (ensure that they are being compared in the same embedding space).`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`_You are taken to the Dataset page of your knowledge base._`

			`4. Click + Add file > Local files to start uploading a particular file to the knowledge base.`

			`5. In the uploaded file entry, click the play button to start file parsing:`

			`![file parsing](https://github.com/infiniflow/ragflow/assets/93570324/19f273fa-0ab0-435e-bdf4-a47fb080a078)`

			`_When the file parsing completes, its parsing status changes to SUCCESS._`

			`## Intervene with file parsing`

			`RAGFlow features visibility and explainability, allowing you to view the chunking results and intervene where necessary. To do so:`

			`1. Click on the file that completes file parsing to view the chunking results:`

			`_You are taken to the Chunk page:_`

			`![chunks](https://github.com/infiniflow/ragflow/assets/93570324/0547fd0e-e71b-41f8-8e0e-31649c85fd3d)`

Initial draft of configure knowledge base (#794) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-16 21:27:09 +08:00			`2. Hover over each snapshot for a quick view of each chunk.`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`3. Double click the chunked texts to add keywords or make manual changes where necessary:`

			`![update chunk](https://github.com/infiniflow/ragflow/assets/93570324/1d84b408-4e9f-46fd-9413-8c1059bf9c76)`

			`4. In Retrieval testing, ask a quick question in Test text to double check if your configurations work:`

			`_As you can tell from the following, RAGFlow responds with truthful citations._`

			`![retrieval test](https://github.com/infiniflow/ragflow/assets/93570324/c03f06f6-f41f-4b20-a97e-ae405d3a950c)`

			`## Set up an AI chat`

			`Conversations in RAGFlow are based on a particular knowledge base or multiple knowledge bases. Once you have created your knowledge base and finished file parsing, you can go ahead and start an AI conversation.`

Miscellaneous updates (#769) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-14 18:46:39 +08:00			`1. Click the Chat tab in the middle top of the mage > Create an assistant to show the Chat Configuration dialogue of your next dialogue.`
			`> RAGFlow offer the flexibility of choosing a different chat model for each dialogue, while allowing you to set the default models in System Model Settings.`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`2. Update Assistant Setting:`

			`- Name your assistant and specify your knowledge bases.`
			`- Empty response:`
			`- If you wish to confine RAGFlow's answers to your knowledge bases, leave a response here. Then when it doesn't retrieve an answer, it uniformly responds with what you set here.`
Miscellaneous updates (#769) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Documentation Update 2024-05-14 18:46:39 +08:00			`- If you wish RAGFlow to improvise when it doesn't retrieve an answer from your knowledge bases, leave it blank, which may give rise to hallucinations.`
Create quickstart.md (#743) ### What problem does this PR solve? Draft quickstart. ### Type of change - [x] Documentation Update 2024-05-14 12:22:33 +08:00
			`3. Update Prompt Engine or leave it as is for the beginning.`

			`4. Update Model Setting.`

			`5. RAGFlow also offers conversation APIs. Hover over your dialogue > Chat Bot API to integrate RAGFlow's chat capabilities into your applications:`

			`![chatbot api](https://github.com/infiniflow/ragflow/assets/93570324/fec23715-f9af-4ac2-81e5-942c5035c5e6)`

			`6. Now, let's start the show:`

			`![question1](https://github.com/infiniflow/ragflow/assets/93570324/bb72dd67-b35e-4b2a-87e9-4e4edbd6e677)`

			`![question2](https://github.com/infiniflow/ragflow/assets/93570324/7cc585ae-88d0-4aa2-817d-0370b2ad7230)`