ragflow/docs/guides/agent/agent_component_reference/retrieval.mdx

---
sidebar_position: 3
slug: /retrieval_component
---

# Retrieval component

A component that retrieves information from specified datasets.

## Scenarios

A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. A **Retrieval** component can operate either as a standalone workflow module or as a tool for an **Agent** component. In the latter role, the **Agent** component has autonomous control over when to invoke it for query and retrieval.

The following screenshot shows a reference design using the **Retrieval** component, where the component serves as a tool for an **Agent** component. You can find it from the **Report Agent Using Knowledge Base** Agent template.

![retrieval_reference_design](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/retrieval_reference_design.jpg)

## Prerequisites

Ensure you [have properly configured your target knowledge base(s)](../../dataset/configure_knowledge_base.md).

## Quickstart

### 1. Click on a **Retrieval** component to show its configuration panel  

The corresponding configuration panel appears to the right of the canvas. Use this panel to define and fine-tune the **Retrieval** component's search behavior.

### 2. Input query variable(s)

The **Retrieval** component depends on query variables to specify its queries. 

:::caution IMPORTANT
- If you use the **Retrieval** component as a standalone workflow module, input query variables in the **Input Variables** text box.
- If it is used as a tool for an **Agent** component, input the query variables in the **Agent** component's **User prompt** field.
:::

By default, you can use `sys.query`, which is the user query and the default output of the **Begin** component. All global variables defined before the **Retrieval** component can also be used as query statements. Use the `(x)` button or type `/` to show all the available query variables.

### 3. Select knowledge base(s) to query

You can specify one or multiple knowledge bases to retrieve data from. If selecting mutiple, ensure they use the same embedding model.

### 4. Expand **Advanced Settings** to configure the retrieval method

By default, a combination of weighted keyword similarity and weighted vector cosine similarity is used for retrieval. If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used instead.

As a starter, you can skip this step to stay with the default retrieval method.

:::caution WARNING
Using a rerank model will *significantly* increase the system's response time. If you must use a rerank model, ensure you use a SaaS reranker; if you prefer a locally deployed rerank model, ensure you start RAGFlow with **docker-compose-gpu.yml**.
:::

### 5. Enable cross-language search

If your user query is different from the languages of the knowledge bases, you can select the target languages in the **Cross-language search** dropdown menu. The model will then translates queries to ensure accurate matching of semantic meaning across languages.


### 6. Test retrieval results

Click the **Run** button on the top of canvas to test the retrieval results.

### 7. Choose the next component

When necessary, click the **+** button on the **Retrieval** component to choose the next component in the worflow from the dropdown list.


## Configurations

### Query variables

*Mandatory*

Select the query source for retrieval. Defaults to `sys.query`, which is the default output of the **Begin** component.

The **Retrieval** component relies on query variables to specify its queries. All global variables defined before the **Retrieval** component can also be used as queries. Use the `(x)` button or type `/` to show all the available query variables.

### Knowledge bases 

Select the knowledge base(s) to retrieve data from.

- If no knowledge base is selected, meaning conversations with the agent will not be based on any knowledge base, ensure that the **Empty response** field is left blank to avoid an error.
- If you select multiple knowledge bases, you must ensure that the knowledge bases (datasets) you select use the same embedding model; otherwise, an error message would occur.

### Similarity threshold

RAGFlow employs a combination of weighted keyword similarity and weighted vector cosine similarity during retrieval. This parameter sets the threshold for similarities between the user query and chunks stored in the datasets. Any chunk with a similarity score below this threshold will be excluded from the results.

Defaults to 0.2.

### Keyword similarity weight

This parameter sets the weight of keyword similarity in the combined similarity score. The total of the two weights must equal 1.0. Its default value is 0.7, which means the weight of vector similarity in the combined search is 1 - 0.7 = 0.3.

### Top N

This parameter selects the "Top N" chunks from retrieved ones and feed them to the LLM.

Defaults to 8.


### Rerank model

*Optional*

If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used for retrieval.

:::caution WARNING
Using a rerank model will *significantly* increase the system's response time.
:::

### Empty response

- Set this as a response if no results are retrieved from the knowledge base(s) for your query, or 
- Leave this field blank to allow the chat model to improvise when nothing is found.

:::caution WARNING
If you do not specify a knowledge base, you must leave this field blank; otherwise, an error would occur.
:::

### Cross-language search

Select one or more languages for cross‑language search. If no language is selected, the system searches with the original query.

### Use knowledge graph

:::caution IMPORTANT
Before enabling this feature, ensure you have properly [constructed a knowledge graph from each target knowledge base](../../dataset/construct_knowledge_graph.md).
:::

Whether to use knowledge graph(s) in the specified knowledge base(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time.

### Output

The global variable name for the output of the **Retrieval** component, which can be referenced by other components in the workflow.


## Frequently asked questions

### How to reduce response time?

Go through the checklist below for best performance:

- Leave the **Rerank model** field empty.
- If you must use a rerank model, ensure you use a SaaS reranker; if you prefer a locally deployed rerank model, ensure you start RAGFlow with **docker-compose-gpu.yml**.
- Disable **Use knowledge graph**.
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
+								---
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								sidebar_position: 3
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
+								slug: /retrieval_component
 								---
 								# Retrieval component
 								A component that retrieves information from specified datasets.
 								## Scenarios
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
+								A **Retrieval** component is essential in most RAG scenarios, where information is extracted from designated knowledge bases before being sent to the LLM for content generation. A **Retrieval** component can operate either as a standalone workflow module or as a tool for an **Agent** component. In the latter role, the **Agent** component has autonomous control over when to invoke it for query and retrieval.
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
+								The following screenshot shows a reference design using the **Retrieval** component, where the component serves as a tool for an **Agent** component. You can find it from the **Report Agent Using Knowledge Base** Agent template.
 								![retrieval_reference_design](https://raw.githubusercontent.com/infiniflow/ragflow-docs/main/images/retrieval_reference_design.jpg)
 								## Prerequisites
 								Ensure you [have properly configured your target knowledge base(s)](../../dataset/configure_knowledge_base.md).
 								## Quickstart
 								### 1. Click on a **Retrieval** component to show its configuration panel
 								The corresponding configuration panel appears to the right of the canvas. Use this panel to define and fine-tune the **Retrieval** component's search behavior.
 								### 2. Input query variable(s)
-												Docs: Updated the Code component reference (#9884)

### What problem does this PR solve?


### Type of change

- [x] Documentation Update
											
										
										
											2025-09-03 14:23:03 +08:00
+								The **Retrieval** component depends on query variables to specify its queries.
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
 								:::caution IMPORTANT
 								- If you use the **Retrieval** component as a standalone workflow module, input query variables in the **Input Variables** text box.
 								- If it is used as a tool for an **Agent** component, input the query variables in the **Agent** component's **User prompt** field.
 								:::
 								By default, you can use `sys.query`, which is the user query and the default output of the **Begin** component. All global variables defined before the **Retrieval** component can also be used as query statements. Use the `(x)` button or type `/` to show all the available query variables.
 								### 3. Select knowledge base(s) to query
 								You can specify one or multiple knowledge bases to retrieve data from. If selecting mutiple, ensure they use the same embedding model.
 								### 4. Expand **Advanced Settings** to configure the retrieval method
-												Docs: Updated the Code component reference (#9884)

### What problem does this PR solve?


### Type of change

- [x] Documentation Update
											
										
										
											2025-09-03 14:23:03 +08:00
+								By default, a combination of weighted keyword similarity and weighted vector cosine similarity is used for retrieval. If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used instead.
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
 								As a starter, you can skip this step to stay with the default retrieval method.
 								:::caution WARNING
 								Using a rerank model will *significantly* increase the system's response time. If you must use a rerank model, ensure you use a SaaS reranker; if you prefer a locally deployed rerank model, ensure you start RAGFlow with **docker-compose-gpu.yml**.
 								:::
 								### 5. Enable cross-language search
 								If your user query is different from the languages of the knowledge bases, you can select the target languages in the **Cross-language search** dropdown menu. The model will then translates queries to ensure accurate matching of semantic meaning across languages.
 								### 6. Test retrieval results
-												Docs: Updated the Code component reference (#9884)

### What problem does this PR solve?


### Type of change

- [x] Documentation Update
											
										
										
											2025-09-03 14:23:03 +08:00
+								Click the **Run** button on the top of canvas to test the retrieval results.
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
-												Docs: Updated the Code component reference (#9884)

### What problem does this PR solve?


### Type of change

- [x] Documentation Update
											
										
										
											2025-09-03 14:23:03 +08:00
+								### 7. Choose the next component
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
 								When necessary, click the **+** button on the **Retrieval** component to choose the next component in the worflow from the dropdown list.
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
 								## Configurations
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								### Query variables
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								*Mandatory*
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
+								Select the query source for retrieval. Defaults to `sys.query`, which is the default output of the **Begin** component.
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
+								The **Retrieval** component relies on query variables to specify its queries. All global variables defined before the **Retrieval** component can also be used as queries. Use the `(x)` button or type `/` to show all the available query variables.
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
 								### Knowledge bases
 								Select the knowledge base(s) to retrieve data from.
 								- If no knowledge base is selected, meaning conversations with the agent will not be based on any knowledge base, ensure that the **Empty response** field is left blank to avoid an error.
 								- If you select multiple knowledge bases, you must ensure that the knowledge bases (datasets) you select use the same embedding model; otherwise, an error message would occur.
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
 								### Similarity threshold
 								RAGFlow employs a combination of weighted keyword similarity and weighted vector cosine similarity during retrieval. This parameter sets the threshold for similarities between the user query and chunks stored in the datasets. Any chunk with a similarity score below this threshold will be excluded from the results.
 								Defaults to 0.2.
-												Added a guide on running a retrieval test, with and without knowledge graph (#5200)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
											
										
										
											2025-02-21 19:36:20 +08:00
+								### Keyword similarity weight
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
 								This parameter sets the weight of keyword similarity in the combined similarity score. The total of the two weights must equal 1.0. Its default value is 0.7, which means the weight of vector similarity in the combined search is 1 - 0.7 = 0.3.
 								### Top N
 								This parameter selects the "Top N" chunks from retrieved ones and feed them to the LLM.
 								Defaults to 8.
 								### Rerank model
 								*Optional*
 								If a rerank model is selected, a combination of weighted keyword similarity and weighted reranking score will be used for retrieval.
 								:::caution WARNING
 								Using a rerank model will *significantly* increase the system's response time.
 								:::
-												Miscellaneous updates (#6245)

### What problem does this PR solve?


### Type of change


- [x] Documentation Update
											
										
										
											2025-03-18 19:49:06 +08:00
+								### Empty response
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												UI updates (#6290)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
											
										
										
											2025-03-20 10:26:16 +08:00
+								- Set this as a response if no results are retrieved from the knowledge base(s) for your query, or
 								- Leave this field blank to allow the chat model to improvise when nothing is found.
 								:::caution WARNING
 								If you do not specify a knowledge base, you must leave this field blank; otherwise, an error would occur.
 								:::
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								### Cross-language search
 								Select one or more languages for cross‑language search. If no language is selected, the system searches with the original query.
 								### Use knowledge graph
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
+								:::caution IMPORTANT
 								Before enabling this feature, ensure you have properly [constructed a knowledge graph from each target knowledge base](../../dataset/construct_knowledge_graph.md).
 								:::
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								Whether to use knowledge graph(s) in the specified knowledge base(s) during retrieval for multi-hop question answering. When enabled, this would involve iterative searches across entity, relationship, and community report chunks, greatly increasing retrieval time.
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								### Output
-												Added descriptions of the retrieval agent component (#4416)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
											
										
										
											2025-01-09 10:11:05 +08:00
-												Docs: Updated docs for 0.20.0 (#9169)

### What problem does this PR solve?

v0.20.0 documents.

### Type of change


- [x] Documentation Update
											
										
										
											2025-08-01 20:22:27 +08:00
+								The global variable name for the output of the **Retrieval** component, which can be referenced by other components in the workflow.
-												Docs: Refactored Retrieval component reference (#9862)

### What problem does this PR solve?

### Type of change

- [x] Documentation Update
											
										
										
											2025-09-02 10:28:23 +08:00
 								## Frequently asked questions
 								### How to reduce response time?
 								Go through the checklist below for best performance:
 								- Leave the **Rerank model** field empty.
 								- If you must use a rerank model, ensure you use a SaaS reranker; if you prefer a locally deployed rerank model, ensure you start RAGFlow with **docker-compose-gpu.yml**.
 								- Disable **Use knowledge graph**.