mirror of
https://github.com/infiniflow/ragflow.git
synced 2025-11-14 17:13:20 +00:00
102 lines
5.4 KiB
Markdown
102 lines
5.4 KiB
Markdown
---
|
|
sidebar_position: 8
|
|
slug: /construct_knowledge_graph
|
|
---
|
|
|
|
# Construct knowledge graph
|
|
|
|
Generate a knowledge graph for your dataset.
|
|
|
|
---
|
|
|
|
To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunking method.
|
|
|
|

|
|
|
|
From v0.16.0 onward, RAGFlow supports constructing a knowledge graph on a dataset, allowing you to construct a *unified* graph across multiple files within your dataset. When a newly uploaded file starts parsing, the generated graph will automatically update.
|
|
|
|
:::danger WARNING
|
|
Constructing a knowledge graph requires significant memory, computational resources, and tokens.
|
|
:::
|
|
|
|
## Scenarios
|
|
|
|
Knowledge graphs are especially useful for multi-hop question-answering involving *nested* logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships.
|
|
|
|
:::tip NOTE
|
|
RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) can also be used for multi-hop question-answering tasks. See [Enable RAPTOR](./enable_raptor.md) for details. You may use either approach or both, but ensure you understand the memory, computational, and token costs involved.
|
|
:::
|
|
|
|
## Prerequisites
|
|
|
|
The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have a chat model properly configured:
|
|
|
|

|
|
|
|
## Configurations
|
|
|
|
### Entity types (*Required*)
|
|
|
|
The types of the entities to extract from your dataset. The default types are: **organization**, **person**, **event**, and **category**. Add or remove types to suit your specific dataset.
|
|
|
|
### Method
|
|
|
|
The method to use to construct knowledge graph:
|
|
|
|
- **General**: Use prompts provided by [GraphRAG](https://github.com/microsoft/graphrag) to extract entities and relationships.
|
|
- **Light**: (Default) Use prompts provided by [LightRAG](https://github.com/HKUDS/LightRAG) to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources.
|
|
|
|
### Entity resolution
|
|
|
|
Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more effective graph.
|
|
|
|
- (Default) Disable entity resolution.
|
|
- Enable entity resolution. This option consumes more tokens.
|
|
|
|
### Community reports
|
|
|
|
In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See [here](https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/) for more information. This indicates whether to generate community reports:
|
|
|
|
- Generate community reports. This option consumes more tokens.
|
|
- (Default) Do not generate community reports.
|
|
|
|
## Quickstart
|
|
|
|
1. Navigate to the **Configuration** page of your dataset and update:
|
|
|
|
- Entity types: *Required* - Specifies the entity types in the knowledge graph to generate. You don't have to stick with the default, but you need to customize them for your documents.
|
|
- Method: *Optional*
|
|
- Entity resolution: *Optional*
|
|
- Community reports: *Optional*
|
|
*The default knowledge graph configurations for your dataset are now set.*
|
|
|
|
2. Navigate to the **Files** page of your dataset, click the **Generate** button on the top right corner of the page, then select **Knowledge graph** from the dropdown to initiate the knowledge graph generation process.
|
|
|
|
*You can click the pause button in the dropdown to halt the build process when necessary.*
|
|
|
|
3. Go back to the **Configuration** page:
|
|
|
|
*Once a knowledge graph is generated, the **Knowledge graph** field changes from `Not generated` to `Generated at a specific timestamp`. You can delete it by clicking the recycle bin button to the right of the field.*
|
|
|
|
4. To use the created knowledge graph, do either of the following:
|
|
|
|
- In the **Chat setting** panel of your chat app, switch on the **Use knowledge graph** toggle.
|
|
- If you are using an agent, click the **Retrieval** agent component to specify the dataset(s) and switch on the **Use knowledge graph** toggle.
|
|
|
|
## Frequently asked questions
|
|
|
|
### Does the knowledge graph automatically update when I remove a related file?
|
|
|
|
Nope. The knowledge graph does *not* update *until* you regenerate a knowledge graph for your dataset.
|
|
|
|
### How to remove a generated knowledge graph?
|
|
|
|
On the **Configuration** page of your dataset, find the **Knoweledge graph** field and click the recycle bin button to the right of the field.
|
|
|
|
### Where is the created knowledge graph stored?
|
|
|
|
All chunks of the created knowledge graph are stored in RAGFlow's document engine: either Elasticsearch or [Infinity](https://github.com/infiniflow/infinity).
|
|
|
|
### How to export a created knowledge graph?
|
|
|
|
Nope. Exporting a created knowledge graph is not supported. If you still consider this feature essential, please [raise an issue](https://github.com/infiniflow/ragflow/issues) explaining your use case and its importance. |