Update README

This commit is contained in:
yangdx 2025-07-27 17:26:49 +08:00
parent 2054bba7e6
commit 3f5ade47cd
3 changed files with 5 additions and 5 deletions

View File

@ -6,7 +6,7 @@ LightRAG employs a multi-layered concurrent control strategy when processing mul
**Control Parameter**: `max_parallel_insert`
This parameter controls the number of documents processed simultaneously. The purpose is to prevent excessive parallelism from overwhelming system resources, which could lead to extended processing times for individual files. Document-level concurrency is governed by the `max_parallel_insert` attribute within LightRAG, which defaults to 2 and is configurable via the `MAX_PARALLEL_INSERT` environment variable.
This parameter controls the number of documents processed simultaneously. The purpose is to prevent excessive parallelism from overwhelming system resources, which could lead to extended processing times for individual files. Document-level concurrency is governed by the `max_parallel_insert` attribute within LightRAG, which defaults to 2 and is configurable via the `MAX_PARALLEL_INSERT` environment variable. `max_parallel_insert` is recommended to be set between 2 and 10, typically `llm_model_max_async/3`. Setting this value too high can increase the likelihood of naming conflicts among entities and relationships across different documents during the merge phase, thereby reducing its overall efficiency.
### 2. Chunk-Level Concurrent Control

View File

@ -565,8 +565,8 @@ pip install lightrag-hku
LightRAG 中的文档处理流程有些复杂,分为两个主要阶段:提取阶段(实体和关系提取)和合并阶段(实体和关系合并)。有两个关键参数控制流程并发性:并行处理的最大文件数(`MAX_PARALLEL_INSERT`)和最大并发 LLM 请求数(`MAX_ASYNC`)。工作流程描述如下:
1. `MAX_PARALLEL_INSERT` 控制提取阶段并行处理的文件数量
2. `MAX_ASYNC` 限制系统中并发 LLM 请求的总数包括查询、提取和合并的请求。LLM 请求具有不同的优先级:查询操作优先级最高,其次是合并,然后是提取
1. `MAX_ASYNC` 限制系统中并发 LLM 请求的总数包括查询、提取和合并的请求。LLM 请求具有不同的优先级:查询操作优先级最高,其次是合并,然后是提取
2. `MAX_PARALLEL_INSERT` 控制提取阶段并行处理的文件数量。`MAX_PARALLEL_INSERT`建议设置为210之间通常设置为 `MAX_ASYNC/3`,设置太大会导致合并阶段不同文档之间实体和关系重名的机会增大,降低合并阶段的效率
3. 在单个文件中,来自不同文本块的实体和关系提取是并发处理的,并发度由 `MAX_ASYNC` 设置。只有在处理完 `MAX_ASYNC` 个文本块后,系统才会继续处理同一文件中的下一批文本块。
4. 当一个文件完成实体和关系提后,将进入实体和关系合并阶段。这一阶段也会并发处理多个实体和关系,其并发度同样是由 `MAX_ASYNC` 控制。
5. 合并阶段的 LLM 请求的优先级别高于提取阶段,目的是让进入合并阶段的文件尽快完成处理,并让处理结果尽快更新到向量数据库中。

View File

@ -501,8 +501,8 @@ EMBEDDING_BINDING_HOST=http://localhost:11434
The document processing pipeline in LightRAG is somewhat complex and is divided into two primary stages: the Extraction stage (entity and relationship extraction) and the Merging stage (entity and relationship merging). There are two key parameters that control pipeline concurrency: the maximum number of files processed in parallel (MAX_PARALLEL_INSERT) and the maximum number of concurrent LLM requests (MAX_ASYNC). The workflow is described as follows:
1. MAX_PARALLEL_INSERT controls the number of files processed in parallel during the extraction stage.
2. MAX_ASYNC limits the total number of concurrent LLM requests in the system, including those for querying, extraction, and merging. LLM requests have different priorities: query operations have the highest priority, followed by merging, and then extraction.
1. MAX_ASYNC limits the total number of concurrent LLM requests in the system, including those for querying, extraction, and merging. LLM requests have different priorities: query operations have the highest priority, followed by merging, and then extraction.
2. MAX_PARALLEL_INSERT controls the number of files processed in parallel during the extraction stage. For optimal performance, MAX_PARALLEL_INSERT is recommended to be set between 2 and 10, typically MAX_ASYNC/3. Setting this value too high can increase the likelihood of naming conflicts among entities and relationships across different documents during the merge phase, thereby reducing its overall efficiency.
3. Within a single file, entity and relationship extractions from different text blocks are processed concurrently, with the degree of concurrency set by MAX_ASYNC. Only after MAX_ASYNC text blocks are processed will the system proceed to the next batch within the same file.
4. When a file completes entity and relationship extraction, it enters the entity and relationship merging stage. This stage also processes multiple entities and relationships concurrently, with the concurrency level also controlled by `MAX_ASYNC`.
5. LLM requests for the merging stage are prioritized over the extraction stage to ensure that files in the merging phase are processed quickly and their results are promptly updated in the vector database.