ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-07 12:30:52 +00:00

Author	SHA1	Message	Date
Jin Hai	4eb7659499	Fix bug: broken import from rag.prompts.prompts (#10217 ) ### What problem does this PR solve? Fix broken imports ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Signed-off-by: jinhai <haijin.chn@gmail.com>	2025-09-23 10:19:25 +08:00
Yongteng Lei	45f52e85d7	Feat: refine dataflow and initialize dataflow app (#9952 ) ### What problem does this PR solve? Refine dataflow and initialize dataflow app. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-05 18:50:46 +08:00
Stephen Hu	c461261f0b	Refactor: Improve the try logic for upload_to_minio (#9735 ) ### What problem does this PR solve? Improve the try logic for upload_to_minio ### Type of change - [x] Refactoring	2025-08-28 09:35:29 +08:00
Kevin Hu	8d8a5f73b6	Fix: meta data filter with AND logic operations. (#9687 ) ### What problem does this PR solve? Close #9648 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-25 18:29:24 +08:00
Kevin Hu	ca720bd811	Fix: save team's canvas issue. (#9518 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-18 13:05:29 +08:00
Kevin Hu	2114e966d8	Feat: add citation option to agent and enlarge the timeouts. (#9484 ) ### What problem does this PR solve? #9422 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-15 10:05:01 +08:00
Kevin Hu	153e430b00	Feat: add meta data filter. (#9405 ) ### What problem does this PR solve? #8531 #7417 #6761 #6573 #6477 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-12 14:12:56 +08:00
Jay Xu	ce3dd019c3	Fix broken data stream when writing image file (#9354 ) ### What problem does this PR solve? fix "broken data stream when writing image file", just log warning and ignore Close #8379 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-11 17:07:49 +08:00
Kevin Hu	9ca86d801e	Refa: add provider info while adding model. (#9273 ) ### What problem does this PR solve? #9248 ### Type of change - [x] Refactoring	2025-08-07 09:40:42 +08:00
gooodboyAo	a7eba61067	FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. (#9246 ) FIX: If chunk["content_with_weight"] contains one or more unpaired surrogate characters (such as incomplete emoji or other special characters), then calling .encode("utf-8") directly will raise a UnicodeEncodeError. ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-06 10:36:50 +08:00
Kevin Hu	2124329e95	Fix: local variable issue. (#9255 ) ### What problem does this PR solve? #9227 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-05 19:24:34 +08:00
Kevin Hu	935ce872d8	Refa: remove temperature since some LLMs fail to support. (#8981 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-07-23 10:17:04 +08:00
Kevin Hu	c783d90ba3	Perf: set timeout for building chunks. (#8940 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-21 15:56:45 +08:00
Kevin Hu	ecdb1701df	Perf: test llm before RAPTOR. (#8897 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-17 16:48:50 +08:00
Kevin Hu	fbd115773b	Perf: set timeout of some steps in KG. (#8873 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-16 18:06:03 +08:00
Yongteng Lei	e9b14142a5	Fix: fixed invalid save() arguments for slide thumbnails (#8851 ) ### What problem does this PR solve? Fixed invalid save() arguments for slide thumbnails. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-15 17:19:45 +08:00
Kevin Hu	aa4a725529	Pref: use redis to check if canceled. (#8853 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-15 17:19:27 +08:00
Kevin Hu	24c41d2a61	Perf: make `do_cancel` quicker. (#8846 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-15 14:35:00 +08:00
Kevin Hu	c642dbefca	Perf: Enhance timeout handling. (#8826 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-15 09:36:45 +08:00
Stephen Hu	d5f6335f99	Fix: The data set created by API call failed to parse after uploading the file. (#8657 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8656 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-04 12:41:28 +08:00
Kevin Hu	e3edcc3064	Trivals. (#8597 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:05:18 +08:00
Tuan Le	303c6dd1a8	Fix memory leaks in PIL image and BytesIO handling during chunk processing (#8522 ) ### What problem does this PR solve? This PR addresses critical memory leaks in the task executor's image processing pipeline. The current implementation fails to properly dispose of PIL Image objects and BytesIO buffers during chunk processing, leading to progressive memory accumulation that can cause the task executor to consume excessive memory over time. ### Background context - The `upload_to_minio` function processes images from document chunks and converts them to JPEG format for storage. - PIL Image objects hold significant memory resources that must be explicitly closed to prevent memory leaks. - BytesIO objects also consume memory and should be properly disposed of after use. - In high-throughput scenarios with many image-containing documents, these memory leaks can lead to out-of-memory errors and degraded performance. ### Specific issues fixed - PIL Image objects were not being explicitly closed after processing. - BytesIO buffers lacked proper cleanup in all code paths. - Converted images (RGBA/P to RGB) were not disposing of the original image object. - Memory references to large image data were not being cleared promptly. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Performance Improvement ### Changes made - Added explicit `d["image"].close()` calls after image processing operations. - Implemented proper cleanup of converted images when changing formats from RGBA/P to RGB. - Enhanced BytesIO cleanup with `try/finally` blocks to ensure disposal in all code paths. - Added explicit `del d["image"]` to clear memory references after processing. This fix ensures stable memory usage during long-running document processing tasks and prevents potential out-of-memory conditions in production environments.	2025-06-27 10:23:21 +08:00
Stephen Hu	be712714af	Refactor:improve the logic to check cancel (#8524 ) ### What problem does this PR solve? improve the logic to check cancel ### Type of change - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-27 10:22:53 +08:00
Stephen Hu	8d9d2cc0a9	Fix: some cases Task return but not set progress (#8469 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8466 I go through the codes, current logic: When do_handle_task raises an exception, handle_task will set the progress, but for some cases do_handle_task internal will just return but not set the right progress, at this cases the redis stream will been acked but the task is running. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 09:58:55 +08:00
WuWeiFlow	bc1b837616	FIX:Saving an RGBA image directly as JPEG will cause an error. If the… (#8399 ) Saving an RGBA image directly as JPEG will cause an error. If the image is in RGBA mode, convert it to RGB mode before saving it in JPG format. ### What problem does this PR solve? During document parsing in the knowledge base, we occasionally encounter the error 'cannot write mode RGBA as JPEG.' This occurs because images in RGBA mode cannot be directly saved as JPEG. They must be converted first before saving. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 18:01:13 +08:00
Stephen Hu	794a4102c2	Fix: Document parse via API will alot problen (#8407 ) ### What problem does this PR solve? #8391 #8404 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-23 13:08:11 +08:00
Kevin Hu	8f3fe63d73	Fix: duplicated task (#8358 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-19 11:12:29 +08:00
Jin Hai	4a2ff633e0	Fix typo in code (#8327 ) ### What problem does this PR solve? Fix typo in code ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-06-18 09:41:09 +08:00
cutiechi	8f9bcb1c74	Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266 ) ### Description This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and ‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for document parsing and embedding vectorization in RAGFlow. By making these parameters configurable, users can optimize performance and resource usage according to their hardware capabilities and workload requirements. ### What problem does this PR solve? Previously, the batch sizes for document parsing and embedding were hardcoded, limiting the ability to adjust throughput and memory consumption. This PR enables users to set these values via environment variables (in ‎`.env`, Helm chart, or directly in the deployment environment), improving flexibility and scalability for both small and large deployments. - ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a single batch during document parsing (default: 4). - ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed in a single batch during embedding vectorization (default: 16). This change updates the codebase, documentation, and configuration files to reflect the new options. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe): ### Additional context - Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe the new variables. - Modified relevant code paths to use the environment variables instead of hardcoded values. - Users can now tune these parameters to achieve better throughput or reduce memory usage as needed. Before: Default value: <img width="643" alt="image" src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a" /> After: 10x: <img width="777" alt="image" src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1" />	2025-06-16 13:40:47 +08:00
Yongteng Lei	8f9e7a6f6f	Refa: revert to original task message collection logic (#8251 ) ### What problem does this PR solve? Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1 doesn't exist' ---- Edit: revert to original message collection logic. ### Type of change - [x] Refactoring --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 16:38:53 +08:00
Yongteng Lei	24ca4cc6b7	Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223 ) ### What problem does this PR solve? This PR investigates the cause of #7957. TL;DR: Incorrect similarity calculations lead to too many candidates. Since candidate selection involves interaction with the LLM, this causes significant delays in the program. What this PR does: 1. Fix similarity calculation: When processing a 64 pages government document, the corrected similarity calculation reduces the number of candidates from over 100,000 to around 16,000. With a default batch size of 100 pairs per LLM call, this fix reduces unnecessary LLM interactions from over 1,000 calls to around 160, a roughly 10x improvement. 2. Add concurrency and timeout limits: Up to 5 entity types are processed in "parallel", each with a 180-second timeout. These limits may be configurable in future updates. 3. Improve logging: The candidate resolution process now reports progress in real time. 4. Mitigates potential concurrency risks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-12 19:09:50 +08:00
Wanderson Pinto dos Santos	0e03542db5	fix: single task executor getting all tasks from Redis queue (#7330 ) ### What problem does this PR solve? Currently, as long as there are tasks in Redis, this loop will keep getting the tasks. This will lead to a single task executor with many tasks in the pending state. Then we need to wait for the pending tasks to get them back in the queue. In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X tasks should be picked from the queue, and others should be left in the queue for other `task_executors` or be picked after 1 of the spots in the current executor gets free. This PR ensures this behavior. The additional changes were due to the Ruff linting in pre-commit. But I believe these are expected to keep the coding style. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2025-06-06 14:32:35 +08:00
Stephen Hu	273f36cc54	Perf: reduce upload to minio limiter scope (#7878 ) ### What problem does this PR solve? reduce upload_to_minio limter scope ### Type of change - [x] Performance Improvement	2025-05-27 17:49:37 +08:00
Kevin Hu	28cb4df127	Fix: raptor overloading (#7889 ) ### What problem does this PR solve? #7840 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-27 17:41:35 +08:00
Kevin Hu	959793e83c	Fix: task limiter issue. (#7873 ) ### What problem does this PR solve? #7869 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-27 11:16:29 +08:00
Kevin Hu	be83074131	Fix: restore task limiter. (#7844 ) ### What problem does this PR solve? Close #7828 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-26 10:59:01 +08:00
Stephen Hu	ce816edb5f	Fix: improve task cancel lag (#7765 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7761 but it may be difficult to achieve 0 delay (which need to pass the cancel token to all parts) Another solution is just 0 delay effect at UI. And task will stop latter ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-22 09:28:08 +08:00
Stephen Hu	e3e7c7ddaa	Feat: delete useless image blobs when task executor meet edge cases (#7727 ) ### What problem does this PR solve? delete useless image blobs when the task executor meets edge cases ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-05-21 10:22:30 +08:00
S0b3Rr	5d21cc3660	fix: Fix the problem that concurrent execution limit in task executor fails and causes OOM (issue#7580) (#7700 ) ### What problem does this PR solve? ## Cause of the bug: During the execution process, due to improper use of trio CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is invalid, causing the executor to take out a large number of tasks from the Redis queue at one time. This behavior will cause the task executor to occupy too much memory and be killed by the OS when a large number of tasks exist at the same time. As a result, all executing tasks are suspended. ## Fix: Added the task_manager method to the entry of /rag/svr/task_executor.py to make CapacityLimiter effective. Deleted the invalid async with statement. ## Fix result: After testing, the task executor execution meets expectations, that is: concurrent execution of up to $MAX_CONCURRENT_TASKS tasks. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-19 10:25:56 +08:00
alkscr	4ae8f87754	Fix: missing graph resolution and community extraction in graphrag tasks (#7586 ) ### What problem does this PR solve? Info of whether applying graph resolution and community extraction is storage in `task["kb_parser_config"]`. However, previous code get `graphrag_conf` from `task["parser_config"]`, making `with_resolution` and `with_community` are always false. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-13 09:21:03 +08:00
hfrt456	332e6ffbd4	Fix:local_es_tag (#7534 ) Two Case when local Es tag search has result which is filtered by score 1: Doc has empty tag,and not visi LLM 2: Code may use empty examples in Prompt for LLM search tag Co-authored-by: huangfuqunze <huangfuqunze.hfqz@alibaba-inc.com>	2025-05-09 10:17:24 +08:00
liuzhenghua	2f768b96e8	perf: optimze figure parser (#7392 ) ### What problem does this PR solve? When parsing documents containing images, the current code uses a single-threaded approach to call the VL model, resulting in extremely slow parsing speed (e.g., parsing a Word document with dozens of images takes over 20 minutes). By switching to a multithreaded approach to call the VL model, the parsing speed can be improved to an acceptable level. ### Type of change - [x] Performance Improvement --------- Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>	2025-05-06 14:39:45 +08:00
Stephen Hu	c88e4b3fc0	Fix: improve recover_pending_tasks timeout (#7408 ) ### What problem does this PR solve? Fix the redis lock will always timeout (change the logic order release lock first) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-04-29 16:50:39 +08:00
benni82	216cd7474b	fix: task_executor bug fix (#7253 ) ### What problem does this PR solve? The lock is not released correctly when task_exectuor is abnormal ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-04-24 11:44:34 +08:00
liuzhenghua	d4dbdfb61d	feat: Recover pending tasks while pod restart. (#7073 ) ### What problem does this PR solve? If you deploy Ragflow using Kubernetes, the hostname will change during a rolling update. This causes the consumer name of the task executor to change, making it impossible to schedule tasks that were previously in a pending state. To address this, I introduced a recovery task that scans these pending messages and re-publishes them, allowing the tasks to continue being processed. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: liuzhenghua-jk <liuzhenghua-jk@360shuke.com>	2025-04-19 16:18:51 +08:00
aniaan	8b8a2f2949	fix(nursery): Fix Closure Trap Issues in Trio Concurrent Tasks (#7106 ) ## Problem Description Multiple files in the RAGFlow project contain closure trap issues when using lambda functions with `trio.open_nursery()`. This problem causes concurrent tasks created in loops to reference the same variable, resulting in all tasks processing the same data (the data from the last iteration) rather than each task processing its corresponding data from the loop. ## Issue Details When using a `lambda` to create a closure function and passing it to `nursery.start_soon()` within a loop, the lambda function captures a reference to the loop variable rather than its value. For example: ```python # Problematic code async with trio.open_nursery() as nursery: for d in docs: nursery.start_soon(lambda: doc_keyword_extraction(chat_mdl, d, topn)) ``` In this pattern, when concurrent tasks begin execution, `d` has already become the value after the loop ends (typically the last element), causing all tasks to use the same data. ## Fix Solution Changed the way concurrent tasks are created with `nursery.start_soon()` by leveraging Trio's API design to directly pass the function and its arguments separately: ```python # Fixed code async with trio.open_nursery() as nursery: for d in docs: nursery.start_soon(doc_keyword_extraction, chat_mdl, d, topn) ``` This way, each task uses the parameter values at the time of the function call, rather than references captured through closures. ## Fixed Files Fixed closure traps in the following files: 1. `rag/svr/task_executor.py`: 3 fixes, involving document keyword extraction, question generation, and tag processing 2. `rag/raptor.py`: 1 fix, involving document summarization 3. `graphrag/utils.py`: 2 fixes, involving graph node and edge processing 4. `graphrag/entity_resolution.py`: 2 fixes, involving entity resolution and graph node merging 5. `graphrag/general/mind_map_extractor.py`: 2 fixes, involving document processing 6. `graphrag/general/extractor.py`: 3 fixes, involving content processing and graph node/edge merging 7. `graphrag/general/community_reports_extractor.py`: 1 fix, involving community report extraction ## Potential Impact This fix resolves a serious concurrency issue that could have caused: - Data processing errors (processing duplicate data) - Performance degradation (all tasks working on the same data) - Inconsistent results (some data not being processed) After the fix, all concurrent tasks should correctly process their respective data, improving system correctness and reliability.	2025-04-18 18:00:20 +08:00
Zhichang Yu	6bf26e2a81	Optimize graphrag again (#6513 ) ### What problem does this PR solve? Removed set_entity and set_relation to avoid accessing doc engine during graph computation. Introduced GraphChange to avoid writing unchanged chunks. ### Type of change - [x] Performance Improvement	2025-03-26 15:34:42 +08:00
Kevin Hu	d83911b632	Fix: huggingface rerank model issue. (#6385 ) ### What problem does this PR solve? #6348 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 12:43:32 +08:00
Zhichang Yu	bb869aca33	Fix get_unacked_iterator (#6280 ) ### What problem does this PR solve? Fix get_unacked_iterator. Close #6132 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:46:58 +08:00
zhou	9cad60fa6d	Fix: Add a basic example when the example of content_tagging is empty (#6276 ) ### What problem does this PR solve? When using LLM for auto-tag, if there are no examples, the tag format generated by LLM may be wrong. This will cause Elasticsearch insert errors. Adding basic examples can avoid this problem. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:30:47 +08:00

1 2 3 4

167 Commits