ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-07-30 20:31:59 +00:00

Author	SHA1	Message	Date
Jin Hai	4a2ff633e0	Fix typo in code (#8327 ) ### What problem does this PR solve? Fix typo in code ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-06-18 09:41:09 +08:00
cutiechi	8f9bcb1c74	Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266 ) ### Description This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and ‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for document parsing and embedding vectorization in RAGFlow. By making these parameters configurable, users can optimize performance and resource usage according to their hardware capabilities and workload requirements. ### What problem does this PR solve? Previously, the batch sizes for document parsing and embedding were hardcoded, limiting the ability to adjust throughput and memory consumption. This PR enables users to set these values via environment variables (in ‎`.env`, Helm chart, or directly in the deployment environment), improving flexibility and scalability for both small and large deployments. - ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a single batch during document parsing (default: 4). - ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed in a single batch during embedding vectorization (default: 16). This change updates the codebase, documentation, and configuration files to reflect the new options. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update - [ ] Refactoring - [x] Performance Improvement - [ ] Other (please describe): ### Additional context - Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe the new variables. - Modified relevant code paths to use the environment variables instead of hardcoded values. - Users can now tune these parameters to achieve better throughput or reduce memory usage as needed. Before: Default value: <img width="643" alt="image" src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a" /> After: 10x: <img width="777" alt="image" src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1" />	2025-06-16 13:40:47 +08:00
Kevin Hu	b1117a8717	Fix: base url issue. (#8281 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-16 13:40:25 +08:00
cutiechi	dabbc852c8	Fix: opendal storage health attribute not found & remove duplicate operator scheme initialization (#8265 ) ### What problem does this PR solve? This PR fixes two issues in the OpenDAL storage connector: 1. The ‎`health` method was missing, which prevented health checks on the storage backend. 3. The initialization of the ‎`opendal.Operator` object included a redundant scheme parameter, causing unnecessary duplication and potential confusion. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Background - The absence of a ‎`health` method made it difficult to verify the availability and reliability of the storage service. - Initializing ‎`opendal.Operator` with both ‎`self._scheme` and unpacked ‎`**self._kwargs` could lead to errors or unexpected behavior if the scheme was already included in the kwargs. ### What is changed and how it works? - Adds a ‎`health` method that writes a test file to verify storage availability. - Removes the duplicate scheme parameter from the ‎`opendal.Operator` initialization to ensure clarity and prevent conflicts. before: <img width="762" alt="企业微信截图_46be646f-2e99-4e5e-be67-b1483426e77c" src="https://github.com/user-attachments/assets/acecbb8c-4810-457f-8342-6355148551ba" /> <img width="767" alt="image" src="https://github.com/user-attachments/assets/147cd5a2-dde3-466b-a9c1-d1d4f0819e5d" /> after: <img width="1123" alt="企业微信截图_09d62997-8908-4985-b89f-7a78b5da55ac" src="https://github.com/user-attachments/assets/97dc88c9-0f4e-4d77-88b3-cd818e8da046" />	2025-06-16 11:35:51 +08:00
Yongteng Lei	8f9e7a6f6f	Refa: revert to original task message collection logic (#8251 ) ### What problem does this PR solve? Get rid of 'RedisDB.get_unacked_iterator queue rag_flow_svr_queue_1 doesn't exist' ---- Edit: revert to original message collection logic. ### Type of change - [x] Refactoring --------- Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 16:38:53 +08:00
Kevin Hu	65d5268439	Feat: implement novitaAI embedding and reranking. (#8250 ) ### What problem does this PR solve? Close #8227 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 15:42:17 +08:00
cutiechi	6aa0b0819d	Fix: unify opendal config key from ‎`schema` to ‎`scheme` (#8232 ) ### What problem does this PR solve? This PR resolves the inconsistency in the opendal configuration where both ‎`schema` and ‎`scheme` were used as keys. The code and configuration file now consistently use ‎`scheme`, which helps prevent configuration errors and runtime issues. This change improves code clarity and maintainability. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Additional context - Updated both ‎`conf/service_conf.yaml` and ‎`rag/utils/opendal_conn.py` to use ‎`scheme` instead of ‎`schema` - No breaking changes to other configuration fields	2025-06-13 14:56:51 +08:00
Wesley	3d0b440e9f	fix(search.py):remove hard page_size (#8242 ) ### What problem does this PR solve? Fix the restriction of forcing similarity_threshold=0 and page_size=30 when doc_ids is not empty #8228 --------- Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 14:56:25 +08:00
Yongteng Lei	24ca4cc6b7	Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223 ) ### What problem does this PR solve? This PR investigates the cause of #7957. TL;DR: Incorrect similarity calculations lead to too many candidates. Since candidate selection involves interaction with the LLM, this causes significant delays in the program. What this PR does: 1. Fix similarity calculation: When processing a 64 pages government document, the corrected similarity calculation reduces the number of candidates from over 100,000 to around 16,000. With a default batch size of 100 pairs per LLM call, this fix reduces unnecessary LLM interactions from over 1,000 calls to around 160, a roughly 10x improvement. 2. Add concurrency and timeout limits: Up to 5 entity types are processed in "parallel", each with a 180-second timeout. These limits may be configurable in future updates. 3. Improve logging: The candidate resolution process now reports progress in real time. 4. Mitigates potential concurrency risks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-12 19:09:50 +08:00
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224 ) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	2025-06-12 17:53:59 +08:00
Kevin Hu	d5236b71f4	Refa: ollama keep alive issue. (#8216 ) ### What problem does this PR solve? #8122 ### Type of change - [x] Refactoring	2025-06-12 15:09:40 +08:00
Kevin Hu	56ee69e9d9	Refa: chat with tools. (#8210 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-12 12:31:10 +08:00
africa-worker	44287fb05f	Oss support opendal(including mysql) (#8204 ) ### What problem does this PR solve? #8074 Oss support opendal(including mysql) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-12 11:37:42 +08:00
Yongteng Lei	1a5f991d86	Fix: auto-keyword and auto-question fail with qwq model (#8190 ) ### What problem does this PR solve? Fix auto-keyword and auto-question fail with qwq model. #8189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 11:37:07 +08:00
Kevin Hu	69e1fc496d	Refa: chat models (#8187 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-11 17:20:12 +08:00
Liu An	a43adafc6b	Refa: Add error handling for JSON decode in embedding models (#8162 ) ### What problem does this PR solve? Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by: 1. Adding try-catch blocks for JSON decode errors 2. Logging error details including response content 3. Raising exceptions with meaningful error messages ### Type of change - [x] Refactoring	2025-06-10 19:04:17 +08:00
Jacky Wu	60ab7027c0	fix: allow to do role auth for S3 bucket use. (#8149 ) ### What problem does this PR solve? Close #8148 . ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 10:50:07 +08:00
HaiyangP	baf32ee461	Display only the duplicate column names and corresponding original source. (#8138 ) ### What problem does this PR solve? This PR aims to slove #8120 which request a better error display of duplicate column names. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:38 +08:00
Kevin Hu	24625e0695	Fix: presentation of PDF using vlm. (#8133 ) ### What problem does this PR solve? #8109 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 15:01:52 +08:00
Zhichang Yu	1ed0b25910	Fix task_limiter in raptor.py (#8124 ) ### What problem does this PR solve? Fix task_limiter in raptor.py ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 10:18:03 +08:00
Kevin Hu	7ed9efcd4e	Fix: QWenCV issue. (#8106 ) ### What problem does this PR solve? Close #8097 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-06 17:55:13 +08:00
Wanderson Pinto dos Santos	0e03542db5	fix: single task executor getting all tasks from Redis queue (#7330 ) ### What problem does this PR solve? Currently, as long as there are tasks in Redis, this loop will keep getting the tasks. This will lead to a single task executor with many tasks in the pending state. Then we need to wait for the pending tasks to get them back in the queue. In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X tasks should be picked from the queue, and others should be left in the queue for other `task_executors` or be picked after 1 of the spots in the current executor gets free. This PR ensures this behavior. The additional changes were due to the Ruff linting in pre-commit. But I believe these are expected to keep the coding style. ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>	2025-06-06 14:32:35 +08:00
Adrian Altermatt	31d2b3cb5a	Fix: Grammar and clarity improvements in prompt templates (#8023 ) ## Summary Fixed grammar errors and improved clarity in prompt templates throughout `rag/prompts.py`. ## Changes Made - Fixed incomplete sentence: `"If the user's latest question is completely, don't do anything"` → `"If the user's latest question is already complete, don't do anything"` - Improved phrasing: `"of like [ID:i]"` → `"such as [ID:i]"` - Added missing articles: `"give top 3"` → `"give the top 3"` - Fixed prepositions: `"in language of"` → `"in the same language as"` - Corrected spelling: `"Jappanese"` → `"Japanese"` - Standardized formatting: Consistent role descriptions and punctuation ## Impact These changes improve prompt readability and should make instructions clearer for the underlying language models. ## Test Plan - [x] Verified changes maintain original prompt functionality - [x] No breaking changes to prompt structure or expected outputs Co-authored-by: Adrian Altermatt <adrian.altermatt@fgcz.uzh.ch>	2025-06-03 19:41:59 +08:00
Kevin Hu	156290f8d0	Fix: url path join issue. (#8013 ) ### What problem does this PR solve? Close #7980 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-03 14:18:40 +08:00
zstar	37998abef3	Update synonym dictionary file (#7997 ) ### What problem does this PR solve? Update the synonym dictionary file with relevant time and date to prevent synonyms from being mistakenly escaped. ### Type of change - [x] Refactoring	2025-06-03 09:41:53 +08:00
Kevin Hu	93f5df716f	Fix: order chunks from docx by positions. (#7979 ) ### What problem does this PR solve? #7934 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-30 17:20:53 +08:00
Yongteng Lei	bd4678bca6	Fix: Unnecessary truncation in markdown parser (#7972 ) ### What problem does this PR solve? Fix unnecessary truncation in markdown parser. So that markdown can work perfectly like [this](https://github.com/infiniflow/ragflow/issues/7824#issuecomment-2921312576) in #7824, supporting multiple special delimiters. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-30 15:04:21 +08:00
Yongteng Lei	46963ab1ca	Fix: add advanced delimiter detection for naive merge (#7941 ) ### What problem does this PR solve? Add advanced delimiter detection for naive merge. #7824 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-05-29 16:17:22 +08:00
giiiiiithub	6ba5a4348a	set PARALLEL_DEVICES default value= 0 (#7935 ) ### What problem does this PR solve? it would be fail if PARALLEL_DEVICES = None in OCR class , because it pass 0 to TextDetector and TextRecognizer init method. and It would be simpler to set 0 as the default value for PARALLEL_DEVICES. ### Type of change - [x] Refactoring	2025-05-29 13:32:16 +08:00
Yongteng Lei	0c562f0a9f	Refa: change citation mark as [ID:n] (#7923 ) ### What problem does this PR solve? Change citation mark as [ID:n], it's easier for LLMs to follow the instruction :) #7904 ### Type of change - [x] Refactoring	2025-05-29 10:03:51 +08:00
Stephen Hu	273f36cc54	Perf: reduce upload to minio limiter scope (#7878 ) ### What problem does this PR solve? reduce upload_to_minio limter scope ### Type of change - [x] Performance Improvement	2025-05-27 17:49:37 +08:00
Kevin Hu	28cb4df127	Fix: raptor overloading (#7889 ) ### What problem does this PR solve? #7840 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-27 17:41:35 +08:00
Kevin Hu	959793e83c	Fix: task limiter issue. (#7873 ) ### What problem does this PR solve? #7869 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-27 11:16:29 +08:00
pyyuhao	5d6bf2224a	Fix: Opensearch chunk management (#7802 ) ### What problem does this PR solve? This PR solve the problems metioned in the pr(https://github.com/infiniflow/ragflow/pull/7140) which is also submitted by me ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### Introduction I fixed the problems when using OpenSearch as the DOC_ENGINE, the failures of pytest and the wrong API's return. Mainly about delete chunk, list chunks, update chunk, retrieval chunk. The pytest comand "cd sdk/python && uv sync --python 3.10 --group test --frozen && source .venv/bin/activate && cd test/test_http_api && DOC_ENGINE=opensearch pytest test_chunk_management_within_dataset -s --tb=short " is finally successful. ###Others As some changes between Elasticsearch And Opensearch differ, some pytest results about OpenSearch are correct and resonable. However, some pytest params (skipif params) are incompatible. So I changed some pytest params about skipif. As a search engine programmer, I will still focus on the usage of vector databases (especially OpenSearch) for the RAG stuff. Thanks for your review	2025-05-26 16:57:58 +08:00
Kevin Hu	be83074131	Fix: restore task limiter. (#7844 ) ### What problem does this PR solve? Close #7828 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-26 10:59:01 +08:00
Hao Zhang	2f4d803db1	Delete Corresponding Minio Bucket When Deleting a Knowledge Base (#7841 ) ### What problem does this PR solve? Delete Corresponding Minio Bucket When Deleting a Knowledge Base [issue #4113 ](https://github.com/infiniflow/ragflow/issues/4113) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-05-26 10:02:51 +08:00
Sol	0d7cfce6e1	Update rag/nlp/query.py (#7816 ) ### What problem does this PR solve? Fix tokenizer resulting in low recall ![37743d3a495f734aa69f1e173fa77457](https://github.com/user-attachments/assets/1394757e-8fcb-4f87-96af-a92716144884) ![4aba633a17f34269a4e17e84fafb34c4](https://github.com/user-attachments/assets/a1828e32-3e17-4394-a633-ba3f09bd506d) ![image](https://github.com/user-attachments/assets/61308f32-2a4f-44d5-a034-d65bbec554ef) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-05-23 17:13:37 +08:00
Stephen Hu	db4371c745	Fix: Improve First Chunk Size (#7806 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7790 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-23 14:30:19 +08:00
Emmanuel Ferdman	d4a123d6dd	Fix: resolve regex library warnings (#7782 ) ### What problem does this PR solve? This small PR resolves the regex library warnings showing in Python3.11: ```python DeprecationWarning: 'count' is passed as positional argument ``` ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-05-22 10:06:28 +08:00
Stephen Hu	ce816edb5f	Fix: improve task cancel lag (#7765 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7761 but it may be difficult to achieve 0 delay (which need to pass the cancel token to all parts) Another solution is just 0 delay effect at UI. And task will stop latter ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-22 09:28:08 +08:00
Stephen Hu	e3e7c7ddaa	Feat: delete useless image blobs when task executor meet edge cases (#7727 ) ### What problem does this PR solve? delete useless image blobs when the task executor meets edge cases ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-05-21 10:22:30 +08:00
S0b3Rr	5d21cc3660	fix: Fix the problem that concurrent execution limit in task executor fails and causes OOM (issue#7580) (#7700 ) ### What problem does this PR solve? ## Cause of the bug: During the execution process, due to improper use of trio CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is invalid, causing the executor to take out a large number of tasks from the Redis queue at one time. This behavior will cause the task executor to occupy too much memory and be killed by the OS when a large number of tasks exist at the same time. As a result, all executing tasks are suspended. ## Fix: Added the task_manager method to the entry of /rag/svr/task_executor.py to make CapacityLimiter effective. Deleted the invalid async with statement. ## Fix result: After testing, the task executor execution meets expectations, that is: concurrent execution of up to $MAX_CONCURRENT_TASKS tasks. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-19 10:25:56 +08:00
Song Fuchang	a1f06a4fdc	Feat: Support tool calling in Generate component (#7572 ) ### What problem does this PR solve? Hello, our use case requires LLM agent to invoke some tools, so I made a simple implementation here. This PR does two things: 1. A simple plugin mechanism based on `pluginlib`: This mechanism lives in the `plugin` directory. It will only load plugins from `plugin/embedded_plugins` for now. A sample plugin `bad_calculator.py` is placed in `plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`, then give a wrong result `a + b + 100`. In the future, it can load plugins from external location with little code change. Plugins are divided into different types. The only plugin type supported in this PR is `llm_tools`, which must implement the `LLMToolPlugin` class in the `plugin/llm_tool_plugin.py`. More plugin types can be added in the future. 2. A tool selector in the `Generate` component: Added a tool selector to select one or more tools for LLM: ![image](https://github.com/user-attachments/assets/74a21fdf-9333-4175-991b-43df6524c5dc) And with the `bad_calculator` tool, it results this with the `qwen-max` model: ![image](https://github.com/user-attachments/assets/93aff9c4-8550-414a-90a2-1a15a5249d94) ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2025-05-16 16:32:19 +08:00
Kevin Hu	bfe97d896d	Fix: docx get image exception. (#7636 ) ### What problem does this PR solve? Close #7631 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-14 12:24:48 +08:00
Kevin Hu	01330fa428	Feat: let image citation being shown. (#7624 ) ### What problem does this PR solve? #7623 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-05-13 19:30:05 +08:00
Kevin Hu	321a280031	Feat: add image preview to retrieval test. (#7610 ) ### What problem does this PR solve? #7608 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-05-13 14:30:36 +08:00
Stephen Hu	573d46a4ef	FIX:ZeroDivisionError when using large page_size in client.retrieve() (#7595 ) ### What problem does this PR solve? Close #7592 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-13 10:46:31 +08:00
alkscr	4ae8f87754	Fix: missing graph resolution and community extraction in graphrag tasks (#7586 ) ### What problem does this PR solve? Info of whether applying graph resolution and community extraction is storage in `task["kb_parser_config"]`. However, previous code get `graphrag_conf` from `task["parser_config"]`, making `with_resolution` and `with_community` are always false. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-13 09:21:03 +08:00
alkscr	baa108f5cc	Fix: markdown table conversion error (#7570 ) ### What problem does this PR solve? Since `import markdown.markdown` has been changed to `import markdown` in `rag/app/naive.py`, previous code for converting markdown tables would call a markdown module instead of a callable function. This cause error. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-05-12 17:16:55 +08:00
Kevin Hu	5b626870d0	Refa: remove ollama keep alive. (#7560 ) ### What problem does this PR solve? #7518 ### Type of change - [x] Refactoring	2025-05-09 17:51:49 +08:00

1 2 3 4 5 ...

722 Commits