ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-06-26 22:19:57 +00:00

Author	SHA1	Message	Date
cutiechi	6aa0b0819d	Fix: unify opendal config key from ‎`schema` to ‎`scheme` (#8232 ) ### What problem does this PR solve? This PR resolves the inconsistency in the opendal configuration where both ‎`schema` and ‎`scheme` were used as keys. The code and configuration file now consistently use ‎`scheme`, which helps prevent configuration errors and runtime issues. This change improves code clarity and maintainability. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Additional context - Updated both ‎`conf/service_conf.yaml` and ‎`rag/utils/opendal_conn.py` to use ‎`scheme` instead of ‎`schema` - No breaking changes to other configuration fields	2025-06-13 14:56:51 +08:00
Wesley	3d0b440e9f	fix(search.py):remove hard page_size (#8242 ) ### What problem does this PR solve? Fix the restriction of forcing similarity_threshold=0 and page_size=30 when doc_ids is not empty #8228 --------- Co-authored-by: shiqing.wusq <shiqing.wusq@dtzhejiang.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 14:56:25 +08:00
Kenny	800e263f64	Fix: Update customer_service.json (#8238 ) ### What problem does this PR solve? The issue of reporting the 「Can't inference the where the component input is. Please identify whose output is this component's input」error when creating an Agent using the Customer service template has been resolved. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-13 14:31:36 +08:00
Stephen Hu	ce65ea1fc1	Fix: Change allocate_container_blocking Calculate Time by async time (#8206 ) ### What problem does this PR solve? Change allocate_container_blocking Calculate Time by async time ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 14:05:11 +08:00
writinwaters	2341939376	Docs: Miscellaneous editorial updates (#8237 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-13 09:46:24 +08:00
balibabu	a9d9215547	Feat: Connect conditional operators to other operators #3221 (#8231 ) ### What problem does this PR solve? Feat: Connect conditional operators to other operators #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 09:30:34 +08:00
Liu An	99725444f1	Fix: desc parameter parsing (#8229 ) ### What problem does this PR solve? - Fix boolean parsing for 'desc' parameter in kb_app.py to properly handle string values ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 19:17:47 +08:00
Stephen Hu	1ab0f52832	Fix：The OpenAI-Compatible Agent API returns an incorrect message (#8177 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8175 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 19:17:15 +08:00
Yongteng Lei	24ca4cc6b7	Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223 ) ### What problem does this PR solve? This PR investigates the cause of #7957. TL;DR: Incorrect similarity calculations lead to too many candidates. Since candidate selection involves interaction with the LLM, this causes significant delays in the program. What this PR does: 1. Fix similarity calculation: When processing a 64 pages government document, the corrected similarity calculation reduces the number of candidates from over 100,000 to around 16,000. With a default batch size of 100 pairs per LLM call, this fix reduces unnecessary LLM interactions from over 1,000 calls to around 160, a roughly 10x improvement. 2. Add concurrency and timeout limits: Up to 5 entity types are processed in "parallel", each with a 180-second timeout. These limits may be configurable in future updates. 3. Improve logging: The candidate resolution process now reports progress in real time. 4. Mitigates potential concurrency risks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-12 19:09:50 +08:00
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224 ) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	2025-06-12 17:53:59 +08:00
Liu An	86a1411b07	Refa: Test configs (#8220 ) ### What problem does this PR solve? - Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to configs.py - Update test imports to use centralized configs - Clean up duplicate constant definitions across test files This improves maintainability by centralizing configuration. ### Type of change - [x] Refactoring test case	2025-06-12 17:42:00 +08:00
Liu An	54a465f9e8	Test: fix chunk deletion test assertions (#8222 ) ### What problem does this PR solve? - Fix test assertions in test_delete_chunks.py to expect empty results after deletion Action 7619 ### Type of change - [x] Bug Fix test cases	2025-06-12 17:41:46 +08:00
balibabu	bf7f7c7027	Feat: Display the connection lines between multiple conditions of the conditional operator #3221 (#8218 ) ### What problem does this PR solve? Feat: Display the connection lines between multiple conditions of the conditional operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 17:11:24 +08:00
Liu An	7fbbc9650d	Fix: Move pagerank field from create to update dataset API (#8217 ) ### What problem does this PR solve? - Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq - Add pagerank update logic in dataset update endpoint - Update API documentation to reflect changes - Modify related test cases and SDK references #8208 This change makes pagerank a mutable property that can only be set after dataset creation, and only when using elasticsearch as the doc engine. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 15:47:49 +08:00
Liu An	d0c5ff04a6	Fix: Add pagerank validation for non-elasticsearch doc engines (#8215 ) ### What problem does this PR solve? Validate that pagerank updates are only allowed when using elasticsearch as the document engine. Return an error if pagerank is set while using a different doc engine, preventing potential inconsistencies in document scoring. #8208 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 15:47:22 +08:00
Kevin Hu	d5236b71f4	Refa: ollama keep alive issue. (#8216 ) ### What problem does this PR solve? #8122 ### Type of change - [x] Refactoring	2025-06-12 15:09:40 +08:00
Stephen Hu	e7c85e569b	Fix: Improve TS Warning For http_api_reference.md (#8172 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8157 The current master code should work fine, but hI ave some warnings, so I added a declare to improve the warning ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 14:20:15 +08:00
balibabu	84b4e32c34	Feat: The value selected in the Select component only displays the icon #3221 (#8209 ) ### What problem does this PR solve? Feat: The value selected in the Select component only displays the icon #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 12:31:57 +08:00
Kevin Hu	56ee69e9d9	Refa: chat with tools. (#8210 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-12 12:31:10 +08:00
africa-worker	44287fb05f	Oss support opendal(including mysql) (#8204 ) ### What problem does this PR solve? #8074 Oss support opendal(including mysql) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-12 11:37:42 +08:00
Liu An	cef587abc2	Fix: Add validation for dataset name in KB update API (#8194 ) ### What problem does this PR solve? Validate dataset name in knowledge base update endpoint to ensure: - Name is a non-empty string - Name length doesn't exceed DATASET_NAME_LIMIT - Whitespace is trimmed before processing Prevents invalid dataset names from being saved and provides clear error messages. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 11:37:25 +08:00
Yongteng Lei	1a5f991d86	Fix: auto-keyword and auto-question fail with qwq model (#8190 ) ### What problem does this PR solve? Fix auto-keyword and auto-question fail with qwq model. #8189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 11:37:07 +08:00
balibabu	713b574c9d	Feat: Add SwitchForm component #3221 (#8200 ) ### What problem does this PR solve? Feat: Add SwitchForm component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 09:50:25 +08:00
Liu An	60c1bf5a19	Fix: duplicate knowledgebase name validation logic (#8199 ) ### What problem does this PR solve? Change the condition from checking for >1 to >=1 when validating duplicate knowledgebase names to properly catch all duplicates. This ensures no two knowledgebases can have the same name for a tenant. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 09:46:57 +08:00
writinwaters	d331866a12	Docs: Miscellaneous (#8198 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-12 09:42:07 +08:00
Kevin Hu	69e1fc496d	Refa: chat models (#8187 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-11 17:20:12 +08:00
Liu An	e87ad8126c	Fix: Improve dataset name validation in KB app (#8188 ) ### What problem does this PR solve? - Trim whitespace before checking for empty dataset names - Change length check from >= to > DATASET_NAME_LIMIT for consistency ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-11 16:14:29 +08:00
Yongteng Lei	5e30426916	Feat: add Qwen3-Embedding text-embedding-v4 (#8184 ) ### What problem does this PR solve? Add Qwen3-Embedding text-embedding-v4. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 15:32:05 +08:00
Liu An	6aff3e052a	Test: Refactor test fixtures to use HttpApiAuth naming consistently (#8180 ) ### What problem does this PR solve? - Rename `api_key` fixture to `HttpApiAuth` across all test files - Update all dependent fixtures and test cases to use new naming - Maintain same functionality while improving naming clarity The rename better reflects the fixture's purpose as an HTTP API authentication helper rather than just an API key. ### Type of change - [x] Refactoring	2025-06-11 14:25:40 +08:00
Liu An	f29d9fa3f9	Test: fix test cases and improve document parsing validation (#8179 ) ### What problem does this PR solve? - Update chat assistant tests to use dataset.id directly in payloads - Enhance document parsing tests with better condition checking - Add explicit type hints and improve timeout handling Action_7556 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-11 14:25:30 +08:00
balibabu	31003cd5f6	Feat: Display the agent node running timeline #3221 (#8185 ) ### What problem does this PR solve? Feat: Display the agent node running timeline #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 14:24:43 +08:00
balibabu	f0a3d91171	Feat: Display agent operator call log #3221 (#8169 ) ### What problem does this PR solve? Feat: Display agent operator call log #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 09:22:07 +08:00
cwr31	e6d36f3a3a	Improve image rotation logic for text recognition (#8167 ) ### What problem does this PR solve? Enhanced the image rotation handling by evaluating the original orientation, clockwise 90°, and counter-clockwise 90° rotations. The image with the highest text recognition score is now selected, improving accuracy for text detection in images with aspect ratios >= 1.5. #8166 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenrui.cao <wenrui.cao@univers.com>	2025-06-11 09:20:30 +08:00
writinwaters	c8269206d7	Docs: UI updates (#8170 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-11 09:17:30 +08:00
Gifford Nowland	ab67292aa3	fix: silence deprecation in huggingface snapshot_download function (#8150 ) ### What problem does this PR solve? fixes the following deprecation emitted from `download_deps.py`: ``` UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir` ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 21:00:03 +08:00
writinwaters	4f92af3cd4	Docs: Updated Auto-question Auto-keyword (#8168 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-10 19:38:28 +08:00
Liu An	a43adafc6b	Refa: Add error handling for JSON decode in embedding models (#8162 ) ### What problem does this PR solve? Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by: 1. Adding try-catch blocks for JSON decode errors 2. Logging error details including response content 3. Raising exceptions with meaningful error messages ### Type of change - [x] Refactoring	2025-06-10 19:04:17 +08:00
balibabu	c5e4684b44	Feat: Let system variables appear in operator prompts #3221 (#8154 ) ### What problem does this PR solve? Feat: Let system variables appear in operator prompts #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 17:06:30 +08:00
Liu An	3a34def55f	Test: Migrate test workflow to use top-level test directory (#8145 ) ### What problem does this PR solve? - Replace manual venv activation with `uv run` for pytest commands - Add dynamic test level (p2/p3) based on GitHub event type - Simplify test commands by removing redundant directory changes ### Type of change - [x] Update Action	2025-06-10 13:55:26 +08:00
Stephen Hu	e6f68e1ccf	Fix: When List Kbs some times the total is wrong (#8151 ) ### What problem does this PR solve? for kb.app list method when owner_ids the total calculate is wrong (now will base on the paged result to calculate total) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 11:34:30 +08:00
Jacky Wu	60ab7027c0	fix: allow to do role auth for S3 bucket use. (#8149 ) ### What problem does this PR solve? Close #8148 . ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 10:50:07 +08:00
balibabu	08f2223a6a	Feat: Constructing query parameter options for the Retrieval operator #3221 (#8152 ) ### What problem does this PR solve? Feat: Constructing query parameter options for the Retrieval operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 10:49:41 +08:00
yurhett	9c6c6c51e0	Fix: use jwks_uri from OIDC metadata for JWKS client (#8136 ) ### What problem does this PR solve? Issue: #8051 The current implementation assumes JWKS endpoints follow the standard `/.well-known/jwks.json` convention. This breaks authentication for OIDC providers that use non-standard JWKS paths, resulting in 404 errors during token validation. Root Cause Analysis - The OpenID Connect specification doesn't mandate a fixed path for JWKS endpoints - Some identity providers (like certain Keycloak configurations) use custom endpoints - Our previous approach constructed JWKS URLs by convention rather than discovery ### Solution Approach Instead of constructing JWKS URLs by appending to the issuer URI, we now: 1. Properly leverage the `jwks_uri` from the OIDC discovery metadata 2. Honor the identity provider's actual configured endpoint ```python # Before (fragile approach) jwks_url = f"{self.issuer}/.well-known/jwks.json" # After (standards-compliant) jwks_cli = jwt.PyJWKClient(self.jwks_uri) # Use discovered endpoint ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:58 +08:00
HaiyangP	baf32ee461	Display only the duplicate column names and corresponding original source. (#8138 ) ### What problem does this PR solve? This PR aims to slove #8120 which request a better error display of duplicate column names. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:38 +08:00
balibabu	8fb6b5d945	Feat: Add agent operator node from agent form #3221 (#8144 ) ### What problem does this PR solve? Feat: Add agent operator node from agent form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 19:19:48 +08:00
Liu An	5cc2eda362	Test: Refactor test fixtures and add SDK session management tests (#8141 ) ### What problem does this PR solve? - Consolidate HTTP API test fixtures using batch operations (batch_add_chunks, batch_create_chat_assistants) - Fix fixture initialization order in clear_session_with_chat_assistants - Add new SDK API test suite for session management (create/delete/list/update) ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 18:13:26 +08:00
balibabu	9a69d5f367	Feat: Display chat content on the agent page #3221 (#8140 ) ### What problem does this PR solve? Feat: Display chat content on the agent page #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 18:13:06 +08:00
balibabu	d9b98cbb18	Feat: Convert the prompt field of the agent operator to an array #3221 (#8137 ) ### What problem does this PR solve? Feat: Convert the prompt field of the agent operator to an array #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 16:02:33 +08:00
Kevin Hu	24625e0695	Fix: presentation of PDF using vlm. (#8133 ) ### What problem does this PR solve? #8109 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 15:01:52 +08:00
Liu An	4649accd54	Test: Add SDK API tests for chat assistant management and improve con… (#8131 ) ### What problem does this PR solve? - Implement new SDK API test cases for chat assistant CRUD operations - Enhance HTTP API concurrent tests to use as_completed for better reliability ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 13:30:12 +08:00

1 2 3 4 5 ...

3288 Commits