ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-09-01 20:36:59 +00:00

Author	SHA1	Message	Date
Stephen Hu	ce65ea1fc1	Fix: Change allocate_container_blocking Calculate Time by async time (#8206 ) ### What problem does this PR solve? Change allocate_container_blocking Calculate Time by async time ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-13 14:05:11 +08:00
writinwaters	2341939376	Docs: Miscellaneous editorial updates (#8237 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-13 09:46:24 +08:00
balibabu	a9d9215547	Feat: Connect conditional operators to other operators #3221 (#8231 ) ### What problem does this PR solve? Feat: Connect conditional operators to other operators #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 09:30:34 +08:00
Liu An	99725444f1	Fix: desc parameter parsing (#8229 ) ### What problem does this PR solve? - Fix boolean parsing for 'desc' parameter in kb_app.py to properly handle string values ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 19:17:47 +08:00
Stephen Hu	1ab0f52832	Fix：The OpenAI-Compatible Agent API returns an incorrect message (#8177 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8175 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 19:17:15 +08:00
Yongteng Lei	24ca4cc6b7	Refa: GraphRAG and explaining GraphRAG stalling behavior on large files (#8223 ) ### What problem does this PR solve? This PR investigates the cause of #7957. TL;DR: Incorrect similarity calculations lead to too many candidates. Since candidate selection involves interaction with the LLM, this causes significant delays in the program. What this PR does: 1. Fix similarity calculation: When processing a 64 pages government document, the corrected similarity calculation reduces the number of candidates from over 100,000 to around 16,000. With a default batch size of 100 pairs per LLM call, this fix reduces unnecessary LLM interactions from over 1,000 calls to around 160, a roughly 10x improvement. 2. Add concurrency and timeout limits: Up to 5 entity types are processed in "parallel", each with a 180-second timeout. These limits may be configurable in future updates. 3. Improve logging: The candidate resolution process now reports progress in real time. 4. Mitigates potential concurrency risks ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-12 19:09:50 +08:00
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224 ) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	2025-06-12 17:53:59 +08:00
Liu An	86a1411b07	Refa: Test configs (#8220 ) ### What problem does this PR solve? - Move common constants (HOST_ADDRESS, INVALID_API_TOKEN, etc.) to configs.py - Update test imports to use centralized configs - Clean up duplicate constant definitions across test files This improves maintainability by centralizing configuration. ### Type of change - [x] Refactoring test case	2025-06-12 17:42:00 +08:00
Liu An	54a465f9e8	Test: fix chunk deletion test assertions (#8222 ) ### What problem does this PR solve? - Fix test assertions in test_delete_chunks.py to expect empty results after deletion Action 7619 ### Type of change - [x] Bug Fix test cases	2025-06-12 17:41:46 +08:00
balibabu	bf7f7c7027	Feat: Display the connection lines between multiple conditions of the conditional operator #3221 (#8218 ) ### What problem does this PR solve? Feat: Display the connection lines between multiple conditions of the conditional operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 17:11:24 +08:00
Liu An	7fbbc9650d	Fix: Move pagerank field from create to update dataset API (#8217 ) ### What problem does this PR solve? - Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq - Add pagerank update logic in dataset update endpoint - Update API documentation to reflect changes - Modify related test cases and SDK references #8208 This change makes pagerank a mutable property that can only be set after dataset creation, and only when using elasticsearch as the doc engine. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 15:47:49 +08:00
Liu An	d0c5ff04a6	Fix: Add pagerank validation for non-elasticsearch doc engines (#8215 ) ### What problem does this PR solve? Validate that pagerank updates are only allowed when using elasticsearch as the document engine. Return an error if pagerank is set while using a different doc engine, preventing potential inconsistencies in document scoring. #8208 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 15:47:22 +08:00
Kevin Hu	d5236b71f4	Refa: ollama keep alive issue. (#8216 ) ### What problem does this PR solve? #8122 ### Type of change - [x] Refactoring	2025-06-12 15:09:40 +08:00
Stephen Hu	e7c85e569b	Fix: Improve TS Warning For http_api_reference.md (#8172 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8157 The current master code should work fine, but hI ave some warnings, so I added a declare to improve the warning ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 14:20:15 +08:00
balibabu	84b4e32c34	Feat: The value selected in the Select component only displays the icon #3221 (#8209 ) ### What problem does this PR solve? Feat: The value selected in the Select component only displays the icon #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 12:31:57 +08:00
Kevin Hu	56ee69e9d9	Refa: chat with tools. (#8210 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-12 12:31:10 +08:00
africa-worker	44287fb05f	Oss support opendal(including mysql) (#8204 ) ### What problem does this PR solve? #8074 Oss support opendal(including mysql) ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-12 11:37:42 +08:00
Liu An	cef587abc2	Fix: Add validation for dataset name in KB update API (#8194 ) ### What problem does this PR solve? Validate dataset name in knowledge base update endpoint to ensure: - Name is a non-empty string - Name length doesn't exceed DATASET_NAME_LIMIT - Whitespace is trimmed before processing Prevents invalid dataset names from being saved and provides clear error messages. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 11:37:25 +08:00
Yongteng Lei	1a5f991d86	Fix: auto-keyword and auto-question fail with qwq model (#8190 ) ### What problem does this PR solve? Fix auto-keyword and auto-question fail with qwq model. #8189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 11:37:07 +08:00
balibabu	713b574c9d	Feat: Add SwitchForm component #3221 (#8200 ) ### What problem does this PR solve? Feat: Add SwitchForm component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-12 09:50:25 +08:00
Liu An	60c1bf5a19	Fix: duplicate knowledgebase name validation logic (#8199 ) ### What problem does this PR solve? Change the condition from checking for >1 to >=1 when validating duplicate knowledgebase names to properly catch all duplicates. This ensures no two knowledgebases can have the same name for a tenant. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 09:46:57 +08:00
writinwaters	d331866a12	Docs: Miscellaneous (#8198 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-12 09:42:07 +08:00
Kevin Hu	69e1fc496d	Refa: chat models (#8187 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-11 17:20:12 +08:00
Liu An	e87ad8126c	Fix: Improve dataset name validation in KB app (#8188 ) ### What problem does this PR solve? - Trim whitespace before checking for empty dataset names - Change length check from >= to > DATASET_NAME_LIMIT for consistency ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-11 16:14:29 +08:00
Yongteng Lei	5e30426916	Feat: add Qwen3-Embedding text-embedding-v4 (#8184 ) ### What problem does this PR solve? Add Qwen3-Embedding text-embedding-v4. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 15:32:05 +08:00
Liu An	6aff3e052a	Test: Refactor test fixtures to use HttpApiAuth naming consistently (#8180 ) ### What problem does this PR solve? - Rename `api_key` fixture to `HttpApiAuth` across all test files - Update all dependent fixtures and test cases to use new naming - Maintain same functionality while improving naming clarity The rename better reflects the fixture's purpose as an HTTP API authentication helper rather than just an API key. ### Type of change - [x] Refactoring	2025-06-11 14:25:40 +08:00
Liu An	f29d9fa3f9	Test: fix test cases and improve document parsing validation (#8179 ) ### What problem does this PR solve? - Update chat assistant tests to use dataset.id directly in payloads - Enhance document parsing tests with better condition checking - Add explicit type hints and improve timeout handling Action_7556 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-11 14:25:30 +08:00
balibabu	31003cd5f6	Feat: Display the agent node running timeline #3221 (#8185 ) ### What problem does this PR solve? Feat: Display the agent node running timeline #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 14:24:43 +08:00
balibabu	f0a3d91171	Feat: Display agent operator call log #3221 (#8169 ) ### What problem does this PR solve? Feat: Display agent operator call log #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-11 09:22:07 +08:00
cwr31	e6d36f3a3a	Improve image rotation logic for text recognition (#8167 ) ### What problem does this PR solve? Enhanced the image rotation handling by evaluating the original orientation, clockwise 90°, and counter-clockwise 90° rotations. The image with the highest text recognition score is now selected, improving accuracy for text detection in images with aspect ratios >= 1.5. #8166 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenrui.cao <wenrui.cao@univers.com>	2025-06-11 09:20:30 +08:00
writinwaters	c8269206d7	Docs: UI updates (#8170 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-11 09:17:30 +08:00
Gifford Nowland	ab67292aa3	fix: silence deprecation in huggingface snapshot_download function (#8150 ) ### What problem does this PR solve? fixes the following deprecation emitted from `download_deps.py`: ``` UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir` ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 21:00:03 +08:00
writinwaters	4f92af3cd4	Docs: Updated Auto-question Auto-keyword (#8168 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-06-10 19:38:28 +08:00
Liu An	a43adafc6b	Refa: Add error handling for JSON decode in embedding models (#8162 ) ### What problem does this PR solve? Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by: 1. Adding try-catch blocks for JSON decode errors 2. Logging error details including response content 3. Raising exceptions with meaningful error messages ### Type of change - [x] Refactoring	2025-06-10 19:04:17 +08:00
balibabu	c5e4684b44	Feat: Let system variables appear in operator prompts #3221 (#8154 ) ### What problem does this PR solve? Feat: Let system variables appear in operator prompts #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 17:06:30 +08:00
Liu An	3a34def55f	Test: Migrate test workflow to use top-level test directory (#8145 ) ### What problem does this PR solve? - Replace manual venv activation with `uv run` for pytest commands - Add dynamic test level (p2/p3) based on GitHub event type - Simplify test commands by removing redundant directory changes ### Type of change - [x] Update Action	2025-06-10 13:55:26 +08:00
Stephen Hu	e6f68e1ccf	Fix: When List Kbs some times the total is wrong (#8151 ) ### What problem does this PR solve? for kb.app list method when owner_ids the total calculate is wrong (now will base on the paged result to calculate total) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 11:34:30 +08:00
Jacky Wu	60ab7027c0	fix: allow to do role auth for S3 bucket use. (#8149 ) ### What problem does this PR solve? Close #8148 . ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 10:50:07 +08:00
balibabu	08f2223a6a	Feat: Constructing query parameter options for the Retrieval operator #3221 (#8152 ) ### What problem does this PR solve? Feat: Constructing query parameter options for the Retrieval operator #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-10 10:49:41 +08:00
yurhett	9c6c6c51e0	Fix: use jwks_uri from OIDC metadata for JWKS client (#8136 ) ### What problem does this PR solve? Issue: #8051 The current implementation assumes JWKS endpoints follow the standard `/.well-known/jwks.json` convention. This breaks authentication for OIDC providers that use non-standard JWKS paths, resulting in 404 errors during token validation. Root Cause Analysis - The OpenID Connect specification doesn't mandate a fixed path for JWKS endpoints - Some identity providers (like certain Keycloak configurations) use custom endpoints - Our previous approach constructed JWKS URLs by convention rather than discovery ### Solution Approach Instead of constructing JWKS URLs by appending to the issuer URI, we now: 1. Properly leverage the `jwks_uri` from the OIDC discovery metadata 2. Honor the identity provider's actual configured endpoint ```python # Before (fragile approach) jwks_url = f"{self.issuer}/.well-known/jwks.json" # After (standards-compliant) jwks_cli = jwt.PyJWKClient(self.jwks_uri) # Use discovered endpoint ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:58 +08:00
HaiyangP	baf32ee461	Display only the duplicate column names and corresponding original source. (#8138 ) ### What problem does this PR solve? This PR aims to slove #8120 which request a better error display of duplicate column names. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-10 10:16:38 +08:00
balibabu	8fb6b5d945	Feat: Add agent operator node from agent form #3221 (#8144 ) ### What problem does this PR solve? Feat: Add agent operator node from agent form #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 19:19:48 +08:00
Liu An	5cc2eda362	Test: Refactor test fixtures and add SDK session management tests (#8141 ) ### What problem does this PR solve? - Consolidate HTTP API test fixtures using batch operations (batch_add_chunks, batch_create_chat_assistants) - Fix fixture initialization order in clear_session_with_chat_assistants - Add new SDK API test suite for session management (create/delete/list/update) ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 18:13:26 +08:00
balibabu	9a69d5f367	Feat: Display chat content on the agent page #3221 (#8140 ) ### What problem does this PR solve? Feat: Display chat content on the agent page #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 18:13:06 +08:00
balibabu	d9b98cbb18	Feat: Convert the prompt field of the agent operator to an array #3221 (#8137 ) ### What problem does this PR solve? Feat: Convert the prompt field of the agent operator to an array #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-09 16:02:33 +08:00
Kevin Hu	24625e0695	Fix: presentation of PDF using vlm. (#8133 ) ### What problem does this PR solve? #8109 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 15:01:52 +08:00
Liu An	4649accd54	Test: Add SDK API tests for chat assistant management and improve con… (#8131 ) ### What problem does this PR solve? - Implement new SDK API test cases for chat assistant CRUD operations - Enhance HTTP API concurrent tests to use as_completed for better reliability ### Type of change - [x] Add test cases - [x] Refactoring	2025-06-09 13:30:12 +08:00
Liu An	968ffc7ef3	Refa: dataset operations to simplify error handling (#8132 ) ### What problem does this PR solve? - Consolidate database operations within single try-except blocks in the methods ### Type of change - [x] Refactoring	2025-06-09 13:29:56 +08:00
Stephen Hu	2337bbf6ca	Perf: pass useless check for tidy graph (#8121 ) ### What problem does this PR solve? Support passing the attribute check when the upstream has already made sure it. ### Type of change - [X] Performance Improvement	2025-06-09 11:44:13 +08:00
Liu An	ad1f89fea0	Fix: chat module update LLM defaults (#8125 ) ### What problem does this PR solve? Previously when LLM.model_name was not configured: - System incorrectly defaulted to 'deepseek-chat' model - This caused permission errors for unauthorized tenants Now: - Use tenant's default chat_model configuration first ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-09 11:44:02 +08:00

1 2 3 4 5 ...

3285 Commits