ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-14 00:27:16 +00:00

Author	SHA1	Message	Date
Stephen Hu	fb77f9917b	Refactor: Use Input Length In DefaultRerank (#9516 ) ### What problem does this PR solve? 1. Use input length to prepare res 2. Adjust torch_empty_cache code location ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-08-18 10:00:27 +08:00
RuyXu	762aa4b8c4	fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248 ) (#9474 ) fix: preserve correct MIME & unify data URL handling for vision inputs (relates #9248) - Updated image2base64() to return a full data URL (data:image/<fmt>;base64,...) with accurate MIME - Removed hardcoded image/jpeg in Base._image_prompt(); pass through data URLs and default raw base64 to image/png - Set AnthropicCV._image_prompt() raw base64 media_type default to image/png - Ensures MIME type matches actual image content, fixing “cannot process base64 image” errors on vLLM/OpenAI-compatible backends ### What problem does this PR solve? This PR fixes a compatibility issue where base64-encoded images sent to vision models (e.g., vLLM/OpenAI-compatible backends) were rejected due to mismatched MIME type or incorrect decoding. Previously, the backend: - Always converted raw base64 into data:image/jpeg;base64,... even if the actual content was PNG. - In some cases, base64 decoding was attempted on the full data URL string instead of the pure base64 part. This caused errors like: ``` cannot process base64 image failed to decode base64 string: illegal base64 data at input byte 0 ``` by strict validators such as vLLM. With this fix, the MIME type in the request now matches the actual image content, and data URLs are correctly handled or passed through, ensuring vision models can decode and process images reliably. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-14 17:00:56 +08:00
Stephen Hu	f2806a8332	Update cv_model.py (#9472 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9452 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-14 13:45:38 +08:00
Stephen Hu	da5cef0686	Refactor:Improve the float compare for LocalAIRerank (#9428 ) ### What problem does this PR solve? Improve the float compare for LocalAIRerank ### Type of change - [x] Refactoring	2025-08-13 10:26:42 +08:00
Yongteng Lei	a0c2da1219	Fix: Patch LiteLLM (#9416 ) ### What problem does this PR solve? Patch LiteLLM refactor. #9408 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-12 15:54:30 +08:00
Yongteng Lei	83771e500c	Refa: migrate chat models to LiteLLM (#9394 ) ### What problem does this PR solve? All models pass the mock response tests, which means that if a model can return the correct response, everything should work as expected. However, not all models have been fully tested in a real environment, the real API_KEY. I suggest actively monitoring the refactored models over the coming period to ensure they work correctly and fixing them step by step, or waiting to merge until most have been tested in practical environment. ### Type of change - [x] Refactoring	2025-08-12 10:59:20 +08:00
Stephen Hu	7713e14d6a	Update chat_model.py (#9318 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9317 base on https://discuss.ai.google.dev/t/valueerror-invalid-operation-the-response-text-quick-accessor-requires-the-response-to-contain-a-valid-part-but-none-were-returned/42866 should can be handled by retry ### Type of change - [x] Refactoring	2025-08-08 14:13:07 +08:00
Kevin Hu	a2e1f5618d	Fix: bytes style image issue. (#9304 ) ### What problem does this PR solve? #9302 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-07 15:20:01 +08:00
so95	35539092d0	Add kwargs to model base class constructors (#9252 ) Updated constructors for base and derived classes in chat, embedding, rerank, sequence2txt, and tts models to accept kwargs. This change improves extensibility and allows passing additional parameters without breaking existing interfaces. - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: IT: Sop.Son <sop.son@feavn.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-07 09:45:37 +08:00
Kevin Hu	2124329e95	Fix: local variable issue. (#9255 ) ### What problem does this PR solve? #9227 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-05 19:24:34 +08:00
Stephen Hu	0a303d9ae1	Refactor:Improve the chat stream logic for NvidiaCV (#9242 ) ### What problem does this PR solve? Improve the chat stream logic for NvidiaCV ### Type of change - [x] Refactoring	2025-08-05 17:47:00 +08:00
Stephen Hu	1deb0a2d42	Fix:local variable 'response' referenced before assignment (#9230 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9227 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-08-05 11:00:06 +08:00
Yongteng Lei	30ccc4a66c	Fix: correct single base64 image handling in image prompt (#9220 ) ### What problem does this PR solve? Correct single base64 image handling in image prompt. ![img_v3_02or_ec4757c2-a9d4-4774-9a76-f7c6be633ebg](https://github.com/user-attachments/assets/872a86bf-e2a8-48d1-9b71-2a0c7a35ba9e) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-05 09:26:42 +08:00
Stephen Hu	e9cbf4611d	Fix:Error when parsing files using Gemini: ERROR: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9177 The reason should be due to the gemin internal use a different parameter name ` max_output_tokens (int): Optional. The maximum number of tokens to include in a response candidate. Note: The default value varies by model, see the ``Model.output_token_limit`` attribute of the ``Model`` returned from the ``getModel`` function. This field is a member of `oneof`_ ``_max_output_tokens``. ` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-04 10:06:09 +08:00
Stephen Hu	5ccdb95008	Refactor:Introduce Image Close For GeminiCV (#9147 ) ### What problem does this PR solve? Introduce Image Close For GeminiCV ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-08-01 12:38:13 +08:00
JI4JUN	aeaeb169e4	Feat/support 302ai provider (#8742 ) ### What problem does this PR solve? Support 302.AI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-31 14:48:30 +08:00
Stephen Hu	20b4d88098	Refactor: Improve the try catch logic for XinferenceEmbed (#9128 ) ### What problem does this PR solve? Improve the try catch logic for XinferenceEmbed ### Type of change - [x] Refactoring	2025-07-31 12:14:50 +08:00
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113 ) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 19:41:09 +08:00
謝富祥	021e8b57ae	Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106 ) ### What problem does this PR solve? fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-30 11:37:49 +08:00
Stephen Hu	ba563f8095	Update embedding_model.py (#9083 ) ### What problem does this PR solve? Reduce the logic scope for DefaultEmbedding ### Type of change - [x] Refactoring	2025-07-30 09:44:30 +08:00
Stephen Hu	86b4da0844	Refactor: Remove Useless split for BedrockEmbed (#9067 ) ### What problem does this PR solve? Remove Useless split for BedrockEmbed ### Type of change - [x] Refactoring	2025-07-28 10:16:38 +08:00
Stephen Hu	53b0b0e583	get keep alive from env (#9039 ) ### What problem does this PR solve? get keepalive from env ### Type of change - [x] Refactoring	2025-07-25 12:16:33 +08:00
Viktor Dmitriyev	b47dcc9108	Fix issue with `keep_alive=-1` for ollama chat model by allowing a user to set an additional configuration option (#9017 ) ### What problem does this PR solve? fix issue with `keep_alive=-1` for ollama chat model by allowing a user to set an additional configuration option. It is no-breaking change because it still uses a previous default value such as: `keep_alive=-1` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [X] Performance Improvement - [X] Other (please describe): - Additional configuration option has been added to control behavior of RAGFlow while working with ollama LLM	2025-07-24 11:20:14 +08:00
Yongteng Lei	a2f73af1a4	Fix: typo Bearer token (#8998 ) ### What problem does this PR solve? Typo Bearer token. #8960 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-23 18:10:51 +08:00
Yongteng Lei	7ebc1f0943	Feat: add model provider DeepInfra (#9003 ) ### What problem does this PR solve? Add model provider DeepInfra. This model list comes from our community. NOTE: most endpoints haven't been tested, but they should work as OpenAI does. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-23 18:10:35 +08:00
Stephen Hu	ec21d9a98f	Refactor:remove use less convert for FastEmbed (#8984 ) ### What problem does this PR solve? remove use less convert for FastEmbed ### Type of change - [x] Refactoring	2025-07-23 10:51:48 +08:00
Stephen Hu	95b9208b13	Fix:Improve float operation when rerank (#8963 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8915 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-22 10:04:00 +08:00
Stephen Hu	46caf6ae72	Refactor improve codes for ranker (#8936 ) ### What problem does this PR solve? Use the normalize method directly ### Type of change - [x] Refactoring	2025-07-21 10:22:20 +08:00
Stephen Hu	38b34116dd	Refa: Remove useless conver and fix a bug for DefaultRerank (#8887 ) ### What problem does this PR solve? 1. bug when re-try, we need to reset i. 2. remove useless convert ### Type of change - [x] Refactoring	2025-07-17 12:09:50 +08:00
Liu An	9e45fcfdb3	Fix: fix typo in OpenAI error logging message (#8865 ) ### What problem does this PR solve? Correct the logging message from "OpenAI cat_with_tools" to "OpenAI chat_with_tools" in the `_exceptions` method of the `Base` class to accurately reflect the method name and improve error traceability. ### Type of change - [x] Typo	2025-07-16 15:31:57 +08:00
Stephen Hu	5fa6f2f151	Update embedding_model.py (#8836 ) ### What problem does this PR solve? Remove useless covert for bge encode_queries ### Type of change - [x] Performance Improvement	2025-07-15 14:04:58 +08:00
Stephen Hu	5383e254c4	Perf:Remove Useless Convert When BGE Embedding (#8816 ) ### What problem does this PR solve? FlagModel internal support returns as numpy ### Type of change - [x] Performance Improvement	2025-07-14 14:02:48 +08:00
Stephen Hu	07208e519b	Fix: Wrong_Input_type_for_Gemin (#8783 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8763#issuecomment-3055317110 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-11 11:34:04 +08:00
Yongteng Lei	1895667573	Feat: add xAI provider (#8781 ) ### What problem does this PR solve? Add xAI provider (experimental feature, requires user feedback). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-11 10:35:23 +08:00
Kevin Hu	8281ceb406	Refa: refine retry gap. (#8773 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-07-10 14:28:57 +08:00
Stephen Hu	8d027813f5	Refactor: Improve How To Handle QWenEmbed (#8765 ) ### What problem does this PR solve? Based on https://github.com/infiniflow/ragflow/issues/8740 1. A better handle for 'NoneType' object is not subscriptable 2. Add some logs to get the internal message ### Type of change - [x] Refactoring	2025-07-10 10:30:18 +08:00
Stephen Hu	19419281c3	Fix: Change Ollama Embedding Keep Alive (#8734 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8733 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-09 12:17:26 +08:00
Stephen Hu	e60ec0a31b	Fix:disallowed special token while embedding (#8692 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 14:13:37 +08:00
6607changchun	9580e99650	fix: retry embedding with Qwen family models when limits temporarily reached. (#8690 ) fix: retry embedding with Qwen family models when limits temporarily reached. APIs of Qwen family models are limited by calling rates. When reached, the "output" attribute of the "resp" will be None, and in turn cause TypeError when trying to retrieve "embeddings". Since these limits are almost temporary, I have added a simple retry mechanism to avoid it. Besides, if retry_max reached, the error can be early raised, instead of hidden behind "TypeError". ### What problem does this PR solve? Sometimes Qwen blocks calling due to rate limits, but it will cause the whole parsing procedure stops when creating knowledge base. In this situation, resp["output"] will be None, and resp["output"]["embeddings"] will cause TypeError. Since the limits are temporary, I apply a simple retry mechanism to solve it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-07 12:15:52 +08:00
Yongteng Lei	f8a6987f1e	Refa: automatic LLMs registration (#8651 ) ### What problem does this PR solve? Support automatic LLMs registration. ### Type of change - [x] Refactoring	2025-07-03 19:05:31 +08:00
Kevin Hu	fffb7c0bba	Fix: anthropic llm issue. (#8633 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:37:34 +08:00
He Wang	898da23caa	make dirs with 'exist_ok=True' (#8629 ) ### What problem does this PR solve? The following error occurred during local testing, which should be fixed by configuring 'exist_ok=True'. ```log set_progress(7461edc2535c11f0a2aa0242c0a82009), progress: -1, progress_msg: 21:41:41 Page(1~100000001): [ERROR][Errno 17] File exists: '/ragflow/tmp' ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:35:16 +08:00
Tuan Le	d343cb4deb	Add Google Cloud Vision API Integration (Image2Text) (#8608 ) ### What problem does this PR solve? This PR introduces Google Cloud Vision API integration to enhance image understanding capabilities in the application. It addresses the need for advanced image description and chat functionalities by implementing a new `GoogleCV` class to handle API interactions and updating relevant configurations. This enables users to leverage Google Cloud Vision for image-to-text tasks, improving the application's ability to process and interpret visual data. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-02 10:02:01 +08:00
Tuan Le	1c77b4ed9b	fix: Correctly format message parts in GoogleChat (#8596 ) ### What problem does this PR solve? This PR addresses an incompatibility issue with the Google Chat API by correcting the message content format in the `GoogleChat` class. Previously, the content was directly assigned to the "parts" field, which did not align with the API's expected format. This change ensures that messages are properly formatted with a "text" key within a dictionary, as required by the API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:06:07 +08:00
Kevin Hu	d46c24045f	Feat: add GiteeAI as a llm provider. (#8572 ) ### What problem does this PR solve? #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:22:11 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Kevin Hu	e441c17c2c	Refa: limit embedding concurrency and fix `chat_with_tool` (#8543 ) ### What problem does this PR solve? #8538 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-27 19:28:41 +08:00
Kevin Hu	a10f05f4d7	Fix: chat with tools bug. (#8528 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 12:10:53 +08:00
Rainman	340354b79c	fix the error 'Unknown field for GenerationConfig: max_tokens' when u… (#8473 ) ### What problem does this PR solve? [https://github.com/infiniflow/ragflow/issues/8324](url) docker image version: v0.19.1 The `_clean_conf` function was not implemented in the `_chat` and `chat_streamly` methods of the `GeminiChat` class, causing the error "Unknown field for GenerationConfig: max_tokens" when the default LLM config includes the "max_tokens" parameter. Buggy Code(ragflow/rag/llm/chat_model.py) ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types if system: self.model._system_instruction = content_types.to_content(system) #❌_clean_conf was not implemented for k in list(gen_conf.keys()): if k not in ["temperature", "top_p", "max_tokens"]: del gen_conf[k] for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` Implement the _clean_conf function ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) if system: self.model._system_instruction = content_types.to_content(system) #✅Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):" for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 16:23:35 +08:00
Rainman	49d67cbcb7	fix a bug when using huggingface embedding api (#8432 ) ### What problem does this PR solve? image_version: v0.19.1 This PR fixes a bug in the HuggingFaceEmBedding API method that was causing AssertionError: assert len(vects) == len(docs) during the document embedding process. #### Problem The HuggingFaceEmbed.encode() method had an early return statement inside the for loop, causing it to return after processing only the first text input instead of processing all texts in the input list. Error Messenge ```python AssertionError: assert len(vects) == len(docs) # input chunks != embedded vectors from embedding api File "/ragflow/rag/svr/task_executor.py", line 442, in embedding ``` Buggy code(/ragflow/rag/llm/embedding_model.py) ```python class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: try: embedding = response.json() embeddings.append(embedding[0]) # ❌ Early return return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) except Exception as _e: log_exception(_e, response) else: raise Exception(...) ``` Fixed Code(I just Rollback this function to the v0.19.0 version) ```python Class HuggingFaceEmbed(Base): def __init__(self, key, model_name, base_url=None): if not model_name: raise ValueError("Model name cannot be None") self.key = key self.model_name = model_name.split("___")[0] self.base_url = base_url or "http://127.0.0.1:8080" def encode(self, texts: list): embeddings = [] for text in texts: response = requests.post(...) if response.status_code == 200: embedding = response.json() embeddings.append(embedding[0]) # ✅ Only append, no return else: raise Exception(...) return np.array(embeddings), sum([num_tokens_from_string(text) for text in texts]) # ✅ Return after processing all ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-24 09:35:02 +08:00

1 2 3 4 5 ...

344 Commits