ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-07 04:21:30 +00:00

Author	SHA1	Message	Date
Günter Lukas	0283e4098f	Fix #10408 (#10471 ) ### What problem does this PR solve? Google Cloud model does not work correctly with gemini-2.5 models Close #10408 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-10-10 19:18:24 +08:00
Kevin Hu	0d8791936e	Feat: TOC retrieval (#10456 ) ### What problem does this PR solve? #10436 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 17:07:55 +08:00
buua436	5d167cd772	feat: support qwq reasoning models with non-stream output (#10468 ) ### What problem does this PR solve? issue: [#6193](https://github.com/infiniflow/ragflow/issues/6193) change: support qwq reasoning models with non-stream output ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-10 16:38:04 +08:00
Billy Bao	1a47e136e3	Feat: Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. (#10428 ) ### What problem does this PR solve? Adds a new feature that enables the LLM to extract a structured table of contents (TOC) directly from plain text. _This implementation prioritizes efficiency over reasoning — the model runs in a strictly deterministic mode (thinking disabled) to minimize latency. As a result, overall performance may be less optimal, but the extraction speed and consistency are guaranteed._ ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-10-09 13:47:31 +08:00
DeerAPI	dfc5fa1f4d	Feat: add DeerAPI support (#10303 ) ### Related issues #10078 ### What problem does this PR solve? Integrate DeerAPI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update Co-authored-by: DeerAPI <tensor.null@gmail.com>	2025-10-09 11:14:49 +08:00
Yongteng Lei	17757930a3	Feat: add support for international Dashscope service (#10356 ) ### What problem does this PR solve? Add support for international Dashscope service. #10340 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-29 14:49:45 +08:00
Yongteng Lei	daea357940	Fix: invalid COMPONENT_EXEC_TIMEOUT (#10278 ) ### What problem does this PR solve? Fix invalid COMPONENT_EXEC_TIMEOUT. #10273 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-25 14:11:09 +08:00
Stephen Hu	193d93d820	Refactor: Improve the logic clean conf for ZhipuChat (#10274 ) ### What problem does this PR solve? Improve the logic clean conf for ZhipuChat ### Type of change - [x] Refactoring	2025-09-25 10:28:03 +08:00
Stephen Hu	a1f848bfe0	Fix:max_tokens must be at least 1, got -950, BadRequestError (#10252 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/10235 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-09-24 10:49:34 +08:00
buua436	38be53cf31	fix: prevent list index out of range in chat streaming (#10238 ) ### What problem does this PR solve? issue: [Bug]: ERROR: list index out of range #10188 change: fix a potential list index out of range error in chat response parsing by adding explicit checks for empty choices. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-23 19:59:39 +08:00
Billy Bao	10cbbb76f8	revert gpt5 integration (#10228 ) ### What problem does this PR solve? Revert back to chat.completions. ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [x] Other (please describe): Revert back to chat.completions.	2025-09-23 16:06:12 +08:00
Dominik Novotný	1c84d1b562	Fix: azure OpenAI retry (#10213 ) ### What problem does this PR solve? Currently, Azure OpenAI returns one minute Quota limit responses when chat API is utilized. This change is needed in order to be able to process almost any documents using models deployed in Azure Foundry. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-23 12:19:28 +08:00
Billy Bao	da82566304	Fix: resolve hash collisions by switching to UUID &correct logic for always-true statements & Update GPT api integration & Support qianwen-deepresearch (#10208 ) ### What problem does this PR solve? Fix: resolve hash collisions by switching to UUID &correct logic for always-true statements, solved: #10165 Feat: Update GPT api integration, solved: #10204 Feat: Support qianwen-deepresearch, solved: #10163 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2025-09-23 09:34:30 +08:00
Stephen Hu	94dbd4aac9	Refactor: use the same implement for total token count from res (#10197 ) ### What problem does this PR solve? use the same implement for total token count from res ### Type of change - [x] Refactoring	2025-09-22 17:17:06 +08:00
Yongteng Lei	4693c5382a	Feat: migrate OpenAI-compatible chats to LiteLLM (#10148 ) ### What problem does this PR solve? Migrate OpenAI-compatible chats to LiteLLM. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-18 17:16:59 +08:00
TensorNull	f12b9fdcd4	Feat: add CometAPI to LLMFactory and update related mappings (#10119 ) ### Related issues #10078 ### What problem does this PR solve? Integrate CometAPI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-09-18 09:51:29 +08:00
纷繁下的无奈	e1d86cfee3	Feat: add TokenPony model provider (#9932 ) ### What problem does this PR solve? Add TokenPony as a LLM provider Co-authored-by: huangzl <huangzl@shinemo.com>	2025-09-11 17:25:31 +08:00
Yongteng Lei	936f27e9e5	Feat: add LongCat-Flash-Chat (#9973 ) ### What problem does this PR solve? Add LongCat-Flash-Chat from Meituan, deepseek v3.1 from SiliconFlow, kimi-k2-09-05-preview and kimi-k2-turbo-preview from Moonshot. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-08 19:00:52 +08:00
Yuhao Bi	91d6fb8061	Fix miscalculated token count (#9776 ) ### What problem does this PR solve? The total token was incorrectly accumulated when using the OpenAI-API-Compatible api. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-05 19:17:21 +08:00
Yongteng Lei	b58e882eaa	Feat: add exponential back-off for Chat LiteLLM (#9880 ) ### What problem does this PR solve? Add exponential back-off for Chat LiteLLM. #9858. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-03 13:31:43 +08:00
Yongteng Lei	56cd576876	Refa: revise the implementation of LightRAG and enable response caching (#9828 ) ### What problem does this PR solve? This revision performed a comprehensive check on LightRAG to ensure the correctness of its implementation. It did not involve Entity Resolution and Community Reports Generation. There is an example using default entity types and the General chunking method, which shows good results in both time and effectiveness. Moreover, response caching is enabled for resuming failed tasks. [The-Necklace.pdf](https://github.com/user-attachments/files/22042432/The-Necklace.pdf) After: ![img_v3_02pk_177dbc6a-e7cc-4732-b202-ad4682d171fg](https://github.com/user-attachments/assets/5ef1d93a-9109-4fe9-8a7b-a65add16f82b) ```bash Begin at: Fri, 29 Aug 2025 16:48:03 GMT Duration: 222.31 s Progress: 16:48:04 Task has been received. 16:48:06 Page(1~7): Start to parse. 16:48:06 Page(1~7): OCR started 16:48:08 Page(1~7): OCR finished (1.89s) 16:48:11 Page(1~7): Layout analysis (3.72s) 16:48:11 Page(1~7): Table analysis (0.00s) 16:48:11 Page(1~7): Text merged (0.00s) 16:48:11 Page(1~7): Finish parsing. 16:48:12 Page(1~7): Generate 7 chunks 16:48:12 Page(1~7): Embedding chunks (0.29s) 16:48:12 Page(1~7): Indexing done (0.04s). Task done (7.84s) 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... 16:48:17 Start processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... 16:49:30 Completed processing for f421fb06849e11f0bdd32724b93a52b2: She had no dresses, no je... after 1 gleanings, 21985 tokens. 16:49:30 Entities extraction of chunk 3 1/7 done, 12 nodes, 13 edges, 21985 tokens. 16:49:40 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Finally, she replied, hes... after 1 gleanings, 22584 tokens. 16:49:40 Entities extraction of chunk 5 2/7 done, 19 nodes, 19 edges, 22584 tokens. 16:50:02 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Then she asked, hesitatin... after 1 gleanings, 24610 tokens. 16:50:02 Entities extraction of chunk 0 3/7 done, 16 nodes, 28 edges, 24610 tokens. 16:50:03 Completed processing for f421fb06849e11f0bdd32724b93a52b2: And this life lasted ten ... after 1 gleanings, 24031 tokens. 16:50:04 Entities extraction of chunk 1 4/7 done, 24 nodes, 22 edges, 24031 tokens. 16:50:14 Completed processing for f421fb06849e11f0bdd32724b93a52b2: So they begged the jewell... after 1 gleanings, 24635 tokens. 16:50:14 Entities extraction of chunk 6 5/7 done, 27 nodes, 26 edges, 24635 tokens. 16:50:29 Completed processing for f421fb06849e11f0bdd32724b93a52b2: Her husband, already half... after 1 gleanings, 25758 tokens. 16:50:29 Entities extraction of chunk 2 6/7 done, 25 nodes, 35 edges, 25758 tokens. 16:51:35 Completed processing for f421fb06849e11f0bdd32724b93a52b2: The Necklace By Guy de Ma... after 1 gleanings, 27491 tokens. 16:51:35 Entities extraction of chunk 4 7/7 done, 39 nodes, 37 edges, 27491 tokens. 16:51:35 Entities and relationships extraction done, 147 nodes, 177 edges, 171094 tokens, 198.58s. 16:51:35 Entities merging done, 0.01s. 16:51:35 Relationships merging done, 0.01s. 16:51:35 ignored 7 relations due to missing entities. 16:51:35 generated subgraph for doc f421fb06849e11f0bdd32724b93a52b2 in 198.68 seconds. 16:51:35 run_graphrag f421fb06849e11f0bdd32724b93a52b2 graphrag_task_lock acquired 16:51:35 set_graph removed 0 nodes and 0 edges from index in 0.00s. 16:51:35 Get embedding of nodes: 9/147 16:51:35 Get embedding of nodes: 109/147 16:51:37 Get embedding of edges: 9/170 16:51:37 Get embedding of edges: 109/170 16:51:40 set_graph converted graph change to 319 chunks in 4.21s. 16:51:40 Insert chunks: 4/319 16:51:40 Insert chunks: 104/319 16:51:40 Insert chunks: 204/319 16:51:40 Insert chunks: 304/319 16:51:40 set_graph added/updated 147 nodes and 170 edges from index in 0.53s. 16:51:40 merging subgraph for doc f421fb06849e11f0bdd32724b93a52b2 into the global graph done in 4.79 seconds. 16:51:40 Knowledge Graph done (204.29s) ``` Before: ![img_v3_02pk_63370edf-ecee-4ee8-8ac8-69c8d2c712fg](https://github.com/user-attachments/assets/1162eb0f-68c2-4de5-abe0-cdfa168f71de) ```bash Begin at: Fri, 29 Aug 2025 17:00:47 GMT processDuration: 173.38 s Progress: 17:00:49 Task has been received. 17:00:51 Page(1~7): Start to parse. 17:00:51 Page(1~7): OCR started 17:00:53 Page(1~7): OCR finished (1.82s) 17:00:57 Page(1~7): Layout analysis (3.64s) 17:00:57 Page(1~7): Table analysis (0.00s) 17:00:57 Page(1~7): Text merged (0.00s) 17:00:57 Page(1~7): Finish parsing. 17:00:57 Page(1~7): Generate 7 chunks 17:00:57 Page(1~7): Embedding chunks (0.31s) 17:00:57 Page(1~7): Indexing done (0.03s). Task done (7.88s) 17:00:57 created task graphrag 17:01:00 Task has been received. 17:02:17 Entities extraction of chunk 1 1/7 done, 9 nodes, 9 edges, 10654 tokens. 17:02:31 Entities extraction of chunk 2 2/7 done, 12 nodes, 13 edges, 11066 tokens. 17:02:33 Entities extraction of chunk 4 3/7 done, 9 nodes, 10 edges, 10433 tokens. 17:02:42 Entities extraction of chunk 5 4/7 done, 11 nodes, 14 edges, 11290 tokens. 17:02:52 Entities extraction of chunk 6 5/7 done, 13 nodes, 15 edges, 11039 tokens. 17:02:55 Entities extraction of chunk 3 6/7 done, 14 nodes, 13 edges, 11466 tokens. 17:03:32 Entities extraction of chunk 0 7/7 done, 19 nodes, 18 edges, 13107 tokens. 17:03:32 Entities and relationships extraction done, 71 nodes, 89 edges, 79055 tokens, 149.66s. 17:03:32 Entities merging done, 0.01s. 17:03:32 Relationships merging done, 0.01s. 17:03:32 ignored 1 relations due to missing entities. 17:03:32 generated subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 in 149.69 seconds. 17:03:32 run_graphrag b1d9d3b6848711f0aacd7ddc0714c4d3 graphrag_task_lock acquired 17:03:32 set_graph removed 0 nodes and 0 edges from index in 0.00s. 17:03:32 Get embedding of nodes: 9/71 17:03:33 Get embedding of edges: 9/88 17:03:34 set_graph converted graph change to 161 chunks in 2.27s. 17:03:34 Insert chunks: 4/161 17:03:34 Insert chunks: 104/161 17:03:34 set_graph added/updated 71 nodes and 88 edges from index in 0.28s. 17:03:34 merging subgraph for doc b1d9d3b6848711f0aacd7ddc0714c4d3 into the global graph done in 2.60 seconds. 17:03:34 Knowledge Graph done (153.18s) ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring - [x] Performance Improvement	2025-08-29 17:58:36 +08:00
Yongteng Lei	fcd18d7d87	Fix: Ollama chat cannot access remote deployment (#9816 ) ### What problem does this PR solve? Fix Ollama chat can only access localhost instance. #9806. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-29 13:35:41 +08:00
Yongteng Lei	b6c1ca828e	Refa: replace Chat Ollama implementation with LiteLLM (#9693 ) ### What problem does this PR solve? replace Chat Ollama implementation with LiteLLM. ### Type of change - [x] Refactoring	2025-08-25 17:56:31 +08:00
Yongteng Lei	3947da10ae	Fix: unexpected LLM parameters (#9661 ) ### What problem does this PR solve? Remove unexpected LLM parameters. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-22 19:33:09 +08:00
Yongteng Lei	a0c2da1219	Fix: Patch LiteLLM (#9416 ) ### What problem does this PR solve? Patch LiteLLM refactor. #9408 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-12 15:54:30 +08:00
Yongteng Lei	83771e500c	Refa: migrate chat models to LiteLLM (#9394 ) ### What problem does this PR solve? All models pass the mock response tests, which means that if a model can return the correct response, everything should work as expected. However, not all models have been fully tested in a real environment, the real API_KEY. I suggest actively monitoring the refactored models over the coming period to ensure they work correctly and fixing them step by step, or waiting to merge until most have been tested in practical environment. ### Type of change - [x] Refactoring	2025-08-12 10:59:20 +08:00
Stephen Hu	7713e14d6a	Update chat_model.py (#9318 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9317 base on https://discuss.ai.google.dev/t/valueerror-invalid-operation-the-response-text-quick-accessor-requires-the-response-to-contain-a-valid-part-but-none-were-returned/42866 should can be handled by retry ### Type of change - [x] Refactoring	2025-08-08 14:13:07 +08:00
so95	35539092d0	Add kwargs to model base class constructors (#9252 ) Updated constructors for base and derived classes in chat, embedding, rerank, sequence2txt, and tts models to accept kwargs. This change improves extensibility and allows passing additional parameters without breaking existing interfaces. - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: IT: Sop.Son <sop.son@feavn.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-07 09:45:37 +08:00
Stephen Hu	e9cbf4611d	Fix:Error when parsing files using Gemini: ERROR: GENERIC_ERROR - Unknown field for GenerationConfig: max_tokens (#9195 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/9177 The reason should be due to the gemin internal use a different parameter name ` max_output_tokens (int): Optional. The maximum number of tokens to include in a response candidate. Note: The default value varies by model, see the ``Model.output_token_limit`` attribute of the ``Model`` returned from the ``getModel`` function. This field is a member of `oneof`_ ``_max_output_tokens``. ` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-08-04 10:06:09 +08:00
JI4JUN	aeaeb169e4	Feat/support 302ai provider (#8742 ) ### What problem does this PR solve? Support 302.AI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-31 14:48:30 +08:00
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113 ) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 19:41:09 +08:00
謝富祥	021e8b57ae	Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106 ) ### What problem does this PR solve? fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-30 11:37:49 +08:00
Viktor Dmitriyev	b47dcc9108	Fix issue with `keep_alive=-1` for ollama chat model by allowing a user to set an additional configuration option (#9017 ) ### What problem does this PR solve? fix issue with `keep_alive=-1` for ollama chat model by allowing a user to set an additional configuration option. It is no-breaking change because it still uses a previous default value such as: `keep_alive=-1` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [X] Performance Improvement - [X] Other (please describe): - Additional configuration option has been added to control behavior of RAGFlow while working with ollama LLM	2025-07-24 11:20:14 +08:00
Yongteng Lei	7ebc1f0943	Feat: add model provider DeepInfra (#9003 ) ### What problem does this PR solve? Add model provider DeepInfra. This model list comes from our community. NOTE: most endpoints haven't been tested, but they should work as OpenAI does. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-23 18:10:35 +08:00
Liu An	9e45fcfdb3	Fix: fix typo in OpenAI error logging message (#8865 ) ### What problem does this PR solve? Correct the logging message from "OpenAI cat_with_tools" to "OpenAI chat_with_tools" in the `_exceptions` method of the `Base` class to accurately reflect the method name and improve error traceability. ### Type of change - [x] Typo	2025-07-16 15:31:57 +08:00
Yongteng Lei	1895667573	Feat: add xAI provider (#8781 ) ### What problem does this PR solve? Add xAI provider (experimental feature, requires user feedback). ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-11 10:35:23 +08:00
Kevin Hu	8281ceb406	Refa: refine retry gap. (#8773 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-07-10 14:28:57 +08:00
Yongteng Lei	f8a6987f1e	Refa: automatic LLMs registration (#8651 ) ### What problem does this PR solve? Support automatic LLMs registration. ### Type of change - [x] Refactoring	2025-07-03 19:05:31 +08:00
Kevin Hu	fffb7c0bba	Fix: anthropic llm issue. (#8633 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-02 18:37:34 +08:00
Tuan Le	1c77b4ed9b	fix: Correctly format message parts in GoogleChat (#8596 ) ### What problem does this PR solve? This PR addresses an incompatibility issue with the Google Chat API by correcting the message content format in the `GoogleChat` class. Previously, the content was directly assigned to the "parts" field, which did not align with the API's expected format. This change ensures that messages are properly formatted with a "text" key within a dictionary, as required by the API. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-01 14:06:07 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Kevin Hu	e441c17c2c	Refa: limit embedding concurrency and fix `chat_with_tool` (#8543 ) ### What problem does this PR solve? #8538 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2025-06-27 19:28:41 +08:00
Kevin Hu	a10f05f4d7	Fix: chat with tools bug. (#8528 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-27 12:10:53 +08:00
Rainman	340354b79c	fix the error 'Unknown field for GenerationConfig: max_tokens' when u… (#8473 ) ### What problem does this PR solve? [https://github.com/infiniflow/ragflow/issues/8324](url) docker image version: v0.19.1 The `_clean_conf` function was not implemented in the `_chat` and `chat_streamly` methods of the `GeminiChat` class, causing the error "Unknown field for GenerationConfig: max_tokens" when the default LLM config includes the "max_tokens" parameter. Buggy Code(ragflow/rag/llm/chat_model.py) ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types if system: self.model._system_instruction = content_types.to_content(system) #❌_clean_conf was not implemented for k in list(gen_conf.keys()): if k not in ["temperature", "top_p", "max_tokens"]: del gen_conf[k] for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` Implement the _clean_conf function ```python class GeminiChat(Base): def __init__(self, key, model_name, base_url=None, kwargs): super().__init__(key, model_name, base_url=base_url, kwargs) from google.generativeai import GenerativeModel, client client.configure(api_key=key) _client = client.get_default_generative_client() self.model_name = "models/" + model_name self.model = GenerativeModel(model_name=self.model_name) self.model._client = _client def _clean_conf(self, gen_conf): for k in list(gen_conf.keys()): if k not in ["temperature", "top_p"]: del gen_conf[k] return gen_conf def _chat(self, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) system = history[0]["content"] if history and history[0]["role"] == "system" else "" hist = [] for item in history: if item["role"] == "system": continue hist.append(deepcopy(item)) item = hist[-1] if "role" in item and item["role"] == "assistant": item["role"] = "model" if "role" in item and item["role"] == "system": item["role"] = "user" if "content" in item: item["parts"] = item.pop("content") if system: self.model._system_instruction = content_types.to_content(system) response = self.model.generate_content(hist, generation_config=gen_conf) ans = response.text return ans, response.usage_metadata.total_token_count def chat_streamly(self, system, history, gen_conf): from google.generativeai.types import content_types #✅ implement _clean_conf to remove the wrong parameters gen_conf = self._clean_conf(gen_conf) if system: self.model._system_instruction = content_types.to_content(system) #✅Removed duplicate parameter filtering logic "for k in list(gen_conf.keys()):" for item in history: if "role" in item and item["role"] == "assistant": item["role"] = "model" if "content" in item: item["parts"] = item.pop("content") ans = "" try: response = self.model.generate_content(history, generation_config=gen_conf, stream=True) for resp in response: ans = resp.text yield ans yield response._chunks[-1].usage_metadata.total_token_count except Exception as e: yield ans + "\nERROR: " + str(e) yield 0 ``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-25 16:23:35 +08:00
Song Fuchang	fd7ac17605	Feat: Scratch MCP tool calling support. (#8263 ) ### What problem does this PR solve? This is a cherry-pick from #7781 as requested. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-06-23 17:45:35 +08:00
Liu An	244d8a47b9	Fix: AzureChat model code (#8426 ) ### What problem does this PR solve? - Simplify AzureChat constructor by passing base_url directly - Clean up spacing and formatting in chat_model.py - Remove redundant parentheses and improve code consistency - #8423 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-23 15:59:25 +08:00
Stephen Hu	35034fed73	Fix: Raptor: [Bug]: ERROR: Unknown field for GenerationConfig: max_tokens (#8331 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8324 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-18 16:40:57 +08:00
Kevin Hu	b1117a8717	Fix: base url issue. (#8281 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-16 13:40:25 +08:00
Kevin Hu	d5236b71f4	Refa: ollama keep alive issue. (#8216 ) ### What problem does this PR solve? #8122 ### Type of change - [x] Refactoring	2025-06-12 15:09:40 +08:00
Kevin Hu	56ee69e9d9	Refa: chat with tools. (#8210 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-06-12 12:31:10 +08:00

1 2 3 4

200 Commits