ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2025-12-07 12:30:52 +00:00

Author	SHA1	Message	Date
Stephen Hu	0ecccd27eb	Refactor:improve the logic for rerank models to cal the total token count (#10882 ) ### What problem does this PR solve? improve the logic for rerank models to cal the total token count ### Type of change - [x] Refactoring	2025-10-31 09:46:16 +08:00
Zhichang Yu	73144e278b	Don't release full image (#10654 ) ### What problem does this PR solve? Introduced gpu profile in .env Added Dockerfile_tei fix datrie Removed LIGHTEN flag ### Type of change - [x] Documentation Update - [x] Refactoring	2025-10-23 23:02:27 +08:00
Stephen Hu	94dbd4aac9	Refactor: use the same implement for total token count from res (#10197 ) ### What problem does this PR solve? use the same implement for total token count from res ### Type of change - [x] Refactoring	2025-09-22 17:17:06 +08:00
buua436	6c24ad7966	fix: correct rerank_model condition logic (#10174 ) ### What problem does this PR solve? fix the rerank_model condition logic by correcting the np.isclose check. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-19 16:02:10 +08:00
Stephen Hu	ca320a8c30	Refactor: for total_token_count method use if to check first. (#9707 ) ### What problem does this PR solve? for total_token_count method use if to check first, to improve the performance when we need to handle exception cases ### Type of change - [x] Refactoring	2025-08-26 10:47:20 +08:00
Stephen Hu	a0d630365c	Refactor:Improve VoyageRerank not texts handling (#9539 ) ### What problem does this PR solve? Improve VoyageRerank not texts handling ### Type of change - [x] Refactoring	2025-08-19 10:31:04 +08:00
Stephen Hu	fb77f9917b	Refactor: Use Input Length In DefaultRerank (#9516 ) ### What problem does this PR solve? 1. Use input length to prepare res 2. Adjust torch_empty_cache code location ### Type of change - [x] Refactoring - [x] Performance Improvement	2025-08-18 10:00:27 +08:00
Stephen Hu	da5cef0686	Refactor:Improve the float compare for LocalAIRerank (#9428 ) ### What problem does this PR solve? Improve the float compare for LocalAIRerank ### Type of change - [x] Refactoring	2025-08-13 10:26:42 +08:00
so95	35539092d0	Add kwargs to model base class constructors (#9252 ) Updated constructors for base and derived classes in chat, embedding, rerank, sequence2txt, and tts models to accept kwargs. This change improves extensibility and allows passing additional parameters without breaking existing interfaces. - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: IT: Sop.Son <sop.son@feavn.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-07 09:45:37 +08:00
JI4JUN	aeaeb169e4	Feat/support 302ai provider (#8742 ) ### What problem does this PR solve? Support 302.AI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-31 14:48:30 +08:00
Kevin Hu	d9fe279dde	Feat: Redesign and refactor agent module (#9113 ) ### What problem does this PR solve? #9082 #6365 <u> WARNING: it's not compatible with the older version of `Agent` module, which means that `Agent` from older versions can not work anymore.</u> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-30 19:41:09 +08:00
Stephen Hu	95b9208b13	Fix:Improve float operation when rerank (#8963 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8915 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-22 10:04:00 +08:00
Stephen Hu	46caf6ae72	Refactor improve codes for ranker (#8936 ) ### What problem does this PR solve? Use the normalize method directly ### Type of change - [x] Refactoring	2025-07-21 10:22:20 +08:00
Stephen Hu	38b34116dd	Refa: Remove useless conver and fix a bug for DefaultRerank (#8887 ) ### What problem does this PR solve? 1. bug when re-try, we need to reset i. 2. remove useless convert ### Type of change - [x] Refactoring	2025-07-17 12:09:50 +08:00
Yongteng Lei	f8a6987f1e	Refa: automatic LLMs registration (#8651 ) ### What problem does this PR solve? Support automatic LLMs registration. ### Type of change - [x] Refactoring	2025-07-03 19:05:31 +08:00
Kevin Hu	d46c24045f	Feat: add GiteeAI as a llm provider. (#8572 ) ### What problem does this PR solve? #1853 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 11:22:11 +08:00
Kevin Hu	aafeffa292	Feat: add gitee as LLM provider. (#8545 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-30 09:22:31 +08:00
Kevin Hu	65d5268439	Feat: implement novitaAI embedding and reranking. (#8250 ) ### What problem does this PR solve? Close #8227 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-06-13 15:42:17 +08:00
Kevin Hu	d36c8d18b1	Refa: make exception more clear. (#8224 ) ### What problem does this PR solve? #8156 ### Type of change - [x] Refactoring	2025-06-12 17:53:59 +08:00
Kevin Hu	156290f8d0	Fix: url path join issue. (#8013 ) ### What problem does this PR solve? Close #7980 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-06-03 14:18:40 +08:00
Kevin Hu	60c3a253ad	Fix: api-key issue for xinference. (#6490 ) ### What problem does this PR solve? #2792 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 15:01:13 +08:00
zhou	a6aed0da46	Fix: rerank with YoudaoRerank issue. (#6396 ) ### What problem does this PR solve? Fix rerank with YoudaoRerank issue，"'YoudaoRerank' object has no attribute '_dynamic_batch_size'" ![17425412353825](https://github.com/user-attachments/assets/9ed304c7-317a-440e-acff-fe895fc20f07) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 10:09:16 +08:00
Kevin Hu	d83911b632	Fix: huggingface rerank model issue. (#6385 ) ### What problem does this PR solve? #6348 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 12:43:32 +08:00
Kevin Hu	5b04b7d972	Fix: rerank with vllm issue. (#6306 ) ### What problem does this PR solve? #6301 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 11:52:42 +08:00
Edouard Hur	b29539b442	Fix: CoHereRerank not respecting base_url when provided (#5784 ) ### What problem does this PR solve? vLLM provider with a reranking model does not work : as vLLM uses under the hood the [CoHereRerank provider](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/__init__.py#L250) with a `base_url`, if this URL [is not passed to the Cohere client](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/rerank_model.py#L379-L382) any attempt will endup on the Cohere SaaS (sending your private api key in the process) instead of your vLLM instance. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-10 11:22:06 +08:00
Kevin Hu	df9b7b2fe9	Fix: rerank issue. (#5696 ) ### What problem does this PR solve? #5673 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-06 15:05:19 +08:00
Kevin Hu	b8da2eeb69	Feat: support huggingface re-rank model. (#5684 ) ### What problem does this PR solve? #5658 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-06 10:44:04 +08:00
Kevin Hu	4e2afcd3b8	Fix FlagRerank max_length issue. (#5366 ) ### What problem does this PR solve? #5352 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-26 11:01:13 +08:00
liwenju0	569e40544d	Refactor rerank model with dynamic batch processing and memory manage… (#5273 ) …ment ### What problem does this PR solve? Issue：https://github.com/infiniflow/ragflow/issues/5262 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: wenju.li <wenju.li@deepctr.cn>	2025-02-24 11:32:08 +08:00
Kevin Hu	4776fa5e4e	Refactor for total_tokens. (#4652 ) ### What problem does this PR solve? #4567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-26 13:54:26 +08:00
Kevin Hu	3805621564	Fix xinference rerank issue. (#4499 ) ### What problem does this PR solve? #4495 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-16 11:35:51 +08:00
Alex Chen	7944aacafa	Feat: add gpustack model provider (#4469 ) ### What problem does this PR solve? Add GPUStack as a new model provider. [GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU cluster manager for running LLMs. Currently, locally deployed models in GPUStack cannot integrate well with RAGFlow. GPUStack provides both OpenAI compatible APIs (Models / Chat Completions / Embeddings / Speech2Text / TTS) and other APIs like Rerank. We would like to use GPUStack as a model provider in ragflow. [GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/) Related issue: https://github.com/infiniflow/ragflow/issues/4064. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Testing Instructions 1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3` text embedding model, `bge-reranker-v2-m3` rerank model, `faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in GPUStack. 2. Add provider in ragflow settings. 3. Testing in ragflow.	2025-01-15 14:15:58 +08:00
Kevin Hu	cb45431412	Fix Voyage re-rank model. Limit file name length. (#4171 ) ### What problem does this PR solve? #4152 #4154 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-23 10:03:50 +08:00
Kevin Hu	593ffc4067	Fix HuggingFace model error. (#3870 ) ### What problem does this PR solve? #3865 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-05 13:28:42 +08:00
Kevin Hu	78601ee1bd	Fix open AI compatible rerank issue. (#3866 ) ### What problem does this PR solve? #3700 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-05 10:26:21 +08:00
Kevin Hu	3f3469130b	Fix preview issue in file manager. (#3846 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-04 11:53:23 +08:00
devMls	59a5813f1b	add jina new models in jina connector (#3770 ) ### What problem does this PR solve? add new models in jinna connector, to allow use models that support multilingual models ### Type of change - [X] Other (please describe): new connectors no breaking change	2024-12-02 10:06:39 +08:00
Kevin Hu	91f1814a87	Fix error response (#3719 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Jin Hai <haijin.chn@gmail.com>	2024-11-28 18:56:10 +08:00
liwenju0	875096384b	when qwen rerank model not return ok, raise exception to notice user (#3593 ) ### What problem does this PR solve? When calling the Qwen rerank model, if the model does not return correctly, an exception should be raised to notify the user, rather than simply returning a value of 0, as this would be confusing to the user. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-11-22 22:34:34 +08:00
shizzgar	4b3eeaa6ef	Added LocalAI support for rerank models (#3446 ) ### What problem does this PR solve? Hi there! LocalAI added support of rerank models https://localai.io/features/reranker/ I've implemented LocalAIRerank class (typically copied it from OpenAI_APIRerank class). Also, LocalAI model response with 500 error code if len of "documents" is less than 2 in similarity check. So I've added the second "document" on RERANK model connection check in `api/apps/llm_app.py`. ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-11-18 12:05:52 +08:00
Jin Hai	1e90a1bf36	Move settings initialization after module init phase (#3438 ) ### What problem does this PR solve? 1. Module init won't connect database any more. 2. Config in settings need to be used with settings.CONFIG_NAME ### Type of change - [x] Refactoring Signed-off-by: jinhai <haijin.chn@gmail.com>	2024-11-15 17:30:56 +08:00
Zhichang Yu	30f6421760	Use consistent log file names, introduced initLogger (#3403 ) ### What problem does this PR solve? Use consistent log file names, introduced initLogger ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2024-11-14 17:13:48 +08:00
roc king	fa54cd5f5c	exstract model dir from model‘s full name (#3368 ) ### What problem does this PR solve? When model’s group name contains 0-9，we can't find downloaded model，because we do not correctly exstract model dir's name from model‘s full name ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: 王志鹏 <zhipeng3.wang@midea.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-11-13 14:10:16 +08:00
Zhichang Yu	a2a5631da4	Rework logging (#3358 ) Unified all log files into one. ### What problem does this PR solve? Unified all log files into one. ### Type of change - [x] Refactoring	2024-11-12 17:35:13 +08:00
Kevin Hu	4097912d59	add inputs to display to every components (#3242 ) ### What problem does this PR solve? #3240 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-11-06 18:47:53 +08:00
Kevin Hu	89d5b2414e	fix SILICONFLOW rerank error (#2980 ) ### What problem does this PR solve? #2977 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-10-23 10:12:39 +08:00
chongchuanbing	ac26d09a59	Feature/feat1017 (#2872 ) ### What problem does this PR solve? 1. fix: mid map show error in knowledge graph, juse because ```@antv/g6```version changed 2. feat: concurrent threads configuration support in graph extractor 3. fix: used tokens update failed for tenant 4. feat: timeout configuration support for llm 5. fix: regex error in graph extractor 6. feat: qwen rerank(```gte-rerank```) support 7. fix: timeout deal in knowledge graph index process. Now chat by stream output, also, it is configuratable. 8. feat: ```qwen-long``` model configuration ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: chongchuanbing <chongchuanbing@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-21 12:11:08 +08:00
Ziyu Huang	e5f7733b31	Resolves #2905 openai compatible model provider add llama.cpp rerank support (#2906 ) ### What problem does this PR solve? Resolve #2905 due to the in-consistent of token size, I make it safe to limit 500 in code, since there is no config param to control my llama.cpp run set -ub to 1024: ${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl 99 -m $gguf_file --reranking "$@" ### Type of change - [x] New Feature (non-breaking change which adds functionality) Here is my test Ragflow use llama.cpp ``` lot update_slots: id 0 \| task 458 \| prompt done, n_past = 416, n_tokens = 416 slot release: id 0 \| task 458 \| stop processing: n_past = 416, truncated = 0 slot launch_slot_: id 0 \| task 459 \| processing task slot update_slots: id 0 \| task 459 \| tokenizing prompt, len = 2 slot update_slots: id 0 \| task 459 \| prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111 slot update_slots: id 0 \| task 459 \| kv cache rm [0, end) slot update_slots: id 0 \| task 459 \| prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000 slot update_slots: id 0 \| task 459 \| prompt done, n_past = 111, n_tokens = 111 slot release: id 0 \| task 459 \| stop processing: n_past = 111, truncated = 0 srv update_slots: all slots are idle request: POST /rerank 172.23.0.4 200 ```	2024-10-21 10:06:29 +08:00
0000sir	4991107822	Fix keys of Xinference deployed models, especially has the same model name with public hosted models. (#2832 ) ### What problem does this PR solve? Fix keys of Xinference deployed models, especially has the same model name with public hosted models. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: 0000sir <0000sir@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-16 10:21:08 +08:00
Kevin Hu	5e7c1fb23a	reduce rerank batch size (#2801 ) ### What problem does this PR solve? ### Type of change - [x] Performance Improvement	2024-10-11 11:29:19 +08:00

1 2

80 Commits