80 Commits

Author SHA1 Message Date
Stephen Hu
0ecccd27eb
Refactor:improve the logic for rerank models to cal the total token count (#10882)
### What problem does this PR solve?

improve the logic for rerank models to cal the total token count

### Type of change

- [x] Refactoring
2025-10-31 09:46:16 +08:00
Zhichang Yu
73144e278b
Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
Stephen Hu
94dbd4aac9
Refactor: use the same implement for total token count from res (#10197)
### What problem does this PR solve?
use the same implement for total token count from res

### Type of change

- [x] Refactoring
2025-09-22 17:17:06 +08:00
buua436
6c24ad7966
fix: correct rerank_model condition logic (#10174)
### What problem does this PR solve?

fix the rerank_model condition logic by correcting the np.isclose check.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-09-19 16:02:10 +08:00
Stephen Hu
ca320a8c30
Refactor: for total_token_count method use if to check first. (#9707)
### What problem does this PR solve?

for total_token_count method use if to check first, to improve the
performance when we need to handle exception cases

### Type of change

- [x] Refactoring
2025-08-26 10:47:20 +08:00
Stephen Hu
a0d630365c
Refactor:Improve VoyageRerank not texts handling (#9539)
### What problem does this PR solve?

Improve VoyageRerank not texts handling

### Type of change

- [x] Refactoring
2025-08-19 10:31:04 +08:00
Stephen Hu
fb77f9917b
Refactor: Use Input Length In DefaultRerank (#9516)
### What problem does this PR solve?

1. Use input length to prepare res
2. Adjust torch_empty_cache code location

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-08-18 10:00:27 +08:00
Stephen Hu
da5cef0686
Refactor:Improve the float compare for LocalAIRerank (#9428)
### What problem does this PR solve?
Improve the float compare for LocalAIRerank

### Type of change

- [x] Refactoring
2025-08-13 10:26:42 +08:00
so95
35539092d0
Add **kwargs to model base class constructors (#9252)
Updated constructors for base and derived classes in chat, embedding,
rerank, sequence2txt, and tts models to accept **kwargs. This change
improves extensibility and allows passing additional parameters without
breaking existing interfaces.

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: IT: Sop.Son <sop.son@feavn.local>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-07 09:45:37 +08:00
JI4JUN
aeaeb169e4
Feat/support 302ai provider (#8742)
### What problem does this PR solve?

Support 302.AI provider.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-31 14:48:30 +08:00
Kevin Hu
d9fe279dde
Feat: Redesign and refactor agent module (#9113)
### What problem does this PR solve?

#9082 #6365

<u> **WARNING: it's not compatible with the older version of `Agent`
module, which means that `Agent` from older versions can not work
anymore.**</u>

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-07-30 19:41:09 +08:00
Stephen Hu
95b9208b13
Fix:Improve float operation when rerank (#8963)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8915

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-07-22 10:04:00 +08:00
Stephen Hu
46caf6ae72
Refactor improve codes for ranker (#8936)
### What problem does this PR solve?
Use the normalize method directly

### Type of change

- [x] Refactoring
2025-07-21 10:22:20 +08:00
Stephen Hu
38b34116dd
Refa: Remove useless conver and fix a bug for DefaultRerank (#8887)
### What problem does this PR solve?

1. bug when re-try, we need to reset i.
2. remove useless convert

### Type of change

- [x] Refactoring
2025-07-17 12:09:50 +08:00
Yongteng Lei
f8a6987f1e
Refa: automatic LLMs registration (#8651)
### What problem does this PR solve?

Support automatic LLMs registration.

### Type of change

- [x] Refactoring
2025-07-03 19:05:31 +08:00
Kevin Hu
d46c24045f
Feat: add GiteeAI as a llm provider. (#8572)
### What problem does this PR solve?

#1853

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-30 11:22:11 +08:00
Kevin Hu
aafeffa292
Feat: add gitee as LLM provider. (#8545)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-30 09:22:31 +08:00
Kevin Hu
65d5268439
Feat: implement novitaAI embedding and reranking. (#8250)
### What problem does this PR solve?

Close #8227

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-13 15:42:17 +08:00
Kevin Hu
d36c8d18b1
Refa: make exception more clear. (#8224)
### What problem does this PR solve?

#8156

### Type of change
- [x] Refactoring
2025-06-12 17:53:59 +08:00
Kevin Hu
156290f8d0
Fix: url path join issue. (#8013)
### What problem does this PR solve?

Close #7980

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-03 14:18:40 +08:00
Kevin Hu
60c3a253ad
Fix: api-key issue for xinference. (#6490)
### What problem does this PR solve?

#2792

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 15:01:13 +08:00
zhou
a6aed0da46
Fix: rerank with YoudaoRerank issue. (#6396)
### What problem does this PR solve?

Fix rerank with YoudaoRerank issue,"'YoudaoRerank' object has no
attribute '_dynamic_batch_size'"


![17425412353825](https://github.com/user-attachments/assets/9ed304c7-317a-440e-acff-fe895fc20f07)


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-24 10:09:16 +08:00
Kevin Hu
d83911b632
Fix: huggingface rerank model issue. (#6385)
### What problem does this PR solve?

#6348

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 12:43:32 +08:00
Kevin Hu
5b04b7d972
Fix: rerank with vllm issue. (#6306)
### What problem does this PR solve?

#6301

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 11:52:42 +08:00
Edouard Hur
b29539b442 Fix: CoHereRerank not respecting base_url when provided (#5784)
### What problem does this PR solve?

vLLM provider with a reranking model does not work : as vLLM uses under
the hood the [CoHereRerank
provider](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/__init__.py#L250)
with a `base_url`, if this URL [is not passed to the Cohere
client](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/rerank_model.py#L379-L382)
any attempt will endup on the Cohere SaaS (sending your private api key
in the process) instead of your vLLM instance.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-10 11:22:06 +08:00
Kevin Hu
df9b7b2fe9
Fix: rerank issue. (#5696)
### What problem does this PR solve?

#5673

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-06 15:05:19 +08:00
Kevin Hu
b8da2eeb69
Feat: support huggingface re-rank model. (#5684)
### What problem does this PR solve?

#5658

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-06 10:44:04 +08:00
Kevin Hu
4e2afcd3b8
Fix FlagRerank max_length issue. (#5366)
### What problem does this PR solve?

#5352

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 11:01:13 +08:00
liwenju0
569e40544d
Refactor rerank model with dynamic batch processing and memory manage… (#5273)
…ment

### What problem does this PR solve?
Issue:https://github.com/infiniflow/ragflow/issues/5262
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-02-24 11:32:08 +08:00
Kevin Hu
4776fa5e4e
Refactor for total_tokens. (#4652)
### What problem does this PR solve?

#4567
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-26 13:54:26 +08:00
Kevin Hu
3805621564
Fix xinference rerank issue. (#4499)
### What problem does this PR solve?
#4495
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-16 11:35:51 +08:00
Alex Chen
7944aacafa
Feat: add gpustack model provider (#4469)
### What problem does this PR solve?

Add GPUStack as a new model provider.
[GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU
cluster manager for running LLMs. Currently, locally deployed models in
GPUStack cannot integrate well with RAGFlow. GPUStack provides both
OpenAI compatible APIs (Models / Chat Completions / Embeddings /
Speech2Text / TTS) and other APIs like Rerank. We would like to use
GPUStack as a model provider in ragflow.

[GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/)

Related issue: https://github.com/infiniflow/ragflow/issues/4064.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)



### Testing Instructions
1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3`
text embedding model, `bge-reranker-v2-m3` rerank model,
`faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in
GPUStack.
2. Add provider in ragflow settings.
3. Testing in ragflow.
2025-01-15 14:15:58 +08:00
Kevin Hu
cb45431412
Fix Voyage re-rank model. Limit file name length. (#4171)
### What problem does this PR solve?

#4152 
#4154

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-23 10:03:50 +08:00
Kevin Hu
593ffc4067
Fix HuggingFace model error. (#3870)
### What problem does this PR solve?

#3865

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-05 13:28:42 +08:00
Kevin Hu
78601ee1bd
Fix open AI compatible rerank issue. (#3866)
### What problem does this PR solve?
#3700
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-05 10:26:21 +08:00
Kevin Hu
3f3469130b
Fix preview issue in file manager. (#3846)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-04 11:53:23 +08:00
devMls
59a5813f1b
add jina new models in jina connector (#3770)
### What problem does this PR solve?

add new models in jinna connector, to allow use models that support
multilingual models

### Type of change

- [X] Other (please describe): new connectors no breaking change
2024-12-02 10:06:39 +08:00
Kevin Hu
91f1814a87
Fix error response (#3719)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Jin Hai <haijin.chn@gmail.com>
2024-11-28 18:56:10 +08:00
liwenju0
875096384b
when qwen rerank model not return ok, raise exception to notice user (#3593)
### What problem does this PR solve?

When calling the Qwen rerank model, if the model does not return
correctly, an exception should be raised to notify the user, rather than
simply returning a value of 0, as this would be confusing to the user.
### Type of change          

- [x] New Feature (non-breaking change which adds functionality)
2024-11-22 22:34:34 +08:00
shizzgar
4b3eeaa6ef
Added LocalAI support for rerank models (#3446)
### What problem does this PR solve?

Hi there!
LocalAI added support of rerank models
https://localai.io/features/reranker/

I've implemented LocalAIRerank class (typically copied it from
OpenAI_APIRerank class).
Also, LocalAI model response with 500 error code if len of "documents"
is less than 2 in similarity check.
So I've added the second "document" on RERANK model connection check in
`api/apps/llm_app.py`.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-18 12:05:52 +08:00
Jin Hai
1e90a1bf36
Move settings initialization after module init phase (#3438)
### What problem does this PR solve?

1. Module init won't connect database any more.
2. Config in settings need to be used with settings.CONFIG_NAME

### Type of change

- [x] Refactoring

Signed-off-by: jinhai <haijin.chn@gmail.com>
2024-11-15 17:30:56 +08:00
Zhichang Yu
30f6421760
Use consistent log file names, introduced initLogger (#3403)
### What problem does this PR solve?

Use consistent log file names, introduced initLogger

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-11-14 17:13:48 +08:00
roc king
fa54cd5f5c
exstract model dir from model‘s full name (#3368)
### What problem does this PR solve?

When model’s group name contains 0-9,we can't find downloaded
model,because we do not correctly exstract model dir's name from model‘s
full name

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: 王志鹏 <zhipeng3.wang@midea.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-13 14:10:16 +08:00
Zhichang Yu
a2a5631da4
Rework logging (#3358)
Unified all log files into one.

### What problem does this PR solve?

Unified all log files into one.

### Type of change

- [x] Refactoring
2024-11-12 17:35:13 +08:00
Kevin Hu
4097912d59
add inputs to display to every components (#3242)
### What problem does this PR solve?

#3240

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-11-06 18:47:53 +08:00
Kevin Hu
89d5b2414e
fix SILICONFLOW rerank error (#2980)
### What problem does this PR solve?

#2977

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-10-23 10:12:39 +08:00
chongchuanbing
ac26d09a59
Feature/feat1017 (#2872)
### What problem does this PR solve?

1. fix: mid map show error in knowledge graph, juse because
```@antv/g6```version changed
2. feat: concurrent threads configuration support in graph extractor
3. fix: used tokens update failed for tenant
4. feat: timeout configuration support for llm
5. fix: regex error in graph extractor
6. feat: qwen rerank(```gte-rerank```) support
7. fix: timeout deal in knowledge graph index process. Now chat by
stream output, also, it is configuratable.
8. feat: ```qwen-long``` model configuration

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chongchuanbing <chongchuanbing@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-10-21 12:11:08 +08:00
Ziyu Huang
e5f7733b31
Resolves #2905 openai compatible model provider add llama.cpp rerank support (#2906)
### What problem does this PR solve?
Resolve #2905 



due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control

my llama.cpp run set -ub to 1024:

${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"





### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Here is my test Ragflow use llama.cpp

```
lot update_slots: id  0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot      release: id  0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id  0 | task 459 | processing task
slot update_slots: id  0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id  0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id  0 | task 459 | kv cache rm [0, end)
slot update_slots: id  0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id  0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot      release: id  0 | task 459 | stop processing: n_past = 111, truncated = 0
srv  update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200

```
2024-10-21 10:06:29 +08:00
0000sir
4991107822
Fix keys of Xinference deployed models, especially has the same model name with public hosted models. (#2832)
### What problem does this PR solve?

Fix keys of Xinference deployed models, especially has the same model
name with public hosted models.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: 0000sir <0000sir@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-10-16 10:21:08 +08:00
Kevin Hu
5e7c1fb23a
reduce rerank batch size (#2801)
### What problem does this PR solve?

### Type of change


- [x] Performance Improvement
2024-10-11 11:29:19 +08:00