3217 Commits

Author SHA1 Message Date
Yongteng Lei
1a5f991d86
Fix: auto-keyword and auto-question fail with qwq model (#8190)
### What problem does this PR solve?

Fix auto-keyword and auto-question fail with qwq model. #8189 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 11:37:07 +08:00
balibabu
713b574c9d
Feat: Add SwitchForm component #3221 (#8200)
### What problem does this PR solve?

Feat: Add SwitchForm component #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-12 09:50:25 +08:00
Liu An
60c1bf5a19
Fix: duplicate knowledgebase name validation logic (#8199)
### What problem does this PR solve?

Change the condition from checking for >1 to >=1 when validating
duplicate knowledgebase names to properly catch all duplicates. This
ensures no two knowledgebases can have the same name for a tenant.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 09:46:57 +08:00
writinwaters
d331866a12
Docs: Miscellaneous (#8198)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-06-12 09:42:07 +08:00
Kevin Hu
69e1fc496d
Refa: chat models (#8187)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-06-11 17:20:12 +08:00
Liu An
e87ad8126c
Fix: Improve dataset name validation in KB app (#8188)
### What problem does this PR solve?

- Trim whitespace before checking for empty dataset names
- Change length check from >= to > DATASET_NAME_LIMIT for consistency

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-11 16:14:29 +08:00
Yongteng Lei
5e30426916
Feat: add Qwen3-Embedding text-embedding-v4 (#8184)
### What problem does this PR solve?

Add Qwen3-Embedding text-embedding-v4.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-11 15:32:05 +08:00
Liu An
6aff3e052a
Test: Refactor test fixtures to use HttpApiAuth naming consistently (#8180)
### What problem does this PR solve?

- Rename `api_key` fixture to `HttpApiAuth` across all test files
- Update all dependent fixtures and test cases to use new naming
- Maintain same functionality while improving naming clarity

The rename better reflects the fixture's purpose as an HTTP API
authentication helper rather than just an API key.

### Type of change

- [x] Refactoring
2025-06-11 14:25:40 +08:00
Liu An
f29d9fa3f9
Test: fix test cases and improve document parsing validation (#8179)
### What problem does this PR solve?

- Update chat assistant tests to use dataset.id directly in payloads
- Enhance document parsing tests with better condition checking
- Add explicit type hints and improve timeout handling

Action_7556

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-11 14:25:30 +08:00
balibabu
31003cd5f6
Feat: Display the agent node running timeline #3221 (#8185)
### What problem does this PR solve?

Feat: Display the agent node running timeline #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-11 14:24:43 +08:00
balibabu
f0a3d91171
Feat: Display agent operator call log #3221 (#8169)
### What problem does this PR solve?

Feat: Display agent operator call log #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-11 09:22:07 +08:00
cwr31
e6d36f3a3a
Improve image rotation logic for text recognition (#8167)
### What problem does this PR solve?

Enhanced the image rotation handling by evaluating the original
orientation, clockwise 90°, and counter-clockwise 90° rotations. The
image with the highest text recognition score is now selected, improving
accuracy for text detection in images with aspect ratios >= 1.5.

#8166

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenrui.cao <wenrui.cao@univers.com>
2025-06-11 09:20:30 +08:00
writinwaters
c8269206d7
Docs: UI updates (#8170)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-06-11 09:17:30 +08:00
Gifford Nowland
ab67292aa3
fix: silence deprecation in huggingface snapshot_download function (#8150)
### What problem does this PR solve?

fixes the following deprecation emitted from `download_deps.py`: 

```
UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 21:00:03 +08:00
writinwaters
4f92af3cd4
Docs: Updated Auto-question Auto-keyword (#8168)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-06-10 19:38:28 +08:00
Liu An
a43adafc6b
Refa: Add error handling for JSON decode in embedding models (#8162)
### What problem does this PR solve?

Improve robustness of Jina, Nvidia, and SILICONFLOW embedding models by:
1. Adding try-catch blocks for JSON decode errors
2. Logging error details including response content
3. Raising exceptions with meaningful error messages

### Type of change

- [x] Refactoring
2025-06-10 19:04:17 +08:00
balibabu
c5e4684b44
Feat: Let system variables appear in operator prompts #3221 (#8154)
### What problem does this PR solve?
Feat: Let system variables appear in operator prompts #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-10 17:06:30 +08:00
Liu An
3a34def55f
Test: Migrate test workflow to use top-level test directory (#8145)
### What problem does this PR solve?

- Replace manual venv activation with `uv run` for pytest commands
- Add dynamic test level (p2/p3) based on GitHub event type
- Simplify test commands by removing redundant directory changes

### Type of change

- [x] Update Action
2025-06-10 13:55:26 +08:00
Stephen Hu
e6f68e1ccf
Fix: When List Kbs some times the total is wrong (#8151)
### What problem does this PR solve?
for kb.app list method when owner_ids the total calculate is wrong (now
will base on the paged result to calculate total)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 11:34:30 +08:00
Jacky Wu
60ab7027c0
fix: allow to do role auth for S3 bucket use. (#8149)
### What problem does this PR solve?

Close #8148 .

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-10 10:50:07 +08:00
balibabu
08f2223a6a
Feat: Constructing query parameter options for the Retrieval operator #3221 (#8152)
### What problem does this PR solve?

Feat: Constructing query parameter options for the Retrieval operator
#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-10 10:49:41 +08:00
yurhett
9c6c6c51e0
Fix: use jwks_uri from OIDC metadata for JWKS client (#8136)
### What problem does this PR solve?
Issue: #8051

The current implementation assumes JWKS endpoints follow the standard
`/.well-known/jwks.json` convention. This breaks authentication for OIDC
providers that use non-standard JWKS paths, resulting in 404 errors
during token validation.

Root Cause Analysis
- The OpenID Connect specification doesn't mandate a fixed path for JWKS
endpoints
- Some identity providers (like certain Keycloak configurations) use
custom endpoints
- Our previous approach constructed JWKS URLs by convention rather than
discovery

### Solution Approach
Instead of constructing JWKS URLs by appending to the issuer URI, we
now:
1. Properly leverage the `jwks_uri` from the OIDC discovery metadata
2. Honor the identity provider's actual configured endpoint

```python
# Before (fragile approach)
jwks_url = f"{self.issuer}/.well-known/jwks.json"

# After (standards-compliant)
jwks_cli = jwt.PyJWKClient(self.jwks_uri)  # Use discovered endpoint
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 10:16:58 +08:00
HaiyangP
baf32ee461
Display only the duplicate column names and corresponding original source. (#8138)
### What problem does this PR solve?
This PR aims to slove #8120 which request a better error display of
duplicate column names.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 10:16:38 +08:00
balibabu
8fb6b5d945
Feat: Add agent operator node from agent form #3221 (#8144)
### What problem does this PR solve?

Feat: Add agent operator node from agent form #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-09 19:19:48 +08:00
Liu An
5cc2eda362
Test: Refactor test fixtures and add SDK session management tests (#8141)
### What problem does this PR solve?

- Consolidate HTTP API test fixtures using batch operations
(batch_add_chunks, batch_create_chat_assistants)
- Fix fixture initialization order in clear_session_with_chat_assistants
- Add new SDK API test suite for session management
(create/delete/list/update)

### Type of change

- [x] Add test cases
- [x] Refactoring
2025-06-09 18:13:26 +08:00
balibabu
9a69d5f367
Feat: Display chat content on the agent page #3221 (#8140)
### What problem does this PR solve?

Feat: Display chat content on the agent page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-09 18:13:06 +08:00
balibabu
d9b98cbb18
Feat: Convert the prompt field of the agent operator to an array #3221 (#8137)
### What problem does this PR solve?

Feat: Convert the prompt field of the agent operator to an array #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-09 16:02:33 +08:00
Kevin Hu
24625e0695
Fix: presentation of PDF using vlm. (#8133)
### What problem does this PR solve?

#8109

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 15:01:52 +08:00
Liu An
4649accd54
Test: Add SDK API tests for chat assistant management and improve con… (#8131)
### What problem does this PR solve?

- Implement new SDK API test cases for chat assistant CRUD operations
- Enhance HTTP API concurrent tests to use as_completed for better
reliability

### Type of change

- [x] Add test cases
- [x] Refactoring
2025-06-09 13:30:12 +08:00
Liu An
968ffc7ef3
Refa: dataset operations to simplify error handling (#8132)
### What problem does this PR solve?

- Consolidate database operations within single try-except blocks in the
methods

### Type of change

- [x] Refactoring
2025-06-09 13:29:56 +08:00
Stephen Hu
2337bbf6ca
Perf: pass useless check for tidy graph (#8121)
### What problem does this PR solve?
Support passing the attribute check when the upstream has already made
sure it.

### Type of change
- [X] Performance Improvement
2025-06-09 11:44:13 +08:00
Liu An
ad1f89fea0
Fix: chat module update LLM defaults (#8125)
### What problem does this PR solve?

Previously when LLM.model_name was not configured:
- System incorrectly defaulted to 'deepseek-chat' model
- This caused permission errors for unauthorized tenants

Now:
- Use tenant's default chat_model configuration first

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 11:44:02 +08:00
Liu An
2ff911b08c
Fix: Set default rerank_model to empty string in Chat class (#8130)
### What problem does this PR solve?

Previously when LLM.rerank_model was not configured:
- SDK would pass None as the value
- Database field with null=False constraint would reject it
- Caused storage failures for unset rerank_model cases

Now:
- SDK checks for None value before database operations
- Provides empty string as default when rerank_model is unset

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 11:43:42 +08:00
Zhichang Yu
1ed0b25910
Fix task_limiter in raptor.py (#8124)
### What problem does this PR solve?

Fix task_limiter in raptor.py

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-09 10:18:03 +08:00
Liu An
5825a24d26
Test: Refactor test concurrency handling and add SDK chunk management tests (#8112)
### What problem does this PR solve?

- Improve concurrent test cases by using as_completed for better
reliability
- Rename variables for clarity (chunk_num -> count)
- Add new SDK API test suite for chunk management operations
- Update HTTP API tests with consistent concurrency patterns

### Type of change

- [x] Add test cases
- [x] Refactoring
2025-06-06 19:43:14 +08:00
writinwaters
157cd8b1b0
Docs: Added auto-keyword auto-question guide (#8113)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-06-06 19:27:41 +08:00
balibabu
06463135ef
Feat: Reference the output variable of the upstream operator #3221 (#8111)
### What problem does this PR solve?
Feat: Reference the output variable of the upstream operator #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-06 19:27:29 +08:00
Kevin Hu
7ed9efcd4e
Fix: QWenCV issue. (#8106)
### What problem does this PR solve?

Close #8097

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-06 17:55:13 +08:00
balibabu
0bc1f45634
Feat: Enables the message operator form to reference the data defined by the begin operator #3221 (#8108)
### What problem does this PR solve?

Feat: Enables the message operator form to reference the data defined by
the begin operator #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-06 17:54:59 +08:00
balibabu
1885a4a4b8
Feat: Receive reply messages of different event types from the agent #3221 (#8100)
### What problem does this PR solve?
Feat: Receive reply messages of different event types from the agent
#3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-06 16:30:18 +08:00
Wanderson Pinto dos Santos
0e03542db5
fix: single task executor getting all tasks from Redis queue (#7330)
### What problem does this PR solve?

Currently, as long as there are tasks in Redis, this loop will keep
getting the tasks. This will lead to a single task executor with many
tasks in the pending state. Then we need to wait for the pending tasks
to get them back in the queue.

In first place, if we set the `MAX_CONCURRENT_TASKS` to X, then only X
tasks should be picked from the queue, and others should be left in the
queue for other `task_executors` or be picked after 1 of the spots in
the current executor gets free. This PR ensures this behavior.

The additional changes were due to the Ruff linting in pre-commit. But I
believe these are expected to keep the coding style.

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
2025-06-06 14:32:35 +08:00
Stephen Hu
2e44c3b743
Fix:Unimplemented function in ppt_parser (#8095)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/8088

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-06 10:05:58 +08:00
writinwaters
d1ff588d46
Docs: Updated server launching code (#8093)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change


- [x] Documentation Update
2025-06-06 09:48:18 +08:00
Liu An
cc1b2c8f09
Test: add sdk Document test cases (#8094)
### What problem does this PR solve?

Add sdk document test cases

### Type of change

- [x] Add test cases
2025-06-06 09:47:06 +08:00
Liu An
100ea574a7
Fix(python-sdk): Add name filtering support to Dataset.list_documents() (#8090)
### What problem does this PR solve?

Added name filtering capability for Dataset.list_documents()

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 19:04:35 +08:00
Liu An
92625e1ca9
Fix: document typo in test (#8091)
### What problem does this PR solve?

fix document typo in test

### Type of change

- [x] Typo
2025-06-05 19:03:46 +08:00
Liu An
f007c1c772
Fix: Resolve JSON download errors in Document.download() (#8084)
### What problem does this PR solve?

An exception is thrown only when the json file has only two keys, `code`
and `message`. In other cases, response.content is returned normally.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 18:03:51 +08:00
balibabu
841291dda0
Fix: Fixed an issue where using the new quote markers would cause dialogue output to have delete symbols #7623 (#8083)
### What problem does this PR solve?

Fix: Fixed an issue where using the new quote markers would cause
dialogue output to have delete symbols #7623
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 17:43:28 +08:00
balibabu
6488f22540
Feat: Convert the inputs parameter of the begin operator #3221 (#8081)
### What problem does this PR solve?

Feat: Convert the inputs parameter of the begin operator #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-06-05 16:29:48 +08:00
Stephen Hu
6953ae89c4
Fix:when stream=false,new message without sessionid does no (#8078)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/8070

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 15:14:15 +08:00