31 Commits

Author SHA1 Message Date
Yongteng Lei
4cdaa77545
Docs: refine MinerU part in FAQ (#11111)
### What problem does this PR solve?

Refine MinerU part in FAQ.

### Type of change

- [x] Documentation Update
2025-11-07 19:58:07 +08:00
Yongteng Lei
2677617f93
Feat: supports MinerU http-client/server method (#10961)
### What problem does this PR solve?

Add support for MinerU http-client/server method.

To use MinerU with vLLM server:

1. Set up a vLLM server running MinerU:
   ```bash
   mineru-vllm-server --port 30000
   ```

2. Configure the following environment variables:
- `MINERU_EXECUTABLE=/ragflow/uv_tools/.venv/bin/mineru` (or the path to
your MinerU executable)
   - `MINERU_BACKEND="vlm-http-client"`
   - `MINERU_SERVER_URL="http://your-vllm-server-ip:30000"`

3. Follow the standard MinerU setup steps as described above.

With this configuration, RAGFlow will connect to your vLLM server to
perform document parsing, which can significantly improve parsing
performance for complex documents while reducing the resource
requirements on your RAGFlow server.



![1](https://github.com/user-attachments/assets/46624a0c-0f3b-423e-ace8-81801e97a27d)

![2](https://github.com/user-attachments/assets/66ccc004-a598-47d4-93cb-fe176834f83b)


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

---------

Co-authored-by: writinwaters <cai.keith@gmail.com>
2025-11-04 16:03:30 +08:00
writinwaters
850e119a81
Doc: Updated How to update MinerU's settings (#10818)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-10-27 19:38:42 +08:00
Yongteng Lei
5acc407240
Feat: MinerU supports VLM-Transfomers backend (#10809)
### What problem does this PR solve?

MinerU supports VLM-Transfomers backend.

Set `MINERU_BACKEND="pipeline"` to choose the backend. (Options:
pipeline | vlm-transformers, default is pipeline)

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-10-27 17:04:13 +08:00
Zhichang Yu
73144e278b
Don't release full image (#10654)
### What problem does this PR solve?

Introduced gpu profile in .env
Added Dockerfile_tei
fix datrie
Removed LIGHTEN flag

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-10-23 23:02:27 +08:00
writinwaters
de24e74b4c
Docs: How to use MinerU to parse pdf documents (#10763)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-10-23 18:56:09 +08:00
Liu An
83e80e3d7f
Docs: Update version references to v0.21.1 in READMEs and docs (#10761)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.21.0 to v0.21.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-10-23 18:55:41 +08:00
Liu An
3ae126836a
Docs: Update version references to v0.21.0 in READMEs and docs (#10565)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.5 to v0.21.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-10-15 11:46:24 +08:00
writinwaters
c4b8e4845c
Docs: The full edition has only two built-in embedding models (#10540)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Documentation Update
2025-10-14 14:13:37 +08:00
writinwaters
4058715df7
Docs: Knowledge base renamed to dataset. (#10269)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-09-25 09:45:27 +08:00
writinwaters
3f1741c8c6
Docs: How to accelerate question answering (#10179)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-09-19 18:18:46 +08:00
Liu An
067b4fc012
Docs: Update version references to v0.20.5 in READMEs and docs (#10015)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.4 to v0.20.5
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-09-10 11:20:43 +08:00
Liu An
986b9cbb1a
Docs: Update version references to v0.20.4 in READMEs and docs (#9758)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.3 to v0.20.4
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-27 16:56:55 +08:00
Liu An
abb6359547
Docs: Update version references to v0.20.3 in READMEs and docs (#9581)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.2 to v0.20.3
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-20 10:45:44 +08:00
Liu An
0aa3c4cdae
Docs: Update version references to v0.20.2 in READMEs and docs (#9559)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.1 to v0.20.2
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-19 17:26:49 +08:00
Liu An
b9eeb8e64f
Docs: Update version references to v0.20.1 in READMEs and docs (#9335)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.20.0 to v0.20.1
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-08 18:17:25 +08:00
Liu An
95534f5cf2
Docs: Update version references to v0.20.0 in READMEs and docs (#9164)
### What problem does this PR solve?

- Update version tags in README files (including translations) from
v0.19.1 to v0.20.0
- Modify Docker image references and documentation to reflect new
version
- Update version badges and image descriptions
- Maintain consistency across all language variants of README files

### Type of change

- [x] Documentation Update
2025-08-01 20:41:44 +08:00
Liu An
7e87eb2e23
Docs: Update version references to v0.19.1 in READMEs and docs (#8366)
### What problem does this PR solve?

- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies

### Type of change

- [x] Documentation Update
2025-06-19 14:39:27 +08:00
cutiechi
8f9bcb1c74
Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266)
### Description

This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and
‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.

### What problem does this PR solve?

Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in ‎`.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.

- ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).

This change updates the codebase, documentation, and configuration files
to reflect the new options.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):

### Additional context
- Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.

Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
2025-06-16 13:40:47 +08:00
writinwaters
d331866a12
Docs: Miscellaneous (#8198)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-06-12 09:42:07 +08:00
writinwaters
157cd8b1b0
Docs: Added auto-keyword auto-question guide (#8113)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-06-06 19:27:41 +08:00
liu an
590b9dabab Docs: update for v0.19.0 (#7823)
### What problem does this PR solve?

update for v0.19.0

### Type of change

- [x] Documentation Update
2025-05-23 18:25:47 +08:00
writinwaters
86c6fee320
Docs: Added an FAQ (#7694)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-05-19 09:58:10 +08:00
writinwaters
f56b651acb
Built-in reranker models have been removed from official deliveries. (#7439)
### What problem does this PR solve?

### Type of change


- [x] Documentation Update
2025-04-30 15:28:03 +08:00
writinwaters
dadd8d9f94
DOC: Miscellaneous UI and editorial updates (#7324)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-04-27 11:44:08 +08:00
liu an
03672df691
Docs: update for v0.18.0 (#7223)
### What problem does this PR solve?

update for v0.18.0

### Type of change

- [x] Documentation Update
2025-04-23 12:02:50 +08:00
writinwaters
6051abb4a3
Miscellaneous UI updates (#6947)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-04-10 20:09:46 +08:00
writinwaters
d0897312ac
Added a guide on setting chat variables (#6904)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-09 19:32:25 +08:00
writinwaters
5a8c479ff3
Miscellaneous editorial updates (#6805)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-07 09:33:55 +08:00
writinwaters
2793c8e4fe
Added a guide on setting page rank. (#6645)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update

---------

Co-authored-by: balibabu <cike8899@users.noreply.github.com>
2025-03-31 11:44:18 +08:00
writinwaters
b7d7ad536a
AI search vs. chat (#6569)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-03-26 18:46:34 +08:00