8 Commits

Author SHA1 Message Date
cutiechi
8f9bcb1c74
Feat: make document parsing and embedding batch sizes configurable via environment variables (#8266)
### Description

This PR introduces two new environment variables, ‎`DOC_BULK_SIZE` and
‎`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.

### What problem does this PR solve?

Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in ‎`.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.

- ‎`DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- ‎`EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).

This change updates the codebase, documentation, and configuration files
to reflect the new options.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):

### Additional context
- Updated ‎`.env`, ‎`helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.

Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
2025-06-16 13:40:47 +08:00
writinwaters
dadd8d9f94
DOC: Miscellaneous UI and editorial updates (#7324)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-04-27 11:44:08 +08:00
writinwaters
e9669e7fb1
Updated v0.18.0 release notes (#7221)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-04-23 11:12:14 +08:00
writinwaters
d0897312ac
Added a guide on setting chat variables (#6904)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-09 19:32:25 +08:00
writinwaters
5a8c479ff3
Miscellaneous editorial updates (#6805)
### What problem does this PR solve?



### Type of change

- [x] Documentation Update
2025-04-07 09:33:55 +08:00
writinwaters
d17970ebd0
0321 chunkmethods (#6520)
### What problem does this PR solve?

#6061 

### Type of change


- [x] Documentation Update
2025-03-26 09:03:18 +08:00
writinwaters
5983803c8b
Miscellaneous UI updates (#6094)
### What problem does this PR solve?

#6049 

### Type of change

- [x] Documentation Update
- [x] Other (please describe): UI updates
2025-03-17 14:17:34 +08:00
writinwaters
e61da33672
Moved agent components into the agent folder (#5496)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-02-28 19:27:57 +08:00