### What problem does this PR solve?
- Update Docker image version badges and references from v0.19.0 to
v0.19.1
- Modify version mentions in all localized README files (id, ja, ko,
pt_br, tzh, zh)
- Update version in docker/README.md and related documentation files
- Includes updates to Helm values and Python SDK dependencies
### Type of change
- [x] Documentation Update
### Description
This PR introduces two new environment variables, `DOC_BULK_SIZE` and
`EMBEDDING_BATCH_SIZE`, to allow flexible tuning of batch sizes for
document parsing and embedding vectorization in RAGFlow. By making these
parameters configurable, users can optimize performance and resource
usage according to their hardware capabilities and workload
requirements.
### What problem does this PR solve?
Previously, the batch sizes for document parsing and embedding were
hardcoded, limiting the ability to adjust throughput and memory
consumption. This PR enables users to set these values via environment
variables (in `.env`, Helm chart, or directly in the deployment
environment), improving flexibility and scalability for both small and
large deployments.
- `DOC_BULK_SIZE`: Controls how many document chunks are processed in a
single batch during document parsing (default: 4).
- `EMBEDDING_BATCH_SIZE`: Controls how many text chunks are processed
in a single batch during embedding vectorization (default: 16).
This change updates the codebase, documentation, and configuration files
to reflect the new options.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [x] Performance Improvement
- [ ] Other (please describe):
### Additional context
- Updated `.env`, `helm/values.yaml`, and documentation to describe
the new variables.
- Modified relevant code paths to use the environment variables instead
of hardcoded values.
- Users can now tune these parameters to achieve better throughput or
reduce memory usage as needed.
Before:
Default value:
<img width="643" alt="image"
src="https://github.com/user-attachments/assets/086e1173-18f3-419d-a0f5-68394f63866a"
/>
After:
10x:
<img width="777" alt="image"
src="https://github.com/user-attachments/assets/5722bbc0-0bcb-4536-b928-077031e550f1"
/>
Modified the chart to retain persistent volumes by default when the
chart is uninstalled, following established best practices in the Helm
community (e.g., Bitnami charts)
### What problem does this PR solve?
Previously, deleting the helm chart would automatically remove all
persistent data, which poses a risk of accidental data loss.
### Rationale
This change aligns with industry standards to safeguard data by
requiring explicit action to remove persistence, rather than making
deletion the default behavior.
### Impact:
Users who intentionally want to remove persistent data will need to do
so manually or by setting appropriate flags during chart uninstallation.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fixes bug & regression introduced by [PR #7187 - refactor: Update Redis
configuration to use StatefulSet instead of deployment with
pvc](https://github.com/infiniflow/ragflow/pull/7187):
1. Fixes bug #7403 - `redis.persistence.enabled` missing from
`helm/values.yaml` causes helm error:
[ERROR] templates/: template: ragflow/templates/redis.yaml:55:24:
executing "ragflow/templates/redis.yaml" at
<.Values.redis.persistence.enabled>: nil pointer evaluating interface
{}.enabled
2. Fixes regression: reverts hardcoded redis.storage.capacity value back
to using variable `redis.storage.capacity` from `helm/values.yaml`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
This PR changes Redis to be a statefulset. In some situation when we
Redis pod gets rescheduled to another Node, it gets stuck in pending
state due to the PVC attached to another Kubernetes node.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [X] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix Helm Ingress template; Trying to access a global variable within a
loop
Fix#6191
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
fix redis pvc in helm deployment
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
related issue #5882
### What problem does this PR solve?
update helm infinity image version from v0.5.0
image to infiniflow/infinity:v0.6.0-dev3
to solve issue #5882
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Adds a new Kubernetes Service resource to the Helm chart which
specifically targets the RAGFlow API. This feature useful for cases
where you want to expose the RAGFlow HTTP API separately from the web
interface, for example if RAGFlow is running behind an authenticating
proxy it allows a route to bypass the proxy (e.g. by defining a separate
ingress resource which forwards to the separate API-only k8s service
added here) to provide RAGFlow API access. This is still secure since
API access is already authenticated by API keys inside RAGFlow itself.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update version info to 0.15.1
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
The initial Helm chart implementation added in #3815 suffers from an
issue where the 5GB data volume for the SQL DB is filled up with
[binlog](https://dev.mysql.com/doc/refman/8.4/en/binary-log.html) files
after just a few days. Since the app uses a non-replicated SQL DB config
I think it makes sense to disable the binlog in the SQL DB container.
This is achieved by simply adding the required argument to the container
startup command.
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add's a Helm chart for deploying RAGFlow on Kubernetes.
Closes#864.
### Type of change
- [X] New Feature (non-breaking change which adds functionality)