969 Commits

Author SHA1 Message Date
yangdx
6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
administrator
c26dfa33de Fix: corrected unterminated f-string in config.py 2025-07-29 11:21:23 +07:00
yangdx
9923821d75 refactor: Remove deprecated max_token_size from embedding configuration
This parameter is no longer used. Its removal simplifies the API and clarifies that token length management is handled by upstream text chunking logic rather than the embedding wrapper.
2025-07-29 10:49:35 +08:00
yangdx
f4c2dc327d Fix linting 2025-07-29 09:57:41 +08:00
yangdx
75d1b1e9f8 Update Ollama context length configuration
- Rename OLLAMA_NUM_CTX to OLLAMA_LLM_NUM_CTX
- Increase default context window size
- Add requirement for minimum context size
- Update documentation examples
2025-07-29 09:53:37 +08:00
yangdx
645f81f7c8 fixes a critical bug where Ollama options were not being applied correctly
`dict.update()` modifies the dictionary in-place and returns `None`.
2025-07-29 09:52:25 +08:00
Michele Comitini
bd94714b15 options needs to be passed to ollama client embed() method
Fix line length

Create binding_options.py

Remove test property

Add dynamic binding options to CLI and environment config

Automatically generate command-line arguments and environment variable
support for all LLM provider bindings using BindingOptions. Add sample
.env generation and extensible framework for new providers.

Add example option definitions and fix test arg check in OllamaOptions

Add options_dict method to BindingOptions for argument parsing

Add comprehensive Ollama binding configuration options

ruff formatting Apply ruff formatting to binding_options.py

Add Ollama separate options for embedding and LLM

Refactor Ollama binding options and fix class var handling

The changes improve how class variables are handled in binding options
and better organize the Ollama-specific options into LLM and embedding
subclasses.

Fix typo in arg test.

Rename cls parameter to klass to avoid keyword shadowing

Fix Ollama embedding binding name typo

Fix ollama embedder context param name

Split Ollama options into LLM and embedding configs with mixin base

Add Ollama option configuration to LLM and embeddings in lightrag_server

Update sample .env generation and environment handling

Conditionally add env vars and cmdline options only when ollama bindings
are used. Add example env file for Ollama binding options.
2025-07-28 12:05:40 +02:00
yangdx
ee53e43568 Update webui assets 2025-07-28 02:52:32 +08:00
yangdx
769f77ef8f Update webui assets 2025-07-28 02:26:07 +08:00
yangdx
98ac6fb3f0 Bump api version to 0192 2025-07-28 01:42:51 +08:00
yangdx
f2ffff063b feat: refactor ollama server configuration management
- Add ollama_server_infos attribute to LightRAG class with default initialization
- Move default values to constants.py for centralized configuration
- Refactor OllamaServerInfos class with property accessors and CLI support
- Update OllamaAPI to get configuration through rag object instead of direct import
- Add command line arguments for simulated model name and tag
- Fix type imports to avoid circular dependencies
2025-07-28 01:38:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
d0d57a45b6 feat: add environment variables to /health endpoint and centralize defaults
- Add 9 environment variables to /health endpoint configuration section
- Centralize default constants in lightrag/constants.py for consistency
- Update config.py to use centralized defaults for better maintainability
2025-07-28 00:30:56 +08:00
yangdx
d70c584d80 Bump api version to 0191 2025-07-27 21:24:53 +08:00
yangdx
3f5ade47cd Update README 2025-07-27 17:26:49 +08:00
yangdx
ebaff228aa feat: Add rerank score filtering with configurable threshold
- Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0)
- Add MIN_RERANK_SCORE environment variable support
- Filter chunks with rerank scores below threshold in process_chunks_unified
- Add info-level logging for filtering operations
- Handle empty results gracefully after filtering
- Maintain backward compatibility with non-reranked chunks
2025-07-27 16:37:44 +08:00
yangdx
0dfbce0bb4 Update the README to clarify the explanation of concurrent processes. 2025-07-27 10:39:28 +08:00
yangdx
e7baf54ec2 Update webui assets 2025-07-26 08:43:12 +08:00
yangdx
b3c2987006 Reduce default MAX_TOKENS from 32000 to 10000 2025-07-26 08:13:49 +08:00
yangdx
6a99d7ac28 Update webui assets 2025-07-25 22:03:58 +08:00
yangdx
4ae44bb24b Bump core version to 1.4.5 and api version to 0190 2025-07-25 11:15:04 +08:00
yangdx
bf58c73c3f Update webui assets 2025-07-24 16:48:45 +08:00
yangdx
51231c7647 Update README 2025-07-24 15:48:49 +08:00
yangdx
5437509824 Bump api version to 0189 2025-07-24 14:07:48 +08:00
yangdx
d8e7b77099 Bump api version to 0188 2025-07-24 12:27:30 +08:00
yangdx
2767212ba0 Fix linting 2025-07-24 12:25:50 +08:00
yangdx
d979e9078f feat: Integrate Jina embeddings API support
- Implemented Jina embedding function
- Add new EMBEDDING_BINDING type of jina for LightRAG Server
- Add env var sample
2025-07-24 12:15:00 +08:00
yangdx
cb3bf3291c Fix: rename rerank parameter from top_k to top_n
The change aligns with the API parameter naming used by Jina and Cohere rerank services, ensuring consistency and clarity.
2025-07-20 00:26:27 +08:00
yangdx
8d8f9e411e Bump core verion to 1.4.4 and api version to 0187 2025-07-19 13:28:39 +08:00
yangdx
488028b9e2 Remove separate requirements.txt and update Dockerfile to use pip install 2025-07-18 01:58:46 +08:00
yangdx
99527027de feat: change default query mode from hybrid to mix
- Update default mode for Ollama chat endpoint
- Update default mode for query endpoint of LightRAG
2025-07-17 19:21:15 +08:00
yangdx
e828539b24 Update README 2025-07-17 19:05:34 +08:00
yangdx
b321afefaa Bump core version to 1.4.3 and api version to 0186 2025-07-17 16:58:57 +08:00
yangdx
f3c0dab7ce Bump core version to 1.4.2 and api version to 0185 2025-07-17 12:26:10 +08:00
yangdx
910c6973f3 Limit file deletion to current directory only after document cleaning 2025-07-16 20:35:24 +08:00
yangdx
2bf0d397ed Update webui assets 2025-07-16 10:18:51 +08:00
yangdx
e4f62de727 Bump aip version to 0184 2025-07-16 04:57:46 +08:00
yangdx
500e940f75 Remove max token summary display from splash screen 2025-07-16 04:55:32 +08:00
yangdx
0adb5f2595 Update webui assets 2025-07-16 01:39:48 +08:00
Daniel.y
b44c8d46a5
Merge pull request #1782 from HKUDS/rerank
Refactor the token control system
2025-07-16 00:23:25 +08:00
yangdx
5f7cb437e8 Centralize query parameters into LightRAG class
This commit refactors query parameter management by consolidating settings like `top_k`, token limits, and thresholds into the `LightRAG` class, and consistently sourcing parameters from a single location.
2025-07-15 23:56:49 +08:00
yangdx
089346f8df Bump api version to 0183 2025-07-15 19:52:50 +08:00
yangdx
93b25a65d5 Update webui assets 2025-07-15 18:10:00 +08:00
yangdx
661a41f9eb Update webui assets 2025-07-15 17:25:39 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
yangdx
e8e1f6ab56 feat: centralize environment variable defaults in constants.py 2025-07-15 16:11:50 +08:00
Daniel.y
6d1260aafa
Merge pull request #1766 from HKUDS/fix-memgraph-max-nodes-issue
Fix Memgraph get_knowledge_graph issues
2025-07-15 16:07:04 +08:00
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
zrguo
7c882313bb remove chunk_rerank_top_k 2025-07-15 11:52:34 +08:00
yangdx
9afe578fe7 Update webui assets 2025-07-14 17:56:51 +08:00