28 Commits

Author SHA1 Message Date
yangdx
47485b130d refac(ui): Show rerank binding info on status card
- Remove separate ENABLE_RERANK flag in favor of rerank_binding="null"
- Change default rerank binding from "cohere" to "null" (disabled)
- Update UI to display both rerank binding and model information
2025-08-23 02:04:14 +08:00
yangdx
bf43e1b8c1 fix: Resolve default rerank config problem when env var missing
- Read config from selected_rerank_func when env var missing
- Make api_key optional for rerank function
- Add response format validation with proper error handling
- Update Cohere rerank default to official API endpoint
2025-08-23 01:07:59 +08:00
yangdx
16a1ef1178 Update summary_max_tokens default from 10k to 30k tokens 2025-08-21 23:16:07 +08:00
yangdx
4c556d8aae Set default TIMEOUT value to 150, and gunicorn timeout to TIMEOUT+30 2025-08-20 22:04:32 +08:00
yangdx
d5e8f1e860 Update default query parameters for better performance
- Increase chunk_top_k from 10 to 20
- Reduce max_entity_tokens to 6000
- Reduce max_relation_tokens to 8000
- Update web UI default values
- Fix max_total_tokens to 30000
2025-08-18 19:32:11 +08:00
yangdx
dcec511f72 feat: increase file path length limit to 32768 and add schema migration for Milvus DB
- Bump path limit to 32768 chars
- Add migration detection logic
- Implement dual-client migration
- Auto-migrate old collections
2025-08-18 04:37:12 +08:00
yangdx
5a40ff654e Change KG chunk selection default to VECTOR
- Set KG_CHUNK_PICK_METHOD default to VECTOR
- Update env.example with new config option
2025-08-13 23:10:42 +08:00
yangdx
f1dafa0d01 feat: KG related chunks selection by vector similarity
- Add env switch to toggle weighted polling vs vector-similarity strategy
- Implement similarity-based sorting with fallback to weighted
- Introduce batch vector read API for vector storage
- Implement vector store and retrive funtion for Nanovector DB
- Preserve default behavior (weighted polling selection method)
2025-08-13 18:16:42 +08:00
yangdx
9d5603d35e Set the default LLM temperature to 1.0 and centralize constant management 2025-07-31 17:15:10 +08:00
yangdx
c6bd9f0329 Disable conversation history by default
- Set default history_turns to 0
- Mark history_turns as deprecated
- Remove history_turns from example
- Update documentation comments
2025-07-31 12:28:42 +08:00
yangdx
f2ffff063b feat: refactor ollama server configuration management
- Add ollama_server_infos attribute to LightRAG class with default initialization
- Move default values to constants.py for centralized configuration
- Refactor OllamaServerInfos class with property accessors and CLI support
- Update OllamaAPI to get configuration through rag object instead of direct import
- Add command line arguments for simulated model name and tag
- Fix type imports to avoid circular dependencies
2025-07-28 01:38:35 +08:00
yangdx
598eecd06d Refactor: Rename llm_model_max_token_size to summary_max_tokens
This commit renames the parameter 'llm_model_max_token_size' to 'summary_max_tokens' for better clarity, as it specifically controls the token limit for entity relation summaries.
2025-07-28 00:49:08 +08:00
yangdx
d0d57a45b6 feat: add environment variables to /health endpoint and centralize defaults
- Add 9 environment variables to /health endpoint configuration section
- Centralize default constants in lightrag/constants.py for consistency
- Update config.py to use centralized defaults for better maintainability
2025-07-28 00:30:56 +08:00
yangdx
a9565d7379 feat: Skip rerank filtering when min_rerank_score is 0.0 2025-07-27 16:50:12 +08:00
yangdx
ebaff228aa feat: Add rerank score filtering with configurable threshold
- Add DEFAULT_MIN_RERANK_SCORE constant (default: 0.0)
- Add MIN_RERANK_SCORE environment variable support
- Filter chunks with rerank scores below threshold in process_chunks_unified
- Add info-level logging for filtering operations
- Handle empty results gracefully after filtering
- Maintain backward compatibility with non-reranked chunks
2025-07-27 16:37:44 +08:00
yangdx
055629d30d Reduce default max total tokens to 30k 2025-07-27 10:33:06 +08:00
yangdx
c8c3545454 refactor: extract file path length limit to shared constant
• Add DEFAULT_MAX_FILE_PATH_LENGTH constant
• Replace hardcoded 4090 in Milvus impl
2025-07-26 10:45:03 +08:00
yangdx
2c940f0728 reduce RELATED_CHUNK_NUMBER from 10 to 5 2025-07-24 02:49:05 +08:00
yangdx
8103b200db Set DEFAULT_HISTORY_TURNS to 0 2025-07-16 02:20:27 +08:00
yangdx
6e084bfae1 Increase default related chunk number from 5 to 10 2025-07-16 00:22:34 +08:00
yangdx
5f7cb437e8 Centralize query parameters into LightRAG class
This commit refactors query parameter management by consolidating settings like `top_k`, token limits, and thresholds into the `LightRAG` class, and consistently sourcing parameters from a single location.
2025-07-15 23:56:49 +08:00
zrguo
1541034816 Add DEFAULT_RELATED_CHUNK_NUMBER 2025-07-15 21:35:12 +08:00
yangdx
47341d3a71 Merge branch 'main' into rerank 2025-07-15 16:12:33 +08:00
yangdx
e8e1f6ab56 feat: centralize environment variable defaults in constants.py 2025-07-15 16:11:50 +08:00
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
zrguo
479865a271 Add max_gleaning to env 2025-07-01 17:13:33 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
yangdx
c8ecfa2d68 feat: Centralize configuration and update defaults
This commit introduces `lightrag/constants.py` to centralize default values for various configurations across the API and core components.

Key changes:
- Added `constants.py` to centralize default values
- Improved the `get_env_value` function in `api/config.py` to correctly handle string "None" as a None value and to catch `TypeError` during value conversion.
- Updated the default `SUMMARY_LANGUAGE` to "English"
- Set default `WORKERS` to 2
2025-05-06 22:00:43 +08:00