LightRAG

mirror of https://github.com/HKUDS/LightRAG.git synced 2025-07-24 17:30:47 +00:00

Author	SHA1	Message	Date
yangdx	05bc5cfb64	Improve task execution with early failure detection - Add early failure detection for async tasks - Cancel pending tasks on first exception	2025-07-19 10:14:22 +08:00
yangdx	12d4f12e57	fix: sort edge_key components in _locked_process_edges for consistent locking - Ensures bidirectional relationships use same lock key - Maintains thread safety for knowledge graph edge operations	2025-07-19 07:36:50 +08:00
yangdx	be2d938c84	Fix file path handling in graph operations - Filter out empty file paths - Handle missing file_path fields	2025-07-17 18:33:14 +08:00
yangdx	7184c7b3ab	fix: change default edge weight from 0.0 to 1.0 in entity extraction and graph storage - Update extract_entities function in operate.py to use 1.0 as default weight - Fix Neo4j implementation to use 1.0 instead of 0.0 for missing edge weights - Fix Memgraph implementation to use 1.0 instead of 0.0 for missing edge weights - Ensures consistent non-zero default weights across all graph storage backends	2025-07-17 11:30:49 +08:00
yangdx	b1276a079f	Fix linting	2025-07-15 23:57:24 +08:00
yangdx	5f7cb437e8	Centralize query parameters into LightRAG class This commit refactors query parameter management by consolidating settings like `top_k`, token limits, and thresholds into the `LightRAG` class, and consistently sourcing parameters from a single location.	2025-07-15 23:56:49 +08:00
zrguo	3ead0489b8	Remove "rank", "weight", "keywords"	2025-07-15 21:47:33 +08:00
zrguo	1541034816	Add DEFAULT_RELATED_CHUNK_NUMBER	2025-07-15 21:35:12 +08:00
zrguo	42f1fd60f4	Update operate.py	2025-07-15 18:59:52 +08:00
zrguo	29e82723e6	Update operate.py	2025-07-15 18:57:57 +08:00
yangdx	1927cb2685	Fix linting	2025-07-15 17:24:57 +08:00
yangdx	47341d3a71	Merge branch 'main' into rerank	2025-07-15 16:12:33 +08:00
yangdx	e8e1f6ab56	feat: centralize environment variable defaults in constants.py	2025-07-15 16:11:50 +08:00
yangdx	ccc2a20071	feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation - Remove MAX_TOKEN_SUMMARY parameter and related configurations - Eliminate forced token-based truncation in entity/relationship descriptions - Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE - Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization - Clean up documentation, environment examples, and API display code - Preserve backward compatibility by graceful parameter removal This change resolves issues where LLMs were forcibly truncating entity relationship descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge graph content. The new approach allows LLMs to generate complete descriptions while still providing summarization when multiple fragments need to be merged. Breaking Change: None - parameter removal is backward compatible Fixes: Entity relationship description truncation issues	2025-07-15 12:26:33 +08:00
zrguo	7c882313bb	remove chunk_rerank_top_k	2025-07-15 11:52:34 +08:00
zrguo	86a0a4872e	Update operate.py	2025-07-15 10:56:48 +08:00
zrguo	7edf087baa	Update operate.py	2025-07-14 18:43:22 +08:00
zrguo	bbd91d3a18	Update operate.py	2025-07-14 16:37:25 +08:00
zrguo	4e425b1b59	Revert "update from main" This reverts commit 1d0376d6a926ef60d641af4406dacf5b8bbb430f.	2025-07-14 16:29:00 +08:00
zrguo	1d0376d6a9	update from main	2025-07-14 16:27:49 +08:00
zrguo	c9cbd2d3e0	Merge branch 'main' into rerank	2025-07-14 16:24:29 +08:00
zrguo	ef2115d437	Update token limit	2025-07-14 15:53:48 +08:00
yangdx	b03bb48e24	feat: Refine summary logic and add dedicated Ollama num_ctx config - Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation. - Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible. - Updated `README` files, `env.example`, and default values to reflect these changes.	2025-07-14 01:55:04 +08:00
yangdx	f185b3fb38	Optimize async task limits for graph processing - Increased concurrency for graph operations - Renamed variables for clarity - Updated status messages	2025-07-13 21:51:19 +08:00
yangdx	e4bf4d19a0	Optimize knowledge graph rebuild with parallel processing - Add parallel processing for KG rebuild - Implement keyed locks for data consistency	2025-07-12 13:22:56 +08:00
yangdx	a85d7054d4	fix: move node existence check inside lock to prevent race condition Move knowledge_graph_inst.has_node check inside get_storage_keyed_lock in _merge_edges_then_upsert to ensure atomic check-then-act operations and prevent duplicate node creation during concurrent updates.	2025-07-12 12:22:32 +08:00
yangdx	2ade3067f8	Refac: Generalize keyed lock with namespace support Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely. Key changes: - `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition. - Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock` - Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces	2025-07-12 12:10:12 +08:00
yangdx	22c36f2fd2	Optimize log messages	2025-07-12 02:41:31 +08:00
yangdx	c47747da9e	Merge branch 'main' into merge_lock_with_key	2025-07-11 16:37:10 +08:00
yangdx	ef4870fda5	Combined entity and edge processing tasks and optimize merging with semaphore	2025-07-11 16:34:54 +08:00
zrguo	b0479c078a	fix process_chunks_unified()	2025-07-09 15:55:38 +08:00
zrguo	e1541caea9	Update webui setting	2025-07-09 12:10:06 +08:00
yangdx	207f0a7f2a	Merge branch 'main' into merge_lock_with_key	2025-07-09 09:25:28 +08:00
yangdx	cb3bfc0e5b	Release semphore before merge stage	2025-07-09 09:24:44 +08:00
yangdx	e9c3503f77	Update logger info	2025-07-09 04:36:52 +08:00
yangdx	5d4484882a	Merge branch 'main' into rerank	2025-07-09 03:59:04 +08:00
zrguo	c295d355a0	fix chunk_top_k limiting	2025-07-08 15:05:30 +08:00
SLKun	5f330ec11a	remove <think> tag for entities and keywords extraction	2025-07-08 14:59:15 +08:00
zrguo	04a57445da	update chunks truncation method	2025-07-08 13:31:05 +08:00
yangdx	56d43de58a	Merge branch 'main' into merge_lock_with_key	2025-07-08 12:46:31 +08:00
zrguo	f5c80d7cde	Simplify Configuration	2025-07-08 11:16:34 +08:00
zrguo	75dd4f3498	add rerank model	2025-07-07 22:44:59 +08:00
yangdx	fe13475234	Fix linting	2025-07-05 12:07:37 +08:00
yangdx	a2e59dd078	fix: prevent empty entity names after normalization in extraction Added validation checks in entity and relationship extraction functions to filter out entities that become empty strings after normalize_extracted_info processing. This prevents empty labels from appearing in get_all_labels() results and maintains knowledge graph data integrity.	2025-07-05 12:06:34 +08:00
yangdx	6c2ae40d7d	Refac: Enhance KG rebuild stability by incorporating `create_time` into the LLM cache	2025-07-03 17:08:29 +08:00
yangdx	6b6d14bc3a	fix: Deduplicate entities and relationships in a single chunk with multiple gleaning results during KG rebuild	2025-07-03 13:47:52 +08:00
yangdx	e56734cb8b	Refac: Optimize document deletion performance - Adding chunks_list to dock_status - Adding llm_cache_list to text_chunks - Implemented storage types: JsonKV and Redis	2025-07-03 04:18:25 +08:00
yangdx	271722405f	feat: Flatten LLM cache structure for improved recall efficiency Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.	2025-07-02 16:11:53 +08:00
yangdx	e70f5a35e5	Refac: Add KG rebuild logging with pipeline status - Logs detailed progress, including warnings and failures, to the pipeline status. - Adds counters to report the total number of successfully rebuilt entities and relationships upon completion.	2025-06-29 21:27:12 +08:00
yangdx	8522bfc9dc	Optimied logger info	2025-06-28 19:27:36 +08:00

1 2 3 4 5 ...

396 Commits