LightRAG

mirror of https://github.com/HKUDS/LightRAG.git synced 2025-07-30 12:22:58 +00:00

Author	SHA1	Message	Date
zrguo	bbd91d3a18	Update operate.py	2025-07-14 16:37:25 +08:00
zrguo	4e425b1b59	Revert "update from main" This reverts commit 1d0376d6a926ef60d641af4406dacf5b8bbb430f.	2025-07-14 16:29:00 +08:00
zrguo	1d0376d6a9	update from main	2025-07-14 16:27:49 +08:00
zrguo	c9cbd2d3e0	Merge branch 'main' into rerank	2025-07-14 16:24:29 +08:00
zrguo	ef2115d437	Update token limit	2025-07-14 15:53:48 +08:00
yangdx	b03bb48e24	feat: Refine summary logic and add dedicated Ollama num_ctx config - Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation. - Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible. - Updated `README` files, `env.example`, and default values to reflect these changes.	2025-07-14 01:55:04 +08:00
yangdx	f185b3fb38	Optimize async task limits for graph processing - Increased concurrency for graph operations - Renamed variables for clarity - Updated status messages	2025-07-13 21:51:19 +08:00
yangdx	e4bf4d19a0	Optimize knowledge graph rebuild with parallel processing - Add parallel processing for KG rebuild - Implement keyed locks for data consistency	2025-07-12 13:22:56 +08:00
yangdx	a85d7054d4	fix: move node existence check inside lock to prevent race condition Move knowledge_graph_inst.has_node check inside get_storage_keyed_lock in _merge_edges_then_upsert to ensure atomic check-then-act operations and prevent duplicate node creation during concurrent updates.	2025-07-12 12:22:32 +08:00
yangdx	2ade3067f8	Refac: Generalize keyed lock with namespace support Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely. Key changes: - `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition. - Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock` - Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces	2025-07-12 12:10:12 +08:00
yangdx	22c36f2fd2	Optimize log messages	2025-07-12 02:41:31 +08:00
yangdx	c47747da9e	Merge branch 'main' into merge_lock_with_key	2025-07-11 16:37:10 +08:00
yangdx	ef4870fda5	Combined entity and edge processing tasks and optimize merging with semaphore	2025-07-11 16:34:54 +08:00
zrguo	b0479c078a	fix process_chunks_unified()	2025-07-09 15:55:38 +08:00
zrguo	e1541caea9	Update webui setting	2025-07-09 12:10:06 +08:00
yangdx	207f0a7f2a	Merge branch 'main' into merge_lock_with_key	2025-07-09 09:25:28 +08:00
yangdx	cb3bfc0e5b	Release semphore before merge stage	2025-07-09 09:24:44 +08:00
yangdx	e9c3503f77	Update logger info	2025-07-09 04:36:52 +08:00
yangdx	5d4484882a	Merge branch 'main' into rerank	2025-07-09 03:59:04 +08:00
zrguo	c295d355a0	fix chunk_top_k limiting	2025-07-08 15:05:30 +08:00
SLKun	5f330ec11a	remove <think> tag for entities and keywords extraction	2025-07-08 14:59:15 +08:00
zrguo	04a57445da	update chunks truncation method	2025-07-08 13:31:05 +08:00
yangdx	56d43de58a	Merge branch 'main' into merge_lock_with_key	2025-07-08 12:46:31 +08:00
zrguo	f5c80d7cde	Simplify Configuration	2025-07-08 11:16:34 +08:00
zrguo	75dd4f3498	add rerank model	2025-07-07 22:44:59 +08:00
yangdx	fe13475234	Fix linting	2025-07-05 12:07:37 +08:00
yangdx	a2e59dd078	fix: prevent empty entity names after normalization in extraction Added validation checks in entity and relationship extraction functions to filter out entities that become empty strings after normalize_extracted_info processing. This prevents empty labels from appearing in get_all_labels() results and maintains knowledge graph data integrity.	2025-07-05 12:06:34 +08:00
yangdx	6c2ae40d7d	Refac: Enhance KG rebuild stability by incorporating `create_time` into the LLM cache	2025-07-03 17:08:29 +08:00
yangdx	6b6d14bc3a	fix: Deduplicate entities and relationships in a single chunk with multiple gleaning results during KG rebuild	2025-07-03 13:47:52 +08:00
yangdx	e56734cb8b	Refac: Optimize document deletion performance - Adding chunks_list to dock_status - Adding llm_cache_list to text_chunks - Implemented storage types: JsonKV and Redis	2025-07-03 04:18:25 +08:00
yangdx	271722405f	feat: Flatten LLM cache structure for improved recall efficiency Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.	2025-07-02 16:11:53 +08:00
yangdx	e70f5a35e5	Refac: Add KG rebuild logging with pipeline status - Logs detailed progress, including warnings and failures, to the pipeline status. - Adds counters to report the total number of successfully rebuilt entities and relationships upon completion.	2025-06-29 21:27:12 +08:00
yangdx	8522bfc9dc	Optimied logger info	2025-06-28 19:27:36 +08:00
yangdx	0f51ec48f1	fix: streaming error when only_need_context=True returns empty results Prevents NoneType async iteration error by handling None responses in stream_generator and ensuring kg_query returns valid strings.	2025-06-28 09:18:06 +08:00
yangdx	495d6c8cce	Improve the pipeline status message for document deletetion	2025-06-25 15:46:58 +08:00
yangdx	da46b341dc	feat: Optimize document deletion performance - To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency. - Graph storage updated: Networkx, Neo4j, Postgres AGE	2025-06-25 12:37:57 +08:00
zrguo	3a9494ab60	Update operate.py	2025-06-09 19:47:29 +08:00
zrguo	9a71a10bc0	Update operate.py	2025-06-09 19:40:29 +08:00
zrguo	ead82a8dbd	update delete_by_doc_id	2025-06-09 18:52:34 +08:00
yangdx	36a736db0b	Fix node merge error	2025-05-30 12:30:24 +08:00
zrguo	40b10e8fcf	Update insert_custom_kg	2025-05-27 16:07:04 +08:00
Arjun Rao	2fbfdb5b17	Merge remote-tracking branch 'upstream/main'	2025-05-14 03:12:03 +10:00
yangdx	9ec9579a95	Fix linting	2025-05-11 11:24:52 +08:00
yangdx	68653f853a	fix: handle missing 'weight' attribute in edge data to prevent KeyError - Add validation in _find_most_related_edges_from_entities and _get_edge_data function during edge data construction - Add warning logs when 'weight' attribute is missing and set default value of 0.0	2025-05-11 11:16:32 +08:00
Arjun Rao	a1a71e7897	Merge branch 'using_keyed_lock_for_max_concurrency'	2025-05-09 12:57:31 +10:00
yangdx	d2d755db7b	Normalize keyword extration result	2025-05-08 16:05:52 +08:00
yangdx	de40f1b5b3	Deduplicate merged relation keywords	2025-05-08 15:52:18 +08:00
yangdx	b92f9b9453	Optimizing query prompt	2025-05-08 12:53:28 +08:00
Arjun Rao	f8149790e4	Initial commit with keyed graph lock	2025-05-08 12:29:49 +10:00
yangdx	10dbbe4ebf	Fix linting	2025-05-08 04:29:43 +08:00

1 2 3 4 5 ...

379 Commits