412 Commits

Author SHA1 Message Date
yangdx
e9c3503f77 Update logger info 2025-07-09 04:36:52 +08:00
yangdx
5d4484882a Merge branch 'main' into rerank 2025-07-09 03:59:04 +08:00
zrguo
c295d355a0 fix chunk_top_k limiting 2025-07-08 15:05:30 +08:00
SLKun
5f330ec11a remove <think> tag for entities and keywords extraction 2025-07-08 14:59:15 +08:00
zrguo
04a57445da update chunks truncation method 2025-07-08 13:31:05 +08:00
yangdx
56d43de58a Merge branch 'main' into merge_lock_with_key 2025-07-08 12:46:31 +08:00
zrguo
f5c80d7cde Simplify Configuration 2025-07-08 11:16:34 +08:00
zrguo
75dd4f3498 add rerank model 2025-07-07 22:44:59 +08:00
yangdx
fe13475234 Fix linting 2025-07-05 12:07:37 +08:00
yangdx
a2e59dd078 fix: prevent empty entity names after normalization in extraction
Added validation checks in entity and relationship extraction functions to filter out entities that become empty strings after normalize_extracted_info processing. This prevents empty labels from appearing in get_all_labels() results and maintains knowledge graph data integrity.
2025-07-05 12:06:34 +08:00
yangdx
6c2ae40d7d Refac: Enhance KG rebuild stability by incorporating create_time into the LLM cache 2025-07-03 17:08:29 +08:00
yangdx
6b6d14bc3a fix: Deduplicate entities and relationships in a single chunk with multiple gleaning results during KG rebuild 2025-07-03 13:47:52 +08:00
yangdx
e56734cb8b Refac: Optimize document deletion performance
- Adding chunks_list to  dock_status
- Adding  llm_cache_list to text_chunks
- Implemented storage types: JsonKV and  Redis
2025-07-03 04:18:25 +08:00
yangdx
271722405f feat: Flatten LLM cache structure for improved recall efficiency
Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
2025-07-02 16:11:53 +08:00
yangdx
e70f5a35e5 Refac: Add KG rebuild logging with pipeline status
- Logs detailed progress, including warnings and failures, to the pipeline status.
- Adds counters to report the total number of successfully rebuilt entities and relationships upon completion.
2025-06-29 21:27:12 +08:00
yangdx
8522bfc9dc Optimied logger info 2025-06-28 19:27:36 +08:00
yangdx
0f51ec48f1 fix: streaming error when only_need_context=True returns empty results
Prevents NoneType async iteration error by handling None responses
in stream_generator and ensuring kg_query returns valid strings.
2025-06-28 09:18:06 +08:00
yangdx
495d6c8cce Improve the pipeline status message for document deletetion 2025-06-25 15:46:58 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
zrguo
3a9494ab60 Update operate.py 2025-06-09 19:47:29 +08:00
zrguo
9a71a10bc0 Update operate.py 2025-06-09 19:40:29 +08:00
zrguo
ead82a8dbd update delete_by_doc_id 2025-06-09 18:52:34 +08:00
yangdx
36a736db0b Fix node merge error 2025-05-30 12:30:24 +08:00
zrguo
40b10e8fcf Update insert_custom_kg 2025-05-27 16:07:04 +08:00
Arjun Rao
2fbfdb5b17 Merge remote-tracking branch 'upstream/main' 2025-05-14 03:12:03 +10:00
yangdx
9ec9579a95 Fix linting 2025-05-11 11:24:52 +08:00
yangdx
68653f853a fix: handle missing 'weight' attribute in edge data to prevent KeyError
- Add validation in _find_most_related_edges_from_entities and  _get_edge_data function during edge data construction
- Add warning logs when 'weight' attribute is missing and set default value of 0.0
2025-05-11 11:16:32 +08:00
Arjun Rao
a1a71e7897 Merge branch 'using_keyed_lock_for_max_concurrency' 2025-05-09 12:57:31 +10:00
yangdx
d2d755db7b Normalize keyword extration result 2025-05-08 16:05:52 +08:00
yangdx
de40f1b5b3 Deduplicate merged relation keywords 2025-05-08 15:52:18 +08:00
yangdx
b92f9b9453 Optimizing query prompt 2025-05-08 12:53:28 +08:00
Arjun Rao
f8149790e4 Initial commit with keyed graph lock 2025-05-08 12:29:49 +10:00
yangdx
10dbbe4ebf Fix linting 2025-05-08 04:29:43 +08:00
yangdx
ae1c9f8d10 Add user_prompt the QueryParam 2025-05-08 03:38:47 +08:00
yangdx
08e532eaf3 Remove unused text_chunks_db param from naive_query 2025-05-08 03:26:14 +08:00
yangdx
3eb3b170ab Remove list_of_list_to_dict function 2025-05-07 18:01:23 +08:00
yangdx
156244e260 Refactor: Unify naive context to JSON format
- Merges 'mix' mode query handling into 'hybrid' mode, simplifying query logic by removing the dedicated `mix_kg_vector_query` function
- Standardizes vector search result by using JSON string format to build context
- Fixes a bug in `query_with_keywords` ensuring `hl_keywords` and `ll_keywords` are correctly passed to `kg_query_with_keywords`
2025-05-07 17:42:14 +08:00
yangdx
59771b60df Optimize relationship title to entity1 and entity2 2025-05-07 13:02:22 +08:00
yangdx
1e03888cef Change function name get_kg_context to _get_kg_context 2025-05-07 10:57:33 +08:00
yangdx
3146309fde Change function name from list_of_list_to_json to list_of_list_to_dict 2025-05-07 10:52:26 +08:00
yangdx
edb3d6ac11 Improve query context format for mix mode 2025-05-07 10:51:44 +08:00
yangdx
2485bfe53c Fix linting 2025-05-07 03:57:14 +08:00
yangdx
910a7a8936 Unified vector retrieval logic for mix and naive queries 2025-05-07 03:47:09 +08:00
yangdx
1794b57b43 Ignore chat history in vector search 2025-05-07 03:20:39 +08:00
yangdx
c984ebd462 Improve mix query context format 2025-05-07 03:11:59 +08:00
yangdx
098846b651 Improve naive query context format 2025-05-07 02:52:05 +08:00
yangdx
b1f874b489 Fix linting 2025-05-07 01:51:58 +08:00
yangdx
52d8815230 Elimiate redunction chunk data fecth for niave query mode 2025-05-07 01:46:23 +08:00
yangdx
027c67a73c Skip self-referential relationships in edge processing 2025-05-05 11:58:33 +08:00
yangdx
9ff3542ab2 Fix time handling bugs for graph data 2025-05-01 15:14:15 +08:00