470 Commits

Author SHA1 Message Date
zrguo
479865a271 Add max_gleaning to env 2025-07-01 17:13:33 +08:00
yangdx
e70f5a35e5 Refac: Add KG rebuild logging with pipeline status
- Logs detailed progress, including warnings and failures, to the pipeline status.
- Adds counters to report the total number of successfully rebuilt entities and relationships upon completion.
2025-06-29 21:27:12 +08:00
yangdx
3a8a99b73d feat(postgres): Implement text_chunks upsert for PGKVStorage 2025-06-28 14:37:35 +08:00
yangdx
8fb1c09b08 Refac: pipelinge message 2025-06-26 01:00:54 +08:00
yangdx
bdcd55a871 Feat: Add delete upload file option to document deletion 2025-06-25 19:02:46 +08:00
yangdx
51bb0471cd Change the API for deleting documents to support deleting multiple documents at once. 2025-06-25 16:19:49 +08:00
yangdx
495d6c8cce Improve the pipeline status message for document deletetion 2025-06-25 15:46:58 +08:00
yangdx
2aaa6d5f7d Fix linting 2025-06-25 14:59:45 +08:00
yangdx
8a365533d7 Add comprehensive error handling for document deletion 2025-06-25 14:58:41 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
yangdx
2946bbdb71 Add TODO: There is performance when iterating get_all_labels 2025-06-24 11:32:28 +08:00
yangdx
e6baffe10c Add retrun status to entity and relation delete operations 2025-06-23 21:39:45 +08:00
yangdx
bd487dd252 Unify document APIs returen status string 2025-06-23 21:38:47 +08:00
yangdx
ce50135efb Improved docstring for document deletion method 2025-06-23 21:08:51 +08:00
yangdx
ebcabe29ca Remove duplicated graph db lock 2025-06-23 18:46:01 +08:00
yangdx
5099ac8213 Fix linting 2025-06-23 18:41:30 +08:00
yangdx
a215939c41 Refac: Avoid duplicate edge processing in adelete_by_doc_id 2025-06-23 18:39:36 +08:00
yangdx
a0be65d5d9 Refac: Return status and messages for delete by doc id operaton 2025-06-23 17:59:27 +08:00
yangdx
9fae0eadff feat: Ensure thread safety for graph write operations
Add a lock to delete, adelete_by_entity, and adelete_by_relation methods to prevent race conditions and ensure data consistency during concurrent modifications to the knowledge graph.
2025-06-23 09:57:56 +08:00
zrguo
4937de8809 Update 2025-06-22 15:12:09 +08:00
zrguo
3abdc42549 Merge branch 'main' into delete_doc 2025-06-16 17:02:21 +08:00
kwilt
09cbcc4572 fix typo: "extrat" -> extract 2025-06-09 08:28:14 -05:00
zrguo
ead82a8dbd update delete_by_doc_id 2025-06-09 18:52:34 +08:00
zrguo
40b10e8fcf Update insert_custom_kg 2025-05-27 16:07:04 +08:00
omri.alon
efccdf0838 Adding citation support in custom graph creation 2025-05-26 20:30:59 +03:00
yangdx
bb7b360269 Fix linting 2025-05-13 21:35:04 +08:00
yangdx
4d57370c94 Refactor: Move get_env_value from api.config to utils
Relocates the `get_env_value` utility function
from `lightrag.api.config` to `lightrag.utils` to decouple
LightRAG core from API Server
2025-05-10 08:58:18 +08:00
yangdx
3025094c62 Add commments for deprecated functions 2025-05-08 09:36:57 +08:00
yangdx
08e532eaf3 Remove unused text_chunks_db param from naive_query 2025-05-08 03:26:14 +08:00
yangdx
156244e260 Refactor: Unify naive context to JSON format
- Merges 'mix' mode query handling into 'hybrid' mode, simplifying query logic by removing the dedicated `mix_kg_vector_query` function
- Standardizes vector search result by using JSON string format to build context
- Fixes a bug in `query_with_keywords` ensuring `hl_keywords` and `ll_keywords` are correctly passed to `kg_query_with_keywords`
2025-05-07 17:42:14 +08:00
yangdx
365ef75447 Add deprecating commend to text_chunks storage 2025-05-07 02:03:57 +08:00
yangdx
dbfcf30801 Fix linting 2025-05-06 22:03:40 +08:00
yangdx
c8ecfa2d68 feat: Centralize configuration and update defaults
This commit introduces `lightrag/constants.py` to centralize default values for various configurations across the API and core components.

Key changes:
- Added `constants.py` to centralize default values
- Improved the `get_env_value` function in `api/config.py` to correctly handle string "None" as a None value and to catch `TypeError` during value conversion.
- Updated the default `SUMMARY_LANGUAGE` to "English"
- Set default `WORKERS` to 2
2025-05-06 22:00:43 +08:00
yangdx
9a41de51fb Optimize log message 2025-05-04 22:20:44 +08:00
yangdx
b9b86df786 Persistent LLM cache on error 2025-05-03 23:00:09 +08:00
yangdx
36f8787bc7 Fix linting 2025-05-01 10:04:31 +08:00
yangdx
a561be0cff Fix time zone problem of doc status 2025-05-01 02:16:19 +08:00
yangdx
0ecae90002 Enhance the function's robustness 2025-04-28 22:52:31 +08:00
yangdx
90a07b0420 Remove unused params 2025-04-28 21:14:19 +08:00
yangdx
ef69009c15 Increase the priority of queries related to LLM requests 2025-04-28 19:36:21 +08:00
yangdx
140b1b3cbb Add priority control for limited async decorator 2025-04-28 18:12:29 +08:00
yangdx
3e385b5f81 Optimize logger info 2025-04-28 02:39:18 +08:00
yangdx
594e7b751a Fix linting 2025-04-28 02:15:25 +08:00
yangdx
18040aa95c Improve parallel handling logic between extraction and merge operation 2025-04-28 01:14:00 +08:00
yangdx
7f09972901 Optimize error log 2025-04-24 15:46:25 +08:00
yangdx
3aab5b41f2 Fix linting 2025-04-24 14:15:10 +08:00
yangdx
4f68f3e410 Using semaphore to control parallel doc processing instead of batching. 2025-04-24 13:45:44 +08:00
earayu
7597a5bdfb feat: support aget_docs_by_ids 2025-04-21 13:27:16 +08:00
yangdx
733e307a8d Merge branch 'stevezhangishero/main' 2025-04-20 15:18:36 +08:00
yangdx
cd01ec64d3 Add tokenizer to global_config 2025-04-20 14:51:11 +08:00