493 Commits

Author SHA1 Message Date
yangdx
3afdd1b67c Fix initial count error for multi-process lock with key 2025-07-11 20:39:08 +08:00
yangdx
c47747da9e Merge branch 'main' into merge_lock_with_key 2025-07-11 16:37:10 +08:00
yangdx
ef4870fda5 Combined entity and edge processing tasks and optimize merging with semaphore 2025-07-11 16:34:54 +08:00
yangdx
9aa2ed0837 Merge branch 'main' into rerank 2025-07-09 15:33:39 +08:00
yangdx
207f0a7f2a Merge branch 'main' into merge_lock_with_key 2025-07-09 09:25:28 +08:00
yangdx
cb3bfc0e5b Release semphore before merge stage 2025-07-09 09:24:44 +08:00
Anton Vice
b192f8c9a3 Fix: Handle NoneType error when processing documents without a file path
The document processing pipeline would crash with a TypeError when a document was submitted as raw text via the API, as the file_path attribute would be None. This change adds a check to handle the None case gracefully, preventing the crash and allowing text-based documents to be indexed correctly.
2025-07-08 19:35:22 -03:00
zrguo
71cb3adb4f Merge branch 'main' into rerank 2025-07-08 15:10:23 +08:00
yangdx
56d43de58a Merge branch 'main' into merge_lock_with_key 2025-07-08 12:46:31 +08:00
zrguo
f5c80d7cde Simplify Configuration 2025-07-08 11:16:34 +08:00
yangdx
9b7b2a9b0f Reduce default embedding batch size from 32 to 10 2025-07-08 11:00:09 +08:00
zrguo
75dd4f3498 add rerank model 2025-07-07 22:44:59 +08:00
yangdx
ef79088f60 Move max_graph_nodes to global config 2025-07-07 21:53:57 +08:00
yangdx
033098c1bc Feat: Add WORKSPACE support to all storage types 2025-07-07 00:57:21 +08:00
yangdx
1b2d295a4f Remove namespace_prefix 2025-07-06 00:16:47 +08:00
yangdx
98150e80b8 Improved empty/whitespace file handling
- Better detection of whitespace-only files
- Changed error to warning for empty chunks
2025-07-05 23:16:39 +08:00
xuewei
648a87653f 文本块是空白 2025-07-05 14:28:42 +08:00
yangdx
a9e10ae810 Update logger messages 2025-07-03 14:08:19 +08:00
yangdx
e56734cb8b Refac: Optimize document deletion performance
- Adding chunks_list to  dock_status
- Adding  llm_cache_list to text_chunks
- Implemented storage types: JsonKV and  Redis
2025-07-03 04:18:25 +08:00
zrguo
479865a271 Add max_gleaning to env 2025-07-01 17:13:33 +08:00
yangdx
e70f5a35e5 Refac: Add KG rebuild logging with pipeline status
- Logs detailed progress, including warnings and failures, to the pipeline status.
- Adds counters to report the total number of successfully rebuilt entities and relationships upon completion.
2025-06-29 21:27:12 +08:00
yangdx
3a8a99b73d feat(postgres): Implement text_chunks upsert for PGKVStorage 2025-06-28 14:37:35 +08:00
yangdx
8fb1c09b08 Refac: pipelinge message 2025-06-26 01:00:54 +08:00
yangdx
bdcd55a871 Feat: Add delete upload file option to document deletion 2025-06-25 19:02:46 +08:00
yangdx
51bb0471cd Change the API for deleting documents to support deleting multiple documents at once. 2025-06-25 16:19:49 +08:00
yangdx
495d6c8cce Improve the pipeline status message for document deletetion 2025-06-25 15:46:58 +08:00
yangdx
2aaa6d5f7d Fix linting 2025-06-25 14:59:45 +08:00
yangdx
8a365533d7 Add comprehensive error handling for document deletion 2025-06-25 14:58:41 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
yangdx
2946bbdb71 Add TODO: There is performance when iterating get_all_labels 2025-06-24 11:32:28 +08:00
yangdx
e6baffe10c Add retrun status to entity and relation delete operations 2025-06-23 21:39:45 +08:00
yangdx
bd487dd252 Unify document APIs returen status string 2025-06-23 21:38:47 +08:00
yangdx
ce50135efb Improved docstring for document deletion method 2025-06-23 21:08:51 +08:00
yangdx
ebcabe29ca Remove duplicated graph db lock 2025-06-23 18:46:01 +08:00
yangdx
5099ac8213 Fix linting 2025-06-23 18:41:30 +08:00
yangdx
a215939c41 Refac: Avoid duplicate edge processing in adelete_by_doc_id 2025-06-23 18:39:36 +08:00
yangdx
a0be65d5d9 Refac: Return status and messages for delete by doc id operaton 2025-06-23 17:59:27 +08:00
yangdx
9fae0eadff feat: Ensure thread safety for graph write operations
Add a lock to delete, adelete_by_entity, and adelete_by_relation methods to prevent race conditions and ensure data consistency during concurrent modifications to the knowledge graph.
2025-06-23 09:57:56 +08:00
zrguo
4937de8809 Update 2025-06-22 15:12:09 +08:00
zrguo
3abdc42549 Merge branch 'main' into delete_doc 2025-06-16 17:02:21 +08:00
kwilt
09cbcc4572 fix typo: "extrat" -> extract 2025-06-09 08:28:14 -05:00
zrguo
ead82a8dbd update delete_by_doc_id 2025-06-09 18:52:34 +08:00
zrguo
40b10e8fcf Update insert_custom_kg 2025-05-27 16:07:04 +08:00
omri.alon
efccdf0838 Adding citation support in custom graph creation 2025-05-26 20:30:59 +03:00
Arjun Rao
2fbfdb5b17 Merge remote-tracking branch 'upstream/main' 2025-05-14 03:12:03 +10:00
yangdx
bb7b360269 Fix linting 2025-05-13 21:35:04 +08:00
yangdx
4d57370c94 Refactor: Move get_env_value from api.config to utils
Relocates the `get_env_value` utility function
from `lightrag.api.config` to `lightrag.utils` to decouple
LightRAG core from API Server
2025-05-10 08:58:18 +08:00
Arjun Rao
6ad9d528b4 Updated semaphore release message 2025-05-08 14:22:11 +10:00
Arjun Rao
812ba41fd1 Merge branch 'using_keyed_lock_for_max_concurrency' 2025-05-08 14:10:44 +10:00
Arjun Rao
f8149790e4 Initial commit with keyed graph lock 2025-05-08 12:29:49 +10:00