70 Commits

Author SHA1 Message Date
Magic_yuan
650b8e38b7 feat(lightrag): Add document status tracking and checkpoint support
功能(lightrag): 添加文档状态跟踪和断点续传支持

- Add DocStatus enum and DocProcessingStatus class for document processing state management
- 添加 DocStatus 枚举和 DocProcessingStatus 类用于文档处理状态管理

- Implement JsonDocStatusStorage for persistent status storage
- 实现 JsonDocStatusStorage 用于持久化状态存储

- Add document-level deduplication in batch processing
- 在批处理中添加文档级别的去重功能

- Add checkpoint support in ainsert method for resumable document processing
- 在 ainsert 方法中添加断点续传支持,实现可恢复的文档处理

- Add status query methods for monitoring processing progress
- 添加状态查询方法用于监控处理进度

- Update LightRAG initialization to support document status tracking
- 更新 LightRAG 初始化以支持文档状态跟踪
2024-12-28 00:11:25 +08:00
zrguo
457e683acd
Update lightrag.py 2024-12-26 22:14:04 +08:00
Alex Potapenko
6f71293c83 Add Gremlin graph storage 2024-12-19 17:47:42 +01:00
Weaxs
344d8f277b support TiDBGraphStorage 2024-12-18 10:57:33 +08:00
GG
2d048b5eb0 fix(llm): hashing_kv初始化修复
-hybrid模式对hashing_kv的依赖不止global_config,干脆复用llm_response_cache的初始化结构
2024-12-17 16:44:42 +08:00
Alex Potapenko
7564841450 Add Apache AGE graph storage 2024-12-13 20:41:38 +01:00
Weaxs
288985eab4 pre-commit fix tidb 2024-12-12 10:22:31 +08:00
Weaxs
8ef5a6b8cd support TiDB: add TiDBKVStorage, TiDBVectorDBStorage 2024-12-11 16:23:50 +08:00
zrguo
504a3c233b
Merge branch 'main' into pkaushal/vectordb-chroma 2024-12-11 14:21:36 +08:00
Pankaj Kaushal
ca788463cc feat: Add ChromaDB integration for vector storage
- Implemented `ChromaVectorDBStorage` class in `lightrag/kg/chroma_impl.py` to support ChromaDB as a vector storage backend.
- Updated `lightrag.py` to include `ChromaVectorDBStorage` in the storage class mapping.
- Added a test script `test_chromadb.py` to demonstrate the usage of ChromaDB with LightRAG, including configuration for embedding functions and ChromaDB connection settings.
- fix lazy import function to support package context for dynamic class loading.
  288d4b8355
2024-12-10 16:23:05 +01:00
david
288d4b8355 fix lazy import 2024-12-10 17:16:21 +08:00
zrguo
3e112c0d05
Merge pull request #432 from ChenZiHong-Gavin/main
fix(lightrag): use is_closed() instead of _closed
2024-12-09 18:08:43 +08:00
zrguo
4c89a1a620
Merge pull request #429 from davidleon/improvement/lazy_external_load
fix extra kwargs error: keyword_extraction.
2024-12-09 18:07:30 +08:00
chenzihong
9dd51f1f35 fix(lightrag): use is_closed() instead of _closed 2024-12-09 17:10:13 +08:00
david
9717ad87fc fix extra kwargs error: keyword_extraction.
add lazy_external_load to reduce external lib deps whenever it's not necessary for user.
2024-12-09 15:35:35 +08:00
Magic_yuan
ccf44dc334 feat(cache): 增加 LLM 相似性检查功能并优化缓存机制
- 在 embedding 缓存配置中添加 use_llm_check 参数
- 实现 LLM 相似性检查逻辑,作为缓存命中的二次验证- 优化 naive 模式的缓存处理流程
- 调整缓存数据结构,移除不必要的 model 字段
2024-12-08 17:35:52 +08:00
magicyuan876
d48c6e4588 feat(lightrag): 添加 查询时使用embedding缓存功能
- 在 LightRAG 类中添加 embedding_cache_config配置项
- 实现基于 embedding 相似度的缓存查询和存储
- 添加量化和反量化函数,用于压缩 embedding 数据
- 新增示例演示 embedding 缓存的使用
2024-12-06 08:17:20 +08:00
partoneplay
d8ba7c57f3 Add MongoDB as KV storage 2024-12-05 13:57:43 +08:00
zrguo
6d274019dd Merge pull request #393 from partoneplay/main
Add Milvus as vector storage
2024-12-05 12:05:30 +08:00
partoneplay
052322b213 Add Milvus as vector storage 2024-12-05 08:48:41 +08:00
LarFii
44d441a951 update insert custom kg 2024-12-04 19:44:04 +08:00
zrguo
6927b57520 Merge pull request #378 from doosenn/main
fix neo4jstorage bug
2024-12-04 11:11:19 +08:00
magicyuan876
607d4f9555 修改日志文件路径
- 因为LightRAG的几乎都是导入的utils中的全局logger对象,当多个rag实例的时候并无法完全把日志记录到对应的working_dir,并且应用中删除working_dir时会由于logger的句柄无法删除
- 此修改简化了日志文件的路径,不再依赖于 working_dir 属性,日志文件独立于working_dir
2024-12-04 08:44:13 +08:00
zuoluo
801619084f fix neo4jstorage bug 2024-12-03 16:04:58 +08:00
Tasha Upchurch
eae310cd68 fix for #209
function was returning a closed event loop.
2024-11-29 13:27:08 -07:00
jin
9f3c0581ac Merge branch 'HKUDS:main' into main 2024-11-27 15:16:28 +08:00
Larfii
cb492ccb04 Add custom KG insertion 2024-11-25 18:06:19 +08:00
Larfii
8562ecdebc Add a progress bar 2024-11-25 15:04:38 +08:00
jin
1dbe803521 Merge branch 'main' of https://github.com/jin38324/LightRAG 2024-11-25 13:32:33 +08:00
jin
89c2de54a2 Optimization logic 2024-11-25 13:29:55 +08:00
LarFii
ce7f524174 Update 2024-11-19 16:52:26 +08:00
Richard
6bdf693b85 fix neo4j bug 2024-11-15 13:11:43 +08:00
Rick Battle
d4a27c901e Only update storage if there was something to insert
Before, the `finally` block would always call `_insert_done()`, which writes out the `vdb_*` and `kv_store_*` files ... even if there was nothing to insert (because all docs had already been inserted).  This was causing the speed of skippable inserts to become very slow as the graph grew.
2024-11-12 09:30:21 -07:00
jin
41599897fb fix pre commit 2024-11-12 13:32:40 +08:00
jin
8bc5d4efff add Oracle support 2024-11-12 09:59:12 +08:00
LarFii
b49f73181c update 2024-11-11 17:54:22 +08:00
LarFii
4c0352ee2b Add delete method 2024-11-11 17:48:40 +08:00
jin
0b6b0064d6 Merge branch 'main' of https://github.com/jin38324/LightRAG 2024-11-11 15:21:37 +08:00
LarFii
d0c1844264 Linting 2024-11-11 10:45:22 +08:00
jin
d68fc3d248 support Oracle DB 2024-11-08 16:20:34 +08:00
jin
5e9d19d5a3 support Oracle Database storage 2024-11-08 16:12:58 +08:00
jin
594470ab56 Oracle Database support
Add oracle 23ai database as the KV/vector/graph storage
2024-11-08 14:58:41 +08:00
zrguo
9463d588fa Merge branch 'main' into main 2024-11-07 14:54:15 +08:00
Ken Wiltshire
3d5d083f42 fix event loop conflict 2024-11-06 11:18:14 -05:00
Ken Wiltshire
8420cd1c77 event loop issue 2024-11-06 10:49:14 -05:00
benx13
6f77f54c6d bug fix issue #95 2024-11-05 18:36:59 -08:00
Ken Wiltshire
8bd5d9b5b2 using neo4j async 2024-11-02 18:35:07 -04:00
wiltshirek
f40725feeb Merge branch 'main' into main 2024-11-01 16:50:45 -04:00
Ken Wiltshire
f375620992 cleaning code for pull 2024-11-01 16:11:19 -04:00
Ken Wiltshire
b41d990fd6 securing for production with env vars for creds 2024-11-01 11:01:50 -04:00