LightRAG

mirror of https://github.com/HKUDS/LightRAG.git synced 2025-07-03 07:04:04 +00:00

Author	SHA1	Message	Date
adikalra	acde4ed173	Add custom chunking function.	2025-01-09 17:20:24 +05:30
zrguo	b93203804c	Merge branch 'main' into main	2025-01-09 15:28:57 +08:00
zrguo	92ccfa2770	Merge pull request #555 from ParisNeo/main Restore backwards compatibility for LightRAG's ainsert method	2025-01-09 15:27:09 +08:00
童石渊	dd213c95be	增加仅字符分割参数，如果开启，仅采用字符分割，不开启，在分割完以后如果chunk过大，会继续根据token size分割，更新测试文件	2025-01-09 11:55:49 +08:00
Saifeddine ALOUI	65c1450c66	fixed retro compatibility with ainsert by making split_by_character get a None default value	2025-01-08 20:50:22 +01:00
Gurjot Singh	9565a4663a	Fix trailing whitespace and formatting issues in lightrag.py	2025-01-09 00:39:22 +05:30
Gurjot Singh	a940251390	Implement custom chunking feature	2025-01-07 20:57:39 +05:30
童石渊	6b19401dc6	chunk split retry	2025-01-07 16:26:12 +08:00
童石渊	536d6f2283	添加字符分割功能，在“insert”函数中如果增加参数split_by_character，则会按照split_by_character进行字符分割，此时如果每个分割后的chunk的tokens大于max_token_size，则会继续按token_size分割（todo：考虑字符分割后过短的chunk处理）	2025-01-07 00:28:15 +08:00
zrguo	990b684a85	Update lightrag.py	2025-01-06 15:27:31 +08:00
Samuel Chan	6ae27d8f06	Some enhancements: - Enable the llm_cache storage to support get_by_mode_and_id, to improve the performance for using real KV server - Provide an option for the developers to cache the LLM response when extracting entities for a document. Solving the paint point that sometimes the process failed, the processed chunks we need to call LLM again, money and time wasted. With the new option (by default not enabled) enabling, we can cache that result, can significantly save the time and money for beginners.	2025-01-06 12:50:05 +08:00
Samuel Chan	60e8a355f0	Merge branch 'HKUDS:main' into main	2025-01-03 21:18:17 +08:00
Samuel Chan	b17cb2aa95	With a draft for progres_impl	2025-01-01 22:43:59 +08:00
zrguo	d489d9dec0	fix linting errors	2024-12-31 17:32:04 +08:00
zrguo	cee5b2fbb0	add delete by doc id	2024-12-31 17:15:57 +08:00
Magic_yuan	aaaf617451	feat(lightrag): Implement mix search mode combining knowledge graph and vector retrieval - Add 'mix' mode to QueryParam for hybrid search functionality - Implement mix_kg_vector_query to combine knowledge graph and vector search results - Update LightRAG class to handle 'mix' mode queries - Enhance README with examples and explanations for the new mix search mode - Introduce new prompt structure for generating responses based on combined search results	2024-12-28 11:56:28 +08:00
Magic_yuan	650b8e38b7	feat(lightrag): Add document status tracking and checkpoint support 功能(lightrag): 添加文档状态跟踪和断点续传支持 - Add DocStatus enum and DocProcessingStatus class for document processing state management - 添加 DocStatus 枚举和 DocProcessingStatus 类用于文档处理状态管理 - Implement JsonDocStatusStorage for persistent status storage - 实现 JsonDocStatusStorage 用于持久化状态存储 - Add document-level deduplication in batch processing - 在批处理中添加文档级别的去重功能 - Add checkpoint support in ainsert method for resumable document processing - 在 ainsert 方法中添加断点续传支持，实现可恢复的文档处理 - Add status query methods for monitoring processing progress - 添加状态查询方法用于监控处理进度 - Update LightRAG initialization to support document status tracking - 更新 LightRAG 初始化以支持文档状态跟踪	2024-12-28 00:11:25 +08:00
zrguo	457e683acd	Update lightrag.py	2024-12-26 22:14:04 +08:00
Alex Potapenko	6f71293c83	Add Gremlin graph storage	2024-12-19 17:47:42 +01:00
Weaxs	344d8f277b	support TiDBGraphStorage	2024-12-18 10:57:33 +08:00
GG	2d048b5eb0	fix(llm): hashing_kv初始化修复 -hybrid模式对hashing_kv的依赖不止global_config，干脆复用llm_response_cache的初始化结构	2024-12-17 16:44:42 +08:00
Alex Potapenko	7564841450	Add Apache AGE graph storage	2024-12-13 20:41:38 +01:00
Weaxs	288985eab4	pre-commit fix tidb	2024-12-12 10:22:31 +08:00
Weaxs	8ef5a6b8cd	support TiDB: add TiDBKVStorage, TiDBVectorDBStorage	2024-12-11 16:23:50 +08:00
zrguo	504a3c233b	Merge branch 'main' into pkaushal/vectordb-chroma	2024-12-11 14:21:36 +08:00
Pankaj Kaushal	ca788463cc	feat: Add ChromaDB integration for vector storage - Implemented `ChromaVectorDBStorage` class in `lightrag/kg/chroma_impl.py` to support ChromaDB as a vector storage backend. - Updated `lightrag.py` to include `ChromaVectorDBStorage` in the storage class mapping. - Added a test script `test_chromadb.py` to demonstrate the usage of ChromaDB with LightRAG, including configuration for embedding functions and ChromaDB connection settings. - fix lazy import function to support package context for dynamic class loading. `288d4b8355`	2024-12-10 16:23:05 +01:00
david	288d4b8355	fix lazy import	2024-12-10 17:16:21 +08:00
zrguo	3e112c0d05	Merge pull request #432 from ChenZiHong-Gavin/main fix(lightrag): use is_closed() instead of _closed	2024-12-09 18:08:43 +08:00
zrguo	4c89a1a620	Merge pull request #429 from davidleon/improvement/lazy_external_load fix extra kwargs error: keyword_extraction.	2024-12-09 18:07:30 +08:00
chenzihong	9dd51f1f35	fix(lightrag): use is_closed() instead of _closed	2024-12-09 17:10:13 +08:00
david	9717ad87fc	fix extra kwargs error: keyword_extraction. add lazy_external_load to reduce external lib deps whenever it's not necessary for user.	2024-12-09 15:35:35 +08:00
Magic_yuan	ccf44dc334	feat(cache): 增加 LLM 相似性检查功能并优化缓存机制 - 在 embedding 缓存配置中添加 use_llm_check 参数 - 实现 LLM 相似性检查逻辑，作为缓存命中的二次验证- 优化 naive 模式的缓存处理流程 - 调整缓存数据结构，移除不必要的 model 字段	2024-12-08 17:35:52 +08:00
magicyuan876	d48c6e4588	feat(lightrag): 添加查询时使用embedding缓存功能 - 在 LightRAG 类中添加 embedding_cache_config配置项 - 实现基于 embedding 相似度的缓存查询和存储 - 添加量化和反量化函数，用于压缩 embedding 数据 - 新增示例演示 embedding 缓存的使用	2024-12-06 08:17:20 +08:00
partoneplay	d8ba7c57f3	Add MongoDB as KV storage	2024-12-05 13:57:43 +08:00
zrguo	6d274019dd	Merge pull request #393 from partoneplay/main Add Milvus as vector storage	2024-12-05 12:05:30 +08:00
partoneplay	052322b213	Add Milvus as vector storage	2024-12-05 08:48:41 +08:00
LarFii	44d441a951	update insert custom kg	2024-12-04 19:44:04 +08:00
zrguo	6927b57520	Merge pull request #378 from doosenn/main fix neo4jstorage bug	2024-12-04 11:11:19 +08:00
magicyuan876	607d4f9555	修改日志文件路径 - 因为LightRAG的几乎都是导入的utils中的全局logger对象，当多个rag实例的时候并无法完全把日志记录到对应的working_dir,并且应用中删除working_dir时会由于logger的句柄无法删除 - 此修改简化了日志文件的路径，不再依赖于 working_dir 属性，日志文件独立于working_dir	2024-12-04 08:44:13 +08:00
zuoluo	801619084f	fix neo4jstorage bug	2024-12-03 16:04:58 +08:00
Tasha Upchurch	eae310cd68	fix for #209 function was returning a closed event loop.	2024-11-29 13:27:08 -07:00
jin	9f3c0581ac	Merge branch 'HKUDS:main' into main	2024-11-27 15:16:28 +08:00
Larfii	cb492ccb04	Add custom KG insertion	2024-11-25 18:06:19 +08:00
Larfii	8562ecdebc	Add a progress bar	2024-11-25 15:04:38 +08:00
jin	1dbe803521	Merge branch 'main' of https://github.com/jin38324/LightRAG	2024-11-25 13:32:33 +08:00
jin	89c2de54a2	Optimization logic	2024-11-25 13:29:55 +08:00
LarFii	ce7f524174	Update	2024-11-19 16:52:26 +08:00
Richard	6bdf693b85	fix neo4j bug	2024-11-15 13:11:43 +08:00
Rick Battle	d4a27c901e	Only update storage if there was something to insert Before, the `finally` block would always call `_insert_done()`, which writes out the `vdb_` and `kv_store_` files ... even if there was nothing to insert (because all docs had already been inserted). This was causing the speed of skippable inserts to become very slow as the graph grew.	2024-11-12 09:30:21 -07:00
jin	41599897fb	fix pre commit	2024-11-12 13:32:40 +08:00

1 2

86 Commits