126 Commits

Author SHA1 Message Date
jin
17a2ec2bc4
Merge branch 'HKUDS:main' into main 2025-01-16 09:59:27 +08:00
jin
85331e3fa2 update Oracle support
add cache support, fix bug
2025-01-10 11:36:28 +08:00
adikalra
acde4ed173 Add custom chunking function. 2025-01-09 17:20:24 +05:30
zrguo
b93203804c
Merge branch 'main' into main 2025-01-09 15:28:57 +08:00
童石渊
dd213c95be 增加仅字符分割参数,如果开启,仅采用字符分割,不开启,在分割完以后如果chunk过大,会继续根据token size分割,更新测试文件 2025-01-09 11:55:49 +08:00
zrguo
6c78c96854 fix linting errors 2025-01-07 22:02:34 +08:00
zrguo
fe7f7086b1
Merge pull request #547 from n3A87/main
Fix:Optimized logic for automatic switching modes when keywords do not exist
2025-01-07 21:51:51 +08:00
童石渊
6b19401dc6 chunk split retry 2025-01-07 16:26:12 +08:00
童石渊
536d6f2283 添加字符分割功能,在“insert”函数中如果增加参数split_by_character,则会按照split_by_character进行字符分割,此时如果每个分割后的chunk的tokens大于max_token_size,则会继续按token_size分割(todo:考虑字符分割后过短的chunk处理) 2025-01-07 00:28:15 +08:00
xYLiuuuuuu
79646fced8
Fix:Optimized logic for automatic switching modes when keywords do not exist 2025-01-06 16:54:53 +08:00
Samuel Chan
6ae27d8f06 Some enhancements:
- Enable the llm_cache storage to support get_by_mode_and_id, to improve the performance for using real KV server
- Provide an option for the developers to cache the LLM response when extracting entities for a document. Solving the paint point that sometimes the process failed, the processed chunks we need to call LLM again, money and time wasted. With the new option (by default not enabled) enabling, we can cache that result, can significantly save the time and money for beginners.
2025-01-06 12:50:05 +08:00
Magic_yuan
7b91dc7fd8 feat: 增强知识图谱关系的时序性支持
- 为关系和向量数据增加时间戳支持,记录知识获取的时间
- 优化混合查询策略,同时考虑语义相关性和时间顺序
- 增强提示词模板,指导LLM在处理冲突信息时考虑时间因素
2024-12-29 15:37:34 +08:00
Magic_yuan
4c950cf4ce feat: 增强知识图谱关系的时序性支持
- 为关系和向量数据增加时间戳支持,记录知识获取的时间
- 优化混合查询策略,同时考虑语义相关性和时间顺序
- 增强提示词模板,指导LLM在处理冲突信息时考虑时间因素
2024-12-29 15:25:57 +08:00
Magic_yuan
aaaf617451 feat(lightrag): Implement mix search mode combining knowledge graph and vector retrieval
- Add 'mix' mode to QueryParam for hybrid search functionality
- Implement mix_kg_vector_query to combine knowledge graph and vector search results
- Update LightRAG class to handle 'mix' mode queries
- Enhance README with examples and explanations for the new mix search mode
- Introduce new prompt structure for generating responses based on combined search results
2024-12-28 11:56:28 +08:00
zrguo
b7552f35aa
Merge pull request #461 from tjyiiuan/main
fix: update operate.py
2024-12-13 15:10:53 +08:00
Jiyu Tian
aac26b086e fix: update operate.py
1. 避免变量在赋值之前就被引用
2. 解决未找到entity返回None导致的unpack问题
2024-12-12 15:47:57 -05:00
chenzihong
e9107a67c3 fix: fix variable name(entitiy->entity) 2024-12-12 23:59:40 +08:00
Magic_yuan
b89041b5b3 feat(operate): 添加实体类型配置并优化提示生成
- 在全局配置中添加 entity_types 参数,用于自定义实体类型
- 在生成实体提取和关系提取的提示时,使用配置的实体类型替代默认值
- 优化了提示生成逻辑,提高了代码的可配置性和灵活性
2024-12-11 13:53:05 +08:00
Magic_yuan
316c4df949 更新日志描述 2024-12-10 14:15:43 +08:00
Magic_yuan
58c0f94346 fix(lightrag): 修复只有实体没有关系的chunk处理逻辑
- 只有实体没有关系时,继续处理,而不是直接return
- 当只有实体而没有关系的图片在高阶查询关系时会返回空,这里优化返回,当没有关系时降级为local查询
2024-12-10 14:13:11 +08:00
Larfii
2ba20910bb fix naive_query 2024-12-09 17:45:01 +08:00
zrguo
71af34196f
Merge branch 'main' into fix-entity-name-string 2024-12-09 17:30:40 +08:00
Magic_yuan
865e76a083 修复bug
https://github.com/HKUDS/LightRAG/issues/306
主要修改包括:
在存储文本块数据时增加了验证,确保只存储有效的数据
在处理文本块之前增加了空列表检查
在截断文本块之前过滤掉无效的数据
增加了更多的日志警告信息
查询的修改:
添加了对 chunks 的有效性检查,过滤掉无效的 chunks:
2024-12-09 15:08:30 +08:00
Magic_yuan
ccf44dc334 feat(cache): 增加 LLM 相似性检查功能并优化缓存机制
- 在 embedding 缓存配置中添加 use_llm_check 参数
- 实现 LLM 相似性检查逻辑,作为缓存命中的二次验证- 优化 naive 模式的缓存处理流程
- 调整缓存数据结构,移除不必要的 model 字段
2024-12-08 17:35:52 +08:00
Saujanya Verma
5a33ce1c1a Fix: Ensure entity_or_relation_name is a string in _handle_entity_relation_summary 2024-12-06 20:54:01 +05:30
magicyuan876
8924d2b8fc Merge remote-tracking branch 'origin/main'
# Conflicts:
#	lightrag/llm.py
#	lightrag/operate.py
2024-12-06 15:06:00 +08:00
yuanxiaobin
ad4b0d1ba9 Merge remote-tracking branch 'origin/main'
# Conflicts:
#	lightrag/llm.py
#	lightrag/operate.py
2024-12-06 15:06:00 +08:00
magicyuan876
e619b09c8a 重构缓存处理逻辑
- 提取通用缓存处理逻辑到新函数 handle_cache 和 save_to_cache
- 使用 CacheData 类统一缓存数据结构
- 优化嵌入式缓存和常规缓存的处理流程
- 添加模式参数以支持不同查询模式的缓存策略
- 重构 get_best_cached_response 函数,提高缓存查询效率
2024-12-06 14:29:16 +08:00
yuanxiaobin
584258078f 重构缓存处理逻辑
- 提取通用缓存处理逻辑到新函数 handle_cache 和 save_to_cache
- 使用 CacheData 类统一缓存数据结构
- 优化嵌入式缓存和常规缓存的处理流程
- 添加模式参数以支持不同查询模式的缓存策略
- 重构 get_best_cached_response 函数,提高缓存查询效率
2024-12-06 14:29:16 +08:00
partoneplay
e82d13e182 Add support for Ollama streaming output and integrate Open-WebUI as the chat UI demo 2024-12-06 10:13:16 +08:00
partoneplay
335179196a Add support for Ollama streaming output and integrate Open-WebUI as the chat UI demo 2024-12-06 10:13:16 +08:00
Larfii
645890aff6 Fix JSON parsing error 2024-12-05 20:40:35 +08:00
Larfii
0ca819dd05 Fix JSON parsing error 2024-12-05 20:40:35 +08:00
Larfii
1e4e2ea4f3 Fix JSON parsing error 2024-12-05 20:22:44 +08:00
Larfii
570790cd37 Fix JSON parsing error 2024-12-05 20:22:44 +08:00
Larfii
9af3676991 Fix JSON parsing error 2024-12-05 18:26:55 +08:00
Larfii
5e1f317264 Fix JSON parsing error 2024-12-05 18:26:55 +08:00
Larfii
254330813a fix JSON parsing error 2024-12-05 18:13:12 +08:00
Larfii
eda9d5abeb fix JSON parsing error 2024-12-05 18:13:12 +08:00
LarFii
44d441a951 update insert custom kg 2024-12-04 19:44:04 +08:00
LarFii
db9b9f69f8 update insert custom kg 2024-12-04 19:44:04 +08:00
LarFii
be72c825d2 fix entity extract 2024-12-04 16:01:19 +08:00
LarFii
7f8460c8ec fix entity extract 2024-12-04 16:01:19 +08:00
Yizhi Zhang
3a6645b78d fix bug of example prompt 2024-12-03 22:25:50 +08:00
Yizhi Zhang
380a1d1bc3 fix bug of example prompt 2024-12-03 22:25:50 +08:00
b10902118
085032640c fix with ruff (unused import) 2024-12-01 16:09:24 +08:00
b10902118
3f9d19401f fix with ruff (unused import) 2024-12-01 16:09:24 +08:00
b10902118
753c1e6714 support JSON output for ollama and openai 2024-11-29 21:41:37 +08:00
b10902118
b0dd600429 support JSON output for ollama and openai 2024-11-29 21:41:37 +08:00
Sebastian Schramm
b0483ae91d fix templating of language in prompts 2024-11-28 14:28:29 +01:00