269 Commits

Author SHA1 Message Date
yangdx
7fd3053e61 Update log message 2025-04-12 21:10:36 +08:00
yangdx
2ac66c3531 Remove chinese quotes in entity name 2025-04-12 20:45:41 +08:00
yangdx
0eed5eb718 feat: implement entity/relation name and description normalization
- Remove spaces between Chinese characters
- Remove spaces between Chinese and English/numbers
- Preserve spaces within English text and numbers
- Replace Chinese parentheses with English parentheses
- Replace Chinese dash with English dash
2025-04-12 19:26:02 +08:00
yangdx
96f439bb52 Optimize pipeline status message 2025-04-10 21:19:26 +08:00
yangdx
7d69449c67 Fix linting 2025-04-10 20:32:40 +08:00
yangdx
339bc99259 Only merge new entities/edges during gleaning
- Restrict gleaning to new entity names
- Only add edges with new keys
- Prevent similar decription of the same entity or edge
2025-04-10 20:31:52 +08:00
Daniel.y
0528c06209
Merge pull request #1334 from danielaskdd/main
Refactoring entity and edge merging and add env FORCE_LLM_SUMMARY_ON_MERGE
2025-04-10 18:46:04 +08:00
yangdx
3007dff153 Add env FORCE_LLM_SUMMARY_ON_MERGE 2025-04-10 17:29:07 +08:00
yangdx
35431644ad Refactor: Entity and edge merging in extract_entities
- Improves efficiency by merging identical
entities and edges in a single operation
- Esures proper handling of undirected graph edges
- Change merge stage from chunk leve to document level
2025-04-10 14:19:06 +08:00
zrguo
1c1afb4eaf
Merge pull request #1304 from FeHuynhVI/main
Fixes a bug where file_path was not present dictionary
2025-04-10 14:08:11 +10:00
yangdx
39b4da41a3 Optimize logging 2025-04-10 04:17:32 +08:00
yangdx
496f87a1e6 Fix linting 2025-04-10 03:58:04 +08:00
yangdx
8d858da4d0 Fix LLM cache now work for nodes and edges merging 2025-04-10 03:57:36 +08:00
yangdx
5d286dd0fa Add node and edge merging log to pipeline_status 2025-04-10 01:07:06 +08:00
IcySugar
224f63cd5f
Merge branch 'HKUDS:main' into main 2025-04-09 13:40:19 +08:00
yangdx
740b4174d2 Fix linting 2025-04-09 12:59:58 +08:00
yangdx
692415b2e1 Fix mix_kg_vector_query function return value error when only_need_context is enabled 2025-04-09 12:59:32 +08:00
IcySugar000
8aa3cd799a Fix: Fixed null value handling and ensure exceptions are avoided 2025-04-09 11:32:05 +08:00
FeHuynhVI
881943b8ae
Update operate.py 2025-04-08 01:22:11 +07:00
yangdx
b2284c8b9d Fix linting 2025-04-06 17:45:32 +08:00
yangdx
b45c5f9304 Change get_by_id batch size from 25 to 5 to reserve db connection resouces 2025-04-06 17:42:13 +08:00
Alex Z
e69a128832
Merge branch 'main' into main 2025-04-05 15:27:59 -07:00
yangdx
247be483eb Merge branch 'main' into clear-doc 2025-04-04 05:45:06 +08:00
yangdx
399b2f14f6 Fix linting 2025-04-04 00:07:21 +08:00
yangdx
a809bc7945 Optmize parallel processing on chunks extraction 2025-04-04 00:06:42 +08:00
yangdx
6b240fa9b2 Serialize merge precess to prevent race conditions 2025-04-03 21:33:46 +08:00
zrguo
75e8a10c21 Update get_keywords_from_query 2025-04-03 17:46:28 +08:00
Alex Z
d0d246bef8 Fix 'TOO MANY OPEN FILE' problem while using redis vector DB:
Enhance RedisKVStorage: Implement connection pooling and error handling. Refactor async methods to use context managers for Redis operations, improving resource management and error logging. Batch processing added for key operations to optimize performance.
2025-04-02 21:06:49 -07:00
Mykola Chaban
ce1a59b1c0 Fix trailing whitespace in docstring 2025-04-02 21:52:06 +03:00
Mykola Chaban
8e66b2a974 added additional verificaton if keywords already provided in query param do not run the generation process; 2025-04-02 21:15:40 +03:00
yangdx
5d517d72f5 Fix file_path error in PostgreSQL storage 2025-04-02 14:30:13 +08:00
jofoks
f349618e37 Fix: unknown filepath errors 2025-03-31 14:50:13 -07:00
yangdx
c94f30be2d Shorten log message to fit in pipelinestatus UI 2025-03-29 13:18:22 +08:00
yangdx
65574459f9 standardize .env loading behavior across modules 2025-03-29 03:48:38 +08:00
zrguo
87fbffde14 fix citation 2025-03-28 13:30:24 +08:00
omdivyatej
f049f2f5c4 linting errors 2025-03-25 15:20:09 +05:30
omdivyatej
3522da1b21 specify LLM for query 2025-03-23 21:33:49 +05:30
zrguo
486a9e8a52 fix index 2025-03-20 16:29:24 +08:00
yangdx
783e7867cf Replace print statement with logger.debug for file_path. 2025-03-18 20:39:38 +08:00
zrguo
dfd19b8d27 fix postgres support 2025-03-17 23:59:47 +08:00
zrguo
6115f60072 fix lint 2025-03-17 23:36:00 +08:00
zrguo
bf18a5406e add citation 2025-03-17 23:32:35 +08:00
zrguo
60dd13f17e fix continue prompt format error 2025-03-17 16:58:04 +08:00
zrguo
418aea3895 fix linting 2025-03-11 15:44:01 +08:00
zrguo
62b304600b clean lightrag.py 2025-03-11 15:43:04 +08:00
zrguo
91f96f2a8b
Merge pull request #1032 from ArindamRoy23/main
Filter by ID during Query for Postgres VDB
2025-03-11 15:26:59 +08:00
yangdx
9d1dc2c9c3 Fix linting 2025-03-11 12:23:51 +08:00
yangdx
061350b2bf Improve Entity Extraction Robustness for Truncated LLM Responses 2025-03-11 12:08:10 +08:00
Roy
92ae895713 Refactor requirements and code formatting
- Simplified requirements.txt by removing specific version constraints
- Added comment about extra library installation using pipmaster
- Improved code formatting in base.py, operate.py, and postgres_impl.py
- Cleaned up SQL templates and query method signatures with consistent formatting
2025-03-10 15:39:18 +00:00
yangdx
bbff3ed0ab Fix linting 2025-03-10 17:30:40 +08:00