251 Commits

Author SHA1 Message Date
FeHuynhVI
881943b8ae
Update operate.py 2025-04-08 01:22:11 +07:00
yangdx
b2284c8b9d Fix linting 2025-04-06 17:45:32 +08:00
yangdx
b45c5f9304 Change get_by_id batch size from 25 to 5 to reserve db connection resouces 2025-04-06 17:42:13 +08:00
Alex Z
e69a128832
Merge branch 'main' into main 2025-04-05 15:27:59 -07:00
yangdx
247be483eb Merge branch 'main' into clear-doc 2025-04-04 05:45:06 +08:00
yangdx
399b2f14f6 Fix linting 2025-04-04 00:07:21 +08:00
yangdx
a809bc7945 Optmize parallel processing on chunks extraction 2025-04-04 00:06:42 +08:00
yangdx
6b240fa9b2 Serialize merge precess to prevent race conditions 2025-04-03 21:33:46 +08:00
zrguo
75e8a10c21 Update get_keywords_from_query 2025-04-03 17:46:28 +08:00
Alex Z
d0d246bef8 Fix 'TOO MANY OPEN FILE' problem while using redis vector DB:
Enhance RedisKVStorage: Implement connection pooling and error handling. Refactor async methods to use context managers for Redis operations, improving resource management and error logging. Batch processing added for key operations to optimize performance.
2025-04-02 21:06:49 -07:00
Mykola Chaban
ce1a59b1c0 Fix trailing whitespace in docstring 2025-04-02 21:52:06 +03:00
Mykola Chaban
8e66b2a974 added additional verificaton if keywords already provided in query param do not run the generation process; 2025-04-02 21:15:40 +03:00
yangdx
5d517d72f5 Fix file_path error in PostgreSQL storage 2025-04-02 14:30:13 +08:00
jofoks
f349618e37 Fix: unknown filepath errors 2025-03-31 14:50:13 -07:00
yangdx
c94f30be2d Shorten log message to fit in pipelinestatus UI 2025-03-29 13:18:22 +08:00
yangdx
65574459f9 standardize .env loading behavior across modules 2025-03-29 03:48:38 +08:00
zrguo
87fbffde14 fix citation 2025-03-28 13:30:24 +08:00
omdivyatej
f049f2f5c4 linting errors 2025-03-25 15:20:09 +05:30
omdivyatej
3522da1b21 specify LLM for query 2025-03-23 21:33:49 +05:30
zrguo
486a9e8a52 fix index 2025-03-20 16:29:24 +08:00
yangdx
783e7867cf Replace print statement with logger.debug for file_path. 2025-03-18 20:39:38 +08:00
zrguo
dfd19b8d27 fix postgres support 2025-03-17 23:59:47 +08:00
zrguo
6115f60072 fix lint 2025-03-17 23:36:00 +08:00
zrguo
bf18a5406e add citation 2025-03-17 23:32:35 +08:00
zrguo
60dd13f17e fix continue prompt format error 2025-03-17 16:58:04 +08:00
zrguo
418aea3895 fix linting 2025-03-11 15:44:01 +08:00
zrguo
62b304600b clean lightrag.py 2025-03-11 15:43:04 +08:00
zrguo
91f96f2a8b
Merge pull request #1032 from ArindamRoy23/main
Filter by ID during Query for Postgres VDB
2025-03-11 15:26:59 +08:00
yangdx
9d1dc2c9c3 Fix linting 2025-03-11 12:23:51 +08:00
yangdx
061350b2bf Improve Entity Extraction Robustness for Truncated LLM Responses 2025-03-11 12:08:10 +08:00
Roy
92ae895713 Refactor requirements and code formatting
- Simplified requirements.txt by removing specific version constraints
- Added comment about extra library installation using pipmaster
- Improved code formatting in base.py, operate.py, and postgres_impl.py
- Cleaned up SQL templates and query method signatures with consistent formatting
2025-03-10 15:39:18 +00:00
yangdx
bbff3ed0ab Fix linting 2025-03-10 17:30:40 +08:00
Roy
7807379bee Remove unused ids parameter from _build_query_context function 2025-03-10 09:18:22 +00:00
yangdx
3cca18c59c Refactor pipeline status updates and entity extraction.
- Let all parrallel jobs using one pipe_status objects
- Improved thread safety with pipeline_status_lock
- Only pipeline jobs can add message to pipe_status
- Marked insert_custom_chunks as deprecated
2025-03-10 16:48:59 +08:00
yangdx
adca27fae9 Merge branch 'main' into neo4j-add-min-degree 2025-03-10 02:13:49 +08:00
yangdx
c938989920 Fix llm cache save problem in json_kv storage 2025-03-09 23:33:03 +08:00
yangdx
bc42afe7b6 Unify llm_response_cache and hashing_kv, prevent creating an independent hashing_kv. 2025-03-09 22:15:26 +08:00
yangdx
c854aabde0 Add process ID to log messages for better multi-process debugging clarity
- Add PID to KV and Neo4j storage  logs
- Add PID to query context logs
- Improve KV data count logging for llm cache
2025-03-09 15:25:10 +08:00
Roy
04fdc617bb main_merge 2025-03-08 20:34:29 +00:00
Roy
e31c0c8f6c Update vector query methods to support ID filtering in PostgreSQL
- Modified `mix_kg_vector_query` in operate.py to pass optional IDs to vector search
- Updated PostgreSQL SQL template to filter results using document IDs instead of chunk_id
- Improved query flexibility by allowing precise document selection during vector search
2025-03-08 20:25:20 +00:00
zrguo
548f9a8234 Update prompts 2025-03-09 01:21:39 +08:00
yangdx
6a969e8de4 Disable logging for graph database lock acquisition and release 2025-03-09 01:14:24 +08:00
yangdx
c5d0962872 Fix linting 2025-03-09 01:00:42 +08:00
yangdx
18c0770409 fix: duplicate nodes for same entity(label) problem in Neo4j
- Add entity_id field as key in Neo4j nodes
- Use  entity_id for nodes retrival and upsert
2025-03-09 00:24:55 +08:00
Roy
528fb11364 Refactor vector query methods to support optional ID filtering
- Updated BaseVectorStorage query method signature to accept optional IDs
- Modified operate.py to pass query parameter IDs to vector storage queries
- Updated PostgreSQL vector storage SQL templates to filter results by document IDs
- Removed unused parameters and simplified query logic across multiple files
2025-03-08 15:43:17 +00:00
yangdx
73452e63fa Add async lock for atomic graph database operations
• Introduced graph_db_lock mechanism
• Ensured atomic node/edge merge and insert operation
2025-03-08 22:48:12 +08:00
Roy
0ec61d6407 Update project dependencies and example test files
- Updated requirements.txt with latest package versions
- Added support for filtering query results by IDs in base and operate modules
- Modified PostgreSQL vector storage to include document and chunk ID fields
2025-03-07 18:45:28 +00:00
Lukas Selch
bad3781f51
Fixed entites_section_list comma error 2025-03-07 12:04:10 +01:00
zrguo
5e7ef39998 Update operate.py 2025-03-05 15:12:01 +08:00
yangdx
c0b22a8ae2 Merge branch 'main' into add-multi-worker-support 2025-03-02 02:54:57 +08:00