246 Commits

Author SHA1 Message Date
yangdx
247be483eb Merge branch 'main' into clear-doc 2025-04-04 05:45:06 +08:00
yangdx
399b2f14f6 Fix linting 2025-04-04 00:07:21 +08:00
yangdx
a809bc7945 Optmize parallel processing on chunks extraction 2025-04-04 00:06:42 +08:00
yangdx
6b240fa9b2 Serialize merge precess to prevent race conditions 2025-04-03 21:33:46 +08:00
zrguo
75e8a10c21 Update get_keywords_from_query 2025-04-03 17:46:28 +08:00
Mykola Chaban
ce1a59b1c0 Fix trailing whitespace in docstring 2025-04-02 21:52:06 +03:00
Mykola Chaban
8e66b2a974 added additional verificaton if keywords already provided in query param do not run the generation process; 2025-04-02 21:15:40 +03:00
yangdx
5d517d72f5 Fix file_path error in PostgreSQL storage 2025-04-02 14:30:13 +08:00
jofoks
f349618e37 Fix: unknown filepath errors 2025-03-31 14:50:13 -07:00
yangdx
c94f30be2d Shorten log message to fit in pipelinestatus UI 2025-03-29 13:18:22 +08:00
yangdx
65574459f9 standardize .env loading behavior across modules 2025-03-29 03:48:38 +08:00
zrguo
87fbffde14 fix citation 2025-03-28 13:30:24 +08:00
omdivyatej
f049f2f5c4 linting errors 2025-03-25 15:20:09 +05:30
omdivyatej
3522da1b21 specify LLM for query 2025-03-23 21:33:49 +05:30
zrguo
486a9e8a52 fix index 2025-03-20 16:29:24 +08:00
yangdx
783e7867cf Replace print statement with logger.debug for file_path. 2025-03-18 20:39:38 +08:00
zrguo
dfd19b8d27 fix postgres support 2025-03-17 23:59:47 +08:00
zrguo
6115f60072 fix lint 2025-03-17 23:36:00 +08:00
zrguo
bf18a5406e add citation 2025-03-17 23:32:35 +08:00
zrguo
60dd13f17e fix continue prompt format error 2025-03-17 16:58:04 +08:00
zrguo
418aea3895 fix linting 2025-03-11 15:44:01 +08:00
zrguo
62b304600b clean lightrag.py 2025-03-11 15:43:04 +08:00
zrguo
91f96f2a8b
Merge pull request #1032 from ArindamRoy23/main
Filter by ID during Query for Postgres VDB
2025-03-11 15:26:59 +08:00
yangdx
9d1dc2c9c3 Fix linting 2025-03-11 12:23:51 +08:00
yangdx
061350b2bf Improve Entity Extraction Robustness for Truncated LLM Responses 2025-03-11 12:08:10 +08:00
Roy
92ae895713 Refactor requirements and code formatting
- Simplified requirements.txt by removing specific version constraints
- Added comment about extra library installation using pipmaster
- Improved code formatting in base.py, operate.py, and postgres_impl.py
- Cleaned up SQL templates and query method signatures with consistent formatting
2025-03-10 15:39:18 +00:00
yangdx
bbff3ed0ab Fix linting 2025-03-10 17:30:40 +08:00
Roy
7807379bee Remove unused ids parameter from _build_query_context function 2025-03-10 09:18:22 +00:00
yangdx
3cca18c59c Refactor pipeline status updates and entity extraction.
- Let all parrallel jobs using one pipe_status objects
- Improved thread safety with pipeline_status_lock
- Only pipeline jobs can add message to pipe_status
- Marked insert_custom_chunks as deprecated
2025-03-10 16:48:59 +08:00
yangdx
adca27fae9 Merge branch 'main' into neo4j-add-min-degree 2025-03-10 02:13:49 +08:00
yangdx
c938989920 Fix llm cache save problem in json_kv storage 2025-03-09 23:33:03 +08:00
yangdx
bc42afe7b6 Unify llm_response_cache and hashing_kv, prevent creating an independent hashing_kv. 2025-03-09 22:15:26 +08:00
yangdx
c854aabde0 Add process ID to log messages for better multi-process debugging clarity
- Add PID to KV and Neo4j storage  logs
- Add PID to query context logs
- Improve KV data count logging for llm cache
2025-03-09 15:25:10 +08:00
Roy
04fdc617bb main_merge 2025-03-08 20:34:29 +00:00
Roy
e31c0c8f6c Update vector query methods to support ID filtering in PostgreSQL
- Modified `mix_kg_vector_query` in operate.py to pass optional IDs to vector search
- Updated PostgreSQL SQL template to filter results using document IDs instead of chunk_id
- Improved query flexibility by allowing precise document selection during vector search
2025-03-08 20:25:20 +00:00
zrguo
548f9a8234 Update prompts 2025-03-09 01:21:39 +08:00
yangdx
6a969e8de4 Disable logging for graph database lock acquisition and release 2025-03-09 01:14:24 +08:00
yangdx
c5d0962872 Fix linting 2025-03-09 01:00:42 +08:00
yangdx
18c0770409 fix: duplicate nodes for same entity(label) problem in Neo4j
- Add entity_id field as key in Neo4j nodes
- Use  entity_id for nodes retrival and upsert
2025-03-09 00:24:55 +08:00
Roy
528fb11364 Refactor vector query methods to support optional ID filtering
- Updated BaseVectorStorage query method signature to accept optional IDs
- Modified operate.py to pass query parameter IDs to vector storage queries
- Updated PostgreSQL vector storage SQL templates to filter results by document IDs
- Removed unused parameters and simplified query logic across multiple files
2025-03-08 15:43:17 +00:00
yangdx
73452e63fa Add async lock for atomic graph database operations
• Introduced graph_db_lock mechanism
• Ensured atomic node/edge merge and insert operation
2025-03-08 22:48:12 +08:00
Roy
0ec61d6407 Update project dependencies and example test files
- Updated requirements.txt with latest package versions
- Added support for filtering query results by IDs in base and operate modules
- Modified PostgreSQL vector storage to include document and chunk ID fields
2025-03-07 18:45:28 +00:00
Lukas Selch
bad3781f51
Fixed entites_section_list comma error 2025-03-07 12:04:10 +01:00
zrguo
5e7ef39998 Update operate.py 2025-03-05 15:12:01 +08:00
yangdx
c0b22a8ae2 Merge branch 'main' into add-multi-worker-support 2025-03-02 02:54:57 +08:00
zrguo
4219454fab fix format 2025-03-01 17:45:06 +08:00
yangdx
3507e894d9 Merge branch 'main' into add-multi-worker-support 2025-03-01 15:55:37 +08:00
yangdx
d704512139 Refactor shared storage module to improve async handling and naming consistency
• Add async support for get_namespace_data
• Rename get_update_flags to get_update_flag
• Rename set_update_flag to set_all_update_flags
• Update docstrings for clarity
• Fix typos in log messages
2025-03-01 05:01:26 +08:00
yangdx
731d820bcc Remove redundancy set_logger function and related calls 2025-02-28 21:46:45 +08:00
yangdx
c973498c34 Fix linting 2025-02-28 21:35:04 +08:00