39 Commits

Author SHA1 Message Date
yangdx
7017f114e1 Merge branch 'main' into select-datastore-in-api-server 2025-02-13 11:25:52 +08:00
yangdx
76164a1b17 Use namespace for graph_name before falling back to env or default value
- Update graph_name initialization
- Add namespace override support
- Maintain backward compatibility
- Prioritize namespace over env variable
2025-02-13 04:52:54 +08:00
yangdx
ed73ea4076 Fix linting 2025-02-13 04:12:00 +08:00
yangdx
f01f57d0da refactor: make cosine similarity threshold a required config parameter
• Remove default threshold from env var
• Add validation for missing threshold
• Move default to lightrag.py config init
• Update all vector DB implementations
• Improve threshold validation consistency
2025-02-13 03:25:48 +08:00
yangdx
7a89916bab Add method to retrieve in-progress documents in DocStatusStorage
• Add get_processing_docs() abstract method
• Override get_processing_docs() in PG storage
• Method retrieves docs with PROCESSING status
• Keep consistent with existing status methods
2025-02-13 01:27:27 +08:00
yangdx
7c7cac1cfd fix: remove unnecessary param binding, use direct workspace string interpolation 2025-02-13 00:39:40 +08:00
yangdx
3372af7c3d refactor: remove injected db field from PGDocStatusStorage, it must be injected after object is created 2025-02-12 22:54:22 +08:00
yangdx
7b79427097 refactor: improve database initialization by centralizing db instance injection
- Move db configs to separate methods
- Remove db field defaults in storage classes
- Add _initialize_database_if_needed method
- Inject db instances during initialization
- Clean up storage implementation code
2025-02-12 22:25:34 +08:00
yangdx
fc0f522ed5 Merge branch 'main' into select-datastore-in-api-server 2025-02-12 09:49:18 +08:00
ArnoChen
9daab4340c add MongoDocStatusStorage
remove unnecessary logging

format
2025-02-12 04:13:48 +08:00
zrguo
18acb4a2b1 fix linting error 2025-02-11 22:16:35 +08:00
yangdx
8a56a5ea6c fix: Add content column to doc status and fix SQL parameter indexing
• Add content column to doc status table
• Fix SQL param index in get_by_status query
• Update insert SQL to include content field
2025-02-11 16:11:15 +08:00
Brenon
4723e9b535 fix(postgres): update document status with partial update instead of full upsert 2025-02-10 15:05:44 +03:00
Yannick Stephan
6480ddee5d cleaned code 2025-02-09 19:51:05 +01:00
Yannick Stephan
7d63898015 fixed bugs 2025-02-09 19:21:49 +01:00
Yannick Stephan
93717e6705 cleaned code 2025-02-09 15:36:01 +01:00
Yannick Stephan
82481ecf28 cleaned code 2025-02-09 14:55:52 +01:00
Yannick Stephan
4cce14e65e cleaned import 2025-02-09 11:24:08 +01:00
Yannick Stephan
31fe96d74a cleaned optional not used 2025-02-09 10:33:15 +01:00
Yannick Stephan
50c7f26262 cleanup code 2025-02-08 23:58:15 +01:00
Yannick Stephan
5a082a0052 cleaned code 2025-02-08 23:20:37 +01:00
Yannick Stephan
cff415d91f implemented method and cleaned the mess 2025-02-08 23:18:12 +01:00
ArnoChen
3f845e9e53 better handling of namespace 2025-02-08 16:05:59 +08:00
ArnoChen
f974bf39bb format
format
2025-02-08 13:53:00 +08:00
ArnoChen
88d691deb9 add namespace prefix to storage namespaces 2025-02-08 13:53:00 +08:00
chenjingyang
6e79bef321 Fix get_by_id DB query ressult is empty array 2025-02-04 17:09:34 +08:00
Samuel Chan
02ac96ff8e - Fix the bug from main stream that using doc['status']
- Improve the performance of Apache AGE.
- Revise the README.md for Apache AGE indexing.
2025-02-02 18:20:32 +08:00
yangdx
06647438b2 Refactor threshold handling to use environment variables and global config settings for oracle, postgres and tidb 2025-01-29 23:47:57 +08:00
zrguo
80451af839 fix linting errors 2025-01-27 23:21:34 +08:00
Saifeddine ALOUI
b6068046ff
Update postgres_impl.py 2025-01-27 09:39:39 +01:00
Saifeddine ALOUI
c7c56863b1
Update postgres_impl.py 2025-01-27 09:36:53 +01:00
Samuel Chan
d91a330e9d Enrich README.md for postgres usage, make some change to cater python version<12 2025-01-15 12:02:55 +08:00
Samuel Chan
c016934021 Revise the AGE implementation on get_node_edges, to align with Neo4j behavior. 2025-01-12 21:38:39 +08:00
Samuel Chan
d03d6f5fc5 Revised the postgres implementation, to use attributes(node_id) rather than nodes to identify an entity. Which significantly reduced the table counts. 2025-01-11 09:30:19 +08:00
Samuel Chan
6ae27d8f06 Some enhancements:
- Enable the llm_cache storage to support get_by_mode_and_id, to improve the performance for using real KV server
- Provide an option for the developers to cache the LLM response when extracting entities for a document. Solving the paint point that sometimes the process failed, the processed chunks we need to call LLM again, money and time wasted. With the new option (by default not enabled) enabling, we can cache that result, can significantly save the time and money for beginners.
2025-01-06 12:50:05 +08:00
Samuel Chan
6c1b669f0f Fix the lint issue 2025-01-04 18:49:32 +08:00
Samuel Chan
e053223ef0 Fix the lint issue 2025-01-04 18:34:35 +08:00
Samuel Chan
f6f62c32a8 Fix the bug of AGE processing 2025-01-03 21:10:06 +08:00
Samuel Chan
b17cb2aa95 With a draft for progres_impl 2025-01-01 22:43:59 +08:00