47 Commits

Author SHA1 Message Date
ArindamRoy23
49dd5f936e
Merge branch 'HKUDS:main' into main 2025-03-11 20:53:00 +05:30
Roy
8aa9d0e6ca Add optional ids filter to vector database query methods
- Updated query method signatures across multiple vector database implementations
- Added optional `ids` parameter to filter search results
- Consistent implementation across ChromaDB, Faiss, Milvus, MongoDB, NanoVectorDB, Oracle, Qdrant, and TiDB vector storage classes
2025-03-11 15:22:17 +00:00
zrguo
c26cb3a9ea fix merge bugs 2025-03-11 16:05:04 +08:00
zrguo
e822f35c89 Fix edit entity and relation bugs 2025-03-07 14:39:06 +08:00
yangdx
e3a40c2fdb Fix linting 2025-03-01 16:23:34 +08:00
yangdx
41eff2ca2f Fix data persistence issue in NanoVectorDBStorage 2025-03-01 13:35:00 +08:00
yangdx
d4f6dcfd54 Improve multi-process data synchronization and persistence in storage implementations
• Remove _get_client() or _get_graph() from index_done_callback
• Add return value for index_done_callback
2025-03-01 12:41:30 +08:00
yangdx
d3de57c1e4 Add multi-process support for vector database and graph storage with lock flags
• Implement storage lock mechanism
• Add update flag handling
• Add cross-process reload detection
2025-03-01 10:37:05 +08:00
yangdx
fd76e00c6a Refactor storage initialization to separate object creation from data loading
• Split __post_init__ and initialize()
• Move data loading to initialize()
• Add FastAPI lifespan integration
2025-03-01 03:48:19 +08:00
yangdx
cd7648791a Fix linting 2025-02-28 01:25:59 +08:00
yangdx
291e0c1b14 revert vector and graph use local data(single process) 2025-02-28 01:14:25 +08:00
yangdx
64f22966a3 Fix linting 2025-02-27 19:05:51 +08:00
yangdx
1699b10a25 Refactor direct client/graph access to reduce redundant get calls in vector/graph ops 2025-02-27 15:14:54 +08:00
yangdx
f007ebf006 Refactor initialization logic for vector, KV and graph storage implementations
• Add try_initialize_namespace check
• Move init code out of storage locks
• Reduce redundant init conditions
• Simplify initialization flow
• Make init thread-safer
2025-02-27 14:55:07 +08:00
yangdx
7436c06f6c Fix linting 2025-02-26 18:11:16 +08:00
yangdx
2c019dbc7b Refactor storage initialization to avoid redundant intitial data loads across processes, show init logs to first load only 2025-02-26 12:28:49 +08:00
yangdx
2752a764ae Refactor storage implementations to support both single and multi-process modes
• Add shared storage management module
• Support process/thread lock based on mode
2025-02-26 05:38:38 +08:00
yangdx
a642bb3190 refactor: use shared manager from main process for storage implementations. 2025-02-25 12:08:49 +08:00
yangdx
087d5770b0 feat(storage): Add shared memory support for file-based storage implementations
This commit adds multiprocessing shared memory support to file-based storage implementations:
- JsonDocStatusStorage
- JsonKVStorage
- NanoVectorDBStorage
- NetworkXStorage

Each storage module now uses module-level global variables with multiprocessing.Manager() to ensure data consistency across multiple uvicorn workers. All processes will see
updates immediately when data is modified through ainsert function.
2025-02-25 11:10:13 +08:00
Yannick Stephan
48a1ad9b3b
Merge pull request #883 from YanSte/fix-return-none
Optimised returns
2025-02-19 22:24:50 +01:00
Yannick Stephan
9277fe8c29 fixed return 2025-02-19 22:22:41 +01:00
Saifeddine ALOUI
45ee4dd08c fixed linting 2025-02-19 20:50:39 +01:00
Saifeddine ALOUI
d3c443529c
Update nano_vector_db_impl.py 2025-02-19 19:49:41 +01:00
Yannick Stephan
2524e02428 remove tqdm and cleaned readme and ollama 2025-02-18 19:58:03 +01:00
Yannick Stephan
2b2c81a722 added some comments 2025-02-16 16:04:07 +01:00
Yannick Stephan
a1607bbcb9 Merge remote-tracking branch 'origin/main' into make-clear-what-implemented-or-not
# Conflicts:
#	lightrag/base.py
#	lightrag/kg/json_doc_status_impl.py
#	lightrag/kg/mongo_impl.py
#	lightrag/kg/postgres_impl.py
2025-02-16 15:29:16 +01:00
Yannick Stephan
0e7aff96bb back to not making breaks 2025-02-16 15:08:50 +01:00
Yannick Stephan
a0844bca28 cleaned import 2025-02-16 14:45:45 +01:00
Yannick Stephan
3fef8201c6 added final, required methods and cleaned import 2025-02-16 14:38:09 +01:00
zrguo
2a0c7c0322
Merge pull request #785 from danielaskdd/improve-CORS-handling
improve CORS and streaming response headers
2025-02-16 20:31:33 +08:00
Yannick Stephan
3eba41aab6 updated clean of what implemented on BaseVectorStorage 2025-02-16 13:24:42 +01:00
Yannick Stephan
805da7b95b cleaned code 2025-02-15 00:02:24 +01:00
yangdx
2c56141bfd Standardize variable names with other vector database implementations (without functional modifications) 2025-02-14 12:34:26 +08:00
yangdx
ed73ea4076 Fix linting 2025-02-13 04:12:00 +08:00
yangdx
f01f57d0da refactor: make cosine similarity threshold a required config parameter
• Remove default threshold from env var
• Add validation for missing threshold
• Move default to lightrag.py config init
• Update all vector DB implementations
• Improve threshold validation consistency
2025-02-13 03:25:48 +08:00
yangdx
3308ecfa69 Refactor logging for vector similarity search with configurable threshold 2025-02-13 02:14:32 +08:00
yangdx
635d4fd9e4 Add lock to protect file write operations in NanoVectorDBStorage
- Introduce asyncio.Lock for save operations
- Ensure thread-safe file writes
2025-02-01 10:36:25 +08:00
yangdx
6a326e2783 Revert "Refactor embedding functions and add async query limit"
This reverts commit 21481dba8f3b020797718de3d8a82aafa7f69590.
2025-02-01 10:36:25 +08:00
yangdx
389f4ee872 Shorten log message for cosine similarity threshold. 2025-01-31 15:33:41 +08:00
yangdx
21481dba8f Refactor embedding functions and add async query limit
- Separate insert/query embedding funcs
- Add query-specific async limit
- Update storage classes to use new funcs
- Protect vector DB save with lock
- Improve config handling for thresholds
2025-01-31 15:00:56 +08:00
yangdx
20d6355a4a Fix cosine threshold parameter setting error for chroma 2025-01-29 22:41:18 +08:00
yangdx
90c765c724 Fix linting 2025-01-29 22:14:18 +08:00
yangdx
c8b890547a Add logging for query parameters in NanoVectorDBStorage.query 2025-01-29 21:36:31 +08:00
yangdx
7aedc08caf Add RAG configuration options and enhance parameter configurability
- Add top-k and cosine-threshold parms for api server
- Update .env and cli parms handling with new parameters
- Improve splash screen display
- Update bash and storage classes to read new parameters from .env file.
2025-01-29 21:34:34 +08:00
yangdx
d0052456d4 Fix cosine threshold parameter setting error 2025-01-29 21:09:11 +08:00
zrguo
80451af839 fix linting errors 2025-01-27 23:21:34 +08:00
Saifeddine ALOUI
56e9c9f4d5
Moved the storages to kg folder 2025-01-27 09:59:26 +01:00