248 Commits

Author SHA1 Message Date
yangdx
fc8ca1a706 Fix: add muti-process lock for initialize and drop method for all storage 2025-08-12 04:25:09 +08:00
yangdx
ca00b9c8ee Fix: Resolve workspace isolation problem for PostgreSQL with multiple LightRAG instances 2025-08-12 01:27:05 +08:00
yangdx
16c9a81f4c feat: support config.ini for PostgreSQL vector index settings
- Add support for reading vector_index_type, hnsw_m, hnsw_ef, and ivfflat_lists from config.ini
- Maintain backward compatibility with environment variables
- Update config.ini.example with new PostgreSQL vector index options
- Follow existing configuration priority: env vars > config.ini > defaults
2025-08-08 02:55:49 +08:00
yangdx
f38e10559e Update PostgreSQL vector index configuration
- Remove FLAT index support
- Standardize on HNSW as default
- Add dimension validation
- Improve error logging
- Clean up index creation code
2025-08-08 02:21:06 +08:00
Matt23-star
727ca43d3c feat: add vector index creation functionality for PostgreSQL 2025-08-07 23:07:18 +08:00
yangdx
cc1f7118e7 Remove deprecated cache_by_modes functionality from all storage 2025-08-05 23:20:26 +08:00
yangdx
8294d6d1b7 Remove deprecated mode field from LLM cache schema
- Drop mode column from LLM cache table
- Update primary key to exclude mode
- Remove mode from all SQL queries
- Deprecate mode-related methods
- Update schema migration logic
2025-08-05 23:18:54 +08:00
yangdx
0463963520 fix: include all query parameters in LLM cache hash key generation
- Add missing query parameters (top_k, enable_rerank, max_tokens, etc.) to cache key generation in kg_query, naive_query, and extract_keywords_only functions
- Add queryparam field to CacheData structure and PostgreSQL storage for debugging
- Update PostgreSQL schema with automatic migration for queryparam JSONB column
- Prevent incorrect cache hits between queries with different parameters

Fixes issue where different query parameters incorrectly shared the same cached results.
2025-08-05 18:03:10 +08:00
yangdx
7b3a9c09ca Fix: add missing colume to LLM cache of PostgreSQL implementation 2025-08-04 11:12:59 +08:00
yangdx
5513155808 Fix namespace tablename translate error
- Reorder namespace table map for PostgreSQL
- Ensure specific namespaces come first
2025-08-04 00:21:20 +08:00
yangdx
952d1feb07 feat: Add support for KV_STORE_FULL_ENTITIES and KV_STORE_FULL_RELATIONS namespaces in PGKVStorage
- Add LIGHTRAG_FULL_ENTITIES and LIGHTRAG_FULL_RELATIONS table schemas
- Implement complete CRUD operations for both namespaces
- Add automatic table creation and migration support
- Add SQL templates and namespace mappings
- Ensure workspace isolation and proper indexing
2025-08-03 22:54:56 +08:00
yangdx
d2dd137f83 feat: implement get_all_nodes and get_all_edges methods for graph storage backends
Add get_all_nodes() and get_all_edges() methods to Neo4JStorage, PGGraphStorage, MongoGraphStorage, and MemgraphStorage classes. These methods return all nodes and edges in the graph with consistent formatting matching NetworkXStorage for compatibility across different storage backends.
2025-08-03 11:02:37 +08:00
yangdx
2f0aa7ed12 Optimize graph query by simplifying MATCH pattern
- Simplify MATCH clause to ()-[r]-()
- Remove node type constraints
- Improve query performance
2025-08-02 12:54:22 +08:00
yangdx
9a8f58826d fix: Add safe handling for missing file_path and metadata in PostgreSQL doc status functions
- Add null-safe file_path handling with "no-file-path" fallback in get_docs_by_status and get_docs_by_track_id
- Enhance metadata validation to ensure dict type after JSON parsing
- Align PostgreSQL implementation with JSON implementation safety patterns
- Prevent KeyError exceptions when database records have missing fields
2025-07-31 18:07:53 +08:00
yangdx
0eac1a883a Feat: add file path sorting for document manager
- Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB)
- Implement smart column header switching between "ID" and "File Name" based on display mode
- Add automatic sort field switching when toggling between ID and file name display
- Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance
- Update frontend to maintain sort state when switching display modes
- Add internationalization support for "fileName" in English and Chinese locales

This enhancement improves user experience by providing intuitive file-based sorting
while maintaining performance through optimized database indexes.
2025-07-30 18:46:55 +08:00
yangdx
74eecc46e5 feat(pagination): Implement document list pagination backends and frontend UI
- Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON)
- Implement RESTful API endpoints for paginated document queries and status counts
- Create reusable pagination UI components with internationalization support
- Optimize performance with database-level pagination and efficient in-memory processing
- Maintain backward compatibility while adding configurable page sizes (10-200 items)
2025-07-30 17:58:32 +08:00
yangdx
cfb7117dd6 Fix track_id missing for query in PostgreSQL 2025-07-30 03:44:20 +08:00
yangdx
93afa7d8a7 feat: add processing time tracking to document status with metadata field
- Add metadata field to DocProcessingStatus with start_time and end_time tracking
- Record processing timestamps using Unix time format (seconds precision)
- Update all storage backends (JSON, MongoDB, Redis, PostgreSQL) for new field support
- Maintain backward compatibility with default values for existing data
- Add error_msg field for better error tracking during document processing
2025-07-29 23:42:33 +08:00
yangdx
7206c07468 Remove deprecated content field from doc status
- Drop content column from LIGHTRAG_DOC_STATUS
- Clean up doc status handling code
- Maintain backward compatibility
2025-07-29 23:19:36 +08:00
yangdx
1e1adcb64a Add index on track_id column in doc status table of PostgreSQL 2025-07-29 23:03:09 +08:00
yangdx
6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
yangdx
24c36d876c Remove content field from DocProcessingStatus, update MongoDB and PostgreSQL implementation 2025-07-29 14:52:45 +08:00
yangdx
5574a30856 fix(postgres): handle ssl_mode="allow" in _create_ssl_context
Add "allow" to the list of recognized SSL modes in PostgreSQL connection helper. Previously, ssl_mode="allow" would fall through to "Unknown SSL mode" warning. Now it's properly handled alongside "require" and "prefer" modes.
2025-07-24 12:45:13 +08:00
yangdx
df8b4202f3 feat: Add SSL support for PostgreSQL database connections
- Add SSL configuration options (ssl_mode, ssl_cert, ssl_key, ssl_root_cert, ssl_crl)
- Support all PostgreSQL SSL modes (disable, allow, prefer, require, verify-ca, verify-full)
- Add SSL context creation with certificate validation
- Update initdb() method to handle SSL connection parameters
- Add SSL environment variables to env.example
- Maintain backward compatibility with existing non-SSL configurations
2025-07-21 02:03:06 +08:00
yangdx
19a38d9310 Feat: add PostgreSQL extensions for vector and AGE
- Ensure VECTOR extension is available when PostgreSQL init
- Ensure AGE extension is available when PGGraphStorage init
2025-07-21 01:46:41 +08:00
yangdx
f033fd6f87 fix(postgres): improve AGE agtype parsing and simplify error logging
- Fix JSON parsing errors caused by :: characters in data content
- Implement precise agtype string parsing using rfind() to separate JSON content from type identifiers
- Add robust error handling for malformed JSON in graph data
2025-07-18 08:50:47 +08:00
yangdx
57c8c19628 Add datetime format migration for doc status table 2025-07-16 22:21:51 +08:00
yangdx
c7b566f6d5 Fix cache migration MD5 error for PostgreSQL 2025-07-16 19:24:57 +08:00
yangdx
80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx
bab2803953 Optimize PostgreSQL database migrations for LLM cache
- Combine column migration into single operation
- Optimize LLM cache key migration query
- Improve migration error handling
- Add conflict detection for cache migration
2025-07-16 17:32:53 +08:00
yangdx
bd340fece6 Fix timestamp column migration comment typos
- Correct timezone-related comments
- Fix typo in debug log message
- Update migration success message
- Maintain same migration logic
2025-07-16 14:27:52 +08:00
yangdx
7e988158a9 Fix: Resolve timezone handling problem in PostgreSQL storage
- Changed timestamp columns to naive UTC
- Added datetime formatting utilities
- Updated SQL templates for timestamp extraction
- Simplified timestamp migration logic
2025-07-14 04:12:52 +08:00
yangdx
157fb4c871 Increase field lengths for entity and file paths for PostgreSQL
- Expand entity_name length to 512 chars
- Increase source/target ID lengths
- Convert file_path to TEXT type
- Add migration logic
2025-07-14 00:24:54 +08:00
yangdx
ef79088f60 Move max_graph_nodes to global config 2025-07-07 21:53:57 +08:00
yangdx
da8655002a Add composite indexes for workspace+id columns for PostgreSQL 2025-07-07 03:36:49 +08:00
yangdx
033098c1bc Feat: Add WORKSPACE support to all storage types 2025-07-07 00:57:21 +08:00
yangdx
531502677e fix: Use create_time when update_time is 0 in PGKVStorage queries 2025-07-03 23:38:53 +08:00
yangdx
6c2ae40d7d Refac: Enhance KG rebuild stability by incorporating create_time into the LLM cache 2025-07-03 17:08:29 +08:00
yangdx
70e154b0aa Fix linting 2025-07-03 12:26:05 +08:00
yangdx
ff1b1c61c7 Implemented storage types: PostgreSQL and MongoDB 2025-07-03 11:46:24 +08:00
yangdx
86c9a0cda2 Fix linting 2025-07-02 16:29:43 +08:00
yangdx
271722405f feat: Flatten LLM cache structure for improved recall efficiency
Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
2025-07-02 16:11:53 +08:00
yangdx
37bf341a69 Fix LLM cache handling for PGKVStorage to address document deletion scenarios.
- Add dynamic cache_type field
- Support mode parameter for LLM cache
- Maintain backward compatibility
2025-06-29 14:39:50 +08:00
yangdx
b7f8c20e61 fix(postgres): use correct table for vector queries
Change SQL templates from LIGHTRAG_DOC_CHUNKS to LIGHTRAG_VDB_CHUNKS
to fix "content_vector does not exist" error in vector operations.
2025-06-28 15:36:54 +08:00
yangdx
2c47367975 Fix linting 2025-06-28 14:37:55 +08:00
yangdx
95c7a7d038 feat(db): Add data migration from LIGHTRAG_DOC_CHUNKS to LIGHTRAG_VDB_CHUNKS 2025-06-28 14:37:47 +08:00
yangdx
3a8a99b73d feat(postgres): Implement text_chunks upsert for PGKVStorage 2025-06-28 14:37:35 +08:00
yangdx
72384f87c4 Remove deprecated code from Postgres_impl.py
- Stop filtering out 'base' node labels
- Match any edge type in query to improve performance
2025-06-25 12:53:07 +08:00
yangdx
109c2b48be Fix linting 2025-06-25 12:39:43 +08:00
yangdx
da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00