115 Commits

Author SHA1 Message Date
yangdx
cc1f7118e7 Remove deprecated cache_by_modes functionality from all storage 2025-08-05 23:20:26 +08:00
yangdx
d2dd137f83 feat: implement get_all_nodes and get_all_edges methods for graph storage backends
Add get_all_nodes() and get_all_edges() methods to Neo4JStorage, PGGraphStorage, MongoGraphStorage, and MemgraphStorage classes. These methods return all nodes and edges in the graph with consistent formatting matching NetworkXStorage for compatibility across different storage backends.
2025-08-03 11:02:37 +08:00
yangdx
41de51a4db fix: add missing await in MongoDB get_all_status_counts aggregation
Resolves 'coroutine' object has no attribute 'to_list' error in document pagination endpoint by adding missing await keyword before self._data.aggregate() call.
2025-07-31 02:27:16 +08:00
yangdx
0eac1a883a Feat: add file path sorting for document manager
- Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB)
- Implement smart column header switching between "ID" and "File Name" based on display mode
- Add automatic sort field switching when toggling between ID and file name display
- Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance
- Update frontend to maintain sort state when switching display modes
- Add internationalization support for "fileName" in English and Chinese locales

This enhancement improves user experience by providing intuitive file-based sorting
while maintaining performance through optimized database indexes.
2025-07-30 18:46:55 +08:00
yangdx
74eecc46e5 feat(pagination): Implement document list pagination backends and frontend UI
- Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON)
- Implement RESTful API endpoints for paginated document queries and status counts
- Create reusable pagination UI components with internationalization support
- Optimize performance with database-level pagination and efficient in-memory processing
- Maintain backward compatibility while adding configurable page sizes (10-200 items)
2025-07-30 17:58:32 +08:00
yangdx
30f71c8acf Remove _id field and improve index handling in MongoDB
- Remove MongoDB _id field from documents
- Improve index existence check and creation
2025-07-30 04:17:26 +08:00
yangdx
75de799353 Remove deprecated content field from doc status storage
- Remove content field from JSON storage
- Remove content field from MongoDB storage
- Remove content field from Redis storage
2025-07-30 01:00:06 +08:00
yangdx
93afa7d8a7 feat: add processing time tracking to document status with metadata field
- Add metadata field to DocProcessingStatus with start_time and end_time tracking
- Record processing timestamps using Unix time format (seconds precision)
- Update all storage backends (JSON, MongoDB, Redis, PostgreSQL) for new field support
- Maintain backward compatibility with default values for existing data
- Add error_msg field for better error tracking during document processing
2025-07-29 23:42:33 +08:00
yangdx
6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
yangdx
92bbb7a1b3 Remove content fallback and standardize doc status handling
- Remove content_summary fallback logic
- Standardize doc status processing
- Handle missing file_path consistently
2025-07-29 16:13:51 +08:00
yangdx
24c36d876c Remove content field from DocProcessingStatus, update MongoDB and PostgreSQL implementation 2025-07-29 14:52:45 +08:00
yangdx
ef79088f60 Move max_graph_nodes to global config 2025-07-07 21:53:57 +08:00
yangdx
7a7a01b68b Fix linting 2025-07-07 04:44:06 +08:00
yangdx
9e823de74e Exit program on vector index creation failure for MongoDB 2025-07-07 04:43:46 +08:00
yangdx
907f2313cd Improve MongoDB vector index handling with workspace support
- Add workspace-specific index naming
- Store index name as instance variable
2025-07-07 03:19:41 +08:00
yangdx
033098c1bc Feat: Add WORKSPACE support to all storage types 2025-07-07 00:57:21 +08:00
yangdx
3355a0ce95 Fix create_time conflict in MongoKVStorage updates 2025-07-03 22:58:08 +08:00
yangdx
6c2ae40d7d Refac: Enhance KG rebuild stability by incorporating create_time into the LLM cache 2025-07-03 17:08:29 +08:00
yangdx
ff1b1c61c7 Implemented storage types: PostgreSQL and MongoDB 2025-07-03 11:46:24 +08:00
yangdx
271722405f feat: Flatten LLM cache structure for improved recall efficiency
Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
2025-07-02 16:11:53 +08:00
yangdx
e2824b721e Fix LLM cache handling for MongoKVStorage to address document deletion scenarios.
- Support fetching all "default_" prefixed documents
- Maintain original behavior for other IDs
- Return dictionary of documents for "default"
- Keep backward compatibility
2025-06-29 15:03:57 +08:00
yangdx
28aedd8b3c Update comments 2025-06-29 00:30:39 +08:00
Ken Chen
4a953d6829 As Graph edges should be treated undirectional, fix incorrect upsert_edge method in MongoDBGraph 2025-06-28 21:03:54 +08:00
Ken Chen
5116d61eaa As Graph edges should be treated undirectional, fix incorrect has_edge method in MongoDBGraph 2025-06-28 20:48:30 +08:00
Ken Chen
73cc86662a Add two BFS subgraph search support for MongoDBGraph 2025-06-28 20:00:13 +08:00
Ken Chen
5739f52d29 Rewrite get_knowledge_graph with label * by degree 2025-06-28 17:10:39 +08:00
Ken Chen
d0f4eee404 Fix accidentally hardcoded edge collection name in searching upstream nodes 2025-06-28 16:25:44 +08:00
Ken Chen
6574dfb7ea Fix accidentally hardcode max depth in searching upstream nodes 2025-06-28 11:40:39 +08:00
Ken Chen
b586bdc02f Fix accidentally hardcode label in searching upstream nodes 2025-06-28 10:50:56 +08:00
Ken Chen
7c8f65d020 Add search on neighbor nodes which are source to selected one 2025-06-28 08:50:32 +08:00
Ken Chen
f40bc43d5e Fix nodes & edges are missing when retrieving knowledge subgraph by selecting particular node_id 2025-06-26 23:11:31 +08:00
yangdx
687ccd4923 fix: optimize MongoDB aggregation pipeline to prevent memory limit errors
- Move $limit operation early in pipeline for "*" queries to reduce memory usage
- Remove memory-intensive $sort operation for large dataset queries
- Add fallback mechanism for memory limit errors with simple query
- Implement additional safety checks to enforce max_nodes limit
- Improve error handling and logging for memory-related issues
2025-06-26 14:37:04 +08:00
yangdx
d8b544ab6f Fix linting 2025-06-26 14:15:11 +08:00
yangdx
c51079335e Optimize node label retrieval with aggregation
- Enable allowDiskUse for large datasets
2025-06-26 14:14:52 +08:00
yangdx
d60db573dc Add allowDiskUse flag to MongoDB aggregations
- Enable disk use for large aggregations
- Fix cursor handling for list_search_indexes
- Improve query performance for big datasets
- Update vector search index check
- Set proper length for to_list results
2025-06-26 13:51:53 +08:00
yangdx
71565f4794 Add get_all method to MongoKVStorage 2025-06-26 13:51:15 +08:00
yangdx
d512db26e4 Fix MongoDB set handling in delete operations 2025-06-26 13:50:19 +08:00
Ken Chen
a3865caaea Implement get_nodes_by_chunk_ids and get_edges_by_chunk_ids, 2025-06-25 22:17:17 +08:00
Ken Chen
a047d966ab MongoGraph: Separate edges from node collection 2025-06-21 21:05:04 +08:00
Ken Chen
cf441aa84c Add missing methods for MongoGraphStorage 2025-06-15 21:22:32 +08:00
yangdx
045993f7d2 Remove deprecated search_by_prefix 2025-05-03 11:17:49 +08:00
yangdx
08e8a7ead1 Fix linting 2025-05-03 00:46:28 +08:00
yangdx
c3df1908dc Fix created_at probelm for MongoDB vector storage 2025-05-02 21:48:01 +08:00
yangdx
ca63386546 Increase embeding priority for query request 2025-04-28 20:10:39 +08:00
yangdx
83353ab9a6 Remove unused node embedding functionality from graph storage
- Deleted embed_nodes() method implementations
2025-04-11 18:34:48 +08:00
yangdx
95a8ee27ed Fix linting 2025-03-31 23:22:27 +08:00
yangdx
3d4f8f67c9 Add drop_cace_by_modes to all KV storage implementation 2025-03-31 23:10:21 +08:00
yangdx
5b7cd50005 Add delete support for MongoKVStorage 2025-03-31 02:14:16 +08:00
yangdx
078cee390c Add drop support for all storage type implementation for Mongo DB 2025-03-31 02:10:58 +08:00
zrguo
56fa051917 fix lint 2025-03-25 13:24:52 +08:00