95 Commits

Author SHA1 Message Date
yangdx
74eecc46e5 feat(pagination): Implement document list pagination backends and frontend UI
- Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON)
- Implement RESTful API endpoints for paginated document queries and status counts
- Create reusable pagination UI components with internationalization support
- Optimize performance with database-level pagination and efficient in-memory processing
- Maintain backward compatibility while adding configurable page sizes (10-200 items)
2025-07-30 17:58:32 +08:00
yangdx
c24c2ff2f6 Remove deprecated temp file saving function
- Delete unused save_temp_file function
2025-07-30 14:23:08 +08:00
yangdx
29e829113b Fix status key serialization issue in get_rack_status 2025-07-30 04:45:48 +08:00
yangdx
7207598fc4 Fix track_id bugs and add track_id to scanning response 2025-07-30 03:06:20 +08:00
yangdx
6f958d5aee feat: add metadata timestamps to document processing and update frontend compatibility
- Add metadata field to doc_status storage with Unix timestamps for processing start/end times
- Update frontend API types: error -> error_msg, add track_id and metadata support
- Add getTrackStatus API method for document tracking functionality
- Fix frontend DocumentManager to use error_msg field for proper error display
- Ensure full compatibility between backend metadata changes and frontend UI
2025-07-30 00:04:27 +08:00
yangdx
6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
yangdx
910c6973f3 Limit file deletion to current directory only after document cleaning 2025-07-16 20:35:24 +08:00
yangdx
033098c1bc Feat: Add WORKSPACE support to all storage types 2025-07-07 00:57:21 +08:00
yangdx
98150e80b8 Improved empty/whitespace file handling
- Better detection of whitespace-only files
- Changed error to warning for empty chunks
2025-07-05 23:16:39 +08:00
xuewei
49cb51b5dc PDF文件解析不到内容 2025-07-05 13:47:47 +08:00
yangdx
04d793abbd Update logger message 2025-07-03 22:15:32 +08:00
yangdx
67f51597c2 Bump api version to 0178 2025-07-03 21:37:47 +08:00
yangdx
05231233f1 Feat: Check pending equest_pending after document deletion
- Add double-check for pipeline status to prevent race conditions
- Implement automatic processing of pending indexing requests after deletion
2025-07-03 21:36:35 +08:00
yangdx
a506753548 Fix linting 2025-06-27 02:33:20 +08:00
yangdx
60777d535b fix: prevent Path Traversal vulnerability in upload endpoint
- Add sanitize_filename() function to validate and clean uploaded filenames
- Remove path separators, traversal sequences, and control characters
- Verify final paths stay within input directory using Path.resolve()
- Return HTTP 400 errors for unsafe filenames
- Prevents directory traversal attacks like ../../../etc/passwd
2025-06-27 02:33:05 +08:00
yangdx
8fb1c09b08 Refac: pipelinge message 2025-06-26 01:00:54 +08:00
yangdx
bdcd55a871 Feat: Add delete upload file option to document deletion 2025-06-25 19:02:46 +08:00
yangdx
51bb0471cd Change the API for deleting documents to support deleting multiple documents at once. 2025-06-25 16:19:49 +08:00
yangdx
495d6c8cce Improve the pipeline status message for document deletetion 2025-06-25 15:46:58 +08:00
yangdx
2aaa6d5f7d Fix linting 2025-06-25 14:59:45 +08:00
yangdx
49baeb7318 Change document deletion API to async 2025-06-25 14:59:10 +08:00
yangdx
922484915b Remove deprecated API endpoint. 2025-06-25 13:55:47 +08:00
yangdx
8b6dcfb6eb Pls do not use /delete_document API endpoint 2025-06-24 11:26:38 +08:00
yangdx
5ae945c1e5 Improved error handling for document deletion
Added HTTPException for not_found status
Added HTTPException for fail status
2025-06-24 01:12:25 +08:00
yangdx
c18065a912 Disable document deletion when LLM cache for extraction is off 2025-06-23 22:41:27 +08:00
yangdx
1973c80dca Feat: Add entity and relation deletion endpoints 2025-06-23 22:14:50 +08:00
yangdx
bd487dd252 Unify document APIs returen status string 2025-06-23 21:38:47 +08:00
yangdx
5099ac8213 Fix linting 2025-06-23 18:41:30 +08:00
yangdx
dffe659388 Feat: Add document deletion by ID API endpoint
- New DELETE endpoint for document removal
- Implements doc_id-based deletion
- Handles pipeline status during operation
- Includes proper error handling
- Updates pipeline status messages
2025-06-23 18:10:40 +08:00
yangdx
a6046bf827 Fix linting 2025-05-22 10:06:09 +08:00
Benjamin L
1b6ddcaf5b change validator method names 2025-05-21 16:06:35 +02:00
Benjamin L
62b536ea6f Adding file_source.s as optional attribute to text.s requests 2025-05-21 15:10:27 +02:00
yangdx
36f8787bc7 Fix linting 2025-05-01 10:04:31 +08:00
yangdx
a561be0cff Fix time zone problem of doc status 2025-05-01 02:16:19 +08:00
yangdx
31bd274601 Add Unicode collation for Chinese file sorting of document scanning 2025-04-25 01:02:09 +08:00
yangdx
3aab5b41f2 Fix linting 2025-04-24 14:15:10 +08:00
yangdx
fc425f1397 Send all found files to pipeline at once 2025-04-24 14:00:43 +08:00
cuikunyu
135a40d696 Optimize: Use python-docx for better parsing. 2025-04-11 03:10:20 +00:00
yangdx
bd2c528dba Merge branch 'optimize-config-management' into clear-doc 2025-04-04 19:46:45 +08:00
yangdx
b0f0f1ff84 refactor: improve document clearing status management
- Use update() for atomic status updates
- Improve history messages clearing while preserving list object
2025-04-01 14:03:45 +08:00
yangdx
cd94e84267 Update clear cache endpoint path 2025-04-01 10:36:28 +08:00
yangdx
d54bda8d36 feat(api): Add Pydantic models for all endpoints in document_routes.py 2025-03-31 23:53:14 +08:00
yangdx
8845779ed7 Add clear cache API endpoint 2025-03-31 23:37:03 +08:00
yangdx
95a8ee27ed Fix linting 2025-03-31 23:22:27 +08:00
yangdx
04967b33cc feat(api): Add dedicated ClearDocumentsResponse class for document deletion endpoint 2025-03-31 19:13:27 +08:00
yangdx
bbc770d1ed feat(api): enhance document clearing error handling and status reporting
- Change pipeline busy status from "error" to "busy"
- Improve error handling documentation
2025-03-31 13:01:52 +08:00
Milin
4dbd5e3899 Merge branch 'main' into optimize-config-management
# Conflicts:
#	env.example
#	lightrag/api/utils_api.py
2025-03-31 11:29:29 +08:00
Milin
088fc19318 feat(config): Refactor configuration management
- Optimize JWT Auth module to load configuration via `global_args`.
- Decouple configuration-related code from `utils_api.py`, and add a new `config.py` file for unified configuration management.
- Adjust configuration import in `lightrag_server.py`, `auth.py`, and `document_routes.py` to be introduced through `global_args`.
2025-03-31 11:19:47 +08:00
yangdx
949a3904a9 feat(api): Enhance document clearing functionality
- Use storage drop methods to properly clean up all data
- Add file deletion from input directory
- Add pipeline status checking and locking mechanism
- Improve error handling with detailed logging and pipeline message tracking
2025-03-30 16:30:41 +08:00
yangdx
adb4ca9294 Fix linting 2025-03-28 16:49:35 +08:00