yangdx
6196bab00a
Update webui assets and bump api version to 0203
2025-08-17 10:39:16 +08:00
yangdx
1af0803c62
fix(ui): fix selection state management in paginated views
...
- Replace DeselectDocumentsDialog with smart selection button
- Auto-reset selection on page/filter changes
- Remove deletion restrictions and update i18n
2025-08-17 10:38:12 +08:00
yangdx
3e4214cef3
Standardize document deletion warning messages for consistency
2025-08-17 09:35:46 +08:00
yangdx
f76d926512
Merge branch 'main' into pg-optimization
2025-08-17 08:57:24 +08:00
yangdx
185b576101
Fix parameter reference and apply code formatting improvements
2025-08-17 04:02:43 +08:00
yangdx
3a7310873c
Merge branch 'bedrock-support'
2025-08-17 02:23:44 +08:00
yangdx
da7e4b79e5
Update documentation in README files
2025-08-17 02:23:14 +08:00
yangdx
1ed77a2e53
Remove openai-ollama binding from LightRAG level args
2025-08-17 02:13:50 +08:00
Daniel.y
459b0e4c44
Merge pull request #1965 from danielaskdd/rm-enqueued-file
...
Feat: Optimize error handling for document processing pipeline
2025-08-17 01:59:33 +08:00
yangdx
301acfc274
Update webui assets
2025-08-17 01:54:39 +08:00
yangdx
bd8ed905e8
Translate Chinese comments to English in ClearDocumentsDialog
2025-08-17 01:53:37 +08:00
yangdx
e566267a20
Implement smart polling recovery after document scan completion
...
• Add 15-second recovery timer
• Restore intelligent intervals
2025-08-17 01:51:11 +08:00
yangdx
e064534941
feat(ui): enhance ClearDocumentsDialog with loading spinner and timeout protection
...
- Add loading spinner animation during document clearing operation
- Implement 30-second timeout protection to prevent hanging operations
- Disable all interactive controls during clearing to prevent duplicate requests
- Add comprehensive error handling with automatic state reset
2025-08-17 01:33:39 +08:00
yangdx
45365ff6ef
Bump api version to 0202
2025-08-16 23:53:01 +08:00
yangdx
cceb46b320
fix: subdirectories are no longer processed during file scans
...
• Change rglob to glob for file scanning
• Simplify error logging messages
2025-08-16 23:46:33 +08:00
yangdx
f5b0c3d38c
feat: Recording file extraction error status to document pipeline
...
- Add apipeline_enqueue_error_documents function to LightRAG class for recording file processing errors in doc_status storage
- Enhance pipeline_enqueue_file with detailed error handling for all file processing stages:
* File access errors (permissions, not found)
* UTF-8 encoding errors
* Format-specific processing errors (PDF, DOCX, PPTX, XLSX)
* Content validation errors
* Unsupported file type errors
This implementation ensures all file extraction failures are properly tracked and recorded in the doc_status storage system, providing better visibility into document processing issues and enabling improved error monitoring and debugging capabilities.
2025-08-16 23:08:52 +08:00
Matt23-star
a0593ec1c9
feat: enhance query performance by restructuring relationships, entities, and chunks retrieval in PostgreSQL.
...
Fixed: duplicate items query
2025-08-16 22:49:54 +08:00
Matt23-star
6a7e3092ea
feat: optimize node and edge queries in PostgreSQL. query tables Directly
2025-08-16 22:37:48 +08:00
Matt23-star
a7da48e05c
feat: add batch size parameter to node and edge retrieval methods
2025-08-16 22:35:22 +08:00
yangdx
ca4c18baaa
Preserve failed documents during data consistency validation for manual review
2025-08-16 22:29:46 +08:00
yangdx
e1310c5262
Optimize document processing pipeline by removing duplicate step
2025-08-16 17:23:01 +08:00
yangdx
5591ef3ac8
Fix document filtering logic and improve logging for ignored docs
2025-08-16 17:22:08 +08:00
yangdx
5d00c4c7a8
feat: move processed files to __enqueued__ directory after processing with filename conflicts handling
2025-08-16 13:19:20 +08:00
SJ
f7ca9ae16a
Ruff formatted
2025-08-15 22:21:34 +00:00
yangdx
dc7a6e1c5b
Update README
2025-08-16 06:15:27 +08:00
SJ
3aa3332505
Merge pull request #1 from HKUDS/main
...
merge
2025-08-15 17:09:03 -05:00
Daniel.y
bdd1169cfb
Merge pull request #1959 from danielaskdd/pick-trunk-by-vector
...
Feat: add KG related chunks selection by vector similarity
2025-08-15 19:33:51 +08:00
yangdx
2a781dfb91
Update Neo4j database naming in env.example
2025-08-15 19:14:38 +08:00
yangdx
3a227e37b8
Add get_vectors_by_ids method to MongoVectorDBStorage
2025-08-15 16:53:14 +08:00
yangdx
7a7385a200
Add efficient vector retrieval by IDs to PGVectorStorage
2025-08-15 16:51:41 +08:00
yangdx
8f7031b882
Add get_vectors_by_ids method to QdrantVectorDBStorage
2025-08-15 16:46:52 +08:00
yangdx
a71499a180
Add get_vectors_by_ids method to MilvusVectorDBStorage
2025-08-15 16:36:50 +08:00
yangdx
1e2d5252d7
Add get_vectors_by_ids method and filter out vector data from query results
2025-08-15 16:32:26 +08:00
yangdx
6cab68bb47
Improve KG chunk selection documentation and configuration clarity
2025-08-15 10:09:44 +08:00
yangdx
3acb32f547
Add comments explaining chunk deduplication behavior in query context
2025-08-15 02:19:01 +08:00
yangdx
0b45d463df
Add .clinerules to .gitignore
2025-08-15 00:43:45 +08:00
yangdx
f733ac829c
Remove debug logging statements from query context building
2025-08-14 23:44:34 +08:00
yangdx
4a19d0de25
Add chunk tracking system to monitor chunk sources and frequencies
...
• Track chunk sources (E/R/C types)
• Log frequency and order metadata
• Preserve chunk_id through processing
• Add debug logging for chunk tracking
• Handle rerank and truncation operations
2025-08-14 22:58:26 +08:00
yangdx
a8b7890470
Rename chunk selection functions for better clarity
2025-08-14 16:01:13 +08:00
yangdx
a11e8d77eb
Improve missing-vector warning logic in vector similarity
...
- Check for any missing vectors
- Separate no-vector vs partial-vector warnings
- Ensure early return on empty vectors
2025-08-14 14:24:15 +08:00
yangdx
5c7ae8721b
Merge branch 'main' into pick-trunk-by-vector
2025-08-14 13:11:14 +08:00
Daniel.y
79d5210988
Merge pull request #1954 from danielaskdd/pipeline-refactor
...
Feat: Reprocessing of failed documents without the original file being present
2025-08-14 13:09:23 +08:00
yangdx
3bba5fc506
Fix linting
2025-08-14 13:03:23 +08:00
yangdx
772f981e7e
fix: check and process queued docs even when upload directory is empty
2025-08-14 12:35:39 +08:00
yangdx
65a4437f78
Fix: Persist document data immediately after index update
2025-08-14 12:33:36 +08:00
yangdx
28fc075c59
Simplify inconsistency logging and cleanup messages
2025-08-14 11:49:58 +08:00
yangdx
17faeb2fb8
refactor: integrate document consistency validation into pipeline processing
...
This ensures data consistency validation is part of the main processing pipeline and provides better monitoring of inconsistent document cleanup operations.
2025-08-14 11:38:36 +08:00
yangdx
a3f7bc5b7e
Merge branch 'main' into pick-trunk-by-vector
2025-08-14 06:19:57 +08:00
yangdx
b5ae84fac6
fix: Add data consistency validation to document processing pipeline
...
- Add _validate_and_fix_document_consistency() method to detect and fix documents with missing content in full_docs storage
- Integrate consistency check into apipeline_process_enqueue_documents() to automatically mark inconsistent documents as FAILED before processing
- Prevent processing errors caused by documents having status records but missing actual content data
2025-08-14 06:18:34 +08:00
yangdx
cb122c63e4
Merge branch 'main' into pick-trunk-by-vector
2025-08-14 05:34:15 +08:00