4179 Commits

Author SHA1 Message Date
yangdx
ccc2a20071 feat: remove deprecated MAX_TOKEN_SUMMARY parameter to prevent LLM output truncation
- Remove MAX_TOKEN_SUMMARY parameter and related configurations
- Eliminate forced token-based truncation in entity/relationship descriptions
- Switch to fragment-count based summarization logic using FORCE_LLM_SUMMARY_ON_MERGE
- Update FORCE_LLM_SUMMARY_ON_MERGE default from 6 to 4 for better summarization
- Clean up documentation, environment examples, and API display code
- Preserve backward compatibility by graceful parameter removal

This change resolves issues where LLMs were forcibly truncating entity relationship
descriptions mid-sentence, leading to incomplete and potentially inaccurate knowledge
graph content. The new approach allows LLMs to generate complete descriptions while
still providing summarization when multiple fragments need to be merged.

Breaking Change: None - parameter removal is backward compatible
Fixes: Entity relationship description truncation issues
2025-07-15 12:26:33 +08:00
Daniel.y
e943591e68
Merge pull request #1779 from danielaskdd/fix-postgres-timezone
Fix: Resolve timezone handling problem in PostgreSQL storage
2025-07-14 04:20:38 +08:00
yangdx
7e988158a9 Fix: Resolve timezone handling problem in PostgreSQL storage
- Changed timestamp columns to naive UTC
- Added datetime formatting utilities
- Updated SQL templates for timestamp extraction
- Simplified timestamp migration logic
2025-07-14 04:12:52 +08:00
Daniel.y
375bfd57a4
Merge pull request #1778 from danielaskdd/add-ollama-num-ctx
feat: Refine summary logic and add dedicated Ollama num_ctx config
2025-07-14 02:13:08 +08:00
yangdx
b03bb48e24 feat: Refine summary logic and add dedicated Ollama num_ctx config
- Refactor the trigger condition for LLM-based summarization of entities and relations. Instead of relying on character length, the summary is now triggered when the number of merged description fragments exceeds a configured threshold. This provides a more robust and logical condition for consolidation.
- Introduce the `OLLAMA_NUM_CTX` environment variable to explicitly configure the context window size (`num_ctx`) for Ollama models. This decouples the model's context length from the `MAX_TOKENS` parameter, which is now specifically used to limit input for summary generation, making the configuration clearer and more flexible.
- Updated `README` files, `env.example`, and default values to reflect these changes.
2025-07-14 01:55:04 +08:00
yangdx
e8b3dfcf90 Bump api verion to 0182 2025-07-14 00:29:48 +08:00
Daniel.y
47588fa9c9
Merge pull request #1777 from danielaskdd/fix-postgres-field-len
Increase field lengths for entity and file paths for PostgreSQL
2025-07-14 00:27:44 +08:00
yangdx
157fb4c871 Increase field lengths for entity and file paths for PostgreSQL
- Expand entity_name length to 512 chars
- Increase source/target ID lengths
- Convert file_path to TEXT type
- Add migration logic
2025-07-14 00:24:54 +08:00
Daniel.y
568a809957
Merge pull request #1776 from danielaskdd/fix-redisk-conn-pool
Increase max length limits for Milvus storage fields
2025-07-13 23:26:08 +08:00
yangdx
187a623125 Increase max length limits for Milvus storage fields
- Extended entity_name max_length to 512
- Increased entity_type max_length to 128
- Expanded file_path limits to 1024
- Raised src_id/tgt_id limits to 512
2025-07-13 23:13:45 +08:00
Daniel.y
77b8fb9b77
Merge pull request #1775 from danielaskdd/fix-redisk-conn-pool
Hotfix: Resolves connection pool bugs for Redis
2025-07-13 22:57:18 +08:00
yangdx
6730a89d7c Hotfix: Resolves connection pool bugs for Redis
- The previous implementation of the shared Redis connection pool had a critical issue where any Redis storage instance would disconnect the global shared pool upon closing. This caused `ConnectionError` exceptions for other instances still using the pool.
- This commit resolves the issue by introducing a reference counting mechanism in `RedisConnectionManager`.
2025-07-13 22:54:34 +08:00
yangdx
f185b3fb38 Optimize async task limits for graph processing
- Increased concurrency for graph operations
- Renamed variables for clarity
- Updated status messages
2025-07-13 21:51:19 +08:00
yangdx
ab805b35c4 Update doc: concurrent explain 2025-07-13 21:50:30 +08:00
Daniel.y
7945d7de59
Merge pull request #1774 from danielaskdd/fix-keyed-lock-error
Hotfix: prevent premature lock cleanup in multiprocess mode
2025-07-13 14:13:32 +08:00
yangdx
85cd1178a1 fix: prevent premature lock cleanup in multiprocess mode
- Change cleanup condition from count == 1 to count == 0 to properly
remove reused locks from cleanup list
- Fix RuntimeError: Attempting to release lock for xxxx more times than it was acquired
2025-07-13 13:51:48 +08:00
yangdx
03b40937f7 Reduce embedding concurrency limit from 16 to 8 2025-07-13 03:13:52 +08:00
yangdx
a2eeae9661 Fixes incorrect cleanup count 2025-07-13 02:38:36 +08:00
yangdx
582e952020 Disable direct logging by default for shared storage module 2025-07-13 01:58:50 +08:00
yangdx
cbf544b3c1 Remvoe redundant log message 2025-07-13 01:51:30 +08:00
yangdx
e4aef36977 Update webui assets 2025-07-13 01:36:25 +08:00
yangdx
fc7b0a9273 Improve query settings input experience in WebUI 2025-07-13 01:35:21 +08:00
yangdx
465757aa6a Increase status card label width and dialog size 2025-07-13 01:14:12 +08:00
yangdx
efc359c411 Update webui assets 2025-07-13 00:57:41 +08:00
Daniel.y
0e52582bf2
Merge pull request #1773 from danielaskdd/cleanup-lock-in-healthcheck
Feat: Added reranker config and lock status to status card of WebUI
2025-07-13 00:52:57 +08:00
Daniel.y
9674ade611
Merge pull request #1772 from danielaskdd/rebuild-chunks-in-parallel
Optimize knowledge graph rebuild with parallel processing
2025-07-13 00:49:52 +08:00
yangdx
eb31ff0f90 Update i18n translation 2025-07-13 00:46:12 +08:00
yangdx
ab561196ff Feat: Added reranker config and lock status to status card of WebUI 2025-07-13 00:41:54 +08:00
yangdx
0e3aaa318f Feat: Add keyed lock cleanup and status monitoring 2025-07-13 00:09:00 +08:00
Daniel.y
9f3af332ec
Merge pull request #1771 from danielaskdd/namespace-keyed-lock
Refac: Generalize keyed lock with namespace support
2025-07-12 13:29:38 +08:00
yangdx
e4bf4d19a0 Optimize knowledge graph rebuild with parallel processing
- Add parallel processing for KG rebuild
- Implement keyed locks for data consistency
2025-07-12 13:22:56 +08:00
yangdx
a85d7054d4 fix: move node existence check inside lock to prevent race condition
Move knowledge_graph_inst.has_node check inside get_storage_keyed_lock
in _merge_edges_then_upsert to ensure atomic check-then-act operations
and prevent duplicate node creation during concurrent updates.
2025-07-12 12:22:32 +08:00
yangdx
2ade3067f8 Refac: Generalize keyed lock with namespace support
Refactored the `KeyedUnifiedLock` to be generic and support dynamic namespaces. This decouples the locking mechanism from a specific "GraphDB" implementation, allowing it to be reused across different components and workspaces safely.

Key changes:
- `KeyedUnifiedLock` now takes a `namespace` parameter on lock acquisition.
- Renamed `_graph_db_lock_keyed` to a more generic _storage_keyed_lock`
- Replaced `get_graph_db_lock_keyed` with get_storage_keyed_lock` to support namespaces
2025-07-12 12:10:12 +08:00
yangdx
f2d875f8ab Update comments 2025-07-12 11:05:25 +08:00
yangdx
943ead8b1d Bump api version to 0181 2025-07-12 05:59:13 +08:00
Daniel.y
b0ca25e5f1
Merge pull request #1768 from schmidt-marvin/main
fix(build): pyproject.toml setup
2025-07-12 05:44:48 +08:00
Daniel.y
ad7d7d0854
Merge pull request #1770 from danielaskdd/merge_lock_with_key
Refac: Optimize keyed lock cleanup logic with time and size tracking
2025-07-12 05:24:36 +08:00
yangdx
5ee509e671 Fix linting 2025-07-12 05:17:44 +08:00
yangdx
964293f21b Optimize lock cleanup with time tracking and intervals
- Add cleanup time tracking variables
- Implement minimum cleanup intervals
- Track earliest cleanup times
- Handle time rollback cases
- Improve cleanup logging
2025-07-12 04:34:26 +08:00
yangdx
39965d7ded Move merging stage back controled by max parallel insert semhore 2025-07-12 03:32:08 +08:00
yangdx
7490a18481 Optimize lock cleanup parameters 2025-07-12 03:10:03 +08:00
yangdx
3d8e6924bc Show lock clean up message 2025-07-12 02:58:05 +08:00
yangdx
22c36f2fd2 Optimize log messages 2025-07-12 02:41:31 +08:00
yangdx
a64c767298 optimize: improve lock cleanup performance with threshold-based strategy
- Add CLEANUP_THRESHOLD constant (100) to control cleanup frequency
- Modify _release_shared_raw_mp_lock to only scan when cleanup list exceeds threshold
- Modify _release_async_lock to only scan when cleanup list exceeds threshold
2025-07-11 23:43:40 +08:00
yangdx
ad99d9ba5a Improve code organization and comments 2025-07-11 22:13:02 +08:00
yangdx
c52c451cf7 Fix linting 2025-07-11 20:40:50 +08:00
yangdx
3afdd1b67c Fix initial count error for multi-process lock with key 2025-07-11 20:39:08 +08:00
Marvin Schmidt
42a1da0041 fix(build): pyproject.toml setup 2025-07-11 12:01:34 +02:00
yangdx
c47747da9e Merge branch 'main' into merge_lock_with_key 2025-07-11 16:37:10 +08:00
yangdx
ef4870fda5 Combined entity and edge processing tasks and optimize merging with semaphore 2025-07-11 16:34:54 +08:00