yangdx
331dcf0509
Remove query params from cache key generation for keyword extration
2025-08-14 02:57:39 +08:00
yangdx
9a62101e9d
Add OpenAI frequency penalty sample env params
2025-08-14 02:57:23 +08:00
yangdx
3343833571
Remove query params from cache key generation for keyword extration
2025-08-14 02:36:01 +08:00
yangdx
2a46667ac9
Add OpenAI frequency penalty sample env params
2025-08-14 01:50:27 +08:00
yangdx
bac09118d5
Simplify embedding func extraction
2025-08-14 01:09:18 +08:00
yangdx
ac3b5605a1
Refactor logging for relation chunk discovery with dedup info
2025-08-14 00:41:58 +08:00
yangdx
edac10906c
fix: Add total_relation_chunks statistics and improve logging in _find_related_text_unit_from_relations
2025-08-13 23:45:31 +08:00
yangdx
5a40ff654e
Change KG chunk selection default to VECTOR
...
- Set KG_CHUNK_PICK_METHOD default to VECTOR
- Update env.example with new config option
2025-08-13 23:10:42 +08:00
yangdx
947e826e61
Bump api version to 0200
2025-08-13 18:29:07 +08:00
yangdx
f1dafa0d01
feat: KG related chunks selection by vector similarity
...
- Add env switch to toggle weighted polling vs vector-similarity strategy
- Implement similarity-based sorting with fallback to weighted
- Introduce batch vector read API for vector storage
- Implement vector store and retrive funtion for Nanovector DB
- Preserve default behavior (weighted polling selection method)
2025-08-13 18:16:42 +08:00
SJ
99643f01de
Enhancement: support aws bedrock as an LLm binding #1733
2025-08-13 02:08:13 -05:00
Daniel.y
5b0e26d9da
Merge pull request #1941 from HKUDS/add-final-namespace
...
Fix: Resolve workspace isolation issues across multiple storage implementations
2025-08-12 20:17:53 +08:00
Daniel.y
203e420b51
Merge pull request #1931 from danielaskdd/fix-first-stage-tasks-missing
...
Fix: Initialize first_stage_tasks and entity_relation_task to prevent empty-task cancel errors
2025-08-12 19:19:04 +08:00
yangdx
578bdaa410
Pin pymilvus version to 2.5.2 to avoid Protobuf version warning
2025-08-12 18:22:00 +08:00
yangdx
5d1bc8b49d
Relocate client creation to the initialize method to prevent race conditions in multi-process mode.
2025-08-12 18:20:56 +08:00
yangdx
74783d7781
Remove redundant debug logging for Qdrant operations
2025-08-12 17:29:05 +08:00
zrguo
f1c7233763
Avoid UTF-8 BOM
2025-08-12 17:06:54 +08:00
yangdx
41f8ef05b9
Restore thread safety to MongoDB client manager
...
- Protected client creation with lock
- Protected client release with lock
2025-08-12 16:42:53 +08:00
yangdx
0b2c3d06c7
- Remove redundant collection listing check
2025-08-12 15:24:06 +08:00
yangdx
fc8ca1a706
Fix: add muti-process lock for initialize and drop method for all storage
2025-08-12 04:25:09 +08:00
yangdx
ca00b9c8ee
Fix: Resolve workspace isolation problem for PostgreSQL with multiple LightRAG instances
2025-08-12 01:27:05 +08:00
yangdx
d9c1f935f5
Fix: Resolve workspace isolation issues in in-memory database with multiple LightRAG instances
2025-08-12 01:26:09 +08:00
yangdx
095e0cbfa2
Refac: Add workspace infomation to all logger output for all storage type
2025-08-12 01:19:09 +08:00
yangdx
44204abef7
Fix linting
2025-08-10 10:59:32 +08:00
yangdx
eb2320e556
Fix: Initialize first_stage_tasks and entity_relation_task to prevent empty-task cancel errors
...
- Initialize first_stage_tasks = [] and entity_relation_task = None at coroutine start
- Ensure cancel block safely handles no-op when tasks lists are empty
2025-08-10 10:45:41 +08:00
Daniel.y
f1c6a4ed94
Merge pull request #1928 from danielaskdd/main
...
Fix: Update OpenAI embedding handling for both list and base64 embeddings
2025-08-09 08:44:21 +08:00
yangdx
ffb642a5ce
Fix linting
2025-08-09 08:41:41 +08:00
yangdx
ecd7777e61
Update OpenAI embedding handling for both list and base64 embeddings
...
- Fix OpenAI embedding array parsing
- Improve embedding data type safety
2025-08-09 08:40:33 +08:00
yangdx
cf064579ce
Remove deprecated keyword extraction query methods
...
- Delete query_with_keywords function
- Remove kg_query_with_keywords helper
- Drop separate keyword extraction methods
2025-08-08 14:59:39 +08:00
yangdx
f5ac6a9f4b
Add default Ollama embedding context length
...
- Set default context length to 8192
- Overide the default context lenght for LLM in binding_options.py
2025-08-08 13:51:25 +08:00
yangdx
c2eefec707
Merge branch 'postgres-vector-index'
2025-08-08 03:01:34 +08:00
yangdx
16c9a81f4c
feat: support config.ini for PostgreSQL vector index settings
...
- Add support for reading vector_index_type, hnsw_m, hnsw_ef, and ivfflat_lists from config.ini
- Maintain backward compatibility with environment variables
- Update config.ini.example with new PostgreSQL vector index options
- Follow existing configuration priority: env vars > config.ini > defaults
2025-08-08 02:55:49 +08:00
yangdx
dec4148075
Merge branch 'main' into Matt23-star/main
2025-08-08 02:24:34 +08:00
yangdx
f38e10559e
Update PostgreSQL vector index configuration
...
- Remove FLAT index support
- Standardize on HNSW as default
- Add dimension validation
- Improve error logging
- Clean up index creation code
2025-08-08 02:21:06 +08:00
Daniel.y
2f289f6e25
Merge pull request #1924 from danielaskdd/neo4j-connection-lifetime
...
Refact:Enhanced Neo4j Connection Lifecycle Management
2025-08-08 01:16:42 +08:00
yangdx
f4ef254de2
fix(neo4j): enhance connection lifecycle management to prevent timeout errors
...
- Add max_connection_lifetime, liveness_check_timeout, keep_alive parameters
- Extend retry mechanisms for connection reset scenarios
- Update config examples with new Neo4j connection options
- Resolves ClientTimeoutException during data insertion operations
2025-08-08 01:07:45 +08:00
Daniel.y
c8a44f5657
Merge pull request #1923 from danielaskdd/fix-context-format
...
Fix: Unify document chunks context format in only_need_context query
2025-08-08 00:05:26 +08:00
yangdx
eded6d1187
Unify document chunks context format in only_need_context query
...
- Update Document Chunks label to include (DC) abbreviation
2025-08-08 00:02:53 +08:00
Matt23-star
727ca43d3c
feat: add vector index creation functionality for PostgreSQL
2025-08-07 23:07:18 +08:00
yangdx
7780776af6
Update env.example
2025-08-06 18:50:58 +08:00
Daniel.y
a6ef29cef6
Merge pull request #1915 from danielaskdd/optimize-llm-cache
...
Refact: Optimized LLM Cache Hash Key Generation by Including All Query Parameters
2025-08-06 01:04:02 +08:00
yangdx
2dab4e321d
Bump api version to 0199
2025-08-06 01:03:35 +08:00
yangdx
a04c11a598
Remove deprecated storage
2025-08-06 00:02:50 +08:00
yangdx
c22315ea6d
refactor: remove selective LLM cache clearing functionality
...
- Remove optional 'modes' parameter from aclear_cache() and clear_cache() methods
- Replace deprecated drop_cache_by_modes() with drop() method for complete cache clearing
- Update API endpoint to ignore mode-specific parameters and clear all cache
- Simplify frontend clearCache() function to send empty request body
This change ensures all LLM cache is cleared together.
2025-08-05 23:51:51 +08:00
yangdx
cc1f7118e7
Remove deprecated cache_by_modes functionality from all storage
2025-08-05 23:20:26 +08:00
yangdx
8294d6d1b7
Remove deprecated mode field from LLM cache schema
...
- Drop mode column from LLM cache table
- Update primary key to exclude mode
- Remove mode from all SQL queries
- Deprecate mode-related methods
- Update schema migration logic
2025-08-05 23:18:54 +08:00
yangdx
0b5c708660
Update storage implementation documentation
...
- Add detailed storage type descriptions
- Remove Chroma from vector storage options
- Include recommended PostgreSQL version
- Add Memgraph to graph storage options
- Update performance comparison notes
2025-08-05 18:03:51 +08:00
yangdx
0463963520
fix: include all query parameters in LLM cache hash key generation
...
- Add missing query parameters (top_k, enable_rerank, max_tokens, etc.) to cache key generation in kg_query, naive_query, and extract_keywords_only functions
- Add queryparam field to CacheData structure and PostgreSQL storage for debugging
- Update PostgreSQL schema with automatic migration for queryparam JSONB column
- Prevent incorrect cache hits between queries with different parameters
Fixes issue where different query parameters incorrectly shared the same cached results.
2025-08-05 18:03:10 +08:00
yangdx
cb75e6631e
Remove quantized embedding info from LLM cache
...
- Delete quantize_embedding function
- Delete dequantize_embedding function
- Remove embedding fields from CacheData
- Update save_to_cache to exclude embedding data
- Clean up unused quantization-related code
2025-08-05 17:58:34 +08:00
Daniel.y
c7d17f13c1
Merge pull request #1914 from danielaskdd/feat-tiktoken-cache
...
feat: add tiktoken cache directory support for offline deployment
2025-08-05 14:25:10 +08:00