yangdx
3f8a9abe7e
Refactor extraction result processing to reduce code duplication
...
• Extract shared processing logic
• Add delimiter pattern fixes
• Improve bracket standardization
2025-09-02 01:22:29 +08:00
yangdx
3cdc98f366
Improve extraction parsing with better bracket handling and delimiter fixes
...
• Standardize Chinese/English brackets
• Fix incomplete tuple delimiters
• Remove duplicate delimiter fix code
• Support mixed bracket formats
• Enhance record parsing robustness
2025-09-02 00:26:04 +08:00
yangdx
8bbf307aeb
Fix regex to match multiline content in extraction parsing
...
• Remove non-greedy quantifier
• Add DOTALL flag for multiline matching
• Apply to both parsing functions
• Enable cross-line content extraction
2025-09-01 10:35:06 +08:00
yangdx
7baeb186c6
Fix regex to use non-greedy matching for parentheses extraction
2025-09-01 10:10:45 +08:00
yangdx
692357fbf3
Add conflict resolution instruction to entity summarization prompt
...
- Add conflict handling step
- Handle entities with same name
- Separate then consolidate summaries
2025-09-01 08:51:19 +08:00
yangdx
e95622ca7b
fix(utils): enhance remove_think_tags to handle orphaned </think> closing tags
...
The function now properly handles cases where text contains </think> closing tags
without corresponding <think> opening tags, which can occur due to content
truncation or processing errors.
2025-09-01 07:17:30 +08:00
Daniel.y
5e73896c40
Merge pull request #2035 from danielaskdd/fix-llm-output
...
Fix LLM output instability for <|> tuple delimiter
v1.4.8rc1
2025-09-01 01:25:24 +08:00
yangdx
30be70991d
Bump API version to 0211
2025-09-01 01:23:22 +08:00
yangdx
5fd7682f16
Fix LLM output instability for <|> tuple delimiter
...
- Replace <||> with <|>
- Replace < | > with <|>
- Apply fix in both functions
- Handle delimiter variations
- Improve parsing reliability
2025-09-01 01:22:27 +08:00
Daniel.y
cdc4570cfe
Merge pull request #2034 from danielaskdd/fix-entity-type-env
...
Fix ENTITY_TYPES Environment Variable Handling
2025-09-01 00:43:46 +08:00
yangdx
ec059d1b5d
Fix typo and clarify delimiter formatting in relationship extraction Prompt
...
- Fix "feild" → "field" typo
- Clarify delimiter spacing rules
2025-09-01 00:42:59 +08:00
yangdx
c8c59c38b0
Fix entity types configuration to support JSON list parsing
...
- Add JSON parsing for list env vars
- Update entity types example format
- Add list type support to get_env_value
2025-09-01 00:14:57 +08:00
yangdx
1a015a7015
Add queue_name parameter to priority_limit_async_func_call for better logging
...
• Add queue_name parameter to decorator
• Update all log messages with queue names
• Pass specific names for LLM and embedding
2025-08-31 23:47:22 +08:00
yangdx
57fe1403c3
Update default entity types in env.example configuration
2025-08-31 22:33:34 +08:00
yangdx
4e751e0653
refac: Enhance extraction with improved prompts and parser
...
- **Prompts**: Restructured prompts with clearer steps and quality guidelines. Simplified the relationship tuple by removing `relationship_strength`
- **Model**: Updated default entity types to be more comprehensive and consistently capitalized (e.g., `Location`, `Product`)
2025-08-31 22:24:11 +08:00
yangdx
75de40da41
Fix typo in relationship extraction log messages
2025-08-31 17:45:16 +08:00
yangdx
97c9600085
Improve extraction error handling and field validation
...
• Add field count validation warnings
• Fix relationship field count (5→6)
• Change error logs to warnings
2025-08-31 17:33:42 +08:00
yangdx
b747417961
feat: enhance text extraction text sanitization and normalization
...
- Improve reduntant quotes in entity and relation name, type and keywords
- Add HTML tag cleaning and Chinese symbol conversion
- Filter out short numeric content and malformed text
- Enhance entity type validation with character filtering
2025-08-31 13:17:20 +08:00
yangdx
d4bbc5dea9
refactor: Merge multi-step text sanitization into single function
2025-08-31 10:36:56 +08:00
Daniel.y
68f18eacf8
Merge pull request #2030 from danielaskdd/fix-leading-white-space
...
Fix: Preserve Leading Spaces in Graph Label Selection
2025-08-31 03:02:23 +08:00
yangdx
69890ff2e1
Bump core version to 1.4.8 and api version to 0210
2025-08-31 03:01:33 +08:00
yangdx
8bab240dbc
Update webui assets
2025-08-31 03:00:16 +08:00
yangdx
25b5d176cd
Fix label selection with leading/trailing whitespace
...
• Fix AsyncSelect value trimming issue
• Preserve whitespace in label display
• Use safe keys for command items
• Add GraphControl dependency fix
• Add debug logging for graph labels
2025-08-31 02:54:39 +08:00
Daniel.y
3c0ce9e38d
Merge pull request #2029 from danielaskdd/optimize-rag-object-creation
...
refac: Eliminate Conditional Imports and Simplify Initialization
2025-08-31 00:30:07 +08:00
yangdx
0cff6e6b13
Merge branch 'fix-ollama-embedding-openai-key'
2025-08-31 00:19:04 +08:00
yangdx
ae09b5c656
refactor: eliminate conditional imports and simplify LightRAG initialization
...
- Remove conditional import block, replace with lazy loading factory functions
- Add create_llm_model_func() and create_llm_model_kwargs() for clean configuration
- Update wrapper functions with lazy imports for better performance
- Unify LightRAG initialization, eliminating duplicate conditional branches
- Reduce code complexity by 33% while maintaining full backward compatibility
2025-08-31 00:18:29 +08:00
yangdx
332202c111
Fix lambda closure bug in embedding function configuration
...
• Replace lambda with proper async function
• Capture config values at creation time
• Avoid closure variable reference issues
• Add factory function for embeddings
• Remove test file for closure bug
2025-08-30 23:43:34 +08:00
avchauzov
414d47d12a
fix(server): Resolve lambda closure bug in embedding_func
...
Fixes #2023 . Resolves an issue where the embedding function would incorrectly fall back to the OpenAI provider if the server's configuration arguments were mutated after initialization. This was caused by a lambda function capturing a reference to the mutable 'args' object instead of capturing the configuration values at creation time.
2025-08-30 14:43:33 +02:00
yangdx
d9aa021682
Update env.example
2025-08-30 11:02:53 +08:00
Daniel.y
0c41be6f8f
Merge pull request #2026 from pedrofs/pedro/fix-env.example
...
fix: adjust the EMBEDDING_BINDING_HOST for openai in the env.example
v1.4.7
2025-08-29 22:52:29 +08:00
Pedro Fernandes Steimbruch
8430e1a051
fix: adjust the EMBEDDING_BINDING_HOST for openai in the env.example
2025-08-29 09:48:42 -03:00
yangdx
43f32e8d97
Bump api version to 0209
2025-08-29 19:42:06 +08:00
Daniel.y
163ec26e10
Merge pull request #2025 from danielaskdd/remove-ids-filter
...
refac: Remove deprecated doc-id based filtering from vector storage queries
2025-08-29 19:39:42 +08:00
yangdx
f3989548b9
Fix MongoDB vector query embedding format compatibility
...
* Convert numpy arrays to lists
* Ensure MongoDB compatibility
2025-08-29 18:51:53 +08:00
yangdx
03d0fa3014
perf: add optional query_embedding parameter to avoid redundant embedding calls
2025-08-29 18:15:45 +08:00
yangdx
a923d378dd
Remove deprecated ID-based filtering from vector storage queries
...
- Remove ids param from QueryParam
- Simplify BaseVectorStorage.query signature
- Update all vector storage implementations
- Streamline PostgreSQL query templates
- Remove ID filtering from operate.py calls
2025-08-29 17:06:48 +08:00
Daniel.y
20b800d694
Merge pull request #2024 from danielaskdd/llm-error-handling
...
refac: Enhanced Timeout Handling for LLM Priority Queue
2025-08-29 15:26:30 +08:00
yangdx
d39afcb831
Add temperature guidance for Qwen3 models in env example
2025-08-29 15:13:52 +08:00
yangdx
d7e0701b63
Improve logging setup and add error prefixes for LLM functions
...
- Move logger init to top of file
- Add console handler by default
- Prefix LLM errors with "[LLM func]"
- Update timeout log messages
- Comment out pypinyin success log
2025-08-29 14:19:13 +08:00
yangdx
925e631a9a
refac: Add robust time out handling for LLM request
2025-08-29 13:50:35 +08:00
yangdx
ac2db35160
Update env.example
2025-08-29 10:18:12 +08:00
Daniel.y
e51fa2439d
Merge pull request #2021 from SandmeyerX/docs/config-fix-env-comment-typos
...
docs(config): fix typo in .env comments
2025-08-28 23:05:19 +08:00
Sandmeyer
1cd27dc048
docs(config): fix typo in .env comments
2025-08-28 20:23:51 +08:00
Daniel.y
57ba2cabcb
Merge pull request #2017 from danielaskdd/improve-text-sanitize
...
Fix UTF-8 Encoding Issues Causing Document Processing Failures
2025-08-28 00:21:44 +08:00
yangdx
99e28e815b
fix: prevent document processing failures from UTF-8 surrogate characters
...
- Change sanitize_text_for_encoding to fail-fast instead of returning error placeholders
- Add strict UTF-8 cleaning pipeline to entity/relationship extraction
- Skip problematic entities/relationships instead of corrupting data
Fixes document processing crashes when encountering surrogate characters (U+D800-U+DFFF)
2025-08-27 23:52:39 +08:00
yangdx
4dfbe5e2db
Rename workflow and remove latest tag from Docker build
...
• Rename docker-build-main to manual
• Remove latest tag from metadata
2025-08-27 15:14:23 +08:00
yangdx
6a2a592224
Fix linting
2025-08-27 12:51:50 +08:00
yangdx
8a0d06e557
Restore default entity types
2025-08-27 12:51:18 +08:00
yangdx
28e07c89f9
Fix linting
2025-08-27 12:35:51 +08:00
yangdx
2ccc39de9a
Fix language fallback in summarize error
2025-08-27 12:34:27 +08:00