孟超
a20d68d865
Revise the context format of chunks from CSV to JSON to enhance compatibility with LLM
2025-04-19 15:18:33 +08:00
drahnreb
9c6b5aefcb
fix linting
2025-04-18 16:24:43 +02:00
drahnreb
e71f466910
fix: take global_config from storage class
2025-04-18 16:24:43 +02:00
drahnreb
0f949dd5d7
fix truncation with global_config tokenizer
2025-04-18 16:24:43 +02:00
drahnreb
20ba1eb9c2
add: to optionally replace default tiktoken Tokenizer with a custom one
2025-04-18 16:24:43 +02:00
drahnreb
0aa994163e
fix: correct parentheses. system_prompt was never formatted.
2025-04-17 23:44:14 +02:00
yangdx
a3ca134e97
Fix special chars problem for Postgres
2025-04-17 22:58:36 +08:00
yangdx
a185e48b87
fix: cancel pending tasks when any chunk processing fails
...
Modify extract_entities function to terminate all pending text chunk processing tasks when any single chunk processing fails.
2025-04-17 03:57:38 +08:00
yangdx
d4c4a40c53
Fix M.env AX_GRAPH_NODES not working problem
2025-04-17 01:28:22 +08:00
yangdx
0afe35a9fd
Convert parallel queries to serial execution
2025-04-16 17:55:49 +08:00
yangdx
2c7e8a5526
Optimize query performance by batch query
2025-04-16 17:53:13 +08:00
yangdx
051e632ab3
Fix cache persistence bugs
2025-04-16 01:24:59 +08:00
yangdx
1de74c9228
Fix linting
2025-04-15 12:34:04 +08:00
yangdx
e9b04e5bd2
Merge branch 'graph-storage-batch-query-frederikhendrix' into graph-storage-batch-query
2025-04-12 22:20:41 +08:00
yangdx
7fd3053e61
Update log message
2025-04-12 21:10:36 +08:00
yangdx
2ac66c3531
Remove chinese quotes in entity name
2025-04-12 20:45:41 +08:00
yangdx
0eed5eb718
feat: implement entity/relation name and description normalization
...
- Remove spaces between Chinese characters
- Remove spaces between Chinese and English/numbers
- Preserve spaces within English text and numbers
- Replace Chinese parentheses with English parentheses
- Replace Chinese dash with English dash
2025-04-12 19:26:02 +08:00
yangdx
96f439bb52
Optimize pipeline status message
2025-04-10 21:19:26 +08:00
yangdx
7d69449c67
Fix linting
2025-04-10 20:32:40 +08:00
yangdx
339bc99259
Only merge new entities/edges during gleaning
...
- Restrict gleaning to new entity names
- Only add edges with new keys
- Prevent similar decription of the same entity or edge
2025-04-10 20:31:52 +08:00
Daniel.y
0528c06209
Merge pull request #1334 from danielaskdd/main
...
Refactoring entity and edge merging and add env FORCE_LLM_SUMMARY_ON_MERGE
2025-04-10 18:46:04 +08:00
yangdx
3007dff153
Add env FORCE_LLM_SUMMARY_ON_MERGE
2025-04-10 17:29:07 +08:00
yangdx
35431644ad
Refactor: Entity and edge merging in extract_entities
...
- Improves efficiency by merging identical
entities and edges in a single operation
- Esures proper handling of undirected graph edges
- Change merge stage from chunk leve to document level
2025-04-10 14:19:06 +08:00
zrguo
1c1afb4eaf
Merge pull request #1304 from FeHuynhVI/main
...
Fixes a bug where file_path was not present dictionary
2025-04-10 14:08:11 +10:00
yangdx
39b4da41a3
Optimize logging
2025-04-10 04:17:32 +08:00
yangdx
496f87a1e6
Fix linting
2025-04-10 03:58:04 +08:00
yangdx
8d858da4d0
Fix LLM cache now work for nodes and edges merging
2025-04-10 03:57:36 +08:00
yangdx
5d286dd0fa
Add node and edge merging log to pipeline_status
2025-04-10 01:07:06 +08:00
IcySugar
224f63cd5f
Merge branch 'HKUDS:main' into main
2025-04-09 13:40:19 +08:00
yangdx
740b4174d2
Fix linting
2025-04-09 12:59:58 +08:00
yangdx
692415b2e1
Fix mix_kg_vector_query function return value error when only_need_context is enabled
2025-04-09 12:59:32 +08:00
IcySugar000
8aa3cd799a
Fix: Fixed null value handling and ensure exceptions are avoided
2025-04-09 11:32:05 +08:00
FeHuynhVI
881943b8ae
Update operate.py
2025-04-08 01:22:11 +07:00
frederikhendrix
90bff7c1d6
Edges and node_edge also implemented. Everything is now ready to be run and tested.
2025-04-07 19:13:59 +02:00
frederikhendrix
182aee2e14
get_node added and all to base.py and to neo4j_impl.py file
2025-04-07 19:09:31 +02:00
yangdx
b2284c8b9d
Fix linting
2025-04-06 17:45:32 +08:00
yangdx
b45c5f9304
Change get_by_id batch size from 25 to 5 to reserve db connection resouces
2025-04-06 17:42:13 +08:00
Alex Z
e69a128832
Merge branch 'main' into main
2025-04-05 15:27:59 -07:00
yangdx
247be483eb
Merge branch 'main' into clear-doc
2025-04-04 05:45:06 +08:00
yangdx
399b2f14f6
Fix linting
2025-04-04 00:07:21 +08:00
yangdx
a809bc7945
Optmize parallel processing on chunks extraction
2025-04-04 00:06:42 +08:00
yangdx
6b240fa9b2
Serialize merge precess to prevent race conditions
2025-04-03 21:33:46 +08:00
zrguo
75e8a10c21
Update get_keywords_from_query
2025-04-03 17:46:28 +08:00
Alex Z
d0d246bef8
Fix 'TOO MANY OPEN FILE' problem while using redis vector DB:
...
Enhance RedisKVStorage: Implement connection pooling and error handling. Refactor async methods to use context managers for Redis operations, improving resource management and error logging. Batch processing added for key operations to optimize performance.
2025-04-02 21:06:49 -07:00
Mykola Chaban
ce1a59b1c0
Fix trailing whitespace in docstring
2025-04-02 21:52:06 +03:00
Mykola Chaban
8e66b2a974
added additional verificaton if keywords already provided in query param do not run the generation process;
2025-04-02 21:15:40 +03:00
yangdx
5d517d72f5
Fix file_path error in PostgreSQL storage
2025-04-02 14:30:13 +08:00
jofoks
f349618e37
Fix: unknown filepath errors
2025-03-31 14:50:13 -07:00
yangdx
c94f30be2d
Shorten log message to fit in pipelinestatus UI
2025-03-29 13:18:22 +08:00
yangdx
65574459f9
standardize .env loading behavior across modules
2025-03-29 03:48:38 +08:00