Daniel.y
a09f6eb170
Merge pull request #1423 from tackhwa/main
...
friendly implementation of entity extraction and relationship weight extract for Low-Capability LLMs
2025-04-22 19:11:04 +08:00
tackhwa
2e186ba488
remove regex
2025-04-22 15:22:37 +08:00
yangdx
1eef9b7205
Set max parallel chunks processing according to MAX_SYNC of LLM
2025-04-22 15:03:46 +08:00
yangdx
21c0bb7abf
Merge branch 'context_format_csv_to_json'
2025-04-22 12:25:50 +08:00
yangdx
6a727103d6
Simplified logger messages
2025-04-22 12:19:40 +08:00
yangdx
9f958db328
Improve logger messages
2025-04-22 12:10:39 +08:00
yangdx
ff65cba544
Add null check for edge data
2025-04-21 18:32:33 +08:00
tackhwa
f3c57b606e
friendly implementation of entity extraction and relationship weight extract for Low-Capability LLMs
2025-04-21 16:52:13 +08:00
yangdx
1a7b225e90
Fix stream respone error for naive query mode
2025-04-21 00:06:15 +08:00
mengchao
f2f3a2721d
Refactor context handling to convert data from CSV to JSON format for improved compatibility with LLM, replacing the list_of_list_to_csv function with list_of_list_to_json
2025-04-20 19:24:05 +08:00
yangdx
4ae5246a7e
Remove summary length check for entity relations
...
- Summary now determined by num_fragment
2025-04-20 12:36:32 +08:00
孟超
a20d68d865
Revise the context format of chunks from CSV to JSON to enhance compatibility with LLM
2025-04-19 15:18:33 +08:00
drahnreb
9c6b5aefcb
fix linting
2025-04-18 16:24:43 +02:00
drahnreb
e71f466910
fix: take global_config from storage class
2025-04-18 16:24:43 +02:00
drahnreb
0f949dd5d7
fix truncation with global_config tokenizer
2025-04-18 16:24:43 +02:00
drahnreb
20ba1eb9c2
add: to optionally replace default tiktoken Tokenizer with a custom one
2025-04-18 16:24:43 +02:00
drahnreb
0aa994163e
fix: correct parentheses. system_prompt was never formatted.
2025-04-17 23:44:14 +02:00
yangdx
a3ca134e97
Fix special chars problem for Postgres
2025-04-17 22:58:36 +08:00
yangdx
a185e48b87
fix: cancel pending tasks when any chunk processing fails
...
Modify extract_entities function to terminate all pending text chunk processing tasks when any single chunk processing fails.
2025-04-17 03:57:38 +08:00
yangdx
d4c4a40c53
Fix M.env AX_GRAPH_NODES not working problem
2025-04-17 01:28:22 +08:00
yangdx
0afe35a9fd
Convert parallel queries to serial execution
2025-04-16 17:55:49 +08:00
yangdx
2c7e8a5526
Optimize query performance by batch query
2025-04-16 17:53:13 +08:00
yangdx
051e632ab3
Fix cache persistence bugs
2025-04-16 01:24:59 +08:00
yangdx
1de74c9228
Fix linting
2025-04-15 12:34:04 +08:00
yangdx
e9b04e5bd2
Merge branch 'graph-storage-batch-query-frederikhendrix' into graph-storage-batch-query
2025-04-12 22:20:41 +08:00
yangdx
7fd3053e61
Update log message
2025-04-12 21:10:36 +08:00
yangdx
2ac66c3531
Remove chinese quotes in entity name
2025-04-12 20:45:41 +08:00
yangdx
0eed5eb718
feat: implement entity/relation name and description normalization
...
- Remove spaces between Chinese characters
- Remove spaces between Chinese and English/numbers
- Preserve spaces within English text and numbers
- Replace Chinese parentheses with English parentheses
- Replace Chinese dash with English dash
2025-04-12 19:26:02 +08:00
yangdx
96f439bb52
Optimize pipeline status message
2025-04-10 21:19:26 +08:00
yangdx
7d69449c67
Fix linting
2025-04-10 20:32:40 +08:00
yangdx
339bc99259
Only merge new entities/edges during gleaning
...
- Restrict gleaning to new entity names
- Only add edges with new keys
- Prevent similar decription of the same entity or edge
2025-04-10 20:31:52 +08:00
Daniel.y
0528c06209
Merge pull request #1334 from danielaskdd/main
...
Refactoring entity and edge merging and add env FORCE_LLM_SUMMARY_ON_MERGE
2025-04-10 18:46:04 +08:00
yangdx
3007dff153
Add env FORCE_LLM_SUMMARY_ON_MERGE
2025-04-10 17:29:07 +08:00
yangdx
35431644ad
Refactor: Entity and edge merging in extract_entities
...
- Improves efficiency by merging identical
entities and edges in a single operation
- Esures proper handling of undirected graph edges
- Change merge stage from chunk leve to document level
2025-04-10 14:19:06 +08:00
zrguo
1c1afb4eaf
Merge pull request #1304 from FeHuynhVI/main
...
Fixes a bug where file_path was not present dictionary
2025-04-10 14:08:11 +10:00
yangdx
39b4da41a3
Optimize logging
2025-04-10 04:17:32 +08:00
yangdx
496f87a1e6
Fix linting
2025-04-10 03:58:04 +08:00
yangdx
8d858da4d0
Fix LLM cache now work for nodes and edges merging
2025-04-10 03:57:36 +08:00
yangdx
5d286dd0fa
Add node and edge merging log to pipeline_status
2025-04-10 01:07:06 +08:00
IcySugar
224f63cd5f
Merge branch 'HKUDS:main' into main
2025-04-09 13:40:19 +08:00
yangdx
740b4174d2
Fix linting
2025-04-09 12:59:58 +08:00
yangdx
692415b2e1
Fix mix_kg_vector_query function return value error when only_need_context is enabled
2025-04-09 12:59:32 +08:00
IcySugar000
8aa3cd799a
Fix: Fixed null value handling and ensure exceptions are avoided
2025-04-09 11:32:05 +08:00
FeHuynhVI
881943b8ae
Update operate.py
2025-04-08 01:22:11 +07:00
frederikhendrix
90bff7c1d6
Edges and node_edge also implemented. Everything is now ready to be run and tested.
2025-04-07 19:13:59 +02:00
frederikhendrix
182aee2e14
get_node added and all to base.py and to neo4j_impl.py file
2025-04-07 19:09:31 +02:00
yangdx
b2284c8b9d
Fix linting
2025-04-06 17:45:32 +08:00
yangdx
b45c5f9304
Change get_by_id batch size from 25 to 5 to reserve db connection resouces
2025-04-06 17:42:13 +08:00
Alex Z
e69a128832
Merge branch 'main' into main
2025-04-05 15:27:59 -07:00
yangdx
247be483eb
Merge branch 'main' into clear-doc
2025-04-04 05:45:06 +08:00