171 Commits

Author SHA1 Message Date
Like0x
ae4569284d
Update README.md add DeepWiki Badge (#532)
* Update README.md add DeepWiki Badge

Add a badge to this wiki in the repo's README file to auto refresh the wiki weekly with the latest code.

* Update README.md
2025-06-02 09:35:16 +08:00
thundax
164f518691
feat(kag): ollama vectorize model (#562)
* update DSL query string

* fix(tools): update ner construct params

* feat(kag): ollama vectorize model

* feat(kag): formatter

* feat(common): remove LLM default setting of max_tokens

* feat(common): remove LLM default setting of max_tokens

* fix(tools): wrapper entity type with '`' in generate_label()
2025-05-31 16:53:01 +08:00
royzhao
881d6e5d0c
fix(solver): add memory graph tensor cache and id index (#555)
* add async return

* add tensor cache and id index

* fix
2025-05-23 23:53:38 +08:00
royzhao
f9e5973625
fix(solver): bugfix for multi hop over graph (#554)
* add async return

* bugfix for multi hop in knowledge graph

* fix commit
2025-05-23 16:01:08 +08:00
田常@蚂蚁
21b57336a0
feat(example):Update kag_config.yaml for riskmining (#545) 2025-05-20 16:18:44 +08:00
田常@蚂蚁
8324a723c2
fix(common):Update default max_tokens to 8192 (#534)
Update default max_tokens to 8192
2025-05-13 09:58:33 +08:00
Xinhong Zhang
54c4d9845a
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Create CITATION.cff

* Update README.md

* Remove sensitive information

* refine vectorizer code and config

* fix vectorizer package name

* fix old import and requirement version bug

* delete kagdemo

* remove batch_vectorizer.py under common

* fix struct chain bug

* fix path bug

* fix path bug

* fix comment

* add README_cn.md

* fix spaces

* handle empty node_path

* handle empty node_batch

* add english README.md

* fix url

* fix typo

* fix bug in solver

* use kwargs

* use kwargs to init config

* add llm input

* remove __main__

* fix bug in solver

* add HumanBodyPart.csv data

* fix bug in builder

* update requirement

* fix example llm config

* add example cfg

* fix readme

* update

* update requirement

* Update README.md

* import FlagEmbedding at top-level to avoid sklearn init failure

* fix llm client

* add __init__

* import sklearn before FlagEmbedding to avoid sklearn init failure

* implement vector_dimensions in base Vectorizer

* add VectorizerConfigChecker

* add llm config checker

* add llm init file

* fix

* fix

* add init for solver

* fix

* fix(kag): fix llm call warning info (#11)

* output llm call warning info

* add generator module

* [fix](builder): prompt config for builder (#14)

* fix zh prompt config bug

* add __init__ for extractabc

* [fix](builder): batchvectorizer kwargs (#16)

* fix zh prompt config bug

* add __init__ for extractabc

* fix batchvectorizer kwargs

* using ollama client

* fix buidler init

* fix buidler init (#18)

* fix buidler init

* fix llm config cheker main

* fix llm test

* (fix)[solver]: language (#19)

* fix builder init

* fix language

* fix req

* (fix)[common]: llm checker (#20)

* fix buidler init

* fix llm checker

* filter edges with empty relation (#26)

* (fix)[common]: llm client (#22)

* fix buidler init

* add sub llm client in __init__

* fix cmd kagbase

* fix spo kag_llm

* (fix)[solver]: default prompt for examples (#29)

* fix buidler init

* fix default prompt for examples

* fix kagbasemodule for prompt config

* fix kagbasemodule for prompt config

* [feat]: Add test dataset for hotpotqa and musique (#30)

* add dataset

* move dir

* fix bug in spg_extractor and base_table_splitter (#44)

* Update README_cn.md (#55)

Update the description of KAG core features in the readme file

* Update README.md (#54)

* docs: add Japanese README file (#49)

I created Japanese translated README.

* doc: note on staring kag in README.md (#59)

* add default prop name (#62)

* Update README for en,cn (#63)

* update readme for cn,en

* update readme for cn,en

* Readme optimize (#65)

* update readme for cn,en

* update readme for cn,en

* update readme for cn,en

* minor format tweak

---------

Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>

* Update README_cn.md (#74)

* generate embedding in batch (#76)

* add pro commit

* fix(example): fix example open qa benchmark data id generator (#97)

* fix builder chunk index

* split length

* add force chunk (#98)

* Update logic_form_plan.py (#95)

* fix(builder): add spgtype not null check (#106)

* add spg_type not null check

* add spg_type not null check

* [feat] add template for issues (#108)

* feat(kag) update template for issues (#109)

* [feat] add template for issues

* update template for issues

* change type of variable self.schema from dict.keys to list[str] (#99)

* feat(kag) update template for issues (#114)

* [feat] add template for issues

* update template for issues

* update template for issues

* rename graphalgoclient to graphclient

* feat(ReadMe) Update README.md (#119)

* feat(kag)Update README.md (#120)

* feat(kag) update docs for github.io (#122)

* [feat] add template for issues

* update template for issues

* update template for issues

* update template for issues

* feat(kag) update docs for kag (#123)

* [feat] add template for issues

* update template for issues

* update template for issues

* update template for issues

* update template for issues

* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)

* 修改了slover中sum和verify功能求解器的传入参数

* 去除多余空格

* add testset for kag-demo (#144)

* feat(kag) rename testset file (#147)

* add testset for kag-demo

* rename testset

* Update README.md (#181)

* refactor(all): kag v0.6 (#174)

* add path find

* fix find path

* spg guided relation extraction

* fix dict parse with same key

* rename graphalgoclient to graphclient

* rename graphalgoclient to graphclient

* file reader supports http url

* add checkpointer class

* parser supports checkpoint

* add build

* remove incorrect logs

* remove logs

* update examples

* update chain checkpointer

* vectorizer batch size set to 32

* add a zodb backended checkpointer

* add a zodb backended checkpointer

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* 增加solver

* add kag

* update outline splitter

* add main test

* add op

* code refactor

* add tools

* fix outline splitter

* fix outline prompt

* graph api pass

* commit with page rank

* add search api and graph api

* add markdown report

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* update markdown reader

* update pdf reader

* raise extractor failure

* add default expr

* add log

* merge jc reader features

* rm import

* add build

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* update outline splitter

* add main test

* add markdown report

* code refactor

* fix outline splitter

* fix outline prompt

* update markdown reader

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* raise extractor failure

* rm parser

* run pipeline

* add config option of whether to perform llm config check, default to false

* fix

* recover pdf reader

* several components can be null for default chain

* 支持完整qa运行

* add if

* remove unused code

* 使用chunk兜底

* excluded source relation to choose

* add generate

* default recall 10

* add local memory

* 排除相似边

* 增加保护

* 修复并发问题

* add debug logger

* 支持topk参数化

* 支持chunk截断和调整spo select 的prompt

* 增加查询请求保护

* 增加force_chunk配置

* fix entity linker algorithm

* 增加sub query改写

* fix md reader dup in test

* fix

* merge knext to kag parallel

* fix package

* 修复指标下跌问题

* scanner update

* scanner update

* add doc and update example scripts

* fix

* add bridge to spg server

* add format

* fix bridge

* update conf for baike

* disable ckpt for spg server runner

* llm invoke error default raise exceptions

* chore(version): bump version to X.Y.Z

* update default response generation prompt

* add method getSummarizationMetrics

* fix(common): fix project conf empty error

* fix typo

* 增加上报信息

* 修改main solver

* postprocessor support spg server

* 修改solver支持名

* fix language

* 修改chunker接口,增加openapi

* rename vectorizer to vectorize_model in spg server config

* generate_random_string start with gen

* add knext llm vector checker

* add knext llm vector checker

* add knext llm vector checker

* solver移除默认值

* udpate yaml and register_name for baike

* udpate yaml and register_name for baike

* remove config key check

* 修复llmmodule

* fix knext project

* udpate yaml and register_name for examples

* udpate yaml and register_name for examples

* Revert "udpate yaml and register_name for examples"

This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.

* update register name

* fix

* fix

* support multiple resigter names

* update component

* update reader register names (#183)

* fix markdown reader

* fix llm client for retry

* feat(common): add processed chunk id checkpoint (#185)

* update reader register names

* add processed chunk id checkpoint

* feat(example): add example config (#186)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* add max_workers parameter for getSummarizationMetrics to make it faster

* add csqa data generation script generate_data.py

* commit generated csqa builder and solver data

* add csqa basic project files

* adjust split_length and num_threads_per_chain to match lightrag settings

* ignore ckpt dirs

* add csqa evaluation script eval.py

* save evaluation scripts summarization_metrics.py and factual_correctness.py

* save LightRAG output csqa_lightrag_answers.json

* ignore KAG output csqa_kag_answers.json

* add README.md for CSQA

* fix(solver): fix solver pipeline conf (#191)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* update links and file paths

* reformat csqa kag_config.yaml

* reformat csqa python files

* reformat getSummarizationMetrics and compare_summarization_answers

* fix(solver): fix solver config (#192)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* add except

* fix typo in csqa README.md

* feat(conf): support reinitialize config for call from java side (#199)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* revert default response generation prompt

* update project list

* add README.md for the hotpotqa, 2wiki and musique examples

* 增加spo检索

* turn off kag config dump by default

* turn off knext schema dump by default

* add .gitignore and fix kag_config.yaml

* add README.md for the medicine example

* add README.md for the supplychain example

* bugfix for risk mining

* use exact out

* refactor(solver): format solver code (#205)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* black format

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>

* docs(examples): finish README.md for builtin kag examples (#207)

* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples

* fix create project (#208)

* docs(examples): finish README.md for the examples directory (#210)

* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples

* add README_cn.md for examples

* finish README.md for examples

* fix typo in README.md for examples

* move images for kag examples to _static/images (#214)

* move more images for kag examples to _static/images (#216)

* udpate default yaml and corpus (#217)

* fix(knext): fix knext project env (#211)

* fix create project

* fix create project

* fix create project

* fix create project

* fix examples REAME.md to match quick start doc (#218)

* fix(example): fix vectorize model config in example (#220)

* fix vectorize model config

* remove ak

* remove ak

* x

* change log level to debug (#221)

* fix knext env (#223)

* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)

* reduce warn (#225)

* update(kag) update log level (#226)

* udpate default yaml and corpus

* update log level to debug

* fix(KAG): change level log (#227)

* change log level to debug

* fix(example): fix vectorize model config in example (#220)

* fix vectorize model config

* remove ak

* remove ak

* x

* fix knext env (#223)

* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)

* reduce warn (#225)

* change log level to debug

---------

Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>

* fix knext client (#229)

* fix(knext): fix knext client (#230)

* fix knext client

* x

* fix ollma regsiter name (#234)

* add timeout param for llm and embedding model (#236)

* Update README_cn.md (#238)

* Update README.md (#237)

* feat(examples): output qfs evaluation results as json and markdown (#240)

* fix vectorize_model configuration key typo

* fix permissions of data files

* fix examples README.md inconsistency

* output qfs evaluation results as json and markdown

* format summarization_metrics.py with black

* chore(examples): domain KG inject example (#249)

* add timeout param for llm and embedding model

* add example

* fix title

* update(kag) Update README (#258)

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update(kag) Update README  (#264)

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* fix mix reader (#270)

* feat(builder): add Azure Open AI Compatibility (#269)

* feat(llm): add Azure OpenAI client and vectorization support

* chore: add .DS_Store to .gitignore

* refactor(llm):add description for api_version and default value

* refactor(vectorize_model): added description for ap_version and default values for some params

* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters

* fix(builder): fix markdown reader for id (#273)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* first fix

* fix(examples): fix qa file name (#251)

* support custom kag config file (#279)

* feat(bridge): spg server bridge supports config check and run solver  (#287)

* x

* x (#280)

* bridge add solver

* x

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(kag): catch unexpected exceptions (#298)

* x (#280)

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* feat(bridge): Spg server bridge check (#285)

* x

* bridge add solver

* x

* add invoke

* feat(common): llm client catch exception (#294)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(solver): catch chunk retriever exception (#297)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* feat(common):llm except (#299)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* feat(common): force raise except (#300)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* force raise except

* delete checkpoint of postprocess (#302)

* disable entity linking in postprocess by default (#304)

* add retry (#306)

* use json repair for llm client (#312)

* fix empty data generate (#319)

* Add Discord link and wechat qr code. (#338)

* Add qr code

* Update README.md

Add discord and how to join the wechat group.

* fix the error when the stream parameter is True (#336)

* Update baike kag_config.yaml (#339)

The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it

* fix(builder): fix pdf reader for normalizing text in outline (#344)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* fix pdf reader

* fix pdf reader

* fix pdf reader

* fix(builder): bugfix official_name node has same prop object (#372)

* bugfix official_name node has same prop object

* reformat by black

* fix(solver): bugfix SPO Retrieval LLM response parse (#378)

* bugfix official_name node has same prop object

* reformat by black

* adapter spo retrieval llm response

* core_team  #andy (#389)

* fix(knext): project update addr (#408)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* use config default

* fix(knext): set token in request of write_graph (#409)

* fix(knext)set token in request of write_graph

* refine code

* feat(kag): update to v0.7 (#456)

* add think cost

* update csv scanner

* add final rerank

* add reasoner

* add iterative planner

* fix dpr search

* fix dpr search

* add reference data

* move odps import

* update requirement.txt

* update 2wiki

* add missing file

* fix markdown reader

* add iterative planning

* update version

* update runner

* update 2wiki example

* update bridge

* merge solver and solver_new

* add cur day

* writer delete

* update multi process

* add missing files

* fix report

* add chunk retrieved executor

* update try in stream runner result

* add path

* add math executor

* update hotpotqa example

* remove log

* fix python coder solver

* update hotpotqa example

* fix python coder solver

* update config

* fix bad

* add log

* remove unused code

* commit with task thought

* move kag model to common

* add default chat llm

* fix

* use static planner

* support chunk graph node

* add args

* support naive rag

* llm client support tool calls

* add default async

* add openai

* fix result

* fix markdown reader

* fix thinker

* update asyncio interface

* feat(solver): add mcp support (#444)

* 上传mcp client相关代码

* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用

* 1、schema

* bugfix:删减冗余代码

---------

Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>

* fix affairqa after solver refactor

* fix affairqa after solver refactor

* fix readme

* add params

* update version

* update mcp executor

* update mcp executor

* solver add mcp executor

* add missing file

* add mpc executor

* add executor

* x

* update

* fix requirement

* fix main llm config

* fix solver

* bugfix:修复invoke函数调用逻辑

* chg eva

* update example

* add kag layer

* add step task

* support dot refresh

* support dot refresh

* support dot refresh

* support dot refresh

* add retrieved num

* add retrieved num

* add pipelineconf

* update ppr

* update musique prompts

* update

* add to_dict for BuilderComponentData

* async build

* add deduce prompt

* add deduce prompt

* add deduce prompt

* fix reader

* add deduce prompt

* add page thinker report

* modify prmpt

* add step status

* add self cognition

* add self cognition

* add memory graph storage

* add now time

* update memory config

* add now time

* chg graph loader

* 添加prqa数据集和代码

* bugfix:prqa调用逻辑修复

* optimize:优化代码逻辑,生成答案规范化

* add retry py code

* update memory graph

* update memory graph

* fix

* fix ner

* add with_out_refer generator prompt

* fix

* close ckpt

* fix query

* fix query

* update version

* add llm checker

* add llm checker

* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件

* update exp

* update exp

* rerank support

* add static rewrite query

* recall more chunks

* fix graph load

* add static rewrite query

* fix bugs

* add finish check

* add finish check

* add finish check

* add finish check

* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件

* add lf retry

* add memory graph api

* fix reader api

* add ner

* add metrics

* fix bug

* remove ner

* add reraise fo retry

* add edge prop to memory graph

* add memory graph

* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题

* 删除评测结果文件

* fix knext host addr

* async eva

* add lf prompt

* add lf prompt

* add config

* add retry

* add unknown check

* add rc result

* add rc result

* add rc result

* add rc result

* 依据kag pipeline格式修改代码逻辑并通过测试

* bugfix:删除冗余代码

* fix report prompt

* bugfix:触发重试机制

* bugfix:中文符号错误

* fix rethinker prompt

* update version to 0.6.2b78

* update version

* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试

* update affairqa for evaluate

* update affairqa for evaluate

* bugfix:修正数据集

* bugfix:修正数据集

* bugfix:修正数据集

* fix name conflict

* bugfix:删除错误问题

* bugfix:文件名命名错误导致evaluator失败

* update for affairqa eval

* bugfix:修改代码保持evaluate逻辑一致

* x

* update for affairqa readme

* remove temp eval scripts

* bugfix for math deduce

* merge 0.6.2_dev

* merge 0.6.2_dev

* fix

* update client addr

* updated version

* update for affairqa eval

* evaUtils 支持中文

* fix affairqa eval:

* remove unused example

* update kag config

* fix default value

* update readme

* fix init

* 注释信息修改,并添加部分class说明

* update example config

* Tc 0.7.0 (#459)

* 提交affairQA 代码

* fix affairqa eval

---------

Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* fix all examples

* reformat

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* 更新readme (#460)

* feat(kag): update readme (#462)

* 更新readme

* 更新readme

* feat(kag): update version of KAG (#475)

* 更新readme

* 更新readme

* update version of KAG

* add async return (#481)

* add run component error log (#482)

* fix(bin): add pip index url (#483)

* add run component error log

* add index url option

* add gpu type

* fix(solver): add component name (#480)

* add ai search example

* bugfix reporter name

* update version

* fix ci

* support disabling vector generation (#486)

* fix

* fix

---------

Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
Xinhong Zhang
e5dd421bd7
fix(builder): md reader str 20250508 (#525)
* Initial commit

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Create CITATION.cff

* Update README.md

* Remove sensitive information

* refine vectorizer code and config

* fix vectorizer package name

* fix old import and requirement version bug

* delete kagdemo

* remove batch_vectorizer.py under common

* fix struct chain bug

* fix path bug

* fix path bug

* fix comment

* add README_cn.md

* fix spaces

* handle empty node_path

* handle empty node_batch

* add english README.md

* fix url

* fix typo

* fix bug in solver

* use kwargs

* use kwargs to init config

* add llm input

* remove __main__

* fix bug in solver

* add HumanBodyPart.csv data

* fix bug in builder

* update requirement

* fix example llm config

* add example cfg

* fix readme

* update

* update requirement

* Update README.md

* import FlagEmbedding at top-level to avoid sklearn init failure

* fix llm client

* add __init__

* import sklearn before FlagEmbedding to avoid sklearn init failure

* implement vector_dimensions in base Vectorizer

* add VectorizerConfigChecker

* add llm config checker

* add llm init file

* fix

* fix

* add init for solver

* fix

* fix(kag): fix llm call warning info (#11)

* output llm call warning info

* add generator module

* [fix](builder): prompt config for builder (#14)

* fix zh prompt config bug

* add __init__ for extractabc

* [fix](builder): batchvectorizer kwargs (#16)

* fix zh prompt config bug

* add __init__ for extractabc

* fix batchvectorizer kwargs

* using ollama client

* fix buidler init

* fix buidler init (#18)

* fix buidler init

* fix llm config cheker main

* fix llm test

* (fix)[solver]: language (#19)

* fix builder init

* fix language

* fix req

* (fix)[common]: llm checker (#20)

* fix buidler init

* fix llm checker

* filter edges with empty relation (#26)

* (fix)[common]: llm client (#22)

* fix buidler init

* add sub llm client in __init__

* fix cmd kagbase

* fix spo kag_llm

* (fix)[solver]: default prompt for examples (#29)

* fix buidler init

* fix default prompt for examples

* fix kagbasemodule for prompt config

* fix kagbasemodule for prompt config

* [feat]: Add test dataset for hotpotqa and musique (#30)

* add dataset

* move dir

* fix bug in spg_extractor and base_table_splitter (#44)

* Update README_cn.md (#55)

Update the description of KAG core features in the readme file

* Update README.md (#54)

* docs: add Japanese README file (#49)

I created Japanese translated README.

* doc: note on staring kag in README.md (#59)

* add default prop name (#62)

* Update README for en,cn (#63)

* update readme for cn,en

* update readme for cn,en

* Readme optimize (#65)

* update readme for cn,en

* update readme for cn,en

* update readme for cn,en

* minor format tweak

---------

Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>

* Update README_cn.md (#74)

* generate embedding in batch (#76)

* add pro commit

* fix(example): fix example open qa benchmark data id generator (#97)

* fix builder chunk index

* split length

* add force chunk (#98)

* Update logic_form_plan.py (#95)

* fix(builder): add spgtype not null check (#106)

* add spg_type not null check

* add spg_type not null check

* [feat] add template for issues (#108)

* feat(kag) update template for issues (#109)

* [feat] add template for issues

* update template for issues

* change type of variable self.schema from dict.keys to list[str] (#99)

* feat(kag) update template for issues (#114)

* [feat] add template for issues

* update template for issues

* update template for issues

* rename graphalgoclient to graphclient

* feat(ReadMe) Update README.md (#119)

* feat(kag)Update README.md (#120)

* feat(kag) update docs for github.io (#122)

* [feat] add template for issues

* update template for issues

* update template for issues

* update template for issues

* feat(kag) update docs for kag (#123)

* [feat] add template for issues

* update template for issues

* update template for issues

* update template for issues

* update template for issues

* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)

* 修改了slover中sum和verify功能求解器的传入参数

* 去除多余空格

* add testset for kag-demo (#144)

* feat(kag) rename testset file (#147)

* add testset for kag-demo

* rename testset

* Update README.md (#181)

* refactor(all): kag v0.6 (#174)

* add path find

* fix find path

* spg guided relation extraction

* fix dict parse with same key

* rename graphalgoclient to graphclient

* rename graphalgoclient to graphclient

* file reader supports http url

* add checkpointer class

* parser supports checkpoint

* add build

* remove incorrect logs

* remove logs

* update examples

* update chain checkpointer

* vectorizer batch size set to 32

* add a zodb backended checkpointer

* add a zodb backended checkpointer

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* 增加solver

* add kag

* update outline splitter

* add main test

* add op

* code refactor

* add tools

* fix outline splitter

* fix outline prompt

* graph api pass

* commit with page rank

* add search api and graph api

* add markdown report

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* update markdown reader

* update pdf reader

* raise extractor failure

* add default expr

* add log

* merge jc reader features

* rm import

* add build

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* update outline splitter

* add main test

* add markdown report

* code refactor

* fix outline splitter

* fix outline prompt

* update markdown reader

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* raise extractor failure

* rm parser

* run pipeline

* add config option of whether to perform llm config check, default to false

* fix

* recover pdf reader

* several components can be null for default chain

* 支持完整qa运行

* add if

* remove unused code

* 使用chunk兜底

* excluded source relation to choose

* add generate

* default recall 10

* add local memory

* 排除相似边

* 增加保护

* 修复并发问题

* add debug logger

* 支持topk参数化

* 支持chunk截断和调整spo select 的prompt

* 增加查询请求保护

* 增加force_chunk配置

* fix entity linker algorithm

* 增加sub query改写

* fix md reader dup in test

* fix

* merge knext to kag parallel

* fix package

* 修复指标下跌问题

* scanner update

* scanner update

* add doc and update example scripts

* fix

* add bridge to spg server

* add format

* fix bridge

* update conf for baike

* disable ckpt for spg server runner

* llm invoke error default raise exceptions

* chore(version): bump version to X.Y.Z

* update default response generation prompt

* add method getSummarizationMetrics

* fix(common): fix project conf empty error

* fix typo

* 增加上报信息

* 修改main solver

* postprocessor support spg server

* 修改solver支持名

* fix language

* 修改chunker接口,增加openapi

* rename vectorizer to vectorize_model in spg server config

* generate_random_string start with gen

* add knext llm vector checker

* add knext llm vector checker

* add knext llm vector checker

* solver移除默认值

* udpate yaml and register_name for baike

* udpate yaml and register_name for baike

* remove config key check

* 修复llmmodule

* fix knext project

* udpate yaml and register_name for examples

* udpate yaml and register_name for examples

* Revert "udpate yaml and register_name for examples"

This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.

* update register name

* fix

* fix

* support multiple resigter names

* update component

* update reader register names (#183)

* fix markdown reader

* fix llm client for retry

* feat(common): add processed chunk id checkpoint (#185)

* update reader register names

* add processed chunk id checkpoint

* feat(example): add example config (#186)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* add max_workers parameter for getSummarizationMetrics to make it faster

* add csqa data generation script generate_data.py

* commit generated csqa builder and solver data

* add csqa basic project files

* adjust split_length and num_threads_per_chain to match lightrag settings

* ignore ckpt dirs

* add csqa evaluation script eval.py

* save evaluation scripts summarization_metrics.py and factual_correctness.py

* save LightRAG output csqa_lightrag_answers.json

* ignore KAG output csqa_kag_answers.json

* add README.md for CSQA

* fix(solver): fix solver pipeline conf (#191)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* update links and file paths

* reformat csqa kag_config.yaml

* reformat csqa python files

* reformat getSummarizationMetrics and compare_summarization_answers

* fix(solver): fix solver config (#192)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* add except

* fix typo in csqa README.md

* feat(conf): support reinitialize config for call from java side (#199)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* revert default response generation prompt

* update project list

* add README.md for the hotpotqa, 2wiki and musique examples

* 增加spo检索

* turn off kag config dump by default

* turn off knext schema dump by default

* add .gitignore and fix kag_config.yaml

* add README.md for the medicine example

* add README.md for the supplychain example

* bugfix for risk mining

* use exact out

* refactor(solver): format solver code (#205)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* black format

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>

* docs(examples): finish README.md for builtin kag examples (#207)

* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples

* fix create project (#208)

* docs(examples): finish README.md for the examples directory (#210)

* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples

* add README_cn.md for examples

* finish README.md for examples

* fix typo in README.md for examples

* move images for kag examples to _static/images (#214)

* move more images for kag examples to _static/images (#216)

* udpate default yaml and corpus (#217)

* fix(knext): fix knext project env (#211)

* fix create project

* fix create project

* fix create project

* fix create project

* fix examples REAME.md to match quick start doc (#218)

* fix(example): fix vectorize model config in example (#220)

* fix vectorize model config

* remove ak

* remove ak

* x

* change log level to debug (#221)

* fix knext env (#223)

* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)

* reduce warn (#225)

* update(kag) update log level (#226)

* udpate default yaml and corpus

* update log level to debug

* fix(KAG): change level log (#227)

* change log level to debug

* fix(example): fix vectorize model config in example (#220)

* fix vectorize model config

* remove ak

* remove ak

* x

* fix knext env (#223)

* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)

* reduce warn (#225)

* change log level to debug

---------

Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>

* fix knext client (#229)

* fix(knext): fix knext client (#230)

* fix knext client

* x

* fix ollma regsiter name (#234)

* add timeout param for llm and embedding model (#236)

* Update README_cn.md (#238)

* Update README.md (#237)

* feat(examples): output qfs evaluation results as json and markdown (#240)

* fix vectorize_model configuration key typo

* fix permissions of data files

* fix examples README.md inconsistency

* output qfs evaluation results as json and markdown

* format summarization_metrics.py with black

* chore(examples): domain KG inject example (#249)

* add timeout param for llm and embedding model

* add example

* fix title

* update(kag) Update README (#258)

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update(kag) Update README  (#264)

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* fix mix reader (#270)

* feat(builder): add Azure Open AI Compatibility (#269)

* feat(llm): add Azure OpenAI client and vectorization support

* chore: add .DS_Store to .gitignore

* refactor(llm):add description for api_version and default value

* refactor(vectorize_model): added description for ap_version and default values for some params

* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters

* fix(builder): fix markdown reader for id (#273)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* first fix

* fix(examples): fix qa file name (#251)

* support custom kag config file (#279)

* feat(bridge): spg server bridge supports config check and run solver  (#287)

* x

* x (#280)

* bridge add solver

* x

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(kag): catch unexpected exceptions (#298)

* x (#280)

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* feat(bridge): Spg server bridge check (#285)

* x

* bridge add solver

* x

* add invoke

* feat(common): llm client catch exception (#294)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(solver): catch chunk retriever exception (#297)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* feat(common):llm except (#299)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* feat(common): force raise except (#300)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* force raise except

* delete checkpoint of postprocess (#302)

* disable entity linking in postprocess by default (#304)

* add retry (#306)

* use json repair for llm client (#312)

* fix empty data generate (#319)

* Add Discord link and wechat qr code. (#338)

* Add qr code

* Update README.md

Add discord and how to join the wechat group.

* fix the error when the stream parameter is True (#336)

* Update baike kag_config.yaml (#339)

The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it

* fix(builder): fix pdf reader for normalizing text in outline (#344)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* fix pdf reader

* fix pdf reader

* fix pdf reader

* fix(builder): bugfix official_name node has same prop object (#372)

* bugfix official_name node has same prop object

* reformat by black

* fix(solver): bugfix SPO Retrieval LLM response parse (#378)

* bugfix official_name node has same prop object

* reformat by black

* adapter spo retrieval llm response

* core_team  #andy (#389)

* fix(knext): project update addr (#408)

* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* use config default

* fix(knext): set token in request of write_graph (#409)

* fix(knext)set token in request of write_graph

* refine code

* feat(kag): update to v0.7 (#456)

* add think cost

* update csv scanner

* add final rerank

* add reasoner

* add iterative planner

* fix dpr search

* fix dpr search

* add reference data

* move odps import

* update requirement.txt

* update 2wiki

* add missing file

* fix markdown reader

* add iterative planning

* update version

* update runner

* update 2wiki example

* update bridge

* merge solver and solver_new

* add cur day

* writer delete

* update multi process

* add missing files

* fix report

* add chunk retrieved executor

* update try in stream runner result

* add path

* add math executor

* update hotpotqa example

* remove log

* fix python coder solver

* update hotpotqa example

* fix python coder solver

* update config

* fix bad

* add log

* remove unused code

* commit with task thought

* move kag model to common

* add default chat llm

* fix

* use static planner

* support chunk graph node

* add args

* support naive rag

* llm client support tool calls

* add default async

* add openai

* fix result

* fix markdown reader

* fix thinker

* update asyncio interface

* feat(solver): add mcp support (#444)

* 上传mcp client相关代码

* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用

* 1、schema

* bugfix:删减冗余代码

---------

Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>

* fix affairqa after solver refactor

* fix affairqa after solver refactor

* fix readme

* add params

* update version

* update mcp executor

* update mcp executor

* solver add mcp executor

* add missing file

* add mpc executor

* add executor

* x

* update

* fix requirement

* fix main llm config

* fix solver

* bugfix:修复invoke函数调用逻辑

* chg eva

* update example

* add kag layer

* add step task

* support dot refresh

* support dot refresh

* support dot refresh

* support dot refresh

* add retrieved num

* add retrieved num

* add pipelineconf

* update ppr

* update musique prompts

* update

* add to_dict for BuilderComponentData

* async build

* add deduce prompt

* add deduce prompt

* add deduce prompt

* fix reader

* add deduce prompt

* add page thinker report

* modify prmpt

* add step status

* add self cognition

* add self cognition

* add memory graph storage

* add now time

* update memory config

* add now time

* chg graph loader

* 添加prqa数据集和代码

* bugfix:prqa调用逻辑修复

* optimize:优化代码逻辑,生成答案规范化

* add retry py code

* update memory graph

* update memory graph

* fix

* fix ner

* add with_out_refer generator prompt

* fix

* close ckpt

* fix query

* fix query

* update version

* add llm checker

* add llm checker

* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件

* update exp

* update exp

* rerank support

* add static rewrite query

* recall more chunks

* fix graph load

* add static rewrite query

* fix bugs

* add finish check

* add finish check

* add finish check

* add finish check

* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件

* add lf retry

* add memory graph api

* fix reader api

* add ner

* add metrics

* fix bug

* remove ner

* add reraise fo retry

* add edge prop to memory graph

* add memory graph

* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题

* 删除评测结果文件

* fix knext host addr

* async eva

* add lf prompt

* add lf prompt

* add config

* add retry

* add unknown check

* add rc result

* add rc result

* add rc result

* add rc result

* 依据kag pipeline格式修改代码逻辑并通过测试

* bugfix:删除冗余代码

* fix report prompt

* bugfix:触发重试机制

* bugfix:中文符号错误

* fix rethinker prompt

* update version to 0.6.2b78

* update version

* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试

* update affairqa for evaluate

* update affairqa for evaluate

* bugfix:修正数据集

* bugfix:修正数据集

* bugfix:修正数据集

* fix name conflict

* bugfix:删除错误问题

* bugfix:文件名命名错误导致evaluator失败

* update for affairqa eval

* bugfix:修改代码保持evaluate逻辑一致

* x

* update for affairqa readme

* remove temp eval scripts

* bugfix for math deduce

* merge 0.6.2_dev

* merge 0.6.2_dev

* fix

* update client addr

* updated version

* update for affairqa eval

* evaUtils 支持中文

* fix affairqa eval:

* remove unused example

* update kag config

* fix default value

* update readme

* fix init

* 注释信息修改,并添加部分class说明

* update example config

* Tc 0.7.0 (#459)

* 提交affairQA 代码

* fix affairqa eval

---------

Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* fix all examples

* reformat

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* 更新readme (#460)

* feat(kag): update readme (#462)

* 更新readme

* 更新readme

* feat(kag): update version of KAG (#475)

* 更新readme

* 更新readme

* update version of KAG

* add async return (#481)

* add run component error log (#482)

* fix(bin): add pip index url (#483)

* add run component error log

* add index url option

* add gpu type

* fix(solver): add component name (#480)

* add ai search example

* bugfix reporter name

* update version

* fix ci

* support disabling vector generation (#486)

* fix

---------

Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-08 14:48:49 +08:00
royzhao
286da392da
fix(solver): add prompt init params (#521)
* add async return

* add std schema init

* add std schema component to default thought pipeline

* add std schema component to default thought pipeline

* add std schema component to default thought pipeline
2025-05-07 15:43:01 +08:00
royzhao
6a4ab468fe
fix(solver): vectorize_model fix generate_key (#518)
* add async return

* turn vectorize model to singleton

* turn vectorize model to singleton

* add schema type linker

* fix black format

* bugfix vectorize

---------

Co-authored-by: 钟书 <zhongshu.zzs@antgroup.com>
2025-05-06 21:02:18 +08:00
royzhao
404d79b304
fix(solver): add search_api param for schema std (#516)
* add async return

* turn vectorize model to singleton

* turn vectorize model to singleton

* add schema type linker

* fix black format

---------

Co-authored-by: 钟书 <zhongshu.zzs@antgroup.com>
2025-05-06 16:57:12 +08:00
Xinhong Zhang
910c2d9df3
fix(builder): fix doc reader (#487)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* fix doc reader

* fix doc reader

* fix doc reader
2025-04-28 16:28:24 +08:00
Xinhong Zhang
407c73cc9e
fix(builder): md reader compatibility (#506)
* fix md reader

* fix md reader for comment

* fix

* fix md reader
2025-04-28 16:28:11 +08:00
unrealise
7d9bbc74e2
fix(builder): fix std and llm_client (#497)
* 修复bug:#278,#495,ollama未自测

* 解决代码规范问题

---------

Co-authored-by: e <ling.liu@chinacreator.com>
2025-04-25 18:01:45 +08:00
zhuzhongshu123
af7c8fe0ac
fix mapping (#492) v0.7.1 2025-04-24 19:27:18 +08:00
thundax
832733062d
fix(tools): update dsl string (#484)
* update DSL query string

* fix(tools): update ner construct params
2025-04-24 14:10:45 +08:00
xionghuaidong
6745cf1744
support disabling vector generation (#486) 2025-04-24 13:36:08 +08:00
royzhao
a7fd51d138
fix(solver): add component name (#480)
* add ai search example

* bugfix reporter name

* update version

* fix ci
2025-04-24 13:26:28 +08:00
zhuzhongshu123
d842662e5c
fix(bin): add pip index url (#483)
* add run component error log

* add index url option

* add gpu type
2025-04-23 16:10:24 +08:00
zhuzhongshu123
90d64d77d4
add run component error log (#482) 2025-04-23 15:22:12 +08:00
royzhao
a63bcde8ed
add async return (#481) 2025-04-23 13:04:38 +08:00
田常@蚂蚁
7f87685e2e
feat(kag): update version of KAG (#475)
* 更新readme

* 更新readme

* update version of KAG
2025-04-22 18:48:30 +08:00
田常@蚂蚁
4fbdd1515b
feat(kag): update readme (#462)
* 更新readme

* 更新readme
2025-04-18 15:40:49 +08:00
田常@蚂蚁
7c7910ab67
更新readme (#460) 2025-04-18 10:05:27 +08:00
zhuzhongshu123
13cea5f6fe
feat(kag): update to v0.7 (#456)
* add think cost

* update csv scanner

* add final rerank

* add reasoner

* add iterative planner

* fix dpr search

* fix dpr search

* add reference data

* move odps import

* update requirement.txt

* update 2wiki

* add missing file

* fix markdown reader

* add iterative planning

* update version

* update runner

* update 2wiki example

* update bridge

* merge solver and solver_new

* add cur day

* writer delete

* update multi process

* add missing files

* fix report

* add chunk retrieved executor

* update try in stream runner result

* add path

* add math executor

* update hotpotqa example

* remove log

* fix python coder solver

* update hotpotqa example

* fix python coder solver

* update config

* fix bad

* add log

* remove unused code

* commit with task thought

* move kag model to common

* add default chat llm

* fix

* use static planner

* support chunk graph node

* add args

* support naive rag

* llm client support tool calls

* add default async

* add openai

* fix result

* fix markdown reader

* fix thinker

* update asyncio interface

* feat(solver): add mcp support (#444)

* 上传mcp client相关代码

* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用

* 1、schema

* bugfix:删减冗余代码

---------

Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>

* fix affairqa after solver refactor

* fix affairqa after solver refactor

* fix readme

* add params

* update version

* update mcp executor

* update mcp executor

* solver add mcp executor

* add missing file

* add mpc executor

* add executor

* x

* update

* fix requirement

* fix main llm config

* fix solver

* bugfix:修复invoke函数调用逻辑

* chg eva

* update example

* add kag layer

* add step task

* support dot refresh

* support dot refresh

* support dot refresh

* support dot refresh

* add retrieved num

* add retrieved num

* add pipelineconf

* update ppr

* update musique prompts

* update

* add to_dict for BuilderComponentData

* async build

* add deduce prompt

* add deduce prompt

* add deduce prompt

* fix reader

* add deduce prompt

* add page thinker report

* modify prmpt

* add step status

* add self cognition

* add self cognition

* add memory graph storage

* add now time

* update memory config

* add now time

* chg graph loader

* 添加prqa数据集和代码

* bugfix:prqa调用逻辑修复

* optimize:优化代码逻辑,生成答案规范化

* add retry py code

* update memory graph

* update memory graph

* fix

* fix ner

* add with_out_refer generator prompt

* fix

* close ckpt

* fix query

* fix query

* update version

* add llm checker

* add llm checker

* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件

* update exp

* update exp

* rerank support

* add static rewrite query

* recall more chunks

* fix graph load

* add static rewrite query

* fix bugs

* add finish check

* add finish check

* add finish check

* add finish check

* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件

* add lf retry

* add memory graph api

* fix reader api

* add ner

* add metrics

* fix bug

* remove ner

* add reraise fo retry

* add edge prop to memory graph

* add memory graph

* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题

* 删除评测结果文件

* fix knext host addr

* async eva

* add lf prompt

* add lf prompt

* add config

* add retry

* add unknown check

* add rc result

* add rc result

* add rc result

* add rc result

* 依据kag pipeline格式修改代码逻辑并通过测试

* bugfix:删除冗余代码

* fix report prompt

* bugfix:触发重试机制

* bugfix:中文符号错误

* fix rethinker prompt

* update version to 0.6.2b78

* update version

* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试

* update affairqa for evaluate

* update affairqa for evaluate

* bugfix:修正数据集

* bugfix:修正数据集

* bugfix:修正数据集

* fix name conflict

* bugfix:删除错误问题

* bugfix:文件名命名错误导致evaluator失败

* update for affairqa eval

* bugfix:修改代码保持evaluate逻辑一致

* x

* update for affairqa readme

* remove temp eval scripts

* bugfix for math deduce

* merge 0.6.2_dev

* merge 0.6.2_dev

* fix

* update client addr

* updated version

* update for affairqa eval

* evaUtils 支持中文

* fix affairqa eval:

* remove unused example

* update kag config

* fix default value

* update readme

* fix init

* 注释信息修改,并添加部分class说明

* update example config

* Tc 0.7.0 (#459)

* 提交affairQA 代码

* fix affairqa eval

---------

Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* fix all examples

* reformat

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
v0.7
2025-04-17 17:23:52 +08:00
bingchu
c6b107ce56
fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph

* refine code
2025-03-12 10:13:03 +08:00
Xinhong Zhang
31b895a3fa
fix(knext): project update addr (#408)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* use config default
2025-03-11 17:29:15 +08:00
Andy
bf46bbbd4d
core_team #andy (#389) 2025-03-03 19:13:44 +08:00
royzhao
5d12979694
fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object

* reformat by black

* adapter spo retrieval llm response
2025-02-27 11:28:01 +08:00
royzhao
6a16df3565
fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object

* reformat by black
2025-02-25 18:16:08 +08:00
Xinhong Zhang
8d51e66d6a
fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* fix pdf reader

* fix pdf reader

* fix pdf reader
2025-02-17 14:02:39 +08:00
xueguanwen
daa536fb3f
Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
2025-02-11 11:01:47 +08:00
hy89
b02553243d
fix the error when the stream parameter is True (#336) 2025-02-11 11:01:26 +08:00
luzizhuo
4a40479e6b
Add Discord link and wechat qr code. (#338)
* Add qr code

* Update README.md

Add discord and how to join the wechat group.
2025-02-08 21:15:56 +08:00
royzhao
bd0c3ec92e
fix empty data generate (#319) 2025-01-22 14:16:45 +08:00
zhuzhongshu123
3348dfeaa4
use json repair for llm client (#312) 2025-01-21 11:20:26 +08:00
royzhao
cdf0ea3933
add retry (#306) 2025-01-20 14:14:10 +08:00
zhuzhongshu123
1e57016373
disable entity linking in postprocess by default (#304) 2025-01-20 11:19:44 +08:00
zhuzhongshu123
4ad5bded26
delete checkpoint of postprocess (#302) 2025-01-18 12:05:31 +08:00
zhuzhongshu123
7666ca40dd
feat(kag): catch unexpected exceptions (#298)
* x (#280)

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* feat(bridge): Spg server bridge check (#285)

* x

* bridge add solver

* x

* add invoke

* feat(common): llm client catch exception (#294)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(solver): catch chunk retriever exception (#297)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* feat(common):llm except (#299)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* feat(common): force raise except (#300)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* force raise except
2025-01-17 17:11:51 +08:00
zhuzhongshu123
deae277510
feat(bridge): spg server bridge supports config check and run solver (#287)
* x

* x (#280)

* bridge add solver

* x

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* add invoke

* llm client catch error
2025-01-17 13:52:00 +08:00
zhuzhongshu123
ca31351971
support custom kag config file (#279) 2025-01-15 18:10:55 +08:00
zhuzhongshu123
a40980a294
fix(examples): fix qa file name (#251) 2025-01-14 20:18:38 +08:00
Xinhong Zhang
248b22520f
fix(builder): fix markdown reader for id (#273)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* first fix
2025-01-14 14:36:41 +08:00
joseosvaldo16
6494fd20c0
feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support

* chore: add .DS_Store to .gitignore

* refactor(llm):add description for api_version and default value

* refactor(vectorize_model): added description for ap_version and default values for some params

* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
2025-01-14 12:57:43 +08:00
zhuzhongshu123
671a9a016c
fix mix reader (#270) 2025-01-14 10:18:38 +08:00
Andy
c2056ef2f6
update(kag) Update README (#264)
* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy
2025-01-13 10:57:10 +08:00
Andy
724a026b15
update(kag) Update README (#258)
* update README #andy

* update README #andy

* update README #andy

* update README #andy
2025-01-10 17:38:23 +08:00
zhuzhongshu123
e1fccef44c
chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model

* add example

* fix title
2025-01-09 17:14:51 +08:00
xionghuaidong
fb15dcec26
feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo

* fix permissions of data files

* fix examples README.md inconsistency

* output qfs evaluation results as json and markdown

* format summarization_metrics.py with black
2025-01-08 16:52:31 +08:00