2024-10-24 11:46:15 +08:00
|
|
|
# -*- coding: utf-8 -*-
|
|
|
|
# Copyright 2023 OpenSPG Authors
|
|
|
|
#
|
|
|
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
|
|
|
|
# in compliance with the License. You may obtain a copy of the License at
|
|
|
|
#
|
|
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
#
|
|
|
|
# Unless required by applicable law or agreed to in writing, software distributed under the License
|
|
|
|
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
|
|
|
# or implied.
|
|
|
|
import logging
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
import asyncio
|
2024-10-24 11:46:15 +08:00
|
|
|
|
2025-05-31 16:53:01 +08:00
|
|
|
from openai import OpenAI, AsyncOpenAI, AzureOpenAI, AsyncAzureOpenAI, NOT_GIVEN
|
2025-04-17 17:23:52 +08:00
|
|
|
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
from kag.interface import LLMClient
|
2025-04-17 17:23:52 +08:00
|
|
|
from typing import Callable, Optional
|
|
|
|
|
|
|
|
|
|
|
|
from kag.interface.solver.reporter_abc import ReporterABC
|
2024-10-24 11:46:15 +08:00
|
|
|
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
logging.getLogger("openai").setLevel(logging.ERROR)
|
|
|
|
logging.getLogger("httpx").setLevel(logging.ERROR)
|
2024-10-24 11:46:15 +08:00
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
2025-01-13 22:57:43 -06:00
|
|
|
AzureADTokenProvider = Callable[[], str]
|
2024-10-24 11:46:15 +08:00
|
|
|
|
2025-01-21 11:20:26 +08:00
|
|
|
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
@LLMClient.register("maas")
|
|
|
|
@LLMClient.register("openai")
|
2025-04-17 17:23:52 +08:00
|
|
|
@LLMClient.register("vllm")
|
2024-10-24 11:46:15 +08:00
|
|
|
class OpenAIClient(LLMClient):
|
|
|
|
"""
|
|
|
|
A client class for interacting with the OpenAI API.
|
|
|
|
|
|
|
|
Initializes the client with an API key, base URL, streaming option, temperature parameter, and default model.
|
|
|
|
|
|
|
|
"""
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
|
2024-10-24 11:46:15 +08:00
|
|
|
def __init__(
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
self,
|
|
|
|
base_url: str,
|
|
|
|
model: str,
|
2025-04-17 17:23:52 +08:00
|
|
|
api_key: str = "dummy",
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
stream: bool = False,
|
|
|
|
temperature: float = 0.7,
|
2025-01-08 15:58:38 +08:00
|
|
|
timeout: float = None,
|
2025-04-17 17:23:52 +08:00
|
|
|
max_rate: float = 1000,
|
|
|
|
time_period: float = 1,
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
think: bool = False,
|
2025-04-17 17:23:52 +08:00
|
|
|
**kwargs,
|
2024-10-24 11:46:15 +08:00
|
|
|
):
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
"""
|
|
|
|
Initializes the OpenAIClient instance.
|
|
|
|
|
|
|
|
Args:
|
|
|
|
api_key (str): The API key for accessing the OpenAI API.
|
|
|
|
base_url (str): The base URL for the OpenAI API.
|
|
|
|
model (str): The default model to use for requests.
|
|
|
|
stream (bool, optional): Whether to stream the response. Defaults to False.
|
|
|
|
temperature (float, optional): The temperature parameter for the model. Defaults to 0.7.
|
2025-01-08 15:58:38 +08:00
|
|
|
timeout (float): The timeout duration for the service request. Defaults to None, means no timeout.
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
"""
|
2025-04-17 17:23:52 +08:00
|
|
|
name = kwargs.pop("name", None)
|
|
|
|
if not name:
|
|
|
|
name = f"{api_key}{base_url}{model}"
|
|
|
|
super().__init__(name, max_rate, time_period, **kwargs)
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
self.api_key = api_key
|
|
|
|
self.base_url = base_url
|
|
|
|
self.model = model
|
|
|
|
self.stream = stream
|
|
|
|
self.temperature = temperature
|
2025-01-08 15:58:38 +08:00
|
|
|
self.timeout = timeout
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
self.think = think
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
self.client = OpenAI(api_key=self.api_key, base_url=self.base_url)
|
2025-04-17 17:23:52 +08:00
|
|
|
self.aclient = AsyncOpenAI(api_key=self.api_key, base_url=self.base_url)
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
self.check()
|
2025-04-17 17:23:52 +08:00
|
|
|
logger.debug(
|
|
|
|
f"Initialize OpenAIClient with rate limit {max_rate} every {time_period}s"
|
|
|
|
)
|
2025-04-25 18:01:45 +08:00
|
|
|
logger.info(f"OpenAIClient max_tokens={self.max_tokens}")
|
2024-10-24 11:46:15 +08:00
|
|
|
|
2025-04-17 17:23:52 +08:00
|
|
|
def __call__(self, prompt: str = "", image_url: str = None, **kwargs):
|
2024-10-24 11:46:15 +08:00
|
|
|
"""
|
|
|
|
Executes a model request when the object is called and returns the result.
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
prompt (str): The prompt provided to the model.
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
str: The response content generated by the model.
|
|
|
|
"""
|
|
|
|
# Call the model with the given prompt and return the response
|
2025-04-17 17:23:52 +08:00
|
|
|
reporter: Optional[ReporterABC] = kwargs.get("reporter", None)
|
|
|
|
segment_name = kwargs.get("segment_name", None)
|
|
|
|
tag_name = kwargs.get("tag_name", None)
|
|
|
|
tools = kwargs.get("tools", None)
|
|
|
|
messages = kwargs.get("messages", None)
|
|
|
|
if messages is None:
|
|
|
|
if image_url:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{
|
|
|
|
"role": "user",
|
|
|
|
"content": [
|
|
|
|
{"type": "text", "text": prompt},
|
|
|
|
{"type": "image_url", "image_url": {"url": image_url}},
|
|
|
|
],
|
|
|
|
},
|
|
|
|
]
|
|
|
|
else:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{"role": "user", "content": prompt},
|
|
|
|
]
|
|
|
|
response = self.client.chat.completions.create(
|
|
|
|
model=self.model,
|
|
|
|
messages=messages,
|
|
|
|
stream=self.stream,
|
|
|
|
temperature=self.temperature,
|
|
|
|
timeout=self.timeout,
|
|
|
|
tools=tools,
|
2025-05-31 16:53:01 +08:00
|
|
|
max_tokens=self.max_tokens if self.max_tokens > 0 else NOT_GIVEN,
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
extra_body={"chat_template_kwargs": {"enable_thinking": self.think}},
|
2025-04-17 17:23:52 +08:00
|
|
|
)
|
|
|
|
if not self.stream:
|
|
|
|
# reasoning_content = getattr(
|
|
|
|
# response.choices[0].message, "reasoning_content", None
|
|
|
|
# )
|
|
|
|
# content = response.choices[0].message.content
|
|
|
|
# if reasoning_content:
|
|
|
|
# rsp = f"{reasoning_content}\n{content}"
|
|
|
|
# else:
|
|
|
|
# rsp = content
|
|
|
|
rsp = response.choices[0].message.content
|
|
|
|
tool_calls = response.choices[0].message.tool_calls
|
|
|
|
else:
|
|
|
|
rsp = ""
|
|
|
|
tool_calls = None # TODO: Handle tool calls in stream mode
|
|
|
|
|
|
|
|
for chunk in response:
|
|
|
|
if not chunk.choices:
|
|
|
|
continue
|
|
|
|
delta_content = getattr(chunk.choices[0].delta, "content", None)
|
|
|
|
if delta_content is not None:
|
|
|
|
rsp += delta_content
|
|
|
|
if reporter:
|
|
|
|
reporter.add_report_line(
|
|
|
|
segment_name,
|
|
|
|
tag_name,
|
|
|
|
rsp,
|
|
|
|
status="RUNNING",
|
|
|
|
)
|
|
|
|
if reporter:
|
|
|
|
reporter.add_report_line(
|
|
|
|
segment_name,
|
|
|
|
tag_name,
|
|
|
|
rsp,
|
|
|
|
status="FINISH",
|
2024-10-24 11:46:15 +08:00
|
|
|
)
|
2025-04-17 17:23:52 +08:00
|
|
|
if tools and tool_calls:
|
|
|
|
return response.choices[0].message
|
refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit b3fa5ca9ba749e501133ac67bd8746027ab839d9.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00
|
|
|
|
2025-04-17 17:23:52 +08:00
|
|
|
return rsp
|
|
|
|
|
|
|
|
async def acall(self, prompt: str = "", image_url: str = None, **kwargs):
|
|
|
|
"""
|
|
|
|
Executes a model request when the object is called and returns the result.
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
prompt (str): The prompt provided to the model.
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
str: The response content generated by the model.
|
|
|
|
"""
|
|
|
|
# Call the model with the given prompt and return the response
|
|
|
|
reporter: Optional[ReporterABC] = kwargs.get("reporter", None)
|
|
|
|
segment_name = kwargs.get("segment_name", None)
|
|
|
|
tag_name = kwargs.get("tag_name", None)
|
|
|
|
if reporter:
|
|
|
|
reporter.add_report_line(
|
|
|
|
segment_name,
|
|
|
|
tag_name,
|
|
|
|
"",
|
|
|
|
status="INIT",
|
2024-10-24 11:46:15 +08:00
|
|
|
)
|
2025-04-17 17:23:52 +08:00
|
|
|
|
|
|
|
tools = kwargs.get("tools", None)
|
|
|
|
messages = kwargs.get("messages", None)
|
|
|
|
if messages is None:
|
|
|
|
if image_url:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{
|
|
|
|
"role": "user",
|
|
|
|
"content": [
|
|
|
|
{"type": "text", "text": prompt},
|
|
|
|
{"type": "image_url", "image_url": {"url": image_url}},
|
|
|
|
],
|
|
|
|
},
|
|
|
|
]
|
|
|
|
|
|
|
|
else:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{"role": "user", "content": prompt},
|
|
|
|
]
|
|
|
|
response = await self.aclient.chat.completions.create(
|
|
|
|
model=self.model,
|
|
|
|
messages=messages,
|
|
|
|
stream=self.stream,
|
|
|
|
temperature=self.temperature,
|
|
|
|
timeout=self.timeout,
|
|
|
|
tools=tools,
|
2025-05-31 16:53:01 +08:00
|
|
|
max_tokens=self.max_tokens if self.max_tokens > 0 else NOT_GIVEN,
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
extra_body={"chat_template_kwargs": {"enable_thinking": self.think}},
|
2025-04-17 17:23:52 +08:00
|
|
|
)
|
2025-02-11 11:01:26 +08:00
|
|
|
if not self.stream:
|
2025-04-17 17:23:52 +08:00
|
|
|
# reasoning_content = getattr(
|
|
|
|
# response.choices[0].message, "reasoning_content", None
|
|
|
|
# )
|
|
|
|
# if reasoning_content:
|
|
|
|
# rsp = f"{reasoning_content}\n{content}"
|
|
|
|
# else:
|
2024-10-24 11:46:15 +08:00
|
|
|
rsp = response.choices[0].message.content
|
2025-04-17 17:23:52 +08:00
|
|
|
tool_calls = response.choices[0].message.tool_calls
|
2025-02-11 11:01:26 +08:00
|
|
|
else:
|
|
|
|
rsp = ""
|
2025-04-17 17:23:52 +08:00
|
|
|
tool_calls = None
|
|
|
|
async for chunk in response:
|
|
|
|
if not chunk.choices:
|
|
|
|
continue
|
|
|
|
delta_content = getattr(chunk.choices[0].delta, "content", None)
|
|
|
|
if delta_content is not None:
|
|
|
|
rsp += delta_content
|
|
|
|
if reporter:
|
|
|
|
reporter.add_report_line(
|
|
|
|
segment_name,
|
|
|
|
tag_name,
|
|
|
|
rsp,
|
|
|
|
status="RUNNING",
|
|
|
|
)
|
|
|
|
if reporter:
|
|
|
|
reporter.add_report_line(
|
|
|
|
segment_name,
|
|
|
|
tag_name,
|
|
|
|
rsp,
|
|
|
|
status="FINISH",
|
|
|
|
)
|
|
|
|
if tools and tool_calls:
|
|
|
|
return response.choices[0].message
|
2025-02-11 11:01:26 +08:00
|
|
|
return rsp
|
2024-10-24 11:46:15 +08:00
|
|
|
|
|
|
|
|
2025-01-13 22:57:43 -06:00
|
|
|
@LLMClient.register("azure_openai")
|
2025-01-21 11:20:26 +08:00
|
|
|
class AzureOpenAIClient(LLMClient):
|
2025-01-13 22:57:43 -06:00
|
|
|
def __init__(
|
|
|
|
self,
|
|
|
|
api_key: str,
|
|
|
|
base_url: str,
|
|
|
|
model: str,
|
|
|
|
stream: bool = False,
|
|
|
|
api_version: str = "2024-12-01-preview",
|
|
|
|
temperature: float = 0.7,
|
|
|
|
azure_deployment: str = None,
|
|
|
|
timeout: float = None,
|
|
|
|
azure_ad_token: str = None,
|
|
|
|
azure_ad_token_provider: AzureADTokenProvider = None,
|
2025-04-17 17:23:52 +08:00
|
|
|
max_rate: float = 1000,
|
|
|
|
time_period: float = 1,
|
|
|
|
**kwargs,
|
2025-01-13 22:57:43 -06:00
|
|
|
):
|
|
|
|
"""
|
|
|
|
Initializes the AzureOpenAIClient instance.
|
|
|
|
|
|
|
|
Args:
|
|
|
|
api_key (str): The API key for accessing the Azure OpenAI API.
|
|
|
|
api_version (str): The API version for the Azure OpenAI API (eg. "2024-12-01-preview, 2024-10-01-preview,2024-05-01-preview").
|
|
|
|
base_url (str): The base URL for the Azure OpenAI API.
|
|
|
|
azure_deployment (str): The deployment name for the Azure OpenAI model
|
|
|
|
model (str): The default model to use for requests.
|
|
|
|
stream (bool, optional): Whether to stream the response. Defaults to False.
|
|
|
|
temperature (float, optional): The temperature parameter for the model. Defaults to 0.7.
|
|
|
|
timeout (float): The timeout duration for the service request. Defaults to None, means no timeout.
|
|
|
|
azure_ad_token: Your Azure Active Directory token, https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id
|
|
|
|
azure_ad_token_provider: A function that returns an Azure Active Directory token, will be invoked on every request.
|
|
|
|
azure_deployment: A model deployment, if given sets the base client URL to include `/deployments/{azure_deployment}`.
|
|
|
|
Note: this means you won't be able to use non-deployment endpoints. Not supported with Assistants APIs.
|
|
|
|
"""
|
2025-04-17 17:23:52 +08:00
|
|
|
name = kwargs.pop("name", None)
|
|
|
|
if not name:
|
|
|
|
name = f"{api_key}{base_url}{model}"
|
|
|
|
super().__init__(name, max_rate, time_period, **kwargs)
|
2025-01-13 22:57:43 -06:00
|
|
|
|
|
|
|
self.api_key = api_key
|
|
|
|
self.base_url = base_url
|
|
|
|
self.azure_deployment = azure_deployment
|
|
|
|
self.model = model
|
|
|
|
self.stream = stream
|
|
|
|
self.temperature = temperature
|
|
|
|
self.timeout = timeout
|
|
|
|
self.api_version = api_version
|
|
|
|
self.azure_ad_token = azure_ad_token
|
|
|
|
self.azure_ad_token_provider = azure_ad_token_provider
|
2025-01-21 11:20:26 +08:00
|
|
|
self.client = AzureOpenAI(
|
|
|
|
api_key=self.api_key,
|
|
|
|
base_url=self.base_url,
|
|
|
|
azure_deployment=self.azure_deployment,
|
|
|
|
model=self.model,
|
|
|
|
api_version=self.api_version,
|
|
|
|
azure_ad_token=self.azure_ad_token,
|
|
|
|
azure_ad_token_provider=self.azure_ad_token_provider,
|
|
|
|
)
|
2025-04-17 17:23:52 +08:00
|
|
|
self.aclient = AsyncAzureOpenAI(
|
|
|
|
api_key=self.api_key,
|
|
|
|
base_url=self.base_url,
|
|
|
|
azure_deployment=self.azure_deployment,
|
|
|
|
model=self.model,
|
|
|
|
api_version=self.api_version,
|
|
|
|
azure_ad_token=self.azure_ad_token,
|
|
|
|
azure_ad_token_provider=self.azure_ad_token_provider,
|
|
|
|
)
|
|
|
|
|
2025-01-13 22:57:43 -06:00
|
|
|
self.check()
|
2025-04-17 17:23:52 +08:00
|
|
|
logger.debug(
|
|
|
|
f"Initialize AzureOpenAIClient with rate limit {max_rate} every {time_period}s"
|
|
|
|
)
|
2025-01-13 22:57:43 -06:00
|
|
|
|
2025-04-17 17:23:52 +08:00
|
|
|
def __call__(self, prompt: str = "", image_url: str = None, **kwargs):
|
2025-01-13 22:57:43 -06:00
|
|
|
"""
|
|
|
|
Executes a model request when the object is called and returns the result.
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
prompt (str): The prompt provided to the model.
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
str: The response content generated by the model.
|
|
|
|
"""
|
|
|
|
# Call the model with the given prompt and return the response
|
2025-04-17 17:23:52 +08:00
|
|
|
tools = kwargs.get("tools", None)
|
|
|
|
messages = kwargs.get("messages", None)
|
|
|
|
if messages is None:
|
|
|
|
if image_url:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{
|
|
|
|
"role": "user",
|
|
|
|
"content": [
|
|
|
|
{"type": "text", "text": prompt},
|
|
|
|
{"type": "image_url", "image_url": {"url": image_url}},
|
|
|
|
],
|
|
|
|
},
|
|
|
|
]
|
|
|
|
else:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{"role": "user", "content": prompt},
|
|
|
|
]
|
|
|
|
response = self.client.chat.completions.create(
|
|
|
|
model=self.model,
|
|
|
|
messages=messages,
|
|
|
|
stream=self.stream,
|
|
|
|
temperature=self.temperature,
|
|
|
|
timeout=self.timeout,
|
2025-04-25 18:01:45 +08:00
|
|
|
max_tokens=self.max_tokens,
|
2025-04-17 17:23:52 +08:00
|
|
|
)
|
|
|
|
rsp = response.choices[0].message.content
|
|
|
|
tool_calls = response.choices[0].message.tool_calls
|
|
|
|
if tools and tool_calls:
|
|
|
|
return response.choices[0].message
|
2025-01-13 22:57:43 -06:00
|
|
|
|
2025-04-17 17:23:52 +08:00
|
|
|
return rsp
|
|
|
|
|
|
|
|
async def acall(self, prompt: str = "", image_url: str = None, **kwargs):
|
|
|
|
"""
|
|
|
|
Executes a model request when the object is called and returns the result.
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
prompt (str): The prompt provided to the model.
|
|
|
|
|
|
|
|
Returns:
|
|
|
|
str: The response content generated by the model.
|
|
|
|
"""
|
|
|
|
# Call the model with the given prompt and return the response
|
|
|
|
tools = kwargs.get("tools", None)
|
|
|
|
messages = kwargs.get("messages", None)
|
|
|
|
if messages is None:
|
|
|
|
if image_url:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{
|
|
|
|
"role": "user",
|
|
|
|
"content": [
|
|
|
|
{"type": "text", "text": prompt},
|
|
|
|
{"type": "image_url", "image_url": {"url": image_url}},
|
|
|
|
],
|
|
|
|
},
|
|
|
|
]
|
|
|
|
|
|
|
|
else:
|
|
|
|
messages = [
|
|
|
|
{"role": "system", "content": "you are a helpful assistant"},
|
|
|
|
{"role": "user", "content": prompt},
|
|
|
|
]
|
|
|
|
response = await self.aclient.chat.completions.create(
|
|
|
|
model=self.model,
|
|
|
|
messages=messages,
|
|
|
|
stream=self.stream,
|
|
|
|
temperature=self.temperature,
|
|
|
|
timeout=self.timeout,
|
2025-04-25 18:01:45 +08:00
|
|
|
max_tokens=self.max_tokens,
|
2025-04-17 17:23:52 +08:00
|
|
|
)
|
|
|
|
rsp = response.choices[0].message.content
|
|
|
|
tool_calls = response.choices[0].message.tool_calls
|
|
|
|
|
|
|
|
if tools and tool_calls:
|
|
|
|
return rsp.choices[0].message
|
|
|
|
return rsp
|
fix(common): ollama and openai client for qwen3 20250508 (#531)
* Initial commit
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Create CITATION.cff
* Update README.md
* Remove sensitive information
* refine vectorizer code and config
* fix vectorizer package name
* fix old import and requirement version bug
* delete kagdemo
* remove batch_vectorizer.py under common
* fix struct chain bug
* fix path bug
* fix path bug
* fix comment
* add README_cn.md
* fix spaces
* handle empty node_path
* handle empty node_batch
* add english README.md
* fix url
* fix typo
* fix bug in solver
* use kwargs
* use kwargs to init config
* add llm input
* remove __main__
* fix bug in solver
* add HumanBodyPart.csv data
* fix bug in builder
* update requirement
* fix example llm config
* add example cfg
* fix readme
* update
* update requirement
* Update README.md
* import FlagEmbedding at top-level to avoid sklearn init failure
* fix llm client
* add __init__
* import sklearn before FlagEmbedding to avoid sklearn init failure
* implement vector_dimensions in base Vectorizer
* add VectorizerConfigChecker
* add llm config checker
* add llm init file
* fix
* fix
* add init for solver
* fix
* fix(kag): fix llm call warning info (#11)
* output llm call warning info
* add generator module
* [fix](builder): prompt config for builder (#14)
* fix zh prompt config bug
* add __init__ for extractabc
* [fix](builder): batchvectorizer kwargs (#16)
* fix zh prompt config bug
* add __init__ for extractabc
* fix batchvectorizer kwargs
* using ollama client
* fix buidler init
* fix buidler init (#18)
* fix buidler init
* fix llm config cheker main
* fix llm test
* (fix)[solver]: language (#19)
* fix builder init
* fix language
* fix req
* (fix)[common]: llm checker (#20)
* fix buidler init
* fix llm checker
* filter edges with empty relation (#26)
* (fix)[common]: llm client (#22)
* fix buidler init
* add sub llm client in __init__
* fix cmd kagbase
* fix spo kag_llm
* (fix)[solver]: default prompt for examples (#29)
* fix buidler init
* fix default prompt for examples
* fix kagbasemodule for prompt config
* fix kagbasemodule for prompt config
* [feat]: Add test dataset for hotpotqa and musique (#30)
* add dataset
* move dir
* fix bug in spg_extractor and base_table_splitter (#44)
* Update README_cn.md (#55)
Update the description of KAG core features in the readme file
* Update README.md (#54)
* docs: add Japanese README file (#49)
I created Japanese translated README.
* doc: note on staring kag in README.md (#59)
* add default prop name (#62)
* Update README for en,cn (#63)
* update readme for cn,en
* update readme for cn,en
* Readme optimize (#65)
* update readme for cn,en
* update readme for cn,en
* update readme for cn,en
* minor format tweak
---------
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* Update README_cn.md (#74)
* generate embedding in batch (#76)
* add pro commit
* fix(example): fix example open qa benchmark data id generator (#97)
* fix builder chunk index
* split length
* add force chunk (#98)
* Update logic_form_plan.py (#95)
* fix(builder): add spgtype not null check (#106)
* add spg_type not null check
* add spg_type not null check
* [feat] add template for issues (#108)
* feat(kag) update template for issues (#109)
* [feat] add template for issues
* update template for issues
* change type of variable self.schema from dict.keys to list[str] (#99)
* feat(kag) update template for issues (#114)
* [feat] add template for issues
* update template for issues
* update template for issues
* rename graphalgoclient to graphclient
* feat(ReadMe) Update README.md (#119)
* feat(kag)Update README.md (#120)
* feat(kag) update docs for github.io (#122)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* feat(kag) update docs for kag (#123)
* [feat] add template for issues
* update template for issues
* update template for issues
* update template for issues
* update template for issues
* fix(solver): 修改了slover中sum和verify功能求解器的传入参数 (#139)
* 修改了slover中sum和verify功能求解器的传入参数
* 去除多余空格
* add testset for kag-demo (#144)
* feat(kag) rename testset file (#147)
* add testset for kag-demo
* rename testset
* Update README.md (#181)
* refactor(all): kag v0.6 (#174)
* add path find
* fix find path
* spg guided relation extraction
* fix dict parse with same key
* rename graphalgoclient to graphclient
* rename graphalgoclient to graphclient
* file reader supports http url
* add checkpointer class
* parser supports checkpoint
* add build
* remove incorrect logs
* remove logs
* update examples
* update chain checkpointer
* vectorizer batch size set to 32
* add a zodb backended checkpointer
* add a zodb backended checkpointer
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* 增加solver
* add kag
* update outline splitter
* add main test
* add op
* code refactor
* add tools
* fix outline splitter
* fix outline prompt
* graph api pass
* commit with page rank
* add search api and graph api
* add markdown report
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* update markdown reader
* update pdf reader
* raise extractor failure
* add default expr
* add log
* merge jc reader features
* rm import
* add build
* fix zodb based checkpointer
* add thread for zodb IO
* fix(common): resolve mutlithread conflict in zodb IO
* fix(common): load existing zodb checkpoints
* update examples
* update examples
* fix zodb writer
* add docstring
* fix jieba version mismatch
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* commit kag_config-tc.yaml
1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file
Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.
* 1、fix bug in base_table_splitter
* 1、fix bug in base_table_splitter
* 1、fix bug in default_chain
* update outline splitter
* add main test
* add markdown report
* code refactor
* fix outline splitter
* fix outline prompt
* update markdown reader
* fix vectorizer num batch compute
* add retry for vectorize model call
* update markdown reader
* raise extractor failure
* rm parser
* run pipeline
* add config option of whether to perform llm config check, default to false
* fix
* recover pdf reader
* several components can be null for default chain
* 支持完整qa运行
* add if
* remove unused code
* 使用chunk兜底
* excluded source relation to choose
* add generate
* default recall 10
* add local memory
* 排除相似边
* 增加保护
* 修复并发问题
* add debug logger
* 支持topk参数化
* 支持chunk截断和调整spo select 的prompt
* 增加查询请求保护
* 增加force_chunk配置
* fix entity linker algorithm
* 增加sub query改写
* fix md reader dup in test
* fix
* merge knext to kag parallel
* fix package
* 修复指标下跌问题
* scanner update
* scanner update
* add doc and update example scripts
* fix
* add bridge to spg server
* add format
* fix bridge
* update conf for baike
* disable ckpt for spg server runner
* llm invoke error default raise exceptions
* chore(version): bump version to X.Y.Z
* update default response generation prompt
* add method getSummarizationMetrics
* fix(common): fix project conf empty error
* fix typo
* 增加上报信息
* 修改main solver
* postprocessor support spg server
* 修改solver支持名
* fix language
* 修改chunker接口,增加openapi
* rename vectorizer to vectorize_model in spg server config
* generate_random_string start with gen
* add knext llm vector checker
* add knext llm vector checker
* add knext llm vector checker
* solver移除默认值
* udpate yaml and register_name for baike
* udpate yaml and register_name for baike
* remove config key check
* 修复llmmodule
* fix knext project
* udpate yaml and register_name for examples
* udpate yaml and register_name for examples
* Revert "udpate yaml and register_name for examples"
This reverts commit 9705951d066b282ac49f0e1972559b646e7f906d.
* update register name
* fix
* fix
* support multiple resigter names
* update component
* update reader register names (#183)
* fix markdown reader
* fix llm client for retry
* feat(common): add processed chunk id checkpoint (#185)
* update reader register names
* add processed chunk id checkpoint
* feat(example): add example config (#186)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* add max_workers parameter for getSummarizationMetrics to make it faster
* add csqa data generation script generate_data.py
* commit generated csqa builder and solver data
* add csqa basic project files
* adjust split_length and num_threads_per_chain to match lightrag settings
* ignore ckpt dirs
* add csqa evaluation script eval.py
* save evaluation scripts summarization_metrics.py and factual_correctness.py
* save LightRAG output csqa_lightrag_answers.json
* ignore KAG output csqa_kag_answers.json
* add README.md for CSQA
* fix(solver): fix solver pipeline conf (#191)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* update links and file paths
* reformat csqa kag_config.yaml
* reformat csqa python files
* reformat getSummarizationMetrics and compare_summarization_answers
* fix(solver): fix solver config (#192)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* add except
* fix typo in csqa README.md
* feat(conf): support reinitialize config for call from java side (#199)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* revert default response generation prompt
* update project list
* add README.md for the hotpotqa, 2wiki and musique examples
* 增加spo检索
* turn off kag config dump by default
* turn off knext schema dump by default
* add .gitignore and fix kag_config.yaml
* add README.md for the medicine example
* add README.md for the supplychain example
* bugfix for risk mining
* use exact out
* refactor(solver): format solver code (#205)
* update reader register names
* add processed chunk id checkpoint
* add example config file
* update solver pipeline config
* fix project create
* fix main solver conf
* support reinitialize config for java call
* black format
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
* docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* fix create project (#208)
* docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain
* add schema modeling for example supplychain
* add kg construction for example supplychain
* add kg query for example supplychain
* finish README_cn.md for the supplychain example
* finish README.md for the supplychain example
* add README.md for the riskmining example
* add README.md for the medicine example
* update README.md of hotpotqa, 2wiki and musique
* add README_cn.md for hotpotqa, 2wiki and musique
* update README.md of csqa
* add README_cn.md for csqa
* update link targets to 0.6 version of the docs
* add README.md for baike
* reformat Python code in examples
* add README_cn.md for examples
* finish README.md for examples
* fix typo in README.md for examples
* move images for kag examples to _static/images (#214)
* move more images for kag examples to _static/images (#216)
* udpate default yaml and corpus (#217)
* fix(knext): fix knext project env (#211)
* fix create project
* fix create project
* fix create project
* fix create project
* fix examples REAME.md to match quick start doc (#218)
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* change log level to debug (#221)
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* update(kag) update log level (#226)
* udpate default yaml and corpus
* update log level to debug
* fix(KAG): change level log (#227)
* change log level to debug
* fix(example): fix vectorize model config in example (#220)
* fix vectorize model config
* remove ak
* remove ak
* x
* fix knext env (#223)
* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)
* reduce warn (#225)
* change log level to debug
---------
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
* fix knext client (#229)
* fix(knext): fix knext client (#230)
* fix knext client
* x
* fix ollma regsiter name (#234)
* add timeout param for llm and embedding model (#236)
* Update README_cn.md (#238)
* Update README.md (#237)
* feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo
* fix permissions of data files
* fix examples README.md inconsistency
* output qfs evaluation results as json and markdown
* format summarization_metrics.py with black
* chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model
* add example
* fix title
* update(kag) Update README (#258)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update(kag) Update README (#264)
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* update README #andy
* fix mix reader (#270)
* feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support
* chore: add .DS_Store to .gitignore
* refactor(llm):add description for api_version and default value
* refactor(vectorize_model): added description for ap_version and default values for some params
* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
* fix(builder): fix markdown reader for id (#273)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* first fix
* fix(examples): fix qa file name (#251)
* support custom kag config file (#279)
* feat(bridge): spg server bridge supports config check and run solver (#287)
* x
* x (#280)
* bridge add solver
* x
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(kag): catch unexpected exceptions (#298)
* x (#280)
* feat(bridge): spg server bridge (#283)
* x
* bridge add solver
* x
* feat(bridge): Spg server bridge check (#285)
* x
* bridge add solver
* x
* add invoke
* feat(common): llm client catch exception (#294)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* feat(solver): catch chunk retriever exception (#297)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* feat(common):llm except (#299)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* feat(common): force raise except (#300)
* x
* bridge add solver
* x
* add invoke
* llm client catch error
* catch exception
* print llm invoke error info
* with except
* force raise except
* delete checkpoint of postprocess (#302)
* disable entity linking in postprocess by default (#304)
* add retry (#306)
* use json repair for llm client (#312)
* fix empty data generate (#319)
* Add Discord link and wechat qr code. (#338)
* Add qr code
* Update README.md
Add discord and how to join the wechat group.
* fix the error when the stream parameter is True (#336)
* Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
* fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* fix pdf reader
* fix pdf reader
* fix pdf reader
* fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object
* reformat by black
* fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object
* reformat by black
* adapter spo retrieval llm response
* core_team #andy (#389)
* fix(knext): project update addr (#408)
* fix buidler init
* add pro commit
* rename graphalgoclient to graphclient
* use config default
* fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph
* refine code
* feat(kag): update to v0.7 (#456)
* add think cost
* update csv scanner
* add final rerank
* add reasoner
* add iterative planner
* fix dpr search
* fix dpr search
* add reference data
* move odps import
* update requirement.txt
* update 2wiki
* add missing file
* fix markdown reader
* add iterative planning
* update version
* update runner
* update 2wiki example
* update bridge
* merge solver and solver_new
* add cur day
* writer delete
* update multi process
* add missing files
* fix report
* add chunk retrieved executor
* update try in stream runner result
* add path
* add math executor
* update hotpotqa example
* remove log
* fix python coder solver
* update hotpotqa example
* fix python coder solver
* update config
* fix bad
* add log
* remove unused code
* commit with task thought
* move kag model to common
* add default chat llm
* fix
* use static planner
* support chunk graph node
* add args
* support naive rag
* llm client support tool calls
* add default async
* add openai
* fix result
* fix markdown reader
* fix thinker
* update asyncio interface
* feat(solver): add mcp support (#444)
* 上传mcp client相关代码
* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用
* 1、schema
* bugfix:删减冗余代码
---------
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
* fix affairqa after solver refactor
* fix affairqa after solver refactor
* fix readme
* add params
* update version
* update mcp executor
* update mcp executor
* solver add mcp executor
* add missing file
* add mpc executor
* add executor
* x
* update
* fix requirement
* fix main llm config
* fix solver
* bugfix:修复invoke函数调用逻辑
* chg eva
* update example
* add kag layer
* add step task
* support dot refresh
* support dot refresh
* support dot refresh
* support dot refresh
* add retrieved num
* add retrieved num
* add pipelineconf
* update ppr
* update musique prompts
* update
* add to_dict for BuilderComponentData
* async build
* add deduce prompt
* add deduce prompt
* add deduce prompt
* fix reader
* add deduce prompt
* add page thinker report
* modify prmpt
* add step status
* add self cognition
* add self cognition
* add memory graph storage
* add now time
* update memory config
* add now time
* chg graph loader
* 添加prqa数据集和代码
* bugfix:prqa调用逻辑修复
* optimize:优化代码逻辑,生成答案规范化
* add retry py code
* update memory graph
* update memory graph
* fix
* fix ner
* add with_out_refer generator prompt
* fix
* close ckpt
* fix query
* fix query
* update version
* add llm checker
* add llm checker
* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件
* update exp
* update exp
* rerank support
* add static rewrite query
* recall more chunks
* fix graph load
* add static rewrite query
* fix bugs
* add finish check
* add finish check
* add finish check
* add finish check
* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件
* add lf retry
* add memory graph api
* fix reader api
* add ner
* add metrics
* fix bug
* remove ner
* add reraise fo retry
* add edge prop to memory graph
* add memory graph
* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题
* 删除评测结果文件
* fix knext host addr
* async eva
* add lf prompt
* add lf prompt
* add config
* add retry
* add unknown check
* add rc result
* add rc result
* add rc result
* add rc result
* 依据kag pipeline格式修改代码逻辑并通过测试
* bugfix:删除冗余代码
* fix report prompt
* bugfix:触发重试机制
* bugfix:中文符号错误
* fix rethinker prompt
* update version to 0.6.2b78
* update version
* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试
* update affairqa for evaluate
* update affairqa for evaluate
* bugfix:修正数据集
* bugfix:修正数据集
* bugfix:修正数据集
* fix name conflict
* bugfix:删除错误问题
* bugfix:文件名命名错误导致evaluator失败
* update for affairqa eval
* bugfix:修改代码保持evaluate逻辑一致
* x
* update for affairqa readme
* remove temp eval scripts
* bugfix for math deduce
* merge 0.6.2_dev
* merge 0.6.2_dev
* fix
* update client addr
* updated version
* update for affairqa eval
* evaUtils 支持中文
* fix affairqa eval:
* remove unused example
* update kag config
* fix default value
* update readme
* fix init
* 注释信息修改,并添加部分class说明
* update example config
* Tc 0.7.0 (#459)
* 提交affairQA 代码
* fix affairqa eval
---------
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* fix all examples
* reformat
---------
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
* 更新readme (#460)
* feat(kag): update readme (#462)
* 更新readme
* 更新readme
* feat(kag): update version of KAG (#475)
* 更新readme
* 更新readme
* update version of KAG
* add async return (#481)
* add run component error log (#482)
* fix(bin): add pip index url (#483)
* add run component error log
* add index url option
* add gpu type
* fix(solver): add component name (#480)
* add ai search example
* bugfix reporter name
* update version
* fix ci
* support disabling vector generation (#486)
* fix
* fix
---------
Co-authored-by: Andy <andy.yj@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: leywar <leywar.liang@antgroup.com>
Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: tpoisonooo <khj.application@aliyun.com>
Co-authored-by: thundax <84020791+thundax-lyp@users.noreply.github.com>
Co-authored-by: Chasing <94726836+zzzcccxx@users.noreply.github.com>
Co-authored-by: yangman <1515243746@qq.com>
Co-authored-by: joseosvaldo16 <joseosvaldo16@yahoo.com.mx>
Co-authored-by: luzizhuo <496521310@qq.com>
Co-authored-by: hy89 <31279043+hy89@users.noreply.github.com>
Co-authored-by: xueguanwen <xgw1989@sina.com>
Co-authored-by: bingchu <152955942+J4ckycjl@users.noreply.github.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
2025-05-12 13:47:53 +08:00
|
|
|
|
|
|
|
|
|
|
|
if __name__ == "__main__":
|
|
|
|
client = OpenAIClient(
|
|
|
|
model="Qwen/Qwen3-0.6B", base_url="http://0.0.0.0:8000/v1", think=False
|
|
|
|
)
|
|
|
|
msg = asyncio.run(client.acall("你好"))
|
|
|
|
print(msg)
|