mirror of
https://github.com/OpenSPG/KAG.git
synced 2025-06-27 03:20:08 +00:00

* add think cost * update csv scanner * add final rerank * add reasoner * add iterative planner * fix dpr search * fix dpr search * add reference data * move odps import * update requirement.txt * update 2wiki * add missing file * fix markdown reader * add iterative planning * update version * update runner * update 2wiki example * update bridge * merge solver and solver_new * add cur day * writer delete * update multi process * add missing files * fix report * add chunk retrieved executor * update try in stream runner result * add path * add math executor * update hotpotqa example * remove log * fix python coder solver * update hotpotqa example * fix python coder solver * update config * fix bad * add log * remove unused code * commit with task thought * move kag model to common * add default chat llm * fix * use static planner * support chunk graph node * add args * support naive rag * llm client support tool calls * add default async * add openai * fix result * fix markdown reader * fix thinker * update asyncio interface * feat(solver): add mcp support (#444) * 上传mcp client相关代码 * 1、完成一套mcp client的调用,从pipeline到planner、executor 2、允许json中传入多个mcp_server,通过大模型进行调用并选择 3、调通baidu_map_mcp的使用 * 1、schema * bugfix:删减冗余代码 --------- Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com> * fix affairqa after solver refactor * fix affairqa after solver refactor * fix readme * add params * update version * update mcp executor * update mcp executor * solver add mcp executor * add missing file * add mpc executor * add executor * x * update * fix requirement * fix main llm config * fix solver * bugfix:修复invoke函数调用逻辑 * chg eva * update example * add kag layer * add step task * support dot refresh * support dot refresh * support dot refresh * support dot refresh * add retrieved num * add retrieved num * add pipelineconf * update ppr * update musique prompts * update * add to_dict for BuilderComponentData * async build * add deduce prompt * add deduce prompt * add deduce prompt * fix reader * add deduce prompt * add page thinker report * modify prmpt * add step status * add self cognition * add self cognition * add memory graph storage * add now time * update memory config * add now time * chg graph loader * 添加prqa数据集和代码 * bugfix:prqa调用逻辑修复 * optimize:优化代码逻辑,生成答案规范化 * add retry py code * update memory graph * update memory graph * fix * fix ner * add with_out_refer generator prompt * fix * close ckpt * fix query * fix query * update version * add llm checker * add llm checker * 1、上传evalutor.py以及修改gold_answer.json格式 2、优化代码逻辑 3、修改README.md文件 * update exp * update exp * rerank support * add static rewrite query * recall more chunks * fix graph load * add static rewrite query * fix bugs * add finish check * add finish check * add finish check * add finish check * 1、上传evalutor.py的结果 2、优化代码逻辑,优化readme文件 * add lf retry * add memory graph api * fix reader api * add ner * add metrics * fix bug * remove ner * add reraise fo retry * add edge prop to memory graph * add memory graph * 1、评测数据集结果修正 2、优化evaluator.py代码 3、删除结果不存在而gold_answer中有答案的问题 * 删除评测结果文件 * fix knext host addr * async eva * add lf prompt * add lf prompt * add config * add retry * add unknown check * add rc result * add rc result * add rc result * add rc result * 依据kag pipeline格式修改代码逻辑并通过测试 * bugfix:删除冗余代码 * fix report prompt * bugfix:触发重试机制 * bugfix:中文符号错误 * fix rethinker prompt * update version to 0.6.2b78 * update version * 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑 2、修改prompt,让没有回答的结果重复测试 * update affairqa for evaluate * update affairqa for evaluate * bugfix:修正数据集 * bugfix:修正数据集 * bugfix:修正数据集 * fix name conflict * bugfix:删除错误问题 * bugfix:文件名命名错误导致evaluator失败 * update for affairqa eval * bugfix:修改代码保持evaluate逻辑一致 * x * update for affairqa readme * remove temp eval scripts * bugfix for math deduce * merge 0.6.2_dev * merge 0.6.2_dev * fix * update client addr * updated version * update for affairqa eval * evaUtils 支持中文 * fix affairqa eval: * remove unused example * update kag config * fix default value * update readme * fix init * 注释信息修改,并添加部分class说明 * update example config * Tc 0.7.0 (#459) * 提交affairQA 代码 * fix affairqa eval --------- Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com> * fix all examples * reformat --------- Co-authored-by: peilong <peilong.zpl@antgroup.com> Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com> Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com> Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
198 lines
7.6 KiB
Python
198 lines
7.6 KiB
Python
# -*- coding: utf-8 -*-
|
|
# Copyright 2023 OpenSPG Authors
|
|
#
|
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
|
|
# in compliance with the License. You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software distributed under the License
|
|
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
|
# or implied.
|
|
import os
|
|
from typing import List, Dict
|
|
|
|
import knext.common.cache
|
|
from knext.common.base.client import Client
|
|
from knext.common.rest import ApiClient, Configuration
|
|
from knext.schema import rest
|
|
from knext.schema.model.base import BaseSpgType, AlterOperationEnum, SpgTypeEnum
|
|
from knext.schema.model.relation import Relation
|
|
|
|
cache = knext.common.cache.SchemaCache()
|
|
|
|
|
|
CHUNK_TYPE = "Chunk"
|
|
TITLE_TYPE = "Title"
|
|
OTHER_TYPE = "Others"
|
|
TEXT_TYPE = "Text"
|
|
INTEGER_TYPE = "Integer"
|
|
FLOAT_TYPE = "Float"
|
|
BASIC_TYPES = [TEXT_TYPE, INTEGER_TYPE, FLOAT_TYPE]
|
|
|
|
|
|
class SchemaSession:
|
|
def __init__(self, client, project_id):
|
|
self._alter_spg_types: List[BaseSpgType] = []
|
|
self._rest_client = client
|
|
self._project_id = project_id
|
|
|
|
self._spg_types = {}
|
|
self.__spg_types = {}
|
|
self._init_spg_types()
|
|
|
|
def _init_spg_types(self):
|
|
"""Query project schema and init SPG types in session."""
|
|
project_schema = self._rest_client.schema_query_project_schema_get(
|
|
self._project_id
|
|
)
|
|
for spg_type in project_schema.spg_types:
|
|
spg_type_name = spg_type.basic_info.name.name
|
|
type_class = BaseSpgType.by_type_enum(spg_type.spg_type_enum)
|
|
if spg_type.spg_type_enum == SpgTypeEnum.Concept:
|
|
self._spg_types[spg_type_name] = type_class(
|
|
name=spg_type_name,
|
|
hypernym_predicate=spg_type.concept_layer_config.hypernym_predicate,
|
|
rest_model=spg_type,
|
|
)
|
|
else:
|
|
self._spg_types[spg_type_name] = type_class(
|
|
name=spg_type_name, rest_model=spg_type
|
|
)
|
|
|
|
@property
|
|
def spg_types(self) -> Dict[str, BaseSpgType]:
|
|
return self._spg_types
|
|
|
|
def get(self, spg_type_name) -> BaseSpgType:
|
|
"""Get SPG type by name from project schema."""
|
|
spg_type = self._spg_types.get(spg_type_name)
|
|
if spg_type is None:
|
|
spg_type = self.__spg_types.get(spg_type_name)
|
|
if spg_type is None:
|
|
raise ValueError(f"{spg_type_name} is not existed")
|
|
else:
|
|
return self.__spg_types.get(spg_type_name)
|
|
return self._spg_types.get(spg_type_name)
|
|
|
|
def create_type(self, spg_type: BaseSpgType):
|
|
"""Add an SPG type in session with `CREATE` operation."""
|
|
spg_type.alter_operation = AlterOperationEnum.Create
|
|
self.__spg_types[spg_type.name] = spg_type
|
|
self._alter_spg_types.append(spg_type)
|
|
return self
|
|
|
|
def update_type(self, spg_type: BaseSpgType):
|
|
"""Add an SPG type in session with `UPDATE` operation."""
|
|
spg_type.alter_operation = AlterOperationEnum.Update
|
|
self._alter_spg_types.append(spg_type)
|
|
return self
|
|
|
|
def delete_type(self, spg_type: BaseSpgType):
|
|
"""Add an SPG type in session with `DELETE` operation."""
|
|
spg_type.alter_operation = AlterOperationEnum.Delete
|
|
self._alter_spg_types.append(spg_type)
|
|
return self
|
|
|
|
def commit(self):
|
|
"""Commit all altered schemas to server."""
|
|
schema_draft = []
|
|
for spg_type in self._alter_spg_types:
|
|
for prop in spg_type.properties.values():
|
|
if prop.object_spg_type is None:
|
|
object_spg_type = self.get(prop.object_type_name)
|
|
prop.object_spg_type = object_spg_type.spg_type_enum
|
|
for sub_prop in prop.sub_properties.values():
|
|
if sub_prop.object_spg_type is None:
|
|
object_spg_type = self.get(sub_prop.object_type_name)
|
|
sub_prop.object_spg_type = object_spg_type.spg_type_enum
|
|
for rel in spg_type.relations.values():
|
|
if rel.is_dynamic is None:
|
|
rel.is_dynamic = False
|
|
if rel.object_spg_type is None:
|
|
object_spg_type = self.get(rel.object_type_name)
|
|
rel.object_spg_type = object_spg_type.spg_type_enum
|
|
for sub_prop in rel.sub_properties.values():
|
|
if sub_prop.object_spg_type is None:
|
|
object_spg_type = self.get(sub_prop.object_type_name)
|
|
sub_prop.object_spg_type = object_spg_type.spg_type_enum
|
|
schema_draft.append(spg_type.to_rest())
|
|
if len(schema_draft) == 0:
|
|
return
|
|
|
|
request = rest.SchemaAlterRequest(
|
|
project_id=self._project_id, schema_draft=rest.SchemaDraft(schema_draft)
|
|
)
|
|
key = "KNEXT_DEBUG_DUMP_SCHEMA"
|
|
dump_flag = os.getenv(key)
|
|
if dump_flag is not None and dump_flag.strip() == "1":
|
|
print(request)
|
|
else:
|
|
print(f"Committing schema: set {key}=1 to dump the schema")
|
|
self._rest_client.schema_alter_schema_post(schema_alter_request=request)
|
|
|
|
|
|
class SchemaClient(Client):
|
|
""" """
|
|
|
|
def __init__(self, host_addr: str = None, project_id: str = None):
|
|
super().__init__(host_addr, project_id)
|
|
self._session = None
|
|
self._rest_client: rest.SchemaApi = rest.SchemaApi(
|
|
api_client=ApiClient(configuration=Configuration(host=host_addr))
|
|
)
|
|
def query_spg_type(self, spg_type_name: str) -> BaseSpgType:
|
|
"""Query SPG type by name."""
|
|
rest_model = self._rest_client.schema_query_spg_type_get(spg_type_name)
|
|
type_class = BaseSpgType.by_type_enum(f"{rest_model.spg_type_enum}")
|
|
|
|
if rest_model.spg_type_enum == SpgTypeEnum.Concept:
|
|
return type_class(
|
|
name=spg_type_name,
|
|
hypernym_predicate=rest_model.concept_layer_config.hypernym_predicate,
|
|
rest_model=rest_model,
|
|
)
|
|
else:
|
|
return type_class(name=spg_type_name, rest_model=rest_model)
|
|
|
|
def query_relation(
|
|
self, subject_name: str, predicate_name: str, object_name: str
|
|
) -> Relation:
|
|
"""Query relation type by s_p_o name."""
|
|
rest_model = self._rest_client.schema_query_relation_get(
|
|
subject_name, predicate_name, object_name
|
|
)
|
|
return Relation(
|
|
name=predicate_name, object_type_name=object_name, rest_model=rest_model
|
|
)
|
|
|
|
def create_session(self):
|
|
"""Create session for altering schema."""
|
|
schema_session = cache.get(self._project_id)
|
|
if not schema_session:
|
|
schema_session = SchemaSession(self._rest_client, self._project_id)
|
|
cache.put(self._project_id, schema_session)
|
|
return schema_session
|
|
|
|
def load(self):
|
|
schema_session = self.create_session()
|
|
schema = {
|
|
k.split(".")[-1]: v
|
|
for k, v in schema_session.spg_types.items()
|
|
if v.spg_type_enum
|
|
in [SpgTypeEnum.Concept, SpgTypeEnum.Entity, SpgTypeEnum.Event]
|
|
}
|
|
return schema
|
|
|
|
def extract_types(self):
|
|
schema = self.load()
|
|
types = [t for t in schema.keys() if t not in [CHUNK_TYPE] + BASIC_TYPES]
|
|
return types
|
|
|
|
def extract_types_zh_mapping(self):
|
|
schema = self.load()
|
|
schema_mapping = {
|
|
t: schema[t].name_zh for t in schema.keys() if t not in [CHUNK_TYPE] + BASIC_TYPES
|
|
}
|
|
return schema_mapping
|