
* Rename AzureChatOpenAI to LCAzureChatOpenAI * Provide vanilla ChatOpenAI and AzureChatOpenAI * Remove the highest accuracy, lowest cost criteria These criteria are unnecessary. The users, not pipeline creators, should choose which LLM to use. Furthermore, it's cumbersome to input this information, really degrades user experience. * Remove the LLM selection in simple reasoning pipeline * Provide a dedicated stream method to generate the output * Return placeholder message to chat if the text is empty
2.5 KiB
Creating a component
A fundamental concept in kotaemon is "component".
Anything that isn't data or data structure is a "component". A component can be thought of as a step within a pipeline. It takes in some input, processes it, and returns an output, just the same as a Python function! The output will then become an input for the next component in a pipeline. In fact, a pipeline is just a component. More appropriately, a nested component: a component that makes use of one or more other components in the processing step. So in reality, there isn't a difference between a pipeline and a component! Because of that, in kotaemon, we will consider them the same as "component".
To define a component, you will:
- Create a class that subclasses from
kotaemon.base.BaseComponent
- Declare init params with type annotation
- Declare nodes (nodes are just other components!) with type annotation
- Implement the processing logic in
run
.
The syntax of a component is as follow:
from kotaemon.base import BaseComponent
from kotaemon.llms import LCAzureChatOpenAI
from kotaemon.parsers import RegexExtractor
class FancyPipeline(BaseComponent):
param1: str = "This is param1"
param2: int = 10
param3: float
node1: BaseComponent # this is a node because of BaseComponent type annotation
node2: LCAzureChatOpenAI # this is also a node because LCAzureChatOpenAI subclasses BaseComponent
node3: RegexExtractor # this is also a node bceause RegexExtractor subclasses BaseComponent
def run(self, some_text: str):
prompt = (self.param1 + some_text) * int(self.param2 + self.param3)
llm_pred = self.node2(prompt).text
matches = self.node3(llm_pred)
return matches
Then this component can be used as follow:
llm = LCAzureChatOpenAI(endpoint="some-endpont")
extractor = RegexExtractor(pattern=["yes", "Yes"])
component = FancyPipeline(
param1="Hello"
param3=1.5
node1=llm,
node2=llm,
node3=extractor
)
component("goodbye")
This way, we can define each operation as a reusable component, and use them to compose larger reusable components!
Benefits of component
By defining a component as above, we formally encapsulate all the necessary information inside a single class. This introduces several benefits:
- Allow tools like promptui to inspect the inner working of a component in order to automatically generate the promptui.
- Allow visualizing a pipeline for debugging purpose.