mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-07-06 00:28:52 +00:00
description |
---|
This design doc will walk through developing a connector for OpenMetadata |
Build a Connector
Ingestion is a simple python framework to ingest the metadata from various sources.
Please look at our framework APIs
Workflow
workflow is a simple orchestration job that runs the components in an Order.
It consists of Source ,Processor, Sink . It also provides support for Stage , BulkSink
Workflow execution happens in serial fashion.
- It runs source component first. Source component retrieves a record from external sources and emits the record downstream.
- if the processor component is configured workflow sends the record to processor next
- There can be multiple processors attached to the workflow it passes them in the order they are configurd
- Once the processors finished , it sends the modified to record to Sink.
- The above steps repeats per record emitted from source component
In the cases where we need to aggregation over the records, we can use stage to write to a file or other store. Use the file written to in stage and pass it to bulksink to publish to external services such as openmetadata or elasticsearch
{% page-ref page="source.md" %}
{% page-ref page="processor.md" %}
{% page-ref page="sink.md" %}
{% page-ref page="stage.md" %}
{% page-ref page="bulksink.md" %}