mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-10-24 15:25:10 +00:00
description |
---|
This design doc will walk through developing a connector for OpenMetadata |
Build a Connector
Ingestion is a simple python framework to ingest the metadata from various sources.
Please look at our framework APIs
Workflow
workflow is a simple orchestration job that runs the components in an Order.
A workflow consists of Source, Processor and Sink. It also provides support for Stage and BulkSink.
Workflow execution happens in serial fashion.
- Workflow runs the source component first. The source retrieves a record from external sources and emits the record downstream.
- If the processor component is configured, the workflow sends the record to the processor next.
- There can be multiple processor components attached to the workflow. The workflow passes a record to each processor in the order they are configured.
- Once a processor is finished, it sends the modified record to sink.
- The above steps are repeated for each record emitted from the source.
In the cases where we need aggregation over the records, we can use stage to write to a file or other store. Use the file written to in stage and pass it to bulksink to publish to external services such as openmetadata or elasticsearch.
{% page-ref page="source.md" %}
{% page-ref page="processor.md" %}
{% page-ref page="sink.md" %}
{% page-ref page="stage.md" %}
{% page-ref page="bulksink.md" %}