1.7 KiB
Adding a Metadata Ingestion Source
:::note
This guide assumes that you've already followed the metadata ingestion developing guide to set up your local environment.
:::
1. Set up the configuration model
We use pydantic for configuration, and all models must inherit from ConfigModel
. The file source is a good example.
2. Set up the reporter
The reporter interface enables the source to report statistics, warnings, failures, and other information about the run. Some sources use the default SourceReport
class, but others inherit and extend that class.
3. Implement the source itself
The core for the source is the get_workunits
method, which produces a stream of MCE objects. The file source is a good and simple example.
The MetadataChangeEventClass is defined in the metadata models. There are also some convenience methods for commonly used operations.
4. Set up the dependencies
Declare the source's pip dependencies in the plugins
variable of the setup script.
5. Enable discoverability
Declare the source under the entry_points
variable of the setup script. This enables the source to be listed when running datahub check plugins
, and sets up the source's shortened alias for use in recipes.
6. Write tests
Tests go in the tests
directory. We use the pytest framework.
7. Write docs
Add the plugin to the table at the top of the README file, and add the source's documentation underneath the sources header.