This guide will help you setup the Ingestion framework and connectors
This guide will help you setup the Ingestion framework and connectors

Python version 3.8+

OpenMetadata Ingesiton is a simple framework to build connectors and ingest metadata of various systems through OpenMetadata APIs. It could be used in an orchestration framework(e.g. Apache Airflow) to ingest metadata. Prerequisites

  • Python >= 3.8.x

Install From PyPI

python3 -m pip install --upgrade pip wheel setuptools openmetadata-ingestion
python3 -m spacy download en_core_web_sm

Install Ingestion Connector Dependencies

Click here to go to Ingestion Connector's Documentation

Generate Redshift Data

metadata ingest -c ./pipelines/redshift.json

Generate Redshift Usage Data

metadata ingest -c ./pipelines/redshift_usage.json

Generate Sample Tables

metadata ingest -c ./pipelines/sample_tables.json

Generate Sample Users

metadata ingest -c ./pipelines/sample_users.json

Ingest MySQL data to Metadata APIs

metadata ingest -c ./pipelines/mysql.json

Ingest Bigquery data to Metadata APIs

export GOOGLE_APPLICATION_CREDENTIALS="$PWD/pipelines/creds/bigquery-cred.json"
metadata ingest -c ./pipelines/bigquery.json

Index Metadata into ElasticSearch

Run ElasticSearch docker

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2

Run ingestion connector

metadata ingest -c ./pipelines/metadata_to_es.json