mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-07-05 16:19:35 +00:00
2.5 KiB
2.5 KiB
description |
---|
This guide will help you setup the Ingestion framework and connectors |
Setup Ingestion
Ingestion is a data ingestion library, which is inspired by Apache Gobblin. It could be used in an orchestration framework(e.g. Apache Airflow) to build data for OpenMetadata.
{% hint style="info" %} Prerequisites
- Python >= 3.8.x {% endhint %}
Install on your Dev
Install Dependencies
cd ingestion
python3 -m venv env
source env/bin/activate
./ingestion_dependency.sh
You only need to run above command once.
Known Issues
Fix MySQL lib
sudo ln -s /usr/local/mysql/lib/libmysqlclient.21.dylib /usr/local/lib/libmysqlclient.21.dylib
Run Ingestion Connectors
Generate Redshift Data
source env/bin/activate
metadata ingest -c ./examples/workflows/redshift.json
Generate Redshift Usage Data
source env/bin/activate
metadata ingest -c ./examples/workflows/redshift_usage.json
Generate Sample Tables
source env/bin/activate
metadata ingest -c ./pipelines/sample_tables.json
Generate Sample Usage
source env/bin/activate
metadata ingest -c ./pipelines/sample_usage.json
Generate Sample Users
source env/bin/activate
metadata ingest -c ./pipelines/sample_users.json
Ingest MySQL data to Metadata APIs
source env/bin/activate
metadata ingest -c ./pipelines/mysql.json
Ingest Bigquery data to Metadata APIs
source env/bin/activate
metadata ingest -c ./examples/workflows/bigquery.json
Index Metadata into ElasticSearch
Run ElasticSearch docker
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
Run ingestion connector
source env/bin/activate
metadata ingest -c ./pipelines/metadata_to_es.json
Install using Docker
Run Ingestion docker
The OpenMetadata should be up and running before you run the docker on the system.
docker build -t ingestion .
docker run --network="host" -t ingestion
Run ElasticSearch docker
Run the command to start ElasticSearch docker
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
Test - Integration
Run the command to start integration tests
source env/bin/activate
cd tests/integration/
pytest -c /dev/null {folder-name}
#pytest -c /dev/null mysql