2021-08-01 14:27:44 -07:00
---
description: This guide will help you setup the Ingestion framework and connectors
---
# Setup Ingestion
2021-08-05 19:56:33 +05:30
Ingestion is a data ingestion library, which is inspired by [Apache Gobblin ](https://gobblin.apache.org/ ). It could be used in an orchestration framework\(e.g. Apache Airflow\) to build data for OpenMetadata.
2021-08-01 14:27:44 -07:00
{% hint style="info" %}
**Prerequisites**
* Python > = 3.8.x
{% endhint %}
## Install on your Dev
### Install Dependencies
```text
cd ingestion
2021-08-05 09:20:59 +05:30
python3 -m venv env
source env/bin/activate
./ingestion_dependency.sh
2021-08-01 14:27:44 -07:00
```
You only need to run above command once.
### Known Issues
#### Fix MySQL lib
```text
sudo ln -s /usr/local/mysql/lib/libmysqlclient.21.dylib /usr/local/lib/libmysqlclient.21.dylib
```
### Run Ingestion Connectors
#### Generate Redshift Data
```text
source env/bin/activate
2021-08-21 00:43:07 +05:30
metadata ingest -c ./examples/workflows/redshift.json
2021-08-01 14:27:44 -07:00
```
#### Generate Redshift Usage Data
```text
source env/bin/activate
2021-08-21 00:43:07 +05:30
metadata ingest -c ./examples/workflows/redshift_usage.json
2021-08-01 14:27:44 -07:00
```
#### Generate Sample Tables
```text
source env/bin/activate
2021-08-05 09:20:59 +05:30
metadata ingest -c ./pipelines/sample_tables.json
2021-08-01 14:27:44 -07:00
```
2021-08-21 00:43:07 +05:30
#### Generate Sample Usage
2021-08-01 14:27:44 -07:00
2021-08-21 00:43:07 +05:30
```text
source env/bin/activate
metadata ingest -c ./pipelines/sample_usage.json
```
2021-08-01 14:27:44 -07:00
#### Generate Sample Users
```text
source env/bin/activate
2021-08-05 09:20:59 +05:30
metadata ingest -c ./pipelines/sample_users.json
2021-08-01 14:27:44 -07:00
```
#### Ingest MySQL data to Metadata APIs
```text
source env/bin/activate
metadata ingest -c ./pipelines/mysql.json
```
#### Ingest Bigquery data to Metadata APIs
```text
source env/bin/activate
2021-08-21 00:43:07 +05:30
metadata ingest -c ./examples/workflows/bigquery.json
2021-08-01 14:27:44 -07:00
```
#### Index Metadata into ElasticSearch
#### Run ElasticSearch docker
```text
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
```
#### Run ingestion connector
```text
source env/bin/activate
metadata ingest -c ./pipelines/metadata_to_es.json
```
## Install using Docker
### Run Ingestion docker
2021-08-05 19:56:33 +05:30
The OpenMetadata should be up and running before you run the docker on the system.
2021-08-01 14:27:44 -07:00
```text
docker build -t ingestion .
docker run --network="host" -t ingestion
```
### Run ElasticSearch docker
Run the command to start ElasticSearch docker
```text
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
```
## Test - Integration
Run the command to start integration tests
```text
source env/bin/activate
cd tests/integration/
pytest -c /dev/null {folder-name}
#pytest -c /dev/null mysql
```
## Design