mirror of https://github.com/open-metadata/OpenMetadata.git synced 2026-01-07 13:07:22 +00:00

2021-09-29 16:56:32 +00:00

1.8 KiB

Raw Blame History

description
This guide will help you to ingest sample data

Ingest Sample Data

Sample Data

We have created some sample data to take OpenMetadata for a spin without integrating with real data sources. The goal of sample data is to give a taste of what OpenMetadata can do with your real data.

{% hint style="info" %} Prerequisites

OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.

Python 3.7 or above {% endhint %}

Run OpenMetadata Server

please refer to Run OpenMetadata section to run the server manually or using docker.

Install from PyPI

{% tabs %} {% tab title="Install Using PyPI" %}

Download the latest OpenMetadata release from here 
https://github.com/open-metadata/OpenMetadata/releases
tar zxvf openmetadata-0.4.0.tar.gz
cd openmetadata-0.4.0/ingestion
python3 -m venv env 
python3 -m pip install 'openmetadata-ingestion[sample-data, elasticsearch]'

{% endtab %} {% endtabs %}

Ingest using Sample Pipelines consisting of

Sample Data, Tables, Usage, Users, Topics, and Dashboards.

metadata ingest -c ./pipelines/sample_data.json
metadata ingest -c ./pipelines/sample_usage.json
metadata ingest -c ./pipelines/sample_users.json

Index Sample Data into ElasticSearch

Start Elastic Search Docker:

{% hint style="warning" %} The below command starts Elasticsearch docker that stores the indexed data in memory. If you stop the container it will lose any data on restart. Please re-run the metadata_to_es workflow again to index the data upon starting the container. {% endhint %}

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2

Index sample data in ElasticSearch:

cd openmetadata-0.4.0/ingestion
metadata ingest -c ./pipelines/metadata_to_es.json

1.8 KiB Raw Blame History