2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								description: This guide will help you to ingest sample data
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								---
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								# Ingest Sample Data
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								## Sample Data
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-15 02:54:01 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								We have created some sample data to take OpenMetadata for a spin without integrating with real data sources. The goal of sample data is to give a taste of what OpenMetadata can do with your real data.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								{% hint style="info" %}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								**Prerequisites**
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								1.  Python 3.7 or above 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								{% endhint %}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-12 13:53:29 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Install from PyPI or Source
  
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								{% tabs %}
							 
						 
					
						
							
								
									
										
										
										
											2021-08-12 13:53:29 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								{% tab title="Install Using PyPI" %}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
									
										
										
										
											2021-08-18 21:55:23 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								python3 -m pip install 'openmetadata-ingestion[sample-tables, elasticsearch]'
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								python3 -m spacy download en_core_web_sm
							 
						 
					
						
							
								
									
										
										
										
											2021-09-01 23:29:30 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								git clone https://github.com/open-metadata/OpenMetadata.git
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cd OpenMetadata/ingestion
							 
						 
					
						
							
								
									
										
										
										
											2021-08-12 13:53:29 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								{% endtab %}
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								{% endtabs %}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-09-07 17:52:06 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								### Ingest using Sample Pipelines consisting of
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Sample Data, Tables, Usage, Users, Topics, and Dashboards.
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_tables.json
							 
						 
					
						
							
								
									
										
										
										
											2021-08-14 20:13:00 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_usage.json
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_users.json
							 
						 
					
						
							
								
									
										
										
										
											2021-09-07 17:52:06 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_topics.json
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_dashboards.json
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/sample_data.json
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								### Index Sample Data into ElasticSearch
  
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Start Elastic Search Docker:
							 
						 
					
						
							
								
									
										
										
										
											2021-08-13 20:03:36 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-13 11:24:31 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								{% hint style="warning" %}
							 
						 
					
						
							
								
									
										
										
										
											2021-09-07 17:38:05 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The below command starts Elasticsearch docker that stores the indexed data in memory. If you stop the container it will lose any data on restart. Please re-run the metadata\_to\_es workflow again to index the data upon starting the container.
							 
						 
					
						
							
								
									
										
										
										
											2021-08-13 11:24:31 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								{% endhint %}
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Index sample data in ElasticSearch:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```bash
							 
						 
					
						
							
								
									
										
										
										
											2021-09-01 23:29:30 +00:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								cd OpenMetadata/ingestion
							 
						 
					
						
							
								
									
										
										
										
											2021-08-01 14:27:44 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								metadata ingest -c ./pipelines/metadata_to_es.json
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								```