OpenMetadata

mirror of https://github.com/open-metadata/OpenMetadata.git synced 2025-08-08 00:58:06 +00:00

History

* Issue 898 (#905)

* ISSUE-898: additional information in the prerequisities for building and running code

* ISSUE-898: removed unreachable old doc

* ISSUE-898: added new docker compose to expose MySQL and ES ports to host machines

* ISSUE-898: changed jdbc connect url to allow Public Key Retrieval

* ISSUE-898: fixed log name to openmetadata.log

Co-authored-by: Vijay Mariadassou <vijay@mariadassou.com>

* Fixes #906 Remove unused methods lingering from #899

* Update pull_request_template.md

* Update pull_request_template.md

* ISSUE-861: add elasticsearch username & password (#894)

* ISSUE-861: add elasticsearch username & password

* ISSUE-861:  python elasticsearch sink add username & password

* ISSUE-861: bugfix

* format code

* format code

* updated instructions to run integration tests

* fixed api call to metadata server; changed test to cover both database as well as table operations everytime

Co-authored-by: Vijay Mariadassou <vijay@mariadassou.com>
Co-authored-by: sureshms <suresh@getcollate.io>
Co-authored-by: Suresh Srinivas <srini30005@gmail.com>
Co-authored-by: rong fengliang <1141591465@qq.com>

2021-10-23 19:58:26 -07:00

examples

Sample Profile Data for Sample Tables (#815 )

2021-10-17 17:24:39 -07:00

pipelines

Removing Cron from all configs (#773 )

2021-10-14 15:21:47 -07:00

src

ISSUE-861: add elasticsearch username & password (#894 )

2021-10-23 10:03:24 -07:00

tests

Issue 910 (#914 )

2021-10-23 19:58:26 -07:00

.pre-commit-config.yaml

[Issue-760] - Precommit & isort (#772 )

2021-10-14 15:22:59 -07:00

CHANGELOG

OpenMetadata snapshot release 0.3

2021-08-01 14:27:44 -07:00

Dockerfile

Docker fix (#200 )

2021-08-16 18:37:04 +05:30

Dockerfile_local

Airflow docker (#762 )

2021-10-14 07:46:24 -07:00

ingestion_dependency.sh

Airflow docker (#762 )

2021-10-14 07:46:24 -07:00

LICENSE

Ingestion Setup.py modification

2021-08-13 01:40:56 +05:30

local_ingestion_dependency.sh

[Issue-759] - Model Entity Sample (#797 )

2021-10-16 09:59:32 -07:00

README.md

[Issue-877] - High Level API (#890 )

2021-10-21 14:51:38 -07:00

requirements-dev.txt

[Issue-877] - High Level API (#890 )

2021-10-21 14:51:38 -07:00

requirements-test.txt

[Issue-760] - Precommit & isort (#772 )

2021-10-14 15:22:59 -07:00

requirements.txt

Ingestion: Airflow integration to ingest metadata about pipelines and tasks (#609 )

2021-09-29 11:32:09 -07:00

setup.cfg

Ingestion Setup.py modification

2021-08-13 01:40:56 +05:30

setup.py

Changing Pydantic from 1.8.2 to 1.7.4 resolves the conflict (#788 )

2021-10-16 07:58:55 -07:00

README.md

This guide will help you setup the Ingestion framework and connectors

OpenMetadata Ingesiton is a simple framework to build connectors and ingest metadata of various systems through OpenMetadata APIs. It could be used in an orchestration framework(e.g. Apache Airflow) to ingest metadata. Prerequisites

Python >= 3.8.x

Install From PyPI

python3 -m pip install --upgrade pip wheel setuptools openmetadata-ingestion
python3 -m spacy download en_core_web_sm

Install Ingestion Connector Dependencies

Click here to go to Ingestion Connector's Documentation

Generate Redshift Data

metadata ingest -c ./pipelines/redshift.json

Generate Redshift Usage Data

metadata ingest -c ./pipelines/redshift_usage.json

Generate Sample Tables

metadata ingest -c ./pipelines/sample_tables.json

Generate Sample Users

metadata ingest -c ./pipelines/sample_users.json

Ingest MySQL data to Metadata APIs

metadata ingest -c ./pipelines/mysql.json

Ingest Bigquery data to Metadata APIs

export GOOGLE_APPLICATION_CREDENTIALS="$PWD/pipelines/creds/bigquery-cred.json"
metadata ingest -c ./pipelines/bigquery.json

Index Metadata into ElasticSearch

Run ElasticSearch docker

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2

Run ingestion connector

metadata ingest -c ./pipelines/metadata_to_es.json

Generated sources

We are using datamodel-codegen to get some pydantic classes inside the generated module from the JSON Schemas defining the API and Entities.

This tool bases the class name on the title of the JSON Schema (vs. Java POJO, which uses the file name). Note that this convention is important for us, as having a standardized approach in creating the titles helps us create generic code capable of tackling multiple Type Variables.