diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 8b47bc2677b..8197efc4c4e 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -10,7 +10,11 @@ * [Athena](openmetadata/connectors/athena.md) * [BigQuery](openmetadata/connectors/bigquery.md) * [BigQuery Usage](openmetadata/connectors/bigquery-usage.md) + * [Data Model](connectors/data-model/README.md) + * [DBT](connectors/data-model/dbt.md) + * [MariaDB](connectors/data-model/mariadb.md) * [ElasticSearch](openmetadata/connectors/elastic-search.md) + * [Glue Catalog](connectors/glue-catalog.md) * [Hive](openmetadata/connectors/hive.md) * [Kafka](openmetadata/connectors/kafka.md) * [Looker](openmetadata/connectors/looker.md) diff --git a/docs/connectors/data-model/README.md b/docs/connectors/data-model/README.md new file mode 100644 index 00000000000..3a0cf9facfb --- /dev/null +++ b/docs/connectors/data-model/README.md @@ -0,0 +1,10 @@ +# Data Model + +{% content-ref url="dbt.md" %} +[dbt.md](dbt.md) +{% endcontent-ref %} + +{% content-ref url="mariadb.md" %} +[mariadb.md](mariadb.md) +{% endcontent-ref %} + diff --git a/docs/connectors/data-model/dbt.md b/docs/connectors/data-model/dbt.md new file mode 100644 index 00000000000..5c1d15fd482 --- /dev/null +++ b/docs/connectors/data-model/dbt.md @@ -0,0 +1,90 @@ +--- +description: This guide will help install DBT connector and run manually +--- + +# DBT + +{% hint style="info" %} +**Prerequisites** + +OpenMetadata is built using Java, DropWizard, Jetty, and MySQL. + +1. Python 3.7 or above +{% endhint %} + +### Install from PyPI + +{% tabs %} +{% tab title="Install Using PyPI" %} +```bash +pip install 'openmetadata-ingestion[dbt]' +``` +{% endtab %} +{% endtabs %} + +### Run Manually + +```bash +metadata ingest -c ./examples/workflows/dbt.json +``` + +### Configuration + +{% code title="dbt.json" %} +```javascript +{ + "source": { + "type": "dbt", + "config": { + "service_name": "bigquery", + "service_type": "BigQuery", + "catalog_file": "./examples/sample_data/dbt/catalog.json", + "manifest_file": "./examples/sample_data/dbt/manifest.json", + "run_results_file": "./examples/sample_data/dbt/run_results.json", + "database": "shopify" + } + } + ... +``` +{% endcode %} + +1. **service\_name** - Service Name for this MySQL cluster. If you added MySQL cluster through OpenMetadata UI, make sure the service name matches the same. +2. **catalog\_file** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata +3. **manifest\_file** - Enable data-profiling (Optional). It will provide you the newly ingested data. +4. **run\_results\_file** - Specify offset. +5. **database** - Specify limit. + +## Publish to OpenMetadata + +Below is the configuration to publish DBT data into the OpenMetadata service. + +Add optionally `pii` processor and `metadata-rest` sink along with `metadata-server` config + +{% code title="dbt.json" %} +```javascript +{ + "source": { + "type": "dbt", + "config": { + "service_name": "bigquery", + "service_type": "BigQuery", + "catalog_file": "./examples/sample_data/dbt/catalog.json", + "manifest_file": "./examples/sample_data/dbt/manifest.json", + "run_results_file": "./examples/sample_data/dbt/run_results.json", + "database": "shopify" + } + }, + "sink": { + "type": "metadata-rest", + "config": {} + }, + "metadata_server": { + "type": "metadata-server", + "config": { + "api_endpoint": "http://localhost:8585/api", + "auth_provider_type": "no-auth" + } + } +} +``` +{% endcode %} diff --git a/docs/connectors/data-model/mariadb.md b/docs/connectors/data-model/mariadb.md new file mode 100644 index 00000000000..8c6f992f906 --- /dev/null +++ b/docs/connectors/data-model/mariadb.md @@ -0,0 +1,94 @@ +--- +description: This guide will help install MariaDB connector and run manually +--- + +# MariaDB + +{% hint style="info" %} +**Prerequisites** + +OpenMetadata is built using Java, DropWizard, Jetty, and MySQL. + +1. Python 3.7 or above +{% endhint %} + +### Install from PyPI + +{% tabs %} +{% tab title="Install Using PyPI" %} +```bash +pip install 'openmetadata-ingestion[mysql]' +``` +{% endtab %} +{% endtabs %} + +### Run Manually + +```bash +metadata ingest -c ./examples/workflows/mariadb.json +``` + +### Configuration + +{% code title="mariadb.json" %} +```javascript +{ + "source": { + "type": "mariadb", + "config": { + "username": "openmetadata_user", + "password": "openmetadata_password", + "database": "openmetadata_db", + "service_name": "local_mysql", + "filter_pattern": { + "excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"] + } + } + }, + ... +``` +{% endcode %} + +1. **username** - pass the MariaDB username. +2. **password** - password for the username +3. **service\_name** - Service Name for this MariaDB cluster. If you added MariaDB cluster through OpenMetadata UI, make sure the service name matches the same. +4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata +5. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data. +6. **data\_profiler\_offset** - Specify offset. +7. **data\_profiler\_limit** - Specify limit. + +## Publish to OpenMetadata + +Below is the configuration to publish MariaDB data into the OpenMetadata service. + +Add optionally `pii` processor and `metadata-rest` sink along with `metadata-server` config + +{% code title="mariadb.json" %} +```javascript +{ + "source": { + "type": "mariadb", + "config": { + "username": "openmetadata_user", + "password": "openmetadata_password", + "database": "openmetadata_db", + "service_name": "local_mysql", + "filter_pattern": { + "excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"] + } + } + }, + "sink": { + "type": "metadata-rest", + "config": {} + }, + "metadata_server": { + "type": "metadata-server", + "config": { + "api_endpoint": "http://localhost:8585/api", + "auth_provider_type": "no-auth" + } + } +} +``` +{% endcode %} diff --git a/docs/connectors/glue-catalog.md b/docs/connectors/glue-catalog.md new file mode 100644 index 00000000000..3997f3c1df8 --- /dev/null +++ b/docs/connectors/glue-catalog.md @@ -0,0 +1,93 @@ +--- +description: This guide will help install Glue connector and run manually +--- + +# Glue Catalog + +{% hint style="info" %} +**Prerequisites** + +OpenMetadata is built using Java, DropWizard, Jetty, and MySQL. + +1. Python 3.7 or above +{% endhint %} + +### Install from PyPI + +{% tabs %} +{% tab title="Install Using PyPI" %} +```bash +pip install 'openmetadata-ingestion[glue]' +``` +{% endtab %} +{% endtabs %} + +### Run Manually + +```bash +metadata ingest -c ./examples/workflows/glue.json +``` + +### Configuration + +{% code title="glue.json" %} +```javascript +{ + "source": { + "type": "glue", + "config": { + "aws_access_key_id": "aws_access_key_id", + "aws_secret_access_key": "aws_secret_access_key", + "db_service_name": "local_glue_db", + "pipeline_service_name": "local_glue_pipeline", + "region_name": "region_name", + "endpoint_url": "endpoint_url", + "service_name": "local_glue" + } + }, +... +``` +{% endcode %} + +1. **aws\_access\_key\_id** - Access Key for AWS. +2. **aws\_secret\_access\_key** - Secret Key for AWS. +3. **db\_service\_name** - Service Name for this Glue Database cluster. +4. **pipeline\_service\_name** - Service Name for this Glue Pipeline cluster. +5. **region\_name** - AWS account region. +6. **endpoint\_url** - Service Endpoints from [AWS](https://docs.aws.amazon.com/general/latest/gr/glue.html). + +## Publish to OpenMetadata + +Below is the configuration to publish Glue data into the OpenMetadata service. + +Add optionally `pii` processor and `metadata-rest` sink along with `metadata-server` config + +{% code title="glue.json" %} +```javascript +{ + "source": { + "type": "glue", + "config": { + "aws_access_key_id": "aws_access_key_id", + "aws_secret_access_key": "aws_secret_access_key", + "db_service_name": "local_glue_db", + "pipeline_service_name": "local_glue_pipeline", + "region_name": "region_name", + "endpoint_url": "endpoint_url", + "service_name": "local_glue" + } + }, + "sink": { + "type": "metadata-rest", + "config": {} + }, + "metadata_server": { + "type": "metadata-server", + "config": { + "api_endpoint": "http://localhost:8585/api", + "auth_provider_type": "no-auth" + } + } +} +``` +{% endcode %} diff --git a/docs/install/metadata-ingestion/connectors/README.md b/docs/install/metadata-ingestion/connectors/README.md index e4ee32c79c7..94a7ca715a1 100644 --- a/docs/install/metadata-ingestion/connectors/README.md +++ b/docs/install/metadata-ingestion/connectors/README.md @@ -7,7 +7,11 @@ OpenMetadata supports connectors to some popular services. We will continue as a * [Athena](../../../openmetadata/connectors/athena.md) * [BigQuery](../../../openmetadata/connectors/bigquery.md) * [BigQuery Usage](../../../openmetadata/connectors/bigquery-usage.md) +* [Data Model](../../../connectors/data-model/) + * [DBT](../../../connectors/data-model/dbt.md) + * [MariaDB](../../../connectors/data-model/mariadb.md) * [ElasticSearch](../../../openmetadata/connectors/elastic-search.md) +* Glue Catalog * [MSSQL](../../../openmetadata/connectors/mssql.md) * [MySQL](../../../openmetadata/connectors/mysql.md) * [Hive](../../../openmetadata/connectors/hive.md) @@ -38,3 +42,4 @@ OpenMetadata supports connectors to some popular services. We will continue as a * Airflow * Prefect +* Glue diff --git a/docs/install/metadata-ingestion/ingest-sample-data.md b/docs/install/metadata-ingestion/ingest-sample-data.md index 82c6dd8f437..4c9c07730b7 100644 --- a/docs/install/metadata-ingestion/ingest-sample-data.md +++ b/docs/install/metadata-ingestion/ingest-sample-data.md @@ -36,10 +36,9 @@ Sample Data, Tables, Usage, Users, Topics, and Dashboards. ```bash #Make sure the OpenMetadata Server is up and running -cd openmetadata-0.5.0/ingestion +cd openmetadata-0.6.0/ingestion metadata ingest -c ./pipelines/sample_data.json metadata ingest -c ./pipelines/sample_usage.json -metadata ingest -c ./pipelines/sample_users.json ``` ### Index Sample Data into ElasticSearch @@ -58,6 +57,6 @@ Index sample data in ElasticSearch: ```bash #Make sure the OpenMetadata Server is up and running -cd openmetadata-0.5.0/ingestion +cd openmetadata-0.6.0/ingestion metadata ingest -c ./pipelines/metadata_to_es.json ``` diff --git a/docs/install/run-in-production.md b/docs/install/run-in-production.md index 20bec90d07b..591700b52da 100644 --- a/docs/install/run-in-production.md +++ b/docs/install/run-in-production.md @@ -21,7 +21,7 @@ Please refer to the previous section [Run Openmetadata](run-openmetadata.md) for OpenMetadata release ships with `./bin/openmetadata` init.d style script. ``` -cd openmetdata-0.5.0 +cd openmetdata-0.6.0 ./bin/openmetdata start ``` diff --git a/docs/install/run-openmetadata.md b/docs/install/run-openmetadata.md index d5d2eea34a2..8bd7f8985c5 100644 --- a/docs/install/run-openmetadata.md +++ b/docs/install/run-openmetadata.md @@ -51,7 +51,8 @@ Preferences -> Resources -> Advanced Create a new directory for OpenMetadata and navigate into that directory. ``` -mkdir openmetadata-docker; cd openmetadata-docker +mkdir openmetadata-docker +cd openmetadata-docker ``` ### 2. Create a Python virtual environment