mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2026-01-05 03:56:35 +00:00
* Added Filter Params for table and Schema * Bigquery Doc changes * Doc Changes for databases * Filter Pattern Changes * Table Filter Pattern Example Changes * Filter Pattern Example Changes
3.1 KiB
3.1 KiB
| description |
|---|
| This guide will help install Redshift connector and run manually |
Redshift
{% hint style="info" %} Prerequisites
OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.
- Python 3.7 or above
- OpenMetadata Server up and running
{% endhint %}
Install from PyPI
pip install 'openmetadata-ingestion[redshift]'
Run Manually
metadata ingest -c ./examples/workflows/redshift.json
Configuration
{% code title="redshift.json" %}
{
"source": {
"type": "redshift",
"config": {
"host_port": "cluster.name.region.redshift.amazonaws.com:5439",
"username": "username",
"password": "strong_password",
"database": "warehouse",
"service_name": "aws_redshift",
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
},
...
{% endcode %}
- username - pass the Redshift username.
- password - the password for the Redshift username.
- service_name - Service Name for this Redshift cluster. If you added the Redshift cluster through OpenMetadata UI, make sure the service name matches the same.
- schema_filter_pattern - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
- table_filter_pattern - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
- database - Database name from where data is to be fetched.
- data_profiler_enabled - Enable data-profiling (Optional). It will provide you the newly ingested data.
- data_profiler_offset - Specify offset.
- data_profiler_limit - Specify limit.
Publish to OpenMetadata
Below is the configuration to publish Redshift data into the OpenMetadata service.
Add metadata-rest sink along with metadata-server config
{% code title="redshift.json" %}
{
"source": {
"type": "redshift",
"config": {
"host_port": "cluster.name.region.redshift.amazonaws.com:5439",
"username": "username",
"password": "strong_password",
"database": "warehouse",
"service_name": "aws_redshift",
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
},
"sink": {
"type": "metadata-rest",
"config": {}
},
"metadata_server": {
"type": "metadata-server",
"config": {
"api_endpoint": "http://localhost:8585/api",
"auth_provider_type": "no-auth"
}
}
}
{% endcode %}