mirror of
				https://github.com/open-metadata/OpenMetadata.git
				synced 2025-10-26 16:22:09 +00:00 
			
		
		
		
	 13be70e416
			
		
	
	
		13be70e416
		
			
		
	
	
	
	
		
			
			* updated documentation * addressing pyline findings * addressing pyline findings * doc update Co-authored-by: parthp2107 <parth@getcollate.io>
		
			
				
	
	
	
		
			2.8 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			2.8 KiB
		
	
	
	
	
	
	
	
| description | 
|---|
| This guide will help install Redshift Usage connector and run manually | 
Redshift Usage
{% hint style="info" %} Prerequisites
OpenMetadata is built using Java, DropWizard, Jetty, and MySQL.
- Python 3.7 or above {% endhint %}
Install from PyPI or Source
{% tabs %} {% tab title="Install Using PyPI" %}
pip install 'openmetadata-ingestion[redshift-usage]'
{% endtab %} {% endtabs %}
Run Manually
metadata ingest -c ./examples/workflows/redshift_usage.json
Configuration
{% code title="redshift_usage.json" %}
{
  "source": {
    "type": "redshift-usage",
    "config": {
      "host_port": "cluster.name.region.redshift.amazonaws.com:5439",
      "username": "username",
      "password": "strong_password",
      "database": "warehouse",
      "where_clause": "and q.label != 'metrics' and q.label != 'health' and q.label != 'cmstats'",
      "service_name": "aws_redshift",
      "duration": 2
    }
  },
 ...
{% endcode %}
- username - pass the Redshift username. We recommend creating a user with read-only permissions to all the databases in your Redshift installation
- password - password for the username
- service_name - Service Name for this Redshift cluster. If you added Redshift cluster through OpenMetadata UI, make sure the service name matches the same.
- filter_pattern - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
Publish to OpenMetadata
Below is the configuration to publish Redshift Usage data into the OpenMetadata service.
Add optionallyquery-parser processor, table-usage stage and metadata-usage bulk_sink along with metadata-server config
{% code title="redshift_usage.json" %}
{
  "source": {
    "type": "redshift-usage",
    "config": {
      "host_port": "cluster.name.region.redshift.amazonaws.com:5439",
      "username": "username",
      "password": "strong_password",
      "database": "warehouse",
      "where_clause": "and q.label != 'metrics' and q.label != 'health' and q.label != 'cmstats'",
      "service_name": "aws_redshift",
      "duration": 2
    }
  },
  "processor": {
    "type": "query-parser",
    "config": {
      "filter": ""
    }
  },
  "stage": {
    "type": "table-usage",
    "config": {
      "filename": "/tmp/redshift_usage"
    }
  },
  "bulk_sink": {
    "type": "metadata-usage",
    "config": {
      "filename": "/tmp/redshift_usage"
    }
  },
  "metadata_server": {
    "type": "metadata-server",
    "config": {
      "api_endpoint": "http://localhost:8585/api",
      "auth_provider_type": "no-auth"
    }
  },
  "cron": {
    "minute": "*/5",
    "hour": null,
    "day": null,
    "month": null,
    "day_of_week": null
  }
}
{% endcode %}