# Redshift For context on getting started with ingestion, check out our [metadata ingestion guide](../README.md). ## Setup To install this plugin, run `pip install 'acryl-datahub[redshift]'`. ### Prerequisites This source needs to access system tables that require extra permissions. To grant these permissions, please alter your datahub Redshift user the following way: ```sql ALTER USER datahub_user WITH SYSLOG ACCESS UNRESTRICTED; ``` :::note Giving a user unrestricted access to system tables gives the user visibility to data generated by other users. For example, STL_QUERY and STL_QUERYTEXT contain the full text of INSERT, UPDATE, and DELETE statements. ::: ## Capabilities This plugin extracts the following: - Metadata for databases, schemas, views and tables - Column types associated with each table - Also supports PostGIS extensions - Table, row, and column statistics via optional [SQL profiling](./sql_profiles.md) - Table lineage :::tip You can also get fine-grained usage statistics for Redshift using the `redshift-usage` source described below. ::: | Capability | Status | Details | |-------------------|--------|------------------------------------------| | Platform Instance | ✔️ | [link](../../docs/platform-instances.md) | | Data Containers | ✔️ | | | Data Domains | ✔️ | [link](../../docs/domains.md) | ## Quickstart recipe Check out the following recipe to get started with ingestion! See [below](#config-details) for full configuration options. For general pointers on writing and running a recipe, see our [main recipe guide](../README.md#recipes). ```yml source: type: redshift config: # Coordinates host_port: example.something.us-west-2.redshift.amazonaws.com:5439 database: DemoDatabase # Credentials username: user password: pass # Options options: # driver_option: some-option include_views: True # whether to include views, defaults to True include_tables: True # whether to include views, defaults to True sink: # sink configs ```
Extra options when running Redshift behind a proxy This requires you to have already installed the Microsoft ODBC Driver for SQL Server. See https://docs.microsoft.com/en-us/sql/connect/python/pyodbc/step-1-configure-development-environment-for-pyodbc-python-development?view=sql-server-ver15 ```yml source: type: redshift config: host_port: my-proxy-hostname:5439 options: connect_args: sslmode: "prefer" # or "require" or "verify-ca" sslrootcert: ~ # needed to unpin the AWS Redshift certificate sink: # sink configs ```
## Config details Like all SQL-based sources, the Redshift integration supports: - Stale Metadata Deletion: See [here](./stateful_ingestion.md) for more details on configuration. - SQL Profiling: See [here](./sql_profiles.md) for more details on configuration. Note that a `.` is used to denote nested fields in the YAML recipe. | Field | Required | Default | Description | |--------------------------------|----------|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `username` | | | Redshift username. | | `password` | | | Redshift password. | | `host_port` | ✅ | | Redshift host URL. | | `database` | | | Redshift database. | | `database_alias` | | | Alias to apply to database when ingesting. | | `env` | | `"PROD"` | Environment to use in namespace when constructing URNs. | | `platform_instance` | | None | The Platform instance to use while constructing URNs. | | `options.