10 KiB
Tableau
For context on getting started with ingestion, check out our metadata ingestion guide.
Note that this connector is currently considered in BETA
, and has not been validated for production use.
Setup
To install this plugin, run pip install 'acryl-datahub[tableau]'
.
See documentation for Tableau's metadata API at https://help.tableau.com/current/api/metadata_api/en-us/index.html
Capabilities
This plugin extracts Sheets, Dashboards, Embedded and Published Data sources metadata within Workbooks in a given project on a Tableau Online site. This plugin is in beta and has only been tested on PostgreSQL database and sample workbooks on Tableau online.
Tableau's GraphQL interface is used to extract metadata information. Queries used to extract metadata are located
in metadata-ingestion/src/datahub/ingestion/source/tableau_common.py
Dashboard
Dashboards from Tableau are ingested as Dashboard in datahub.
- GraphQL query
{
workbooksConnection(first: 15, offset: 0, filter: {projectNameWithin: ["default", "Project 2"]}) {
nodes {
id
name
luid
projectName
owner {
username
}
description
uri
createdAt
updatedAt
dashboards {
id
name
path
createdAt
updatedAt
sheets {
id
name
}
}
}
pageInfo {
hasNextPage
endCursor
}
totalCount
}
}
Sheet
Sheets from Tableau are ingested as charts in datahub.
- GraphQL query
{
workbooksConnection(first: 10, offset: 0, filter: {projectNameWithin: ["default"]}) {
.....
sheets {
id
name
path
createdAt
updatedAt
tags {
name
}
containedInDashboards {
name
path
}
upstreamDatasources {
id
name
}
datasourceFields {
__typename
id
name
description
upstreamColumns {
name
}
... on ColumnField {
dataCategory
role
dataType
aggregation
}
... on CalculatedField {
role
dataType
aggregation
formula
}
... on GroupField {
role
dataType
}
... on DatasourceField {
remoteField {
__typename
id
name
description
folderName
... on ColumnField {
dataCategory
role
dataType
aggregation
}
... on CalculatedField {
role
dataType
aggregation
formula
}
... on GroupField {
role
dataType
}
}
}
}
}
}
.....
}
}
Embedded Data Source
Embedded Data source from Tableau is ingested as a Dataset in datahub.
- GraphQL query
{
workbooksConnection(first: 15, offset: 0, filter: {projectNameWithin: ["default"]}) {
nodes {
....
embeddedDatasources {
__typename
id
name
hasExtracts
extractLastRefreshTime
extractLastIncrementalUpdateTime
extractLastUpdateTime
upstreamDatabases {
id
name
connectionType
isEmbedded
}
upstreamTables {
name
schema
columns {
name
remoteType
}
}
fields {
__typename
id
name
description
isHidden
folderName
... on ColumnField {
dataCategory
role
dataType
defaultFormat
aggregation
columns {
table {
... on CustomSQLTable {
id
name
}
}
}
}
... on CalculatedField {
role
dataType
defaultFormat
aggregation
formula
}
... on GroupField {
role
dataType
}
}
upstreamDatasources {
name
}
workbook {
name
projectName
}
}
}
....
}
}
Published Data Source
Published Data source from Tableau is ingested as a Dataset in datahub.
- GraphQL query
{
publishedDatasourcesConnection(filter: {idWithin: ["00cce29f-b561-bb41-3557-8e19660bb5dd", "618c87db-5959-338b-bcc7-6f5f4cc0b6c6"]}) {
nodes {
__typename
id
name
hasExtracts
extractLastRefreshTime
extractLastIncrementalUpdateTime
extractLastUpdateTime
downstreamSheets {
id
name
}
upstreamTables {
name
schema
fullName
connectionType
description
contact {
name
}
}
fields {
__typename
id
name
description
isHidden
folderName
... on ColumnField {
dataCategory
role
dataType
defaultFormat
aggregation
columns {
table {
... on CustomSQLTable {
id
name
}
}
}
}
... on CalculatedField {
role
dataType
defaultFormat
aggregation
formula
}
... on GroupField {
role
dataType
}
}
owner {
username
}
description
uri
projectName
}
pageInfo {
hasNextPage
endCursor
}
totalCount
}
}
Custom SQL Data Source
For custom sql data sources, the query is viewable in UI under View Definition tab.
- GraphQL query
{
customSQLTablesConnection(filter: {idWithin: ["22b0b4c3-6b85-713d-a161-5a87fdd78f40"]}) {
nodes {
id
name
query
columns {
id
name
remoteType
description
referencedByFields {
datasource {
id
name
upstreamDatabases {
id
name
}
upstreamTables {
id
name
schema
connectionType
columns {
id
}
}
... on PublishedDatasource {
projectName
}
... on EmbeddedDatasource {
workbook {
name
projectName
}
}
}
}
}
tables {
id
name
schema
connectionType
}
}
}
}
Quickstart recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: tableau
config:
# Coordinates
connect_uri: https://prod-ca-a.online.tableau.com
site: acryl
projects: ["default", "Project 2"]
# Credentials
username: username@acrylio.com
password: pass
token_name: Acryl
token_value: token_generated_from_tableau
# Options
ingest_tags: True
ingest_owner: True
default_schema_map:
mydatabase: public
anotherdatabase: anotherschema
sink:
# sink configs
Config details
Field | Required | Default | Description |
---|---|---|---|
connect_uri |
✅ | Tableau host URL. | |
site |
✅ | Tableau Online Site | |
env |
"PROD" |
Environment to use in namespace when constructing URNs. | |
username |
Tableau user name. | ||
password |
Tableau password for authentication. | ||
token_name |
Tableau token name if authenticating using a personal token. | ||
token_value |
Tableau token value if authenticating using a personal token. | ||
projects |
default |
List of projects | |
default_schema_map * |
Default schema to use when schema is not found. | ||
ingest_tags |
False |
Ingest Tags from source. This will override Tags entered from UI | |
ingest_owners |
False |
Ingest Owner from source. This will override Owner info entered from UI |
*Tableau may not provide schema name when ingesting Custom SQL data source. Use default_schema_map
to provide a default
schema name to use when constructing a table URN.
Authentication
Currently, authentication is supported on Tableau Online using username and password and personal token. For more information on Tableau authentication, refer to How to Authenticate guide.
Compatibility
Tableau Server Version: 2021.4.0 (20214.22.0114.0959) 64-bit Linux
Tableau Pod: prod-ca-a
Questions
If you've got any questions on configuring this source, feel free to ping us on our Slack!