Update: Trino, Presto Logic and Doc (#9859)

* Update: Trino, Presto Logic and Doc

* Update: Trino, Presto Logic and Doc

* Update permissions in Doc

* Update openmetadata-docs/content/connectors/database/trino/index.md

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>

Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
This commit is contained in:
Milan Bariya 2023-01-25 19:20:06 +05:30 committed by GitHub
parent 1cfdd6d7b0
commit 49d48e0546
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
10 changed files with 91 additions and 50 deletions

View File

@ -5,10 +5,10 @@ source:
config: config:
type: Presto type: Presto
hostPort: localhost:8080 hostPort: localhost:8080
catalog: tpcds catalog: catalog_name
username: admin username: admin
password: password password: password
databaseSchema: tpcds databaseSchema: schema_name
sourceConfig: sourceConfig:
config: config:
generateSampleData: false generateSampleData: false

View File

@ -6,8 +6,8 @@ source:
type: Trino type: Trino
hostPort: localhost:8080 hostPort: localhost:8080
username: user username: user
catalog: tpcds catalog: catalog_name
databaseSchema: tiny databaseSchema: schema_name
connectionOptions: {} connectionOptions: {}
connectionArguments: {} connectionArguments: {}
sourceConfig: sourceConfig:

View File

@ -132,27 +132,28 @@ class PrestoSource(CommonDbSourceService):
else: else:
results = self.connection.execute("SHOW CATALOGS") results = self.connection.execute("SHOW CATALOGS")
for res in results: for res in results:
new_catalog = res[0] if res:
database_fqn = fqn.build( new_catalog = res[0]
self.metadata, database_fqn = fqn.build(
entity_type=Database, self.metadata,
service_name=self.context.database_service.name.__root__, entity_type=Database,
database_name=new_catalog, service_name=self.context.database_service.name.__root__,
) database_name=new_catalog,
if filter_by_database(
self.source_config.databaseFilterPattern,
database_fqn
if self.source_config.useFqnForFiltering
else new_catalog,
):
self.status.filter(database_fqn, "Database Filtered Out")
continue
try:
self.set_inspector(database_name=new_catalog)
yield new_catalog
except Exception as exc:
logger.debug(traceback.format_exc())
logger.warning(
f"Error trying to connect to database {new_catalog}: {exc}"
) )
if filter_by_database(
self.source_config.databaseFilterPattern,
database_fqn
if self.source_config.useFqnForFiltering
else new_catalog,
):
self.status.filter(database_fqn, "Database Filtered Out")
continue
try:
self.set_inspector(database_name=new_catalog)
yield new_catalog
except Exception as exc:
logger.debug(traceback.format_exc())
logger.warning(
f"Error trying to connect to database {new_catalog}: {exc}"
)

View File

@ -181,27 +181,28 @@ class TrinoSource(CommonDbSourceService):
else: else:
results = self.connection.execute("SHOW CATALOGS") results = self.connection.execute("SHOW CATALOGS")
for res in results: for res in results:
new_catalog = res[0] if res:
database_fqn = fqn.build( new_catalog = res[0]
self.metadata, database_fqn = fqn.build(
entity_type=Database, self.metadata,
service_name=self.context.database_service.name.__root__, entity_type=Database,
database_name=new_catalog, service_name=self.context.database_service.name.__root__,
) database_name=new_catalog,
if filter_by_database(
self.source_config.databaseFilterPattern,
database_fqn
if self.source_config.useFqnForFiltering
else new_catalog,
):
self.status.filter(database_fqn, "Database Filtered Out")
continue
try:
self.set_inspector(database_name=new_catalog)
yield new_catalog
except Exception as exc:
logger.debug(traceback.format_exc())
logger.warning(
f"Error trying to connect to database {new_catalog}: {exc}"
) )
if filter_by_database(
self.source_config.databaseFilterPattern,
database_fqn
if self.source_config.useFqnForFiltering
else new_catalog,
):
self.status.filter(database_fqn, "Database Filtered Out")
continue
try:
self.set_inspector(database_name=new_catalog)
yield new_catalog
except Exception as exc:
logger.debug(traceback.format_exc())
logger.warning(
f"Error trying to connect to database {new_catalog}: {exc}"
)

View File

@ -142,6 +142,7 @@ workflowConfig:
- **password**: Password to connect to Presto. - **password**: Password to connect to Presto.
- **hostPort**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field. - **hostPort**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field.
- **catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`

View File

@ -142,6 +142,7 @@ workflowConfig:
- **password**: Password to connect to Presto. - **password**: Password to connect to Presto.
- **hostPort**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field. - **hostPort**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field.
- **catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`

View File

@ -136,6 +136,7 @@ the changes.
- **Password**: Password to connect to Presto. - **Password**: Password to connect to Presto.
- **Host and Port**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field. - **Host and Port**: Enter the fully qualified hostname and port number for your Presto deployment in the Host and Port field.
- **Catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **Catalog**: Presto offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Presto during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`

View File

@ -30,6 +30,17 @@ To run the Trino ingestion, you will need to install:
pip3 install "openmetadata-ingestion[trino]" pip3 install "openmetadata-ingestion[trino]"
``` ```
<Note>
To Inesget metadata from the Trino User Must have select privileges to this tables.
- `information_schema.schemata`
- `information_schema.columns`
- `information_schema.tables`
- `information_schema.views`
- `system.metadata.table_comments`
</Note>
## Metadata Ingestion ## Metadata Ingestion
All connectors are defined as JSON Schemas. All connectors are defined as JSON Schemas.
@ -145,6 +156,7 @@ workflowConfig:
- **password**: Password to connect to Trino. - **password**: Password to connect to Trino.
- **hostPort**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field. - **hostPort**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field.
- **catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`

View File

@ -30,6 +30,17 @@ To run the Trino ingestion, you will need to install:
pip3 install "openmetadata-ingestion[trino]" pip3 install "openmetadata-ingestion[trino]"
``` ```
<Note>
To Inesget metadata from the Trino User Must have select privileges to this tables.
- `information_schema.schemata`
- `information_schema.columns`
- `information_schema.tables`
- `information_schema.views`
- `system.metadata.table_comments`
</Note>
## Metadata Ingestion ## Metadata Ingestion
All connectors are defined as JSON Schemas. All connectors are defined as JSON Schemas.
@ -145,6 +156,7 @@ workflowConfig:
- **password**: Password to connect to Trino. - **password**: Password to connect to Trino.
- **hostPort**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field. - **hostPort**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field.
- **catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`

View File

@ -43,6 +43,17 @@ To deploy OpenMetadata, check the <a href="/deployment">Deployment</a> guides.
To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with To run the Ingestion via the UI you'll need to use the OpenMetadata Ingestion Container, which comes shipped with
custom Airflow plugins to handle the workflow deployment. custom Airflow plugins to handle the workflow deployment.
<Note>
To ingest metadata from the Trino source, the user must have select privileges for the following tables.
- `information_schema.schemata`
- `information_schema.columns`
- `information_schema.tables`
- `information_schema.views`
- `system.metadata.table_comments`
</Note>
## Metadata Ingestion ## Metadata Ingestion
### 1. Visit the Services Page ### 1. Visit the Services Page
@ -136,6 +147,7 @@ the changes.
- **Password**: Password to connect to Trino. - **Password**: Password to connect to Trino.
- **Host and Port**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field. - **Host and Port**: Enter the fully qualified hostname and port number for your Trino deployment in the Host and Port field.
- **Catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions) - **Catalog**: Trino offers a catalog feature where all the databases are stored. (Providing the Catalog is not mandatory from 0.12.2 or greater versions)
- **DatabaseSchema**: DatabaseSchema of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single databaseSchema. When left blank, OpenMetadata Ingestion attempts to scan all the databaseSchema.
- **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Options (Optional)**: Enter the details for any additional connection options that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs. - **Connection Arguments (Optional)**: Enter the details for any additional connection arguments such as security or protocol configs that can be sent to Trino during the connection. These details must be added as Key-Value pairs.
- In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"` - In case you are using Single-Sign-On (SSO) for authentication, add the `authenticator` details in the Connection Arguments as a Key-Value pair as follows: `"authenticator" : "sso_login_url"`