Added Filter Params for Table and Schema (#1954)

* Added Filter Params for table and Schema

* Bigquery Doc changes

* Doc Changes for databases

* Filter Pattern Changes

* Table Filter Pattern Example Changes

* Filter Pattern Example Changes
This commit is contained in:
Ayush Shah 2021-12-29 22:43:09 +05:30 committed by GitHub
parent 1e334af89c
commit 5d6f385a75
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
18 changed files with 125 additions and 70 deletions

View File

@ -40,7 +40,10 @@ metadata ingest -c ./examples/workflows/mariadb.json
"password": "openmetadata_password",
"database": "openmetadata_db",
"service_name": "local_mysql",
"filter_pattern": {
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"]
}
}
@ -52,10 +55,11 @@ metadata ingest -c ./examples/workflows/mariadb.json
1. **username** - pass the MariaDB username.
2. **password** - password for the username
3. **service\_name** - Service Name for this MariaDB cluster. If you added MariaDB cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
5. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
6. **data\_profiler\_offset** - Specify offset.
7. **data\_profiler\_limit** - Specify limit.
4. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
8. **data\_profiler\_limit** - Specify limit.
## Publish to OpenMetadata
@ -73,8 +77,11 @@ Add optionally `pii` processor and `metadata-rest` sink along with `metadata-ser
"password": "openmetadata_password",
"database": "openmetadata_db",
"service_name": "local_mysql",
"filter_pattern": {
"excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},

View File

@ -44,7 +44,8 @@ pip install 'openmetadata-ingestion[athena]'
1. **username** - pass the Athena username. We recommend creating a user with read-only permissions to all the databases in your Athena installation
2. **password** - password for the username
3. **service\_name** - Service Name for this Athena cluster. If you added the Athena cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
4. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -69,8 +69,9 @@ metadata ingest -c ./examples/workflows/bigquery_usage.json
1. **username** - pass the Bigquery username.
2. **password** - the password for the Bigquery username.
3. **service\_name** - Service Name for this Bigquery cluster. If you added the Bigquery cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **database -** Database name from where data is to be fetched.
### Publish to OpenMetadata

View File

@ -63,7 +63,10 @@ metadata ingest -c ./examples/workflows/bigquery.json
"options": {
"credentials_path": "examples/creds/bigquery-cred.json"
},
"filter_pattern": {
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": [
"[\\w]*cloudaudit.*",
"[\\w]*logging_googleapis_com.*",
@ -78,11 +81,12 @@ metadata ingest -c ./examples/workflows/bigquery.json
1. **username** - pass the Bigquery username.
2. **password** - the password for the Bigquery username.
3. **service\_name** - Service Name for this Bigquery cluster. If you added the Bigquery cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
8. **data\_profiler\_limit** - Specify limit.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **database -** Database name from where data is to be fetched.
7. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
8. **data\_profiler\_offset** - Specify offset.
9. **data\_profiler\_limit** - Specify limit.
### Publish to OpenMetadata
@ -106,7 +110,10 @@ Add `metadata-rest` sink along with `metadata-server` config
"options": {
"credentials_path": "examples/creds/bigquery-cred.json"
},
"filter_pattern": {
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": [
"[\\w]*cloudaudit.*",
"[\\w]*logging_googleapis_com.*",

View File

@ -43,10 +43,11 @@ pip install 'openmetadata-ingestion[hive]'
{% endcode %}
1. **service\_name** - Service Name for this Hive cluster. If you added the Hive cluster through OpenMetadata UI, make sure the service name matches the same.
2. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
3. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
4. **data\_profiler\_offset** - Specify offset.
5. **data\_profiler\_limit** - Specify limit.
2. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
3. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
4. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
5. **data\_profiler\_offset** - Specify offset.
6. **data\_profiler\_limit** - Specify limit.
## Publish to OpenMetadata

View File

@ -42,8 +42,11 @@ metadata ingest -c ./examples/workflows/mssql.json
"query": "select top 50 * from {}.{}",
"username": "sa",
"password": "test!Password",
"filter_pattern": {
"excludes": ["catalog_test.*"]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},
@ -55,8 +58,9 @@ metadata ingest -c ./examples/workflows/mssql.json
2. **password** - the password for the mssql username.
3. **service\_name** - Service Name for this mssql cluster. If you added the mssql cluster through OpenMetadata UI, make sure the service name matches the same.
4. **host\_port** - Hostname and Port number where the service is being initialized.
5. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
6. **database** - Database name from where data is to be fetched from.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
7. **database** - Database name from where data is to be fetched from.
## Publish to OpenMetadata
@ -76,8 +80,11 @@ Add `metadata-rest` sink along with `metadata-server` config
"query": "select top 50 * from {}.{}",
"username": "sa",
"password": "test!Password",
"filter_pattern": {
"excludes": ["catalog_test.*"]
"table_filter_pattern": {
"excludes": ["catalog_test.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},

View File

@ -43,8 +43,11 @@ metadata ingest -c ./examples/workflows/mysql.json
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},
@ -55,10 +58,11 @@ metadata ingest -c ./examples/workflows/mysql.json
1. **username** - pass the MySQL username. We recommend creating a user with read-only permissions to all the databases in your MySQL installation
2. **password** - password for the username
3. **service\_name** - Service Name for this MySQL cluster. If you added MySQL cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
5. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
6. **data\_profiler\_offset** - Specify offset.
7. **data\_profiler\_limit** - Specify limit.
4. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
8. **data\_profiler\_limit** - Specify limit.
## Publish to OpenMetadata
@ -79,7 +83,10 @@ Add `metadata-rest` sink along with `metadata-server` config
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"]
}
}

View File

@ -48,7 +48,8 @@ pip install 'openmetadata-ingestion[oracle]'
3. **host\_port** - Host Port where Oracle Instance is initiated
4. **service\_name** - Service Name for this Oracle cluster. If you added Oracle cluster through OpenMetadata UI, make sure the service name matches the same.
5. **oracle\_service\_name -** Oracle Service Name (TNS alias)
6. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
6. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
7. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -53,11 +53,12 @@ metadata ingest -c ./examples/workflows/postgres.json
1. **username** - pass the Postgres username.
2. **password** - the password for the Postgres username.
3. **service\_name** - Service Name for this Postgres cluster. If you added the Postgres cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
8. **data\_profiler\_limit** - Specify limit.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **database -** Database name from where data is to be fetched.
7. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
8. **data\_profiler\_offset** - Specify offset.
9. **data\_profiler\_limit** - Specify limit.
## Publish to OpenMetadata

View File

@ -46,7 +46,8 @@ metadata ingest -c ./examples/workflows/presto.json
2. **password** - password for the username
3. **host\_port** - host and port of the Presto cluster
4. **service\_name** - Service Name for this Presto cluster. If you added the Presto cluster through OpenMetadata UI, make sure the service name matches the same.
5. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -52,7 +52,8 @@ metadata ingest -c ./examples/workflows/redshift_usage.json
1. **username** - pass the Redshift username. We recommend creating a user with read-only permissions to all the databases in your Redshift installation
2. **password** - password for the username
3. **service\_name** - Service Name for this Redshift cluster. If you added the Redshift cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
4. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -42,8 +42,11 @@ metadata ingest -c ./examples/workflows/redshift.json
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"excludes": ["information_schema.*", "[\\w]*event_vw.*"]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
},
...
@ -53,7 +56,8 @@ metadata ingest -c ./examples/workflows/redshift.json
1. **username** - pass the Redshift username.
2. **password** - the password for the Redshift username.
3. **service\_name** - Service Name for this Redshift cluster. If you added the Redshift cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
@ -79,8 +83,11 @@ Add `metadata-rest` sink along with `metadata-server` config
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"excludes": ["information_schema.*", "[\\w]*event_vw.*"]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
},
"sink": {

View File

@ -53,7 +53,8 @@ metadata ingest -c ./examples/workflows/salesforce.json
3. **security\_token** - pass the security token.
4. **sobject\_name** - pass the salesforce object name.
5. **service\_name** - Service Name for this Salesforce cluster. If you added Salesforce cluster through OpenMetadata UI, make sure the service name matches the same.
6. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
6. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
7. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -51,8 +51,9 @@ metadata ingest -c ./examples/workflows/snowflake_usage.json
1. **username** - pass the Snowflake username.
2. **password** - the password for the Snowflake username.
3. **service\_name** - Service Name for this Snowflake cluster. If you added the Snowflake cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **database -** Database name from where data is to be fetched.
### Publish to OpenMetadata

View File

@ -45,10 +45,11 @@ metadata ingest -c ./examples/workflows/snowflake.json
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"excludes": [
"tpcds_sf100tcl"
]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},
@ -58,11 +59,12 @@ metadata ingest -c ./examples/workflows/snowflake.json
1. **username** - pass the Snowflake username.
2. **password** - the password for the Snowflake username.
3. **service\_name** - Service Name for this Snowflake cluster. If you added the Snowflake cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata.
5. **database -** Database name from where data is to be fetched.
6. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you with the newly ingested data.
7. **data\_profiler\_offset** - Specify offset.
8. **data\_profiler\_limit** - Specify limit.
4. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **database -** Database name from where data is to be fetched.
7. **data\_profiler\_enabled** - Enable data-profiling (Optional). It will provide you with the newly ingested data.
8. **data\_profiler\_offset** - Specify offset.
9. **data\_profiler\_limit** - Specify limit.
### SSO Configuration
@ -136,10 +138,11 @@ Add `metadata-rest` sink along with `metadata-server` config
"data_profiler_enabled": "true",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"filter_pattern": {
"excludes": [
"tpcds_sf100tcl"
]
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},

View File

@ -47,7 +47,8 @@ metadata ingest -c ./examples/workflows/trino.json
2. **password** - password for the username
3. **host\_port** - host and port of the Trino cluster
4. **service\_name** - Service Name for this Trino cluster. If you added the Trino cluster through OpenMetadata UI, make sure the service name matches the same.
5. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
5. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
6. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
## Publish to OpenMetadata

View File

@ -40,8 +40,11 @@ metadata ingest -c ./examples/workflows/vertica.json
"password": "openmetadata_password",
"database": "openmetadata_db",
"service_name": "local_vertica",
"filter_pattern": {
"excludes": []
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},
@ -52,7 +55,8 @@ metadata ingest -c ./examples/workflows/vertica.json
1. \*\*username \*\*- pass the Vertica username.
2. **password** - password for the username.
3. **service\_name** - Service Name for this Vertica cluster. If you added Vertica cluster through OpenMetadata UI, make sure the service name matches the same.
4. **filter\_pattern** - It contains includes, excludes options to choose which pattern of datasets you want to ingest into OpenMetadata
4. **table\_filter\_pattern** - It contains includes, excludes options to choose which pattern of tables you want to ingest into OpenMetadata.
5. **schema\_filter\_pattern** - It contains includes, excludes options to choose which pattern of schemas you want to ingest into OpenMetadata.
### Publish to OpenMetadata
@ -70,8 +74,11 @@ Add `metadata-rest` sink along with `metadata-server` config
"password": "openmetadata_password",
"database": "openmetadata_db",
"service_name": "local_vertica",
"filter_pattern": {
"excludes": []
"table_filter_pattern": {
"excludes": ["demo.*","orders.*"]
},
"schema_filter_pattern": {
"excludes": ["information_schema.*"]
}
}
},

View File

@ -6,7 +6,7 @@
"password": "openmetadata_password",
"database": "openmetadata_db",
"service_name": "local_mysql_test",
"filter_pattern": {
"schema_filter_pattern": {
"excludes": ["mysql.*", "information_schema.*", "performance_schema.*", "sys.*"]
}
}