From cb8f4c612fde65a4dbdf46eb2cc5ffce10fa1d98 Mon Sep 17 00:00:00 2001 From: Mayur Singal <39544459+ulixius9@users.noreply.github.com> Date: Mon, 27 May 2024 12:03:38 +0530 Subject: [PATCH] MINOR: improve query log lineage docs (#16413) --- .../workflows/lineage/lineage-workflow-query-logs.md | 9 ++++++++- .../workflows/usage/usage-workflow-query-logs.md | 11 +++++++++-- .../workflows/lineage/lineage-workflow-query-logs.md | 9 ++++++++- .../workflows/usage/usage-workflow-query-logs.md | 12 ++++++++++-- 4 files changed, 35 insertions(+), 6 deletions(-) diff --git a/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md b/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md index 388ab7355c4..95c4a1108e5 100644 --- a/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md +++ b/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md @@ -33,13 +33,20 @@ A standard CSV should be comma separated, and each row represented as a single l {% /note %} - **query_text:** This field contains the literal query that has been executed in the database. It is quite possible - that your query has commas `,` inside. Then, wrap each query in quotes `""` to not have any clashes + that your query has commas `,` inside. Then, wrap each query in quotes to not have any clashes with the comma as a separator. - **database_name (optional):** Enter the database name on which the query was executed. - **schema_name (optional):** Enter the schema name to which the query is associated. Checkout a sample query log file [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/examples/sample_data/glue/query_log.csv). +```csv +query_text,database_name,schema_name +"select * from sales",default,information_schema +"select * from marketing",default,information_schema +"insert into marketing select * from sales",default,information_schema +``` + ## Lineage Workflow In order to run a Lineage Workflow we need to make sure that Metadata Ingestion Workflow for corresponding service has already been executed. We will follow the steps to create a JSON configuration able to collect the query log file and execute the lineage workflow. diff --git a/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md b/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md index a0eafaf5d34..8d5802f019f 100644 --- a/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md +++ b/openmetadata-docs/content/v1.3.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md @@ -34,8 +34,9 @@ A standard CSV should be comma separated, and each row represented as a single l {% /note %} - **query_text:** This field contains the literal query that has been executed in the database. It is quite possible - that your query has commas `,` inside. Then, wrap each query in quotes `""` to not have any clashes - with the comma as a separator.- **user_name (optional):** Enter the database user name which has executed this query. + that your query has commas `,` inside. Then, wrap each query in quotes to not have any clashes + with the comma as a separator. +- **user_name (optional):** Enter the database user name which has executed this query. - **start_time (optional):** Enter the query execution start time in YYYY-MM-DD HH:MM:SS format. - **end_time (optional):** Enter the query execution end time in YYYY-MM-DD HH:MM:SS format. - **aborted (optional):** This field accepts values as true or false and indicates whether the query was aborted during execution @@ -44,6 +45,12 @@ A standard CSV should be comma separated, and each row represented as a single l Checkout a sample query log file [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/examples/sample_data/glue/query_log.csv). +```csv +query_text,database_name,schema_name +"create table sales_analysis as select id, name from sales",default,information_schema +"insert into marketing select * from sales",default,information_schema +``` + ## Usage Workflow In order to run a Usage Workflow we need to make sure that Metadata Ingestion Workflow for corresponding service has already been executed. We will follow the steps to create a JSON configuration able to collect the query log file and execute the usage workflow. diff --git a/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md b/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md index 388ab7355c4..95c4a1108e5 100644 --- a/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md +++ b/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs.md @@ -33,13 +33,20 @@ A standard CSV should be comma separated, and each row represented as a single l {% /note %} - **query_text:** This field contains the literal query that has been executed in the database. It is quite possible - that your query has commas `,` inside. Then, wrap each query in quotes `""` to not have any clashes + that your query has commas `,` inside. Then, wrap each query in quotes to not have any clashes with the comma as a separator. - **database_name (optional):** Enter the database name on which the query was executed. - **schema_name (optional):** Enter the schema name to which the query is associated. Checkout a sample query log file [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/examples/sample_data/glue/query_log.csv). +```csv +query_text,database_name,schema_name +"select * from sales",default,information_schema +"select * from marketing",default,information_schema +"insert into marketing select * from sales",default,information_schema +``` + ## Lineage Workflow In order to run a Lineage Workflow we need to make sure that Metadata Ingestion Workflow for corresponding service has already been executed. We will follow the steps to create a JSON configuration able to collect the query log file and execute the lineage workflow. diff --git a/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md b/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md index a0eafaf5d34..9a5ffa3ec7e 100644 --- a/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md +++ b/openmetadata-docs/content/v1.4.x/connectors/ingestion/workflows/usage/usage-workflow-query-logs.md @@ -34,8 +34,9 @@ A standard CSV should be comma separated, and each row represented as a single l {% /note %} - **query_text:** This field contains the literal query that has been executed in the database. It is quite possible - that your query has commas `,` inside. Then, wrap each query in quotes `""` to not have any clashes - with the comma as a separator.- **user_name (optional):** Enter the database user name which has executed this query. + that your query has commas `,` inside. Then, wrap each query in quotes to not have any clashes + with the comma as a separator. +- **user_name (optional):** Enter the database user name which has executed this query. - **start_time (optional):** Enter the query execution start time in YYYY-MM-DD HH:MM:SS format. - **end_time (optional):** Enter the query execution end time in YYYY-MM-DD HH:MM:SS format. - **aborted (optional):** This field accepts values as true or false and indicates whether the query was aborted during execution @@ -44,6 +45,13 @@ A standard CSV should be comma separated, and each row represented as a single l Checkout a sample query log file [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/examples/sample_data/glue/query_log.csv). +```csv +query_text,database_name,schema_name +"select * from sales",default,information_schema +"select * from marketing",default,information_schema +"insert into marketing select * from sales",default,information_schema +``` + ## Usage Workflow In order to run a Usage Workflow we need to make sure that Metadata Ingestion Workflow for corresponding service has already been executed. We will follow the steps to create a JSON configuration able to collect the query log file and execute the usage workflow.