2.7 KiB
Query Usage
The Query Usage workflow will be using the query-parser processor.
After running a Metadata Ingestion workflow, we can run Query Usage workflow.
While the serviceName will be the same to that was used in Metadata Ingestion, so the ingestion bot can get the serviceConnection details from the server.
1. Define the YAML Config
This is a sample config for BigQuery Usage:
{% codePreview %}
{% codeInfoContainer %}
{% codeInfo srNumber=25 %}
Source Configuration - Source Config
You can find all the definitions and types for the sourceConfig here.
queryLogDuration: Configuration to tune how far we want to look back in query logs to process usage data.
{% /codeInfo %}
{% codeInfo srNumber=26 %}
stageFileLocation: Temporary file name to store the query logs before processing. Absolute file path required.
{% /codeInfo %}
{% codeInfo srNumber=27 %}
resultLimit: Configuration to set the limit for query logs
{% /codeInfo %}
{% codeInfo srNumber=28 %}
queryLogFilePath: Configuration to set the file path for query logs
{% /codeInfo %}
{% codeInfo srNumber=29 %}
Processor, Stage and Bulk Sink Configuration
To specify where the staging files will be located.
Note that the location is a directory that will be cleaned at the end of the ingestion.
{% /codeInfo %}
{% partial file="/v1.5/connectors/yaml/workflow-config-def.md" /%}
{% /codeInfoContainer %}
{% codeBlock fileName="filename.yaml" %}
source:
type: {% $connector %}-usage
serviceName: <service name>
sourceConfig:
config:
type: DatabaseUsage
# Number of days to look back
queryLogDuration: 7
# This is a directory that will be DELETED after the usage runs
stageFileLocation: <path to store the stage file>
# resultLimit: 1000
# If instead of getting the query logs from the database we want to pass a file with the queries
# queryLogFilePath: path-to-file
processor:
type: query-parser
config: {}
stage:
type: table-usage
config:
filename: /tmp/athena_usage
bulkSink:
type: metadata-usage
config:
filename: /tmp/athena_usage
{% partial file="/v1.5/connectors/yaml/workflow-config.md" /%}
{% /codeBlock %} {% /codePreview %}
2. Run with the CLI
After saving the YAML config, we will run the command the same way we did for the metadata ingestion:
metadata usage -c <path-to-yaml>