mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-04 07:34:44 +00:00
30 lines
964 B
Markdown
30 lines
964 B
Markdown
![]() |
# SQL Profiling
|
||
|
|
||
|
SQL Profiling collects table level and column level statistics.
|
||
|
The SQL-based profiler does not run alone, but rather can be enabled for other SQL-based sources.
|
||
|
Enabling profiling will slow down ingestion runs.
|
||
|
|
||
|
:::caution
|
||
|
|
||
|
Running profiling against many tables or over many rows can run up significant costs.
|
||
|
While we've done our best to limit the expensiveness of the queries the profiler runs, you
|
||
|
should be prudent about the set of tables profiling is enabled on or the frequency
|
||
|
of the profiling runs.
|
||
|
|
||
|
:::
|
||
|
|
||
|
## Capabilities
|
||
|
|
||
|
Extracts:
|
||
|
|
||
|
- Row and column counts for each table
|
||
|
- For each column, if applicable:
|
||
|
- null counts and proportions
|
||
|
- distinct counts and proportions
|
||
|
- minimum, maximum, mean, median, standard deviation, some quantile values
|
||
|
- histograms or frequencies of unique values
|
||
|
|
||
|
## Supported Sources
|
||
|
|
||
|
SQL profiling is supported for all SQL sources. Check the individual source page to verify if it supports profiling.
|