mirror of
				https://github.com/datahub-project/datahub.git
				synced 2025-11-04 04:39:10 +00:00 
			
		
		
		
	
		
			
	
	
		
			30 lines
		
	
	
		
			964 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
		
		
			
		
	
	
			30 lines
		
	
	
		
			964 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| 
								 | 
							
								# SQL Profiling
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								SQL Profiling collects table level and column level statistics.
							 | 
						||
| 
								 | 
							
								The SQL-based profiler does not run alone, but rather can be enabled for other SQL-based sources.
							 | 
						||
| 
								 | 
							
								Enabling profiling will slow down ingestion runs.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::caution
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Running profiling against many tables or over many rows can run up significant costs.
							 | 
						||
| 
								 | 
							
								While we've done our best to limit the expensiveness of the queries the profiler runs, you
							 | 
						||
| 
								 | 
							
								should be prudent about the set of tables profiling is enabled on or the frequency
							 | 
						||
| 
								 | 
							
								of the profiling runs.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								:::
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Capabilities
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Extracts:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								- Row and column counts for each table
							 | 
						||
| 
								 | 
							
								- For each column, if applicable:
							 | 
						||
| 
								 | 
							
								  - null counts and proportions
							 | 
						||
| 
								 | 
							
								  - distinct counts and proportions
							 | 
						||
| 
								 | 
							
								  - minimum, maximum, mean, median, standard deviation, some quantile values
							 | 
						||
| 
								 | 
							
								  - histograms or frequencies of unique values
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								## Supported Sources
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								SQL profiling is supported for all SQL sources. Check the individual source page to verify if it supports profiling.
							 |