- Deprecates UsageAggregationClass, /usageStats?action=batchIngest, UsageStatsWorkUnit
- Removes parsing of UsageAggregationClass in file source, all sinks, and WorkUnitRecordExtractor
- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables
Also makes the following refactors:
- Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times.
- Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now?
- Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references
- Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code
Breaks up proxy.py into implementation and types
Includes major refactor of bigquery usage ingestion, minor refactor of the source as a whole, and reporting cleanup.
Includes bigquery performance testing changes.