mirror of
https://github.com/datahub-project/datahub.git
synced 2025-07-31 13:27:58 +00:00

- Adds usage extraction to the unity catalog source and a TableReference object to handle references to tables Also makes the following refactors: - Creates UsageAggregator class to usage_common, as I've seen this same logic multiple times. - Allows customizable user_urn_builder in usage_common as not all unity users are emails. We create emails with a default email_domain config in other connectors like redshift and snowflake, which seems unnecessary now? - Creates TableReference for unity catalog and adds it to the Table dataclass, for managing string references to tables. Replaces logic, especially in lineage extraction, with these references - Creates gen_dataset_urn and gen_user_urn on unity source to reduce duplicate code Breaks up proxy.py into implementation and types