1858 Commits

Author SHA1 Message Date
Jonny Dixon
b44c0365cf
Merge branch 'master' into fivetran-std-edition-support 2025-09-29 14:50:23 +01:00
Jonny Dixon
8ddee246c2 updates 2025-09-29 14:45:14 +01:00
Jonny Dixon
d88df1a847 Implement incremental lineage and improve job/lineage extraction debugging
 Incremental Lineage Implementation:
- Add IncrementalLineageConfigMixin to FivetranSourceConfig
- Implement auto_incremental_lineage processor in get_workunit_processors
- Add comprehensive test coverage for incremental lineage functionality

🔧 Job Extraction Improvements:
- Enhanced sync history extraction with better error messages
- Added alternative job extraction for connectors with recent activity
- Implement synthetic job creation from connector timestamps
- Better debugging for missing sync history and job extraction failures

🐛 Bug Fixes:
- Fix Pydantic v1/v2 compatibility issues with HiddenFromDocs
- Improve error handling and logging throughout job extraction pipeline
- Add detailed logging for schema and lineage extraction failures

📊 Enhanced Debugging:
- More informative error messages for sync history extraction failures
- Better logging for job timestamp parsing and validation
- Improved lineage extraction debugging with schema count information
2025-09-29 09:12:07 +01:00
Mayuri Nehate
e698f0bf1d
feat(sdk/search): add tags, glossary terms filter (#14873)
Co-authored-by: Mayuri N <mayuri.nehate@datahub.com>
2025-09-29 06:31:39 +00:00
Mayuri Nehate
8d13b03e85
feat(sdk/search): add owner filter (#14649)
Co-authored-by: Mayuri N <mayuri.nehate@datahub.com>
2025-09-29 04:35:29 +00:00
Anush Kumar
c18b125a05
feat(ingestion): Enhanced column lineage extraction for Looker/LookML (#14826) 2025-09-26 09:27:18 -07:00
Tamas Nemeth
7e9c525448
fix(ingestion): Fix for module level variable caching in sqllite check (#14861) 2025-09-26 14:24:51 +02:00
Michael Maltese
eb066dcf1e
fix(ingest/gcs): fix a number of issues and add integration tests (#14857) 2025-09-25 14:53:33 -04:00
Michael Maltese
55d714e0cd
fix(ingest/mssql): don't split_statements on keywords inside bracketed identifiers (#14863) 2025-09-25 12:29:38 -04:00
skrydal
b0c9662be7
feature(transformers): Introduce Set browsePathsV2 transformer (#14825) 2025-09-23 19:20:01 +00:00
Sergio Gómez Villamor
ec166abade
tests(snaplogic): fix tests (#14848) 2025-09-23 18:18:32 +00:00
Anush Kumar
0f69e96078
feat(sdk): Added support for Change Audit Stamps in Dashboard and Chart entities (#14815) 2025-09-23 07:07:52 -07:00
Michael Maltese
9be65bd971
feat(ingest/tableau): enable extract_lineage_from_unsupported_custom_sql_queries by default (#14717) 2025-09-23 09:39:56 -04:00
sabdul
6a97baeee4
feat(ingestion/snaplogic): Add snaplogic as a source for metadata ingestion (#14231) 2025-09-23 20:26:24 +09:00
Jonny Dixon
8f29ff821d Merge branch 'master' into fivetran-std-edition-support 2025-09-23 11:02:30 +01:00
Jonny Dixon
01ef4eeac4 improved throughput 2025-09-23 10:41:31 +01:00
Harshal Sheth
a17fc4e0a8
chore(python): drop pydantic v1 support (#14014)
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
Co-authored-by: Piotr Skrydalewicz <piotr.skrydalewicz@acryl.io>
2025-09-23 07:40:29 +00:00
Jonny Dixon
02ff0283c0
Merge branch 'master' into fivetran-std-edition-support 2025-09-22 15:34:40 +01:00
Jonny Dixon
173e3f912a updates 2025-09-22 10:34:21 -04:00
Anush Kumar
7c1200c704
refactor(ingestion): looker source migration to use SDKv2 entities (#14693) 2025-09-18 13:26:50 -07:00
Jonny Dixon
5be17c6444
feat(ingestion/tableau): parameter to have entity owners as email address of owner (#14724) 2025-09-18 15:25:14 +00:00
Jonny Dixon
8bc7b9e6cc
Merge branch 'master' into fivetran-std-edition-support 2025-09-18 15:47:12 +01:00
Jonny Dixon
f6872f0cea Merge branch 'fivetran-std-edition-support' of https://github.com/datahub-project/datahub into fivetran-std-edition-support 2025-09-18 10:40:21 -04:00
Jonny Dixon
a8a6631b39 memory improvements 2025-09-18 10:40:18 -04:00
Anush Kumar
da885c6196
refactor(ingestion): lookml source migration to use SDKv2 entities (#14710) 2025-09-17 19:37:42 -07:00
Benjamin Maquet
5c07dc6e5a
feat(superset/preset): propagate chart & dashboard tags to DataHub (#14538) 2025-09-17 17:19:43 -04:00
Jonny Dixon
8072271af0
Merge branch 'master' into fivetran-std-edition-support 2025-09-17 18:11:35 +01:00
Jonny Dixon
a171710129 improvements for lineage 2025-09-17 12:36:39 -04:00
Abdullah
acffdce986
feat(dbt): add filtering for materialized nodes based on their physical location (#14689)
Co-authored-by: Abdullah Tariq <abdullah.tariq@adevinta.com>
Co-authored-by: skrydal <piotr.skrydalewicz@gmail.com>
2025-09-17 13:18:16 +00:00
Kevin Karch
002cc398d0
fix(ingest): change redash sql parse error to warnining (#14785) 2025-09-17 08:06:15 -04:00
Jonny Dixon
4a13f3a37f
Merge branch 'master' into fivetran-std-edition-support 2025-09-17 12:16:26 +01:00
Jonny Dixon
2feaa39256 improved naming 2025-09-17 07:14:21 -04:00
skrydal
667b7cb12c
fix(sdk_v2/lineage): Fix handling of null platform (#14784) 2025-09-17 09:11:03 +02:00
Sergio Gómez Villamor
d82ae8014e
feat(bigquery): add created and modified timestamps to dataset containers (#14716)
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-15 18:24:58 +02:00
Jonny Dixon
e88ca21448
Merge branch 'master' into fivetran-std-edition-support 2025-09-15 09:49:28 +01:00
Sergio Gómez Villamor
492e28a938
feat(ingest/neo4j): migrate Neo4j source to DataHub Python SDK v2 (#14591)
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-15 08:42:54 +00:00
Jonny Dixon
50243cdeac fixed tests 2025-09-15 04:19:56 -04:00
Jonny Dixon
62557841bc test fixes 2025-09-13 11:39:03 +01:00
Jonny Dixon
0ba70347b8 adding test coverage 2025-09-12 23:28:55 +01:00
Jonny Dixon
e5591e8865 better formatting 2025-09-12 19:28:58 +01:00
Jonny Dixon
526f23be28 Merge origin/master into fivetran-std-edition-support
Resolved merge conflicts by:
- Keeping improved dynamic service mapping over hardcoded platform mappings
- Maintaining enhanced parallel processing and error handling
- Preserving comprehensive API validation and accessibility checks
- Using updated DataFlow/DataJob constructors with better parameter naming
- Keeping improved golden test expectations that reflect our enhancements
- Cleaned up unused imports (StructuredLogCategory, CorpUserUrn)

All 49 unit tests passing with comprehensive Fivetran connector improvements.
2025-09-12 17:42:27 +01:00
Jonny Dixon
6471c32457 feat(fivetran): Add parallel processing and optimize Standard API performance
- Add ThreadPoolExecutor for parallel column retrieval and connector processing
- Add max_workers configuration parameter for controlling parallelization
- Improve error handling for 400 Bad Request responses in column retrieval
- Fix per-table mode DPI generation to avoid excessive duplication
- Improve Fivetran edition auto-detection to prefer enterprise mode
- Fix GraphQL operation parameter naming issue

Performance improvements:
- Column retrieval now parallelized within batches
- Connector lineage extraction now parallelized across connectors
- Reduced unnecessary retries for 400 errors (permanent failures)
- Better error logging and handling

This significantly speeds up Standard API ingestion while maintaining reliability.
2025-09-12 15:02:37 +01:00
Brock Griffey
4244620e7a
feat(cassandra): Add optional SSL configuration (#14726) 2025-09-11 15:37:30 +00:00
Tamas Nemeth
01932d3f87
fix(ingest/pipeline): Fix for slow ingestion and incomplete ingestion report metrics (#14735) 2025-09-11 16:07:47 +02:00
Aseem Bansal
137ffb7d48
fix(ingest): only add to samples where platform match (#14722) 2025-09-11 13:26:39 +05:30
skrydal
5f23652fd3
fix(ingestion/iceberg): Improve iceberg source resiliency to server errors (#14731) 2025-09-11 00:57:03 +02:00
Tamas Nemeth
a82d4e0647
fix(ingest/athena): Fix Athena partition extraction and CONCAT function type issues (#14712) 2025-09-10 12:33:54 +02:00
Tamas Nemeth
4ea758da19
chore(ingest/sqlparser): Bump sqlglot to 27.12.0 (#14673) 2025-09-09 19:57:52 +02:00
Benjamin Maquet
9105241bfd
feat(superset/preset): add dataset and column description (#14426) 2025-09-08 16:35:43 +09:00
skrydal
cc8e87143e
fix(cli): Fix to the deletion command (#14667)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2025-09-05 11:37:55 +00:00