Jonny Dixon
b44c0365cf
Merge branch 'master' into fivetran-std-edition-support
2025-09-29 14:50:23 +01:00
Jonny Dixon
8ddee246c2
updates
2025-09-29 14:45:14 +01:00
Jonny Dixon
d88df1a847
Implement incremental lineage and improve job/lineage extraction debugging
...
✅ Incremental Lineage Implementation:
- Add IncrementalLineageConfigMixin to FivetranSourceConfig
- Implement auto_incremental_lineage processor in get_workunit_processors
- Add comprehensive test coverage for incremental lineage functionality
🔧 Job Extraction Improvements:
- Enhanced sync history extraction with better error messages
- Added alternative job extraction for connectors with recent activity
- Implement synthetic job creation from connector timestamps
- Better debugging for missing sync history and job extraction failures
🐛 Bug Fixes:
- Fix Pydantic v1/v2 compatibility issues with HiddenFromDocs
- Improve error handling and logging throughout job extraction pipeline
- Add detailed logging for schema and lineage extraction failures
📊 Enhanced Debugging:
- More informative error messages for sync history extraction failures
- Better logging for job timestamp parsing and validation
- Improved lineage extraction debugging with schema count information
2025-09-29 09:12:07 +01:00
Mayuri Nehate
e698f0bf1d
feat(sdk/search): add tags, glossary terms filter ( #14873 )
...
Co-authored-by: Mayuri N <mayuri.nehate@datahub.com>
2025-09-29 06:31:39 +00:00
Mayuri Nehate
8d13b03e85
feat(sdk/search): add owner filter ( #14649 )
...
Co-authored-by: Mayuri N <mayuri.nehate@datahub.com>
2025-09-29 04:35:29 +00:00
Anush Kumar
c18b125a05
feat(ingestion): Enhanced column lineage extraction for Looker/LookML ( #14826 )
2025-09-26 09:27:18 -07:00
Tamas Nemeth
7e9c525448
fix(ingestion): Fix for module level variable caching in sqllite check ( #14861 )
2025-09-26 14:24:51 +02:00
Michael Maltese
55d714e0cd
fix(ingest/mssql): don't split_statements on keywords inside bracketed identifiers ( #14863 )
2025-09-25 12:29:38 -04:00
skrydal
b0c9662be7
feature(transformers): Introduce Set browsePathsV2 transformer ( #14825 )
2025-09-23 19:20:01 +00:00
Anush Kumar
0f69e96078
feat(sdk): Added support for Change Audit Stamps in Dashboard and Chart entities ( #14815 )
2025-09-23 07:07:52 -07:00
Jonny Dixon
8f29ff821d
Merge branch 'master' into fivetran-std-edition-support
2025-09-23 11:02:30 +01:00
Jonny Dixon
01ef4eeac4
improved throughput
2025-09-23 10:41:31 +01:00
Harshal Sheth
a17fc4e0a8
chore(python): drop pydantic v1 support ( #14014 )
...
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
Co-authored-by: Piotr Skrydalewicz <piotr.skrydalewicz@acryl.io>
2025-09-23 07:40:29 +00:00
Jonny Dixon
02ff0283c0
Merge branch 'master' into fivetran-std-edition-support
2025-09-22 15:34:40 +01:00
Anush Kumar
7c1200c704
refactor(ingestion): looker source migration to use SDKv2 entities ( #14693 )
2025-09-18 13:26:50 -07:00
Jonny Dixon
5be17c6444
feat(ingestion/tableau): parameter to have entity owners as email address of owner ( #14724 )
2025-09-18 15:25:14 +00:00
Jonny Dixon
8072271af0
Merge branch 'master' into fivetran-std-edition-support
2025-09-17 18:11:35 +01:00
Jonny Dixon
a171710129
improvements for lineage
2025-09-17 12:36:39 -04:00
Kevin Karch
002cc398d0
fix(ingest): change redash sql parse error to warnining ( #14785 )
2025-09-17 08:06:15 -04:00
Jonny Dixon
4a13f3a37f
Merge branch 'master' into fivetran-std-edition-support
2025-09-17 12:16:26 +01:00
skrydal
667b7cb12c
fix(sdk_v2/lineage): Fix handling of null platform ( #14784 )
2025-09-17 09:11:03 +02:00
Sergio Gómez Villamor
d82ae8014e
feat(bigquery): add created and modified timestamps to dataset containers ( #14716 )
...
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-15 18:24:58 +02:00
Jonny Dixon
e88ca21448
Merge branch 'master' into fivetran-std-edition-support
2025-09-15 09:49:28 +01:00
Sergio Gómez Villamor
492e28a938
feat(ingest/neo4j): migrate Neo4j source to DataHub Python SDK v2 ( #14591 )
...
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-15 08:42:54 +00:00
Jonny Dixon
62557841bc
test fixes
2025-09-13 11:39:03 +01:00
Jonny Dixon
0ba70347b8
adding test coverage
2025-09-12 23:28:55 +01:00
Jonny Dixon
e5591e8865
better formatting
2025-09-12 19:28:58 +01:00
Jonny Dixon
526f23be28
Merge origin/master into fivetran-std-edition-support
...
Resolved merge conflicts by:
- Keeping improved dynamic service mapping over hardcoded platform mappings
- Maintaining enhanced parallel processing and error handling
- Preserving comprehensive API validation and accessibility checks
- Using updated DataFlow/DataJob constructors with better parameter naming
- Keeping improved golden test expectations that reflect our enhancements
- Cleaned up unused imports (StructuredLogCategory, CorpUserUrn)
All 49 unit tests passing with comprehensive Fivetran connector improvements.
2025-09-12 17:42:27 +01:00
Jonny Dixon
6471c32457
feat(fivetran): Add parallel processing and optimize Standard API performance
...
- Add ThreadPoolExecutor for parallel column retrieval and connector processing
- Add max_workers configuration parameter for controlling parallelization
- Improve error handling for 400 Bad Request responses in column retrieval
- Fix per-table mode DPI generation to avoid excessive duplication
- Improve Fivetran edition auto-detection to prefer enterprise mode
- Fix GraphQL operation parameter naming issue
Performance improvements:
- Column retrieval now parallelized within batches
- Connector lineage extraction now parallelized across connectors
- Reduced unnecessary retries for 400 errors (permanent failures)
- Better error logging and handling
This significantly speeds up Standard API ingestion while maintaining reliability.
2025-09-12 15:02:37 +01:00
Tamas Nemeth
01932d3f87
fix(ingest/pipeline): Fix for slow ingestion and incomplete ingestion report metrics ( #14735 )
2025-09-11 16:07:47 +02:00
Aseem Bansal
137ffb7d48
fix(ingest): only add to samples where platform match ( #14722 )
2025-09-11 13:26:39 +05:30
skrydal
5f23652fd3
fix(ingestion/iceberg): Improve iceberg source resiliency to server errors ( #14731 )
2025-09-11 00:57:03 +02:00
Tamas Nemeth
a82d4e0647
fix(ingest/athena): Fix Athena partition extraction and CONCAT function type issues ( #14712 )
2025-09-10 12:33:54 +02:00
Tamas Nemeth
4ea758da19
chore(ingest/sqlparser): Bump sqlglot to 27.12.0 ( #14673 )
2025-09-09 19:57:52 +02:00
skrydal
cc8e87143e
fix(cli): Fix to the deletion command ( #14667 )
...
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2025-09-05 11:37:55 +00:00
Harshal Sheth
188e5af79e
feat(ingest): respect user email pattern in usage aggregator ( #14562 )
...
Co-authored-by: Claude <noreply@anthropic.com>
2025-09-03 23:34:42 +00:00
Harshal Sheth
ceb8dd2e11
feat(sdk): add container support for charts and dashboards ( #14641 )
2025-09-03 12:36:07 -07:00
Tamas Nemeth
9ec58e3876
fix(ingest/unity): Backport proxy fix for unity catalog sql library ( #14571 )
2025-09-03 17:37:45 +02:00
Michael Maltese
117a9e91d0
fix(ingest/databricks): fix upstream external path lineage when using system tables ( #14633 )
2025-09-02 22:54:52 +02:00
Hyejin Yoon
584f6ce3d0
feat(ingest/unity) : add mlmodel / mlmodel version support ( #14594 )
2025-09-02 15:22:17 +09:00
Anush Kumar
be8c684b35
refactor(ingestion): renamed redshift lineage_v2 to lineage and other v2 nomenclatures ( #14603 )
2025-08-29 11:37:26 -07:00
Anush Kumar
b64f2a1533
refactor(ingestion): Updated Redshift lineage_v1 refs and removed v1 implementation ( #14580 )
2025-08-29 10:09:10 -07:00
Sergio Gómez Villamor
ea0677b918
feat(snowflake): add China region support ( #14434 )
...
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-29 10:01:52 +02:00
Sergio Gómez Villamor
67a441f312
fix(tool_meta_extractor): relax hex query detection to search entire query text ( #14582 )
2025-08-28 13:22:40 +02:00
Mayuri Nehate
fe8f108746
fix(sdk): make Filter type permissive of implicit and dict ( #14569 )
2025-08-28 15:22:00 +05:30
Michael Minichino
340b1bf930
feat(ingest/excel): Add Excel Source ( #13261 )
...
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
2025-08-28 08:03:37 +00:00
Michael Minichino
0252818bd0
feat(ingest/powerbi): Add ODBC SQL query parsing with DSN-to-database/schema mapping ( #13752 )
...
Co-authored-by: Sergio Gómez Villamor <sgomezvillamor@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-27 15:25:23 +05:30
Michael Maltese
da127b92df
feat(s3): support wildcards in bucket name component of path_specs ( #14549 )
2025-08-26 16:29:52 -04:00
Jonny Dixon
0462415095
feat(ingestion/sql-queries): support incremental lineage ( #14548 )
2025-08-26 10:02:40 +01:00
Sergio Gómez Villamor
b3f20ee437
test(ingestion/json-schema): add test for JSON Schema $ref loop in definitions ( #14536 )
2025-08-25 07:21:50 +02:00