80 Commits

Author SHA1 Message Date
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00
Harshal Sheth
817406eadb
refactor(ingest): simplify stateful ingestion config (#6454)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-18 00:09:24 -05:00
Harshal Sheth
8c322ede35
feat(ingest): allow specific profiler config fields to override profile_table_level_only (#6366) 2022-11-16 23:49:31 +01:00
Harshal Sheth
3e907ab0d1
feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ (#6204)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-11 15:04:36 -05:00
wangsaisai
92192d2410
feat(ingest): enable container stateful ingestion (#6343) 2022-11-08 12:34:43 +01:00
Harshal Sheth
09616ee2b3
feat(ingest): include instance in container dataPlatform when provided (#6083) 2022-10-13 11:29:54 -07:00
Tamas Nemeth
3b1a0c568e
fix(ingest): bigquery-beta - Getting datasets with biquery client (#6039)
* Getting datasets with biquery client instead of information schema because it did not work everywhere
Changing lists to lossylist in report
2022-09-24 00:48:15 +02:00
Ravindra Lanka
ee68f09624
feat(ingestion): Refactor standard state-handling tasks into a common handler that are common across all stateful ingestion sources. (#5766) 2022-09-14 09:30:42 -07:00
Harshal Sheth
dfeced8eee
fix(ingest): hide internal profiler.allow_deny_patterns from config (#5619) 2022-09-13 16:09:10 +05:30
Harshal Sheth
e556bcb306
feat(ingest): add entity type inference to mcpw (#5880) 2022-09-10 20:36:10 -07:00
Harshal Sheth
abddc01877
fix(ingest): fix doc generation import ordering issue with postgres (#5846)
Relying on the correct import directly, rather than going through
SQLAlchemy's import wrapper (in their dialect.py) allows us to bypass
this strange error in doc generation.
2022-09-07 21:44:33 +05:30
Mugdha Hardikar
e448bb8832
feat(ingest): mysql - support multiple database in single recipe (#5684) 2022-08-26 19:47:49 +02:00
liyuhui666
0481075705
fix(ingest): Fix ingest Clickhouse without password (#5511)
* fix(ingest): Fix ingest Clickhouse without password

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-08-09 10:30:56 -07:00
Mugdha Hardikar
d6621b4481
fix(ingest): sql-common - db2, snowflake bug fixes to extract table descriptions (#5526)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-07 09:23:23 -07:00
Harshal Sheth
858b132444
fix(ingest): fix some typos and logging issues (#5564) 2022-08-03 21:01:45 -07:00
Harshal Sheth
690443ce14
fix(ingest): cleanup unused flake8 noqa statements (#5492)
* fix(ingest): cleanup unused flake8 noqa statements

In the future, we can discover these using `flake8-noqa`.

* add back c901
2022-07-27 22:02:32 +05:30
Tamas Nemeth
9ec4fbae86
fix(ingest): bigquery - Graceful bq partition id date parsing failure (#5386) 2022-07-13 13:21:45 +02:00
Shirshanka Das
860d475c5e
feat(ingest): improve domain ingestion usability (#5366) 2022-07-11 09:37:38 -07:00
Mugdha Hardikar
5216d72f91
feat(bigquery): support size, rowcount, lastmodified based table selection for profiling (#5329)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-07-06 15:29:26 +05:30
Tamas Nemeth
f08c3f784f
fix(ingest): bigquery - Fix for bigquery error when there was no bigquery catalog specified (#5303) 2022-07-01 17:47:07 +02:00
Aseem Bansal
32e598df2b
feat(ingest): working with multiple bigquery projects (#5240) 2022-06-27 14:21:54 +05:30
Aseem Bansal
f66a6b41ef
fix(ingest): do not dump password (#5235) 2022-06-24 17:05:39 +05:30
Tamas Nemeth
1086614129
fix(ingest): bigquery - Grouping date named tables at bigquery (#5230)
* Grouping date named tables at bigquery
* Fixing table name for sharded tables
2022-06-23 11:53:25 +02:00
mayurinehate
7b143b06fc
feat(ingest): snowflake profile tables only if they have been updates since N days (#5132) 2022-06-13 14:59:16 +05:30
Aseem Bansal
2b06cb19fd
fix(bigquery): handling of empty partitioned tables, improve report message (#5122) 2022-06-08 23:03:32 +05:30
Aseem Bansal
912ce11821
fix(bigquery): reduce number of calls for details of partitioning (#5014) 2022-05-27 13:09:08 +05:30
buggythepirate
92338c7912
feat(ingest): Added new ingestion source SAP HANA (#4376)
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-26 03:42:50 -07:00
Ebu (えぶ)
2911e1ed1b
feat(ingest): Add Source from Vertica (#4555)
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-26 03:26:28 -07:00
Aseem Bansal
bbfe500581
fix(ingest): remove new schema field usage (#4987) 2022-05-24 22:02:40 +05:30
Aseem Bansal
a9ff203363
feat(bigquery): add partition key tag (#4974) 2022-05-23 21:27:13 +05:30
Aseem Bansal
2260118232
feat(bigquery): reduce logging (#4961)
* feat(bigquery): reduce logging

* doc: add entry for behaviour change
2022-05-20 09:42:55 -07:00
Aseem Bansal
2bb2c5243c
fix(bigquery): add dataset_id for bigquery (#4932) 2022-05-19 10:43:06 +05:30
Ravindra Lanka
5c64e9d541
fix(ingestion): Allow profiling of only those tables that are allowed by the table_pattern. (#4842) 2022-05-06 11:07:31 +02:00
Shirshanka Das
a9ad138172
feat(ingest): docs - overhaul source connector docs to make it code driven (#4798)
Co-authored-by: MugdhaHardikar-GSLab <mugdha.hardikar@gslab.com>
2022-05-02 00:18:15 -07:00
Aseem Bansal
155209f0e1
fix(ingestion): add missing workunit ids (#4657) 2022-04-13 10:19:37 +02:00
Kevin Hu
08c34bfe15
feat(ingest): capture MSSQL table+column descriptions (#4579)
* feat(ingest): capture MSSQL table+column descriptions

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-04-12 17:49:56 -07:00
Marcin Szymański
e7c5eb357c
feat(ingest): add trino platform for great expectations (#4594) 2022-04-11 19:48:15 -07:00
Marcin Szymański
7c3ad3d293
feat(ingest): enable connection string for all sqlalchemy datasources (#4508)
* feat(ingest): enable connection string for all sqlalchemy datasources

* Update sql_common.py

* fix types

* update docs

* rename variable to sqlalchemy_uri

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-04-07 23:11:52 -04:00
Tamas Nemeth
4358d8fb01
feat(ingest): athena - set Athena location as upstream (#4503) 2022-03-29 07:06:48 -07:00
Shirshanka Das
a69eac8247
feat(ingest): dbt,looker,sql_common,kafka - moving sources to produce display names and subtypes more consistently (#4496) 2022-03-27 18:49:26 -05:00
Aseem Bansal
c5f1d2c9bd
feat(ingestion): snowflake, bigquery - enhancements to log and bugfix (#4442)
feat(ingestion): add logging for snowflake, bigquery
2022-03-21 09:50:36 -07:00
Tamas Nemeth
f557b2c1b3
fix(ingestion) containers: Adding platform instance to container keys (#4279) 2022-03-16 14:57:50 -07:00
mayurinehate
9025bfb8d0
fix(ingest): extract redshift platform correctly from sqlalchemy uri (#4421)
* fix(ingest): extract redshift platform from sqlalchemy uri
2022-03-16 19:36:23 +01:00
Aseem Bansal
4bcc2b3d12
feat(ingestion): improve logging, docs for bigquery, snowflake, redshift (#4344) 2022-03-14 08:50:29 -07:00
Aseem Bansal
beb51ebf59
fix(ingestion): add logging, make job more resilient to errors (#4331) 2022-03-07 14:32:44 -08:00
Tamas Nemeth
2a5cf3dd07
feat(ingest): bigquery - ability to disable partition profiling (#4228) 2022-03-01 22:29:48 -08:00
Kevin Hu
46701319dc
feat(ingest): switch telemetry endpoint to Mixpanel (#4238) 2022-02-24 12:35:48 -08:00
Alexander Chashnikov
c2065bd7fe
feat(ingest): clickhouse - add initial support (#4057)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-02-21 07:36:08 -08:00
Kevin Hu
6b5fba882e
fix(ingest): sql-sources: add mapping for postgres types (#4179) 2022-02-20 19:32:15 -08:00
Tamas Nemeth
a4dc4137b7
feat(ingest): sql-sources - prevent hard failure on table/view ingestion exceptions (#4185) 2022-02-20 14:32:59 -08:00