103 Commits

Author SHA1 Message Date
RChygir
43d48ddde4
feat(ingest/mssql): load jobs and stored procedures (#5363) 2023-08-24 14:48:03 +05:30
Andrew Sikowitz
526e626146
feat(ingest): Add DataHub source (#8561)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2023-08-15 17:49:20 -04:00
Aseem Bansal
dac89fb1fb
feat(ingest): allow lower freq profiling based on date of month/day of week (#8489) 2023-08-04 10:13:48 +05:30
Gabe Lyons
843f82b943
feat(presto-on-hive): allow v1 fieldpaths in the presto-on-hive source (#8474) 2023-08-01 14:05:50 -07:00
Aseem Bansal
f4c0ed3aab
ingest(mysql): add storage bytes information (#8294)
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
2023-07-20 17:31:06 -04:00
Harshal Sheth
60dd9ef187
fix(ingest): remove original_table_name logic in sql source (#8130) 2023-05-31 15:58:09 -07:00
Andrew Sikowitz
fdbc4de695
refactor(ingest): Call source_helpers via new WorkUnitProcessors in base Source (#8101) 2023-05-24 13:36:19 -07:00
Andrew Sikowitz
a43903bf6d
refactor(ingest): Auto report workunits (#8061) 2023-05-22 17:06:31 -07:00
Andrew Sikowitz
2e1c3981aa
refactor(ingest): Move source_helpers.py from datahub/utilities -> datahub/api (#8052) 2023-05-17 20:51:06 -07:00
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
Harshal Sheth
94fa62d431
chore(ingest): formatting + cleanup MCPW usages (#7706) 2023-03-29 11:43:25 -07:00
Harshal Sheth
d54ff061a4
fix(ingest): remove get_platform_instance_id from stateful ingestion (#7572)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2023-03-20 17:35:10 -07:00
Shirshanka Das
17e85979dd
refactor(ingest): subtypes - standardize (#7437) 2023-02-28 13:11:07 -08:00
Andrew Sikowitz
2764c44977
fix(ingest): Do not require platform_instance for stateful ingestion (#7397) 2023-02-21 21:27:44 -05:00
Tamas Nemeth
0697fbcf81
fix(ingest/vertica): Fixing missing container properties (#7197) 2023-01-31 19:52:55 +01:00
Tamas Nemeth
0cdb5e4b4b
refactor(ingest/containers): Refactoring container creation to common place (#6877) 2023-01-21 00:14:31 +01:00
Harshal Sheth
13cc16fbc2
fix(cli/lite): fix datahub lite serve command (#7089) 2023-01-20 10:21:24 +01:00
Harshal Sheth
432feaa16d
feat(ingest): mark database_alias and env as deprecated (#6901) 2023-01-09 19:58:19 +05:30
Harshal Sheth
f651646d3d
chore(ingest): remove inferred args to MCPW, part 2 (#6905) 2023-01-04 23:29:56 -05:00
Harshal Sheth
6bc85502ba
feat(ingest): add include_table_location_lineage flag for SQL common (#6934) 2023-01-04 14:30:33 -05:00
Mayuri Nehate
2129496c98
feat(ingest/snowflake): handle failures gracefully and raise permission failures (#6748) 2022-12-28 08:20:37 -08:00
Harshal Sheth
1d0c7852a7
feat(ingest): add db/schema properties hook to SQL common (#6847) 2022-12-22 13:38:59 -08:00
Tamas Nemeth
e41b455e14
fix(ingest): bigquery - sharded table support improvements (#6789) 2022-12-19 18:57:37 +01:00
Harshal Sheth
44cfd21a65
chore(ingest): bump and pin mypy (#6584) 2022-12-02 19:53:28 +01:00
Harshal Sheth
817406eadb
refactor(ingest): simplify stateful ingestion config (#6454)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-18 00:09:24 -05:00
Harshal Sheth
8c322ede35
feat(ingest): allow specific profiler config fields to override profile_table_level_only (#6366) 2022-11-16 23:49:31 +01:00
Harshal Sheth
3e907ab0d1
feat(ingest): loosen sqlalchemy dep & support airflow 2.3+ (#6204)
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
2022-11-11 15:04:36 -05:00
wangsaisai
92192d2410
feat(ingest): enable container stateful ingestion (#6343) 2022-11-08 12:34:43 +01:00
Harshal Sheth
09616ee2b3
feat(ingest): include instance in container dataPlatform when provided (#6083) 2022-10-13 11:29:54 -07:00
Tamas Nemeth
3b1a0c568e
fix(ingest): bigquery-beta - Getting datasets with biquery client (#6039)
* Getting datasets with biquery client instead of information schema because it did not work everywhere
Changing lists to lossylist in report
2022-09-24 00:48:15 +02:00
Ravindra Lanka
ee68f09624
feat(ingestion): Refactor standard state-handling tasks into a common handler that are common across all stateful ingestion sources. (#5766) 2022-09-14 09:30:42 -07:00
Harshal Sheth
dfeced8eee
fix(ingest): hide internal profiler.allow_deny_patterns from config (#5619) 2022-09-13 16:09:10 +05:30
Harshal Sheth
e556bcb306
feat(ingest): add entity type inference to mcpw (#5880) 2022-09-10 20:36:10 -07:00
Harshal Sheth
abddc01877
fix(ingest): fix doc generation import ordering issue with postgres (#5846)
Relying on the correct import directly, rather than going through
SQLAlchemy's import wrapper (in their dialect.py) allows us to bypass
this strange error in doc generation.
2022-09-07 21:44:33 +05:30
Mugdha Hardikar
e448bb8832
feat(ingest): mysql - support multiple database in single recipe (#5684) 2022-08-26 19:47:49 +02:00
liyuhui666
0481075705
fix(ingest): Fix ingest Clickhouse without password (#5511)
* fix(ingest): Fix ingest Clickhouse without password

Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-08-09 10:30:56 -07:00
Mugdha Hardikar
d6621b4481
fix(ingest): sql-common - db2, snowflake bug fixes to extract table descriptions (#5526)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-08-07 09:23:23 -07:00
Harshal Sheth
858b132444
fix(ingest): fix some typos and logging issues (#5564) 2022-08-03 21:01:45 -07:00
Harshal Sheth
690443ce14
fix(ingest): cleanup unused flake8 noqa statements (#5492)
* fix(ingest): cleanup unused flake8 noqa statements

In the future, we can discover these using `flake8-noqa`.

* add back c901
2022-07-27 22:02:32 +05:30
Tamas Nemeth
9ec4fbae86
fix(ingest): bigquery - Graceful bq partition id date parsing failure (#5386) 2022-07-13 13:21:45 +02:00
Shirshanka Das
860d475c5e
feat(ingest): improve domain ingestion usability (#5366) 2022-07-11 09:37:38 -07:00
Mugdha Hardikar
5216d72f91
feat(bigquery): support size, rowcount, lastmodified based table selection for profiling (#5329)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-07-06 15:29:26 +05:30
Tamas Nemeth
f08c3f784f
fix(ingest): bigquery - Fix for bigquery error when there was no bigquery catalog specified (#5303) 2022-07-01 17:47:07 +02:00
Aseem Bansal
32e598df2b
feat(ingest): working with multiple bigquery projects (#5240) 2022-06-27 14:21:54 +05:30
Aseem Bansal
f66a6b41ef
fix(ingest): do not dump password (#5235) 2022-06-24 17:05:39 +05:30
Tamas Nemeth
1086614129
fix(ingest): bigquery - Grouping date named tables at bigquery (#5230)
* Grouping date named tables at bigquery
* Fixing table name for sharded tables
2022-06-23 11:53:25 +02:00
mayurinehate
7b143b06fc
feat(ingest): snowflake profile tables only if they have been updates since N days (#5132) 2022-06-13 14:59:16 +05:30
Aseem Bansal
2b06cb19fd
fix(bigquery): handling of empty partitioned tables, improve report message (#5122) 2022-06-08 23:03:32 +05:30
Aseem Bansal
912ce11821
fix(bigquery): reduce number of calls for details of partitioning (#5014) 2022-05-27 13:09:08 +05:30
buggythepirate
92338c7912
feat(ingest): Added new ingestion source SAP HANA (#4376)
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
2022-05-26 03:42:50 -07:00