588 Commits

Author SHA1 Message Date
Kevin Hu
40ecd96c7a
feat(ingest): replace and warn against relative imports (#3033) 2021-08-06 10:25:30 -07:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Gabe Lyons
aa253f5b3b
feat(deletes): add run commands (list, show, rollback) to datahub ingest (#2960) 2021-07-29 20:04:40 -07:00
Kevin Hu
bf2775f6cf
feat(ingest): type stubs for boto3 (#2975) 2021-07-28 20:35:35 -07:00
Harshal Sheth
7ab6355b1c
feat(ingest): stricter deserialization for MCE JSONs (#2976) 2021-07-28 14:50:21 -07:00
Harshal Sheth
328b098d01
text(ingestion): test multiple python versions in CI (#2952) 2021-07-26 13:25:58 -07:00
Kevin Hu
662017ef17
fix(ingest): patch lookml types and refactor ingestion sources layout (#2950) 2021-07-26 13:06:52 -07:00
Harshal Sheth
5f0b4464f5
fix(ingest): pin snowflake sqlalchemy connector (#2923) 2021-07-20 19:28:40 -07:00
aseembansal-gogo
6e1b2cf4f7
feat(ingest): Add option to change name of database for postgres (#2898) 2021-07-20 07:01:42 -07:00
Harshal Sheth
283d131546
test(ingest): update tox test configurations and test airflow 2.x by default (#2906) 2021-07-17 20:00:50 -07:00
Harshal Sheth
8e573fdb31
fix(ingest): fix druid misconfiguration bug (#2882) 2021-07-14 20:29:23 -07:00
Harshal Sheth
be39037b10
build(ingest): reduce dependencies for dev install (#2872) 2021-07-14 20:02:48 -07:00
Harshal Sheth
74e34dddfc
feat(ingest): prettify stack traces in CLI (#2845) 2021-07-08 13:29:34 -07:00
Harshal Sheth
2f921d15e8
fix(ingest): avoid setting timestamps unless source system provides it (#2843) 2021-07-08 12:11:06 -07:00
Harshal Sheth
6fe663bf6a
feat(ingest): basic support for complex hive types (#2804) 2021-06-30 22:57:13 -07:00
Kevin Hu
4da76726d3
feat(ingest): SageMaker feature store ingestion (#2758) 2021-06-29 19:43:31 -07:00
Remi
2aa95ec750
feat(ingest): Improve lookml sql derived tables detection, add cascading derived tables to lineage (#2770) 2021-06-29 19:41:34 -07:00
Harshal Sheth
c8fe8d4026
fix(ingest): quote table names in hive (#2801) 2021-06-29 17:51:01 -07:00
Harshal Sheth
19b2a42a00
feat: usage stats (part 2) (#2762)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 19:44:59 -07:00
Harshal Sheth
937f02c6bc
feat: usage stats (part 1) (#2750)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 17:11:00 -07:00
Harshal Sheth
5d93f249b4
feat(ingest): expose additional types to Python via codegen (#2712) 2021-06-17 10:04:28 -07:00
Harshal Sheth
7e9a04479b
test(ingest): simplify docker cleanup commands (#2699) 2021-06-16 16:59:28 -07:00
Harshal Sheth
1b539220d5
feat(ingest): support Oracle service names (#2676) 2021-06-11 17:27:34 -07:00
Harshal Sheth
1857f85242
fix(ingest): upgrade acryl-pyhive to use sasl3 instead of sasl (#2684) 2021-06-11 17:24:41 -07:00
Harshal Sheth
5eee818a61
fix(ingest): pin to new mypy version (#2670) 2021-06-11 09:44:54 -07:00
zack3241
91eb3cc57e
Add get_identifier to hive source in metadata ingestion (#2667) 2021-06-09 15:12:17 -07:00
Kevin Hu
ebdaa0e359
feat(ingest): Feast ingestion integration (#2605)
* Add feast testing setup

* Init Feast test script

* Add feast to dependencies

* Update feast descriptors

* Sort integrations

* Working feast pytest

* Clean up feast docker-compose file

* Expand Feast tests

* Setup feast classes

* Add continuous and bytes data to feature types

* Update field type mapping

* Add PDLs

* Add MLFeatureSetUrn.java

* Comment out feast setup

* Add snapshot file and update inits

* Init Feast golden files generation

* Clean up Feast ingest

* Feast testing comments

* Yield Feature snapshots

* Fix Feature URN naming

* Update feast MCE

* Update Feature URN prefix

* Add MLEntity

* Update golden files with entities

* Specify feast sources

* Add feast source configs

* Working feast docker ingestion

* List entities and features before adding tables

* Add featureset names

* Remove unused

* Rename feast image

* Update README

* Add env to feast URNs

* Fix URN naming

* Remove redundant URN names

* Fix enum backcompatibility

* Move feast testing to docker

* Move URN generators to mce_builder

* Add source for features

* Switch TypeClass -> enum_type

* Rename source -> sourceDataset

* Add local Feast ingest image builds

* Rename Entity -> MLPrimaryKey

* Restore features and keys for each featureset

* Do not json encode source configs

* Remove old source properties from feature sets

* Regenerate golden file

* Fix race condition with Feast tests

* Exclude unknown source

* Update feature datatype enum

* Update README and fix typos

* Fix Entity typo

* Fix path to local docker image

* Specify feast config and version

* Fix feast env variables

* PR fixes

* Refactor feast ingest constants

* Make feature sources optional for back-compatibility

* Remove unused GCP files

* adding docker publish workflow

* Simplify name+namespace in PrimaryKeys

* adding docker publish workflow

* debug

* final attempt

* final final attempt

* final final final commit

* Switch to published ingestion image

* Update name and namespace in java files

* Rename FeatureSet -> FeatureTable

* Regenerate codegen

* Fix initial generation errors

* Update snapshot jsons

* Regenerated schemas

* Fix URN formats

* Revise builds

* Clean up feast URN builders

* Fix naming typos

* Fix Feature Set -> Feature Table

* Fix comments

* PR fixes

* All you need is Urn

* Regenerate snapshots and update validation

* Add UNKNOWN data type

* URNs for source types

* Add note on docker requirement

* Fix typo

* Reorder aspect unions

* Refactor feast ingest functions

* Update snapshot jsons

* Rebuild

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-09 15:07:04 -07:00
Kevin Hu
c0ace2ce59
fix(ingest): fix MyPy stubs (#2666) 2021-06-08 16:10:16 -07:00
Harshal Sheth
2123c8b6d7
fix(ingest): exclude mssql-odbc from "all" extra (#2660) 2021-06-07 14:00:35 -07:00
Harshal Sheth
31eae24300
fix(ingest): support mssql encryption via ODBC (#2657) 2021-06-04 18:19:11 -07:00
Harshal Sheth
a0ad590b3f
fix(ingest): improve redshift ingestion performance (#2635) 2021-06-03 11:14:34 -07:00
Harshal Sheth
acef397ece
fix(ingest): fail gracefully when lookml used on old python versions (#2614) 2021-05-26 17:16:17 -07:00
Kevin Hu
48d2b94203
fix(ingest): default values for env (#2598) 2021-05-24 14:09:55 -07:00
taufiqibrahim
db78373427
feat(ingest): kafka connect metadata ingestion (#2516) 2021-05-18 14:45:38 -07:00
Harshal Sheth
ebe7409897
fix(cli): prevent click from suppressing errors (#2560) 2021-05-17 11:50:38 -07:00
Harshal Sheth
3dfe3d375b
feat(ingest): add options for Airflow lineage backend (#2557) 2021-05-13 20:02:47 -07:00
Fredrik Sannholm
133577557c
feat(ingest): Looker view and dashboard ingestion (#2493) 2021-05-13 11:42:53 -07:00
Harshal Sheth
a671001824
refactor(ingest): move Airflow into datahub_provider module (#2521) 2021-05-12 15:01:11 -07:00
Albert Franzi
7fce505ffb
feat(ingest): define Redshift as a Postgres Source (#2540) 2021-05-12 10:00:34 -07:00
Harshal Sheth
cd588baccb
build(ingest): include package data in sdist (#2513) 2021-05-07 15:21:43 -07:00
Harshal Sheth
1facfbd5a3
feat(ingest): capture table properties if available (#2497) 2021-05-05 14:07:08 -07:00
Harshal Sheth
c32bf494d5
fix(ingest): support https connections with cookies in Hive ingestion (#2489)
Tested locally.
2021-05-04 13:10:52 -07:00
Harshal Sheth
6f1f0a4845
feat(ingest): support hive over http (#2486) 2021-05-03 22:11:50 -07:00
Harshal Sheth
d415234a8c
fix(ingest): fields with defaults should be optional (#2461) 2021-04-26 16:45:48 -07:00
Harshal Sheth
2da5e1fd10
feat(ingest): setup scaffolding for tox testing (#2451) 2021-04-26 16:44:36 -07:00
Harshal Sheth
034c33a050
fix(ingest): use entrypoints lib instead of pkg_resources (#2438) 2021-04-22 00:13:47 -07:00
Gabe Lyons
c7b49de67b
feat(ingest): adding superset ingestion source (#2425) 2021-04-22 00:11:54 -07:00
Harshal Sheth
ffe49f061a
fix(ingest): fix chart type enum serialization and add tests for rest emitter (#2429) 2021-04-21 11:34:24 -07:00
Harshal Sheth
79daec29b7
fix(ingest): ensure upstreams in airflow lineage emission are entities (#2427) 2021-04-20 20:44:38 -07:00
Harshal Sheth
9ac17c4ee0
fix(ingest): bump avro-gen3 (#2403)
Closes #2375.
2021-04-16 11:59:05 -07:00