154 Commits

Author SHA1 Message Date
Aseem Bansal
f099aeb550
fix(m1): tweak m1 preflight (#4771) 2022-04-29 10:29:47 +02:00
John Joyce
325c9b0f08
Adding new models to python codegen (#4726) 2022-04-22 10:12:58 -07:00
John Joyce
c69310522b
feat(metadata service): Introducing Platform Events (#4477) 2022-03-29 18:32:04 -07:00
Kevin Neville
d8e6f890a9
fix: Replace old repository link with new link (#4446) 2022-03-18 14:12:19 -07:00
Swaroop Jagadish
eaf7b02b2a
docs(model): auto-generated docs and hand-written docs for the metadata model (#4189) 2022-02-18 09:45:45 -08:00
Harshal Sheth
75d2ec2a39
ci(ingestion): fix airflow 1 deps for tox (#4083) 2022-02-17 00:33:28 -08:00
Swaroop Jagadish
d1a14abb53
fix(docs): fixing metadata model doc generation script and updating png (#4120) 2022-02-10 22:56:28 -08:00
Aseem Bansal
d3b7cece7a
fix(build): m1 - harden pre-flight script for M1 (#3958) 2022-01-27 23:01:18 -08:00
John Joyce
271784c9c1
feat(ui): UI-based ingestion (as featured in Dec Townhall) (#3975) 2022-01-27 10:33:12 -08:00
Swaroop Jagadish
7d986ec880
fix(ingest): populate system metadata for all metadata events (mcp, mcpw) (#3900) 2022-01-16 12:03:38 -08:00
Gabe Lyons
b7a732d96b
fix(build): correcting m1 preflight check (#3677) 2021-12-06 21:49:08 -08:00
Swaroop Jagadish
a16c432a1b
feat(metadata-model): adding metadata model doc generation and upload… (#3667) 2021-12-05 12:22:17 -08:00
Tamas Nemeth
f043b79f4c
feat(build): Preflight script for metadata ingestion on m1 (#3652) 2021-12-01 11:16:32 -08:00
Swaroop Jagadish
806b28e697
fix(ci): SKIP_RELEASE_UPLOAD flag was not being respected by python release script (#3588) 2021-11-16 15:11:25 -08:00
Swaroop Jagadish
9fdcbdabb0
feat(ci): adding support for env variables in python release script (#3587) 2021-11-16 12:15:52 -08:00
Harshal Sheth
6e5d1fe42f
fix(ingest): switch to avro from deprecated avro-python3 (#3412) 2021-10-18 15:35:27 -07:00
Swaroop Jagadish
9dd7303bad
feat(build): adding support for python codegen for all aspects, not just the snapshot ones (#3299) 2021-09-26 17:22:58 -07:00
rslanka
c418bc845c
feat(Analytics): Support for Timeseries Aggregated Statistics (#3207)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: Dexter Lee <dexter@acryl.io>
2021-09-14 18:35:10 -07:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Harshal Sheth
6135eed757
test(ingestion): fix flaky package discovery test (#2949) 2021-07-25 21:15:22 -07:00
Harshal Sheth
f01bb8bcd0
build(ingestion): add version prompt to release script (#2866) 2021-07-13 15:02:14 -07:00
Harshal Sheth
d66381451a
feat(ingest): refactor mce comparison and add pytest update golden files option (#2812) 2021-06-30 16:53:20 -07:00
Kevin Hu
4da76726d3
feat(ingest): SageMaker feature store ingestion (#2758) 2021-06-29 19:43:31 -07:00
Harshal Sheth
937f02c6bc
feat: usage stats (part 1) (#2750)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 17:11:00 -07:00
Harshal Sheth
82468016ae
fix(docker): use head tag for datahub-ingestion (#2760) 2021-06-24 13:16:49 -07:00
Kevin Hu
22a2ed81e4
feat(ingest): ingest last-modified from dbt sources.json (#2729) 2021-06-23 13:56:20 -07:00
Kevin Hu
a89094da5b
feat(ingest): add support for Glue ETL jobs (#2687) 2021-06-22 11:33:22 -07:00
Kevin Hu
554e1637c5
fix(ingest): types for dbt (#2716) 2021-06-22 10:37:08 -07:00
Harshal Sheth
5d93f249b4
feat(ingest): expose additional types to Python via codegen (#2712) 2021-06-17 10:04:28 -07:00
Harshal Sheth
26dcece8ec
fix(ingest): use looker data platform (#2708) 2021-06-16 16:58:13 -07:00
Kevin Hu
24268c2021
feat(ingest): headers for codegen Python scripts (#2637) 2021-06-11 09:44:18 -07:00
Kevin Hu
ebdaa0e359
feat(ingest): Feast ingestion integration (#2605)
* Add feast testing setup

* Init Feast test script

* Add feast to dependencies

* Update feast descriptors

* Sort integrations

* Working feast pytest

* Clean up feast docker-compose file

* Expand Feast tests

* Setup feast classes

* Add continuous and bytes data to feature types

* Update field type mapping

* Add PDLs

* Add MLFeatureSetUrn.java

* Comment out feast setup

* Add snapshot file and update inits

* Init Feast golden files generation

* Clean up Feast ingest

* Feast testing comments

* Yield Feature snapshots

* Fix Feature URN naming

* Update feast MCE

* Update Feature URN prefix

* Add MLEntity

* Update golden files with entities

* Specify feast sources

* Add feast source configs

* Working feast docker ingestion

* List entities and features before adding tables

* Add featureset names

* Remove unused

* Rename feast image

* Update README

* Add env to feast URNs

* Fix URN naming

* Remove redundant URN names

* Fix enum backcompatibility

* Move feast testing to docker

* Move URN generators to mce_builder

* Add source for features

* Switch TypeClass -> enum_type

* Rename source -> sourceDataset

* Add local Feast ingest image builds

* Rename Entity -> MLPrimaryKey

* Restore features and keys for each featureset

* Do not json encode source configs

* Remove old source properties from feature sets

* Regenerate golden file

* Fix race condition with Feast tests

* Exclude unknown source

* Update feature datatype enum

* Update README and fix typos

* Fix Entity typo

* Fix path to local docker image

* Specify feast config and version

* Fix feast env variables

* PR fixes

* Refactor feast ingest constants

* Make feature sources optional for back-compatibility

* Remove unused GCP files

* adding docker publish workflow

* Simplify name+namespace in PrimaryKeys

* adding docker publish workflow

* debug

* final attempt

* final final attempt

* final final final commit

* Switch to published ingestion image

* Update name and namespace in java files

* Rename FeatureSet -> FeatureTable

* Regenerate codegen

* Fix initial generation errors

* Update snapshot jsons

* Regenerated schemas

* Fix URN formats

* Revise builds

* Clean up feast URN builders

* Fix naming typos

* Fix Feature Set -> Feature Table

* Fix comments

* PR fixes

* All you need is Urn

* Regenerate snapshots and update validation

* Add UNKNOWN data type

* URNs for source types

* Add note on docker requirement

* Fix typo

* Reorder aspect unions

* Refactor feast ingest functions

* Update snapshot jsons

* Rebuild

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-09 15:07:04 -07:00
Harshal Sheth
a47400f18e
build(ingest): use gradle in commands + docs (#2531) 2021-05-11 19:03:20 -07:00
Harshal Sheth
cd588baccb
build(ingest): include package data in sdist (#2513) 2021-05-07 15:21:43 -07:00
Harshal Sheth
d0ca3191c9
build(ingest): add metadata-ingestion to gradle build (#2510) 2021-05-06 22:10:49 -07:00
Harshal Sheth
6f1f0a4845
feat(ingest): support hive over http (#2486) 2021-05-03 22:11:50 -07:00
Harshal Sheth
9f4de4b20a
fix(ingest): remove datahub.metadata import shortcut (#2449) 2021-04-30 21:10:12 -07:00
Gabe Lyons
851e00ba9f
feat(lineage): implement support for datasets, charts and dashboards downstream lineage fetching in a generic way (#2397)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Brian <brianwebtek@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2021-04-23 00:18:39 -07:00
Harshal Sheth
2af4603e49
fix(ingest): enable mypy disallow_incomplete_defs and disallow_untyped_decorators (#2393) 2021-04-14 13:40:24 -07:00
Harshal Sheth
41cd52f9e2
feat(ingest): add Airflow lineage backend (#2368) 2021-04-12 17:40:15 -07:00
Harshal Sheth
bd78b84bd3
feat(ingest): start airflow integration + metadata builders (#2331) 2021-04-05 19:11:28 -07:00
Harshal Sheth
a921d0deae
feat(ingest): MongoDB ingestion source (#2289) 2021-03-23 20:15:44 -07:00
Harshal Sheth
b8462028c3
feat(ingest): various minor fixes (#2246) 2021-03-17 23:05:05 -07:00
Harshal Sheth
aa6bc15cd7
fix(ingest): various avro codegen fixes (#2232) 2021-03-15 15:27:30 -07:00
Harshal Sheth
6a8fca59f1
feat(ingest): use plugin system based on Python extras (#2224) 2021-03-11 13:41:05 -08:00
Harshal Sheth
11532a1cc3
build(docker): add large generated directories to dockerignore (#2151) 2021-03-02 10:48:55 -08:00
Harshal Sheth
d2745ee9cd
ci(ingest): run apt update (#2135) 2021-02-23 20:51:23 -08:00
Harshal Sheth
76e0594b8b
feat(ingest): add support for LDAP ingestion (#2122) 2021-02-18 20:05:39 -08:00
Harshal Sheth
38f75be8ad gometa -> datahub 2021-02-15 18:29:27 -08:00
Harshal Sheth
d0bc3c55db Setup CI 2021-02-15 18:29:27 -08:00