70 Commits

Author SHA1 Message Date
John Joyce
8c9c696cdd
feat(ingest): Adding an Okta Integration to extract Users, Groups, Group Membership (#3043) 2021-08-11 18:49:16 -07:00
rslanka
8844240328
feat: Adding support for nested schemas in ingestion and visualization (#3079) 2021-08-11 15:47:18 -07:00
Gabe Lyons
a06c4caf4b
chore: upgrading gma to 0.2.80 (#3070) 2021-08-10 13:29:11 -07:00
John Joyce
352a0abf8d
Introducing TimeSeries Aspects + Dataset Profile (Stats) Aspect (#2983)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Ravindra Lanka <rlanka@acryl.io>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-07-30 17:41:03 -07:00
Dexter Lee
15c0c4dfb3
fix(browse): Fix browse pagination and multi-browse path issue (#2984) 2021-07-30 10:48:32 -07:00
Gabe Lyons
aa253f5b3b
feat(deletes): add run commands (list, show, rollback) to datahub ingest (#2960) 2021-07-29 20:04:40 -07:00
Kevin Hu
43e61d9628
feat(models): remove versions from metrics and hyperparams (#2938) 2021-07-22 22:18:45 -07:00
Kevin Hu
6dbc59940a
feat(ingest): refactor mlModel grouping and add browsepaths (#2929) 2021-07-22 13:33:15 -07:00
Kevin Hu
736249f0c7
feat(ingest): extract SageMaker metrics, hyperparameters, and external URLs (#2910) 2021-07-21 21:30:07 -07:00
Kevin Hu
6abd5e191a
feat(ingest): lineage for SageMaker model endpoints and groups (#2894) 2021-07-19 11:30:43 -07:00
Lal Rishav
ae014bf2e1
fix(glossary):default browse path for glossary term (#2788) 2021-07-16 09:15:24 -07:00
Kevin Hu
44ed2f3684
feat(ingest): extract lineage between SageMaker jobs and models (#2868) 2021-07-15 18:56:13 -07:00
Kevin Hu
a2106ca9e8
feat(ingest): SageMaker jobs and models (#2830) 2021-07-08 16:16:16 -07:00
Harshal Sheth
2f921d15e8
fix(ingest): avoid setting timestamps unless source system provides it (#2843) 2021-07-08 12:11:06 -07:00
Dexter Lee
8f0f322279
feat(backup): Add restore indices and restore backup tasks (#2779) 2021-06-30 16:49:02 -07:00
Kevin Hu
4da76726d3
feat(ingest): SageMaker feature store ingestion (#2758) 2021-06-29 19:43:31 -07:00
Harshal Sheth
19b2a42a00
feat: usage stats (part 2) (#2762)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 19:44:59 -07:00
Harshal Sheth
937f02c6bc
feat: usage stats (part 1) (#2750)
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
2021-06-24 17:11:00 -07:00
Kevin Hu
a89094da5b
feat(ingest): add support for Glue ETL jobs (#2687) 2021-06-22 11:33:22 -07:00
Gabe Lyons
0750332714
fix(editable descriptions): adding indexing for editable descriptions (#2710) 2021-06-17 10:55:26 -07:00
Brian
a5f9b8dfe9
feat(entities): add markdown description update/viewer feature in dataset, datajob, dataflow, chart and dashboard, update ui/ux (#2707) 2021-06-16 15:48:27 -07:00
Gabe Lyons
523c3bf1d4
feat(aspects): support fetching of versioned aspects (#2677) 2021-06-16 10:03:21 -07:00
Kevin Hu
ebdaa0e359
feat(ingest): Feast ingestion integration (#2605)
* Add feast testing setup

* Init Feast test script

* Add feast to dependencies

* Update feast descriptors

* Sort integrations

* Working feast pytest

* Clean up feast docker-compose file

* Expand Feast tests

* Setup feast classes

* Add continuous and bytes data to feature types

* Update field type mapping

* Add PDLs

* Add MLFeatureSetUrn.java

* Comment out feast setup

* Add snapshot file and update inits

* Init Feast golden files generation

* Clean up Feast ingest

* Feast testing comments

* Yield Feature snapshots

* Fix Feature URN naming

* Update feast MCE

* Update Feature URN prefix

* Add MLEntity

* Update golden files with entities

* Specify feast sources

* Add feast source configs

* Working feast docker ingestion

* List entities and features before adding tables

* Add featureset names

* Remove unused

* Rename feast image

* Update README

* Add env to feast URNs

* Fix URN naming

* Remove redundant URN names

* Fix enum backcompatibility

* Move feast testing to docker

* Move URN generators to mce_builder

* Add source for features

* Switch TypeClass -> enum_type

* Rename source -> sourceDataset

* Add local Feast ingest image builds

* Rename Entity -> MLPrimaryKey

* Restore features and keys for each featureset

* Do not json encode source configs

* Remove old source properties from feature sets

* Regenerate golden file

* Fix race condition with Feast tests

* Exclude unknown source

* Update feature datatype enum

* Update README and fix typos

* Fix Entity typo

* Fix path to local docker image

* Specify feast config and version

* Fix feast env variables

* PR fixes

* Refactor feast ingest constants

* Make feature sources optional for back-compatibility

* Remove unused GCP files

* adding docker publish workflow

* Simplify name+namespace in PrimaryKeys

* adding docker publish workflow

* debug

* final attempt

* final final attempt

* final final final commit

* Switch to published ingestion image

* Update name and namespace in java files

* Rename FeatureSet -> FeatureTable

* Regenerate codegen

* Fix initial generation errors

* Update snapshot jsons

* Regenerated schemas

* Fix URN formats

* Revise builds

* Clean up feast URN builders

* Fix naming typos

* Fix Feature Set -> Feature Table

* Fix comments

* PR fixes

* All you need is Urn

* Regenerate snapshots and update validation

* Add UNKNOWN data type

* URNs for source types

* Add note on docker requirement

* Fix typo

* Reorder aspect unions

* Refactor feast ingest functions

* Update snapshot jsons

* Rebuild

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-09 15:07:04 -07:00
Dexter Lee
9c030d8325
fix(NoCode): Update snapshot json to latest (#2655) 2021-06-07 11:52:52 -07:00
John Joyce
97e9660037
feat: No Code Metadata Modeling (#2629)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2021-06-03 13:24:33 -07:00
Brian
aa8ba1b8e9
feat(dataflow): update dataflow to have datajobs in new tab (#2579) 2021-05-18 21:25:42 -07:00
shubham garg
95782b1acf
feat(graphql): add graphql types for business glossary (#2485)
Co-authored-by: shubham.garg <shubham.garg@thoughtworks.com>
2021-05-11 17:29:00 -07:00
shakti-garg
8ed14a62e2
feat(business_glossary): add new entity business term and its relationship with dataset and its fields (#2228)
Co-authored-by: shubham.garg <shubham.garg@thoughtworks.com>
2021-05-10 13:20:23 -07:00
Gabe Lyons
851e00ba9f
feat(lineage): implement support for datasets, charts and dashboards downstream lineage fetching in a generic way (#2397)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Brian <brianwebtek@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2021-04-23 00:18:39 -07:00
Gabe Lyons
c7b49de67b
feat(ingest): adding superset ingestion source (#2425) 2021-04-22 00:11:54 -07:00
Dexter Lee
259e6af494
feat(search-by-field): Add the ability to search for field names (#2286) 2021-03-23 15:18:32 -07:00
Gabe Lyons
039fe597f7
feat(tags): editing tags from react client on datasets, schemas, charts & dashboards (#2248) 2021-03-18 11:52:14 -07:00
Arun Vasudevan
7750c6120a
feat: MLmodel Graphql Query (#2166) 2021-03-13 08:34:48 -08:00
Fredrik Sannholm
da6b3d111d
feat(datajob): Backend implementation (#2197) 2021-03-13 08:00:44 -08:00
Gabe Lyons
11e0cd66d4
feat(tag): adding search for tags in gms layer (#2203) 2021-03-10 00:02:58 -08:00
Harshal Sheth
d220647094
feat: add date and time types to SQL model (#2201) 2021-03-09 23:07:20 -08:00
Gabe Lyons
adfe60e97a
feat(tags): adding support for read/write of tags in gms & read-only in react datahub-frontend. (#2164) 2021-03-07 11:26:47 -08:00
Dexter Lee
cda1ce4589
feat(dashboards): Add browse end point for charts and dashboards (#2143)
Co-authored-by: Dexter Lee <dexter@acryl.io>
2021-02-28 10:53:02 -08:00
John Joyce
4dcea8c1d3
feat(gms): Add optional data platform display name (#2148)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-26 21:22:18 -08:00
John Joyce
4f8d8b31ac
feat: Introducing optional DataPlatform logo url (#2127)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-19 19:51:49 -08:00
John Joyce
12ff330a54
feat(GraphQL API): GQL implementation of Charts + Dashboards (#2117)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-17 23:36:17 -08:00
RyanHolstien
ea86ade29b
feat: ML Model Backend Implementation (#1896)
Co-authored-by: RyanHolstien <rholstien@expediagroup.com>
2021-02-17 13:28:13 -08:00
Nagarjuna Kanamarlapudi
f9d33f5519
(refactor): Convert dataPlatforms to GMA aspect models and associated resource to GMA resource. (#2057)
* (refactor): Convert dataPlatforms to GMA aspect and associated resource to GMA resource.

BREAKING CHANGE: /datasets/dataPlatforms API is now changed to become GMA resource.

* Change documentation style
2021-01-20 15:50:48 -08:00
Kerem Sahin
4d8320e4a0
feat(dashboard): Dashboards backend implementation (#1884) 2020-11-23 09:25:58 -08:00
Kerem Sahin
733893f5f9
feat(dashboard): Dashboard models update (#1932)
* feat(dashboard): Dashboard models update

* Keep chartId/dashboardId fields in the URN definitions and add fields for chartURL/dashboardURL into info aspects

* Rebase and address some comments
2020-11-12 11:17:22 -08:00
Jyoti Wadhwani
70ddb09d29
feat: enable SCSI for datasets (#1986)
* enable SCSI for datasets

* Update scsi-onboarding-guide.md
2020-11-11 13:04:20 -08:00
Nagarjuna Kanamarlapudi
7d574d1094
feat(field-level-lineage): Add models for field level lineage (#1936)
* feat(field-level-lineage): adding models for field level lineage

adding models for field level lineage. Introduce DatasetFieldUrn as a unique identifier for dataset field
2020-11-09 14:08:48 -08:00
John Plaisted
25b663cc18
refactor: move code to linkedin/datahub-gma. (#1955)
Move code to linkedin/datahub-gma.

"GMA" (Generalized Metadata Architecture) is the backend of DataHub, and has been moved to its own repository.

This deletes the code that was moved and uses jars that GMA publishes to bintray to load it.

Note that not all of GMA was moved, but most of it. We may still move more things to the other repository in the future.
2020-10-23 15:14:57 -07:00
Jyoti Wadhwani
4bfcb4b508
add aspects to VALUE model of datasets (#1940) 2020-10-22 21:29:28 -07:00
John Plaisted
8223cdcbdb Fix build after merge:
- Add commonsLang to build file.
- Add emails field to CorpUserInfoDocument (either this should be synced or the index builder not synced in the future).
- Fix EbeanLocalDAOTest which used internal Urn API.
- Fix BaseSearchableEntityResource "backfill" override return types (and regenerate snapshots).
- EbeanlocalDAO's constructor changed; now requires URN class.
- Add restli resource module as dependency of :gms:api as it now contains a needed PDL model.
2020-09-11 09:15:56 -07:00