72 Commits

Author SHA1 Message Date
Dexter Lee
b4457afe30
feat(search): Add search for field level description and tags (#2491) 2021-05-05 14:04:02 -07:00
Dexter Lee
55712f5918
feat(search): Support search terms that are dataset platform names (#2442) 2021-04-26 16:16:54 -07:00
Gabe Lyons
851e00ba9f
feat(lineage): implement support for datasets, charts and dashboards downstream lineage fetching in a generic way (#2397)
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Brian <brianwebtek@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2021-04-23 00:18:39 -07:00
Fredrik Sannholm
e88a671959
Fix(search): fix datajob and dataflow search mappings (#2418) 2021-04-21 12:04:20 -07:00
Harshal Sheth
2fee45d127
feat: add s3 data platform and logo (#2424) 2021-04-20 16:40:51 -07:00
Gabe Lyons
a9f4f3797a
fix(tags): check description existence on tags (#2399) 2021-04-14 13:20:51 -07:00
Dexter Lee
1fc532d831
feat(index): Add index naming convention for elasticsearch (#2386) 2021-04-13 07:56:31 -07:00
Dexter Lee
1c6f1d7a86
fix(elasticsearch): Fix inconsistencies between documents and elasticsearch mappings (#2356) 2021-04-07 14:59:43 -07:00
shakti-garg
3fb71acf71
feat(es-setup): add logic in elasticsearch setup to compare-and-update index if already exists (#2312)
* 2310 | add logic in es-setup script to compare-and-update index if already exists
2021-04-03 11:17:16 -07:00
Dexter Lee
259e6af494
feat(search-by-field): Add the ability to search for field names (#2286) 2021-03-23 15:18:32 -07:00
Pedro Silva
f0d093f4ab
feat(react): add druid logo (#2264) 2021-03-19 08:29:52 -07:00
John Plaisted
5e91014e00
feat(search) BREAKING Support ElasticSearch 7, drop ES5 (#2263)
Merges in changes from our ES7 branch, and drops support for ES5.

This is a breaking change due to the upgrade, we have a ES5 branch at the commit before this.
2021-03-18 19:16:44 -07:00
Gabe Lyons
7106753a4d
feat(tags): improving elastic search templates for tags (#2254) 2021-03-18 15:04:24 -07:00
Gabe Lyons
039fe597f7
feat(tags): editing tags from react client on datasets, schemas, charts & dashboards (#2248) 2021-03-18 11:52:14 -07:00
John Joyce
c819289b53
feat(react): Adding big query logo (#2245)
Co-authored-by: John Joyce <john@acryl.io>
2021-03-17 20:34:37 -07:00
Dexter Lee
c362cc4388
feat(tags): Enable search for datasets by tags (#2240) 2021-03-15 22:37:06 -07:00
Arun Vasudevan
7750c6120a
feat: MLmodel Graphql Query (#2166) 2021-03-13 08:34:48 -08:00
Fredrik Sannholm
da6b3d111d
feat(datajob): Backend implementation (#2197) 2021-03-13 08:00:44 -08:00
Gabe Lyons
11e0cd66d4
feat(tag): adding search for tags in gms layer (#2203) 2021-03-10 00:02:58 -08:00
Gabe Lyons
adfe60e97a
feat(tags): adding support for read/write of tags in gms & read-only in react datahub-frontend. (#2164) 2021-03-07 11:26:47 -08:00
John Joyce
e575add1fb
feat(DataPlatform Logos): Adding server driven logos (#2165)
Co-authored-by: John Joyce <john@acryl.io>
2021-03-04 23:16:13 -08:00
Harshal Sheth
e066991f54
fix(ingest): bigquery source and dataset naming fixes (#2161) 2021-03-03 19:49:46 -08:00
Rickard Cardell
3252e92a16
feat: neo4j Bolt TLS support (#2100) (#2145)
* Bumping version of neo4j-java-driver to include encryption support that came in 4.0.1.
2021-03-01 12:55:46 -08:00
Dexter Lee
cda1ce4589
feat(dashboards): Add browse end point for charts and dashboards (#2143)
Co-authored-by: Dexter Lee <dexter@acryl.io>
2021-02-28 10:53:02 -08:00
John Joyce
4dcea8c1d3
feat(gms): Add optional data platform display name (#2148)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-26 21:22:18 -08:00
John Joyce
8e03441ffe
fix(gms): fix getAllDataPlatforms bug (#2126) 2021-02-19 14:30:55 -08:00
John Joyce
12ff330a54
feat(GraphQL API): GQL implementation of Charts + Dashboards (#2117)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-17 23:36:17 -08:00
RyanHolstien
ea86ade29b
feat: ML Model Backend Implementation (#1896)
Co-authored-by: RyanHolstien <rholstien@expediagroup.com>
2021-02-17 13:28:13 -08:00
John Plaisted
d89fa4a27b
feat: update GMA to 0.2.35 (#2067)
Required editing the search tests for the new search config tests.
2021-01-28 15:55:51 -08:00
Nagarjuna Kanamarlapudi
f9d33f5519
(refactor): Convert dataPlatforms to GMA aspect models and associated resource to GMA resource. (#2057)
* (refactor): Convert dataPlatforms to GMA aspect and associated resource to GMA resource.

BREAKING CHANGE: /datasets/dataPlatforms API is now changed to become GMA resource.

* Change documentation style
2021-01-20 15:50:48 -08:00
Nagarjuna Kanamarlapudi
0fef73bb57
fix(search): Fix the rollback unintentional rollback(#2028) of dataset index to search by field paths. (#2040)
Enables the auto complete of field paths on DataHub UI
2020-12-14 17:40:24 -08:00
John Plaisted
838f964114
feat: add elasticsearch sanity integration tests (#2028)
These tests verify that, given an index settings and mappings, data can be written to the index, and read from it with a query_all query. These are very simple sanity tests.

We can, and should, write more complex tests that specific to each index in the future.
2020-12-02 20:49:34 -08:00
Kerem Sahin
4d8320e4a0
feat(dashboard): Dashboards backend implementation (#1884) 2020-11-23 09:25:58 -08:00
John Plaisted
60e43061d8
[Breaking] Update to GMA 0.2.0 and fix Urn definitions. (#1977)
Urn definitions needed to be updated since 0.2.0 changed the base Urn class. 

I also added some more urn coercers that were missing.
2020-11-11 16:06:29 -08:00
Jyoti Wadhwani
70ddb09d29
feat: enable SCSI for datasets (#1986)
* enable SCSI for datasets

* Update scsi-onboarding-guide.md
2020-11-11 13:04:20 -08:00
Mars Lan
ab03a4ee9e
refactor(gms): use BaseLocalDAO as the interface in factories & rest.li resources (#1979)
* refactor(gms): use BaseLocalDAO as the interface in factories & rest.li resources

Fixes #1974

* Revert yarn.lock change

Co-authored-by: Mars Lan <mars@trayminder.com>
2020-10-31 09:27:01 -07:00
Kerem Sahin
1ec9f66b66
Bump to datahub-gma 0.1.0 (#1931) 2020-10-26 16:18:21 -07:00
John Plaisted
25b663cc18
refactor: move code to linkedin/datahub-gma. (#1955)
Move code to linkedin/datahub-gma.

"GMA" (Generalized Metadata Architecture) is the backend of DataHub, and has been moved to its own repository.

This deletes the code that was moved and uses jars that GMA publishes to bintray to load it.

Note that not all of GMA was moved, but most of it. We may still move more things to the other repository in the future.
2020-10-23 15:14:57 -07:00
Jyoti Wadhwani
4bfcb4b508
add aspects to VALUE model of datasets (#1940) 2020-10-22 21:29:28 -07:00
Mars Lan
2bdb52b104
Update Datasets.java 2020-10-09 04:34:34 -07:00
John Plaisted
8223cdcbdb Fix build after merge:
- Add commonsLang to build file.
- Add emails field to CorpUserInfoDocument (either this should be synced or the index builder not synced in the future).
- Fix EbeanLocalDAOTest which used internal Urn API.
- Fix BaseSearchableEntityResource "backfill" override return types (and regenerate snapshots).
- EbeanlocalDAO's constructor changed; now requires URN class.
- Add restli resource module as dependency of :gms:api as it now contains a needed PDL model.
2020-09-11 09:15:56 -07:00
Mars Lan
f0485a490e
feat(platform): add "postgres" as a supported data platform (#1859)
* feat(platform): add "postgres" as a supported data platform

* update tests
2020-09-08 10:21:08 -07:00
Kerem Sahin
57f81d488d
feat(data-platforms): adding rest resource for /dataPlatforms and mid-tier support (#1817)
* feat(data-platforms): Adding rest resource for /dataPlatforms and mid-tier support

* Removed data platforms which are Linkedin internal
2020-08-20 12:55:30 -07:00
John Plaisted
b673c8e160
fix: update defaults of aspectNames params (#1815)
fix: Update defaults of aspectNames params.

The last PR to sync internal code broke the external GMS, as code was now expected aspectNames to be null rather than empty by default. This preventing me logging into DataHub as the corp user request would fail (it assumed I asked for no aspects rather than all aspects).

TESTED: Built locally, launched with docker/dev.sh (so used latest frontend, but whatever). Verified I can now log into DataHub, browse and search for datasets, and view my profile.
2020-08-19 18:42:56 -07:00
John Plaisted
d9b86d1f05
Update metadata-models to head! (#1811)
metadata-models 80.0.0 -> 90.0.13:

   90.0.13: Roll forward: Fix the open source build by avoiding URN method that isn't part of the open source URN.
    90.0.2: Refactor listUrnsFromIndex method
    90.0.0: Start distinguishing between [] aspects vs null aspects input param
    89.0.4: Fix the open source build by avoiding URN method that isn't part of the open source URN.
    89.0.2: fix some test case name
    89.0.0: META-12686: Made the MXE_v5 topics become strictly ACL'ed to avoid the wildcard write ACL as "MetadataXEvent.+"
    88.0.6: change DAO to take Storage Config as input
    88.0.3: Add a comment on lack of avro generation for MXEv5 + add MXEv5 to the pegasus validation task.
   87.0.15: META-12651: Integrate the metadata-models-ext with metadata-models
   87.0.13: add StorageConfig to Local DAO
    87.0.3: Treat empty aspect vs optional aspect same until all clients are migrated
    87.0.2: Treat empty aspect vs optional aspect differently
    87.0.1: META-12533: Skip processing unregistered aspect specific MAE.
    83.0.6: action method to return list of urns from strong consistent index
    83.0.4: Change input param type for batch backfill
    83.0.3: Implement batch backfill
    83.0.1: Implement support for OR filter in browse query
   82.0.10: Throw UnsupportedOperationException for unsupported condition types in search filter
    82.0.6: Implement local secondary backfilling index as part of backfill method
    82.0.5: [strongly consistent index] implement getUrns method
    82.0.4: Add indexing urn fields to the local secondary index
    82.0.0: Render Delta fiels in the MCE_v5.
    81.0.1: Add pegasus to avro conversion for FMCE
    80.0.4: add get all support for BaseSingleAspectEntitySimpleKeyResource
    80.0.2: Add a BaseSearchWriterDAO with an ESBulkWriterDAO implementation.
    80.0.1: META-12254: Produce aspect specific MAE with always emit option
    80.0.0: Convert getNodesInTraversedPath to getSubgraph to return complete view of the subgraph (nodes+edges)
2020-08-19 16:06:29 -07:00
Liangjun Jiang
5d078aa617
Implemented data process search feature (#1706)
* implement search feature

* add test for dataprocessIndexBuilder; refactor code based on feedback

* update based on PR feedback

* Update DataProcessDocument.pdl

fixed typo wording.

* add not null check for data process info
2020-06-29 10:20:22 -07:00
Kerem Sahin
2e2fb2b810
Add missing updates from recent internal push (#1700) 2020-06-12 12:55:50 -07:00
Jyoti Wadhwani
ad6f1653e1
metadata-models 62.0.3 -> 72.0.8 (#1693) 2020-06-11 10:21:51 -07:00
Liangjun Jiang
92c4a3689e
Data process entity (#1680)
* add job info as aspect of a dataset

* add job urn def., aspect and entity

* job entity with upstream and downstream lineage

* use job urn in upstream & downstream

* add Job entity rest APIs

* rest.li api, impl and factory for job entity

* code cleanup

* use pdl; onboard data process entity

* add es index json

* fix gradlew build ignored tasks

* add a comment about data process info field

* fix style warning issues

* update content based on PR

* checked in generated snapshot json

* updated based on PR feedback

* update data process data format

* updated based on code review feedback

* revert back gms & mce-job docker image

* delete temp files

* update based pr feedback

* file name and a typo

* format with linkedin style

Co-authored-by: Liangjun <liajiang@expediagroup.com>
2020-06-09 15:42:08 -07:00
Mars Lan
f932437742
build: start enforcing checkstyle and fix all violations (#1670) 2020-05-11 08:41:02 -07:00