118 Commits

Author SHA1 Message Date
Thomas Larsson
7043138797
feat(gms): add elasticsearch SSL support (#2189)
Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-03-08 10:38:04 -08:00
Harshal Sheth
20bf794ec4
docs: hosted documentation website (#2174) 2021-03-05 00:12:12 -08:00
Harshal Sheth
410b823be9
build(ingest): use multi-stage docker build for datahub-ingestion (#2159) 2021-03-02 14:51:59 -08:00
Harshal Sheth
dced25fef7
feat(ingest): switch quickstart to Python ingestion (#2158) 2021-03-02 11:48:26 -08:00
Harshal Sheth
9e73794022
ci(ingest): setup docker container for metadata ingestion (#2150) 2021-03-01 17:36:38 -08:00
Rickard Cardell
4db9914432
feat: neo4j https support (#2101) (#2144)
* feat: neo4j https support (#2101)

    Ability to specify http as well as https URI schemes in the 'NEO4J_HOST' variable.
2021-02-26 21:28:22 -08:00
Thomas Larsson
8fe9520ddc
feat(datahub-dao): enable services to access gms over https (#2133)
Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
2021-02-25 14:26:23 -08:00
Dexter Lee
e7c3fd867b
refactor(docker-dev): set up elasticsearch using local mapping on docker-compose.dev (#2137)
Co-authored-by: Dexter Lee <dexter@acryl.io>
2021-02-24 16:00:16 -08:00
John Joyce
1aede64465
bug(docker react): Fix react docker image build (#2118)
Co-authored-by: John Joyce <john@acryl.io>
2021-02-17 23:11:32 -08:00
RyanHolstien
ea86ade29b
feat: ML Model Backend Implementation (#1896)
Co-authored-by: RyanHolstien <rholstien@expediagroup.com>
2021-02-17 13:28:13 -08:00
John Joyce
715fb7d7f7
bug(docker): Removing datahub-gms-graphql-service from default docker-compose.yml file (#2111) 2021-02-16 21:47:38 -08:00
John Joyce
11f030b118 Adding Github Action to publish Docker image 2021-02-15 17:40:41 -08:00
John Joyce
ef04205a73 Serving React App via Docker 2021-02-15 17:40:41 -08:00
John Joyce
e3ac44cfd4 Fixes 2021-02-15 17:40:41 -08:00
John Joyce
8ef5e2a545 Deploy React via Docker 2021-02-15 17:40:41 -08:00
Arun Vasudevan
84e952e138
feat (graphql): Datahub GMS Graphql Api Application for Querying Dataset (#2071) 2021-02-01 11:51:15 -08:00
Satyaprakash Bommaraju
782e29ce53
Fix for Kafka-UI Connectivity Error with Kafka-Rest Proxy (#2053)
Fixes error when acessing http://localhost:18000 where the Kafka-UI was unable to connect to Rest Proxy
2021-01-12 10:30:47 -08:00
Mars Lan
36b79a3ef3
build(docker): add script to clean up docker environment (#2013)
Co-authored-by: Mars Lan <mars@trayminder.com>
2020-12-17 13:52:31 -08:00
John Plaisted
838f964114
feat: add elasticsearch sanity integration tests (#2028)
These tests verify that, given an index settings and mappings, data can be written to the index, and read from it with a query_all query. These are very simple sanity tests.

We can, and should, write more complex tests that specific to each index in the future.
2020-12-02 20:49:34 -08:00
John Plaisted
5f9d967451
fix: ingestion docker image (#2027)
The environment was not set correctly, so it could not fire kafka events. It (mce-cli) always worked when running outside of docker.

I also added a dev ingestion docker image / script which may be much faster if you've already built locally.

Tested:
1. Cleaned docker volumes and started datahub. Verified it is empty.
2. Built with gradle.
3. Ran ./docker/ingestion/ingestion-dev.sh. Verified data shows in DataHub.
4. Ran step 1 again.
5. Ran ./docker/ingestion/ingestion.sh. Verified data shows in DataHub.
2020-12-02 17:40:12 -08:00
Nagarjuna Kanamarlapudi
a1e7e26e08
Fix dataset index creation issue (#2022) 2020-11-30 18:33:06 -08:00
Kerem Sahin
4d8320e4a0
feat(dashboard): Dashboards backend implementation (#1884) 2020-11-23 09:25:58 -08:00
Nagarjuna Kanamarlapudi
5d083143db
feat(dataset): Enable search of datasets by field names (#2001)
* feat(dataset): Enable search of datasets by field names
2020-11-20 12:01:07 -08:00
Jyoti Wadhwani
70ddb09d29
feat: enable SCSI for datasets (#1986)
* enable SCSI for datasets

* Update scsi-onboarding-guide.md
2020-11-11 13:04:20 -08:00
Kerem Sahin
b989f9d16a
Upgrade neo4j to 4.0 (#1960) 2020-10-26 05:31:00 -07:00
Mars Lan
93805e7f1a
build(docker): use community version of ES & Kibana in quickstart (#1929)
Fixes #1928
2020-10-07 21:21:08 -07:00
Fredrik Sannholm
125ae288f1
docker: Run as non-root user in docker (#1914) 2020-10-06 04:35:38 -07:00
Mars Lan
a13ca65e02
Update README.md 2020-09-30 06:20:14 -07:00
Grant Nicholas
9bcf273661
fix(docker): update mae and mce consumer images to include glibc compat layer. allows the consumer jobs to deal with snappy compressed kafka topics when running on alpine linux (#1899) 2020-09-28 15:30:56 -07:00
John Plaisted
821bce7d69
feat: Port mce-cli to Java. (#1871)
Port mce-cli to Java.

Also moved off the avro format event file to json instead. Much nicer to use :)
2020-09-25 14:05:29 -07:00
Fredrik Sannholm
d50b9c01b4
fix (docker): Fix install of Chrome in frontend Dockerimage (#1889)
* fix (docker): Fix install of Chrome in frontend Dockerimage

Retry installing Chrome after dependencies have been installed

* fix (docker): Install Chrome with apt-get

Install Chrome and dependencies at the same time, using apt-get
2020-09-22 12:02:37 -07:00
Kerem Sahin
ece9b82f7a
Update README.md 2020-08-19 21:39:46 -07:00
Kerem Sahin
21a5c9e607
Update README.md 2020-08-19 21:38:03 -07:00
na zhang
97424509d1
add description field for dataset index mapping (#1791) 2020-08-09 17:35:17 -07:00
Mars Lan
aa0a62e991
Update README.md 2020-08-08 04:58:55 -07:00
John Plaisted
b8e18b0b5d
refactor(docker): make docker files easier to use during development. (#1777)
* Make docker files easier to use during development.

During development it quite nice to have docker work with locally built code. This allows you to launch all services very quickly, with your changes, and optionally with debugging support.

Changes made to docker files:
- Removed all redundant docker-compose files. We now have 1 giant file, and smaller files to use as overrides.
- Remove redundant README files that provided little information.
- Rename docker/<dir> to match the service name in the docker-compose file for clarity.
- Move environment variables to .env files. We only provide dev / the default environment for quickstart.
- Add debug options to docker files using multistage build to build minimal images with the idea that built files will be mounted instead.
- Add a docker/dev.sh script + compose file to easily use the dev override images (separate tag; images never published; uses debug docker files; mounts binaries to image).
- Added docs/docker documentation for this.
2020-08-06 16:38:53 -07:00
Chris Lee
4143fb901e
<refactor>[ingestions]: align the default kafka topics with PR #1756 (#1758) 2020-07-29 20:26:01 -07:00
Mars Lan
00d89115b2
feat(gms): add postgres & mariadb supports to GMS (#1742)
* feat(gms): add postgres & mariadb supports to GMS

Also add corresponding docker-compose files

* Update README.md
2020-07-22 19:39:58 -07:00
Liangjun Jiang
5d078aa617
Implemented data process search feature (#1706)
* implement search feature

* add test for dataprocessIndexBuilder; refactor code based on feedback

* update based on PR feedback

* Update DataProcessDocument.pdl

fixed typo wording.

* add not null check for data process info
2020-06-29 10:20:22 -07:00
Kerem Sahin
2dc11a51f4
fix(py3): Bump ingestion Docker py dependency to 3.6 (#1716) 2020-06-29 08:22:50 -07:00
Kerem Sahin
9501e9bd70 docs: Graph onboarding demo 2020-06-26 01:10:44 -07:00
Mars Lan
34d6f4ed09
Update README.md 2020-06-25 19:24:03 -07:00
Liangjun Jiang
92c4a3689e
Data process entity (#1680)
* add job info as aspect of a dataset

* add job urn def., aspect and entity

* job entity with upstream and downstream lineage

* use job urn in upstream & downstream

* add Job entity rest APIs

* rest.li api, impl and factory for job entity

* code cleanup

* use pdl; onboard data process entity

* add es index json

* fix gradlew build ignored tasks

* add a comment about data process info field

* fix style warning issues

* update content based on PR

* checked in generated snapshot json

* updated based on PR feedback

* update data process data format

* updated based on code review feedback

* revert back gms & mce-job docker image

* delete temp files

* update based pr feedback

* file name and a typo

* format with linkedin style

Co-authored-by: Liangjun <liajiang@expediagroup.com>
2020-06-09 15:42:08 -07:00
Mars Lan
4f221f9a12
build(docker): refactor docker build scripts (#1687)
* build(docker): refactor docker build scripts

- add "build" option to docker-compose files to simplify rebuilding of images
- create "start.sh" script so it's easier to override "command" in the quickstart's docker-compose file
- use dockerize to wait for requisite services to start up
- add a dedicated Dockerfile for kafka-setup

This fixes https://github.com/linkedin/datahub/issues/1549 & https://github.com/linkedin/datahub/issues/1550
2020-06-08 13:37:14 -07:00
Mars Lan
94ffb300a9
build(docker): refactor ingestion docker build script (#1690)
- add "build" option to docker-compose file to simplify rebuilding of images
- move command from docker-compose.yml to Dockerfile
- add ingestion.sh script to simplify quickstart instruction and to reduce confusion
2020-06-05 14:39:20 -07:00
Mars Lan
920a1774dc
docs: points to docker images hosted by linkedin org (#1683)
As we're now utilize GitHub Actions to build & publish docker images to docker hub under linkedin org
Also allow overriding image tags via DATAHUB_VERSION environment variable
2020-06-01 09:36:51 -07:00
Mars Lan
505c8c03ca
build: further clean up gms Dockerfile 2020-05-29 16:14:46 -07:00
Mars Lan
e812dd7a16
build: clean up gms Dockerfile 2020-05-29 16:06:15 -07:00
Mars Lan
509b2e1515
refactor: use named volume instead of bind mount in quickstart (#1669)
Volume is the preferred method over bind mount (https://docs.docker.com/storage/volumes/) for persistent container data.
This also eliminates the need for the ugly chmod hack for elasticsearch and hopefully fixes https://github.com/linkedin/datahub/issues/1650
2020-05-11 09:06:38 -07:00
e11it
31887dfbea
fix(quickstart): set utf8mb4 for mysql (#1657)
Co-authored-by: Ilya Makarov <makarov_ia@nlmk.com>
2020-05-04 20:04:25 -07:00