
* Add Datasource as variable in dashboard (cherry picked from commit e75b3f7333dd9436f7434eefa76ab9fbfb1babab) * Update datahub_dashboard.json (cherry picked from commit 701592697702a99ddf33f75d57826282df999c82) * Bump docker compose version to 3.8 (cherry picked from commit ff6a97b1692ac21368eacc83a0daed30b0364d5d) * Update grafana image tag from latest to 9.1.4 (cherry picked from commit 2c88e2a3041b9ddfa70bb92d13e80ca15be0ead2) * Update old metric name in datahub_dashboard.json (cherry picked from commit 21b502e25392f37c5053dcf00161fc5c7d35c94d) * Add panel for new metrics (cherry picked from commit 194452778bc825820dda87ab2e159b7064a360ef) Co-authored-by: Peter Szalai <szalaipeti.vagyok@gmail.com>
title | hide_title |
---|---|
Deploying with Docker | true |
Docker Images
Prerequisites
You need to install docker and docker-compose (if using Linux; on Windows and Mac compose is included with Docker Desktop).
Make sure to allocate enough hardware resources for Docker engine. Tested & confirmed config: 2 CPUs, 8GB RAM, 2GB Swap area.
Quickstart
The easiest way to bring up and test DataHub is using DataHub Docker images which are continuously deployed to Docker Hub with every commit to repository.
You can easily download and run all these images and their dependencies with our quick start guide.
DataHub Docker Images:
Do not use latest
or debug
tags for any of the image as those are not supported and present only due to leagcy reasons. Please use head
or tags specific for versions like v0.8.40
. For production we recommend using version specific tags not head
.
- linkedin/datahub-ingestion - This contains the Python CLI. If you are looking for docker image for every minor CLI release you can find them under acryldata/datahub-ingestion.
- linkedin/datahub-gms.
- linkedin/datahub-frontend-react
- linkedin/datahub-mae-consumer
- linkedin/datahub-mce-consumer
- acryldata/datahub-upgrade
- linkedin/datahub-kafka-setup
- linkedin/datahub-elasticsearch-setup
- acryldata/datahub-mysql-setup
- acryldata/datahub-postgres-setup
- acryldata/datahub-actions. Do not use
acryldata/acryl-datahub-actions
as that is deprecated and no longer used.
Dependencies:
Ingesting demo data.
If you want to test ingesting some data once DataHub is up, use the ./docker/ingestion/ingestion.sh
script or datahub docker ingest-sample-data
. See the quickstart guide for more details.
Using Docker Images During Development
See Using Docker Images During Development.
Building And Deploying Docker Images
We use GitHub Actions to build and continuously deploy our images. There should be no need to do this manually; a successful release on Github will automatically publish the images.
Building images
This is not our recommended development flow and most developers should be following the Using Docker Images During Development guide.
To build the full images (that we are going to publish), you need to run the following:
COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose -p datahub build
This is because we're relying on builtkit for multistage builds. It does not hurt also set DATAHUB_VERSION
to
something unique.
Community Built Images
As the open source project grows, community members would like to contribute additions to the docker images. Not all contributions to the images can be accepted because those changes are not useful for all community members, it will increase build times, add dependencies and possible security vulns. In those cases this section can be used to point to Dockerfiles
hosted by the community which build on top of the images published by the DataHub core team along with any container registry links where the result of those images are maintained.