datahub/docker/datahub-ingestion
Chakru 51863325a5
build: use versioning in gradle consistent with ci (#13259)
Enable use of gradle for all image builds for publishing, eliminating the per-image build action in docker-unified.yml that duplicated what was in gradle but used slightly different mechanisms to determine what is the tag. Enabled gradle build to consume tags provided by the workflow and produce tags same as earlier.

Use bake matrix builds to build slim/full versions of datahub-ingestion, datahub-actions.

Publish images and scan relies on gradle to get the list of images, via depot.

Image publish and scans run once a day on schedule or on manual triggers only.
Pending work: Separate the publish and scans into a separate workflow that runs on a schedule and could also run other tests.
2025-04-24 23:00:03 +05:30
..

DataHub Metadata Ingestion Docker Image

datahub-ingestion docker

Refer to the metadata ingestion framework to understand the architecture and responsibilities of this service.

Slim vs Full Image Build

There are two versions of this image. One includes pyspark and Oracle dependencies and is larger due to the java dependencies.

Running the standard build results in the slim image without pyspark being generated by default. In order to build the full image with pyspark use the following project property -PdockerTarget=full.