mirror of
https://github.com/datahub-project/datahub.git
synced 2025-06-27 05:03:31 +00:00

Enable use of gradle for all image builds for publishing, eliminating the per-image build action in docker-unified.yml that duplicated what was in gradle but used slightly different mechanisms to determine what is the tag. Enabled gradle build to consume tags provided by the workflow and produce tags same as earlier. Use bake matrix builds to build slim/full versions of datahub-ingestion, datahub-actions. Publish images and scan relies on gradle to get the list of images, via depot. Image publish and scans run once a day on schedule or on manual triggers only. Pending work: Separate the publish and scans into a separate workflow that runs on a schedule and could also run other tests.
DataHub Metadata Ingestion Docker Image
Refer to the metadata ingestion framework to understand the architecture and responsibilities of this service.
Slim vs Full Image Build
There are two versions of this image. One includes pyspark and Oracle dependencies and is larger due to the java dependencies.
Running the standard build results in the slim
image without pyspark being generated by default. In order to build the full
image with pyspark use the following project property -PdockerTarget=full
.