feat(ingestion): make REST emitter batch max payload size configurable (#14123)

This commit is contained in:
Pedro Silva 2025-07-21 18:42:55 +01:00 committed by GitHub
parent 6f791b3d4e
commit 12db9aa879
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 4 additions and 1 deletions

View File

@ -67,6 +67,7 @@ This file documents any backwards-incompatible changes in DataHub and assists pe
### Other Notable Changes
- The `acryl-datahub-actions` package now requires Pydantic V2, while it previously was compatible with both Pydantic V1 and V2.
- #14123: Adds a new environment variable `DATAHUB_REST_EMITTER_BATCH_MAX_PAYLOAD_BYTES` to control batch size limits when using the RestEmitter in ingestions. Default is 15MB but configurable.
## 1.1.0

View File

@ -98,7 +98,9 @@ TRACE_BACKOFF_FACTOR = 2.0 # Double the wait time each attempt
# The limit is 16mb. We will use a max of 15mb to have some space
# for overhead like request headers.
# This applies to pretty much all calls to GMS.
INGEST_MAX_PAYLOAD_BYTES = 15 * 1024 * 1024
INGEST_MAX_PAYLOAD_BYTES = int(
os.getenv("DATAHUB_REST_EMITTER_BATCH_MAX_PAYLOAD_BYTES", 15 * 1024 * 1024)
)
# This limit is somewhat arbitrary. All GMS endpoints will timeout
# and return a 500 if processing takes too long. To avoid sending