---
title: "Deployment Environment Variables"
---

# Environment Variables

The following is a summary of a few important environment variables which expose various levers which control how
DataHub works.

## Feature Flags

| Variable                                         | Default | Unit/Type | Components                              | Description                                                                                                                 |
| ------------------------------------------------ | ------- | --------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- |
| `UI_INGESTION_ENABLED`                           | `true`  | boolean   | [`GMS`, `MCE Consumer`]                 | Enable UI based ingestion.                                                                                                  |
| `DATAHUB_ANALYTICS_ENABLED`                      | `true`  | boolean   | [`Frontend`, `GMS`]                     | Collect DataHub usage to populate the analytics dashboard.                                                                  |
| `BOOTSTRAP_SYSTEM_UPDATE_WAIT_FOR_SYSTEM_UPDATE` | `true`  | boolean   | [`GMS`, `MCE Consumer`, `MAE Consumer`] | Do not wait for the `system-update` to complete before starting. This should typically only be disabled during development. |
| `ER_MODEL_RELATIONSHIP_FEATURE_ENABLED`          | `false` | boolean   | [`Frontend`, `GMS`]                     | Enable ER Model Relation Feature that shows Relationships Tab within a Dataset UI.                                          |
| `STRICT_URN_VALIDATION_ENABLED`                  | `false` | boolean   | [`GMS`, `MCE Consumer`, `MAE Consumer`] | Enable stricter URN validation logic                                                                                        |
| `SHOW_MANAGE_STRUCTURED_PROPERTIES`              | `true`  | boolean   | [`GMS`]                                 | Controls whether the Structured Properties page is visible and accessible via the UI.                                       |
| `SCHEMA_FIELD_CLL_ENABLED`                       | `true`  | boolean   | [`GMS`]                                 | Controls whether the Column-level lineage focus view is accessible via the lineage Graph.                                   |
| `SCHEMA_FIELD_LINEAGE_IGNORE_STATUS`             | `true`  | boolean   | [`GMS`]                                 | Controls whether lineage ignores the schema field status aspect, reading the parent's status aspect instead.                |
| `HIDE_DBT_SOURCE_IN_LINEAGE`                     | `false` | boolean   | [`GMS`]                                 | Hides dbt source entities from lineage graphs when used with specific dbt ingestion settings.                               |
| `SHOW_NAV_BAR_REDESIGN`                          | `false` | boolean   | [`GMS`]                                 | Enables the new navigation bar redesign.                                                                                    |
| `SHOW_MANAGE_TAGS`                               | `true`  | boolean   | [`GMS`]                                 | Enables the manage tags page.                                                                                               |

## Ingestion

| Variable                          | Default | Unit/Type | Components              | Description                                                                                                                                                                       |
| --------------------------------- | ------- | --------- | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- |
| `ASYNC_INGEST_DEFAULT`            | `false` | boolean   | [`GMS`]                 | Asynchronously process ingestProposals by writing the ingestion MCP to Kafka. Typically enabled with standalone consumers.                                                        |
| `MCP_CONSUMER_ENABLED`            | `true`  | boolean   | [`GMS`, `MCE Consumer`] | When running in standalone mode, disabled on `GMS` and enabled on separate `MCE Consumer`.                                                                                        |
| `MCL_CONSUMER_ENABLED`            | `true`  | boolean   | [`GMS`, `MAE Consumer`] | When running in standalone mode, disabled on `GMS` and enabled on separate `MAE Consumer`.                                                                                        |
| `PE_CONSUMER_ENABLED`             | `true`  | boolean   | [`GMS`, `MAE Consumer`] | When running in standalone mode, disabled on `GMS` and enabled on separate `MAE Consumer`.                                                                                        |
| `ES_BULK_REQUESTS_LIMIT`          | 1000    | docs      | [`GMS`, `MAE Consumer`] | Number of bulk documents to index. `MAE Consumer` if standalone.                                                                                                                  |
| `ES_BULK_FLUSH_PERIOD`            | 1       | seconds   | [`GMS`, `MAE Consumer`] | How frequently indexed documents are made available for query.                                                                                                                    |
| `ALWAYS_EMIT_CHANGE_LOG`          | `false` | boolean   | [`GMS`]                 | Enables always emitting a MCL even when no changes are detected. Used for Time Based Lineage when no changes occur.                                                               |     |
| `GRAPH_SERVICE_DIFF_MODE_ENABLED` | `true`  | boolean   | [`GMS`]                 | Enables diff mode for graph writes, uses a different code path that produces a diff from previous to next to write relationships instead of wholesale deleting edges and reading. |

## Caching

| Variable                                   | Default  | Unit/Type | Components | Description                                                                          |
| ------------------------------------------ | -------- | --------- | ---------- | ------------------------------------------------------------------------------------ |
| `SEARCH_SERVICE_ENABLE_CACHE`              | `false`  | boolean   | [`GMS`]    | Enable caching of search results.                                                    |
| `SEARCH_SERVICE_CACHE_IMPLEMENTATION`      | caffeine | string    | [`GMS`]    | Set to `hazelcast` if the number of GMS replicas > 1 for enabling distributed cache. |
| `CACHE_TTL_SECONDS`                        | 600      | seconds   | [`GMS`]    | Default cache time to live.                                                          |
| `CACHE_MAX_SIZE`                           | 10000    | objects   | [`GMS`]    | Maximum number of items to cache.                                                    |
| `LINEAGE_SEARCH_CACHE_ENABLED`             | `true`   | boolean   | [`GMS`]    | Enables in-memory cache for searchAcrossLineage query.                               |
| `CACHE_ENTITY_COUNTS_TTL_SECONDS`          | 600      | seconds   | [`GMS`]    | Homepage entity count time to live.                                                  |
| `CACHE_SEARCH_LINEAGE_TTL_SECONDS`         | 86400    | seconds   | [`GMS`]    | Search lineage cache time to live.                                                   |
| `CACHE_SEARCH_LINEAGE_LIGHTNING_THRESHOLD` | 300      | objects   | [`GMS`]    | Lineage graphs exceeding this limit will use a local cache.                          |

## Search

| Variable                                            | Default             | Unit/Type | Components                                                      | Description                                                                                                              |
| --------------------------------------------------- | ------------------- | --------- | --------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| `INDEX_PREFIX`                                      | ``                  | string    | [`GMS`, `MAE Consumer`, `Elasticsearch Setup`, `System Update`] | Prefix Elasticsearch indices with the given string.                                                                      |
| `ELASTICSEARCH_NUM_SHARDS_PER_INDEX`                | 1                   | integer   | [`System Update`]                                               | Default number of shards per Elasticsearch index.                                                                        |
| `ELASTICSEARCH_NUM_REPLICAS_PER_INDEX`              | 1                   | integer   | [`System Update`]                                               | Default number of replica per Elasticsearch index.                                                                       |
| `ELASTICSEARCH_BUILD_INDICES_RETENTION_VALUE`       | 60                  | integer   | [`System Update`]                                               | Number of units for the retention of Elasticsearch clone/backup indices.                                                 |
| `ELASTICSEARCH_BUILD_INDICES_RETENTION_UNIT`        | DAYS                | string    | [`System Update`]                                               | Unit for the retention of Elasticsearch clone/backup indices.                                                            |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_EXCLUSIVE`         | `false`             | boolean   | [`GMS`]                                                         | Only return exact matches when using quotes.                                                                             |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_WITH_PREFIX`       | `true`              | boolean   | [`GMS`]                                                         | Include prefix match in exact match results.                                                                             |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_FACTOR`            | 10.0                | float     | [`GMS`]                                                         | Multiply by this number on true exact match.                                                                             |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_PREFIX_FACTOR`     | 1.6                 | float     | [`GMS`]                                                         | Multiply by this number when prefix match.                                                                               |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_CASE_FACTOR`       | 0.7                 | float     | [`GMS`]                                                         | Multiply by this number when case insensitive match.                                                                     |
| `ELASTICSEARCH_QUERY_EXACT_MATCH_ENABLE_STRUCTURED` | `true`              | boolean   | [`GMS`]                                                         | When using structured query, also include exact matches.                                                                 |
| `ELASTICSEARCH_QUERY_PARTIAL_URN_FACTOR`            | 0.5                 | float     | [`GMS`]                                                         | Multiply by this number when partial token match on URN)                                                                 |
| `ELASTICSEARCH_QUERY_PARTIAL_FACTOR`                | 0.4                 | float     | [`GMS`]                                                         | Multiply by this number when partial token match on non-URN field.                                                       |
| `ELASTICSEARCH_QUERY_CUSTOM_CONFIG_ENABLED`         | `true`              | boolean   | [`GMS`]                                                         | Enable search query and ranking customization configuration.                                                             |
| `ELASTICSEARCH_QUERY_CUSTOM_CONFIG_FILE`            | `search_config.yml` | string    | [`GMS`]                                                         | The location of the search customization configuration.                                                                  |
| `ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX`      | `false`             | boolean   | [`System Update`]                                               | Enable reindexing on Elasticsearch schema changes.                                                                       |
| `ENABLE_STRUCTURED_PROPERTIES_SYSTEM_UPDATE`        | `false`             | boolean   | [`System Update`]                                               | Enable reindexing to remove hard deleted structured properties.                                                          |
| `ELASTICSEARCH_LIMIT_RESULTS_MAX`                   | 2000                | integer   | [`GMS`]                                                         | Maximum search results per page.                                                                                         |
| `ELASTICSEARCH_LIMIT_RESULTS_STRICT`                | `false`             | boolean   | [`GMS`]                                                         | If `false`, reduce the page size to the maximum rathen then throw an exception is the request exceeds the maximum value. |

## Entities and Versions

| Variable                    | Default | Unit/Type | Components                              | Description                                                                                                        |
| --------------------------- | ------- | --------- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| `ENTITY_VERSIONING_ENABLED` | `false` | boolean   | [`GMS`, `MCE Consumer`, `MAE Consumer`] | Enables entity versioning related resolvers, validators, side effects, etc. to support versioned entities.         |
| `ALTERNATE_MCP_VALIDATION`  | `false` | boolean   | [`GMS`]                                 | Enables an alternate MCP validation pathway for MCPs that should be validated only after applying a mutation hook. |

## Kafka

In general, there are **lots** of Kafka configuration environment variables for both the producer and consumers defined in the official Spring Kafka documentation [here](https://docs.spring.io/spring-boot/docs/2.7.10/reference/html/application-properties.html#appendix.application-properties.integration).
These environment variables follow the standard Spring representation of properties as environment variables.
Simply replace the dot, `.`, with an underscore, `_`, and convert to uppercase.

| Variable                                            | Default                                      | Unit/Type | Components                                          | Description                                                                                                                                                                                                                                                                                         |
| --------------------------------------------------- | -------------------------------------------- | --------- | --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `KAFKA_LISTENER_CONCURRENCY`                        | 1                                            | integer   | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | Number of Kafka consumer threads. Optimize throughput by matching to topic partitions.                                                                                                                                                                                                              |
| `SPRING_KAFKA_PRODUCER_PROPERTIES_MAX_REQUEST_SIZE` | 1048576                                      | bytes     | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | Max produced message size. Note that the topic configuration is not controlled by this variable.                                                                                                                                                                                                    |
| `SCHEMA_REGISTRY_TYPE`                              | `INTERNAL`                                   | string    | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | Schema registry implementation. One of `INTERNAL` or `KAFKA` or `AWS_GLUE`                                                                                                                                                                                                                          |
| `KAFKA_SCHEMAREGISTRY_URL`                          | `http://localhost:8080/schema-registry/api/` | string    | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | Schema registry url. Used for `INTERNAL` and `KAFKA`. The default value is for the `GMS` component. The `MCE Consumer` and `MAE Consumer` should be the `GMS` hostname and port.                                                                                                                    |
| `AWS_GLUE_SCHEMA_REGISTRY_REGION`                   | `us-east-1`                                  | string    | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | If using `AWS_GLUE` in the `SCHEMA_REGISTRY_TYPE` variable for the schema registry implementation.                                                                                                                                                                                                  |
| `AWS_GLUE_SCHEMA_REGISTRY_NAME`                     | ``                                           | string    | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | If using `AWS_GLUE` in the `SCHEMA_REGISTRY_TYPE` variable for the schema registry.                                                                                                                                                                                                                 |
| `USE_CONFLUENT_SCHEMA_REGISTRY`                     | `true`                                       | boolean   | [`kafka-setup`]                                     | Enable Confluent schema registry configuration.                                                                                                                                                                                                                                                     |
| `KAFKA_PRODUCER_MAX_REQUEST_SIZE`                   | `5242880`                                    | integer   | [`Frontend`, `GMS`, `MCE Consumer`, `MAE Consumer`] | Max produced message size. Note that the topic configuration is not controlled by this variable.                                                                                                                                                                                                    |
| `KAFKA_CONSUMER_MAX_PARTITION_FETCH_BYTES`          | `5242880`                                    | integer   | [`GMS`, `MCE Consumer`, `MAE Consumer`]             | The maximum amount of data per-partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. |
| `MAX_MESSAGE_BYTES`                                 | `5242880`                                    | integer   | [`kafka-setup`]                                     | Sets the max message size on the kakfa topics.                                                                                                                                                                                                                                                      |
| `KAFKA_PRODUCER_COMPRESSION_TYPE`                   | `snappy`                                     | string    | [`Frontend`, `GMS`, `MCE Consumer`, `MAE Consumer`] | The compression used by the producer.                                                                                                                                                                                                                                                               |

## Backend

| Variable                                     | Default | Unit/Type | Components | Description                                                                                                   |
| -------------------------------------------- | ------- | --------- | ---------- | ------------------------------------------------------------------------------------------------------------- |
| `ENTITY_CLIENT_RESTLI_GET_BATCH_CONCURRENCY` | `2`     | integer   | [`GMS`]    | Number of concurrent rest.li calls when the number of urns in a getBatchV2 call exceeds the batch size of 50. |

## Frontend

| Variable                           | Default  | Unit/Type | Components   | Description                                                                                                                                                                                                                                        |
| ---------------------------------- | -------- | --------- | ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `AUTH_VERBOSE_LOGGING`             | `false`  | boolean   | [`Frontend`] | Enable verbose authentication logging. Enabling this will leak sensisitve information in the logs. Disable when finished debugging.                                                                                                                |
| `AUTH_OIDC_GROUPS_CLAIM`           | `groups` | string    | [`Frontend`] | Claim to use as the user's group.                                                                                                                                                                                                                  |
| `AUTH_OIDC_EXTRACT_GROUPS_ENABLED` | `false`  | boolean   | [`Frontend`] | Auto-provision the group from the user's group claim.                                                                                                                                                                                              |
| `AUTH_SESSION_TTL_HOURS`           | `24`     | string    | [`Frontend`] | The number of hours a user session is valid. After this many hours the actor cookie will be expired by the browser and the user will be prompted to login again.                                                                                   |
| `MAX_SESSION_TOKEN_AGE`            | `24h`    | string    | [`Frontend`] | The maximum age of the session token. [User session tokens are stateless and will become invalid after this time](https://www.playframework.com/documentation/2.8.x/SettingsSession#Session-Timeout-/-Expiration) requiring a user to login again. |