--- title: "Deployment Environment Variables" --- # Environment Variables The following is a summary of a few important environment variables which expose various levers which control how DataHub works. --- # DataHub Java Components This includes GMS, System Update, MAE/MCE Consumers. ## Authentication & Authorization Reference Links: - **Authentication Overview**: [Authentication Overview](../authentication/README.md) - **Authentication Concepts**: [Authentication Concepts](../authentication/concepts.md) - **Metadata Service Authentication**: [Introducing Metadata Service Authentication](../authentication/introducing-metadata-service-authentication.md) - **OIDC Configuration**: [Configure OIDC Authentication](../authentication/guides/sso/configure-oidc-react.md) - **Adding Users**: [Adding Users Guide](../authentication/guides/add-users.md) - **Plugin Configuration**: [Plugin Documentation](../plugins.md) ### Authentication Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------------- | ---------- | --------------------------------------------------------------------------- | --------------------------------------------------------------- | | `METADATA_SERVICE_AUTH_ENABLED` | `true` | Enable if you want all requests to the Metadata Service to be authenticated | GMS, MAE Consumer, MCE Consumer, PE Consumer, Frontend | | `DATAHUB_SYSTEM_CLIENT_SECRET` | | System client secret used by AuthServiceController | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions, Frontend | | `METADATA_SERVICE_AUTHENTICATOR_EXCEPTIONS_ENABLED` | `false` | Normally failures are only warnings, enable this to throw them | GMS | | `DATAHUB_TOKEN_SERVICE_SIGNING_KEY` | | Key used to validate incoming tokens and sign new tokens | GMS | | `DATAHUB_TOKEN_SERVICE_SALT` | | Salt used for token validation and signing | GMS | | `DATAHUB_TOKEN_SERVICE_SIGNING_ALGORITHM` | `HS256` | Signing algorithm for DataHub tokens | GMS | | `SESSION_TOKEN_DURATION_MS` | `86400000` | The max duration of a UI session in milliseconds (defaults to 1 day) | GMS | | `GUEST_AUTHENTICATION_USER` | `guest` | Guest user for unauthenticated access | GMS | | `GUEST_AUTHENTICATION_ENABLED` | `false` | Enable guest authentication | GMS | ### Authorization Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------- | ------- | ---------------------------------------------------------------- | ---------- | | `AUTH_POLICIES_ENABLED` | `true` | Enable the default DataHub policies-based authorizer | GMS | | `POLICY_CACHE_REFRESH_INTERVAL_SECONDS` | `120` | Cache refresh interval for policies in seconds | GMS | | `POLICY_CACHE_FETCH_SIZE` | `1000` | Cache policy fetch size | GMS | | `REST_API_AUTHORIZATION_ENABLED` | `true` | Enable authorization of reads, writes, and deletes on REST APIs | GMS | | `VIEW_AUTHORIZATION_ENABLED` | `false` | Controls whether entity pages can limit access based on policies | GMS | | `VIEW_AUTHORIZATION_RECOMMENDATIONS_PEER_GROUP_ENABLED` | `true` | Enable peer group recommendations for view authorization | GMS | ## Ingestion Configuration Reference Links: - **CLI Configuration**: [CLI Documentation](../cli.md) - **DataHub Actions**: [Actions Documentation](../actions/README.md) | Environment Variable | Default | Description | Components | | ------------------------------------------- | ------- | ------------------------------------------------------------------------------------------------ | ----------------- | | `UI_INGESTION_ENABLED` | `true` | Enable UI-based ingestion | GMS, MAE Consumer | | `INGESTION_BATCH_REFRESH_COUNT` | `100` | Number of entities to refresh in a single batch when refreshing entities after ingestion | GMS | | `INGESTION_SOURCE_REFRESH_INTERVAL_SECONDS` | `43200` | Interval at which the ingestion source scheduler will check for new or updated ingestion sources | GMS | ## Telemetry & Analytics | Environment Variable | Default | Description | Components | | ----------------------------- | ------- | ------------------------------------ | ---------- | | `INGESTION_REPORTING_ENABLED` | `false` | Enable ingestion reporting | GMS | | `ENABLE_THIRD_PARTY_LOGGING` | `false` | Whether mixpanel tracking is enabled | GMS | ## DataHub Core Configuration | Environment Variable | Default | Description | Components | | -------------------------------------- | ----------- | ----------------------------------------------------------------- | ---------- | | `DATAHUB_SERVER_TYPE` | `prod` | DataHub server type | GMS | | `DATAHUB_GMS_ASYNC_REQUEST_TIMEOUT_MS` | `55000` | Async request timeout for GMS | GMS | | `DATAHUB_GMS_HOST` | `localhost` | GMS host | Frontend | | `DATAHUB_GMS_PORT` | `8080` | GMS port | Frontend | | `DATAHUB_GMS_USE_SSL` | `false` | Use SSL for GMS connections | Frontend | | `DATAHUB_GMS_URI` | `null` | URI instead of separate host/port/ssl parameters (takes priority) | Frontend | | `DATAHUB_GMS_SSL_PROTOCOL` | `null` | SSL protocol for GMS | Frontend | ### Plugin Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------- | -------------------------------- | -------------------------------------------- | ---------- | | `PLUGIN_SECURITY_MODE` | `RESTRICTED` | Plugin security mode (RESTRICTED or LENIENT) | GMS | | `ENTITY_REGISTRY_PLUGIN_PATH` | `/etc/datahub/plugins/models` | Path for entity registry plugins | GMS | | `ENTITY_REGISTRY_PLUGIN_LOAD_DELAY_SECONDS` | `60` | Rate at which plugin runnable executes | GMS | | `RETENTION_PLUGIN_PATH` | `/etc/datahub/plugins/retention` | Path for retention plugins | GMS | | `AUTH_PLUGIN_PATH` | `/etc/datahub/plugins/auth` | Path for auth plugins | GMS | ### Metrics Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------- | --------------------------------- | ---------------------------------------------- | ----------------- | | `DATAHUB_METRICS_HOOK_LATENCY_PERCENTILES` | `0.5,0.95,0.99,0.999` | Hook latency percentiles | GMS, MAE Consumer | | `DATAHUB_METRICS_HOOK_LATENCY_SERVICE_LEVEL_OBJECTIVES` | `300,1800,3000,10800,21600,43200` | Hook latency SLOs in seconds | GMS, MAE Consumer | | `DATAHUB_METRICS_HOOK_LATENCY_MAX_EXPECTED_VALUE` | `86000` | Maximum expected hook latency value in seconds | GMS, MAE Consumer | ## Entity Service Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------ | ------- | ----------------------------- | ----------------- | | `ENTITY_SERVICE_IMPL` | `ebean` | Entity service implementation | GMS, MCE Consumer | | `ENTITY_SERVICE_ENABLE_RETENTION` | `true` | Enable entity retention | GMS, MCE Consumer | | `ENTITY_SERVICE_APPLY_RETENTION_BOOTSTRAP` | `false` | Apply retention on bootstrap | GMS, MCE Consumer | ## Graph Service Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------- | --------------- | --------------------------------------------------------------------------- | ----------------- | | `GRAPH_SERVICE_IMPL` | `elasticsearch` | Graph service implementation | GMS, MAE Consumer | | `GRAPH_SERVICE_LIMIT_RESULTS_MAX` | `10000` | Maximum allowed result count for queries | GMS | | `GRAPH_SERVICE_LIMIT_RESULTS_API_DEFAULT` | `5000` | Default API result limit | GMS | | `GRAPH_SERVICE_LIMIT_RESULTS_STRICT` | `false` | Throw exception if strict is true, otherwise override with default and warn | GMS | ## Search Service Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------------- | ------------------- | --------------------------------------------------------------------------- | ---------- | | `SEARCH_SERVICE_BATCH_SIZE` | `100` | Search service batch size | GMS | | `SEARCH_SERVICE_ENABLE_CACHE` | `false` | Enable search service cache | GMS | | `SEARCH_SERVICE_ENABLE_CACHE_EVICTION` | `false` | Enable search service cache eviction | GMS | | `SEARCH_SERVICE_CACHE_IMPLEMENTATION` | `caffeine` | Search service cache implementation | GMS | | `SEARCH_SERVICE_HAZELCAST_SERVICE_NAME` | `hazelcast-service` | Hazelcast service name for search cache | GMS | | `SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_ENABLED` | `true` | Enable container expansion in search filters | GMS | | `SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_PAGE_SIZE` | `100` | Page size for container expansion | GMS | | `SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_LIMIT` | `100` | Limit for container expansion | GMS | | `SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_ENABLED` | `true` | Enable domain expansion in search filters | GMS | | `SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_PAGE_SIZE` | `100` | Page size for domain expansion | GMS | | `SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_LIMIT` | `100` | Limit for domain expansion | GMS | | `SEARCH_SERVICE_LIMIT_RESULTS_MAX` | `10000` | Maximum allowed result count for queries | GMS | | `SEARCH_SERVICE_LIMIT_RESULTS_API_DEFAULT` | `5000` | Default API result limit | GMS | | `SEARCH_SERVICE_LIMIT_RESULTS_STRICT` | `false` | Throw exception if strict is true, otherwise override with default and warn | GMS | ## Timeseries Aspect Service | Environment Variable | Default | Description | Components | | ----------------------------------------------------- | ------- | --------------------------------------------------------------------------- | ---------- | | `TIMESERIES_ASPECT_SERVICE_QUERY_CONCURRENCY` | `10` | Parallel threads for timeseries queries | GMS | | `TIMESERIES_ASPECT_SERVICE_QUERY_QUEUE_SIZE` | `500` | Queue size for timeseries queries | GMS | | `TIMESERIES_ASPECT_SERVICE_QUERY_THREAD_KEEP_ALIVE` | `60` | Thread keep alive time for timeseries queries | GMS | | `TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_MAX` | `10000` | Maximum allowed result count for queries | GMS | | `TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_API_DEFAULT` | `5000` | Default API result limit | GMS | | `TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_STRICT` | `false` | Throw exception if strict is true, otherwise override with default and warn | GMS | ## System Metadata Service | Environment Variable | Default | Description | Components | | --------------------------------------------------- | ------- | --------------------------------------------------------------------------- | ---------- | | `SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_MAX` | `10000` | Maximum allowed result count for queries | GMS | | `SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_API_DEFAULT` | `5000` | Default API result limit | GMS | | `SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_STRICT` | `false` | Throw exception if strict is true, otherwise override with default and warn | GMS | ## Platform Analytics | Environment Variable | Default | Description | Components | | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | --------------------------- | | `DATAHUB_ANALYTICS_ENABLED` | `true` | Enable platform analytics | GMS, MAE Consumer, Frontend | | `DATAHUB_ANALYTICS_TRACING_ENABLED` | `true` | Enable backend usage tracing | GMS | | `ANALYTICS_DATAHUB_USAGE_EVENT_TYPES` | `CreateAccessTokenEvent,CreatePolicyEvent,UpdatePolicyEvent,CreateIngestionSourceEvent,UpdateIngestionSourceEvent,RevokeAccessTokenEvent,CreateUserEvent,UpdateUserEvent,DeletePolicyEvent` | Comma separated list of usage event types to listen to | GMS | | `ANALYTICS_GENERIC_ASPECT_TYPES` | `` | Filter list for generic aspect events | GMS | | `ANALYTICS_USER_FILTERS` | `` | Filter out specific users' events from being published | GMS | ## Visual Configuration ### Queries Tab | Environment Variable | Default | Description | Components | | ----------------------------------- | ------- | -------------------------------------- | ---------- | | `REACT_APP_QUERIES_TAB_RESULT_SIZE` | `5` | Queries tab result size (experimental) | Frontend | ### Theme Configuration | Environment Variable | Default | Description | Components | | --------------------------- | ------- | ------------------------------------------------- | ---------- | | `REACT_APP_CUSTOM_THEME_ID` | `` | Custom theme ID for rendering specific theme file | Frontend | ### Assets Configuration | Environment Variable | Default | Description | Components | | ----------------------- | ----------------------------------- | ------------------------------- | ---------- | | `REACT_APP_LOGO_URL` | `/assets/platforms/datahublogo.png` | Logo URL for the application | Frontend | | `REACT_APP_FAVICON_URL` | `/assets/icons/favicon.ico` | Favicon URL for the application | Frontend | | `REACT_APP_TITLE` | `` | Application title | Frontend | ### UI Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------- | ------- | ---------------------------------------------------------------------------------- | ---------- | | `REACT_APP_HIDE_GLOSSARY` | `false` | Hide glossary in the UI | Frontend | | `REACT_APP_SHOW_FULL_TITLE_IN_LINEAGE` | `false` | Show full title in lineage | Frontend | | `DOMAIN_DEFAULT_TAB` | `` | Default tab for domains (set to DOCUMENTATION_TAB to show documentation tab first) | Frontend | | `APPLICATION_SHOW_SIDEBAR_SECTION_WHEN_EMPTY` | `false` | Show sidebar section when empty (deprecated) | Frontend | | `SEARCH_RESULT_NAME_HIGHLIGHT_ENABLED` | `true` | Enable visual highlighting on search result names/descriptions | Frontend | ## Storage Layer Configuration ### EBean Configuration (MySQL/PostgreSQL) | Environment Variable | Default | Description | Components | | --------------------------------- | ------------------------------------- | ----------------------------------------- | -------------------------------- | | `EBEAN_DATASOURCE_USERNAME` | `datahub` | Database username | GMS, MCE Consumer, System Update | | `EBEAN_DATASOURCE_PASSWORD` | `datahub` | Database password | GMS, MCE Consumer, System Update | | `EBEAN_DATASOURCE_URL` | `jdbc:mysql://localhost:3306/datahub` | JDBC URL | GMS, MCE Consumer, System Update | | `EBEAN_DATASOURCE_DRIVER` | `com.mysql.jdbc.Driver` | JDBC Driver | GMS, MCE Consumer, System Update | | `EBEAN_MIN_CONNECTIONS` | `2` | Minimum database connections | GMS, MCE Consumer, System Update | | `EBEAN_MAX_CONNECTIONS` | `50` | Maximum database connections | GMS, MCE Consumer, System Update | | `EBEAN_MAX_INACTIVE_TIME_IN_SECS` | `120` | Maximum inactive time in seconds | GMS, MCE Consumer, System Update | | `EBEAN_MAX_AGE_MINUTES` | `120` | Maximum age in minutes | GMS, MCE Consumer, System Update | | `EBEAN_LEAK_TIME_MINUTES` | `15` | Leak time in minutes | GMS, MCE Consumer, System Update | | `EBEAN_WAIT_TIMEOUT_MILLIS` | `1000` | Wait timeout in milliseconds | GMS, MCE Consumer, System Update | | `EBEAN_AUTOCREATE` | `false` | Auto-create DDL | GMS, MCE Consumer, System Update | | `EBEAN_POSTGRES_USE_AWS_IAM_AUTH` | `false` | Use AWS IAM authentication for PostgreSQL | GMS, MCE Consumer, System Update | | `EBEAN_BATCH_GET_METHOD` | `IN` | Batch get method (IN or UNION) | GMS, MCE Consumer, System Update | ### Cassandra Configuration | Environment Variable | Default | Description | Components | | ------------------------------- | ------------- | --------------------- | -------------------------------- | | `CASSANDRA_DATASOURCE_USERNAME` | `cassandra` | Cassandra username | GMS, MCE Consumer, System Update | | `CASSANDRA_DATASOURCE_PASSWORD` | `cassandra` | Cassandra password | GMS, MCE Consumer, System Update | | `CASSANDRA_HOSTS` | `cassandra` | Cassandra hosts | GMS, MCE Consumer, System Update | | `CASSANDRA_PORT` | `9042` | Cassandra port | GMS, MCE Consumer, System Update | | `CASSANDRA_DATACENTER` | `datacenter1` | Cassandra datacenter | GMS, MCE Consumer, System Update | | `CASSANDRA_KEYSPACE` | `datahub` | Cassandra keyspace | GMS, MCE Consumer, System Update | | `CASSANDRA_USE_SSL` | `false` | Use SSL for Cassandra | GMS, MCE Consumer, System Update | ### Elasticsearch Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------ | --------------- | -------------------------------------------- | ---------------------------------------------- | | `ELASTICSEARCH_HOST` | `localhost` | Elasticsearch host | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_PORT` | `9200` | Elasticsearch port | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_THREAD_COUNT` | `2` | Elasticsearch thread count | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_CONNECTION_REQUEST_TIMEOUT` | `5000` | Connection request timeout | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_USERNAME` | `null` | Elasticsearch username | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_PASSWORD` | `null` | Elasticsearch password | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_PATH_PREFIX` | `null` | Elasticsearch path prefix | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_USE_SSL` | `false` | Use SSL for Elasticsearch | GMS, MAE Consumer, MCE Consumer, System Update | | `OPENSEARCH_USE_AWS_IAM_AUTH` | `false` | Use AWS IAM authentication for OpenSearch | GMS, MAE Consumer, MCE Consumer, System Update | | `AWS_REGION` | `null` | AWS region | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_IMPLEMENTATION` | `elasticsearch` | Implementation (elasticsearch or opensearch) | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTIC_ID_HASH_ALGO` | `MD5` | ID hash algorithm | GMS, MAE Consumer, MCE Consumer, System Update | #### SSL Context Configuration | Environment Variable | Default | Description | Components | | --------------------------------------- | ------- | -------------------------------- | ---------------------------------------------- | | `ELASTICSEARCH_SSL_PROTOCOL` | `null` | SSL protocol | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_SECURE_RANDOM_IMPL` | `null` | SSL secure random implementation | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_TRUSTSTORE_FILE` | `null` | SSL truststore file | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_TRUSTSTORE_TYPE` | `null` | SSL truststore type | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_TRUSTSTORE_PASSWORD` | `null` | SSL truststore password | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_KEYSTORE_FILE` | `null` | SSL keystore file | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_KEYSTORE_TYPE` | `null` | SSL keystore type | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_KEYSTORE_PASSWORD` | `null` | SSL keystore password | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_SSL_KEY_PASSWORD` | `null` | SSL key password | GMS, MAE Consumer, MCE Consumer, System Update | #### Bulk Operations Configuration | Environment Variable | Default | Description | Components | | ------------------------------ | --------- | ----------------------------- | ----------------- | | `ES_BULK_DELETE_BATCH_SIZE` | `5000` | Bulk delete batch size | GMS, MAE Consumer | | `ES_BULK_DELETE_SLICES` | `auto` | Bulk delete slices | GMS, MAE Consumer | | `ES_BULK_DELETE_POLL_INTERVAL` | `30` | Bulk delete poll interval | GMS, MAE Consumer | | `ES_BULK_DELETE_POLL_UNIT` | `SECONDS` | Bulk delete poll unit | GMS, MAE Consumer | | `ES_BULK_DELETE_TIMEOUT` | `30` | Bulk delete timeout | GMS, MAE Consumer | | `ES_BULK_DELETE_TIMEOUT_UNIT` | `MINUTES` | Bulk delete timeout unit | GMS, MAE Consumer | | `ES_BULK_DELETE_NUM_RETRIES` | `3` | Bulk delete number of retries | GMS, MAE Consumer | | `ES_BULK_ASYNC` | `true` | Enable async bulk operations | GMS, MAE Consumer | | `ES_BULK_REQUESTS_LIMIT` | `1000` | Bulk requests limit | GMS, MAE Consumer | | `ES_BULK_FLUSH_PERIOD` | `1` | Bulk flush period | GMS, MAE Consumer | | `ES_BULK_NUM_RETRIES` | `3` | Bulk number of retries | GMS, MAE Consumer | | `ES_BULK_RETRY_INTERVAL` | `1` | Bulk retry interval | GMS, MAE Consumer | | `ES_BULK_REFRESH_POLICY` | `NONE` | Bulk refresh policy | GMS, MAE Consumer | | `ES_BULK_ENABLE_BATCH_DELETE` | `false` | Enable batch delete | GMS, MAE Consumer | #### Index Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------------------- | ------- | --------------------------------------- | ---------------------------------------------- | | `INDEX_PREFIX` | `` | Index prefix | GMS, MAE Consumer, MCE Consumer, System Update | | `ELASTICSEARCH_INDEX_DOC_IDS_SCHEMA_FIELD_HASH_ID_ENABLED` | `false` | Enable hash ID for schema field doc IDs | GMS, MAE Consumer, MCE Consumer, System Update | #### Build Indices Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------------------- | ------- | ----------------------------------------------------------- | ------------- | | `ELASTICSEARCH_BUILD_INDICES_ALLOW_DOC_COUNT_MISMATCH` | `false` | Allow document count mismatch when clone indices is enabled | System Update | | `ELASTICSEARCH_BUILD_INDICES_CLONE_INDICES` | `true` | Clone indices | System Update | | `ELASTICSEARCH_BUILD_INDICES_RETENTION_UNIT` | `DAYS` | Retention unit for indices | System Update | | `ELASTICSEARCH_BUILD_INDICES_RETENTION_VALUE` | `60` | Retention value for indices | System Update | | `ELASTICSEARCH_BUILD_INDICES_REINDEX_OPTIMIZATION_ENABLED` | `true` | Enable reindex optimization | System Update | | `ELASTICSEARCH_NUM_SHARDS_PER_INDEX` | `1` | Number of shards per index | System Update | | `ELASTICSEARCH_NUM_REPLICAS_PER_INDEX` | `1` | Number of replicas per index | System Update | | `ELASTICSEARCH_INDEX_BUILDER_NUM_RETRIES` | `3` | Index builder number of retries | System Update | | `ELASTICSEARCH_INDEX_BUILDER_REFRESH_INTERVAL_SECONDS` | `3` | Index builder refresh interval | System Update | | `SEARCH_DOCUMENT_MAX_ARRAY_LENGTH` | `1000` | Maximum array length in search documents | System Update | | `SEARCH_DOCUMENT_MAX_OBJECT_KEYS` | `1000` | Maximum object keys in search documents | System Update | | `SEARCH_DOCUMENT_MAX_VALUE_LENGTH` | `4096` | Maximum value length in search documents | System Update | | `ELASTICSEARCH_MAIN_TOKENIZER` | `null` | Main tokenizer | System Update | | `ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX` | `false` | Enable mappings reindex | System Update | | `ELASTICSEARCH_INDEX_BUILDER_SETTINGS_REINDEX` | `false` | Enable settings reindex | System Update | | `ELASTICSEARCH_INDEX_BUILDER_MAX_REINDEX_HOURS` | `0` | Maximum reindex hours (0 = no timeout) | System Update | | `ELASTICSEARCH_INDEX_BUILDER_SETTINGS_OVERRIDES` | `null` | Index builder settings overrides | System Update | | `ELASTICSEARCH_MIN_SEARCH_FILTER_LENGTH` | `3` | Minimum search filter length | System Update | | `ELASTICSEARCH_INDEX_BUILDER_ENTITY_SETTINGS_OVERRIDES` | `null` | Entity settings overrides | System Update | #### Search Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------- | -------------------- | ---------------------------------------------- | ---------- | | `ELASTICSEARCH_QUERY_MAX_TERM_BUCKET_SIZE` | `60` | Maximum term bucket size | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_EXCLUSIVE` | `false` | Only return exact matches when using quotes | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_WITH_PREFIX` | `true` | Include prefix match in exact match results | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_FACTOR` | `16.0` | Multiply by this number on true exact match | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_PREFIX_FACTOR` | `1.1` | Multiply by this number when prefix match | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_CASE_FACTOR` | `0.0` | Stacked boost multiplier when case mismatch | GMS | | `ELASTICSEARCH_QUERY_EXACT_MATCH_ENABLE_STRUCTURED` | `true` | Enable exact match on structured search | GMS | | `ELASTICSEARCH_QUERY_TWO_GRAM_FACTOR` | `1.2` | Boost multiplier when match on 2-gram tokens | GMS | | `ELASTICSEARCH_QUERY_THREE_GRAM_FACTOR` | `1.5` | Boost multiplier when match on 3-gram tokens | GMS | | `ELASTICSEARCH_QUERY_FOUR_GRAM_FACTOR` | `1.8` | Boost multiplier when match on 4-gram tokens | GMS | | `ELASTICSEARCH_QUERY_PARTIAL_URN_FACTOR` | `0.5` | Multiplier on Urn token match | GMS | | `ELASTICSEARCH_QUERY_PARTIAL_FACTOR` | `0.4` | Multiplier on possible non-Urn token match | GMS | | `ELASTICSEARCH_QUERY_CUSTOM_CONFIG_ENABLED` | `true` | Enable search query and ranking customization | GMS | | `ELASTICSEARCH_QUERY_CUSTOM_CONFIG_FILE` | `search_config.yaml` | Location of search customization configuration | GMS | | `ELASTICSEARCH_QUERY_SEARCH_FIELD_CONFIG_DEFAULT` | `legacy` | Default field configuration for search | GMS | | `ELASTICSEARCH_QUERY_AUTOCOMPLETE_FIELD_CONFIG_DEFAULT` | `legacy` | Default field configuration for autocomplete | GMS | #### Graph Search Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------- | ------- | -------------------------------------------- | ---------- | | `ELASTICSEARCH_SEARCH_GRAPH_TIMEOUT_SECONDS` | `50` | Graph DAO timeout seconds | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_BATCH_SIZE` | `1000` | Graph DAO batch size | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_MULTI_PATH_SEARCH` | `false` | Allow path retraversal for all paths | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_BOOST_VIA_NODES` | `true` | Boost graph edges with via nodes | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_STATUS_ENABLED` | `false` | Enable soft delete tracking of URNs on edges | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_LINEAGE_MAX_HOPS` | `20` | Maximum hops to traverse lineage graph | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_IMPACT_MAX_HOPS` | `1000` | Maximum hops to traverse for impact analysis | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_IMPACT_MAX_THREADS` | `32` | Maximum parallel lineage graph queries | GMS | | `ELASTICSEARCH_SEARCH_GRAPH_QUERY_OPTIMIZATION` | `true` | Reduce query nesting if possible | GMS | ### Neo4j Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------------- | ------------------ | -------------------------------------- | -------------------------------- | | `NEO4J_USERNAME` | `neo4j` | Neo4j username | GMS, MAE Consumer, System Update | | `NEO4J_PASSWORD` | `datahub` | Neo4j password | GMS, MAE Consumer, System Update | | `NEO4J_URI` | `bolt://localhost` | Neo4j URI | GMS, MAE Consumer, System Update | | `NEO4J_DATABASE` | `graph.db` | Neo4j database | GMS, MAE Consumer, System Update | | `NEO4J_MAX_CONNECTION_POOL_SIZE` | `100` | Maximum connection pool size | GMS, MAE Consumer, System Update | | `NEO4J_MAX_CONNECTION_ACQUISITION_TIMEOUT_IN_SECONDS` | `60` | Maximum connection acquisition timeout | GMS, MAE Consumer, System Update | | `NEO4j_MAX_CONNECTION_LIFETIME_IN_SECONDS` | `3600` | Maximum connection lifetime | GMS, MAE Consumer, System Update | | `NEO4J_MAX_TRANSACTION_RETRY_TIME_IN_SECONDS` | `30` | Maximum transaction retry time | GMS, MAE Consumer, System Update | | `NEO4J_CONNECTION_LIVENESS_CHECK_TIMEOUT_IN_SECONDS` | `-1` | Connection liveness check timeout | GMS, MAE Consumer, System Update | ## Kafka Configuration Reference Links: - **Kafka Configuration**: [Kafka Configuration Guide](../how/kafka-config.md) - **Confluent Cloud**: [Confluent Cloud Integration](confluent-cloud.md) - **DataHub Actions**: [Actions Documentation](../actions/README.md) ### Topic Configuration | Environment Variable | Default | Description | Components | | -------------------------- | ---------------------- | ------------------------------ | -------------------------------------------------- | | `DATAHUB_USAGE_EVENT_NAME` | `DataHubUsageEvent_v1` | DataHub usage event topic name | GMS, MAE Consumer, MCE Consumer, Actions, Frontend | ### Bootstrap Servers | Environment Variable | Default | Description | Components | | ------------------------ | ----------------------- | ----------------------- | --------------------------------------------------------------- | | `KAFKA_BOOTSTRAP_SERVER` | `http://localhost:9092` | Kafka bootstrap servers | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions, Frontend | ### Producer Configuration | Environment Variable | Default | Description | Components | | --------------------------------- | --------- | ------------------------------ | -------------------------------- | | `KAFKA_PRODUCER_RETRY_COUNT` | `3` | Producer retry count | GMS, MCE Consumer, System Update | | `KAFKA_PRODUCER_DELIVERY_TIMEOUT` | `30000` | Producer delivery timeout | GMS, MCE Consumer, System Update | | `KAFKA_PRODUCER_REQUEST_TIMEOUT` | `3000` | Producer request timeout | GMS, MCE Consumer, System Update | | `KAFKA_PRODUCER_BACKOFF_TIMEOUT` | `500` | Producer backoff timeout | GMS, MCE Consumer, System Update | | `KAFKA_PRODUCER_COMPRESSION_TYPE` | `snappy` | Producer compression algorithm | GMS, MCE Consumer, System Update | | `KAFKA_PRODUCER_MAX_REQUEST_SIZE` | `5242880` | Maximum bytes sent by producer | GMS, MCE Consumer, System Update | ### Consumer Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------- | --------------------------------- | ------------------------------------------ | --------------------------------------------------------- | | `KAFKA_LISTENER_CONCURRENCY` | `1` | Number of Kafka consumer threads | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_MAX_PARTITION_FETCH_BYTES` | `5242880` | Maximum data per partition | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_STOP_ON_DESERIALIZATION_ERROR` | `true` | Stop on deserialization error | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_HEALTH_CHECK_ENABLED` | `true` | Enable health check for consumers | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_MCP_AUTO_OFFSET_RESET` | `earliest` | MCP consumer auto offset reset | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_MCL_AUTO_OFFSET_RESET` | `earliest` | MCL consumer auto offset reset | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_CONSUMER_MCL_FINE_GRAINED_LOGGING_ENABLED` | `false` | Enable fine-grained logging for MCL | GMS, MAE Consumer | | `KAFKA_CONSUMER_MCL_ASPECTS_TO_DROP` | `` | Aspects to drop for MCL | GMS, MAE Consumer | | `KAFKA_CONSUMER_PE_AUTO_OFFSET_RESET` | `latest` | PE consumer auto offset reset | GMS, PE Consumer | | `KAFKA_CONSUMER_PERCENTILES` | `0.5,0.95,0.99,0.999` | Consumer percentiles | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer | | `KAFKA_CONSUMER_SERVICE_LEVEL_OBJECTIVES` | `300,1800,3000,10800,21600,43200` | Consumer SLOs in seconds | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer | | `KAFKA_CONSUMER_MAX_EXPECTED_VALUE` | `86000` | Maximum expected consumer value in seconds | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer | ### Consumer Pool Configuration | Environment Variable | Default | Description | Components | | ---------------------------------- | ------- | -------------------------- | ---------- | | `KAFKA_CONSUMER_POOL_INITIAL_SIZE` | `1` | Consumer pool initial size | GMS | | `KAFKA_CONSUMER_POOL_MAX_SIZE` | `5` | Consumer pool maximum size | GMS | ### Schema Registry Configuration | Environment Variable | Default | Description | Components | | ------------------------------------ | ----------------------- | --------------------------------------------------- | ----------------------------------------------------- | | `SCHEMA_REGISTRY_TYPE` | `KAFKA` | Schema registry type (INTERNAL, KAFKA, or AWS_GLUE) | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_SCHEMAREGISTRY_URL` | `http://localhost:8081` | Schema registry URL | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `SCHEMA_REGISTRY_URL` | `http://localhost:8081` | Schema registry URL (Actions) | Actions | | `AWS_GLUE_SCHEMA_REGISTRY_REGION` | `us-east-1` | AWS Glue schema registry region | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `AWS_GLUE_SCHEMA_REGISTRY_NAME` | `null` | AWS Glue schema registry name | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `KAFKA_PROPERTIES_SECURITY_PROTOCOL` | `PLAINTEXT` | Kafka security protocol | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions | ## Spring Configuration ### Kafka Security | Environment Variable | Default | Description | Components | | -------------------------------- | ----------- | ----------------------- | -------------------------------------------- | | `spring.kafka.security.protocol` | `PLAINTEXT` | Kafka security protocol | GMS, MAE Consumer, MCE Consumer, PE Consumer | ## Management & Monitoring ### JMX Configuration | Environment Variable | Default | Description | Components | | -------------------- | ------- | ----------- | -------------------------------------------- | | `spring.jmx.enabled` | `true` | Enable JMX | GMS, MAE Consumer, MCE Consumer, PE Consumer | ### Endpoints Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------- | ------------------------------------- | --------------------- | ---------- | | `management.endpoints.web.exposure.include` | `prometheus,info,healthcheck,metrics` | Exposed web endpoints | GMS | | `management.endpoints.jmx.enabled` | `true` | Enable JMX endpoints | GMS | ### Metrics Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------- | ------- | -------------------------------- | -------------------------------------------- | | `management.metrics.cache.enabled` | `false` | Enable cache metrics | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `management.metrics.export.jmx.enabled` | `true` | Enable JMX metrics export | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `management.metrics.export.prometheus.enabled` | `true` | Enable Prometheus metrics export | GMS, MAE Consumer, MCE Consumer, PE Consumer | ### Server Configuration | Environment Variable | Default | Description | Components | | ---------------------- | ------- | ------------- | ---------- | | `server.server-header` | `false` | Server header | GMS | ## Feature Flags Reference Links: - **Access Management**: [Access Management Feature](../features/feature-guides/access-roles.md) - **Structured Properties**: [Structured Properties Overview](../features/feature-guides/properties/overview.md) - **Lineage Features**: [Data Lineage](../features/feature-guides/lineage.md), [UI Lineage Management](../features/feature-guides/ui-lineage.md) - **Compliance Forms**: [Compliance Forms Overview](../features/feature-guides/compliance-forms/overview.md) - **Dataset Usage**: [Dataset Usage & Query History](../features/dataset-usage-and-query-history.md) - **MCP Server**: [DataHub MCP Server](../features/feature-guides/mcp.md) | Environment Variable | Default | Description | Components | | --------------------------------------- | ------- | ------------------------------------------------------------------ | ---------- | | `SHOW_SIMPLIFIED_HOMEPAGE_BY_DEFAULT` | `false` | Show simplified homepage with just datasets, charts and dashboards | GMS | | `LINEAGE_SEARCH_CACHE_ENABLED` | `true` | Enable in-memory cache for searchAcrossLineage query | GMS | | `GRAPH_SERVICE_DIFF_MODE_ENABLED` | `true` | Enable diff mode for graph writes | GMS | | `POINT_IN_TIME_CREATION_ENABLED` | `false` | Enable creation of point in time snapshots for scroll API | GMS | | `ALWAYS_EMIT_CHANGE_LOG` | `false` | Always emit MCL even when no changes detected | GMS | | `SEARCH_SERVICE_DIFF_MODE_ENABLED` | `true` | Enable diff mode for search document writes | GMS | | `READ_ONLY_MODE_ENABLED` | `false` | Enable read only mode for instance | GMS | | `SHOW_ACCESS_MANAGEMENT` | `false` | Show AccessManagement tab in UI | GMS | | `SHOW_SEARCH_FILTERS_V2` | `true` | Show search filters V2 experience | GMS | | `SHOW_BROWSE_V2` | `true` | Show browse v2 sidebar experience | GMS | | `PLATFORM_BROWSE_V2` | `true` | Enable platform browse experience | GMS | | `LINEAGE_GRAPH_V2` | `true` | Enable new lineage visualization | GMS | | `PRE_PROCESS_HOOKS_UI_ENABLED` | `true` | Circumvent Kafka for UI changes | GMS | | `PRE_PROCESS_HOOKS_UI_ENABLED` | `false` | Reprocess UI sourced events asynchronously | GMS | | `SHOW_ACRYL_INFO` | `false` | Show CTAs around moving to DataHub Cloud | GMS | | `ER_MODEL_RELATIONSHIP_FEATURE_ENABLED` | `false` | Enable Join Tables Feature | GMS | | `NESTED_DOMAINS_ENABLED` | `true` | Enable nested Domains feature | GMS | | `SCHEMA_FIELD_ENTITY_FETCH_ENABLED` | `true` | Enable fetching schema field entities | GMS | | `BUSINESS_ATTRIBUTE_ENTITY_ENABLED` | `false` | Enable business attribute entity | GMS | | `DATA_CONTRACTS_ENABLED` | `true` | Enable Data Contracts feature | GMS | | `ALTERNATE_MCP_VALIDATION` | `false` | Enable alternate MCP validation flow | GMS | | `THEME_V2_ENABLED` | `true` | Allow theme v2 to be turned on | GMS | | `THEME_V2_DEFAULT` | `true` | Set default theme for users | GMS | | `THEME_V2_TOGGLEABLE` | `true` | Allow theme v2 to be toggled (Acryl only) | GMS | | `SCHEMA_FIELD_CLL_ENABLED` | `false` | Enable schema field-level lineage links | GMS | | `SCHEMA_FIELD_LINEAGE_IGNORE_STATUS` | `true` | Ignore schema field status in lineage | GMS | | `SHOW_SEPARATE_SIBLINGS` | `false` | Separate siblings with no combined view | GMS | | `EDITABLE_DATASET_NAME_ENABLED` | `false` | Enable editing dataset name in UI | GMS | | `SHOW_MANAGE_STRUCTURED_PROPERTIES` | `true` | Show manage structured properties button | GMS | | `HIDE_DBT_SOURCE_IN_LINEAGE` | `false` | Hide dbt sources in lineage | GMS | | `SHOW_NAV_BAR_REDESIGN` | `true` | Show newly designed nav bar | GMS | | `SHOW_AUTO_COMPLETE_RESULTS` | `true` | Show auto complete results in search bar | GMS | | `ENTITY_VERSIONING_ENABLED` | `false` | Enable entity versioning APIs | GMS | | `SHOW_HAS_SIBLINGS_FILTER` | `false` | Show "has siblings" filter in search | GMS | | `SHOW_SEARCH_BAR_AUTOCOMPLETE_REDESIGN` | `false` | Show redesigned search bar autocomplete | GMS | | `SHOW_MANAGE_TAGS` | `true` | Allow users to manage tags in UI | GMS | | `SHOW_INTRODUCE_PAGE` | `true` | Show introduce page in V2 UI | GMS | | `SHOW_INGESTION_PAGE_REDESIGN` | `false` | Show re-designed Ingestion page | GMS | | `SHOW_LINEAGE_EXPAND_MORE` | `true` | Show expand more button in lineage graph | GMS | | `SHOW_HOME_PAGE_REDESIGN` | `false` | Show re-designed home page | GMS | | `LINEAGE_GRAPH_V3` | `false` | Enable redesign of lineage v2 graph | GMS | | `SHOW_PRODUCT_UPDATES` | `true` | Show in-product update popover | GMS | | `LOGICAL_MODELS_ENABLED` | `false` | Enable logical models feature | GMS | | `SHOW_HOMEPAGE_USER_ROLE` | `false` | Display homepage user role underneath name | GMS | | `VIEWS_ENABLED` | `true` | Enable views feature | GMS | ## System Updates Reference Links: - **Updating DataHub**: [Updating DataHub Guide](../how/updating-datahub.md) ### Bootstrap Configuration | Environment Variable | Default | Description | Components | | -------------------------------- | ------------------------------ | --------------------------------------------- | ---------- | | `BOOTSTRAP_POLICIES_FILE` | `classpath:boot/policies.json` | Bootstrap policies file | GMS | | `BOOTSTRAP_SERVLETS_WAITTIMEOUT` | `60` | Total waiting time for servlets to initialize | GMS | ### System Update Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------- | --------------------- | ------------------------------------ | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_INITIAL_BACK_OFF_MILLIS` | `5000` | Initial back off for system updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_MAX_BACK_OFFS` | `50` | Maximum back offs for system updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_BACK_OFF_FACTOR` | `2` | Multiplicative factor for back off | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_WAIT_FOR_SYSTEM_UPDATE` | `true` | Wait for system update to complete | System Update | | `SYSTEM_UPDATE_BOOTSTRAP_MCP_CONFIG` | `bootstrap_mcps.yaml` | Bootstrap MCP configuration | System Update | ### Data Job Node CLL Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------ | ------- | --------------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_ENABLED` | `false` | Enable data job node CLL | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_BATCH_SIZE` | `1000` | Data job node CLL batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_DELAY_MS` | `30000` | Data job node CLL delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_LIMIT` | `0` | Data job node CLL limit | System Update | ### Domain Description Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------- | ------- | ---------------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_ENABLED` | `true` | Enable domain description updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_BATCH_SIZE` | `1000` | Domain description batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_DELAY_MS` | `30000` | Domain description delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_CLL_LIMIT` | `0` | Domain description CLL limit | System Update | ### Dashboard Info Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------------- | ------- | ------------------------------------ | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_ENABLED` | `true` | Enable dashboard info updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_BATCH_SIZE` | `1000` | Dashboard info batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_DELAY_MS` | `30000` | Dashboard info delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_CLL_LIMIT` | `0` | Dashboard info CLL limit | System Update | ### Browse Paths V2 Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------------- | ------- | --------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_BROWSE_PATHS_V2_ENABLED` | `true` | Enable browse paths V2 updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_BROWSE_PATHS_V2_BATCH_SIZE` | `5000` | Browse paths V2 batch size | System Update | | `REPROCESS_DEFAULT_BROWSE_PATHS_V2` | `false` | Reprocess default browse paths V2 | System Update | ### Ingestion Indices Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------ | ------- | --------------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_ENABLED` | `true` | Enable ingestion indices updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_BATCH_SIZE` | `5000` | Ingestion indices batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_DELAY_MS` | `1000` | Ingestion indices delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_CLL_LIMIT` | `0` | Ingestion indices CLL limit | System Update | ### Policy Fields Configuration | Environment Variable | Default | Description | Components | | -------------------------------------------------- | ------- | ------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_ENABLED` | `true` | Enable policy fields updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_BATCH_SIZE` | `5000` | Policy fields batch size | System Update | | `REPROCESS_DEFAULT_POLICY_FIELDS` | `false` | Reprocess default policy fields | System Update | ### Ownership Types Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------------- | ------- | ------------------------------ | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_ENABLED` | `true` | Enable ownership types updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_BATCH_SIZE` | `1000` | Ownership types batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_REPROCESS` | `false` | Reprocess ownership types | System Update | ### Schema Fields Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------------------- | ------- | --------------------------------------------- | ------------- | | `SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_ENABLED` | `false` | Enable schema fields from schema metadata | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_BATCH_SIZE` | `500` | Schema fields from schema metadata batch size | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_DELAY_MS` | `1000` | Schema fields from schema metadata delay | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_LIMIT` | `0` | Schema fields from schema metadata limit | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_ENABLED` | `false` | Enable schema fields doc IDs | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_BATCH_SIZE` | `500` | Schema fields doc IDs batch size | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_DELAY_MS` | `5000` | Schema fields doc IDs delay | System Update | | `SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_LIMIT` | `0` | Schema fields doc IDs limit | System Update | ### Process Instance Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------------------- | ------- | ------------------------------------------- | ------------- | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_ENABLED` | `true` | Enable process instance has run events | System Update | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_BATCH_SIZE` | `100` | Process instance has run events batch size | System Update | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_DELAY_MS` | `1000` | Process instance has run events delay | System Update | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_TOTAL_DAYS` | `90` | Process instance has run events total days | System Update | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_WINDOW_DAYS` | `1` | Process instance has run events window days | System Update | | `SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_REPROCESS` | `false` | Reprocess process instance has run events | System Update | ### Edge Status Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------ | ------- | --------------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_ENABLED` | `false` | Enable edge status updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_BATCH_SIZE` | `1000` | Edge status batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_DELAY_MS` | `5000` | Edge status delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_LIMIT` | `0` | Edge status limit | System Update | ### Property Definitions Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------------------- | ------- | ------------------------------------------ | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_ENABLED` | `true` | Enable property definitions updates | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_BATCH_SIZE` | `500` | Property definitions batch size | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_DELAY_MS` | `1000` | Property definitions delay in milliseconds | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_CLL_LIMIT` | `0` | Property definitions CLL limit | System Update | ### Remove Query Edges Configuration | Environment Variable | Default | Description | Components | | ---------------------------------------------------- | ------- | -------------------------- | ------------- | | `BOOTSTRAP_SYSTEM_UPDATE_REMOVE_QUERY_EDGES_ENABLED` | `true` | Enable remove query edges | System Update | | `BOOTSTRAP_SYSTEM_UPDATE_REMOVE_QUERY_EDGES_RETRIES` | `20` | Remove query edges retries | System Update | ## Additional Environment Variables The following environment variables are used in the codebase but may not be explicitly defined in the application.yaml file: ### Ingestion and Processing | Environment Variable | Default | Description | Components | | ----------------------------------- | ------- | ---------------------------------------------------------- | ---------- | | `ASYNC_INGEST_DEFAULT` | `false` | Asynchronously process ingestProposals by writing to Kafka | GMS | | `STRICT_URN_VALIDATION_ENABLED` | `false` | Enable stricter URN validation logic | GMS | | `DATAHUB_DATASET_URN_TO_LOWER` | `null` | Convert dataset URN names to lowercase | GMS | | `BUSINESS_ATTRIBUTE_ENTITY_ENABLED` | `false` | Enable business attribute entity feature | GMS | ### REST and Servlet Configuration | Environment Variable | Default | Description | Components | | ------------------------ | ------- | ---------------------------------- | ----------------- | | `RESTLI_SERVLET_THREADS` | `null` | Number of threads for REST servlet | GMS, MCE Consumer | | `RESTLI_TIMEOUT_SECONDS` | `60` | REST timeout in seconds | GMS, MCE Consumer | ### System and Version Information | Environment Variable | Default | Description | Components | | ---------------------- | ------- | ------------------------- | ---------- | | `DATAHUB_GMS_PROTOCOL` | `http` | GMS protocol (http/https) | GMS | ### Upgrade and Migration | Environment Variable | Default | Description | Components | | -------------------------------------------------- | ------- | -------------------------------------------------- | ------------- | | `SKIP_REINDEX_EDGE_STATUS` | `false` | Skip reindexing edge status | System Update | | `SKIP_REINDEX_DATA_JOB_INPUT_OUTPUT` | `false` | Skip reindexing data job input/output | System Update | | `SKIP_GENERATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA` | `false` | Skip generating schema fields from schema metadata | System Update | | `SKIP_MIGRATE_SCHEMA_FIELDS_DOC_ID` | `false` | Skip migrating schema fields doc IDs | System Update | | `BACKFILL_BROWSE_PATHS_V2` | `false` | Enable backfilling browse paths V2 | System Update | | `READER_POOL_SIZE` | `null` | Reader pool size for restore operations | System Update | | `WRITER_POOL_SIZE` | `null` | Writer pool size for restore operations | System Update | ### OpenTelemetry Configuration | Environment Variable | Default | Description | Components | | ----------------------- | ------- | ------------------------------ | -------------------------------------------- | | `OTEL_METRICS_EXPORTER` | `none` | OpenTelemetry metrics exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `OTEL_TRACES_EXPORTER` | `none` | OpenTelemetry traces exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `OTEL_LOGS_EXPORTER` | `none` | OpenTelemetry logs exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer | | `OTEL_PROPAGATORS` | `null` | OpenTelemetry propagators | GMS, MAE Consumer, MCE Consumer, PE Consumer | ### Secret Service Configuration | Environment Variable | Default | Description | Components | | ------------------------------------- | ---------------- | -------------------------------------- | ---------- | | `SECRET_SERVICE_ENCRYPTION_KEY` | `ENCRYPTION_KEY` | Secret service encryption key | GMS | | `SECRET_SERVICE_V1_ALGORITHM_ENABLED` | `true` | Enable v1 algorithm for secret service | GMS | ### Health Check Configuration | Environment Variable | Default | Description | Components | | ------------------------------------- | ------- | --------------------------- | ---------- | | `HEALTH_CHECK_CACHE_DURATION_SECONDS` | `5` | Health check cache duration | GMS | ### Metadata Tests Configuration | Environment Variable | Default | Description | Components | | ------------------------ | ------- | --------------------- | ---------- | | `METADATA_TESTS_ENABLED` | `false` | Enable metadata tests | GMS | ### Hooks Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------------ | ------------- | ---------------------------------------------------- | ----------------- | | `ENABLE_SIBLING_HOOK` | `true` | Enable automatic sibling associations | GMS, MAE Consumer | | `SIBLINGS_HOOK_CONSUMER_GROUP_SUFFIX` | `` | Siblings hook consumer group suffix | GMS, MAE Consumer | | `ENABLE_UPDATE_INDICES_HOOK` | `true` | Enable update indices hook | GMS, MAE Consumer | | `UPDATE_INDICES_CONSUMER_GROUP_SUFFIX` | `` | Update indices consumer group suffix | GMS, MAE Consumer | | `ENABLE_INGESTION_SCHEDULER_HOOK` | `true` | Enable ingestion scheduling | GMS, MAE Consumer | | `INGESTION_SCHEDULER_HOOK_CONSUMER_GROUP_SUFFIX` | `` | Ingestion scheduler hook consumer group suffix | GMS, MAE Consumer | | `ENABLE_INCIDENTS_HOOK` | `true` | Enable incidents hook | GMS, MAE Consumer | | `MAX_INCIDENT_HISTORY` | `100` | Maximum incident history | GMS, MAE Consumer | | `INCIDENTS_HOOK_CONSUMER_GROUP_SUFFIX` | `` | Incidents hook consumer group suffix | GMS, MAE Consumer | | `ENABLE_STRUCTURED_PROPERTIES_HOOK` | `true` | Enable structured properties mappings | GMS, MAE Consumer | | `ENABLE_STRUCTURED_PROPERTIES_WRITE` | `true` | Enable writing structured property values | GMS, MAE Consumer | | `ENABLE_STRUCTURED_PROPERTIES_SYSTEM_UPDATE` | `false` | Enable structured property mappings in system update | GMS, MAE Consumer | | `ENABLE_ENTITY_CHANGE_EVENTS_HOOK` | `true` | Enable entity change events hook | GMS, MAE Consumer | | `ECE_CONSUMER_GROUP_SUFFIX` | `` | Entity change events consumer group suffix | GMS, MAE Consumer | | `ECE_ENTITY_EXCLUSIONS` | `schemaField` | Entities to exclude from ECE hook | GMS, MAE Consumer | | `FORMS_HOOK_ENABLED` | `true` | Enable forms hook | GMS, MAE Consumer | | `FORMS_HOOK_CONSUMER_GROUP_SUFFIX` | `` | Forms hook consumer group suffix | GMS, MAE Consumer | ### Search and API Configuration | Environment Variable | Default | Description | Components | | --------------------------- | --------------------------- | ------------------------------ | ---------- | | `SEARCH_BAR_API_VARIANT` | `AUTOCOMPLETE_FOR_MULTIPLE` | Search bar API variant | Frontend | | `FIRST_IN_PERSONAL_SIDEBAR` | `YOUR_ASSETS` | First item in personal sidebar | Frontend | ### Client Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------------- | ------- | --------------------------------------------------- | ------------------------------ | | `ENTITY_CLIENT_RETRY_INTERVAL` | `2` | Entity client retry interval | GMS | | `ENTITY_CLIENT_NUM_RETRIES` | `3` | Entity client number of retries | GMS | | `ENTITY_CLIENT_JAVA_GET_BATCH_SIZE` | `375` | Entity client Java get batch size | GMS | | `ENTITY_CLIENT_JAVA_INGEST_BATCH_SIZE` | `375` | Entity client Java ingest batch size | GMS | | `ENTITY_CLIENT_RESTLI_GET_BATCH_SIZE` | `100` | Entity client RESTli get batch size | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_GET_BATCH_CONCURRENCY` | `2` | Entity client RESTli get batch concurrency | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_GET_BATCH_QUEUE_SIZE` | `500` | Entity client RESTli get batch queue size | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_GET_BATCH_THREAD_KEEP_ALIVE` | `60` | Entity client RESTli get batch thread keep alive | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_INGEST_BATCH_SIZE` | `50` | Entity client RESTli ingest batch size | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_INGEST_BATCH_CONCURRENCY` | `2` | Entity client RESTli ingest batch concurrency | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_INGEST_BATCH_QUEUE_SIZE` | `500` | Entity client RESTli ingest batch queue size | GMS, MAE Consumer, PE Consumer | | `ENTITY_CLIENT_RESTLI_INGEST_BATCH_THREAD_KEEP_ALIVE` | `60` | Entity client RESTli ingest batch thread keep alive | GMS, MAE Consumer, PE Consumer | | `USAGE_CLIENT_RETRY_INTERVAL` | `2` | Usage client retry interval | GMS, MAE Consumer, PE Consumer | | `USAGE_CLIENT_NUM_RETRIES` | `0` | Usage client number of retries | GMS, MAE Consumer, PE Consumer | | `USAGE_CLIENT_TIMEOUT_MS` | `3000` | Usage client timeout in milliseconds | GMS, MAE Consumer, PE Consumer | ### Cache Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------------- | ----------- | -------------------------------------------------------- | ------------------------------ | | `CACHE_TTL_SECONDS` | `600` | Default cache time to live | GMS | | `CACHE_MAX_SIZE` | `10000` | Maximum number of items to cache | GMS | | `CACHE_ENTITY_COUNTS_TTL_SECONDS` | `600` | Homepage entity count time to live | GMS | | `CACHE_SEARCH_LINEAGE_TTL_SECONDS` | `86400` | Search lineage cache time to live | GMS | | `CACHE_SEARCH_LINEAGE_LIGHTNING_THRESHOLD` | `300` | Lineage graphs exceeding this limit will use local cache | GMS | | `CACHE_CLIENT_USAGE_CLIENT_ENABLED` | `true` | Enable usage client cache | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_USAGE_CLIENT_STATS_ENABLED` | `true` | Enable usage client cache stats | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_USAGE_CLIENT_STATS_INTERVAL_SECONDS` | `120` | Usage client cache stats interval | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_USAGE_CLIENT_TTL_SECONDS` | `86400` | Usage client cache TTL | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_USAGE_CLIENT_MAX_BYTES` | `52428800` | Usage client cache max bytes (50MB) | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_ENTITY_CLIENT_ENABLED` | `true` | Enable entity client cache | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_ENTITY_CLIENT_STATS_ENABLED` | `true` | Enable entity client cache stats | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_ENTITY_CLIENT_STATS_INTERVAL_SECONDS` | `120` | Entity client cache stats interval | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_ENTITY_CLIENT_TTL_SECONDS` | `0` | Entity client cache TTL (0 = no cache) | GMS, MAE Consumer, PE Consumer | | `CACHE_CLIENT_ENTITY_CLIENT_MAX_BYTES` | `104857600` | Entity client cache max bytes (100MB) | GMS, MAE Consumer, PE Consumer | ### GraphQL Configuration | Environment Variable | Default | Description | Components | | ----------------------------------------------- | ---------------------------------------------------------- | ------------------------------------------------ | ---------- | | `GRAPHQL_CONCURRENCY_SEPARATE_THREAD_POOL` | `false` | Enable separate thread pool for GraphQL | GMS | | `GRAPHQL_CONCURRENCY_STACK_SIZE` | `256000` | GraphQL thread pool stack size | GMS | | `GRAPHQL_CONCURRENCY_CORE_POOL_SIZE` | `-1` | GraphQL core pool size (default 5 \* cores) | GMS | | `GRAPHQL_CONCURRENCY_MAX_POOL_SIZE` | `-1` | GraphQL max pool size (default 100 \* cores) | GMS | | `GRAPHQL_CONCURRENCY_KEEP_ALIVE` | `60` | GraphQL thread keep alive time | GMS | | `GRAPHQL_QUERY_COMPLEXITY_LIMIT` | `2000` | GraphQL query complexity limit | GMS | | `GRAPHQL_QUERY_DEPTH_LIMIT` | `50` | GraphQL query depth limit | GMS | | `GRAPHQL_QUERY_INTROSPECTION_ENABLED` | `true` | Enable GraphQL introspection | GMS | | `GRAPHQL_METRICS_ENABLED` | `true` | Enable GraphQL metrics collection | GMS | | `GRAPHQL_PERCENTILES` | `0.5,0.75,0.95,0.98,0.99,0.999` | GraphQL percentiles | GMS | | `GRAPHQL_METRICS_FIELD_LEVEL_ENABLED` | `false` | Enable field-level GraphQL metrics | GMS | | `GRAPHQL_METRICS_FIELD_LEVEL_OPERATIONS` | `getSearchResultsForMultiple,searchAcrossLineageStructure` | GraphQL field-level operations | GMS | | `GRAPHQL_METRICS_FIELD_LEVEL_PATH_ENABLED` | `false` | Include field path in GraphQL metrics | GMS | | `GRAPHQL_METRICS_FIELD_LEVEL_PATHS` | `` | GraphQL field-level paths | GMS | | `GRAPHQL_METRICS_TRIVIAL_DATA_FETCHERS_ENABLED` | `false` | Include trivial data fetchers in GraphQL metrics | GMS | ### Chrome Extension Configuration | Environment Variable | Default | Description | Components | | ---------------------------------- | ------- | ------------------------------- | ---------- | | `CHROME_EXTENSION_ENABLED` | `true` | Enable Chrome extension | Frontend | | `CHROME_EXTENSION_LINEAGE_ENABLED` | `true` | Enable Chrome extension lineage | Frontend | ### Business Attribute Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------------------- | ------- | ---------------------------------------------------------------- | ---------- | | `BUSINESS_ATTRIBUTE_RELATED_ENTITIES_COUNT` | `20000` | Business attribute related entities count | GMS | | `BUSINESS_ATTRIBUTE_RELATED_ENTITIES_BATCH_SIZE` | `1000` | Business attribute related entities batch size | GMS | | `BUSINESS_ATTRIBUTE_PROPAGATION_CONCURRENCY_THREAD_COUNT` | `-1` | Business attribute propagation thread count (default 2 \* cores) | GMS | | `BUSINESS_ATTRIBUTE_PROPAGATION_CONCURRENCY_KEEP_ALIVE` | `60` | Business attribute propagation keep alive time | GMS | ### Metadata Change Proposal Configuration | Environment Variable | Default | Description | Components | | --------------------------------------------- | ---------- | ---------------------------------------------- | ----------------- | | `MCP_CONSUMER_BATCH_ENABLED` | `false` | Enable MCP consumer batch processing | GMS, MCE Consumer | | `MCP_CONSUMER_BATCH_SIZE` | `15744000` | MCP consumer batch size | GMS, MCE Consumer | | `MCP_VALIDATION_IGNORE_UNKNOWN` | `true` | Ignore unknown fields in MCP validation | GMS, MCE Consumer | | `MCP_VALIDATION_PRIVILEGE_CONSTRAINTS` | `true` | Enable privilege constraints in MCP validation | GMS, MCE Consumer | | `MCP_VALIDATION_EXTENSIONS_ENABLED` | `false` | Enable extensions in MCP validation | GMS, MCE Consumer | | `MCP_SIDE_EFFECTS_SCHEMA_FIELD_ENABLED` | `false` | Enable schema field side effects | GMS, MCE Consumer | | `MCP_SIDE_EFFECTS_DATA_PRODUCT_UNSET_ENABLED` | `true` | Enable data product unset side effects | GMS, MCE Consumer | | `MCP_THROTTLE_UPDATE_INTERVAL_MS` | `60000` | MCP throttle update interval | GMS, MCE Consumer | | `MCP_MCE_CONSUMER_THROTTLE_ENABLED` | `false` | Enable MCE consumer throttling | GMS, MCE Consumer | | `MCP_API_REQUESTS_THROTTLE_ENABLED` | `false` | Enable API requests throttling | GMS, MCE Consumer | | `MCP_VERSIONED_THROTTLE_ENABLED` | `false` | Enable versioned MCL topic throttling | GMS, MCE Consumer | | `MCP_VERSIONED_THRESHOLD` | `4000` | Versioned throttle threshold | GMS, MCE Consumer | | `MCP_VERSIONED_MAX_ATTEMPTS` | `1000` | Versioned max attempts | GMS, MCE Consumer | | `MCP_VERSIONED_INITIAL_INTERVAL_MS` | `100` | Versioned initial interval | GMS, MCE Consumer | | `MCP_VERSIONED_MULTIPLIER` | `10` | Versioned multiplier | GMS, MCE Consumer | | `MCP_VERSIONED_MAX_INTERVAL_MS` | `30000` | Versioned max interval | GMS, MCE Consumer | | `MCP_TIMESERIES_THROTTLE_ENABLED` | `false` | Enable timeseries MCL topic throttling | GMS, MCE Consumer | | `MCP_TIMESERIES_THRESHOLD` | `4000` | Timeseries throttle threshold | GMS, MCE Consumer | | `MCP_TIMESERIES_MAX_ATTEMPTS` | `1000` | Timeseries max attempts | GMS, MCE Consumer | | `MCP_TIMESERIES_INITIAL_INTERVAL_MS` | `100` | Timeseries initial interval | GMS, MCE Consumer | | `MCP_TIMESERIES_MULTIPLIER` | `10` | Timeseries multiplier | GMS, MCE Consumer | | `MCP_TIMESERIES_MAX_INTERVAL_MS` | `30000` | Timeseries max interval | GMS, MCE Consumer | ### Events API Configuration | Environment Variable | Default | Description | Components | | -------------------- | ------- | ----------------- | ---------- | | `EVENTS_API_ENABLED` | `true` | Enable events API | GMS | ### Iceberg Catalog Configuration | Environment Variable | Default | Description | Components | | ----------------------- | ------------------- | ----------------------------------------- | ---------- | | `ENABLE_PUBLIC_READ` | `false` | Enable public read for Iceberg catalog | GMS | | `PUBLICLY_READABLE_TAG` | `PUBLICLY_READABLE` | Publicly readable tag for Iceberg catalog | GMS | ## Component Configuration | Variable | Default | Description | Components | | ---------------------- | ------- | ----------------------------------------------------------------------------------------- | ----------------- | | `MCP_CONSUMER_ENABLED` | `true` | When running in standalone mode, disabled on `GMS` and enable on separate `MCE Consumer`. | GMS, MCE Consumer | | `MCL_CONSUMER_ENABLED` | `true` | When running in standalone mode, disabled on `GMS` and enable on separate `MAE Consumer`. | GMS, MAE Consumer | | `PE_CONSUMER_ENABLED` | `true` | When running in standalone mode, disabled on `GMS` and enable on separate `MAE Consumer`. | GMS, PE Consumer | --- # DataHub Frontend ## Play Framework Configuration ### Secret Key Configuration | Environment Variable | Default | Description | Components | | -------------------- | ------- | ------------------------------------------------- | ---------- | | `DATAHUB_SECRET` | `null` | Secret key used to secure cryptographic functions | Frontend | ### HTTP Parser Configuration | Environment Variable | Default | Description | Components | | ------------------------------ | ------- | ------------------------------------------ | ---------- | | `DATAHUB_PLAY_MEM_BUFFER_SIZE` | `10MB` | Maximum memory buffer size for HTTP parser | Frontend | ### Server Configuration | Environment Variable | Default | Description | Components | | -------------------------------------- | ------- | --------------------------------- | ---------- | | `DATAHUB_AKKA_MAX_HEADER_COUNT` | `64` | Maximum number of headers allowed | Frontend | | `DATAHUB_AKKA_MAX_HEADER_VALUE_LENGTH` | `32k` | Maximum header value length | Frontend | ### Session Configuration | Environment Variable | Default | Description | Components | | ----------------------- | ------- | ----------------------------------------------- | ---------- | | `AUTH_COOKIE_SAME_SITE` | `LAX` | SameSite attribute for authentication cookies | Frontend | | `AUTH_COOKIE_SECURE` | `false` | Whether authentication cookies should be secure | Frontend | ## Authentication Configuration ### OIDC Configuration Reference Links: - **OIDC Setup Guide**: [Configure OIDC Authentication](../authentication/guides/sso/configure-oidc-react.md) - **OIDC Prerequisites**: [Initialize OIDC](../authentication/guides/sso/initialize-oidc.md) #### Required OIDC Configuration | Environment Variable | Default | Description | Components | | ------------------------- | ------- | ---------------------------------------------------- | ---------- | | `AUTH_OIDC_ENABLED` | `false` | Enable OIDC authentication | Frontend | | `AUTH_OIDC_CLIENT_ID` | `null` | Unique client ID issued by the identity provider | Frontend | | `AUTH_OIDC_CLIENT_SECRET` | `null` | Unique client secret issued by the identity provider | Frontend | | `AUTH_OIDC_DISCOVERY_URI` | `null` | The IdP OIDC discovery URL | Frontend | | `AUTH_OIDC_BASE_URL` | `null` | The base URL associated with your DataHub deployment | Frontend | #### Optional OIDC Configuration | Environment Variable | Default | Description | Components | | ------------------------------------------- | --------------------- | ------------------------------------------------------------------------ | ---------- | | `AUTH_OIDC_USER_NAME_CLAIM` | `preferred_username` | The attribute/claim used to derive the DataHub username | Frontend | | `AUTH_OIDC_USER_NAME_CLAIM_REGEX` | `(.*)` | The regex used to parse the DataHub username from the user name claim | Frontend | | `AUTH_OIDC_SCOPE` | `oidc email profile` | String representing the requested scope from the IdP | Frontend | | `AUTH_OIDC_CLIENT_AUTHENTICATION_METHOD` | `client_secret_basic` | Authentication method to pass credentials to token endpoint | Frontend | | `AUTH_OIDC_JIT_PROVISIONING_ENABLED` | `true` | Whether DataHub users should be provisioned on login if they don't exist | Frontend | | `AUTH_OIDC_PRE_PROVISIONING_REQUIRED` | `false` | Whether the user should already exist in DataHub on login | Frontend | | `AUTH_OIDC_EXTRACT_GROUPS_ENABLED` | `true` | Whether groups should be extracted from a claim in the OIDC profile | Frontend | | `AUTH_OIDC_GROUPS_CLAIM` | `groups` | The OIDC claim to extract groups information from | Frontend | | `AUTH_OIDC_RESPONSE_TYPE` | `null` | OIDC response type | Frontend | | `AUTH_OIDC_RESPONSE_MODE` | `null` | OIDC response mode | Frontend | | `AUTH_OIDC_USE_NONCE` | `null` | Whether to use nonce in OIDC flow | Frontend | | `AUTH_OIDC_CUSTOM_PARAM_RESOURCE` | `null` | Custom resource parameter for OIDC | Frontend | | `AUTH_OIDC_READ_TIMEOUT` | `null` | OIDC read timeout | Frontend | | `AUTH_OIDC_CONNECT_TIMEOUT` | `null` | OIDC connect timeout | Frontend | | `AUTH_OIDC_EXTRACT_JWT_ACCESS_TOKEN_CLAIMS` | `false` | Whether to extract claims from JWT access token | Frontend | | `AUTH_OIDC_PREFERRED_JWS_ALGORITHM` | `null` | Which JWS algorithm to use | Frontend | | `AUTH_OIDC_ACR_VALUES` | `null` | OIDC ACR values | Frontend | | `AUTH_OIDC_GRANT_TYPE` | `null` | OIDC grant type | Frontend | ### Authentication Methods Configuration | Environment Variable | Default | Description | Components | | ------------------------------ | ------- | -------------------------------------------------------- | ---------- | | `AUTH_JAAS_ENABLED` | `true` | Enable JAAS authentication | Frontend | | `AUTH_NATIVE_ENABLED` | `true` | Enable native authentication | Frontend | | `GUEST_AUTHENTICATION_ENABLED` | `false` | Enable guest authentication | Frontend | | `GUEST_AUTHENTICATION_USER` | `guest` | The name of the guest user ID | Frontend | | `GUEST_AUTHENTICATION_PATH` | `null` | The path to bypass login page and get logged in as guest | Frontend | | `ENFORCE_VALID_EMAIL` | `true` | Enforce the usage of a valid email for user sign up | Frontend | ### Authentication Logging | Environment Variable | Default | Description | Components | | ---------------------- | ------- | ------------------------------------- | ---------- | | `AUTH_VERBOSE_LOGGING` | `false` | Enable verbose authentication logging | Frontend | ### Session Configuration | Environment Variable | Default | Description | Components | | ------------------------ | ------- | -------------------------------------- | ---------- | | `AUTH_SESSION_TTL_HOURS` | `24` | Login session expiration time in hours | Frontend | | `MAX_SESSION_TOKEN_AGE` | `24h` | Maximum age of session token | Frontend | ## Metadata Service Configuration ### Connection Configuration | Environment Variable | Default | Description | Components | | --------------------- | ----------- | -------------------------------------------------- | ---------- | | `DATAHUB_GMS_HOST` | `localhost` | Metadata service host | Frontend | | `DATAHUB_GMS_PORT` | `8080` | Metadata service port | Frontend | | `DATAHUB_GMS_USE_SSL` | `false` | Whether to use SSL for metadata service connection | Frontend | ### Authentication Configuration | Environment Variable | Default | Description | Components | | ------------------------------- | ---------------------- | ----------------------------------------- | ---------- | | `METADATA_SERVICE_AUTH_ENABLED` | `false` | Enable metadata service authentication | Frontend | | `DATAHUB_SYSTEM_CLIENT_SECRET` | `JohnSnowKnowsNothing` | System client secret for metadata service | Frontend | ## Entity Client Configuration | Environment Variable | Default | Description | Components | | -------------------------------------------- | ------- | ------------------------------------------ | ---------- | | `ENTITY_CLIENT_RETRY_INTERVAL` | `2` | Entity client retry interval | Frontend | | `ENTITY_CLIENT_NUM_RETRIES` | `3` | Entity client number of retries | Frontend | | `ENTITY_CLIENT_RESTLI_GET_BATCH_SIZE` | `50` | Entity client RESTli get batch size | Frontend | | `ENTITY_CLIENT_RESTLI_GET_BATCH_CONCURRENCY` | `2` | Entity client RESTli get batch concurrency | Frontend | --- ## Notes - Environment variables follow the pattern of converting YAML property paths to uppercase with underscores - Default values are shown in the table above - For Kafka configuration, refer to the official Spring Kafka documentation for additional properties - Feature flags control experimental or optional functionality - System update configurations control various background maintenance tasks - Cache configurations help optimize performance for different use cases - GraphQL configurations control query complexity and performance monitoring - OpenTelemetry variables control observability and tracing behavior - Play Framework properties are converted to environment variables by: - Converting dots (`.`) to underscores (`_`) - Converting to uppercase