mirror of
https://github.com/datahub-project/datahub.git
synced 2025-08-17 13:45:54 +00:00

Co-authored-by: John Joyce <john@Mac-4333.lan> Co-authored-by: John Joyce <john@Mac-4560.lan> Co-authored-by: John Joyce <john@Mac-4605.lan>
97 KiB
97 KiB
title |
---|
Deployment Environment Variables |
Environment Variables
The following is a summary of a few important environment variables which expose various levers which control how DataHub works.
DataHub Java Components
This includes GMS, System Update, MAE/MCE Consumers.
Authentication & Authorization
Reference Links:
- Authentication Overview: Authentication Overview
- Authentication Concepts: Authentication Concepts
- Metadata Service Authentication: Introducing Metadata Service Authentication
- OIDC Configuration: Configure OIDC Authentication
- Adding Users: Adding Users Guide
- Plugin Configuration: Plugin Documentation
Authentication Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
METADATA_SERVICE_AUTH_ENABLED |
true |
Enable if you want all requests to the Metadata Service to be authenticated | GMS, MAE Consumer, MCE Consumer, PE Consumer, Frontend |
DATAHUB_SYSTEM_CLIENT_SECRET |
System client secret used by AuthServiceController | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions, Frontend | |
METADATA_SERVICE_AUTHENTICATOR_EXCEPTIONS_ENABLED |
false |
Normally failures are only warnings, enable this to throw them | GMS |
DATAHUB_TOKEN_SERVICE_SIGNING_KEY |
Key used to validate incoming tokens and sign new tokens | GMS | |
DATAHUB_TOKEN_SERVICE_SALT |
Salt used for token validation and signing | GMS | |
DATAHUB_TOKEN_SERVICE_SIGNING_ALGORITHM |
HS256 |
Signing algorithm for DataHub tokens | GMS |
SESSION_TOKEN_DURATION_MS |
86400000 |
The max duration of a UI session in milliseconds (defaults to 1 day) | GMS |
GUEST_AUTHENTICATION_USER |
guest |
Guest user for unauthenticated access | GMS |
GUEST_AUTHENTICATION_ENABLED |
false |
Enable guest authentication | GMS |
Authorization Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_POLICIES_ENABLED |
true |
Enable the default DataHub policies-based authorizer | GMS |
POLICY_CACHE_REFRESH_INTERVAL_SECONDS |
120 |
Cache refresh interval for policies in seconds | GMS |
POLICY_CACHE_FETCH_SIZE |
1000 |
Cache policy fetch size | GMS |
REST_API_AUTHORIZATION_ENABLED |
true |
Enable authorization of reads, writes, and deletes on REST APIs | GMS |
VIEW_AUTHORIZATION_ENABLED |
false |
Controls whether entity pages can limit access based on policies | GMS |
VIEW_AUTHORIZATION_RECOMMENDATIONS_PEER_GROUP_ENABLED |
true |
Enable peer group recommendations for view authorization | GMS |
Ingestion Configuration
Reference Links:
- CLI Configuration: CLI Documentation
- DataHub Actions: Actions Documentation
Environment Variable | Default | Description | Components |
---|---|---|---|
UI_INGESTION_ENABLED |
true |
Enable UI-based ingestion | GMS, MAE Consumer |
INGESTION_BATCH_REFRESH_COUNT |
100 |
Number of entities to refresh in a single batch when refreshing entities after ingestion | GMS |
INGESTION_SOURCE_REFRESH_INTERVAL_SECONDS |
43200 |
Interval at which the ingestion source scheduler will check for new or updated ingestion sources | GMS |
Telemetry & Analytics
Environment Variable | Default | Description | Components |
---|---|---|---|
INGESTION_REPORTING_ENABLED |
false |
Enable ingestion reporting | GMS |
ENABLE_THIRD_PARTY_LOGGING |
false |
Whether mixpanel tracking is enabled | GMS |
DataHub Core Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_SERVER_TYPE |
prod |
DataHub server type | GMS |
DATAHUB_GMS_ASYNC_REQUEST_TIMEOUT_MS |
55000 |
Async request timeout for GMS | GMS |
DATAHUB_GMS_HOST |
localhost |
GMS host | Frontend |
DATAHUB_GMS_PORT |
8080 |
GMS port | Frontend |
DATAHUB_GMS_USE_SSL |
false |
Use SSL for GMS connections | Frontend |
DATAHUB_GMS_URI |
null |
URI instead of separate host/port/ssl parameters (takes priority) | Frontend |
DATAHUB_GMS_SSL_PROTOCOL |
null |
SSL protocol for GMS | Frontend |
Plugin Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
PLUGIN_SECURITY_MODE |
RESTRICTED |
Plugin security mode (RESTRICTED or LENIENT) | GMS |
ENTITY_REGISTRY_PLUGIN_PATH |
/etc/datahub/plugins/models |
Path for entity registry plugins | GMS |
ENTITY_REGISTRY_PLUGIN_LOAD_DELAY_SECONDS |
60 |
Rate at which plugin runnable executes | GMS |
RETENTION_PLUGIN_PATH |
/etc/datahub/plugins/retention |
Path for retention plugins | GMS |
AUTH_PLUGIN_PATH |
/etc/datahub/plugins/auth |
Path for auth plugins | GMS |
Metrics Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_METRICS_HOOK_LATENCY_PERCENTILES |
0.5,0.95,0.99,0.999 |
Hook latency percentiles | GMS, MAE Consumer |
DATAHUB_METRICS_HOOK_LATENCY_SERVICE_LEVEL_OBJECTIVES |
300,1800,3000,10800,21600,43200 |
Hook latency SLOs in seconds | GMS, MAE Consumer |
DATAHUB_METRICS_HOOK_LATENCY_MAX_EXPECTED_VALUE |
86000 |
Maximum expected hook latency value in seconds | GMS, MAE Consumer |
Entity Service Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ENTITY_SERVICE_IMPL |
ebean |
Entity service implementation | GMS, MCE Consumer |
ENTITY_SERVICE_ENABLE_RETENTION |
true |
Enable entity retention | GMS, MCE Consumer |
ENTITY_SERVICE_APPLY_RETENTION_BOOTSTRAP |
false |
Apply retention on bootstrap | GMS, MCE Consumer |
Graph Service Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
GRAPH_SERVICE_IMPL |
elasticsearch |
Graph service implementation | GMS, MAE Consumer |
GRAPH_SERVICE_LIMIT_RESULTS_MAX |
10000 |
Maximum allowed result count for queries | GMS |
GRAPH_SERVICE_LIMIT_RESULTS_API_DEFAULT |
5000 |
Default API result limit | GMS |
GRAPH_SERVICE_LIMIT_RESULTS_STRICT |
false |
Throw exception if strict is true, otherwise override with default and warn | GMS |
Search Service Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SEARCH_SERVICE_BATCH_SIZE |
100 |
Search service batch size | GMS |
SEARCH_SERVICE_ENABLE_CACHE |
false |
Enable search service cache | GMS |
SEARCH_SERVICE_ENABLE_CACHE_EVICTION |
false |
Enable search service cache eviction | GMS |
SEARCH_SERVICE_CACHE_IMPLEMENTATION |
caffeine |
Search service cache implementation | GMS |
SEARCH_SERVICE_HAZELCAST_SERVICE_NAME |
hazelcast-service |
Hazelcast service name for search cache | GMS |
SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_ENABLED |
true |
Enable container expansion in search filters | GMS |
SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_PAGE_SIZE |
100 |
Page size for container expansion | GMS |
SEARCH_SERVICE_FILTER_CONTAINER_EXPANSION_LIMIT |
100 |
Limit for container expansion | GMS |
SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_ENABLED |
true |
Enable domain expansion in search filters | GMS |
SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_PAGE_SIZE |
100 |
Page size for domain expansion | GMS |
SEARCH_SERVICE_FILTER_DOMAIN_EXPANSION_LIMIT |
100 |
Limit for domain expansion | GMS |
SEARCH_SERVICE_LIMIT_RESULTS_MAX |
10000 |
Maximum allowed result count for queries | GMS |
SEARCH_SERVICE_LIMIT_RESULTS_API_DEFAULT |
5000 |
Default API result limit | GMS |
SEARCH_SERVICE_LIMIT_RESULTS_STRICT |
false |
Throw exception if strict is true, otherwise override with default and warn | GMS |
Timeseries Aspect Service
Environment Variable | Default | Description | Components |
---|---|---|---|
TIMESERIES_ASPECT_SERVICE_QUERY_CONCURRENCY |
10 |
Parallel threads for timeseries queries | GMS |
TIMESERIES_ASPECT_SERVICE_QUERY_QUEUE_SIZE |
500 |
Queue size for timeseries queries | GMS |
TIMESERIES_ASPECT_SERVICE_QUERY_THREAD_KEEP_ALIVE |
60 |
Thread keep alive time for timeseries queries | GMS |
TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_MAX |
10000 |
Maximum allowed result count for queries | GMS |
TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_API_DEFAULT |
5000 |
Default API result limit | GMS |
TIMESERIES_ASPECT_SERVICE_LIMIT_RESULTS_STRICT |
false |
Throw exception if strict is true, otherwise override with default and warn | GMS |
System Metadata Service
Environment Variable | Default | Description | Components |
---|---|---|---|
SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_MAX |
10000 |
Maximum allowed result count for queries | GMS |
SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_API_DEFAULT |
5000 |
Default API result limit | GMS |
SYSTEM_METADATA_SERVICE_LIMIT_RESULTS_STRICT |
false |
Throw exception if strict is true, otherwise override with default and warn | GMS |
Platform Analytics
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_ANALYTICS_ENABLED |
true |
Enable platform analytics | GMS, MAE Consumer, Frontend |
DATAHUB_ANALYTICS_TRACING_ENABLED |
true |
Enable backend usage tracing | GMS |
ANALYTICS_DATAHUB_USAGE_EVENT_TYPES |
CreateAccessTokenEvent,CreatePolicyEvent,UpdatePolicyEvent,CreateIngestionSourceEvent,UpdateIngestionSourceEvent,RevokeAccessTokenEvent,CreateUserEvent,UpdateUserEvent,DeletePolicyEvent |
Comma separated list of usage event types to listen to | GMS |
ANALYTICS_GENERIC_ASPECT_TYPES |
`` | Filter list for generic aspect events | GMS |
ANALYTICS_USER_FILTERS |
`` | Filter out specific users' events from being published | GMS |
Visual Configuration
Queries Tab
Environment Variable | Default | Description | Components |
---|---|---|---|
REACT_APP_QUERIES_TAB_RESULT_SIZE |
5 |
Queries tab result size (experimental) | Frontend |
Theme Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
REACT_APP_CUSTOM_THEME_ID |
`` | Custom theme ID for rendering specific theme file | Frontend |
Assets Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
REACT_APP_LOGO_URL |
/assets/platforms/datahublogo.png |
Logo URL for the application | Frontend |
REACT_APP_FAVICON_URL |
/assets/icons/favicon.ico |
Favicon URL for the application | Frontend |
REACT_APP_TITLE |
`` | Application title | Frontend |
UI Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
REACT_APP_HIDE_GLOSSARY |
false |
Hide glossary in the UI | Frontend |
REACT_APP_SHOW_FULL_TITLE_IN_LINEAGE |
false |
Show full title in lineage | Frontend |
DOMAIN_DEFAULT_TAB |
`` | Default tab for domains (set to DOCUMENTATION_TAB to show documentation tab first) | Frontend |
APPLICATION_SHOW_SIDEBAR_SECTION_WHEN_EMPTY |
false |
Show sidebar section when empty (deprecated) | Frontend |
SEARCH_RESULT_NAME_HIGHLIGHT_ENABLED |
true |
Enable visual highlighting on search result names/descriptions | Frontend |
Storage Layer Configuration
EBean Configuration (MySQL/PostgreSQL)
Environment Variable | Default | Description | Components |
---|---|---|---|
EBEAN_DATASOURCE_USERNAME |
datahub |
Database username | GMS, MCE Consumer, System Update |
EBEAN_DATASOURCE_PASSWORD |
datahub |
Database password | GMS, MCE Consumer, System Update |
EBEAN_DATASOURCE_URL |
jdbc:mysql://localhost:3306/datahub |
JDBC URL | GMS, MCE Consumer, System Update |
EBEAN_DATASOURCE_DRIVER |
com.mysql.jdbc.Driver |
JDBC Driver | GMS, MCE Consumer, System Update |
EBEAN_MIN_CONNECTIONS |
2 |
Minimum database connections | GMS, MCE Consumer, System Update |
EBEAN_MAX_CONNECTIONS |
50 |
Maximum database connections | GMS, MCE Consumer, System Update |
EBEAN_MAX_INACTIVE_TIME_IN_SECS |
120 |
Maximum inactive time in seconds | GMS, MCE Consumer, System Update |
EBEAN_MAX_AGE_MINUTES |
120 |
Maximum age in minutes | GMS, MCE Consumer, System Update |
EBEAN_LEAK_TIME_MINUTES |
15 |
Leak time in minutes | GMS, MCE Consumer, System Update |
EBEAN_WAIT_TIMEOUT_MILLIS |
1000 |
Wait timeout in milliseconds | GMS, MCE Consumer, System Update |
EBEAN_AUTOCREATE |
false |
Auto-create DDL | GMS, MCE Consumer, System Update |
EBEAN_POSTGRES_USE_AWS_IAM_AUTH |
false |
Use AWS IAM authentication for PostgreSQL | GMS, MCE Consumer, System Update |
EBEAN_BATCH_GET_METHOD |
IN |
Batch get method (IN or UNION) | GMS, MCE Consumer, System Update |
Cassandra Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
CASSANDRA_DATASOURCE_USERNAME |
cassandra |
Cassandra username | GMS, MCE Consumer, System Update |
CASSANDRA_DATASOURCE_PASSWORD |
cassandra |
Cassandra password | GMS, MCE Consumer, System Update |
CASSANDRA_HOSTS |
cassandra |
Cassandra hosts | GMS, MCE Consumer, System Update |
CASSANDRA_PORT |
9042 |
Cassandra port | GMS, MCE Consumer, System Update |
CASSANDRA_DATACENTER |
datacenter1 |
Cassandra datacenter | GMS, MCE Consumer, System Update |
CASSANDRA_KEYSPACE |
datahub |
Cassandra keyspace | GMS, MCE Consumer, System Update |
CASSANDRA_USE_SSL |
false |
Use SSL for Cassandra | GMS, MCE Consumer, System Update |
Elasticsearch Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ELASTICSEARCH_HOST |
localhost |
Elasticsearch host | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_PORT |
9200 |
Elasticsearch port | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_THREAD_COUNT |
2 |
Elasticsearch thread count | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_CONNECTION_REQUEST_TIMEOUT |
5000 |
Connection request timeout | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_USERNAME |
null |
Elasticsearch username | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_PASSWORD |
null |
Elasticsearch password | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_PATH_PREFIX |
null |
Elasticsearch path prefix | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_USE_SSL |
false |
Use SSL for Elasticsearch | GMS, MAE Consumer, MCE Consumer, System Update |
OPENSEARCH_USE_AWS_IAM_AUTH |
false |
Use AWS IAM authentication for OpenSearch | GMS, MAE Consumer, MCE Consumer, System Update |
AWS_REGION |
null |
AWS region | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_IMPLEMENTATION |
elasticsearch |
Implementation (elasticsearch or opensearch) | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTIC_ID_HASH_ALGO |
MD5 |
ID hash algorithm | GMS, MAE Consumer, MCE Consumer, System Update |
SSL Context Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ELASTICSEARCH_SSL_PROTOCOL |
null |
SSL protocol | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_SECURE_RANDOM_IMPL |
null |
SSL secure random implementation | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_TRUSTSTORE_FILE |
null |
SSL truststore file | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_TRUSTSTORE_TYPE |
null |
SSL truststore type | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_TRUSTSTORE_PASSWORD |
null |
SSL truststore password | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_KEYSTORE_FILE |
null |
SSL keystore file | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_KEYSTORE_TYPE |
null |
SSL keystore type | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_KEYSTORE_PASSWORD |
null |
SSL keystore password | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_SSL_KEY_PASSWORD |
null |
SSL key password | GMS, MAE Consumer, MCE Consumer, System Update |
Bulk Operations Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ES_BULK_DELETE_BATCH_SIZE |
5000 |
Bulk delete batch size | GMS, MAE Consumer |
ES_BULK_DELETE_SLICES |
auto |
Bulk delete slices | GMS, MAE Consumer |
ES_BULK_DELETE_POLL_INTERVAL |
30 |
Bulk delete poll interval | GMS, MAE Consumer |
ES_BULK_DELETE_POLL_UNIT |
SECONDS |
Bulk delete poll unit | GMS, MAE Consumer |
ES_BULK_DELETE_TIMEOUT |
30 |
Bulk delete timeout | GMS, MAE Consumer |
ES_BULK_DELETE_TIMEOUT_UNIT |
MINUTES |
Bulk delete timeout unit | GMS, MAE Consumer |
ES_BULK_DELETE_NUM_RETRIES |
3 |
Bulk delete number of retries | GMS, MAE Consumer |
ES_BULK_ASYNC |
true |
Enable async bulk operations | GMS, MAE Consumer |
ES_BULK_REQUESTS_LIMIT |
1000 |
Bulk requests limit | GMS, MAE Consumer |
ES_BULK_FLUSH_PERIOD |
1 |
Bulk flush period | GMS, MAE Consumer |
ES_BULK_NUM_RETRIES |
3 |
Bulk number of retries | GMS, MAE Consumer |
ES_BULK_RETRY_INTERVAL |
1 |
Bulk retry interval | GMS, MAE Consumer |
ES_BULK_REFRESH_POLICY |
NONE |
Bulk refresh policy | GMS, MAE Consumer |
ES_BULK_ENABLE_BATCH_DELETE |
false |
Enable batch delete | GMS, MAE Consumer |
Index Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
INDEX_PREFIX |
`` | Index prefix | GMS, MAE Consumer, MCE Consumer, System Update |
ELASTICSEARCH_INDEX_DOC_IDS_SCHEMA_FIELD_HASH_ID_ENABLED |
false |
Enable hash ID for schema field doc IDs | GMS, MAE Consumer, MCE Consumer, System Update |
Build Indices Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ELASTICSEARCH_BUILD_INDICES_ALLOW_DOC_COUNT_MISMATCH |
false |
Allow document count mismatch when clone indices is enabled | System Update |
ELASTICSEARCH_BUILD_INDICES_CLONE_INDICES |
true |
Clone indices | System Update |
ELASTICSEARCH_BUILD_INDICES_RETENTION_UNIT |
DAYS |
Retention unit for indices | System Update |
ELASTICSEARCH_BUILD_INDICES_RETENTION_VALUE |
60 |
Retention value for indices | System Update |
ELASTICSEARCH_BUILD_INDICES_REINDEX_OPTIMIZATION_ENABLED |
true |
Enable reindex optimization | System Update |
ELASTICSEARCH_NUM_SHARDS_PER_INDEX |
1 |
Number of shards per index | System Update |
ELASTICSEARCH_NUM_REPLICAS_PER_INDEX |
1 |
Number of replicas per index | System Update |
ELASTICSEARCH_INDEX_BUILDER_NUM_RETRIES |
3 |
Index builder number of retries | System Update |
ELASTICSEARCH_INDEX_BUILDER_REFRESH_INTERVAL_SECONDS |
3 |
Index builder refresh interval | System Update |
SEARCH_DOCUMENT_MAX_ARRAY_LENGTH |
1000 |
Maximum array length in search documents | System Update |
SEARCH_DOCUMENT_MAX_OBJECT_KEYS |
1000 |
Maximum object keys in search documents | System Update |
SEARCH_DOCUMENT_MAX_VALUE_LENGTH |
4096 |
Maximum value length in search documents | System Update |
ELASTICSEARCH_MAIN_TOKENIZER |
null |
Main tokenizer | System Update |
ELASTICSEARCH_INDEX_BUILDER_MAPPINGS_REINDEX |
false |
Enable mappings reindex | System Update |
ELASTICSEARCH_INDEX_BUILDER_SETTINGS_REINDEX |
false |
Enable settings reindex | System Update |
ELASTICSEARCH_INDEX_BUILDER_MAX_REINDEX_HOURS |
0 |
Maximum reindex hours (0 = no timeout) | System Update |
ELASTICSEARCH_INDEX_BUILDER_SETTINGS_OVERRIDES |
null |
Index builder settings overrides | System Update |
ELASTICSEARCH_MIN_SEARCH_FILTER_LENGTH |
3 |
Minimum search filter length | System Update |
ELASTICSEARCH_INDEX_BUILDER_ENTITY_SETTINGS_OVERRIDES |
null |
Entity settings overrides | System Update |
Search Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ELASTICSEARCH_QUERY_MAX_TERM_BUCKET_SIZE |
60 |
Maximum term bucket size | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_EXCLUSIVE |
false |
Only return exact matches when using quotes | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_WITH_PREFIX |
true |
Include prefix match in exact match results | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_FACTOR |
16.0 |
Multiply by this number on true exact match | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_PREFIX_FACTOR |
1.1 |
Multiply by this number when prefix match | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_CASE_FACTOR |
0.0 |
Stacked boost multiplier when case mismatch | GMS |
ELASTICSEARCH_QUERY_EXACT_MATCH_ENABLE_STRUCTURED |
true |
Enable exact match on structured search | GMS |
ELASTICSEARCH_QUERY_TWO_GRAM_FACTOR |
1.2 |
Boost multiplier when match on 2-gram tokens | GMS |
ELASTICSEARCH_QUERY_THREE_GRAM_FACTOR |
1.5 |
Boost multiplier when match on 3-gram tokens | GMS |
ELASTICSEARCH_QUERY_FOUR_GRAM_FACTOR |
1.8 |
Boost multiplier when match on 4-gram tokens | GMS |
ELASTICSEARCH_QUERY_PARTIAL_URN_FACTOR |
0.5 |
Multiplier on Urn token match | GMS |
ELASTICSEARCH_QUERY_PARTIAL_FACTOR |
0.4 |
Multiplier on possible non-Urn token match | GMS |
ELASTICSEARCH_QUERY_CUSTOM_CONFIG_ENABLED |
true |
Enable search query and ranking customization | GMS |
ELASTICSEARCH_QUERY_CUSTOM_CONFIG_FILE |
search_config.yaml |
Location of search customization configuration | GMS |
ELASTICSEARCH_QUERY_SEARCH_FIELD_CONFIG_DEFAULT |
legacy |
Default field configuration for search | GMS |
ELASTICSEARCH_QUERY_AUTOCOMPLETE_FIELD_CONFIG_DEFAULT |
legacy |
Default field configuration for autocomplete | GMS |
Graph Search Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ELASTICSEARCH_SEARCH_GRAPH_TIMEOUT_SECONDS |
50 |
Graph DAO timeout seconds | GMS |
ELASTICSEARCH_SEARCH_GRAPH_BATCH_SIZE |
1000 |
Graph DAO batch size | GMS |
ELASTICSEARCH_SEARCH_GRAPH_MULTI_PATH_SEARCH |
false |
Allow path retraversal for all paths | GMS |
ELASTICSEARCH_SEARCH_GRAPH_BOOST_VIA_NODES |
true |
Boost graph edges with via nodes | GMS |
ELASTICSEARCH_SEARCH_GRAPH_STATUS_ENABLED |
false |
Enable soft delete tracking of URNs on edges | GMS |
ELASTICSEARCH_SEARCH_GRAPH_LINEAGE_MAX_HOPS |
20 |
Maximum hops to traverse lineage graph | GMS |
ELASTICSEARCH_SEARCH_GRAPH_IMPACT_MAX_HOPS |
1000 |
Maximum hops to traverse for impact analysis | GMS |
ELASTICSEARCH_SEARCH_GRAPH_IMPACT_MAX_THREADS |
32 |
Maximum parallel lineage graph queries | GMS |
ELASTICSEARCH_SEARCH_GRAPH_QUERY_OPTIMIZATION |
true |
Reduce query nesting if possible | GMS |
Neo4j Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
NEO4J_USERNAME |
neo4j |
Neo4j username | GMS, MAE Consumer, System Update |
NEO4J_PASSWORD |
datahub |
Neo4j password | GMS, MAE Consumer, System Update |
NEO4J_URI |
bolt://localhost |
Neo4j URI | GMS, MAE Consumer, System Update |
NEO4J_DATABASE |
graph.db |
Neo4j database | GMS, MAE Consumer, System Update |
NEO4J_MAX_CONNECTION_POOL_SIZE |
100 |
Maximum connection pool size | GMS, MAE Consumer, System Update |
NEO4J_MAX_CONNECTION_ACQUISITION_TIMEOUT_IN_SECONDS |
60 |
Maximum connection acquisition timeout | GMS, MAE Consumer, System Update |
NEO4j_MAX_CONNECTION_LIFETIME_IN_SECONDS |
3600 |
Maximum connection lifetime | GMS, MAE Consumer, System Update |
NEO4J_MAX_TRANSACTION_RETRY_TIME_IN_SECONDS |
30 |
Maximum transaction retry time | GMS, MAE Consumer, System Update |
NEO4J_CONNECTION_LIVENESS_CHECK_TIMEOUT_IN_SECONDS |
-1 |
Connection liveness check timeout | GMS, MAE Consumer, System Update |
Kafka Configuration
Reference Links:
- Kafka Configuration: Kafka Configuration Guide
- Confluent Cloud: Confluent Cloud Integration
- DataHub Actions: Actions Documentation
Topic Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_USAGE_EVENT_NAME |
DataHubUsageEvent_v1 |
DataHub usage event topic name | GMS, MAE Consumer, MCE Consumer, Actions, Frontend |
Bootstrap Servers
Environment Variable | Default | Description | Components |
---|---|---|---|
KAFKA_BOOTSTRAP_SERVER |
http://localhost:9092 |
Kafka bootstrap servers | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions, Frontend |
Producer Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
KAFKA_PRODUCER_RETRY_COUNT |
3 |
Producer retry count | GMS, MCE Consumer, System Update |
KAFKA_PRODUCER_DELIVERY_TIMEOUT |
30000 |
Producer delivery timeout | GMS, MCE Consumer, System Update |
KAFKA_PRODUCER_REQUEST_TIMEOUT |
3000 |
Producer request timeout | GMS, MCE Consumer, System Update |
KAFKA_PRODUCER_BACKOFF_TIMEOUT |
500 |
Producer backoff timeout | GMS, MCE Consumer, System Update |
KAFKA_PRODUCER_COMPRESSION_TYPE |
snappy |
Producer compression algorithm | GMS, MCE Consumer, System Update |
KAFKA_PRODUCER_MAX_REQUEST_SIZE |
5242880 |
Maximum bytes sent by producer | GMS, MCE Consumer, System Update |
Consumer Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
KAFKA_LISTENER_CONCURRENCY |
1 |
Number of Kafka consumer threads | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_MAX_PARTITION_FETCH_BYTES |
5242880 |
Maximum data per partition | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_STOP_ON_DESERIALIZATION_ERROR |
true |
Stop on deserialization error | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_HEALTH_CHECK_ENABLED |
true |
Enable health check for consumers | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_MCP_AUTO_OFFSET_RESET |
earliest |
MCP consumer auto offset reset | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_MCL_AUTO_OFFSET_RESET |
earliest |
MCL consumer auto offset reset | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_CONSUMER_MCL_FINE_GRAINED_LOGGING_ENABLED |
false |
Enable fine-grained logging for MCL | GMS, MAE Consumer |
KAFKA_CONSUMER_MCL_ASPECTS_TO_DROP |
`` | Aspects to drop for MCL | GMS, MAE Consumer |
KAFKA_CONSUMER_PE_AUTO_OFFSET_RESET |
latest |
PE consumer auto offset reset | GMS, PE Consumer |
KAFKA_CONSUMER_PERCENTILES |
0.5,0.95,0.99,0.999 |
Consumer percentiles | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer |
KAFKA_CONSUMER_SERVICE_LEVEL_OBJECTIVES |
300,1800,3000,10800,21600,43200 |
Consumer SLOs in seconds | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer |
KAFKA_CONSUMER_MAX_EXPECTED_VALUE |
86000 |
Maximum expected consumer value in seconds | GMS, MAE Consumer, MCE Consumer, PE Consumer, PE Consumer |
Consumer Pool Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
KAFKA_CONSUMER_POOL_INITIAL_SIZE |
1 |
Consumer pool initial size | GMS |
KAFKA_CONSUMER_POOL_MAX_SIZE |
5 |
Consumer pool maximum size | GMS |
Schema Registry Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SCHEMA_REGISTRY_TYPE |
KAFKA |
Schema registry type (INTERNAL, KAFKA, or AWS_GLUE) | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_SCHEMAREGISTRY_URL |
http://localhost:8081 |
Schema registry URL | GMS, MAE Consumer, MCE Consumer, PE Consumer |
SCHEMA_REGISTRY_URL |
http://localhost:8081 |
Schema registry URL (Actions) | Actions |
AWS_GLUE_SCHEMA_REGISTRY_REGION |
us-east-1 |
AWS Glue schema registry region | GMS, MAE Consumer, MCE Consumer, PE Consumer |
AWS_GLUE_SCHEMA_REGISTRY_NAME |
null |
AWS Glue schema registry name | GMS, MAE Consumer, MCE Consumer, PE Consumer |
KAFKA_PROPERTIES_SECURITY_PROTOCOL |
PLAINTEXT |
Kafka security protocol | GMS, MAE Consumer, MCE Consumer, PE Consumer, Actions |
Spring Configuration
Kafka Security
Environment Variable | Default | Description | Components |
---|---|---|---|
spring.kafka.security.protocol |
PLAINTEXT |
Kafka security protocol | GMS, MAE Consumer, MCE Consumer, PE Consumer |
Management & Monitoring
JMX Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
spring.jmx.enabled |
true |
Enable JMX | GMS, MAE Consumer, MCE Consumer, PE Consumer |
Endpoints Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
management.endpoints.web.exposure.include |
prometheus,info,healthcheck,metrics |
Exposed web endpoints | GMS |
management.endpoints.jmx.enabled |
true |
Enable JMX endpoints | GMS |
Metrics Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
management.metrics.cache.enabled |
false |
Enable cache metrics | GMS, MAE Consumer, MCE Consumer, PE Consumer |
management.metrics.export.jmx.enabled |
true |
Enable JMX metrics export | GMS, MAE Consumer, MCE Consumer, PE Consumer |
management.metrics.export.prometheus.enabled |
true |
Enable Prometheus metrics export | GMS, MAE Consumer, MCE Consumer, PE Consumer |
Server Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
server.server-header |
false |
Server header | GMS |
Feature Flags
Reference Links:
- Access Management: Access Management Feature
- Structured Properties: Structured Properties Overview
- Lineage Features: Data Lineage, UI Lineage Management
- Compliance Forms: Compliance Forms Overview
- Dataset Usage: Dataset Usage & Query History
- MCP Server: DataHub MCP Server
Environment Variable | Default | Description | Components |
---|---|---|---|
SHOW_SIMPLIFIED_HOMEPAGE_BY_DEFAULT |
false |
Show simplified homepage with just datasets, charts and dashboards | GMS |
LINEAGE_SEARCH_CACHE_ENABLED |
true |
Enable in-memory cache for searchAcrossLineage query | GMS |
GRAPH_SERVICE_DIFF_MODE_ENABLED |
true |
Enable diff mode for graph writes | GMS |
POINT_IN_TIME_CREATION_ENABLED |
false |
Enable creation of point in time snapshots for scroll API | GMS |
ALWAYS_EMIT_CHANGE_LOG |
false |
Always emit MCL even when no changes detected | GMS |
SEARCH_SERVICE_DIFF_MODE_ENABLED |
true |
Enable diff mode for search document writes | GMS |
READ_ONLY_MODE_ENABLED |
false |
Enable read only mode for instance | GMS |
SHOW_ACCESS_MANAGEMENT |
false |
Show AccessManagement tab in UI | GMS |
SHOW_SEARCH_FILTERS_V2 |
true |
Show search filters V2 experience | GMS |
SHOW_BROWSE_V2 |
true |
Show browse v2 sidebar experience | GMS |
PLATFORM_BROWSE_V2 |
true |
Enable platform browse experience | GMS |
LINEAGE_GRAPH_V2 |
true |
Enable new lineage visualization | GMS |
PRE_PROCESS_HOOKS_UI_ENABLED |
true |
Circumvent Kafka for UI changes | GMS |
PRE_PROCESS_HOOKS_UI_ENABLED |
false |
Reprocess UI sourced events asynchronously | GMS |
SHOW_ACRYL_INFO |
false |
Show CTAs around moving to DataHub Cloud | GMS |
ER_MODEL_RELATIONSHIP_FEATURE_ENABLED |
false |
Enable Join Tables Feature | GMS |
NESTED_DOMAINS_ENABLED |
true |
Enable nested Domains feature | GMS |
SCHEMA_FIELD_ENTITY_FETCH_ENABLED |
true |
Enable fetching schema field entities | GMS |
BUSINESS_ATTRIBUTE_ENTITY_ENABLED |
false |
Enable business attribute entity | GMS |
DATA_CONTRACTS_ENABLED |
true |
Enable Data Contracts feature | GMS |
ALTERNATE_MCP_VALIDATION |
false |
Enable alternate MCP validation flow | GMS |
THEME_V2_ENABLED |
true |
Allow theme v2 to be turned on | GMS |
THEME_V2_DEFAULT |
true |
Set default theme for users | GMS |
THEME_V2_TOGGLEABLE |
true |
Allow theme v2 to be toggled (Acryl only) | GMS |
SCHEMA_FIELD_CLL_ENABLED |
false |
Enable schema field-level lineage links | GMS |
SCHEMA_FIELD_LINEAGE_IGNORE_STATUS |
true |
Ignore schema field status in lineage | GMS |
SHOW_SEPARATE_SIBLINGS |
false |
Separate siblings with no combined view | GMS |
EDITABLE_DATASET_NAME_ENABLED |
false |
Enable editing dataset name in UI | GMS |
SHOW_MANAGE_STRUCTURED_PROPERTIES |
true |
Show manage structured properties button | GMS |
HIDE_DBT_SOURCE_IN_LINEAGE |
false |
Hide dbt sources in lineage | GMS |
SHOW_NAV_BAR_REDESIGN |
true |
Show newly designed nav bar | GMS |
SHOW_AUTO_COMPLETE_RESULTS |
true |
Show auto complete results in search bar | GMS |
ENTITY_VERSIONING_ENABLED |
false |
Enable entity versioning APIs | GMS |
SHOW_HAS_SIBLINGS_FILTER |
false |
Show "has siblings" filter in search | GMS |
SHOW_SEARCH_BAR_AUTOCOMPLETE_REDESIGN |
false |
Show redesigned search bar autocomplete | GMS |
SHOW_MANAGE_TAGS |
true |
Allow users to manage tags in UI | GMS |
SHOW_INTRODUCE_PAGE |
true |
Show introduce page in V2 UI | GMS |
SHOW_INGESTION_PAGE_REDESIGN |
false |
Show re-designed Ingestion page | GMS |
SHOW_LINEAGE_EXPAND_MORE |
true |
Show expand more button in lineage graph | GMS |
SHOW_HOME_PAGE_REDESIGN |
false |
Show re-designed home page | GMS |
LINEAGE_GRAPH_V3 |
false |
Enable redesign of lineage v2 graph | GMS |
SHOW_PRODUCT_UPDATES |
true |
Show in-product update popover | GMS |
LOGICAL_MODELS_ENABLED |
false |
Enable logical models feature | GMS |
SHOW_HOMEPAGE_USER_ROLE |
false |
Display homepage user role underneath name | GMS |
VIEWS_ENABLED |
true |
Enable views feature | GMS |
System Updates
Reference Links:
- Updating DataHub: Updating DataHub Guide
Bootstrap Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_POLICIES_FILE |
classpath:boot/policies.json |
Bootstrap policies file | GMS |
BOOTSTRAP_SERVLETS_WAITTIMEOUT |
60 |
Total waiting time for servlets to initialize | GMS |
System Update Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_INITIAL_BACK_OFF_MILLIS |
5000 |
Initial back off for system updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_MAX_BACK_OFFS |
50 |
Maximum back offs for system updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_BACK_OFF_FACTOR |
2 |
Multiplicative factor for back off | System Update |
BOOTSTRAP_SYSTEM_UPDATE_WAIT_FOR_SYSTEM_UPDATE |
true |
Wait for system update to complete | System Update |
SYSTEM_UPDATE_BOOTSTRAP_MCP_CONFIG |
bootstrap_mcps.yaml |
Bootstrap MCP configuration | System Update |
Data Job Node CLL Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_ENABLED |
false |
Enable data job node CLL | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_BATCH_SIZE |
1000 |
Data job node CLL batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_DELAY_MS |
30000 |
Data job node CLL delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DATA_JOB_NODE_CLL_LIMIT |
0 |
Data job node CLL limit | System Update |
Domain Description Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_ENABLED |
true |
Enable domain description updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_BATCH_SIZE |
1000 |
Domain description batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_DELAY_MS |
30000 |
Domain description delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DOMAIN_DESCRIPTION_CLL_LIMIT |
0 |
Domain description CLL limit | System Update |
Dashboard Info Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_ENABLED |
true |
Enable dashboard info updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_BATCH_SIZE |
1000 |
Dashboard info batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_DELAY_MS |
30000 |
Dashboard info delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_DASHBOARD_INFO_CLL_LIMIT |
0 |
Dashboard info CLL limit | System Update |
Browse Paths V2 Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_BROWSE_PATHS_V2_ENABLED |
true |
Enable browse paths V2 updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_BROWSE_PATHS_V2_BATCH_SIZE |
5000 |
Browse paths V2 batch size | System Update |
REPROCESS_DEFAULT_BROWSE_PATHS_V2 |
false |
Reprocess default browse paths V2 | System Update |
Ingestion Indices Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_ENABLED |
true |
Enable ingestion indices updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_BATCH_SIZE |
5000 |
Ingestion indices batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_DELAY_MS |
1000 |
Ingestion indices delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_INGESTION_INDICES_CLL_LIMIT |
0 |
Ingestion indices CLL limit | System Update |
Policy Fields Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_ENABLED |
true |
Enable policy fields updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_POLICY_FIELDS_BATCH_SIZE |
5000 |
Policy fields batch size | System Update |
REPROCESS_DEFAULT_POLICY_FIELDS |
false |
Reprocess default policy fields | System Update |
Ownership Types Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_ENABLED |
true |
Enable ownership types updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_BATCH_SIZE |
1000 |
Ownership types batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_REPROCESS |
false |
Reprocess ownership types | System Update |
Schema Fields Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_ENABLED |
false |
Enable schema fields from schema metadata | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_BATCH_SIZE |
500 |
Schema fields from schema metadata batch size | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_DELAY_MS |
1000 |
Schema fields from schema metadata delay | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA_LIMIT |
0 |
Schema fields from schema metadata limit | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_ENABLED |
false |
Enable schema fields doc IDs | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_BATCH_SIZE |
500 |
Schema fields doc IDs batch size | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_DELAY_MS |
5000 |
Schema fields doc IDs delay | System Update |
SYSTEM_UPDATE_SCHEMA_FIELDS_DOC_IDS_LIMIT |
0 |
Schema fields doc IDs limit | System Update |
Process Instance Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_ENABLED |
true |
Enable process instance has run events | System Update |
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_BATCH_SIZE |
100 |
Process instance has run events batch size | System Update |
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_DELAY_MS |
1000 |
Process instance has run events delay | System Update |
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_TOTAL_DAYS |
90 |
Process instance has run events total days | System Update |
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_WINDOW_DAYS |
1 |
Process instance has run events window days | System Update |
SYSTEM_UPDATE_PROCESS_INSTANCE_HAS_RUN_EVENTS_REPROCESS |
false |
Reprocess process instance has run events | System Update |
Edge Status Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_ENABLED |
false |
Enable edge status updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_BATCH_SIZE |
1000 |
Edge status batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_DELAY_MS |
5000 |
Edge status delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_EDGE_STATUS_LIMIT |
0 |
Edge status limit | System Update |
Property Definitions Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_ENABLED |
true |
Enable property definitions updates | System Update |
BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_BATCH_SIZE |
500 |
Property definitions batch size | System Update |
BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_DELAY_MS |
1000 |
Property definitions delay in milliseconds | System Update |
BOOTSTRAP_SYSTEM_UPDATE_PROPERTY_DEFINITIONS_CLL_LIMIT |
0 |
Property definitions CLL limit | System Update |
Remove Query Edges Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BOOTSTRAP_SYSTEM_UPDATE_REMOVE_QUERY_EDGES_ENABLED |
true |
Enable remove query edges | System Update |
BOOTSTRAP_SYSTEM_UPDATE_REMOVE_QUERY_EDGES_RETRIES |
20 |
Remove query edges retries | System Update |
Additional Environment Variables
The following environment variables are used in the codebase but may not be explicitly defined in the application.yaml file:
Ingestion and Processing
Environment Variable | Default | Description | Components |
---|---|---|---|
ASYNC_INGEST_DEFAULT |
false |
Asynchronously process ingestProposals by writing to Kafka | GMS |
STRICT_URN_VALIDATION_ENABLED |
false |
Enable stricter URN validation logic | GMS |
DATAHUB_DATASET_URN_TO_LOWER |
null |
Convert dataset URN names to lowercase | GMS |
BUSINESS_ATTRIBUTE_ENTITY_ENABLED |
false |
Enable business attribute entity feature | GMS |
REST and Servlet Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
RESTLI_SERVLET_THREADS |
null |
Number of threads for REST servlet | GMS, MCE Consumer |
RESTLI_TIMEOUT_SECONDS |
60 |
REST timeout in seconds | GMS, MCE Consumer |
System and Version Information
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_GMS_PROTOCOL |
http |
GMS protocol (http/https) | GMS |
Upgrade and Migration
Environment Variable | Default | Description | Components |
---|---|---|---|
SKIP_REINDEX_EDGE_STATUS |
false |
Skip reindexing edge status | System Update |
SKIP_REINDEX_DATA_JOB_INPUT_OUTPUT |
false |
Skip reindexing data job input/output | System Update |
SKIP_GENERATE_SCHEMA_FIELDS_FROM_SCHEMA_METADATA |
false |
Skip generating schema fields from schema metadata | System Update |
SKIP_MIGRATE_SCHEMA_FIELDS_DOC_ID |
false |
Skip migrating schema fields doc IDs | System Update |
BACKFILL_BROWSE_PATHS_V2 |
false |
Enable backfilling browse paths V2 | System Update |
READER_POOL_SIZE |
null |
Reader pool size for restore operations | System Update |
WRITER_POOL_SIZE |
null |
Writer pool size for restore operations | System Update |
OpenTelemetry Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
OTEL_METRICS_EXPORTER |
none |
OpenTelemetry metrics exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer |
OTEL_TRACES_EXPORTER |
none |
OpenTelemetry traces exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer |
OTEL_LOGS_EXPORTER |
none |
OpenTelemetry logs exporter | GMS, MAE Consumer, MCE Consumer, PE Consumer |
OTEL_PROPAGATORS |
null |
OpenTelemetry propagators | GMS, MAE Consumer, MCE Consumer, PE Consumer |
Secret Service Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SECRET_SERVICE_ENCRYPTION_KEY |
ENCRYPTION_KEY |
Secret service encryption key | GMS |
SECRET_SERVICE_V1_ALGORITHM_ENABLED |
true |
Enable v1 algorithm for secret service | GMS |
Health Check Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
HEALTH_CHECK_CACHE_DURATION_SECONDS |
5 |
Health check cache duration | GMS |
Metadata Tests Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
METADATA_TESTS_ENABLED |
false |
Enable metadata tests | GMS |
Hooks Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ENABLE_SIBLING_HOOK |
true |
Enable automatic sibling associations | GMS, MAE Consumer |
SIBLINGS_HOOK_CONSUMER_GROUP_SUFFIX |
`` | Siblings hook consumer group suffix | GMS, MAE Consumer |
ENABLE_UPDATE_INDICES_HOOK |
true |
Enable update indices hook | GMS, MAE Consumer |
UPDATE_INDICES_CONSUMER_GROUP_SUFFIX |
`` | Update indices consumer group suffix | GMS, MAE Consumer |
ENABLE_INGESTION_SCHEDULER_HOOK |
true |
Enable ingestion scheduling | GMS, MAE Consumer |
INGESTION_SCHEDULER_HOOK_CONSUMER_GROUP_SUFFIX |
`` | Ingestion scheduler hook consumer group suffix | GMS, MAE Consumer |
ENABLE_INCIDENTS_HOOK |
true |
Enable incidents hook | GMS, MAE Consumer |
MAX_INCIDENT_HISTORY |
100 |
Maximum incident history | GMS, MAE Consumer |
INCIDENTS_HOOK_CONSUMER_GROUP_SUFFIX |
`` | Incidents hook consumer group suffix | GMS, MAE Consumer |
ENABLE_STRUCTURED_PROPERTIES_HOOK |
true |
Enable structured properties mappings | GMS, MAE Consumer |
ENABLE_STRUCTURED_PROPERTIES_WRITE |
true |
Enable writing structured property values | GMS, MAE Consumer |
ENABLE_STRUCTURED_PROPERTIES_SYSTEM_UPDATE |
false |
Enable structured property mappings in system update | GMS, MAE Consumer |
ENABLE_ENTITY_CHANGE_EVENTS_HOOK |
true |
Enable entity change events hook | GMS, MAE Consumer |
ECE_CONSUMER_GROUP_SUFFIX |
`` | Entity change events consumer group suffix | GMS, MAE Consumer |
ECE_ENTITY_EXCLUSIONS |
schemaField |
Entities to exclude from ECE hook | GMS, MAE Consumer |
FORMS_HOOK_ENABLED |
true |
Enable forms hook | GMS, MAE Consumer |
FORMS_HOOK_CONSUMER_GROUP_SUFFIX |
`` | Forms hook consumer group suffix | GMS, MAE Consumer |
Search and API Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
SEARCH_BAR_API_VARIANT |
AUTOCOMPLETE_FOR_MULTIPLE |
Search bar API variant | Frontend |
FIRST_IN_PERSONAL_SIDEBAR |
YOUR_ASSETS |
First item in personal sidebar | Frontend |
Client Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ENTITY_CLIENT_RETRY_INTERVAL |
2 |
Entity client retry interval | GMS |
ENTITY_CLIENT_NUM_RETRIES |
3 |
Entity client number of retries | GMS |
ENTITY_CLIENT_JAVA_GET_BATCH_SIZE |
375 |
Entity client Java get batch size | GMS |
ENTITY_CLIENT_JAVA_INGEST_BATCH_SIZE |
375 |
Entity client Java ingest batch size | GMS |
ENTITY_CLIENT_RESTLI_GET_BATCH_SIZE |
100 |
Entity client RESTli get batch size | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_GET_BATCH_CONCURRENCY |
2 |
Entity client RESTli get batch concurrency | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_GET_BATCH_QUEUE_SIZE |
500 |
Entity client RESTli get batch queue size | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_GET_BATCH_THREAD_KEEP_ALIVE |
60 |
Entity client RESTli get batch thread keep alive | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_INGEST_BATCH_SIZE |
50 |
Entity client RESTli ingest batch size | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_INGEST_BATCH_CONCURRENCY |
2 |
Entity client RESTli ingest batch concurrency | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_INGEST_BATCH_QUEUE_SIZE |
500 |
Entity client RESTli ingest batch queue size | GMS, MAE Consumer, PE Consumer |
ENTITY_CLIENT_RESTLI_INGEST_BATCH_THREAD_KEEP_ALIVE |
60 |
Entity client RESTli ingest batch thread keep alive | GMS, MAE Consumer, PE Consumer |
USAGE_CLIENT_RETRY_INTERVAL |
2 |
Usage client retry interval | GMS, MAE Consumer, PE Consumer |
USAGE_CLIENT_NUM_RETRIES |
0 |
Usage client number of retries | GMS, MAE Consumer, PE Consumer |
USAGE_CLIENT_TIMEOUT_MS |
3000 |
Usage client timeout in milliseconds | GMS, MAE Consumer, PE Consumer |
Cache Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
CACHE_TTL_SECONDS |
600 |
Default cache time to live | GMS |
CACHE_MAX_SIZE |
10000 |
Maximum number of items to cache | GMS |
CACHE_ENTITY_COUNTS_TTL_SECONDS |
600 |
Homepage entity count time to live | GMS |
CACHE_SEARCH_LINEAGE_TTL_SECONDS |
86400 |
Search lineage cache time to live | GMS |
CACHE_SEARCH_LINEAGE_LIGHTNING_THRESHOLD |
300 |
Lineage graphs exceeding this limit will use local cache | GMS |
CACHE_CLIENT_USAGE_CLIENT_ENABLED |
true |
Enable usage client cache | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_USAGE_CLIENT_STATS_ENABLED |
true |
Enable usage client cache stats | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_USAGE_CLIENT_STATS_INTERVAL_SECONDS |
120 |
Usage client cache stats interval | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_USAGE_CLIENT_TTL_SECONDS |
86400 |
Usage client cache TTL | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_USAGE_CLIENT_MAX_BYTES |
52428800 |
Usage client cache max bytes (50MB) | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_ENTITY_CLIENT_ENABLED |
true |
Enable entity client cache | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_ENTITY_CLIENT_STATS_ENABLED |
true |
Enable entity client cache stats | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_ENTITY_CLIENT_STATS_INTERVAL_SECONDS |
120 |
Entity client cache stats interval | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_ENTITY_CLIENT_TTL_SECONDS |
0 |
Entity client cache TTL (0 = no cache) | GMS, MAE Consumer, PE Consumer |
CACHE_CLIENT_ENTITY_CLIENT_MAX_BYTES |
104857600 |
Entity client cache max bytes (100MB) | GMS, MAE Consumer, PE Consumer |
GraphQL Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
GRAPHQL_CONCURRENCY_SEPARATE_THREAD_POOL |
false |
Enable separate thread pool for GraphQL | GMS |
GRAPHQL_CONCURRENCY_STACK_SIZE |
256000 |
GraphQL thread pool stack size | GMS |
GRAPHQL_CONCURRENCY_CORE_POOL_SIZE |
-1 |
GraphQL core pool size (default 5 * cores) | GMS |
GRAPHQL_CONCURRENCY_MAX_POOL_SIZE |
-1 |
GraphQL max pool size (default 100 * cores) | GMS |
GRAPHQL_CONCURRENCY_KEEP_ALIVE |
60 |
GraphQL thread keep alive time | GMS |
GRAPHQL_QUERY_COMPLEXITY_LIMIT |
2000 |
GraphQL query complexity limit | GMS |
GRAPHQL_QUERY_DEPTH_LIMIT |
50 |
GraphQL query depth limit | GMS |
GRAPHQL_QUERY_INTROSPECTION_ENABLED |
true |
Enable GraphQL introspection | GMS |
GRAPHQL_METRICS_ENABLED |
true |
Enable GraphQL metrics collection | GMS |
GRAPHQL_PERCENTILES |
0.5,0.75,0.95,0.98,0.99,0.999 |
GraphQL percentiles | GMS |
GRAPHQL_METRICS_FIELD_LEVEL_ENABLED |
false |
Enable field-level GraphQL metrics | GMS |
GRAPHQL_METRICS_FIELD_LEVEL_OPERATIONS |
getSearchResultsForMultiple,searchAcrossLineageStructure |
GraphQL field-level operations | GMS |
GRAPHQL_METRICS_FIELD_LEVEL_PATH_ENABLED |
false |
Include field path in GraphQL metrics | GMS |
GRAPHQL_METRICS_FIELD_LEVEL_PATHS |
`` | GraphQL field-level paths | GMS |
GRAPHQL_METRICS_TRIVIAL_DATA_FETCHERS_ENABLED |
false |
Include trivial data fetchers in GraphQL metrics | GMS |
Chrome Extension Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
CHROME_EXTENSION_ENABLED |
true |
Enable Chrome extension | Frontend |
CHROME_EXTENSION_LINEAGE_ENABLED |
true |
Enable Chrome extension lineage | Frontend |
Business Attribute Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
BUSINESS_ATTRIBUTE_RELATED_ENTITIES_COUNT |
20000 |
Business attribute related entities count | GMS |
BUSINESS_ATTRIBUTE_RELATED_ENTITIES_BATCH_SIZE |
1000 |
Business attribute related entities batch size | GMS |
BUSINESS_ATTRIBUTE_PROPAGATION_CONCURRENCY_THREAD_COUNT |
-1 |
Business attribute propagation thread count (default 2 * cores) | GMS |
BUSINESS_ATTRIBUTE_PROPAGATION_CONCURRENCY_KEEP_ALIVE |
60 |
Business attribute propagation keep alive time | GMS |
Metadata Change Proposal Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
MCP_CONSUMER_BATCH_ENABLED |
false |
Enable MCP consumer batch processing | GMS, MCE Consumer |
MCP_CONSUMER_BATCH_SIZE |
15744000 |
MCP consumer batch size | GMS, MCE Consumer |
MCP_VALIDATION_IGNORE_UNKNOWN |
true |
Ignore unknown fields in MCP validation | GMS, MCE Consumer |
MCP_VALIDATION_PRIVILEGE_CONSTRAINTS |
true |
Enable privilege constraints in MCP validation | GMS, MCE Consumer |
MCP_VALIDATION_EXTENSIONS_ENABLED |
false |
Enable extensions in MCP validation | GMS, MCE Consumer |
MCP_SIDE_EFFECTS_SCHEMA_FIELD_ENABLED |
false |
Enable schema field side effects | GMS, MCE Consumer |
MCP_SIDE_EFFECTS_DATA_PRODUCT_UNSET_ENABLED |
true |
Enable data product unset side effects | GMS, MCE Consumer |
MCP_THROTTLE_UPDATE_INTERVAL_MS |
60000 |
MCP throttle update interval | GMS, MCE Consumer |
MCP_MCE_CONSUMER_THROTTLE_ENABLED |
false |
Enable MCE consumer throttling | GMS, MCE Consumer |
MCP_API_REQUESTS_THROTTLE_ENABLED |
false |
Enable API requests throttling | GMS, MCE Consumer |
MCP_VERSIONED_THROTTLE_ENABLED |
false |
Enable versioned MCL topic throttling | GMS, MCE Consumer |
MCP_VERSIONED_THRESHOLD |
4000 |
Versioned throttle threshold | GMS, MCE Consumer |
MCP_VERSIONED_MAX_ATTEMPTS |
1000 |
Versioned max attempts | GMS, MCE Consumer |
MCP_VERSIONED_INITIAL_INTERVAL_MS |
100 |
Versioned initial interval | GMS, MCE Consumer |
MCP_VERSIONED_MULTIPLIER |
10 |
Versioned multiplier | GMS, MCE Consumer |
MCP_VERSIONED_MAX_INTERVAL_MS |
30000 |
Versioned max interval | GMS, MCE Consumer |
MCP_TIMESERIES_THROTTLE_ENABLED |
false |
Enable timeseries MCL topic throttling | GMS, MCE Consumer |
MCP_TIMESERIES_THRESHOLD |
4000 |
Timeseries throttle threshold | GMS, MCE Consumer |
MCP_TIMESERIES_MAX_ATTEMPTS |
1000 |
Timeseries max attempts | GMS, MCE Consumer |
MCP_TIMESERIES_INITIAL_INTERVAL_MS |
100 |
Timeseries initial interval | GMS, MCE Consumer |
MCP_TIMESERIES_MULTIPLIER |
10 |
Timeseries multiplier | GMS, MCE Consumer |
MCP_TIMESERIES_MAX_INTERVAL_MS |
30000 |
Timeseries max interval | GMS, MCE Consumer |
Events API Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
EVENTS_API_ENABLED |
true |
Enable events API | GMS |
Iceberg Catalog Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ENABLE_PUBLIC_READ |
false |
Enable public read for Iceberg catalog | GMS |
PUBLICLY_READABLE_TAG |
PUBLICLY_READABLE |
Publicly readable tag for Iceberg catalog | GMS |
Component Configuration
Variable | Default | Description | Components |
---|---|---|---|
MCP_CONSUMER_ENABLED |
true |
When running in standalone mode, disabled on GMS and enable on separate MCE Consumer . |
GMS, MCE Consumer |
MCL_CONSUMER_ENABLED |
true |
When running in standalone mode, disabled on GMS and enable on separate MAE Consumer . |
GMS, MAE Consumer |
PE_CONSUMER_ENABLED |
true |
When running in standalone mode, disabled on GMS and enable on separate MAE Consumer . |
GMS, PE Consumer |
DataHub Frontend
Play Framework Configuration
Secret Key Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_SECRET |
null |
Secret key used to secure cryptographic functions | Frontend |
HTTP Parser Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_PLAY_MEM_BUFFER_SIZE |
10MB |
Maximum memory buffer size for HTTP parser | Frontend |
Server Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_AKKA_MAX_HEADER_COUNT |
64 |
Maximum number of headers allowed | Frontend |
DATAHUB_AKKA_MAX_HEADER_VALUE_LENGTH |
32k |
Maximum header value length | Frontend |
Session Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_COOKIE_SAME_SITE |
LAX |
SameSite attribute for authentication cookies | Frontend |
AUTH_COOKIE_SECURE |
false |
Whether authentication cookies should be secure | Frontend |
Authentication Configuration
OIDC Configuration
Reference Links:
- OIDC Setup Guide: Configure OIDC Authentication
- OIDC Prerequisites: Initialize OIDC
Required OIDC Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_OIDC_ENABLED |
false |
Enable OIDC authentication | Frontend |
AUTH_OIDC_CLIENT_ID |
null |
Unique client ID issued by the identity provider | Frontend |
AUTH_OIDC_CLIENT_SECRET |
null |
Unique client secret issued by the identity provider | Frontend |
AUTH_OIDC_DISCOVERY_URI |
null |
The IdP OIDC discovery URL | Frontend |
AUTH_OIDC_BASE_URL |
null |
The base URL associated with your DataHub deployment | Frontend |
Optional OIDC Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_OIDC_USER_NAME_CLAIM |
preferred_username |
The attribute/claim used to derive the DataHub username | Frontend |
AUTH_OIDC_USER_NAME_CLAIM_REGEX |
(.*) |
The regex used to parse the DataHub username from the user name claim | Frontend |
AUTH_OIDC_SCOPE |
oidc email profile |
String representing the requested scope from the IdP | Frontend |
AUTH_OIDC_CLIENT_AUTHENTICATION_METHOD |
client_secret_basic |
Authentication method to pass credentials to token endpoint | Frontend |
AUTH_OIDC_JIT_PROVISIONING_ENABLED |
true |
Whether DataHub users should be provisioned on login if they don't exist | Frontend |
AUTH_OIDC_PRE_PROVISIONING_REQUIRED |
false |
Whether the user should already exist in DataHub on login | Frontend |
AUTH_OIDC_EXTRACT_GROUPS_ENABLED |
true |
Whether groups should be extracted from a claim in the OIDC profile | Frontend |
AUTH_OIDC_GROUPS_CLAIM |
groups |
The OIDC claim to extract groups information from | Frontend |
AUTH_OIDC_RESPONSE_TYPE |
null |
OIDC response type | Frontend |
AUTH_OIDC_RESPONSE_MODE |
null |
OIDC response mode | Frontend |
AUTH_OIDC_USE_NONCE |
null |
Whether to use nonce in OIDC flow | Frontend |
AUTH_OIDC_CUSTOM_PARAM_RESOURCE |
null |
Custom resource parameter for OIDC | Frontend |
AUTH_OIDC_READ_TIMEOUT |
null |
OIDC read timeout | Frontend |
AUTH_OIDC_CONNECT_TIMEOUT |
null |
OIDC connect timeout | Frontend |
AUTH_OIDC_EXTRACT_JWT_ACCESS_TOKEN_CLAIMS |
false |
Whether to extract claims from JWT access token | Frontend |
AUTH_OIDC_PREFERRED_JWS_ALGORITHM |
null |
Which JWS algorithm to use | Frontend |
AUTH_OIDC_ACR_VALUES |
null |
OIDC ACR values | Frontend |
AUTH_OIDC_GRANT_TYPE |
null |
OIDC grant type | Frontend |
Authentication Methods Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_JAAS_ENABLED |
true |
Enable JAAS authentication | Frontend |
AUTH_NATIVE_ENABLED |
true |
Enable native authentication | Frontend |
GUEST_AUTHENTICATION_ENABLED |
false |
Enable guest authentication | Frontend |
GUEST_AUTHENTICATION_USER |
guest |
The name of the guest user ID | Frontend |
GUEST_AUTHENTICATION_PATH |
null |
The path to bypass login page and get logged in as guest | Frontend |
ENFORCE_VALID_EMAIL |
true |
Enforce the usage of a valid email for user sign up | Frontend |
Authentication Logging
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_VERBOSE_LOGGING |
false |
Enable verbose authentication logging | Frontend |
Session Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
AUTH_SESSION_TTL_HOURS |
24 |
Login session expiration time in hours | Frontend |
MAX_SESSION_TOKEN_AGE |
24h |
Maximum age of session token | Frontend |
Metadata Service Configuration
Connection Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
DATAHUB_GMS_HOST |
localhost |
Metadata service host | Frontend |
DATAHUB_GMS_PORT |
8080 |
Metadata service port | Frontend |
DATAHUB_GMS_USE_SSL |
false |
Whether to use SSL for metadata service connection | Frontend |
Authentication Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
METADATA_SERVICE_AUTH_ENABLED |
false |
Enable metadata service authentication | Frontend |
DATAHUB_SYSTEM_CLIENT_SECRET |
JohnSnowKnowsNothing |
System client secret for metadata service | Frontend |
Entity Client Configuration
Environment Variable | Default | Description | Components |
---|---|---|---|
ENTITY_CLIENT_RETRY_INTERVAL |
2 |
Entity client retry interval | Frontend |
ENTITY_CLIENT_NUM_RETRIES |
3 |
Entity client number of retries | Frontend |
ENTITY_CLIENT_RESTLI_GET_BATCH_SIZE |
50 |
Entity client RESTli get batch size | Frontend |
ENTITY_CLIENT_RESTLI_GET_BATCH_CONCURRENCY |
2 |
Entity client RESTli get batch concurrency | Frontend |
Notes
- Environment variables follow the pattern of converting YAML property paths to uppercase with underscores
- Default values are shown in the table above
- For Kafka configuration, refer to the official Spring Kafka documentation for additional properties
- Feature flags control experimental or optional functionality
- System update configurations control various background maintenance tasks
- Cache configurations help optimize performance for different use cases
- GraphQL configurations control query complexity and performance monitoring
- OpenTelemetry variables control observability and tracing behavior
- Play Framework properties are converted to environment variables by:
- Converting dots (
.
) to underscores (_
) - Converting to uppercase
- Converting dots (