Work with Acryl to receive deployment templates specific to your environment (Helm charts, CloudFormation, or Terraform) for deploying Remote Executors in this Pool.
:::
<Tabs>
<TabItemvalue="ecs"label="Amazon ECS">
### Deploy on Amazon ECS
1.**AWS Account Configuration**
To access the private Acryl ECR registry, you'll need to provide your AWS account ID to Acryl. You can securely share your account ID through:
- Your Acryl representative
- A secure secret-sharing service like [One Time Secret](https://onetimesecret.com/)
This step is required to grant your AWS account access to pull the Remote Executor container image.
The Acryl Team will provide a [Cloudformation Template](https://raw.githubusercontent.com/acryldata/datahub-cloudformation/master/remote-executor/datahub-executor.ecs.template.yaml) that you can run to provision an ECS cluster with a single remote ingestion task. It will also provision an AWS role for the task which grants the permissions necessary to read and delete from the private queue created for you, along with reading the secrets you've specified. At minimum, the template requires the following parameters:
- Optional: Acryl Remote Executor Version; defaults to latest
Optional parameters:
- Source Secrets: `SECRET_NAME=SECRET_ARN` (up to 10); separate multiple secrets by comma, e.g. `SECRET_NAME_1=SECRET_ARN_1,SECRET_NAME_2,SECRET_ARN_2`.
- Environment Variables: `ENV_VAR_NAME=ENV_VAR_VALUE` (up to 10); separate multiple variable by comma, e.g. `ENV_VAR_NAME_1=ENV_VAR_VALUE_1,ENV_VAR_NAME_2,ENV_VAR_VALUE_2`.
:::note
Configuring Secrets enables you to manage ingestion sources from the DataHub UI without storing credentials inside DataHub. Once defined, secrets can be referenced by name inside of your DataHub Ingestion Source configurations using the usual convention: `${SECRET_NAME}`.
To update your Remote Executor deployment (e.g., to deploy a new container version or modify configuration), you'll need to update your existing CloudFormation Stack. This process involves re-deploying the CloudFormation template with your updated parameters while preserving your existing resources.
The update process will maintain your existing resources (e.g., secrets, IAM roles) while deploying the new configuration. Monitor the stack events to track the update progress.
The [datahub-executor-worker](https://executor-helm.acryl.io/index.yaml) Helm chart provides a streamlined way to deploy Remote Executors on any Kubernetes cluster, including Amazon EKS and Google GKE.
1.**Registry Access Configuration**
To access the private Acryl container registry, you'll need to work with your Acryl representative to set up the necessary permissions:
- For AWS EKS: Provide the IAM principal that will pull from the ECR repository
- For Google Cloud: Provide the cluster's IAM service account
- For other platforms: Contact Acryl for specific requirements
2.**Configure Secrets**
Create the required secrets in your Kubernetes cluster:
```bash
# Create DataHub PAT secret (required)
# Generate token from Settings > Access Tokens in DataHub UI
-`global.datahub.executor.pool_id`: Your Executor Pool ID
-`global.datahub.gms.url`: Your DataHub Cloud URL (must include `/gms`)
-`image.tag`: Acryl Remote Executor version
4.**Configure Secret Mounting (Optional)**
Starting from DataHub Cloud v0.3.8.2, you can manage secrets using Kubernetes Secret CRDs. This enables runtime secret updates without executor restarts.
- Default mount path: `/mnt/secrets` (override with `DATAHUB_EXECUTOR_FILE_SECRET_BASEDIR`)
- Default file size limit: 1MB (override with `DATAHUB_EXECUTOR_FILE_SECRET_MAXLEN`)
- Reference secrets in ingestion recipes using `${SECRET_NAME}` syntax
:::
Example ingestion recipe using mounted secrets:
```yaml
source:
type: redshift
config:
host_port: '<redshift-host:port>'
username: connector_test
password: '${REDSHIFT_PASSWORD}'
# ... other configuration ...
```
For additional configuration options, refer to the [values.yaml](https://github.com/acryldata/datahub-executor-helm/blob/main/charts/datahub-executor-worker/values.yaml) file in the Helm chart repository.
</TabItem>
</Tabs>
Once you have successfully deployed the Executor in your environment, DataHub will automatically begin reporting Executor Status in the UI:
## Assigning Ingestion Sources to an Executor Pool
After you have created an Executor Pool and deployed the Executor within your environment, you are now ready to configure an Ingestion Source to run in that Pool.
1. Navigate to **Manage Data Sources** in DataHub Cloud
2. Edit an existing Source or click **Create new source**
3. In the **Finish Up** step, expand the **Advanced** to select your desired **Executor Pool**
New Ingestion Sources will automatically use your designated Default Pool if you have assigned one. You can override this assignment when creating or editing an Ingestion Source at any time.
## Advanced: Performance Settings and Task Weight-Based Queuing
Executors use a weight-based queuing system to manage resource allocation efficiently:
- **Default Behavior**: With 4 ingestion threads (default), each task gets a weight of 0.25, allowing up to 4 parallel tasks
- **Resource-Intensive Tasks**: Tasks can be assigned a higher weight (up to 1.0) to limit parallelism
- **Queue Management**: If the total weight of running tasks exceeds 1.0, new tasks are queued until capacity becomes available
- **Priority Tasks**: Setting a weight of 1.0 ensures exclusive resource access - the task will run alone until completion
The following environment variables can be configured to manage memory-intensive ingestion tasks, prevent resource contention, and ensure stable execution of resource-demanding processes:
-`DATAHUB_EXECUTOR_INGESTION_MAX_WORKERS` (default: 4) - Maximum concurrent Ingestion tasks
-`DATAHUB_EXECUTOR_MONITORS_MAX_WORKERS` (default: 10) - Maximum concurrent Observe monitoring tasks
-`EXECUTOR_TASK_MEMORY_LIMIT` - Memory limit per task in kilobytes, configured per Ingestion Source under **Extra Environment Variables**. This setting helps prevent the executor's master process from being OOM-killed and protects against memory-leaking ingestion tasks. Example configuration:
-`EXECUTOR_TASK_WEIGHT` - Task weight for resource allocation, configured per Ingestion Source under **Extra Environment Variables**. By default, each task is assigned a weight of 1/MAX_THREADS (e.g., 0.25 with 4 threads). The total weight of concurrent tasks cannot exceed 1.0. Example configuration for a resource-intensive task: