2024-06-18 15:53:06 +02:00
---
title: Run the Sagemaker Connector Externally
slug: /connectors/ml-model/sagemaker/yaml
---
{% connectorDetailsHeader
name="Sagemaker"
stage="PROD"
platform="OpenMetadata"
availableFeatures=["ML Store"]
unavailableFeatures=["ML Features", "Hyperparameters"]
/ %}
In this section, we provide guides and references to use the Sagemaker connector.
Configure and schedule Sagemaker metadata and profiler workflows from the OpenMetadata UI:
- [Requirements ](#requirements )
- [Metadata Ingestion ](#metadata-ingestion )
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/external-ingestion-deployment.md" /%}
2024-06-18 15:53:06 +02:00
## Requirements
OpenMetadata retrieves information about models and tags associated with the models in the AWS account.
The user must have the following policy set to ingest the metadata from Sagemaker.
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "SageMakerPolicy",
"Effect": "Allow",
"Action": [
"sagemaker:ListModels",
"sagemaker:DescribeModel",
"sagemaker:ListTags"
],
"Resource": "*"
}
]
}
```
For more information on Sagemaker permissions visit the [AWS Sagemaker official documentation ](https://docs.aws.amazon.com/sagemaker/latest/dg/api-permissions-reference.html ).
### Python Requirements
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/python-requirements.md" /%}
2024-06-18 15:53:06 +02:00
To run the Sagemaker ingestion, you will need to install:
```bash
pip3 install "openmetadata-ingestion[sagemaker]"
```
## Metadata Ingestion
All connectors are defined as JSON Schemas.
[Here ](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/entity/services/connections/mlmodel/sageMakerConnection.json )
you can find the structure to create a connection to Sagemaker.
In order to create and run a Metadata Ingestion workflow, we will follow
the steps to create a YAML configuration able to connect to the source,
process the Entities if needed, and reach the OpenMetadata server.
The workflow is modeled around the following
[JSON Schema ](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/mlmodelServiceMetadataPipeline.json )
### 1. Define the YAML Config
This is a sample config for Sagemaker:
{% codePreview %}
{% codeInfoContainer %}
#### Source Configuration - Service Connection
2025-03-24 09:07:16 +05:30
{% partial file="/v1.7/connectors/yaml/common/aws-config-def.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/ml-model/source-config-def.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/ingestion-sink-def.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/wor kflow-config-def.md" /%}
2024-06-18 15:53:06 +02:00
{% /codeInfoContainer %}
{% codeBlock fileName="filename.yaml" %}
```yaml {% isCodeBlock=true %}
source:
type: sagemaker
serviceName: local_sagemaker
serviceConnection:
config:
2025-01-15 18:38:08 +05:30
type: SageMaker
2024-06-18 15:53:06 +02:00
awsConfig:
```
2025-03-24 09:07:16 +05:30
{% partial file="/v1.7/connectors/yaml/common/aws-config.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/ml-model/source-config.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/ingestion-sink.md" /%}
2024-06-18 15:53:06 +02:00
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/workflow-config.md" /%}
2024-06-18 15:53:06 +02:00
{% /codeBlock %}
{% /codePreview %}
2024-12-12 11:34:09 +05:30
{% partial file="/v1.7/connectors/yaml/ingestion-cli.md" /%}