Connector: VertexAI UI, docs (#17649)

This commit is contained in:
harshsoni2024 2024-09-04 11:45:33 +05:30 committed by harshsoni2024
parent 5aba41ef58
commit 6fa662bfda
9 changed files with 374 additions and 2 deletions

View File

@ -0,0 +1,97 @@
---
title: VertexAI
slug: /connectors/ml-model/vertexai
---
{% connectorDetailsHeader
name="VertexAI"
stage="BETA"
platform="Collate"
availableFeatures=["ML Store", "ML Features", "Hyper parameters"]
unavailableFeatures=[]
/ %}
In this section, we provide guides and references to use the VertexAI connector.
Configure and schedule VertexAI metadata workflows from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
{% partial file="/v1.6/connectors/ingestion-modes-tiles.md" variables={yamlPath: "/connectors/ml-model/vertexai/yaml"} /%}
## Requirements
### VertexAI API Permissions
- Go to [Cloud VertexAI Library enable API](https://cloud.google.com/vertex-ai/docs/featurestore/setup)
- Select the `GCP Project ID`.
- Click on `Enable API` which will enable the data catalog api on the respective project.
### GCP Permissions
To execute metadata extraction workflow successfully the user or the service account should have enough access to fetch required data. Following table describes the minimum required permissions
{% multiTablesWrapper %}
| # | GCP Permission | Required For |
| :--- | :---------------------------- | :---------------------- |
| 1 | aiplatform.models.get | Metadata Ingestion |
| 2 | aiplatform.models.list | Metadata Ingestion |
{% /multiTablesWrapper %}
## Metadata Ingestion
{% partial
file="/v1.6/connectors/metadata-ingestion-ui.md"
variables={
connector: "VertexAI",
selectServicePath: "/images/v1.6/connectors/vertexai/select-service.png",
addNewServicePath: "/images/v1.6/connectors/vertexai/add-new-service.png",
serviceConnectionPath: "/images/v1.6/connectors/vertexai/service-connection.png",
}
/%}
{% stepsContainer %}
{% extraContent parentTagName="stepsContainer" %}
#### Connection Options
**GCP Credentials**:
You can authenticate with your VertexAI instance using either `GCP Credentials Path` where you can specify the file path of the service account key or you can pass the values directly by choosing the `GCP Credentials Values` from the service account key file.
You can checkout [this](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) documentation on how to create the service account keys and download it.
**GCP Credentials Values**: Passing the raw credential values provided by VertexAI. This requires us to provide the following information, all provided by VertexAI:
- **Credentials type**: Credentials Type is the type of the account, for a service account the value of this field is `service_account`. To fetch this key, look for the value associated with the `type` key in the service account key file.
- **Project ID**: A project ID is a unique string used to differentiate your project from all others in Google Cloud. To fetch this key, look for the value associated with the `project_id` key in the service account key file. You can also pass multiple project id to ingest metadata from different VertexAI projects into one service.
- **Private Key ID**: This is a unique identifier for the private key associated with the service account. To fetch this key, look for the value associated with the `private_key_id` key in the service account file.
- **Private Key**: This is the private key associated with the service account that is used to authenticate and authorize access to VertexAI. To fetch this key, look for the value associated with the `private_key` key in the service account file.
- **Client Email**: This is the email address associated with the service account. To fetch this key, look for the value associated with the `client_email` key in the service account key file.
- **Client ID**: This is a unique identifier for the service account. To fetch this key, look for the value associated with the `client_id` key in the service account key file.
- **Auth URI**: This is the URI for the authorization server. To fetch this key, look for the value associated with the `auth_uri` key in the service account key file. The default value to Auth URI is https://accounts.google.com/o/oauth2/auth.
- **Token URI**: The Google Cloud Token URI is a specific endpoint used to obtain an OAuth 2.0 access token from the Google Cloud IAM service. This token allows you to authenticate and access various Google Cloud resources and APIs that require authorization. To fetch this key, look for the value associated with the `token_uri` key in the service account credentials file. Default Value to Token URI is https://oauth2.googleapis.com/token.
- **Authentication Provider X509 Certificate URL**: This is the URL of the certificate that verifies the authenticity of the authorization server. To fetch this key, look for the value associated with the `auth_provider_x509_cert_url` key in the service account key file. The Default value for Auth Provider X509Cert URL is https://www.googleapis.com/oauth2/v1/certs
- **Client X509Cert URL**: This is the URL of the certificate that verifies the authenticity of the service account. To fetch this key, look for the value associated with the `client_x509_cert_url` key in the service account key file.
**GCP Credentials Path**: Passing a local file path that contains the credentials.
**Location**:
Location refers to the geographical region where your resources, such as datasets, models, and endpoints, are physically hosted.(e.g. `us-central1`, `europe-west4`)
{% /extraContent %}
{% partial file="/v1.6/connectors/test-connection.md" /%}
{% partial file="/v1.6/connectors/ml-model/configure-ingestion.md" /%}
{% partial file="/v1.6/connectors/ingestion-schedule-and-deploy.md" /%}
{% /stepsContainer %}
{% partial file="/v1.6/connectors/troubleshooting.md" /%}

View File

@ -0,0 +1,142 @@
---
title: Run the VertexAI Connector Externally
slug: /connectors/ml-model/vertexai/yaml
---
{% connectorDetailsHeader
name="VertexAI"
stage="BETA"
platform="Collate"
availableFeatures=["ML Store", "ML Features", "Hyper parameters"]
unavailableFeatures=[]
/ %}
In this section, we provide guides and references to use the VertexAI connector.
Configure and schedule VertexAI metadata from the OpenMetadata UI:
- [Requirements](#requirements)
- [Metadata Ingestion](#metadata-ingestion)
{% partial file="/v1.6/connectors/external-ingestion-deployment.md" /%}
## Requirements
### Python Requirements
{% partial file="/v1.6/connectors/python-requirements.md" /%}
To run the VertexAI ingestion, you will need to install:
```bash
pip3 install "openmetadata-ingestion[vertexai]"
```
### GCP Permissions
To execute metadata extraction workflow successfully the user or the service account should have enough access to fetch required data. Following table describes the minimum required permissions
{% multiTablesWrapper %}
| # | GCP Permission | Required For |
| :--- | :---------------------------- | :---------------------- |
| 1 | aiplatform.models.get | Metadata Ingestion |
| 2 | aiplatform.models.list | Metadata Ingestion |
{% /multiTablesWrapper %}
## Metadata Ingestion
### 1. Define the YAML Config
This is a sample config for VertexAI:
{% codePreview %}
{% codeInfoContainer %}
#### Source Configuration - Service Connection
{% codeInfo srNumber=1 %}
**credentials**:
You can authenticate with your vertexai instance using either `GCP Credentials Path` where you can specify the file path of the service account key or you can pass the values directly by choosing the `GCP Credentials Values` from the service account key file.
You can checkout [this](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) documentation on how to create the service account keys and download it.
**gcpConfig:**
**1.** Passing the raw credential values provided by VertexAI. This requires us to provide the following information, all provided by VertexAI:
- **type**: Credentials Type is the type of the account, for a service account the value of this field is `service_account`. To fetch this key, look for the value associated with the `type` key in the service account key file.
- **projectId**: A project ID is a unique string used to differentiate your project from all others in Google Cloud. To fetch this key, look for the value associated with the `project_id` key in the service account key file. You can also pass multiple project id to ingest metadata from different VertexAI projects into one service.
- **privateKeyId**: This is a unique identifier for the private key associated with the service account. To fetch this key, look for the value associated with the `private_key_id` key in the service account file.
- **privateKey**: This is the private key associated with the service account that is used to authenticate and authorize access to VertexAI. To fetch this key, look for the value associated with the `private_key` key in the service account file.
- **clientEmail**: This is the email address associated with the service account. To fetch this key, look for the value associated with the `client_email` key in the service account key file.
- **clientId**: This is a unique identifier for the service account. To fetch this key, look for the value associated with the `client_id` key in the service account key file.
- **authUri**: This is the URI for the authorization server. To fetch this key, look for the value associated with the `auth_uri` key in the service account key file. The default value to Auth URI is https://accounts.google.com/o/oauth2/auth.
- **tokenUri**: The Google Cloud Token URI is a specific endpoint used to obtain an OAuth 2.0 access token from the Google Cloud IAM service. This token allows you to authenticate and access various Google Cloud resources and APIs that require authorization. To fetch this key, look for the value associated with the `token_uri` key in the service account credentials file. Default Value to Token URI is https://oauth2.googleapis.com/token.
- **authProviderX509CertUrl**: This is the URL of the certificate that verifies the authenticity of the authorization server. To fetch this key, look for the value associated with the `auth_provider_x509_cert_url` key in the service account key file. The Default value for Auth Provider X509Cert URL is https://www.googleapis.com/oauth2/v1/certs
- **clientX509CertUrl**: This is the URL of the certificate that verifies the authenticity of the service account. To fetch this key, look for the value associated with the `client_x509_cert_url` key in the service account key file.
**2.** Passing a local file path that contains the credentials:
- **gcpCredentialsPath**
**Location**:
Location refers to the geographical region where your resources, such as datasets, models, and endpoints, are physically hosted.(e.g. `us-central1`, `europe-west4`)
{% /codeInfo %}
{% /codeInfoContainer %}
{% codeBlock fileName="filename.yaml" %}
```yaml {% isCodeBlock=true %}
source:
type: vertexai
serviceName: localvx
serviceConnection:
config:
type: VertexAI
```
```yaml {% srNumber=1 %}
credentials:
gcpConfig:
type: My Type
projectId: project ID # ["project-id-1", "project-id-2"]
privateKeyId: us-east-2
privateKey: |
-----BEGIN PRIVATE KEY-----
Super secret key
-----END PRIVATE KEY-----
clientEmail: client@mail.com
clientId: 1234
# authUri: https://accounts.google.com/o/oauth2/auth (default)
# tokenUri: https://oauth2.googleapis.com/token (default)
# authProviderX509CertUrl: https://www.googleapis.com/oauth2/v1/certs (default)
clientX509CertUrl: https://cert.url
location: PROJECT LOCATION/REGION (us-central1)
```
```yaml {% srNumber=2 %}
# connectionOptions:
# key: value
```
```yaml {% srNumber=3 %}
# connectionArguments:
# key: value
```
{% partial file="/v1.6/connectors/yaml/database/source-config.md" /%}
{% partial file="/v1.6/connectors/yaml/ingestion-sink.md" /%}
{% partial file="/v1.6/connectors/yaml/workflow-config.md" /%}
{% /codeBlock %}
{% /codePreview %}
{% partial file="/v1.6/connectors/yaml/ingestion-cli.md" /%}

View File

@ -544,6 +544,10 @@ site_menu:
url: /connectors/ml-model/sagemaker
- category: Connectors / ML Model / Sagemaker / Run Externally
url: /connectors/ml-model/sagemaker/yaml
- category: Connectors / ML Model / VertexAI
url: /connectors/ml-model/vertexai
- category: Connectors / ML Model / VertexAI / Run Externally
url: /connectors/ml-model/vertexai/yaml
- category: Connectors / Storage
url: /connectors/storage

Binary file not shown.

After

Width:  |  Height:  |  Size: 111 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 160 KiB

View File

@ -22,7 +22,8 @@
},
"algorithm": {
"description": "Algorithm used to train the ML Model",
"type": "string"
"type": "string",
"default": "mlmodel"
},
"mlFeatures": {
"description": "Features used to train the ML Model.",
@ -103,6 +104,6 @@
"maxLength": 32
}
},
"required": ["name", "algorithm", "service"],
"required": ["name", "service"],
"additionalProperties": false
}

View File

@ -0,0 +1,127 @@
# VertexAI
In this section, we provide guides and references to use the VertexAI connector.
## Requirements
We need to enable the Vertex API and use an account with a specific set of minimum permissions:
### VertexAI API Permissions
Click on `Enable API`, which will enable the APs on the selected project:
- [VertexAI API ](https://cloud.google.com/vertex-ai/docs/featurestore/setup)
### GCP Permissions
To execute the metadata extraction and Usage workflow successfully, the user or the service account should have permission as VertexAI owner in IAM section.
- `aiplatform.models.create`
- `aiplatform.models.get`
- `aiplatform.models.list`
- `aiplatform.models.update`
- `aiplatform.models.delete`
- `aiplatform.models.deploy`
- `aiplatform.models.undeploy`
You can visit [this](https://cloud.google.com/vertex-ai/docs/general/access-control) documentation on how you can create a custom role in GCP and assign the above permissions to the role & service account!
You can find further information on the VertexAI connector in the [docs](https://docs.open-metadata.org/connectors/ml-model/vertexai).
## Connection Details
$$section
### GCP Credentials Configuration $(id="gcpConfig")
You can authenticate with your VertexAI instance using either `GCP Credentials Path` where you can specify the file path of the service account key, or you can pass the values directly by choosing the `GCP Credentials Values` from the service account key file.
You can check [this](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console) documentation on how to create the service account keys and download it.
$$
$$section
### Credentials Type $(id="type")
Credentials Type is the type of the account, for a service account the value of this field is `service_account`. To fetch this key, look for the value associated with the `type` key in the service account key file.
$$
$$section
### Project ID $(id="projectId")
A project ID is a unique string used to differentiate your project from all others in Google Cloud. To fetch this key, look for the value associated with the `project_id` key in the service account key file.
$$
$$section
### Private Key ID $(id="privateKeyId")
This is a unique identifier for the private key associated with the service account. To fetch this key, look for the value associated with the `private_key_id` key in the service account file.
$$
$$section
### Private Key $(id="privateKey")
This is the private key associated with the service account that is used to authenticate and authorize access to GCP. To fetch this key, look for the value associated with the `private_key` key in the service account file.
Make sure you are passing the key in a correct format. If your private key looks like this:
```
-----BEGIN ENCRYPTED PRIVATE KEY-----
MII..
MBQ...
CgU..
8Lt..
...
h+4=
-----END ENCRYPTED PRIVATE KEY-----
```
You will have to replace new lines with `\n` and the final private key that you need to pass should look like this:
```
-----BEGIN ENCRYPTED PRIVATE KEY-----\nMII..\nMBQ...\nCgU..\n8Lt..\n...\nh+4=\n-----END ENCRYPTED PRIVATE KEY-----\n
```
$$
$$section
### Client Email $(id="clientEmail")
This is the email address associated with the service account. To fetch this key, look for the value associated with the `client_email` key in the service account key file.
$$
$$section
### Client ID $(id="clientId")
This is a unique identifier for the service account. To fetch this key, look for the value associated with the `client_id` key in the service account key file.
$$
$$section
### Auth URI $(id="authUri")
This is the URI for the authorization server. To fetch this key, look for the value associated with the `auth_uri` key in the service account key file.
$$
$$section
### Token URI $(id="tokenUri")
The Google Cloud Token URI is a specific endpoint used to obtain an OAuth 2.0 access token from the Google Cloud IAM service. This token allows you to authenticate and access various Google Cloud resources and APIs that require authorization.
To fetch this key, look for the value associated with the `token_uri` key in the service account credentials file.
$$
$$section
### Auth Provider X509Cert URL $(id="authProviderX509CertUrl")
This is the URL of the certificate that verifies the authenticity of the authorization server. To fetch this key, look for the value associated with the `auth_provider_x509_cert_url` key in the service account key file.
$$
$$section
### Client X509Cert URL $(id="clientX509CertUrl")
This is the URL of the certificate that verifies the authenticity of the service account. To fetch this key, look for the value associated with the `client_x509_cert_url` key in the service account key file.
$$
$$section
### Location $(id="location")
Location refers to the geographical region where your resources, such as datasets, models, and endpoints, are physically hosted.(e.g. `us-central1`, `europe-west4`)
$$

View File

@ -139,6 +139,7 @@ class ServiceUtilClassBase {
DatabaseServiceType.Synapse,
MetadataServiceType.Alation,
APIServiceType.Webhook,
MlModelServiceType.VertexAI,
];
DatabaseServiceTypeSmallCase = this.convertEnumToLowerCase<