mirror of
https://github.com/open-metadata/OpenMetadata.git
synced 2025-07-14 20:47:41 +00:00
174 lines
5.8 KiB
Markdown
174 lines
5.8 KiB
Markdown
# Pipeline
|
|
|
|
This schema defines the Pipeline entity. A pipeline enables the flow of data from source to destination through a series of processing steps. ETL is a type of pipeline where the series of steps Extract, Transform and Load the data.
|
|
|
|
**$id:**[**https://open-metadata.org/schema/entity/data/pipeline.json**](https://open-metadata.org/schema/entity/data/pipeline.json)
|
|
|
|
Type: `object`
|
|
|
|
This schema <u>does not</u> accept additional properties.
|
|
|
|
## Properties
|
|
- **id** `required`
|
|
- Unique identifier that identifies a pipeline instance.
|
|
- $ref: [../../type/basic.json#/definitions/uuid](../types/basic.md#uuid)
|
|
- **name** `required`
|
|
- Name that identifies this pipeline instance uniquely.
|
|
- Type: `string`
|
|
- Length: between 1 and 128
|
|
- **displayName**
|
|
- Display Name that identifies this Pipeline. It could be title or label from the source services.
|
|
- Type: `string`
|
|
- **fullyQualifiedName**
|
|
- A unique name that identifies a pipeline in the format 'ServiceName.PipelineName'.
|
|
- Type: `string`
|
|
- **description**
|
|
- Description of this Pipeline.
|
|
- Type: `string`
|
|
- **version**
|
|
- Metadata version of the entity.
|
|
- $ref: [../../type/entityHistory.json#/definitions/entityVersion](../types/entityhistory.md#entityversion)
|
|
- **updatedAt**
|
|
- Last update time corresponding to the new version of the entity in Unix epoch time milliseconds.
|
|
- $ref: [../../type/basic.json#/definitions/timestamp](../types/basic.md#timestamp)
|
|
- **updatedBy**
|
|
- User who made the update.
|
|
- Type: `string`
|
|
- **pipelineUrl**
|
|
- Pipeline URL to visit/manage. This URL points to respective pipeline service UI.
|
|
- Type: `string`
|
|
- String format must be a "uri"
|
|
- **concurrency**
|
|
- Concurrency of the Pipeline.
|
|
- Type: `integer`
|
|
- **pipelineLocation**
|
|
- Pipeline Code Location.
|
|
- Type: `string`
|
|
- **startDate**
|
|
- Start date of the workflow.
|
|
- $ref: [../../type/basic.json#/definitions/dateTime](../types/basic.md#datetime)
|
|
- **tasks**
|
|
- All the tasks that are part of pipeline.
|
|
- Type: `array`
|
|
- **Items**
|
|
- $ref: [#/definitions/task](#task)
|
|
- **pipelineStatus**
|
|
- Series of pipeline executions and its status.
|
|
- Type: `array`
|
|
- **Items**
|
|
- $ref: [#/definitions/pipelineStatus](#pipelinestatus)
|
|
- **followers**
|
|
- Followers of this Pipeline.
|
|
- $ref: [../../type/entityReference.json#/definitions/entityReferenceList](../types/entityreference.md#entityreferencelist)
|
|
- **tags**
|
|
- Tags for this Pipeline.
|
|
- Type: `array`
|
|
- **Items**
|
|
- $ref: [../../type/tagLabel.json](../types/taglabel.md)
|
|
- **href**
|
|
- Link to the resource corresponding to this entity.
|
|
- $ref: [../../type/basic.json#/definitions/href](../types/basic.md#href)
|
|
- **owner**
|
|
- Owner of this pipeline.
|
|
- $ref: [../../type/entityReference.json](../types/entityreference.md)
|
|
- **service** `required`
|
|
- Link to service where this pipeline is hosted in.
|
|
- $ref: [../../type/entityReference.json](../types/entityreference.md)
|
|
- **serviceType**
|
|
- Service type where this pipeline is hosted in.
|
|
- $ref: [../services/pipelineService.json#/definitions/pipelineServiceType](../services/pipelineservice.md#pipelineservicetype)
|
|
- **changeDescription**
|
|
- Change that lead to this version of the entity.
|
|
- $ref: [../../type/entityHistory.json#/definitions/changeDescription](../types/entityhistory.md#changedescription)
|
|
- **deleted**
|
|
- When `true` indicates the entity has been soft deleted.
|
|
- Type: `boolean`
|
|
- Default: _false_
|
|
|
|
|
|
## Type definitions in this schema
|
|
### statusType
|
|
|
|
- Enum defining the possible Status.
|
|
- Type: `string`
|
|
- The value is restricted to the following:
|
|
1. _"Successful"_
|
|
2. _"Failed"_
|
|
3. _"Pending"_
|
|
|
|
|
|
### taskStatus
|
|
|
|
- This schema defines a time series of the status of a Pipeline or Task.
|
|
- Type: `object`
|
|
- This schema <u>does not</u> accept additional properties.
|
|
- **Properties**
|
|
- **name**
|
|
- Name of the Task.
|
|
- Type: `string`
|
|
- **executionStatus**
|
|
- Status at a specific execution date.
|
|
- $ref: [#/definitions/statusType](#statustype)
|
|
|
|
|
|
### task
|
|
|
|
- Type: `object`
|
|
- This schema <u>does not</u> accept additional properties.
|
|
- **Properties**
|
|
- **name** `required`
|
|
- Name that identifies this task instance uniquely.
|
|
- Type: `string`
|
|
- **displayName**
|
|
- Display Name that identifies this Task. It could be title or label from the pipeline services.
|
|
- Type: `string`
|
|
- **fullyQualifiedName**
|
|
- A unique name that identifies a pipeline in the format 'ServiceName.PipelineName.TaskName'.
|
|
- Type: `string`
|
|
- **description**
|
|
- Description of this Task.
|
|
- Type: `string`
|
|
- **taskUrl**
|
|
- Task URL to visit/manage. This URL points to respective pipeline service UI.
|
|
- Type: `string`
|
|
- String format must be a "uri"
|
|
- **downstreamTasks**
|
|
- All the tasks that are downstream of this task.
|
|
- Type: `array`
|
|
- **Items**
|
|
- Type: `string`
|
|
- **taskType**
|
|
- Type of the Task. Usually refers to the class it implements.
|
|
- Type: `string`
|
|
- **taskSQL**
|
|
- SQL used in the task. Can be used to determine the lineage.
|
|
- $ref: [../../type/basic.json#/definitions/sqlQuery](../types/basic.md#sqlquery)
|
|
- **tags**
|
|
- Tags for this task.
|
|
- Type: `array`
|
|
- **Items**
|
|
- $ref: [../../type/tagLabel.json](../types/taglabel.md)
|
|
|
|
|
|
### pipelineStatus
|
|
|
|
- Series of pipeline executions, its status and task status.
|
|
- Type: `object`
|
|
- This schema <u>does not</u> accept additional properties.
|
|
- **Properties**
|
|
- **executionDate**
|
|
- Date where the job was executed.
|
|
- $ref: [../../type/basic.json#/definitions/timestamp](../types/basic.md#timestamp)
|
|
- **executionStatus**
|
|
- Status at a specific execution date.
|
|
- $ref: [#/definitions/statusType](#statustype)
|
|
- **taskStatus**
|
|
- Series of task executions and its status.
|
|
- Type: `array`
|
|
- **Items**
|
|
- $ref: [#/definitions/taskStatus](#taskstatus)
|
|
|
|
|
|
|
|
|
|
_This document was updated on: Monday, March 7, 2022_ |