2022-03-22 11:44:28 -07:00

28 KiB

Table

Table

This schema defines the Table entity. A Table organizes data in rows and columns and is defined by a Schema. OpenMetadata does not have a separate abstraction for Schema. Both Table and Schema are captured in this entity.

$id:https://open-metadata.org/schema/entity/data/table.json

Type: object

This schema does not accept additional properties.

Properties

<<<<<<< HEAD

TableConstraint

	 - $ref: [#/definitions/tableConstraint](#tableconstraint)

a07bc411 (updated json schema and schema docs (#3219))

Type definitions in this schema

tableType

  • This schema defines the type used for describing different types of tables.
  • Type: string
  • The value is restricted to the following:
    1. "Regular"
    2. "External"
    3. "View"
    4. "SecureView"
    5. "MaterializedView"

dataType

<<<<<<< HEAD

  • This enum defines the type of data stored in a column.
  • Type: string
  • The value is restricted to the following:
    1. "NUMBER"
    2. "TINYINT"
    3. "SMALLINT"
    4. "INT"
    5. "BIGINT"
    6. "BYTEINT"
    7. "BYTES"
    8. "FLOAT"
    9. "DOUBLE"
    10. "DECIMAL"
    11. "NUMERIC"
    12. "TIMESTAMP"
    13. "TIME"
    14. "DATE"
    15. "DATETIME"
    16. "INTERVAL"
    17. "STRING"
    18. "MEDIUMTEXT"
    19. "TEXT"
    20. "CHAR"
    21. "VARCHAR"
    22. "BOOLEAN"
    23. "BINARY"
    24. "VARBINARY"
    25. "ARRAY"
    26. "BLOB"
    27. "LONGBLOB"
    28. "MEDIUMBLOB"
    29. "MAP"
    30. "STRUCT"
    31. "UNION"
    32. "SET"
    33. "GEOGRAPHY"
    34. "ENUM"
    35. "JSON" =======
  • This enum defines the type of data stored in a column.
  • Type: string
  • The value is restricted to the following:
    1. "NUMBER"
    2. "TINYINT"
    3. "SMALLINT"
    4. "INT"
    5. "BIGINT"
    6. "BYTEINT"
    7. "BYTES"
    8. "FLOAT"
    9. "DOUBLE"
    10. "DECIMAL"
    11. "NUMERIC"
    12. "TIMESTAMP"
    13. "TIME"
    14. "DATE"
    15. "DATETIME"
    16. "INTERVAL"
    17. "STRING"
    18. "MEDIUMTEXT"
    19. "TEXT"
    20. "CHAR"
    21. "VARCHAR"
    22. "BOOLEAN"
    23. "BINARY"
    24. "VARBINARY"
    25. "ARRAY"
    26. "BLOB"
    27. "LONGBLOB"
    28. "MEDIUMBLOB"
    29. "MAP"
    30. "STRUCT"
    31. "UNION"
    32. "SET"
    33. "GEOGRAPHY"
    34. "ENUM"
    35. "JSON"
    36. "UUID"

a07bc411 (updated json schema and schema docs (#3219))

  • Type: string
  • The value is restricted to the following:
    1. "Regular"
    2. "External"
    3. "View"
    4. "SecureView"
    5. "MaterializedView"
    6. "Iceberg"

constraint

  • This enum defines the type for column constraint.
  • Type: string
  • The value is restricted to the following:
    1. "NULL"
    2. "NOT_NULL"
    3. "UNIQUE"
    4. "PRIMARY_KEY"

tableConstraint

<<<<<<< HEAD

  • This enum defines the type for table constraint.
  • Type: object
  • Properties
    • constraintType
      • Type: string
      • The value is restricted to the following:
        1. "UNIQUE"
        2. "PRIMARY_KEY"
        3. "FOREIGN_KEY"
    • columns
      • List of column names corresponding to the constraint.
      • Type: array
        • Items
        • Type: string

columnName

  • Local name (not fully qualified name) of the column. ColumnName is - when the column is not named in struct dataType. For example, BigQuery supports struct with unnamed fields.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 128

tableName

  • Local name (not fully qualified name) of a table.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 128 =======

TableConstraint

  • This enum defines the type for table constraint.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • constraintType
      • Type: string
      • The value is restricted to the following:
        1. "UNIQUE"
        2. "PRIMARY_KEY"
        3. "FOREIGN_KEY"
    • columns
      • List of column names corresponding to the constraint.
      • Type: array
        • Items
        • Type: string

columnName

  • Local name (not fully qualified name) of the column. ColumnName is - when the column is not named in struct dataType. For example, BigQuery supports struct with unnamed fields.
  • Type: string
  • Length: between 1 and 128

tableName

  • Local name (not fully qualified name) of a table. Dots will be escaped automatically.
  • Type: string
  • Length: between 1 and 128

a07bc411 (updated json schema and schema docs (#3219))

fullyQualifiedColumnName

  • Fully qualified name of the column that includes serviceName.databaseName.tableName.columnName[.nestedColumnName]. When columnName is null for dataType struct fields, field_# where # is field index is used. For map dataType, for key the field name key is used and for the value field value is used.
  • Type: string
  • Length: between 1 and 256

column

<<<<<<< HEAD

  • This schema defines the type for a column in a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name required
    • dataType required
    • arrayDataType
      • Data type used array in dataType. For example, array<int> has dataType as array and arrayDataType as int.
      • $ref: #/definitions/dataType
    • dataLength
      • Length of char, varchar, binary, varbinary dataTypes, else null. For example, varchar(20) has dataType as varchar and dataLength as 20.
      • Type: integer
    • dataTypeDisplay
      • Display name used for dataType. This is useful for complex types, such as `array, map<int,string>, struct<>, and union types.
      • Type: string
    • description
      • Description of the column.
      • Type: string
    • fullyQualifiedName
    • tags
    • constraint
    • ordinalPosition
      • Ordinal position of the column.
      • Type: integer
    • jsonSchema
      • Json schema only if the dataType is JSON else null.
      • Type: string
    • children
      • Child columns if dataType or arrayDataType is map, struct, or union else null.
      • Type: array

columnJoins

  • This schema defines the type to capture how frequently a column is joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
  • This schema defines the type for a column in a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name required
    • displayName
      • Display Name that identifies this column name.
      • Type: string
    • dataType required
    • arrayDataType
      • Data type used array in dataType. For example, array<int> has dataType as array and arrayDataType as int.
      • $ref: #/definitions/dataType
    • dataLength
      • Length of char, varchar, binary, varbinary dataTypes, else null. For example, varchar(20) has dataType as varchar and dataLength as 20.
      • Type: integer
    • dataTypeDisplay
      • Display name used for dataType. This is useful for complex types, such as `array, map<int,string>, struct<>, and union types.
      • Type: string
    • description
      • Description of the column.
      • Type: string
    • fullyQualifiedName
    • tags
    • constraint
    • ordinalPosition
      • Ordinal position of the column.
      • Type: integer
    • jsonSchema
      • Json schema only if the dataType is JSON else null.
      • Type: string
    • children
      • Child columns if dataType or arrayDataType is map, struct, or union else null.
      • Type: array
    • columnTests

columnJoins

  • This schema defines the type to capture how frequently a column is joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

a07bc411 (updated json schema and schema docs (#3219))

tableJoins

  • This schema defines the type to capture information about how columns in this table are joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

tableData

  • This schema defines the type to capture rows of sample data for a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • columns
      • List of local column names (not fully qualified column names) of the table.
      • Type: array
    • rows
      • Data for multiple rows of the table.
      • Type: array
        • Items
        • Data for a single row of the table within the same order as columns fields.
        • Type: array

columnProfile

<<<<<<< HEAD

  • This schema defines the type to capture the table's column profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name
      • Column Name.
      • Type: string
    • valuesCount
      • Total count of the values in this column.
      • Type: number
    • valuesPercentage
      • Percentage of values in this column with respect to rowcount.
      • Type: number
    • validCount
      • Total count of valid values in this column.
      • Type: number
    • duplicateCount
      • No.of Rows that contain duplicates in a column.
      • Type: number
    • nullCount
      • No.of null values in a column.
      • Type: number
    • nullProportion
      • No.of null value proportion in columns.
      • Type: number
    • missingPercentage
      • Missing Percentage is calculated by taking percentage of validCount/valuesCount.
      • Type: number
    • missingCount
      • Missing count is calculated by subtracting valuesCount - validCount.
      • Type: number
    • uniqueCount
      • No. of unique values in the column.
      • Type: number
    • uniqueProportion
      • Proportion of number of unique values in a column.
      • Type: number
    • distinctCount
      • Number of values that contain distinct values.
      • Type: number
    • min
      • Minimum value in a column.
      • Type: number
    • max
      • Maximum value in a column.
      • Type: number
    • mean
      • Avg value in a column.
      • Type: number
    • sum
      • Median value in a column.
      • Type: number
    • stddev
      • Standard deviation of a column.
      • Type: number
    • variance
      • Variance of a column.
      • Type: number
    • histogram
      • Histogram of a column. =======
  • This schema defines the type to capture the table's column profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name
      • Column Name.
      • Type: string
    • valuesCount
      • Total count of the values in this column.
      • Type: number
    • valuesPercentage
      • Percentage of values in this column with respect to rowcount.
      • Type: number
    • validCount
      • Total count of valid values in this column.
      • Type: number
    • duplicateCount
      • No.of Rows that contain duplicates in a column.
      • Type: number
    • nullCount
      • No.of null values in a column.
      • Type: number
    • nullProportion
      • No.of null value proportion in columns.
      • Type: number
    • missingPercentage
      • Missing Percentage is calculated by taking percentage of validCount/valuesCount.
      • Type: number
    • missingCount
      • Missing count is calculated by subtracting valuesCount - validCount.
      • Type: number
    • uniqueCount
      • No. of unique values in the column.
      • Type: number
    • uniqueProportion
      • Proportion of number of unique values in a column.
      • Type: number
    • distinctCount
      • Number of values that contain distinct values.
      • Type: number
    • min
      • Minimum value in a column.
      • Types: number, integer, string
    • max
      • Maximum value in a column.
      • Types: number, integer, string
    • minLength
      • Minimum string length in a column.
      • Type: number
    • maxLength
      • Maximum string length in a column.
      • Type: number
    • mean
      • Avg value in a column.
      • Type: number
    • sum
      • Median value in a column.
      • Type: number
    • stddev
      • Standard deviation of a column.
      • Type: number
    • variance
      • Variance of a column.
      • Type: number
    • histogram
      • Histogram of a column.

a07bc411 (updated json schema and schema docs (#3219))

tableProfile

  • This schema defines the type to capture the table's data profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

sqlQuery

<<<<<<< HEAD

  • This schema defines the type to capture the table's sql queries.
  • Type: object
  • Properties
    • query
      • SQL Query text that matches the table name.
      • Type: string
    • duration
      • How long did the query took to run in seconds.
      • Type: number
    • user
    • vote
      • Users can vote up to rank the popular queries.
      • Type: number
      • Default: 1
    • checksum
      • Checksum to avoid registering duplicate queries.
      • Type: string
    • queryDate
  • This schema defines the type to capture the table's sql queries.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • query
      • SQL Query text that matches the table name.
      • Type: string
    • duration
      • How long did the query took to run in seconds.
      • Type: number
    • user
    • vote
      • Users can vote up to rank the popular queries.
      • Type: number
      • Default: 1
    • checksum
      • Checksum to avoid registering duplicate queries.
      • Type: string
    • queryDate

a07bc411 (updated json schema and schema docs (#3219))

modelType

  • The value is restricted to the following:
    1. "DBT"

dataModel

<<<<<<< HEAD

  • This captures information about how the table is modeled. Currently only DBT model is supported.
  • Type: object
  • Properties
    • modelType required
    • description
      • Description of the Table from the model.
      • Type: string
    • path
      • Path to sql definition file.
      • Type: string
    • rawSql
    • sql required
    • upstream
      • Fully qualified name of Models/tables used for in sql for creating this table.
      • Type: array
        • Items
        • Type: string
    • columns
      • Columns from the schema defined during modeling. In case of DBT, the metadata here comes from schema.yaml.
      • Type: array
    • generatedAt

This document was updated on: Tuesday, January 25, 2022

  • This captures information about how the table is modeled. Currently only DBT model is supported.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • modelType required
    • description
      • Description of the Table from the model.
      • Type: string
    • path
      • Path to sql definition file.
      • Type: string
    • rawSql
    • sql required
    • upstream
      • Fully qualified name of Models/tables used for in sql for creating this table.
      • Type: array
        • Items
        • Type: string
    • columns
      • Columns from the schema defined during modeling. In case of DBT, the metadata here comes from schema.yaml.
      • Type: array
    • generatedAt

This document was updated on: Monday, March 7, 2022

This document was updated on: Wednesday, March 9, 2022