parthp2107 d52297e28f
updated schema documentaiton (#1188)
* updated schema documentaiton

* id correction
2021-11-15 08:05:54 -08:00

12 KiB

Table

This schema defines the Table entity. A Table organizes data in rows and columns and is defined by a Schema. OpenMetadata does not have a separate abstraction for Schema. Both Table and Schema are captured in this entity.

$id:https://open-metadata.org/schema/entity/data/table.json

Type: object

This schema does not accept additional properties.

Properties

Type definitions in this schema

tableType

  • This schema defines the type used for describing different types of tables.
  • Type: string
  • The value is restricted to the following:
    1. "Regular"
    2. "External"
    3. "View"
    4. "SecureView"
    5. "MaterializedView"

dataType

  • This enum defines the type of data stored in a column.
  • Type: string
  • The value is restricted to the following:
    1. "NUMBER"
    2. "TINYINT"
    3. "SMALLINT"
    4. "INT"
    5. "BIGINT"
    6. "BYTEINT"
    7. "FLOAT"
    8. "DOUBLE"
    9. "DECIMAL"
    10. "NUMERIC"
    11. "TIMESTAMP"
    12. "TIME"
    13. "DATE"
    14. "DATETIME"
    15. "INTERVAL"
    16. "STRING"
    17. "MEDIUMTEXT"
    18. "TEXT"
    19. "CHAR"
    20. "VARCHAR"
    21. "BOOLEAN"
    22. "BINARY"
    23. "VARBINARY"
    24. "ARRAY"
    25. "BLOB"
    26. "LONGBLOB"
    27. "MEDIUMBLOB"
    28. "MAP"
    29. "STRUCT"
    30. "UNION"
    31. "SET"
    32. "GEOGRAPHY"
    33. "ENUM"
    34. "JSON"

constraint

  • This enum defines the type for column constraint.
  • Type: string
  • The value is restricted to the following:
    1. "NULL"
    2. "NOT_NULL"
    3. "UNIQUE"
    4. "PRIMARY_KEY"

tableConstraint

  • This enum defines the type for table constraint.
  • Type: object
  • Properties
    • constraintType
      • Type: string
      • The value is restricted to the following:
        1. "UNIQUE"
        2. "PRIMARY_KEY"
        3. "FOREIGN_KEY"
    • columns
      • List of column names corresponding to the constraint.
      • Type: array
        • Items
        • Type: string

columnName

  • Local name (not fully qualified name) of the column. ColumnName is - when the column is not named in struct dataType. For example, BigQuery supports struct with unnamed fields.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 64

tableName

  • Local name (not fully qualified name) of a table.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 64

fullyQualifiedColumnName

  • Fully qualified name of the column that includes serviceName.databaseName.tableName.columnName[.nestedColumnName]. When columnName is null for dataType struct fields, field_# where # is field index is used. For map dataType, for key the field name key is used and for the value field value is used.
  • Type: string
  • Length: between 1 and 256

column

  • This schema defines the type for a column in a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name required
    • dataType required
    • arrayDataType
      • Data type used array in dataType. For example, array<int> has dataType as array and arrayDataType as int.
      • $ref: #/definitions/dataType
    • dataLength
      • Length of char, varchar, binary, varbinary dataTypes, else null. For example, varchar(20) has dataType as varchar and dataLength as 20.
      • Type: integer
    • dataTypeDisplay
      • Display name used for dataType. This is useful for complex types, such as `array, map<int,string>, struct<>, and union types.
      • Type: string
    • description
      • Description of the column.
      • Type: string
    • fullyQualifiedName
    • tags
    • constraint
    • ordinalPosition
      • Ordinal position of the column.
      • Type: integer
    • jsonSchema
      • Json schema only if the dataType is JSON else null.
      • Type: string
    • children
      • Child columns if dataType or arrayDataType is map, struct, or union else null.
      • Type: array

columnJoins

  • This schema defines the type to capture how frequently a column is joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

tableJoins

  • This schema defines the type to capture information about how columns in this table are joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

tableData

  • This schema defines the type to capture rows of sample data for a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • columns
      • List of local column names (not fully qualified column names) of the table.
      • Type: array
      • rows
        • Data for multiple rows of the table.
        • Type: array
          • Items
            • Data for a single row of the table within the same order as columns fields.
            • Type: array

columnProfile

  • This schema defines the type to capture the table's column profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name
      • Column Name.
      • Type: string
    • uniqueCount
      • No. of unique values in the column.
      • Type: number
    • uniqueProportion
      • Proportion of number of unique values in a column.
      • Type: number
    • nullCount
      • No.of null values in a column.
      • Type: number
    • nullProportion
      • No.of null value proportion in columns.
      • Type: number
    • min
      • Minimum value in a column.
      • Type: string
    • max
      • Maximum value in a column.
      • Type: string
    • mean
      • Avg value in a column.
      • Type: string
    • median
      • Median value in a column.
      • Type: string
    • stddev
      • Standard deviation of a column.
      • Type: number

tableProfile

  • This schema defines the type to capture the table's data profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

sqlQuery

  • This schema defines the type to capture the table's sql queries.
  • Type: object
  • Properties
    • query
      • SQL Query text that matches the table name.
      • Type: string
    • duration
      • How long did the query took to run in seconds.
      • Type: number
    • user
    • vote
      • Users can vote up to rank the popular queries.
      • Type: number
      • Default: 1
    • checksum
      • Checksum to avoid registering duplicate queries.
      • Type: string
    • queryDate

This document was updated on: Monday, November 15, 2021