parthp2107 e2578d6be3
Added documentation changes done in 0.5.0 branch to main (#1168)
* GitBook: [#177] Documentation Update - Airflow

* GitBook: [#195] Removing Cron from databaseServices

* GitBook: [#196] Added trino

* GitBook: [#197] removed cron from config

* GitBook: [#198] Added Redash Documentation

* GitBook: [#199] Added Bigquery Usage Documentation

* GitBook: [#200] Added page link for presto

* GitBook: [#201] Added Local Docker documentation

* GitBook: [#202] Added Documentation for Local Docker Setup

* GitBook: [#203] Added Git Command to clone Openmetadata in docs

* GitBook: [#207] links update

* GitBook: [#208] Updating Airflow Documentation

* GitBook: [#210] Adding Python installation package under Airflow Lineage config

* GitBook: [#211] Change the links to 0.5..0

* GitBook: [#213] Move buried connectors page up

* GitBook: [#214] Update to connectors page

* GitBook: [#215] Removed sub-categories

* GitBook: [#212] Adding Discovery tutorial

* GitBook: [#220] Updated steps to H2s.

* GitBook: [#230] Complex queries

* GitBook: [#231] Add lineage to feature overview

* GitBook: [#232] Make feature overview headers verbs instead of nouns

* GitBook: [#233] Add data reliability to features overview

* GitBook: [#234] Add complex data types to feature overview

* GitBook: [#235] Simplify and further distinguish discovery feature headers

* GitBook: [#236] Add data importance to feature overview

* GitBook: [#237] Break Connectors into its own section

* GitBook: [#238] Reorganize first section of docs.

* GitBook: [#239] Add connectors to feature overview

* GitBook: [#240] Organize layout of feature overview into feature categories as agreed with Harsha.

* GitBook: [#242] Make overview paragraph more descriptive.

* GitBook: [#243] Create a link to Connectors section from feature overview.

* GitBook: [#244] Add "discover data through association" to feature overview.

* GitBook: [#245] Update importance and owners gifs

* GitBook: [#246] Include a little more descriptive documentation for key features.

* GitBook: [#248] Small tweaks to intro paragraph.

* GitBook: [#249] Clean up data profiler paragraph.

* GitBook: [#250] Promote Complex Data Types to its own feature.

* GitBook: [#251] Update to advanced search

* GitBook: [#252] Update Roadmap

* GitBook: [#254] Remove old features page (text and screenshot based).

* GitBook: [#255] Remove references to removed page.

* GitBook: [#256] Add Descriptions and Tags section to feature overview.

* GitBook: [#257] Update title for "Know Your Data"

Co-authored-by: Ayush Shah <ayush.shah@deuexsolutions.com>
Co-authored-by: Suresh Srinivas <suresh@getcollate.io>
Co-authored-by: Shannon Bradshaw <shannon.bradshaw@arrikto.com>
Co-authored-by: OpenMetadata <github@harsha.io>
2021-11-13 09:33:20 -08:00

10 KiB

Table

This schema defines the Table entity. A Table organizes data in rows and columns and is defined by a Schema. OpenMetadata does not have a separate abstraction for Schema. Both Table and Schema are captured in this entity.

**$id: **https://open-metadata.org/schema/entity/data/table.json

Type: object

Properties

Type definitions in this schema

tableType

  • This schema defines the type used for describing different types of tables.
  • Type: string
  • The value is restricted to the following:
    1. "Regular"
    2. "External"
    3. "View"
    4. "SecureView"
    5. "MaterializedView"

dataType

  • This enum defines the type of data stored in a column.
  • Type: string
  • The value is restricted to the following:
    1. "NUMBER"
    2. "TINYINT"
    3. "SMALLINT"
    4. "INT"
    5. "BIGINT"
    6. "BYTEINT"
    7. "FLOAT"
    8. "DOUBLE"
    9. "DECIMAL"
    10. "NUMERIC"
    11. "TIMESTAMP"
    12. "TIME"
    13. "DATE"
    14. "DATETIME"
    15. "INTERVAL"
    16. "STRING"
    17. "MEDIUMTEXT"
    18. "TEXT"
    19. "CHAR"
    20. "VARCHAR"
    21. "BOOLEAN"
    22. "BINARY"
    23. "VARBINARY"
    24. "ARRAY"
    25. "BLOB"
    26. "LONGBLOB"
    27. "MEDIUMBLOB"
    28. "MAP"
    29. "STRUCT"
    30. "UNION"
    31. "SET"
    32. "GEOGRAPHY"
    33. "ENUM"
    34. "JSON"

constraint

  • This enum defines the type for column constraint.
  • Type: string
  • The value is restricted to the following:
    1. "NULL"
    2. "NOT_NULL"
    3. "UNIQUE"
    4. "PRIMARY_KEY"

tableConstraint

  • This enum defines the type for table constraint.
  • Type: object
  • Properties
    • constraintType
      • Type: string
      • The value is restricted to the following:
        1. "UNIQUE"
        2. "PRIMARY_KEY"
        3. "FOREIGN_KEY"
    • columns
      • List of column names corresponding to the constraint.
      • Type: array
        • Items
        • Type: string

columnName

  • Local name (not fully qualified name) of the column. ColumnName is - when the column is not named in struct dataType. For example, BigQuery supports struct with unnamed fields.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 64

tableName

  • Local name (not fully qualified name) of a table.
  • Type: string
  • The value must match this pattern: ^[^.]*$
  • Length: between 1 and 64

fullyQualifiedColumnName

  • Fully qualified name of the column that includes serviceName.databaseName.tableName.columnName[.nestedColumnName]. When columnName is null for dataType struct fields, field_# where # is field index is used. For map dataType, for key the field name key is used and for the value field value is used.
  • Type: string
  • Length: between 1 and 256

column

  • This schema defines the type for a column in a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • name required
    • dataType required
    • arrayDataType
      • Data type used array in dataType. For example, array<int> has dataType as array and arrayDataType as int.
      • $ref: #/definitions/dataType
    • dataLength
      • Length of char, varchar, binary, varbinary dataTypes, else null. For example, varchar(20) has dataType as varchar and dataLength as 20.
      • Type: integer
    • dataTypeDisplay
      • Display name used for dataType. This is useful for complex types, such as `array, map<int,string>, struct<>, and union types.
      • Type: string
    • description
      • Description of the column.
      • Type: string
    • fullyQualifiedName
    • tags
    • constraint
    • ordinalPosition
      • Ordinal position of the column.
      • Type: integer
    • jsonSchema
      • Json schema only if the dataType is JSON else null.
      • Type: string
    • children
      • Child columns if dataType or arrayDataType is map, struct, or union else null.
      • Type: array

columnJoins

  • This schema defines the type to capture how frequently a column is joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

tableJoins

  • This schema defines the type to capture information about how columns in this table are joined with columns in the other tables.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

tableData

  • This schema defines the type to capture rows of sample data for a table.
  • Type: object
  • This schema does not accept additional properties.
  • Properties
    • columns
      • List of local column names (not fully qualified column names) of the table.
      • Type: array
      • rows
        • Data for multiple rows of the table.
        • Type: array
          • Items
          • Data for a single row of the table within the same order as columns fields.
          • Type: array

columnProfile

  • This schema defines the type to capture the table's column profile.
  • Type: object
  • Properties
    • name
      • Column Name.
      • Type: string
    • uniqueCount
      • No. of unique values in the column.
      • Type: number
    • uniqueProportion
      • Proportion of number of unique values in a column.
      • Type: number
    • nullCount
      • No.of null values in a column.
      • Type: number
    • nullProportion
      • No.of null value proportion in columns.
      • Type: number
    • min
      • Minimum value in a column.
      • Type: string
    • max
      • Maximum value in a column.
      • Type: string
    • mean
      • Avg value in a column.
      • Type: string
    • median
      • Median value in a column.
      • Type: string
    • stddev
      • Standard deviation of a column.
      • Type: number

tableProfile

  • This schema defines the type to capture the table's data profile.
  • Type: object
  • This schema does not accept additional properties.
  • Properties

This document was updated on: Monday, October 18, 2021