2362 Commits

Author SHA1 Message Date
Mayur Singal
186eb252d7
Fix #11332: Fix databricks sample data ingestion for array datatype (#11420)
* Fix #11332: Fix databricks sample data ingestion for array datatype

* Fix checkstyle
2023-05-04 13:36:12 +05:30
Mayur Singal
49f3bae15e
Fix gemetry type for postgres (#11394) 2023-05-04 13:02:50 +05:30
Teddy
0a7f114281
fix: added logic to handled tests with no in result (#11409) 2023-05-03 21:59:23 +02:00
NiharDoshi99
02d4a1d7d6
making downstream_task_ids field optional for airflow AirflowDagDetails (#11405)
* making downstream_task_ids field optional for airflow AirflowDagDetails

* update requirements file for airflow
2023-05-03 22:09:10 +05:30
Onkar Ravgan
7e9c02fe6f
Fixed clean_query method for \n (#11389)
* Fixed clean query method

* fixed regex and tests

* updated regex
2023-05-03 18:08:54 +05:30
Keith Sirmons
00289bd85f
Fixes#11189: Implement Impala and hive get_view_definition (#11237)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* Added get_view_definition to hive and impala connectors.

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2023-05-03 15:06:33 +05:30
Teddy
f8c667b504
Fix median for concatenable types (#11382)
* fix: median/fq/tq for concatenable types

* fix: ran linting
2023-05-02 10:45:26 +00:00
Ayush Shah
00ecca07e9
Add fix for test connection w/o db (#11354) 2023-05-02 16:00:57 +05:30
Nahuel
94eece76f8
Fix: Tableau DataModel optional dataType (#11379) 2023-05-02 09:05:44 +00:00
Mayur Singal
4110dc2472
Fix #11352: Fix athena usage models (#11378) 2023-05-02 08:24:01 +00:00
Teddy
4b5a0eab1a
fix: catch generic SQAlchemy error for non supported regex_match (#11366) 2023-05-02 10:30:30 +05:30
Keith Sirmons
ad9b5a0cb5
Impalaconnection 0.2.1 + string datatypes enabled in profile (#11364)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* Fixed overflow error when converting large numbers to bigint

Fixed error for CHAR datatype missing.

* Fixed NaN issues with Impala Profile

* py formatting

* Fixed warnings from SqlAlchemy
  The GenericFunction 'max' is already registered and is going to be overridden.
  The GenericFunction 'min' is already registered and is going to be overridden.

Updated Min/Max to handle strings by getting they length.

* Updated profiler to handle strings by using the string length as the parameter to compute the profile

* py_format updates

* fix: ran linting

* fix: Mysql hardcoded table alias

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: Teddy Crepineau <teddy.crepineau@gmail.com>
2023-04-30 10:03:56 +02:00
Pere Miquel Brull
fc5c0fa756
Fixes #11340 - Add missing headers (#11356)
* Add missing headers

* Add raise

* Format
2023-04-28 07:42:37 +02:00
Teddy
b715208d28
Fixes #11327 - Improve Profiler Logging (#11341)
* feat: improved profiler logging

* feat: ran python linting
2023-04-27 18:18:33 +02:00
Pere Miquel Brull
c53a3413fb
Fixes #11307 - Handle exceptions if LookML model is invalid (#11320)
* Fix dynamo docs

* Handle data model fetch exceptions

* Format

* Add example for Private Key format
2023-04-27 11:42:16 +02:00
Mayur Singal
fd5f63fb58
Fix MSSQL connection with pyodbc scheme (#11304)
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-04-27 07:25:10 +02:00
Teddy
0930bc307a
fix: change in entityLink to string in CreateTestCaseRequest (#11291) 2023-04-26 10:52:09 +00:00
Ayush Shah
dd509681be
Fixes tableau, add quicksight e2e (#11177) 2023-04-26 10:22:08 +05:30
Onkar Ravgan
8bcfd013a1
Added validation (#11249) 2023-04-25 06:58:59 +00:00
Mayur Singal
c920c9afa3
0.13 to 1.00 docs changes (#11236)
* 0.13 to 1.00 changes

* add superset changes

* dbt gcs yaml fix

---------

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
2023-04-24 16:12:24 +02:00
NiharDoshi99
d1996d4260
added docs for sqlite (#11232) 2023-04-24 11:07:15 +00:00
Pere Miquel Brull
d3d523e96d
Ingestion md docs review (#11219)
* Update workflow docs

* Remove duplicate key

* Update Custom connector docs

* Update Domo connector docs

* Dashboard docs updates

* Some databases docs updates

* Finish db docs updates

* Remove Pulsar

* Messaging docs

* Metadata docs

* ML docs

* S3 docs

* Fix rendering

* Update title and description of the databaseSchema

* Pipeline Service docs

* remove pulsar from tests

* Format

* Fix test

* Remove pulsar

* Remove pulsar
2023-04-23 18:43:46 +02:00
Sriharsha Chintalapani
123758e21e
Fix #10964: update retentionSize based on retention.size in the topic config (#11217)
* Fix #10964: update retentionSize based on retention.size in the topic config

* Fix #10964: update retentionSize based on retention.size in the topic config
2023-04-23 08:36:58 +02:00
Sriharsha Chintalapani
9e259be44e
Fix #11214: Ingestion based elastic search index missing serviceType for MLModel and Container (#11215) 2023-04-23 07:30:37 +02:00
Mayur Singal
cb5ee34a1b
Fix Lineage Via Table Entity Error (#11209) 2023-04-22 18:31:30 +02:00
Teddy
6e129c1e65
Issue 10805 Added Hive e2e (#11197)
* tests: Added E2E test for Hive + fix minor bug

* tests: ran python linting
2023-04-21 15:45:12 +00:00
Onkar Ravgan
4c3b20b910
Req Markdown docs: dbt, sagemaker, mode, powerbi, db2, dynamo, kinesis, fivetran (#11173)
* Added markdown req docs

* Added v1 docs

* Update openmetadata-docs-v1/content/v1.0.0/connectors/database/db2/index.md

fixed typo in db2 grant

Co-authored-by: Teddy <teddy.crepineau@gmail.com>

* typo fix v1 docs

---------

Co-authored-by: Teddy <teddy.crepineau@gmail.com>
2023-04-21 16:44:41 +02:00
Ayush Shah
a50c31539b
Fix HexByteString Issue, revert datatype change (#11145)
* Fix HexByteString Issue, revert datatype change

* Add E2E MSSQL Bit type
2023-04-21 10:08:27 +02:00
Keith Sirmons
97b58c65f5
Impalaconnection (#11151)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* first pass for impala connector

* updated default auth_mechanism to be one of the enum values.

* updated UI documentation to match fields for the impalaconneciton.

refined impalaConnection to bring use_ssl to a boolean instead or relying on an extra connection option being manually added.

Removed reference to hive for type mapping

added impala to the pip setup

* py_format updates

* removed print statement

* Lints and fixes

* Updated database documentation to follow new style

* Flag as BETA

* Remove tests

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-04-21 09:57:13 +02:00
Nahuel
8184516f80
Fix: Align includeOwner option with the rest of options to include entities (#11160) 2023-04-20 16:01:54 +02:00
Pere Miquel Brull
91cd1491ee
Add Athena, Lineage and Usage docs & Fix Athena UI Lineage and Usage workflows (#11148)
* Athena docs

* Lineage and Usage docs

* Missing section close

* Fix Athena Model
2023-04-20 06:31:53 +02:00
Mayur Singal
dd754d586e
Metabase E2E Test & docs (#11126) 2023-04-20 00:50:23 +05:30
Onkar Ravgan
09fb69b68d
Fixed objectstore import (#11144) 2023-04-19 21:39:29 +05:30
Teddy
0f7d9699ad
Fix metrics filtering (#11149)
* fix: get column not filtering for metric types when profilerConfig with include columns is set

* fix: run python linting
2023-04-19 14:09:13 +00:00
Milan Bariya
66b25d2f30
Fix: Databricks usage issue (#11143) 2023-04-19 19:25:17 +05:30
Ayush Shah
ca861bc06e
Fixes #11137: Mssql Syntax Error + Arithemetic Error (#11138) 2023-04-19 15:08:12 +05:30
NiharDoshi99
1862ba2ba4
Changing behaviour for owners same as description for dashboards (#11118)
* changing behaviour for owners same as description

* fix typo
2023-04-19 12:31:56 +05:30
Pere Miquel Brull
a78a3b4734
Azure datalake metadata ingestion fixes (#11125)
* Add ADLS permissions

* Fix Azure DL ingestion

* Format

* enable decode for json

* fix gcs decode error

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2023-04-19 07:28:41 +02:00
Pere Miquel Brull
463f242d6b
Add S3 Storage docs & CI validation (#11120)
* Prep CI

* Update Athena

* Update docs for S3 Storage

* Add manifest information
2023-04-19 06:31:55 +02:00
Milan Bariya
7cbe48971d
fix: Redash lineage issue (#11098)
* fix: Redash lineage issue

* change based on comments

* change based on comments

* change based on comments
2023-04-18 21:22:17 +05:30
Teddy
97ff34967a
Fix histogram bin creation (#11105)
* fix: bin creation + pass full table name for mysql median computation

* fix: ran linting for python

---------

Co-authored-by: Nahuel <nahuel@getcollate.io>
2023-04-18 14:49:21 +02:00
Nahuel
22ce62e13b
Fix: Add Redash E2E test (#11091) 2023-04-18 12:52:38 +05:30
Mayur Singal
857ddeab1e
Ingestion: Metabase Unit Tests (#11080) 2023-04-18 09:08:17 +05:30
Mayur Singal
e7013b481a
Improve Redshift Query (#11082) 2023-04-18 00:51:06 +05:30
Teddy
b04f7225f8
fix: column retrieval for SNOWFLAKE (#11090) 2023-04-17 14:36:58 +00:00
NiharDoshi99
7e4b63997b
Changes for no columns and added owner for table (#11086)
* changes for no columns and added owner for table

* added pydantic model for owners
2023-04-17 18:42:30 +05:30
Onkar Ravgan
b82abd5047
Fixed tableau url (#11071)
* Fixed tableau url

* review comments and tests

* changes to remove host-port addition from the UI for dashboard and chart urls

---------

Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
2023-04-17 11:20:11 +02:00
Mayur Singal
199fe8753a
Fix Top Level Imports (#11075) 2023-04-14 17:18:38 +00:00
Pere Miquel Brull
ae984d1808
Handle impala auth mechanism (#11074) 2023-04-14 18:04:42 +02:00
Teddy
a7d98dddda
Fixes #9632 - Add Profiler Support for BQ Arrays of Structs (#11059) 2023-04-14 19:29:26 +05:30