2916 Commits

Author SHA1 Message Date
nicor88
5eae1e371c
fix ingestion of owner in dbt, via email (#17613) 2024-08-28 18:24:50 +05:30
Pere Miquel Brull
2180a6c7f1
FIX - profiler interface system metrics validation & e2e YAML includeDDL (#17562) 2024-08-23 09:00:18 +02:00
Imri Paran
b48c6a0485
feat(postgres): add money profile (#17558)
add support for profiling of money type
2024-08-22 14:53:34 -07:00
Pere Miquel Brull
519b3c32e3
MINOR - Speek up redshift test connection (#17553) 2024-08-22 11:12:09 -07:00
Onkar Ravgan
bbe92e2af3
MINOR: Fix none password issue for mysql and postgresql databases (#17548)
* fix none password issue

* added warning
2024-08-22 17:34:34 +05:30
kwgdaig
43a244fbf1
ISSUE-17045: Modified to create column linage even when upstream columns and data source columns are one-to-many (#17112)
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-08-22 09:07:05 +02:00
Imri Paran
2dd613b2a7
tests: lineage (#17436)
add test for cell with 100k characters
2024-08-21 20:28:08 -07:00
Mayur Singal
dd17ee739a
MINOR: Fix output handler time ingestion (#17429)
* MINOR: Fix output handler time ingestion

* chore: fixes Lint error

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-08-21 21:07:26 +05:30
Ayush Shah
7ada44a315
Minor: Remove trailing slash from FivetranClient base URL (#17528)
* fix: Remove trailing slash from FivetranClient base URL

* chore: use a method created for removing trailing slash
2024-08-21 13:02:16 +00:00
Ayush Shah
c08b83f41d
MINOR: Fixes AttributeError: 'DatalakeGcsClient' object has no attribute 'project' (#17526) 2024-08-21 17:51:35 +05:30
Imri Paran
5133c31d31
MINOR: kafka integration tests (#17457)
* tests: kafka integration

kafka integration tests with schema registry

* added ignore kafka for python 3.8

* fixed tests
2024-08-21 16:05:09 +05:30
IceS2
ac6b192fb3
MINOR: Fix pydantic v2 issues with domo (#17507)
* Fix pydantic v2 issues with domo

* Fix sourceURL for domo charts
2024-08-21 11:10:12 +02:00
Imri Paran
c055620ff4
tests: lineage (#17509)
added test cases for lineage with and without includeDDL
2024-08-21 07:47:30 +00:00
Do Manh Ha
a868596db7
Fixes #17461: Unquote and interpret escaped characters in BigQuery dataset description (#17462)
* fix(bigquery): unquote and convert any escaped characters to their actual representations

* test: bigquery description with multiple line

---------

Co-authored-by: Imri Paran <imri.paran@gmail.com>
2024-08-20 17:56:19 +02:00
Teddy
354f879866
fix: db_service not passed to metric instantiating (#17504) 2024-08-20 07:57:03 -07:00
Teddy
4ed2035a50
fix: remove root when accessing fqn of entity reference (#17491) 2024-08-20 15:17:16 +02:00
Ayush Shah
9880f06b2c
Fixes #17489: Allow non numeric numbers to be sent via Json, Replace NaN value… (#17490)
* fix: Allow non numeric numbers to be sent via Json, Replace NaN values with None in SQAProfilerInterface

Replace NaN values with None in the SQAProfilerInterface class to maintain database parity. NaN values will be cast to null in OpenMetadata. This change ensures that data handling processes account for this conversion.

* fix: histogram overflow error

* test: Add Unit Test for Null and Null Ratio Metric

* chore: Address comments

* chore: Address comments

* fix: checkstyle and message

* fix: failing tests as null count works as expected
2024-08-20 16:33:55 +05:30
Imri Paran
a59eb2a3cd
fix: pin numpy version (#17487) 2024-08-20 10:19:05 +00:00
Imri Paran
5da7bb049c
MINOR: fix table profiler on empty tables in trino (#17471)
* fix(profiler): trino

coalesce row count to 0 if result is null. this value gets returned for empty tables

* fixed test_metadata.py
2024-08-20 08:42:10 +00:00
IceS2
48b43900b6
Install db2 dependency on amd64 architectures (#17495) 2024-08-20 09:24:38 +02:00
Imri Paran
2722eadc33
fix: gcs (#17486)
1. update docs using gcp credentials in path.
2. updated example `clientIt` in docs
3. fixed client to work with implicit project
4. fixed workflow to warn about missing buckets
2024-08-19 23:14:46 -07:00
Imri Paran
31c2ec8c57
MINOR: fix qlikcloud test connection (#17459)
* fix: qlikcloud test connection

* patch test_connection for qlik cloud unit tests
2024-08-19 23:14:09 -07:00
IceS2
ddd8c41864
Fix DB2 Schema Trailing Whitespaces (#17475) 2024-08-19 23:13:14 -07:00
Imri Paran
7508848376
fix(dq): data types for unique columns (#17431)
1. remove json and array from supported data types of unique column test.
2. migrations.
3. tests.
2024-08-19 14:28:42 +02:00
Mayur Singal
8acf6d3e94
MINOR: Make Include ddl disabled by default (#17450)
* MINOR: Make Include ddl disabled by default

* make schema def failure warning debug

* Add missing condition

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Pablo Takara <pjt1991@gmail.com>
2024-08-19 14:07:18 +02:00
Pere Miquel Brull
b175e40e99
MINOR - Clean DEBUG logs (#17464)
Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
2024-08-19 12:20:29 +02:00
Imri Paran
4c08f82e4e
Fixes 17413: Fix one sided tests for columnValueLengthsToBeBetween and columnValuesToBeBetween (#17423)
* mysql integration tests

* fix(data-quality): accept between with no bounds

add between filters only when the bounds are defined. if they are not (ie: resolve to 'inf' values), do not add any filters

* format

* consolidated ingestion_config

* format

* fixed handling of date and time columns

* fixed tests
2024-08-19 09:09:35 +02:00
Onkar Ravgan
bbb3256c0d
Match correct file names for the dbt artifacts (#17445) 2024-08-18 11:40:37 +02:00
Suman Maharana
de3a82eeb6
Minor: Kill active/idle connections after test connections (#17411)
* Minor: Kill active/idle connections after test connections

* fixed idle conn for multi db

* added exception handling
2024-08-14 15:42:42 +02:00
Ayush Shah
8ad6c95fe4
Fixes #17367: PipelineStatus Timestamp None not allowed (#17422)
* fix(ingestion): Change Timestamp None to Current Time noting pending pipeline

* fix(ingestion): Address comments around PipelineStatus timestamp

* fix(ingestion): Improve timestamps handling for tasks and pipeline status
2024-08-13 15:39:29 +02:00
Suman Maharana
feab12422b
MINOR: Fix Datetime Conversion issue in usage/lineage (#17380)
* MINOR: Fix Datetime Conversion issue in usage/lineage

* Undo mssql specific fixes

* fixed datetime conversion in mssql

* fixed datetime conversion in oracle
2024-08-13 14:04:50 +02:00
IceS2
5e32c2aa78
Install DB2 odbc driver on x86_64 architectures and update docs (#17425) 2024-08-13 13:16:19 +02:00
Onkar Ravgan
1bc0ca7155
MINOR: Added support to process multiple dbt run_results.json for a single dbt project (#17412)
* Added dbt multiple run_results

* correct to suffix

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-08-13 13:19:56 +05:30
harshsoni2024
a67a3241e7
sql query correct wildcard syntax (#17405) 2024-08-13 11:39:05 +05:30
Suman Maharana
1a2fd24c10
Fixes: DBT Cloud lineage not showing (#17395) 2024-08-12 23:48:40 +05:30
Imri Paran
3069a63cb4
remove pandas import for null_ratio (#17401) 2024-08-12 17:20:11 +02:00
Antoine Balliet
34dc79b5fe
feat: DBT - allow to use email to match users as fallback with ES search (#17349)
* feat: allow to use email to match dbt model owner

* improve description

* revert add option to parse from email

* feat: implement email matching in case name was not found

* feat: use email to find user as fallback at connector level
2024-08-12 14:54:54 +02:00
harshsoni2024
c84b77859e
MINOR: databricks describe table query update (#17396)
* databricks describe table query update

* remove extra args

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-08-12 17:35:04 +05:30
Mayur Singal
6a31c579f6
MINOR: Add Unity Catalog Lineage Dialect (#17398) 2024-08-12 17:01:37 +05:30
Ayush Shah
af14267e09
Fixes #17319: ArrayDataType issue resolved, Fix Queries + Add DB Name to the queries (#17379)
* fixes arrayDataType must be not null, adding db name to queries as it fails

* Fix Pydantic Issue

* Partial: Add Unity Catalog Topology Test

* Fix lint

* Fix Tests, Fix UnityCatalog Array Column issue

* Fix Tests

* Address comments, add logger to the exception
2024-08-12 09:59:03 +02:00
Ayush Shah
83e2b68a25
Fixes #17377: HIERARCHYID & GEOGRAPHY pyodbc error (#17378)
* fixes: HIERARCHYID & GEOGRAPHY pyodbc error

- Add support for "HIERARCHYID" column type in column_type_parser.py
- Register AzureSQLSampler for "AzureSQLConnection" in sampler_factory.py

* Revert SQA_UTILS
2024-08-11 17:29:43 +02:00
harshsoni2024
0548342239
Fix #16958: column parser data type fix (#17154) 2024-08-11 00:24:46 +05:30
Pere Miquel Brull
a098c20c7c
MINOR - LKML sample data (#17359) 2024-08-10 18:01:00 +02:00
harshsoni2024
264be13b66
databricks foreign table issue: skip table metadata for foreign tables (#17343)
* skip get_columns for foreign catalog tables

* get table type before executing column metadata

* remove duplicate query, pycheckstyle fix

* skip fk table instead of reaching till column metadata

* add debug log
2024-08-10 21:08:49 +05:30
Onkar Ravgan
f33bfe78d6
fixed dbt cloud conn (#17376) 2024-08-10 16:58:55 +05:30
IceS2
e5a7cff5a5
Updated oracle to use DBA_ tables (#17274)
Co-authored-by: Ayush Shah <ayush@getcollate.io>
2024-08-10 16:58:21 +05:30
Yung-Chun
453c20f53e
fix type hint (#17354) 2024-08-10 10:48:42 +05:30
harshsoni2024
1b04f1fb37
Fix #16573: get table owners for databaricks & unitycatalog tables (#17282) 2024-08-10 10:45:22 +05:30
IceS2
e52c4af9ee
Fix Databricks TableQuery timestamps to str (#17362) 2024-08-09 17:32:32 +02:00
IceS2
322794ecc2
MINOR: Fix usage datetime format for mssql (#17341)
* Fix usage datetime format for mssql

* Add Integration Test to check that the Usage workflow runs without error

* Fix checkstyle
2024-08-08 16:31:31 +02:00