242 Commits

Author SHA1 Message Date
Pere Miquel Brull
11c07ee8ab
Fix #11516 - SAP Hana Connector (#11777)
* SAP Hana skeleton

* Add SAP Hana Connector

* Fix ingestion and docs

* Prep SAP Hana Profiler

* Linting

* Update index.md

* Revert: Update index.md

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-05-31 16:00:31 +02:00
Chirag Madlani
7adc291364
fix(ui): circular deps for entityReference.json (#11760)
* fix(ui): circular deps for entityReference.json

* Fix circular Dependency python

* Cap Delta Spark version

---------

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2023-05-26 18:02:21 +05:30
Onkar Ravgan
3fbddc2a03
upgrade kafka dep (#11721) 2023-05-23 09:59:12 -07:00
Sriharsha Chintalapani
6509a3670a
Fix #11664: Refactor patch_mixin to use jsonpatch lib (#11696)
* Fix #11664: Refactor patch_mixin to use jsonpatch lib

* Migrate to jsonpatch

* Fix nested cols

* Format

* Update patch_description

* Table constraints

* tag

* owner

* column tag

* column desc

* Format

* Format

* Fix log

* Update dbt patch

* Update column fqn

* Fix test

* Fix tests

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-05-23 15:47:11 +02:00
Pere Miquel Brull
1370325762
Fix typing_extension verson (#11719)
* fix extension

* fix extension
2023-05-23 10:09:31 +02:00
Onkar Ravgan
3d9d4416b7
Fixed incompatible column name for Postgres version 11.6 (#11536)
* postgres col name on version

* Added dependancy

* Added paranthesis validation

* review comments and tests
2023-05-15 11:48:03 +05:30
Akash Jain
92d5bfa94e
fix: versions in main branch (#11478)
* fix: versions in main branch

* Prepare main branch for next release 1.1.0

* prepare main for latest release 1.0.1
2023-05-15 10:42:29 +05:30
Mayur Singal
ef7b02529d
Remove sqlalchemy-ibmi from db2 dependency (#11553)
* Remove sqlalchemy-ibmi from db2 dependency

* remove from json schema

* add migration

* update what's new
2023-05-11 15:03:26 +02:00
Nahuel
1ec6e5e285
Fix#11311: Add IBM dependency for i Series in DB2 connector (#11381) 2023-05-02 15:50:39 +02:00
Ayush Shah
8ebe6a80e6
Upgrade Pyarrow (#11383) 2023-05-02 16:00:32 +05:30
Keith Sirmons
97b58c65f5
Impalaconnection (#11151)
* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* updated metadata to work with the impala query engine.
Uses the describe function to grab column names, data types, and comments.

* added the ordinalPosition data point into the Column constructor.

* renamed variable to better describe its usage.

* updated profile errors.
Hive connections now comment columns by default.

* removed print statements

* Cleaned up code by pulling check into its own function

* Updated median function to return null when it is being used for first and third quartiles.

* removed print statements and ran make py_format

* updated to fix some pylint errors.
imported Dialects to remove string compare to "impala" engine

* moved huge comment into function docstring.
This comment shows us the sql to get quartiles in Impala

* added cast to decimal for column when running average in mean.py

* fixed lint error

* fixed ui ordering of precision and scale.
Precision should be ordred in front of scale since the precision is set first in decimal data types

* first pass for impala connector

* updated default auth_mechanism to be one of the enum values.

* updated UI documentation to match fields for the impalaconneciton.

refined impalaConnection to bring use_ssl to a boolean instead or relying on an extra connection option being manually added.

Removed reference to hive for type mapping

added impala to the pip setup

* py_format updates

* removed print statement

* Lints and fixes

* Updated database documentation to follow new style

* Flag as BETA

* Remove tests

---------

Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-04-21 09:57:13 +02:00
Nahuel
c3bfd1310a
Fix: bump 'openmetadata-sqllineage' version to 1.0.4 (#11109) 2023-04-18 13:58:39 +02:00
Pere Miquel Brull
09b283818d
Rel to #10927 - Looker DataModel (#10945)
* Organise calls

* Prepare skeleton

* Add looker model handling

* Parse files as sql

* Handle labels

* Linting

* Format

* Fix version

* Also check the API for explore lineage
2023-04-11 08:44:00 +02:00
Mayur Singal
d7e0153000
Fix #10896: Fix snappy coded issue (#10919) 2023-04-05 12:12:47 +05:30
Nahuel
6c9ef22168
Update: openmetadata-sqllineage dependency (#10894) 2023-04-03 11:41:13 +00:00
NiharDoshi99
3406c8c868
removed en_web_md from setup (#10839)
* removed en_web_md from setup

* Use Constant

---------

Co-authored-by: ulixius9 <mayursingal9@gmail.com>
2023-03-30 09:43:41 +00:00
Onkar Ravgan
5d6e18dc28
Fix 10642: Mark delete entities and tags toggle (#10695)
* Added mark delete logic

* Final test and optimization

* After merge fixes

* Added include tags for dash pipelines dbt

* added docs and fixed test

* Fixed py tests

* Added UI changes for following newly added fields:
- markDeletedDashboards
- markDeletedMlModels
- markDeletedPipelines
- markDeletedTopics
- includeTags

* Fixed failing unit tests

* updated json files of localization for other languages

* Improved localization changes

* added localization changes for other languages

* Updated mark deleted desc

* updated the ingestion fields descriptions in the ingestion form for UI

* automated localization changes for other languages

* updated descriptions for includeTags field for dbtPipeline and databaseServiceMetadataPipeline json

* fixed issue where includeTags field was being sent in the dbtConfigSource

* Added flow to input taxonomy while adding BigQuery service.

---------

Co-authored-by: Aniket Katkar <aniketkatkar97@gmail.com>
2023-03-29 12:41:44 +05:30
VolkovGeoPhy
86febae17c
GX up to 0.16 (#10746) 2023-03-28 16:09:46 +02:00
Nahuel
ef759c7e88
Fixes#8038: Change how status is handled after running workflow (#10710)
* Change how status is handled after running workflow

* Reset changes in config files

* Add auxiliary Summary class

* Improve failures handling

* Pylint error

* Pylint error

* Show result in table

* Add test

* Fix setup.py

* Add comments
2023-03-24 17:59:06 +01:00
Nahuel
07d6028149
Fix: remove avro-python3 deprecated dependency (#10602) 2023-03-15 14:15:57 +00:00
Onkar Ravgan
93e554ae67
Fixed Redash Source Issues (#10570)
* Imporved redash source

* Added docs

* Addressed review comments
2023-03-14 23:00:49 +05:30
Nahuel
ed884cf79a
Bug: Update sqllineage-openmetadata + add timeout for parsing queries (#10474)
* Update sqllineage-openmetadata version + add timeout

* Pyimpala fix colnames, comments and dialect sql compilation (#10470)

* Fix col names and comments for impala hive

* Fix cols, comments and impala sql compilation

* Handle hive types

* Format

* Added doc in avro array and tests (#10473)

* Fixed: Add job definition id field for dbt cloud in UI #10269 (#10472)

* fixed Add job definition id field for dbt cloud in UI #10269

* sync-localization file

* fixed failing unit test and add unit test for the changes

* Address PR comments

* Update tests

* Pylint clean

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2023-03-08 20:49:02 +01:00
NiharDoshi99
d41878ec90
fix: spacy model (#10467) 2023-03-08 15:19:35 +05:30
Onkar Ravgan
d4fafa3168
Downgraded conf-kafka lib (#10466) 2023-03-08 14:22:38 +05:30
Sriharsha Chintalapani
fe73948b55
Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf (#10430)
* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Fix #10429: Kafka Sample data improvements and adding support for JSONSchema and Protobuf

* Added top level parsing and unit tests

* fix(ui): show schemaText and fields both

* fix no data placeholder for fields & schema text

* addressing comments

* fixed py checkstyle

---------

Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
2023-03-07 15:40:04 +01:00
NiharDoshi99
1ff76f5e65
pii tagging using spacy (#10256)
* WIP: pii tagging using spacy

* added test cases and changes as per comment

* fix python checkstyle

* fix python checkstyle

* added score, test_cases and docs update

* solved merge conflict

* fix python checkstyle

* remove pii tagging using regex

* fix python test

* lib changes and added some test case

* changed as per comment

* fix: python test

* fix: changes to get source_config

* fix: changes as per comment
2023-03-03 18:33:18 +05:30
Teddy
754074f1be
Fixes #7758 - Added Column value and Integer Range Partitionning (#10350)
* feat(profiler): renamed  module to

* feat(profiler): added dbt-artifacts-parser to test setup.py

* feat(profiler): refactor workflow and interface

* feat(profiler): linting

* feat(profiler): removed old profiler modules

* feat(profiler): added support for value and integer range partition

* feat(profiler): fixed linting

* feat(profiler): added partitionning support for datalake profiler

* feat(profiler): removed `ProfilerInterfaceArgs` class

* feat(profiler): address comments

* feat(profiler): Added `OTHER` as an `IntervalType` for UI type generation
2023-03-01 08:20:38 +01:00
Mayur Singal
cd4461397d
Add impyla as scheme for hive connector (#10270) 2023-02-22 16:54:56 +05:30
Teddy
83be5d933b
Fixes #9301 - Refactor TestSuite and Remove Pandas from Base Requirements (#10244)
* feat(testSuite): extracted out column test for SQA type

* refactor(testSuite): extracted SQA column and table tests into their own classes

* refactor(testSuite): Added pkutil namespace package style for test suite classes

* refactor(testSuite): added dynamic importer function for test cases

* refactor(testSuite): black formatting

* refactor(testSuite): fixed linting issues

* refactor(testSuite): refactor metrics for dataframe

* refactor(testSuite): Added Mixins and base methods

* refactor(testSuite): extrcated out get bound for floats

* refactor(testSuite): Added pandas column test cases

* refactor(testSuite): Deleted old column tests

* refactor(testSuite): Added table tests for datalake

* refactor(testSuite): Removed old tests definition

* refactor(testSuite): changed registry to dynamic class inport

* refactor(testSuite): renamed dl_fn to df_fn

* refactor(testSuite): updated registry unit test

* refactor(testSuite): updated import path to sqa like column

* refactor(testSuite): cleaned up imports in old files

* refactor(testSuite): harmonzied SQALikeColumn object to replicate SQA Column object

* refactor(testSuite): linting

* refactor(testSuite): linting

* refactor(testSuite): raise expection on DQ exception

* refactor(testSuite): linting

* refactor(testSuite): removed pandas from base requirements

* refactor(testSuite): Added __futur__ for py3.7 type hint

* refactor(testSuite): added `df` to good-names

* refactor(testSuite): renamed Handler to Validator

* refactor(testSuite): Added test inheritance for column tests

* refactor(testSuite): cleaned up column type check

* refactor(testSuite): cleaned up typo

* refactor(testSuite): extracted main table test logic into parent class

* refactor(testSuite): linting

* refactor(testSuite): linting fixes

* refactor(testSuite): address doc string and linting issues
2023-02-22 09:42:34 +01:00
VolkovGeoPhy
7a59bc7676
>= grpc-tools 1.47.2 (Done) (#10218) 2023-02-20 18:07:27 +05:30
Nahuel
b9a3c06104
Bump main branch to version 1.0.0 (#10040)
* Bump to version 0.13.2

* Bump mvn projects to 1.0.0-SNAPSHOT

* Bump python projects to 1.0.0.dev0

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-02-02 12:56:14 +01:00
Pere Miquel Brull
f0f3f0be6a
Add looker unit tests (#9691)
* Add looker tests

* Empty-Commit

* Install GE for tests

* Fix usage details python name

* Add missing test requirement
2023-02-01 09:20:26 +00:00
Ayush Shah
747fcf569b
Add docs - quicksight, lineage... (#10023) 2023-01-31 15:17:40 +00:00
Onkar Ravgan
949989fb1c
Added dbt parser (#9982)
* Added dbt parser

* Added library dependency

* format and final fixes

* Addressed review comments

* Fixed typo
2023-01-29 20:47:39 +01:00
Pere Miquel Brull
f6d59f599e
Pin SQLAlchemy lower than 2 (#9952) 2023-01-27 15:26:30 +01:00
Nahuel
254ee9a186
Fix#9460: Avoid reuse inspector to get view definition (#9821)
* Avoid reuse inspector to get view definition

* Update openmetadata-sqllineage version
2023-01-20 13:54:41 +00:00
Nahuel
ddff6e2875
Fix: Replace sqllineage with openmetadata-sqllineage (#9800)
* Replace sqllineage with openmetadata-sqllineage

* Fix checkstyle and failing test

* Move logic to retrieve dialect of a service type into a class

* Improve py-check message when it fails

* Updated mapper

* Update code after merge
2023-01-19 14:56:29 +01:00
Pere Miquel Brull
294277708b
Fix #9558 - Add a greater range for boto3 dependency (#9778)
* add boto3 wiggle room

* add boto3 wiggle room
2023-01-18 08:20:40 +01:00
Sriharsha Chintalapani
2a314809c1
Keep elasticsearch version to be 7.13.1 (#9756)
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2023-01-17 19:12:49 -08:00
NiharDoshi99
029dbe892e
Fix: added test case for atlas (#9678)
* Fix: added test case for atlas

* Fix: resolved conflict

* Fix: changing back neo4j to old version

* Fix: changing back neo4j to old version

* Fix: changes as per comment

* Fix: changes as per comment

* Fix: python checkstyle
2023-01-13 16:07:29 +05:30
NiharDoshi99
1ec324e43e
Fix: neo4j version bump (#9680) 2023-01-11 18:28:25 +05:30
Pere Miquel Brull
bf753a4dee
Fix #7768 - Update and organize versions (#9664)
Fix #7768 - Update and organize versions (#9664)
2023-01-11 07:05:12 +01:00
Pere Miquel Brull
84348d4748
Fix #8866 - bump datamodel-codegen (#9623)
* Fix #8866 - bump datamodel-codegen

* Update connection options and arguments structure

* Add builders test

* Format

* Allow Any values in componentConfig

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2023-01-09 13:20:32 +01:00
Ayush Shah
1d930ad14b
Fix security vulnerability (#9580) 2023-01-05 12:36:00 +05:30
Pere Miquel Brull
7f21a7bced
Fix #8088 - Restructure source connections & clients (#9545) 2023-01-02 13:52:27 +01:00
Chirag Madlani
bf6fc5f93a
prepare(release) next release (#9479)
* prepare(release) next release

* airflow typo

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2022-12-27 20:15:46 +05:30
Nahuel
2c43ebba6f
Fix#9448: Add ES volumes (#9506)
* Add ES volumes

* Fix run_local_docker script

* Fix error run_local_docker script

* Update Es volumes in docker-compose files
2022-12-23 17:33:30 +01:00
Ayush Shah
2bf5eb9051
fix 7995: profileSample % and row number (#9104) 2022-12-20 14:55:11 +05:30
Nahuel
a2b34dd0f4
Fix: Update Ingestion docker images and fix python libraries dependencies (#9342)
* Update Ingestion docker images and fix python libraries dependencies

* Install also apache-airlfow-providers-http
2022-12-16 14:46:25 +00:00
Nahuel
819001182f
Fix#9251: DB2 connection config and ingestion update (#9322)
* DB2 connection config and ingestion update

* Update ingestion/src/metadata/ingestion/source/database/common_db_source.py

Co-authored-by: Ayush Shah <ayush@getcollate.io>

* Update ingestion/src/metadata/ingestion/source/database/common_db_source.py

Co-authored-by: Ayush Shah <ayush@getcollate.io>

* Update bootstrap/sql/com.mysql.cj.jdbc.Driver/v007__create_db_connection_info.sql

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2022-12-16 07:43:18 +01:00