106 Commits

Author SHA1 Message Date
Keshav Mohta
b7a7023890
Fix #20665: BigQuery - Adding billing project (#21231) 2025-06-09 13:09:40 +05:30
Keshav Mohta
0796c6274b
Fixes: Databricks httpPath Required (#20611)
* fix: made databricks httpPath required and added a migration file for the same

* fix: added sql migration in postDataMigration file and fix databricks tests

* fix: added httpPath in test_source_connection.py and test_source_parsing.py files

* fix: added httpPath in test_databricks_lineage.py

* fix: table name in postgres migration
2025-04-07 13:33:55 +05:30
Mayur Singal
7760663b22
MINOR: Change ingestion licence header (#20549) 2025-04-03 10:39:47 +05:30
Mohit Tilala
06ab82170b
Fixes #19534: Snowflake stream ingestion support (#20278) 2025-04-01 13:02:37 +05:30
harshsoni2024
b1d481f2f1
issue-16744: salesforce column description with toggle api (#19527) 2025-01-27 16:54:35 +05:30
Keshav Mohta
7bea4f957f
Feature: Docker Host Retry (#19127) 2025-01-14 19:48:10 +05:30
agriev
dcebc41e3f
Adds percona server for postgresql support (#19322)
* percona server for postgresql support

The only meaningful difference is version string in percona server for postgresql. So commit propose universal and safe way to detect server version by integer string, not complicated parsing of unformatted string.

* updated tests with get_server_version_num

commented outdated tests

---------

Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
2025-01-13 17:51:40 -08:00
Akash Verma
b2898f7007
Cockroach enhancement (#19108) 2025-01-07 18:51:59 +05:30
Akash Verma
39dcb5baef
Feature : Cockroach db connector (#18961) 2025-01-02 13:07:55 +05:30
Akash Verma
69557e8716
fixes: #15742 Oracle stored package feature (#18852) 2024-12-16 19:35:20 +05:30
Keshav Mohta
cde3a7dd1e
Feature: Cassandra Connector (#18943) 2024-12-12 15:12:55 +05:30
Imri Paran
ee7d043035
[GEN-2109] feat(mongo): added ssl support (#18731)
* feat(mongo): added ssl support

Added SSL support for MongoDB using the SSL manager.

Attached a video demo.

- [Example repository for setting up mongodb with SSL](https://github.com/sushi30/mongodb-docker-ssl-example)
- [MongoDB TLS documentation](https://www.mongodb.com/docs/manual/tutorial/configure-ssl/)

* fixed test_doris.py
2024-11-22 08:54:13 -08:00
IceS2
dccba20101
Return s3 endpoint as str() instead of Url (#18521) 2024-11-05 17:39:50 +00:00
Katarzyna Kałek
47c75fe6a7
Enhanced Glue ingestion with external table features (#18511)
* added fileFormat, locationPath and external table lineage to Glue ingestion

* Improve Lineage Label

---------

Co-authored-by: Katarzyna Kałek <kkalek@olx.pl>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
2024-11-05 21:48:20 +05:30
Ethan
49fceb4674
Fixes #18104 : change parse_obj and assertEquals which was deprecated (#18105)
* change deprecationwarning

* fix format python

* fix replace module

* change : java function name
2024-10-07 09:02:41 +02:00
Pere Miquel Brull
bcb29b46da
MINOR - Implement SAP Hana Lineage (#17615)
* MINOR - SAP Hana Lineage

* skeleton

* parser

* lineage

* manage formulas

* add cvs

* add cvs

* better typing

* enum

* handle cvs

* saphana docs
2024-08-30 07:42:43 +02:00
Do Manh Ha
a868596db7
Fixes #17461: Unquote and interpret escaped characters in BigQuery dataset description (#17462)
* fix(bigquery): unquote and convert any escaped characters to their actual representations

* test: bigquery description with multiple line

---------

Co-authored-by: Imri Paran <imri.paran@gmail.com>
2024-08-20 17:56:19 +02:00
Ayush Shah
af14267e09
Fixes #17319: ArrayDataType issue resolved, Fix Queries + Add DB Name to the queries (#17379)
* fixes arrayDataType must be not null, adding db name to queries as it fails

* Fix Pydantic Issue

* Partial: Add Unity Catalog Topology Test

* Fix lint

* Fix Tests, Fix UnityCatalog Array Column issue

* Fix Tests

* Address comments, add logger to the exception
2024-08-12 09:59:03 +02:00
Onkar Ravgan
fe7922c13c
MINOR: [SAP ERP Connector] Added column dtype displaynames and precision/scale values (#17240) 2024-08-01 12:49:34 +05:30
Sriharsha Chintalapani
fe107aa3cb
Issue #17012: Multi User/Team Ownership (#17013)
* Add multiple owners

* Multi Ownership

* Issue #17012: Multi User/Team Ownership

* Issue #17012: Multi User/Team Ownership

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 1

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 2

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 3

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 4

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 5

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 6

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 7

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 8

* Add Migrations for Owner Thread

* update ingestion for multi owner

* fix pytests

* fixed checkstyle

* Add Alert Name to Publishers (#17108)

* Add Alert Name to Publishers

* Fix Test

* Add Bound to Setuptools (#17105)

* Minor: fixed testSummaryGraph issue (#17115)

* feat: updated multi pipeline ui as per new mock (#17106)

* feat: updated multi pipeline ui as per new mock

* translation sync

* fixed failing unit test

* fixed playwright test

* fixed viewService click issue

* sorted pipeline based on test case length

* Added domo federated dataset support (#17061)

* fix usernames (#17122)

* Doc: Updated Doris & Redshift Docs (#17123)

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>

* Fix #12677: Added Synapse Connector - docs and side docs (#17041)

* Fix #17098: Fixed case sensitive partition column name in Bigquery (#17104)

* Fixed case sensitive partiion col name bigquery

* update test

* #13876: change placement of comment and close button in task approval workflow (#17044)

* change placment of comment and close button in task approval workflow

* minor change

* playwright test for the close and comment function

* supported ref in activityFeedEditor

* fix playwright test

* added playwright test for data steward

* fix the test for the data streward user

* fix the close button not showing if task has no suggestions and icon fixes

* fix sonar issue

* change glossary and add suggestion button to dropdown button

* fix the glossary failure due to button change

* icon change for add tag and description

* fix glossary cypress failure due to button chnages

* changes as per comments

* MINOR: docs links fix (#17125)

* alation link fix

* dbt yaml config source link fix

* bigquery doc fix

* Explore tree feedbacks (#17078)

* fix explore design

* update switcher icon

* show menu when search query exists

* fix selection of active service

* fix type error

* fix tests

* fix tests

* fix tests

* MINOR: Databricks view TableType fix (#17124)

* Minor: fixed AUT test (#17128)

* Fix #16692: Override Lineage Support for View & Dashboard Lineage (#17064)

* #17065: fix the tags not rendering in selector after selection in edit tags task (#17107)

* fix the tags not rendering in selector after selection in edit tags taks

* added playwright test

* minor changes

* minor fix

* fix the tags not updating in edit and accept tag

* fix explore type changes for collate (#17131)

* MINOR: changed log level to debug (#17126)

* changed log level to debug

* fixed type

* changed type to optional

* Get feed and count data of soft deleted user (#17135)

* Doc: Adding OIDC Docs (#17139)

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>

* Doc: Updating Profiler Workflow Docs URL (#17140)

Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>

* fix playwright and cypress (#17138)

* Minor: fixed edit modal issue for sql test case (#17132)

* Minor: fixed edit modal issue for sql test case

* fixed test

* Minor: Added whats new content for 1.4.6 release (#17148)

* MINOR [GEN-799]: add option to disable manual trigger using scheduleType (#17031)

* fix: raise for triggering system app

* added scheduleType ScheduledOrManual

* minor: remove "service" field from required properties in createAPIEndpoint schema (#17147)

* initial commit multi ownership

* update glossary and other entities

* update owners

* fix version pages

* fix tests

* Update entity_extension to move owner to array (#17200)

* fix tests

* fix api page errors

* fix owner label design

* locales

* fix owners in elastic search source

* fix types

* fix tests

* fix tests

* Updated CustomMetric owner to entityReferenceList. (#17211)

* Fix owners field in search mappings

* fix search aggregates

* fix inherited label

* Issue #17012: Multi User/Team Ownership - Fix Tests - Part 9

* Fix QUeries

* Fix Mysql Queries

* Typo

* fix tests

* fix tests

* fix tests

* fix advanced search constants

* fix service ingestion tests

* fix tests

---------

Co-authored-by: mohitdeuex <mohit.y@deuexsolutions.com>
Co-authored-by: Onkar Ravgan <onkar.10r@gmail.com>
Co-authored-by: Mohit Yadav <105265192+mohityadav766@users.noreply.github.com>
Co-authored-by: Ayush Shah <ayush@getcollate.io>
Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
Co-authored-by: k.nakagaki <141020064+nakaken-churadata@users.noreply.github.com>
Co-authored-by: Prajwal214 <167504578+Prajwal214@users.noreply.github.com>
Co-authored-by: Prajwal Pandit <prajwalpandit@Prajwals-MacBook-Air.local>
Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
Co-authored-by: Ashish Gupta <ashish@getcollate.io>
Co-authored-by: harshsoni2024 <64592571+harshsoni2024@users.noreply.github.com>
Co-authored-by: Karan Hotchandani <33024356+karanh37@users.noreply.github.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Co-authored-by: Imri Paran <imri.paran@gmail.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
Co-authored-by: karanh37 <karanh37@gmail.com>
Co-authored-by: Siddhant <86899184+Siddhanttimeline@users.noreply.github.com>
2024-07-30 08:06:39 +02:00
Onkar Ravgan
b6745d7cf1
MINOR: Implemented SAPERP Connector feedback (#17137)
* Implemented saperp feedback

* Fixed pytest
2024-07-29 10:31:20 +05:30
Antoine Balliet
e67ba6b14c
feat: BigQuery ingestion allow specify projectId and fix primary key constrains retrieval (#16956) 2024-07-17 11:21:17 +05:30
harshsoni2024
52dc5b551e
MINOR: fix column name in databricks ingestion fail (#17019)
* add backticks in column_name

* add test for column name with .
2024-07-17 10:25:12 +05:30
Onkar Ravgan
80efc7075f
Fix #15163: Added SAP ERP Connector 2024-07-04 10:57:46 +05:30
harshsoni2024
3f5bc1948d
Fix #14676: Athena S3 Lineage (#16426)
* get table ddl for athena tables

* changes in method to get all table ddls

* external table/container lineage for athena

* column lineage for external table lineage

* unittest for athena

* pyformat changes

* add external table lineage unit test

* fix unittest with pydantic v2 changes

* fix unittest formating

* fix code smell
2024-06-26 19:53:36 +05:30
Ayush Shah
c9a017d8db
#16720: Add Support for Salesforce SSL (#16719) 2024-06-20 12:10:41 +05:30
IceS2
f0049853ec
FIXES 14885: Initial deltalake implementation for s3 (#16665)
* Initial deltalake implementation for s3

* Fix styles

* Fix test_amundsen

* Fix UnitTests

* Fix Checkstyle

* Fix integration tests due to datalake client refactor

* Fix unit tests

* Fix tests

* Fix Integration DeltaLake Storage test

* Skip delta storage integration test for python 3.8

* DeltaLake JSONSchema changes migrations

* Update import name

* Add some comments based on sonarcloud suggestions

* Update DeltaLake documentation

* Resolve some comments
2024-06-20 12:08:21 +05:30
Pere Miquel Brull
cb72a22b59
Fix - e2e tests for pydantic V2 (#16551)
* Fix - e2e tests for pydantic V2

* add correct default

* add correct default

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* revert datetime aware

* fix apis

* format
2024-06-06 19:36:17 -07:00
Pere Miquel Brull
d8e2187980
#15243 - Pydantic V2 & Airflow 2.9 (#16480)
* pydantic v2

* pydanticv2

* fix parser

* fix annotated

* fix model dumping

* mysql ingestion

* clean root models

* clean root models

* bump airflow

* bump airflow

* bump airflow

* optionals

* optionals

* optionals

* jdk

* airflow migrate

* fab provider

* fab provider

* fab provider

* some more fixes

* fixing tests and imports

* model_dump and model_validate

* model_dump and model_validate

* model_dump and model_validate

* union

* pylint

* pylint

* integration tests

* fix CostAnalysisReportData

* integration tests

* tests

* missing defaults

* missing defaults
2024-06-05 21:18:37 +02:00
Suman Maharana
488078da8a
Add DDL query ingest (#15860) 2024-05-06 18:03:50 +05:30
Ayush Shah
d5b1465406
Fixes #14113 - Allow SSL file uploads (#15828) 2024-04-19 11:38:27 +05:30
Mayur Singal
6b90c245d4
MINOR: Add support for json schema parsing for datalake & s3 (#15615) 2024-03-26 10:03:21 +05:30
IceS2
e7c9d6aa7f
FIXES 15215: Implement initial Multithreading approach for the Metadata Ingestion on Databases (#15130)
* Implement Initial MultiThread suggestion

* Update all the ingestion sources to use the new ContextManager

* Fix missing wraps on decorator

* Fix Unittests

* Fix linters

* Fix linters

* Fix BigQuery UnitTests

* Add UnitTests to the newly created code

* Fix unittest

* change the threads from table to schemas

* Update README.md

* Small change suggested by Sonar

* Slight change to test a different way to multithread over tables

* Debug changes

* More multithread tests

* Remove uneeded wait time

* Testing

* refactor code based on removal of time.sleep

* Fix wrong paste

* Improve ExecutionTimeContextManager

* Fix missing .get() and unit tests

* Fix conflicting changes

* Update Multithread logic with the incremental extraction

* Fix linters

* Fix unittest

* Remove commented code

* Fix Unittests

* Fix checkstyle

* Change default to threads = 1
2024-03-25 18:20:40 +01:00
Ayush Shah
1bb7d893ac
Fix 15419: Improve fetching Oracle Queries for SP (#15621) 2024-03-20 15:58:06 +05:30
IceS2
51e3d7a466
FIXES 15215: First draft implementation on extracting metadata incrementally. Done for Snowflake, BigQuery and Redshift (#15201)
* Initial incremental implementation for snowflake

* Initial unit test refactor for snowflake

* Fix linter complaints

* Propagate change on abstract create method

* Add missing argument to create

* Polish Snowflake incremental extraction

* Fix linters and make enabled required

* Initial proposal for incremental bigquery extraction

* BigQuery incremental tests

* Remove debugging override

* Fix linters

* Remove unused query

* Initial Redshift Incremental Extraction

* Add Incremental Extraction documentation

* Move the default to False

* Improve code based on sonarcloud input

* Apply suggestions

* Fix wrong path

* Change timestamp to be time aware as per sonar

* Move documentation to 1.4

* Move documentation to 1.4

* Fix linters

---------

Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
2024-03-15 14:00:49 +01:00
Teddy
056e6368d0
Issue #14765 - Preparatory Work (#15312)
* refactor!: change partition metadata structure for table entities

* refactor!: updated json schema for TypeScript code gen

* chore: migration of partition for table entities

* style: python & java linting

* updated ui side change for table partitioned key

* miner fix

* addressing comments

* fixed ci error

---------

Co-authored-by: Shailesh Parmar <shailesh.parmar.webdev@gmail.com>
2024-02-28 07:11:00 +01:00
Imri Paran
aeb5fbe303
fixes #12591: add BigTable (#15122)
* feat(connector): add BigTable

* bigtable work

1. docstrings
2. tests
3. created a Row BaseModel
4. implemented a ClassConverter

* docs moved to separate PR

* format files

* minor cosmetic

- removed TODO
- changed headers' year to 2024 for new files
- fixed typos

* format

* formatting and comments

1. added missing docstrings.
2. abstracted the _find_instance method.
3. aliased the IDs used in the BigTable connection

* added comment regarding private key

* added comments regarding column families

* enclose get_schema_name_list in `try/except/else`

* format

* streamlined get_schema_name_list to include all logic in the try block
2024-02-13 08:28:01 +01:00
NiharDoshi99
2b56e34b19
#14930 bigquery support for pk, fk and column view description (#15042) 2024-02-07 16:49:27 +05:30
Teddy
9a4a9df836
Fix #14895 - Get Metadata from Parquet Schema (#14956)
* linting: fix python linting

* fix: get column types from parquet schema for parquet files

* style: python linting

* fix: remove displayType check in test as variation depending on OS
2024-02-01 09:02:52 +01:00
IceS2
373cafcda2
Fixes #5448: Implement initial Iceberg Connector using PyIceberg (#14825)
* Create the iceberg connection schema

* Link the IcebergConnection configuration with the forms on the UI

* Add the pyiceberg dependency on the ingestion package

* Create the get_connection and test_connection functions

* First iteration on the iceberg ingestion logic

* Add A more comprehensive implementation of the Iceberg Source

* Add UnitTests

* Update icebergConnection definition

* Update the iceberg souce code based on new schema

* Updated icebergConnecgtion schema for simplicity and to be able to configure Converters

* Updated setup dependencies to be more flexible

* Updated get_owner_ref logic

* Fix formatting

* Changed the icebergConnection json schema structure to enable the ClassConverters

* Add the IcebergCatalog and IcebergFileSystem ClassConverters

* Refactor the code to take into account the new jsonSchema structure

* Fix formatting

* Add Documentation for the Iceberg Connector

* Fix Menu order for Iceberg

* ui: add Iceberg service icon and constant

* Fix DynamoDb Catalog issue due to how PyIceberg instantes it

* Changed uri title to URI

* Fix ClassConverter for Iceberg

* Fix GetSecretValue for password types

* Fix formatting

* Fix formatting

* Add Iceberg Connector Images for the docs

* Add pylint disable for Hacky super() call

* Add Iceberg.md for the UI docs

* Fix pylint complaint

* Fix pylint complaint

* Fix UnitTests

* fix type error and unit tests

* update pipeline type checks

* Fix Sonar Cloud complaints

---------

Co-authored-by: Sachin Chaurasiya <sachinchaurasiyachotey87@gmail.com>
2024-01-29 06:32:58 +01:00
NiharDoshi99
c1d62186df
MINOR - metadata tag extraction for Databricks (#14874)
* metadata tag extraction for databaricks

* fix python test

* changes as per comment

* fix python test

* fix python checkstyle
2024-01-26 07:09:24 +01:00
Ayush Shah
1552aeb2de
Fix #13149: Multiple Project Id for Datalake GCS (#14846)
* Fix Multiple Project Id for datalake gcs

* Optimize logic

* Fix Tests

* Add Datalake GCS Tests

* Add multiple project id gcs test
2024-01-25 10:52:16 +01:00
Shiyang Xiao
9f5a70bd71
MINOR - update docs & added unit test for SAS Connector (#14743)
Co-authored-by: Shiyang Xiao <Shiyang.Xiao@sas.com>
2024-01-23 14:55:29 -08:00
NiharDoshi99
3f78e072e1
#13429 support for struct data type in hive (#14785) 2024-01-19 18:26:53 +05:30
Onkar Ravgan
f2219a10f3
Fixed oracle tests (#14738) 2024-01-16 17:39:10 +01:00
Onkar Ravgan
64a4e1afce
Fix 12180, 14158: Added LF tags to Athena (#14718)
* Added LF tags to athena

* fixed pytests

* Added docs
2024-01-16 14:24:31 +05:30
NiharDoshi99
54d34934c1
#14630 added oracle stored procedures (#14641) 2024-01-15 18:28:27 +05:30
Pere Miquel Brull
24643a397a
#14492 - Fix Snowflake SP parsing with empty signature (#14623) 2024-01-08 11:16:35 -08:00
Mayur Singal
a789fc86d6
Fix #13053: Remove Connection URI config MongoDB (#14584)
* Fix #13053: Remove Connection URI cofig MongoDB

* pyformat & test fixes
2024-01-05 10:51:12 -08:00
Pere Miquel Brull
a83a5ba3a3
MINOR - Skip delta tests for 3.11 (#14398)
* MINOR - Bump delta for 3.11

* Update flags

* MINOR - Bump delta for 3.11

* Update tests regex

* Update version

* Deprecations

* Format

* Version

* Try delta spark

* Skip delta tests for 3.11

* Update ingestion/tests/unit/topology/pipeline/test_airflow.py
2023-12-18 17:01:57 +01:00