55 Commits

Author SHA1 Message Date
Akash Jain
96c65e7ebd
fix: remove spaCy dependencies from setup.py (#1362)
* remove spaCy dependencies from setup.py

* Spacy, PII and Processor dependencies removed

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2021-11-24 12:23:17 -08:00
Ayush Shah
eb34d04285 Setup.py modified to support 3.10 2021-11-23 21:50:44 -08:00
Alberto Miorin
00fd7f4175 Fix #1291: Python Inspect Code errors 2021-11-21 15:51:51 +01:00
Sriharsha Chintalapani
e4fa0247f5
Fix #1280: Amundsen connector to import metadata (#1281)
* Fix #1280: Amundsen connector to import metadata
2021-11-20 14:08:27 -08:00
Ayush Shah
219246b78e
Glue pagination added (#1282) 2021-11-20 12:46:18 -08:00
Sriharsha Chintalapani
f3054658f5
Fix #968: Add DBT Connector (#1200) 2021-11-16 01:02:45 -08:00
Ayush Shah
5dc3bb9297
Docker Support from Python added (#1158)
* Script modified - supports running from different locations

* Docker support from Python CLI

* Docker plugin setup.py

* Paths for latest and local dockers updated

* Resolved Comments - Docker CLI optimized, timestamp added

* help attribute added to options

* Docker clean code refactored
2021-11-12 10:30:28 -08:00
Ayush Shah
9839191242
Issue 483 - Glue Implementation (#1124)
* Glue Tables and Pipeline workflows implemented

* Glue Config Added

* Relative imports changed to absolute

* Resolving Comments - Changed Imports, serviceTypes

* Type fixed in setup.py
2021-11-10 07:28:13 -08:00
Tom Vijlbrief
e6f6d9c2bb
Map SQL columns with unknown type to VARCHAR (#980)
Change pydantic version requirement
2021-10-30 21:07:39 -07:00
Ayush Shah
93921814af
Docker fix - latest release changes (#983)
* Docker fix - Architecture, MySQL

* Docker Airflow API Dagrun support

* Docker latest changes modified
2021-10-30 09:05:30 -07:00
Ayush Shah
759574a8de
Ingestion Optimization - Sample Users, Dockerfiles, Removal of Pandas (#935)
* Sample Profile Data for Sample Tables (#815)

* Sample Profile Data for Sample Tables

* Disabling Profile as Default

* Added Sample Profile Data to 3 additional sample tables

* Sample Tables fixed (#850)

* Pydantic fix, Docker update (#860)

* Setup.py Modified with openmetadata-airflow package, docker update

* Setup.py Modified

* Update setup.py

* Removed Pandas from Sample Data

* Sample Users added under sample data

* Sample User Standalone pipelines and modules removed

* Docker release package updated

* Dockerfile updated, removed redundant files

* Setup.py removed from ingestion src directory

* User Resource failing check resolved

* Modifying Usage Columns Datatype
2021-10-26 09:14:24 -07:00
Ayush Shah
0eb3e7b964
Changing Pydantic from 1.8.2 to 1.7.4 resolves the conflict (#788) 2021-10-16 07:58:55 -07:00
Ayush Shah
85b6b72848
Airflow docker (#762)
* Airflow Docker implementation - Ingestion

* Dockerfiles modified
2021-10-14 07:46:24 -07:00
Sriharsha Chintalapani
c28665bca7
Sample lineage (#735)
* Fix #727: Add sample lineage data and ingestion support
2021-10-11 20:12:40 -07:00
Ayush Shah
1650a4ef4a
Added support for struct in bigquery, pyhive and bigquery pypi packag… (#717)
* Added support for struct in bigquery, pyhive and bigquery pypi packages modified

* Versions added, Naming changed, Newlines removed
2021-10-09 07:15:41 -07:00
James
d455409cc9
issue-696: Added trino support for Openmetadata (#697)
* issue-696: Added trino support for Openmetadata

* issue-696: fixed linting issues

* issue-696: not mentioning Trino for now as it will be part of 0.5 release

Co-authored-by: jbuoncri <jbuoncri@cisco.com>
2021-10-07 11:15:34 -07:00
Sriharsha Chintalapani
e3cfb4dc65
For hive's complex data types parse raw type (#560)
* For hive's complex data types parse raw type

* Complex Data type logic modification

* Complex Data Type parsing implemented

* Raw Data type helper modification

* handling unnamed/anonymous struct

* Complex Nested structure implementation

* print statements removed and reverted to raw_data_type

* Complex Structure Array & MAP logic implemented

* Raw Data Type Logic revamped

* Redshift Integration

* MAP and UnionType support added

* Redshift Pypi package updated

* dataLength validationError fix

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2021-10-04 23:36:35 +05:30
Sriharsha Chintalapani
bfec0bfbed
Ingestion: Airflow integration to ingest metadata about pipelines and tasks (#609)
* [WIP] Airlfow integration

* [WIP] Airlfow integration

* [WIP] airflow integration

* [WIP] Airflow

* [WIP] Airflow

* Fix #608: Ingestion: Airflow integration to ingest metadata about pipelines and tasks

* Fix #608: Ingestion: Airflow integration to ingest metadata about pipelines and tasks

* Update DashboardServiceResource.java

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2021-09-29 11:32:09 -07:00
Sriharsha Chintalapani
eb2717b0e3
Fix #587: Ingestion: Add standalone report process to generate datasets, usage & profile and serve from standalone server (#588)
* Fix #587: Ingestion: Add standalone report process to generate datasets, usage & profile and serve from standalone server

* add localhost
2021-09-27 08:43:38 -07:00
Ayush Shah
627481f181
Status record Json encoding bug fixed and pandas not found fixed (#584) 2021-09-25 13:54:04 -07:00
Sriharsha Chintalapani
745ae0c253
Fix #577: Users API should support put op (#578)
* Fix #577: Users API should support put op
2021-09-24 17:55:26 -07:00
parthp2107
06810cdec1
Fix #432:Added Redash Connector (#444)
* added redash connector

* added redash connector

* Added Redash Connector

* minor changes

Co-authored-by: parthp2107 <parth.panchal@deuexsoultions.com>
Co-authored-by: parthp2107 <parth@getcollate.io>
2021-09-22 15:09:24 +05:30
Ayush Shah
7652baa00d
Setup.py Refactored, ES port fix (#521)
* Pylint build failure fixed

* Setup & dependency modified, Data profiler default to False, ES port fix

* Profiler requirements refactored

* Setup.py requirement fix

* openmetadata-ingestion version upgrade
2021-09-19 13:59:14 +05:30
Sriharsha Chintalapani
4c6c8fd446
Fix #515: Ingestion: Add ES configuration to allow port (#516) 2021-09-17 08:57:41 -07:00
Sriharsha Chintalapani
8c103bd2ad
Profiler (#496)
* profiler code

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2021-09-15 15:49:26 -07:00
Sriharsha Chintalapani
b7adb5dc6b
Fix #469: Add Vertica Connector (#470) 2021-09-12 21:59:31 -07:00
Ayush Shah
d2df40cf2b
Fix #355: Tableau Implemented (#468)
* Fix #355: Tableau Implemented

* Tableau pipeline location modification
2021-09-11 11:46:10 -07:00
Sriharsha Chintalapani
1c80dc246e
Fix #456: Make PII-Processor optional and independent install (#457) 2021-09-10 10:41:19 +05:30
Sriharsha Chintalapani
2369ddc858 [WIP] Fix #446: Add DataProfiler to ingestion and APIs 2021-09-08 23:55:48 -07:00
Suresh Srinivas
328658ebea [WIP] profiler 2021-09-07 22:03:57 -07:00
Ayush Shah
657962bc4f
MSSQL sample-data query fix (#375)
* MSSQL sample-data query fix

* Query Format as per Database implemented
2021-09-06 21:03:04 -07:00
Sriharsha Chintalapani
d0dbcc19b7
Fix #401: Merge sample data generation into single connector (#402)
* Fix #401: Merge sample data generation into single connector

* Path for datasets modified

Co-authored-by: Ayush Shah <ayush@getcollate.io>
2021-09-05 22:35:02 +05:30
Ayush Shah
c9ada4ca1a
Looker Dashboard Connecter Added (#351)
* Looker Dashboard Connecter Added

* Dashboard yield fixed

* Looker Connector Method modifications
2021-09-02 20:32:03 -07:00
Suresh Srinivas
6a28ae988f [WIP] Issue #285: Add support for Dashboard Entities; Superset connector 2021-08-24 13:47:41 -07:00
Suresh Srinivas
994b49d055 Fix #281: Ingestion: Add a sample topics connector 2021-08-23 14:59:39 -07:00
Suresh Srinivas
19151dcac7 Ingestion: Add Kafka Connector 2021-08-21 17:52:24 -07:00
Suresh Srinivas
4f6cc54465 Ingestion: Add Confluent Kafka topic and schema connector 2021-08-21 13:16:51 -07:00
Suresh Srinivas
dc7e05dd74 Ingestion: Add Confluent Kafka topic and schema connector 2021-08-21 13:16:40 -07:00
Ayush Shah
19f7904258 Pip install git path modified pylint - simplescheduler 2021-08-18 01:01:20 +05:30
Suresh Srinivas
2d70350742
Update python json schema classes; Add presto connector (#213)
* Update python json schema classes; Add presto connector

* Modification in Postgres, pylint and presto support

Co-authored-by: Suresh Srinivas <srini3005@gmail.com>
Co-authored-by: Ayush Shah <ayush02shah12@gmail.com>
2021-08-17 21:15:46 +05:30
Suresh Srinivas
d905bd04c3 Ingestion: add a simple scheduler from open-metadata 2021-08-15 18:01:54 -07:00
Ayush Shah
37e59518a5 Scheduler Connector 2021-08-15 01:32:47 +05:30
Ayush Shah
79d5d76418 setup.py dependency issue resolved 2021-08-14 00:58:21 +05:30
Ayush Shah
2b26274804 Removed registry files and modified workflow and setup.py 2021-08-14 00:54:16 +05:30
Ayush Shah
03ad583744 Registry dependency removed 2021-08-14 00:09:51 +05:30
Suresh Srinivas
a2403b4570 Ingestion: add sample usage connector 2021-08-13 00:33:48 -07:00
Ayush Shah
d9cd3e4856 Connector Dependency Cleanup and Docker Modification 2021-08-13 02:29:36 +05:30
Ayush Shah
a3f4398284 README and Setup.py Modification 2021-08-13 02:03:39 +05:30
Ayush Shah
72a355bd5d Ingestion Setup.py modification 2021-08-13 01:40:56 +05:30
Ayush Shah
754813d220 connection string abstract method 2021-08-11 23:49:00 +05:30