* feat: add query logger as an event listent in debug mode
* fix: added ingestion.src plugin to pylint
* minor: add partition sampled table
* test: added test for partitioned BQ table
* Remove log_query function from logger.py
* style: ran python linting
* Remove 'ORGANIZATION' PII Tag as it is no longer supported by our PII detectors.
* Updata presidio version to fix wrong regex for indian passport
* Increase sample size of Indian passport numbers
---------
Co-authored-by: Pere Menal <pere.menal@getcollate.io>
* Add PII Tag and Sensitivity Level enums.
* Add feature-extraction for PII classification tasks
* Add faker as test dependency
* Add unit tests for presidio tag extractor
* Add PIISensitivityTags enum and update sensitivity mapping logic
* Add Presidio utility functions for PII analysis
* Extend column name regexs for PII
* Add tests for PAN, NIF, SSN entities
* Fix version of faker to prevent flaky tests. Fix failing tests.
* Add Generated to State enum
* Integrate PIISensitive classifier to PIIProcessor
* Add PII Tag and Sensitivity Level enums.
* Add feature-extraction for PII classification tasks
* Add faker as test dependency
* Add unit tests for presidio tag extractor
* Add PIISensitivityTags enum and update sensitivity mapping logic
* Add Presidio utility functions for PII analysis
* Extend column name regexs for PII
* Add colum name split
* Move pii algorithms to dedicated package
* Add tests for PAN, NIF, SSN entities
* Fix linting
* Add comment on why we need to set specific lanaguage to Presidio recognizers, as per PR suggestion.
* Fix version of faker to prevent flaky tests. Fix failing tests.
* Fix wrong import
---------
Co-authored-by: Pere Menal <pere.menal@getcollate.io>
* feat: implemented load test logic
* style: ran python linting
* fix: added locust dependency in test
* fix: skip locust in 3.8 as not supported
* fix: update gcsfs version
* fix: revert gcsfs versionning
* fix: fix gcsf version to 2023.10
* fix: dagster graphql and gx versions
* fix: dagster version to 1.8 for py8 compatibility
* fix: fix clickhouse to 0.2 as 0.3 requires SQA 2+
* fix: revert changes from main
* fix: revert changes compared to main
* fix: add support for GX 0.18.22 and GX 1.4.x
* fix: add support for GX 0.18.22 and GX 1.4.x
* style: ran python linting
* fix: skip test if GX version is not installed
* fix: made databricks httpPath required and added a migration file for the same
* fix: added sql migration in postDataMigration file and fix databricks tests
* fix: added httpPath in test_source_connection.py and test_source_parsing.py files
* fix: added httpPath in test_databricks_lineage.py
* fix: table name in postgres migration
* Fix#19667: OpenSearch Connector
* Fix#19667: OpenSearch Connector
* do not ingest any system level indexes
* fix pyformat
* Add AWS auth
* Use common schema and fix ssl config in client
* Add openseach connector docs and update schema
* Remove api key auth type and complete docs checklist
* Remove unnecessary httpx dependency and pyformat
* Add compatible version of httpx for elasticsearch
* Fix pylint fails and py-tests validation error
---------
Co-authored-by: Mohit Tilala <tilalamohit123@gmail.com>
Co-authored-by: Mohit Tilala <63147650+mohittilala@users.noreply.github.com>