* MINOR: User search should only look in name & displayname
* py_format
* pyformat
---------
Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
This aims at fixing the s3 ingestion for parquet files, current behaviour is that
the pipeline will break if it encounters a file that is not valid parquet in the
the container, this is not great as containers might container non parquet files
on purpose like for example _SUCCESS files created by spark.
For that do not fail the whole pipeline when a single container fails, instead
count it as a failure and move on with the remainder of the containers, this is
already an improvement by ideally the ingestion should try a couple more files
under the given prefix before given up, additionally we can allow users to specify
file patterns to be ignored.
Co-authored-by: Abdallah Serghine <abdallah.serghine@olx.pl>
Co-authored-by: Pere Miquel Brull <peremiquelbrull@gmail.com>
* fix(data-diff): added nd5 handling for bigquery
- added MD5 handling for bigquery
- use URL instead of Engine because it requires less steps and less prone to failure
* added e2e test for data diff with sampling in bigquery
* Get last completed job run
* formatting
---------
Co-authored-by: 😺Leo Luo <leo.luo@mavenclinic.com>
Co-authored-by: Suman Maharana <sumanmaharana786@gmail.com>
* fix: sqa table reference
* style: ran python linting
* fix: added raw dataset to query runner
* fix: get table and schema name from orm object
* fix: get table level config for table tests
* add dbt freshness check
* docs
* run linting
* add test case param definition
* fix test case param definition
* add config for dbt http, fix linting
* refactor (only create freshness test definition when user executed one)
* fix dbt files class
* fix dbt files class 2
* fix dbt objects class
* fix linting
* fix pylint
* fix linting once and for all
---------
Co-authored-by: Teddy <teddy.crepineau@gmail.com>
* ref(data-quality): modularized test case validator import
- removed test_suite_factory
- implemented TestCaseImporter
- removed SQAValidatorBuilder and PandasValidatorBuilder in favor of a SourceType enum
- removed the orm table creation from test suite source
* format
* IValidatorBuilder -> ValidatorBuilder
* use the table from the sampler in the test suite interface
* linting
* fixed the profiler with similar solution
* removed unused inheritance
* removed unneeded super().__init__()
* removed all instances of orm_table
* fixed tests
* add reportExplicitAny=false
* fixed tests