11 Commits

Author SHA1 Message Date
Harshal Sheth
e99875cac6
chore(ingest): enable flake8 bugbear linting (#7763) 2023-04-10 14:14:42 -07:00
Andrew Sikowitz
ce1ac7fa12
refactor(ingest): Use sqlite.Row row_factory for FileBackedCollections (#7739) 2023-04-04 11:53:56 -07:00
Andrew Sikowitz
c7d35ffd66
perf(ingest): Improve FileBackedDict iteration performance; minor refactoring (#7689)
- Adds dirty bit to cache, only writes data if dirty
- Refactors __iter__
- Adds sql_query_iterator
- Adds items_snapshot, more performant `items()` that allows for filtering
- Renames connection -> shared_connection
- Removes unnecessary flush during close if connection is not shared
- Adds Closeable mixin
2023-03-27 17:20:34 -04:00
Andrew Sikowitz
8dd7a85533
refactor(ingest): Use shared connection wrapper over connection cache (#7570) 2023-03-14 15:09:37 -07:00
Harshal Sheth
fbfe43b1cb
feat(ingest): fix edge cases + interface cleanup for file-system APIs (#7533) 2023-03-13 13:14:53 -07:00
Harshal Sheth
b82afa89f1
feat(ingest): enable joins across FileBackedDicts + add FileBackedList (#7506) 2023-03-09 15:22:03 -08:00
Andrew Sikowitz
8101f0d47a
feat(ingest): Introduce FileBackedDict for offloading data to disk (#7461)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Also includes minor refactoring to the bigquery connector
2023-03-01 19:09:51 -05:00
Tamas Nemeth
9015a43f25
fix(ingest): bigquery-beta - Adding python 3.8 fix for memory footprint util (#6228) 2022-10-18 17:59:31 -07:00
Tamas Nemeth
2f79b50c24
fix(ingest): presto-on-hive - not failing on Hive type parsing error (#6118)
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
2022-10-04 20:54:38 -07:00
Mayuri Nehate
b195b6c123
fix(ingest): encode reserved characters when creating dataset urn (#5977)
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
2022-09-20 16:59:02 -07:00
Shirshanka Das
9afda47085
feat(cli): add support for sampled reporting to keep logs manageable (#5800) 2022-09-01 14:47:28 -07:00