5 Commits

Author SHA1 Message Date
Roman Isecke
63861f537e
Add check for duplicate click options (#1775)
### Description
Given that many of the options associated with the `Click` based cli
ingest commands are added dynamically from a number of configs, a check
was incorporated to make sure there were no duplicate entries to prevent
new configs from overwriting already added options.

### Issues that were found and fixes:
* duplicate api-key option set on Notion command conflicts with api key
used for unstructured api. Added notion prefix.
* retry logic configs had duplicates in biomed. Removed since this is
not handled by the pipeline.
2023-10-20 14:00:19 +00:00
Roman Isecke
b265d8874b
refactoring linting (#1739)
### Description
Currently linting only takes place over the base unstructured directory
but we support python files throughout the repo. It makes sense for all
those files to also abide by the same linting rules so the entire repo
was set to be inspected when the linters are run. Along with that
autoflake was added as a linter which has a lot of added benefits such
as removing unused imports for you that would currently break flake and
require manual intervention.

The only real relevant changes in this PR are in the `Makefile`,
`setup.cfg`, and `requirements/test.in`. The rest is the result of
running the linters.
2023-10-17 12:45:12 +00:00
Matt Robinson
08e091c5a9
chore: Reorganize partition bricks under partition directory (#76)
* move partition_pdf to partition folder

* move partition.py

* refactor partioning bricks into partition diretory

* import to nlp for backward compatibility

* update docs

* update version and bump changelog

* fix typo in changelog

* update readme reference
2022-11-21 22:27:23 +00:00
Sebastian Laverde Alfonso
baa15d0098
feat: new partitioning brick that calls the document image analysis API (#68)
* docs: add new feature to the CHANGELOG.md, bump the version, update __version__.py

* feat: new partition to call the document image analysis API

* fix: remove duplicated dependency on partition.py

* fix: linting error due to line-lenght > 100

* test: add test to call partition_pdf brick

* chore: new short example-doc pdf for speed up in test X8

* fix: add missing return statement to _read to pass check

* feat: new partitioning brick to call doc parse API

* docs: version update fix in CHANGELOG

* refactor: no nested ifs

* docs: documentation for new brick partition_pdf

* refactor: made tidy

* docs: minor doc refactor

Co-authored-by: Sebastian Laverde <sebastian@unstructured.io>
2022-11-16 17:48:30 +01:00
Matt Robinson
5f40c78f25 Initial Release 2022-09-26 14:55:20 -07:00