### Description
Update all other connectors to use the new downstream architecture that
was recently introduced for the s3 connector.
Closes#1313 and #1311
The CustomError that we use to wrap custom ingest errors inherits from
BaseException rather than Exception (as we should, per specification
[here](https://docs.python.org/3/library/exceptions.html#BaseException)).
This resulted in exceptions not properly raising as expected. This PR
changes the inheritance which resolves the known issue.
Additionally, our base definition for get_file on IngestDoc was wrapped
with SourceConnectionError, however this must be explicitly decorating
each subclass definition in order to function. This PR does that.
## Testing
Some unit test coverage was added for the error wrapping class, however
this wasn't properly recreating the issue we are seeing when running
ingest tests.
To recreate that issue one can intentionally raise an exception in the
[partition_file](https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/ingest/interfaces.py#L214C9-L214C23)
definition and then run any ingest test. Prior to this change: the code
and logs suggest that everything ran without exception, but the
partitioned output was not generated (as a result the test will fail
without any clues as to what went wrong). With this update, the expected
custom partition error, error message, and stack trace will be visible.
---------
Co-authored-by: Ahmet Melek <39141206+ahmetmeleq@users.noreply.github.com>
* split dependencies by document type
* make pip-compile with new requirements
* add extra requirements to setup.py
* add in all docs; re pip-compile
* extra for all docs
* add pandas to xlsx
* dependency requires for tsv and csv
* handling for doc, docx and odt
* dependency check for pypandoc
* required dependencies for pandoc files
* xml and html
* markdown
* msg
* add in pdf
* add in pptx
* add in excel
* add lxml as base req
* extra all docs for local inference
* local inference installs all
* pin pillow version
* fixes for plain text tests
* fixes for doc
* update make commands
* changelog and version
* add xlrd
* update pip-compile
* pin numpy for python 3.8 support
* more constraints
* contraint on scipy
* update install docs
* constrain ipython
* add outlook to pip-compile
* more ipython constraints
* add extras to dockerfile
* pin office365 client
* few doc tweaks
* types as strings
* last pip-compile
* re pip-comple
* make tidy
* make tidy
* add support for page numbers in docx when present
* version and changelog
* add comment on page numbers
* add header and footer to doc elements list
* update integrations docs
* include_page_breaks kwarg for doc and docx
* merge element metadata for pagebreaks
* fix typo
* fix changelog typo
* change page number default to None
* add initial_page_number kwarg
* make page number tests in pdf more explicit
* revert test file
* update ingest tests
* update test fixture outputs
* updates to IRS forms fixtures
* ingest-test-fixtures-update
* Update ingest test fixtures (#759)
Co-authored-by: MthwRobinson <MthwRobinson@users.noreply.github.com>
---------
Co-authored-by: Unstructured-DevOps <111007769+Unstructured-DevOps@users.noreply.github.com>
Co-authored-by: MthwRobinson <MthwRobinson@users.noreply.github.com>