fix: local connector output filename when a single file is being processed (#879)

* fix string processing error for _output_filename

* Add docstring and type hint, update CHANGELOG, update version

* update test fixture

* simple code change commit to retrigger ci checks

* update test fixture - after brew install tesseract-lang

* Update ingest test fixtures (#882)

Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>

* correct CHANGELOG

* correct CHANGELOG

---------

Co-authored-by: Unstructured-DevOps <111007769+Unstructured-DevOps@users.noreply.github.com>
Co-authored-by: ahmetmeleq <ahmetmeleq@users.noreply.github.com>
This commit is contained in:
Ahmet Melek 2023-07-05 22:37:40 +01:00 committed by GitHub
parent 24dad24f87
commit 4b827f0793
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 15 additions and 6 deletions

View File

@ -8,6 +8,7 @@ repos:
- id: check-json
- id: check-xml
- id: end-of-file-fixer
exclude: \.json$
include: \.py$
- id: trailing-whitespace
- id: mixed-line-ending

View File

@ -1,4 +1,4 @@
## 0.8.0-dev0
## 0.8.0-dev1
### Enhancements
@ -8,6 +8,7 @@
### Fixes
* Fix KeyError when `isd_to_elements` doesn't find a type
* Fix _output_filename for local connector, allowing single files to be written correctly to the disk
### BREAKING CHANGES

View File

@ -1 +1 @@
__version__ = "0.8.0-dev0" # pragma: no cover
__version__ = "0.8.0-dev1" # pragma: no cover

View File

@ -51,11 +51,18 @@ class LocalIngestDoc(BaseIngestDoc):
pass
@property
def _output_filename(self):
return (
Path(self.standard_config.output_dir)
/ f"{self.path.replace(f'{self.config.input_path}/', '')}.json"
def _output_filename(self) -> Path:
"""Returns output filename for the doc
If input path argument is a file itself, it returns the filename of the doc.
If input path argument is a folder, it returns the relative path of the doc.
"""
input_path = Path(self.config.input_path)
basename = (
f"{Path(self.path).name}.json"
if input_path.is_file()
else f"{Path(self.path).relative_to(input_path)}.json"
)
return Path(self.standard_config.output_dir) / basename
class LocalConnector(BaseConnector):