build(deps): add typing extensions dep (#1835)

Closes #1330.

Added `typing-extensions` as an explicit dependency (it was previously
an implicit dependency via `dataclasses-json`).

This dependency should be explicit, since we import from it directly in
`unstructured.documents.elements`. This has the added benefit that
`TypedDict` will be available for Python 3.7 users.

Other changes:
* Ran `pip-compile`
* Fixed a bug in `version-sync.sh` that caused an error when using the
sync functionality when syncing to a dev version from a release version.

#### Testing:

To test the Python 3.7 functionality, in a Python 3.7 environment
install the base requirements and run
```python
from unstructured.documents.elements import Element

```
This also works on `main` as `typing_extensions` is a requirement.
However if you `pip uninstall typing-extensions`, and run the above
code, it should fail. So this update makes sure `typing-extensions`
doesn't get lost if the other dependencies move around.

To reproduce the `version-sync.sh` bug that was fixed, in `main`,
increment the most recent version in `CHANGELOG.md` while leaving the
version in `__version__.py`. Then add the following lines to
`version-sync.sh` to simulate a particular set of circumstances,
starting on line 114:

```
MAIN_IS_RELEASE=true
CURRENT_BRANCH="something-not-main"
```

Then run `make version-sync`.

The expected behavior is that the version in `__version__.py` is changed
to the new version to match `CHANGELOG.md`, but instead it exits with an
error.

The fix was to only do the version incrementation check when the script
is running in `-c` or "check" mode.
This commit is contained in:
qued 2023-10-24 14:19:09 -05:00 committed by GitHub
parent 01a0e003d9
commit d79f633ada
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
5 changed files with 21 additions and 9 deletions

View File

@ -1,4 +1,4 @@
## 0.10.26-dev4 ## 0.10.26-dev5
### Enhancements ### Enhancements
@ -10,6 +10,7 @@
### Fixes ### Fixes
* **Adds `typing-extensions` as an explicit dependency** This package is an implicit dependency, but the module is being imported directly in `unstructured.documents.elements` so the dependency should be explicit in case changes in other dependencies lead to `typing-extensions` being dropped as a dependency.
* ** Stop passing `extract_tables` to unstructured-inference ** since it is now supported in unstructured instead. Also noted the table * ** Stop passing `extract_tables` to unstructured-inference ** since it is now supported in unstructured instead. Also noted the table
output regressioin for PDF files. output regressioin for PDF files.
* **Fix a bug on Table partitioning** Previously the `skip_infer_table_types` variable used in partition was not being passed down to specific file partitioners. Now you can utilize the `skip_infer_table_types` list variable in partition to pass the filetype you want to exclude `text_as_html` metadata field for, or the `infer_table_structure` boolean variable on the file specific partitioning function. * **Fix a bug on Table partitioning** Previously the `skip_infer_table_types` variable used in partition was not being passed down to specific file partitioners. Now you can utilize the `skip_infer_table_types` list variable in partition to pass the filetype you want to exclude `text_as_html` metadata field for, or the `infer_table_structure` boolean variable on the file specific partitioning function.

View File

@ -14,3 +14,4 @@ langdetect
numpy numpy
rapidfuzz rapidfuzz
backoff backoff
typing-extensions

View File

@ -63,7 +63,9 @@ tabulate==0.9.0
tqdm==4.66.1 tqdm==4.66.1
# via nltk # via nltk
typing-extensions==4.8.0 typing-extensions==4.8.0
# via typing-inspect # via
# -r requirements/base.in
# typing-inspect
typing-inspect==0.9.0 typing-inspect==0.9.0
# via dataclasses-json # via dataclasses-json
urllib3==1.26.18 urllib3==1.26.18

View File

@ -1,6 +1,6 @@
#!/usr/bin/env bash #!/usr/bin/env bash
set -eu set -u
function usage { function usage {
echo "Usage: $(basename "$0") [-c] -f FILE_TO_CHANGE REPLACEMENT_FORMAT [-f FILE_TO_CHANGE REPLACEMENT_FORMAT ...]" 2>&1 echo "Usage: $(basename "$0") [-c] -f FILE_TO_CHANGE REPLACEMENT_FORMAT [-f FILE_TO_CHANGE REPLACEMENT_FORMAT ...]" 2>&1
@ -123,12 +123,19 @@ for i in "${!FILES_TO_CHECK[@]}"; do
# No match to semver regex in VERSIONFILE, so nothing to replace # No match to semver regex in VERSIONFILE, so nothing to replace
printf "Error: No semver version found in file %s.\n" "$FILE_TO_CHANGE" printf "Error: No semver version found in file %s.\n" "$FILE_TO_CHANGE"
exit 1 exit 1
elif [[ "$MAIN_IS_RELEASE" == true && "$FILE_VERSION" == "$MAIN_VERSION" && "$CURRENT_BRANCH" != "main" ]]; else
if [[ "$MAIN_IS_RELEASE" == true && "$UPDATED_VERSION" == "$MAIN_VERSION" && "$CURRENT_BRANCH" != "main" ]];
then then
# Only one commit should be associated with a particular non-dev version # Only one commit should be associated with a particular non-dev version
if [[ "$CHECK" == 1 ]];
then
printf "Error: there is already a commit associated with version %s.\n" "$MAIN_VERSION" printf "Error: there is already a commit associated with version %s.\n" "$MAIN_VERSION"
exit 1 exit 1
else else
printf "Warning: there is already a commit associated with version %s.\n" "$MAIN_VERSION"
fi
fi
# Replace semver in VERSIONFILE with semver obtained from SOURCE_FILE # Replace semver in VERSIONFILE with semver obtained from SOURCE_FILE
TMPFILE=$(mktemp /tmp/new_version.XXXXXX) TMPFILE=$(mktemp /tmp/new_version.XXXXXX)
# Check sed version, exit if version < 4.3 # Check sed version, exit if version < 4.3
@ -163,6 +170,7 @@ done
# Exit with code determined by whether changes were needed in a check. # Exit with code determined by whether changes were needed in a check.
if [ ${FAILED_CHECK} -ne 0 ]; then if [ ${FAILED_CHECK} -ne 0 ]; then
printf "\nVersions are out of sync! See above for diffs.\n"
exit 1 exit 1
else else
exit 0 exit 0

View File

@ -1 +1 @@
__version__ = "0.10.26-dev4" # pragma: no cover __version__ = "0.10.26-dev5" # pragma: no cover