mirror of
https://github.com/Unstructured-IO/unstructured.git
synced 2025-06-27 02:30:08 +00:00

### Summary Updates the `Dockerfile` to use the Chainguard `wolfi-base` image to reduce CVEs. Also adds a step in the docker publish job that scans the images and checks for CVEs before publishing. The job will fail if there are high or critical vulnerabilities. ### Testing Run `make docker-run-dev` and then `python3.11` once you're in. And that point, you can try: ```python from unstructured.partition.auto import partition elements = partition(filename="example-docs/DA-1p.pdf", skip_infer_table_types=["pdf"]) elements ``` Stop the container once you're done.
23 lines
420 B
Bash
Executable File
23 lines
420 B
Bash
Executable File
#!/bin/bash
|
|
|
|
files=(
|
|
"libreoffice-7.6.5-r0.apk"
|
|
"openjpeg-2.5.0-r0.apk"
|
|
"poppler-23.09.0-r0.apk"
|
|
"leptonica-1.83.0-r0.apk"
|
|
"pandoc-3.1.8-r0.apk"
|
|
"tesseract-5.3.2-r0.apk"
|
|
"nltk_data.tgz"
|
|
|
|
)
|
|
|
|
directory="docker-packages"
|
|
mkdir -p "${directory}"
|
|
|
|
for file in "${files[@]}"; do
|
|
echo "Downloading ${file}"
|
|
wget "https://utic-public-cf.s3.amazonaws.com/$file" -P "$directory"
|
|
done
|
|
|
|
echo "Downloads complete."
|