This website requires JavaScript.
Explore
Help
Register
Sign In
yujunjun
/
unstructured
Watch
1
Star
0
Fork
0
You've already forked unstructured
mirror of
https://github.com/Unstructured-IO/unstructured.git
synced
2025-08-10 17:59:09 +00:00
Code
Issues
Packages
Projects
Releases
Wiki
Activity
unstructured
/
test_unstructured_ingest
/
metrics
/
metrics-json-manifest.txt
4 lines
99 B
Plaintext
Raw
Normal View
History
Unescape
Escape
chore: remove copy line from non-matrix connectors (#1976)
2023-11-04 13:58:56 -04:00
handbook-1p.docx.json
build: text extraction evaluation metrics workflow added (#1757) **Executive Summary** This PR adds the evaluation metrics to our current workflow. It verifies the flow that when the code is pushed, the code will gets evaluate against our gold standard and output into `.tsv` file. **Technical Details** - Adds evaluation metrics to the test-ingest workflow - Make use of `structured-output` from `test-ingest` and compare to the gold-standard uploaded in s3, and download into local when make comparison. The current folder in-use is `s3://utic-dev-tech-fixtures/small-cct`. This dir is editable in the shell script. - With this PR, only one file from one connector is use to compare. **Misc** - Not many overlapped files between test-ingest and gold-standard. More files will be added. **Outputs** 2 `.tsv` files are saved under `test_unstructured_ingest/metrics/`.   --------- Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com> Co-authored-by: Klaijan <Klaijan@users.noreply.github.com>
2023-10-23 17:39:22 -04:00
example-10k.html.json
IRS-form-1987.pdf.json
science-exploration-1p.pptx.json
Reference in New Issue
Copy Permalink