8 Commits

Author SHA1 Message Date
cragwolfe
bd8a74d686
chore: shell scripts default indent of 2 instead of 4 (#2287)
Given the tendency for shell scripts to easily enter into a few levels
of indentation and long line lengths, update the default to 2 spaces.
2023-12-19 07:48:21 +00:00
Roman Isecke
76efcf4dd7
chore: add shfmt (#2246)
### Description
Given all the shell files that now exist in the repo, would be nice to
have linting/formatting around them (in addition to the existing
shellcheck which doesn't do anything to format the shell code). This PR
introduces `shfmt` to both check for changes and apply formatting when
the associated make targets are called.
2023-12-12 01:04:15 +00:00
Klaijan
877a30aed3
fix: fix eval ci to skip the overwrite if none exists (#2159)
Currently the `check-diff-evaluation-metrics` only runs when there is
file to perform evaluation on. Add the checking condition to skip the
action when there is none. Additionally, more refactoring and
`visualize` option for both evaluation calculation functions is also
added.
2023-11-25 15:46:05 +00:00
Klaijan
2c2d5b65ca
refactor: measure_text_edit_distance function for aggregation (#2108)
- Refactor `metrics/evaluation.py` to accepts `grouping` as parameter. 
- Switch to `DataFrame` for easier analysis and aggregation.
2023-11-22 13:30:16 -08:00
Klaijan
366c8af2ae
ci: make eval fail on diff (#2138)
Add conditions on `check-diff-evaluation-metrics.sh` that exits when
there's diff between new evaluation metric outputs and the old one.
2023-11-21 20:55:03 -08:00
Klaijan
433c3889dc
ci: reorganize eval output folders and add azure to matrix test (#2093)
**Summary**
The CI workflow for evaluation previously saved the metric outputs to
the `metrics/` folder. Currently structured in subfolders e.g.
`metrics/text-extraction` `metrics/element-type` for the folder clean up
purpose.

Additionally, Azure connector is also added to
`full_python_matrix_tests` in this PR.

---------

Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
Co-authored-by: Klaijan <Klaijan@users.noreply.github.com>
2023-11-21 20:04:30 +00:00
Klaijan
5ba3b9c2c6
chore: get eval metrics from ingest in (#2097)
Co-authored-by: ryannikolaidis <1208590+ryannikolaidis@users.noreply.github.com>
Co-authored-by: Klaijan <Klaijan@users.noreply.github.com>
2023-11-17 18:22:36 +00:00
Klaijan
777a428071
chore: for ingest-test metrics, also check subdirs (#2079)
- Copy script only went through one layer of subdirectory so it did not
found the match between manifest file and structured output. Now edited
to search all subdirectories.
- `set -e` causes the script to exit at any exit rather than `exit 0`,
fix all scripts that needs to run the copy script to be `set +e` right
before the check diff, then back to `set -e` after
- Edit the default evaluation metrics output from `metrics` to
`metrics-tmp` to account for diff check
- Add a script that checks the differences between old eval metric
output (metrics) and new eval metrics output (metrics-tmp)
2023-11-15 21:02:43 -08:00