Michał Martyniak 2f25d8f79e
Support for concurrent processing of documents during evaluation (#2973)
Currently, CCT eval takes a long time for any of the test_metrics CI
runs. Documents in an eval set are evaluated sequentially, and It
appears that a max of 1 cpu core is currently utilized. This implies
there could be a large speedup by running eval across multiple docs
concurrently (probably with multiprocessing).

Things done in this PR:
- [x] concurrent.futures.ProcessPoolExecutor instead of sequential
for-loop
- [x] refactor/reorganization of redundant pieces of code without
changing the inner logic too much. Without that we'd have 3 places where
documents are being processed. Take a look at `BaseMetricsCalculator`
class and classes that inherit from it.
- [x] string paths manipulation is now reworked and relies on
`pathlib.Path()`
2024-05-09 21:25:47 +00:00
..