* Fix types in test_run.py
* Get test_run.py to pass fmt-check
* Add test_run to mypy checks
* Update test folder to pass ruff linting
* Fix merge
* Fix HF tests
* Fix hf test
* Try to fix tests
* Another attempt
* minor fix
* fix SentenceTransformersDiversityRanker
* skip integrations tests due to model unavailable on HF inference
---------
Co-authored-by: anakin87 <stefanofiorucci@gmail.com>
* fix eval metric docstrings, change type of individual scores
* change import order
* change exactmatch docstring to single ground truth answer
* change exactmatch comment to single ground truth answer
* reverted changing docs to single ground truth
* add warm up in SASEvaluator example
* fix FaithfulnessEvaluator docstring example
* extend FaithfulnessEvaluator docstring example
* Update FaithfulnessEvaluator init docstring
* Remove outdated default from LLMEvaluator docstring
* Add examples param to LLMEvaluator docstring example
* Add import and print to LLMEvaluator docstring example