* fix data augmentation path in finetuning notebook
* Add latest docstring and tutorial changes
* make distillation possible with other models than BERT
* use smaller dataset for distillation in finetuning tutorial
* Add latest docstring and tutorial changes
* make data augmentation in finetuning faster
* update language models forward doc strings
* fix return type of language models
* remove debug output
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>