MichelBartels 0cca2b97cd
distinguish intermediate layer & prediction layer distillation phases with different parameters (#2001)
* add parameters to allow for different hyperparameters in stage 1 and 2 of tinybert distillation

* Add latest docstring and tutorial changes

* improve default parameters

* Add latest docstring and tutorial changes

* split up distillation method

* Add latest docstring and tutorial changes

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-14 20:40:38 +01:00
..
2021-12-30 10:15:11 +01:00
2021-11-15 09:50:09 +01:00