MichelBartels
5b6b0cef77
Add UnlabeledTextProcessor ( #2054 )
...
* add UnlabeledTextProcessor
* allow choosing processor when finetuning or distilling
* fix type hint
* Add latest docstring and tutorial changes
* improve segment id computation for UnlabeledTextProcessor
* add text and documentation
* change batch size parameter for intermediate layer distillation
* Add latest docstring and tutorial changes
* fix distillation dim mapping
* remove unnecessary changes
* removed confusing parameter
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-25 14:54:34 +01:00
MichelBartels
0cca2b97cd
distinguish intermediate layer & prediction layer distillation phases with different parameters ( #2001 )
...
* add parameters to allow for different hyperparameters in stage 1 and 2 of tinybert distillation
* Add latest docstring and tutorial changes
* improve default parameters
* Add latest docstring and tutorial changes
* split up distillation method
* Add latest docstring and tutorial changes
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-01-14 20:40:38 +01:00
MichelBartels
f33c2b987a
Adding distillation loss functions from TinyBERT ( #1879 )
...
* initial tinybertdistill commit
* add tinybert distill loss
* remove teacher caching for tinybert
* add tinybert to distil_from method
* Add latest docstring and tutorial changes
* add dim mapping and fix type hints
* fix type hints
* fix dummy input
* fix dim mapping for tinybert loss and add comments/doc strings
* add test for tinybert loss
* Add latest docstring and tutorial changes
* add comment
* fix BERT forward parameters
* add doc string to AdaptiveModel forward method
* remove unnecessary data silo
* fix farm import
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-12-23 14:54:02 +01:00
tstadel
fc8df2163d
Fix Windows CI OOM ( #1878 )
...
* set fixture scope to "function"
* run FARMReader without multiprocessing
* dispose off ray after tests
* run most expensive tasks first in test files
* run expensive tests first
* run garbage collector between tests
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-12-22 17:20:23 +01:00
MichelBartels
84147edcca
Model Distillation ( #1758 )
...
* initial commit
* Add latest docstring and tutorial changes
* added comments and fixed bug
* fixed bugs, added benchmark and added documentation
* Add latest docstring and tutorial changes
* fix type: ignore comment
* fix logging in benchmark
* fixed distillation config
* Add latest docstring and tutorial changes
* added type annotations
* fixed distillation loss calculation
* added type annotations
* fixed distillation mse loss
* improved model distillation benchmark config loading
* added temperature for model distillation
* removed uncessary imports, added comments, added named parameter calls
* Add latest docstring and tutorial changes
* added some more comments
* added distillation test
* fixed distillation test
* removed unnecessary import
* fix softmax dimension
* add grid search
* improved model distillation benchmark config
* fixed model distillation hyperparameter search
* added doc strings and type hints for model distillation
* Add latest docstring and tutorial changes
* fixed type hints
* fixed type hints
* fixed type hints
* wrote out params instead of kwargs in DistillationDataSilo initializer
* fixed type hints
* fixed typo
* fixed typo
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2021-11-26 18:49:30 +01:00