* add tinybert data augmentation
* don't reload glove in tinybert data augmentation
* fix unnecessary load_glove call
* fix type hints
* add comments and type hints
* add batch_size argument
* don't predict subwords as alternative for words
* fix subword predictions
* limit sequence length
* actually limit sequence length
* improve performance by calculating nearest glove vector on gpu
* add model and tokenizer parameter
* fix type hints
* improve data augmentation performance
* explained limits of script
* corrected comment
* added data augmentation test
* don't label every question in augmented dataset as impossible
* add sample glove
* better handling of downloading of glove
* fix typo of last commit