2021-06-18 00:42:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								# Hyperparameter Optimization for Huggingface Transformers
 
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								AutoTransformers is an AutoML class for fine-tuning pre-trained language models based on the transformers library.
							 
						 
					
						
							
								
									
										
										
										
											2021-06-18 00:42:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								An example of using AutoTransformers:
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```python
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								from flaml.nlp.autotransformers import AutoTransformers
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								autohf = AutoTransformers()
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								preparedata_setting = {
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								    "dataset_subdataset_name": "glue:mrpc",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    "pretrained_model_size": "electra-base-discriminator:base",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    "data_root_path": "data/",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    "max_seq_length": 128,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								}
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								autohf.prepare_data(**preparedata_setting)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								autohf_settings = {"resources_per_trial": {"gpu": 1, "cpu": 1},
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								                    "num_samples": -1,  # unlimited sample size
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								                    "time_budget": 3600,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								                    "ckpt_per_epoch": 1,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								                    "fp16": False,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								                   }
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								validation_metric, analysis = autohf.fit(**autohf_settings)
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								The current use cases that are supported:
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								1.  A simplified version of fine-tuning the GLUE dataset using HuggingFace;
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								2.  For selecting better search space for fine-tuning the GLUE dataset;
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								3.  Use the search algorithms in flaml for more efficient fine-tuning of HuggingFace.
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								The use cases that can be supported in future:
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-06-09 11:37:03 -04:00 
										
									 
								 
							 
							
								
							 
							
								 
							 
							
							
								1.  HPO fine-tuning for text generation;
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								2.  HPO fine-tuning for question answering.
							 
						 
					
						
							
								
									
										
										
										
											2021-06-18 00:42:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								## Troubleshooting fine-tuning HPO for pre-trained language models
 
							 
						 
					
						
							
								
									
										
										
										
											2021-06-18 00:42:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								To reproduce the results for our ACL2021 paper:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2021-08-02 19:10:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								*  [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models ](https://arxiv.org/abs/2106.09204 ). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```bibtex
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								@inproceedings {liu2021hpo,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    author={Xueqing Liu and Chi Wang},
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    year={2021},
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								    booktitle={ACL-IJCNLP},
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								```
							 
						 
					
						
							
								
									
										
										
										
											2021-06-18 00:42:26 -04:00 
										
									 
								 
							 
							
								
									
										 
									 
								
							 
							
								 
							 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							 
							
							
								Please refer to the following jupyter notebook: [Troubleshooting HPO for fine-tuning pre-trained language models ](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb )