autogen/flaml/nlp/README.md

# Hyperparameter Optimization for Huggingface Transformers

AutoTransformers is an AutoML class for fine-tuning pre-trained language models based on the transformers library.

An example of using AutoTransformers:

```python
from flaml.nlp.autotransformers import AutoTransformers

autohf = AutoTransformers()
preparedata_setting = {
    "dataset_subdataset_name": "glue:mrpc",
    "pretrained_model_size": "electra-base-discriminator:base",
    "data_root_path": "data/",
    "max_seq_length": 128,
}
autohf.prepare_data(**preparedata_setting)
autohf_settings = {"resources_per_trial": {"gpu": 1, "cpu": 1},
                    "num_samples": -1,  # unlimited sample size
                    "time_budget": 3600,
                    "ckpt_per_epoch": 1,
                    "fp16": False,
                   }
validation_metric, analysis = autohf.fit(**autohf_settings)

```

The current use cases that are supported:

1. A simplified version of fine-tuning the GLUE dataset using HuggingFace;
2. For selecting better search space for fine-tuning the GLUE dataset;
3. Use the search algorithms in flaml for more efficient fine-tuning of HuggingFace.

The use cases that can be supported in future:

1. HPO fine-tuning for text generation;
2. HPO fine-tuning for question answering.

## Troubleshooting fine-tuning HPO for pre-trained language models

To reproduce the results for our ACL2021 paper:

* [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.

```bibtex
@inproceedings{liu2021hpo,
    title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
    author={Xueqing Liu and Chi Wang},
    year={2021},
    booktitle={ACL-IJCNLP},
}
```

Please refer to the following jupyter notebook: [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)
add notebook (#109) * added support for transformers==3.4.0 * updating error message * adding arxiv 2021-06-18 00:42:26 -04:00			`# Hyperparameter Optimization for Huggingface Transformers`

space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`AutoTransformers is an AutoML class for fine-tuning pre-trained language models based on the transformers library.`
add notebook (#109) * added support for transformers==3.4.0 * updating error message * adding arxiv 2021-06-18 00:42:26 -04:00
			`An example of using AutoTransformers:`
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00
			```python
			`from flaml.nlp.autotransformers import AutoTransformers`

			`autohf = AutoTransformers()`
			`preparedata_setting = {`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`"dataset_subdataset_name": "glue:mrpc",`
			`"pretrained_model_size": "electra-base-discriminator:base",`
			`"data_root_path": "data/",`
			`"max_seq_length": 128,`
			`}`
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00			`autohf.prepare_data(**preparedata_setting)`
			`autohf_settings = {"resources_per_trial": {"gpu": 1, "cpu": 1},`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`"num_samples": -1, # unlimited sample size`
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00			`"time_budget": 3600,`
			`"ckpt_per_epoch": 1,`
			`"fp16": False,`
			`}`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`validation_metric, analysis = autohf.fit(**autohf_settings)`
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00
			```

			`The current use cases that are supported:`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00			`1. A simplified version of fine-tuning the GLUE dataset using HuggingFace;`
			`2. For selecting better search space for fine-tuning the GLUE dataset;`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`3. Use the search algorithms in flaml for more efficient fine-tuning of HuggingFace.`
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00
			`The use cases that can be supported in future:`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00
autohf (#43) automate huggingface transformer 2021-06-09 11:37:03 -04:00			`1. HPO fine-tuning for text generation;`
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`2. HPO fine-tuning for question answering.`
add notebook (#109) * added support for transformers==3.4.0 * updating error message * adding arxiv 2021-06-18 00:42:26 -04:00
space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`## Troubleshooting fine-tuning HPO for pre-trained language models`
add notebook (#109) * added support for transformers==3.4.0 * updating error message * adding arxiv 2021-06-18 00:42:26 -04:00
			`To reproduce the results for our ACL2021 paper:`

space -> main (#148) * subspace in flow2 * search space and trainable from AutoML * experimental features: multivariate TPE, grouping, add_evaluated_points * test experimental features * readme * define by run * set time_budget_s for bs Co-authored-by: liususan091219 <Xqq630517> * version * acl * test define_by_run_func * size * constraints Co-authored-by: Chi Wang <wang.chi@microsoft.com> 2021-08-02 19:10:26 -04:00			`* [An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models](https://arxiv.org/abs/2106.09204). Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.`

			```bibtex
			`@inproceedings{liu2021hpo,`
			`title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},`
			`author={Xueqing Liu and Chi Wang},`
			`year={2021},`
			`booktitle={ACL-IJCNLP},`
			`}`
			```
add notebook (#109) * added support for transformers==3.4.0 * updating error message * adding arxiv 2021-06-18 00:42:26 -04:00
			`Please refer to the following jupyter notebook: [Troubleshooting HPO for fine-tuning pre-trained language models](https://github.com/microsoft/FLAML/blob/main/notebook/research/acl2021.ipynb)`