mirror of
https://github.com/microsoft/autogen.git
synced 2025-10-29 17:00:56 +00:00
* warning -> info for low cost partial config #195, #110 * when n_estimators < 0, use trained_estimator's * log debug info * test random seed * remove "objective"; avoid ZeroDivisionError * hp config to estimator params * check type of searcher * default n_jobs * try import * Update searchalgo_auto.py * CLASSIFICATION * auto_augment flag * min_sample_size * make catboost optional
Hyperparameter Optimization for Huggingface Transformers
AutoTransformers is an AutoML class for fine-tuning pre-trained language models based on the transformers library.
An example of using AutoTransformers:
from flaml.nlp.autotransformers import AutoTransformers
autohf = AutoTransformers()
preparedata_setting = {
"dataset_subdataset_name": "glue:mrpc",
"pretrained_model_size": "electra-base-discriminator:base",
"data_root_path": "data/",
"max_seq_length": 128,
}
autohf.prepare_data(**preparedata_setting)
autohf_settings = {"resources_per_trial": {"gpu": 1, "cpu": 1},
"num_samples": -1, # unlimited sample size
"time_budget": 3600,
"ckpt_per_epoch": 1,
"fp16": False,
}
validation_metric, analysis = autohf.fit(**autohf_settings)
The current use cases that are supported:
- A simplified version of fine-tuning the GLUE dataset using HuggingFace;
- For selecting better search space for fine-tuning the GLUE dataset;
- Use the search algorithms in flaml for more efficient fine-tuning of HuggingFace.
The use cases that can be supported in future:
- HPO fine-tuning for text generation;
- HPO fine-tuning for question answering.
Troubleshooting fine-tuning HPO for pre-trained language models
To reproduce the results for our ACL2021 paper:
- An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models. Xueqing Liu, Chi Wang. ACL-IJCNLP 2021.
@inproceedings{liu2021hpo,
title={An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models},
author={Xueqing Liu and Chi Wang},
year={2021},
booktitle={ACL-IJCNLP},
}
Please refer to the following jupyter notebook: Troubleshooting HPO for fine-tuning pre-trained language models