mirror of https://github.com/microsoft/autogen.git synced 2025-11-01 10:19:46 +00:00

Go to file

* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* sync

* version number

* update gitignore

* delta time

* reindex columns when dropping int-indexed columns

* add seed

* add seed in Args

* merge

* init upload of ChaCha

* remove redundancy

* add back catboost

* improve AutoVW API

* set min_resource_lease in VWOnlineTrial

* docstr

* rename

* docstr

* add docstr

* improve API and documentation

* fix name

* docstr

* naming

* remove max_resource in scheduler

* add TODO in flow2

* remove redundancy in rearcher

* add input type

* adapt code from ray.tune

* move files

* naming

* documentation

* fix import error

* fix format issues

* remove cb in worse than test

* improve _generate_all_comb

* remove ray tune

* naming

* VowpalWabbitTrial

* import error

* import error

* merge test code

* scheduler import

* fix import

* remove

* import, minor bug and version

* Float or Categorical

* fix default

* add test_autovw.py

* add vowpalwabbit and openml

* lint

* reorg

* lint

* indent

* add autovw notebook

* update notebook

* update log msg and autovw notebook

* update autovw notebook

* update autovw notebook

* add available strings for model_select_policy

* string for metric

* Update vw format in flaml/onlineml/trial.py

Co-authored-by: olgavrou <olgavrou@gmail.com>

* make init_config optional

* add _setup_trial_runner and update notebook

* space

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
Co-authored-by: olgavrou <olgavrou@gmail.com>

2021-06-02 22:08:24 -04:00

.github/workflows

code coverage (#79 )

2021-04-26 20:04:57 -04:00

docs

Update version.py (#70 )

2021-04-20 16:57:35 -07:00

flaml

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

notebook

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

test

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

.coveragerc

code coverage (#79 )

2021-04-26 20:04:57 -04:00

.flake8

v0.1.0

2020-12-04 09:40:27 -08:00

.gitignore

pickle the AutoML object (#37 )

2021-03-16 22:13:35 -07:00

CODE_OF_CONDUCT.md

v0.1.0

2020-12-04 09:40:27 -08:00

LICENSE

add NOTICE file (#91 )

2021-05-24 14:35:08 -04:00

NOTICE.md

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

README.md

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

SECURITY.md

v0.1.0

2020-12-04 09:40:27 -08:00

setup.py

Add ChaCha (#92 )

2021-06-02 22:08:24 -04:00

README.md

FLAML - Fast and Lightweight AutoML

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner. It is fast and economical. The simple and lightweight design makes it easy to extend, such as adding customized learners or metrics. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research. FLAML leverages the structure of the search space to choose a search order optimized for both cost and error. For example, the system tends to propose cheap configurations at the beginning stage of the search, but quickly moves to configurations with high model complexity and large sample size when needed in the later stage of the search. For another example, it favors cheap learners in the beginning but penalizes them later if the error improvement is slow. The cost-bounded search and cost-based prioritization make a big difference in the search efficiency under budget constraints.

Installation

FLAML requires Python version >= 3.6. It can be installed from pip:

pip install flaml

To run the notebook example, install flaml with the [notebook] option:

pip install flaml[notebook]

Quickstart

With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.

from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")

You can restrict the learners and use FLAML as a fast hyperparameter tuning tool for XGBoost, LightGBM, Random Forest etc. or a customized learner.

automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])

You can also run generic ray-tune style hyperparameter tuning for a custom function.

from flaml import tune
tune.run(train_with_config, config={…}, low_cost_partial_config={…}, time_budget_s=3600)

Advantages

For classification and regression tasks, find quality models with lower computational resources.
Users can choose their desired customizability: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), full customization (arbitrary training and evaluation code).
Allow human guidance in hyperparameter tuning to respect prior on certain subspaces but also able to explore other subspaces.

Examples

A basic classification example.

from flaml import AutoML
from sklearn.datasets import load_iris
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": "test/iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
                        **automl_settings)
# Predict
print(automl.predict_proba(X_train))
# Export the best model
print(automl.model)

A basic regression example.

from flaml import AutoML
from sklearn.datasets import load_boston
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'r2',
    "task": 'regression',
    "log_file_name": "test/boston.log",
}
X_train, y_train = load_boston(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
                        **automl_settings)
# Predict
print(automl.predict(X_train))
# Export the best model
print(automl.model)

More examples can be found in notebooks.

Documentation

Please find the API documentation here.

Please find demo and tutorials of FLAML here

Read more about the hyperparameter optimization methods in FLAML here. They can be used beyond the AutoML context. And they can be used in distributed HPO frameworks such as ray tune or nni.

For more technical details, please check our papers.

FLAML: A Fast and Lightweight AutoML Library. Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys, 2021.

@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}

Frugal Optimization for Cost-related Hyperparameters. Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
Economical Hyperparameter Optimization With Blended Search Strategy. Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
ChaCha for online AutoML. Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. To appear in ICML 2021.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

If you are new to GitHub here is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Developing

Setup:

git clone https://github.com/microsoft/FLAML.git
pip install -e .[test,notebook]

Coverage

Any code you commit should generally not significantly impact coverage. To run all unit tests:

coverage run -m pytest test

If all the tests are passed, please also test run notebook/flaml_automl to make sure your commit does not break the notebook example.

Authors

Chi Wang
Qingyun Wu

Contributors (alphabetical order): Sebastien Bubeck, Surajit Chaudhuri, Nadiia Chepurko, Ofer Dekel, Alex Deng, Anshuman Dutt, Nicolo Fusi, Jianfeng Gao, Johannes Gehrke, Silu Huang, Dongwoo Kim, Christian Konig, John Langford, Amin Saied, Neil Tenenholtz, Markus Weimer, Haozhe Zhang, Erkang Zhu.

License

MIT License

Languages

Python 61.5%

C# 25.1%

TypeScript 12.6%

HTML 0.3%

JavaScript 0.2%

Other 0.2%