89 Commits

Author SHA1 Message Date
Chi Wang
71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang
339eb80f44
variable name (#187) 2021-09-04 20:28:37 -07:00
Chi Wang
e46573a01d
warmstart blendsearch (#186)
* increase test coverage

* use define by run only when needed

* warmstart bs

* classification -> binary, multi

* warm start with evaluated rewards

* data transformer; resource attr for gs

* BlendSearchTuner bug fix and unittest

* bug fix

* docstr and import

* task type
2021-09-04 01:42:21 -07:00
Gian Pio Domiziani
63bba92fd0
Fix decide_split_type bug. (#184)
* Fix decide_split_type bug.
2021-09-02 08:50:22 -07:00
Chi Wang
6ab0730793
remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178)
* remove catboost training dir

* close #48

* bs for hierarchical space. close #85

* retrain for hierarchical space

* clean ml (#180)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* support ranking task

* examples

* cv shuffle

* forecast api and implementation cleaner

* period constraints

* delete groups after fit
2021-09-01 16:25:04 -07:00
Chi Wang
1bc8786dcb
remove big objects after fit (#176)
* remove big objects after fit

* xgboost>1.3.3 has a weird auc socre on:
kr-vs-kp, fold 5, 1h1c

* keep_search_state
2021-08-26 13:45:13 -07:00
Qingyun Wu
a229a6112a
Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
Kevin Chen
3d0a3d26a2
Forecast (#162)
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']

* update setup.py

* add TimeSeriesSplit to 'regression' and 'classification' task

* add 'time' split_type for 'classification' and 'regression' task

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* feature importance

* variable name

* Update test/test_split.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/test_forecast.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prophet installation fail in windows

* upload flaml_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-08-23 13:26:46 -07:00
すずまる
6270353458
support ROC and AUC for multi-class classification (#170)
* support ROC and AUC for multi-class classification

* add a test case to cover ROC and AUC for multi-class classification
2021-08-22 15:16:10 -07:00
Qingyun Wu
10082b9262
v0.5.12 (#150)
* remove extra comma

* exclusive bound

* log file name

* add cost to space

* dataset_format

* add load_openml_dataset test

* docstr

* revise test format

* simplify restore

* order categories

* openml server exception in test

* process space

* add warning

* log format

* reduce n_cpu

* nested space

* hierarchical search space for CFO

* non hierarchical for bs

* unflatten hierarchical config

* connection error

* random sample

* config signature

* check ray version

* preprocess numpy array

* catboost preprocess

* time budget

* seed, verbose, hpo_method

* test cfocat

* shallow copy in flatten_dict
prevent lgbm model duplication

* match estimator name

* quantize and log

* test qloguniform and qrandint

* test qlograndint

* thread.running

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
2021-08-11 23:02:22 -07:00
Xueqing Liu
eeaf5b5963
space -> main (#148)
* subspace in flow2

* search space and trainable from AutoML

* experimental features: multivariate TPE, grouping, add_evaluated_points

* test experimental features

* readme

* define by run

* set time_budget_s for bs

Co-authored-by: liususan091219 <Xqq630517>

* version

* acl

* test define_by_run_func

* size

* constraints

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-08-02 16:10:26 -07:00
Qingyun Wu
e24265ee5d
automl fit with starting points (#141)
* add starting point in fit

* add estimator best config

* add test

* add doc string

* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.

Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-31 13:39:31 -07:00
Chi Wang
15fd8adac4
max_leaves (#138)
* max_leaf_nodes in rf and extra_tree

* preprocess numpy str

* free up mem after training
2021-07-27 18:02:49 -07:00
Chi Wang
b3bb00966d
coverage (#135)
* coverage

* readme

* timeout
2021-07-20 17:00:44 -07:00
Chi Wang
072e9e4588
constraint (#132)
* constraint

* ensemble
2021-07-10 09:02:17 -07:00
Qingyun Wu
b04b00dc9d
V0.5.6 (#128)
* recover ConcurrencyLimiter

* cost attribute

* update notebooks

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-07-06 08:32:20 -07:00
Qingyun Wu
a291abfab9
Cha cha (#127)
* unordered categorical

* allow cost attribute to be None

* tensorboardX version

* quote

* cfo cat

* trunc

* Update version.py

* incumbent is normalized

* python 3.9

* remove ConcurrencyLimiter

* seed

* estimator

* update autovw notebook

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-07-05 18:17:26 -07:00
Chi Wang
183b867856
groups (#107)
* groups

* version

* developer's guide
2021-06-15 18:52:57 -07:00
Chi Wang
f7cf2ea45a
Multiclass (#99)
* utility functions

* stepsize lower bound
2021-06-04 10:31:33 -07:00
Chi Wang
61d1263dfd
log best model (#96)
* log best model
2021-06-02 13:11:41 -07:00
Chi Wang
b206363c9a
metric constraint (#90)
* penalty change

* metric modification

* catboost init
2021-05-22 08:51:38 -07:00
Chi Wang
0925e2b308
constraints (#88)
* pre-training constraints

* metric constraints after training
2021-05-18 15:57:42 -07:00
Gian Pio Domiziani
730fd14ef6
micro/macro f1 metrics added. (#80)
* micro/macro f1 metrics added.

* format lines.
2021-04-26 14:50:41 -04:00
Chi Wang
97a7c114ee
Issue58 (#59)
* iter per learner

* code cleanup
2021-04-08 09:29:55 -07:00
Chi Wang
b7a91e0385
V0.3.0 (#55)
* flaml v0.3

* low cost partial config
2021-04-06 11:37:52 -07:00
Chi Wang
ae5f8e5426
data validation (#45)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* delta time

* reindex columns when dropping int-indexed columns

* test drop columns and small training data

* param set for ensemble builder

* fillna on copy

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
2021-03-19 09:50:47 -07:00
Iman Hosseini
0f99526b63
Update automl.py to add verbose argument to fit (#40)
* Update automl.py

* Pass verbose-1 to tune

passing verbose-1 to tune, ensures that for verbose=1, tune is silenced (no INFO prints) and for verbose=2 we see the INFO prints, and for verbose=3 we get DEBUG level at tune, as we want. This is due to: https://github.com/microsoft/FLAML/blob/main/flaml/tune/tune.py#L227
2021-03-17 10:54:17 -07:00
Chi Wang
4a8110c87b
pickle the AutoML object (#37)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* Add Gitter badge (#41)

* prevent divide by zero

* test roberta

* BlendSearchTuner

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
2021-03-16 22:13:35 -07:00
Chi Wang
7bd231e497
v0.2.6 (#32)
* xgboost notebook

* finetuning notebook

* finetuning test

* experimental nni support

* support nested search space

* log file name

* record training_iteration

* eps

* reset times

* std set to default step size if 0
2021-02-28 12:43:43 -08:00
Chi Wang
6ff0ed434b
v0.2.5 (#30)
* test distillbert

* import check

* complete partial config

* None check

* init config is not suggested by bo

* badge

* notebook for lightgbm
2021-02-22 22:10:41 -08:00
Chi Wang (MSR)
da88aa77e3 None check 2021-02-13 10:58:49 -08:00
Chi Wang (MSR)
bd16eeee69 sample_weight; dependency; notebook 2021-02-13 10:43:11 -08:00
Chi Wang (MSR)
f956e957d4 sample_weight when training with full data 2021-02-05 23:12:52 -08:00
Chi Wang (MSR)
833d022cfc sample weight for subsampled data 2021-02-05 23:04:29 -08:00
Chi Wang
776aa55189
V0.2.2 (#19)
* v0.2.2

separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>
2021-02-05 21:41:14 -08:00
Chi Wang
cb5ce4e3a6
v0.1.3 Set default logging level to INFO (#14)
* set default logging level to INFO

* remove unnecessary import

* API future compatibility

* add test for customized learner

* test dependency

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
2020-12-15 08:10:43 -08:00
Eric Zhu
0fb3e04fc3
API docs #6 (#13) and update version to 0.1.2 2020-12-15 00:57:30 -08:00
Eric Zhu
4ce908f42e
Fix #11; add tests for training log and python logger (#12) 2020-12-14 23:10:03 -08:00
Chi Wang (MSR)
492990655d v0.1.0 2020-12-04 09:40:27 -08:00