56 Commits

Author SHA1 Message Date
Chi Wang
0b25e89f29
reproducibility for random sampling (#349)
* reproducibility for random sampling #236

* doc update
2021-12-22 12:12:25 -08:00
Chi Wang
434586e2e2
train at least one iter when not trained (#336)
* train at least one iter when not trained

* bump version to 0.9.1
2021-12-12 20:05:18 -08:00
Qingyun Wu
17b17d084f
tune api for schedulers (#322)
* revise api and tests

* rename prune_attr

* update finetune notebook

* add scheduler test and notebook

* update tune api for scheduler

* remove scheduler notebook

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

* fix imports

* clear notebook output

* fix ray import

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* improve docstr

* Update flaml/searcher/blendsearch.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* remove redundant import

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-04 21:52:20 -05:00
Chi Wang
c57954fbbd
include default value in rf search space (#317)
* include default value in rf search space

* init _mem_per_iter with -1

* bump version to 0.8.2

* docstr for search space's arguments
2021-12-03 09:15:21 -08:00
Chi Wang
1545d5a6d2
skip cv preparation if eval_method is holdout (#314)
* skip cv preparation if eval_method is holdout

* bump version to 0.8.1
2021-11-28 11:18:55 -08:00
Chi Wang
72caa2172d
model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283)
if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO.
retrain if using ray.
update ITER_HP in config after a trial is finished.
change prophet logging level.
example and notebook update.
allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class.
remove model_history.
checkpoint bug fix.

* model_history meaning save_best_model_per_estimator

* ITER_HP

* example update

* prophet logging level

* comment update in forecast notebook

* print format improvement

* allow settings to be passed to AutoML constructor

* checkpoint bug fix

* time limit for autohf regression test

* skip slow test on macos

* cleanup before del
2021-11-18 09:39:45 -08:00
Chi Wang
59083fbdcb
example update (#281)
* example update

* bump version to 0.7.2

* notebook update
2021-11-12 22:29:33 -08:00
Chi Wang
5b68f556dc bump version to 0.7.1 2021-11-07 08:05:13 -08:00
Kevin Chen
519bfc2a18
Integrate multivariate time series forecasting (#254)
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables

- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task

* update automl.py and test_forecast.py

* update forecast notebook

* update README.md and setup.py

* update ml.py and test_forecast.py

- make "ds" and "y" constant variables

* replace constants with constant variables

* bump version to 0.7.0

* update setup.py
- support 'forecast' and 'ts_forecast'

* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
2021-10-30 09:48:57 -07:00
Chi Wang
46cfb76863 bump version to 0.6.9 2021-10-18 21:56:21 -07:00
Chi Wang
fe65fa143d
v0.6.8 (#247) 2021-10-12 15:08:40 -07:00
Chi Wang
ddc1a63a76
Package (#244)
* build and upload pypi package

* pandas in dependency
2021-10-10 22:57:22 -07:00
Chi Wang
a99e939404
update config if n_estimators is modified (#225)
* update config if n_estimators is modified

* prediction as int

* handle the case n_estimators <= 0

* if trained and no budget to train more, return the trained model

* split_type=group for classification & regression
2021-09-27 21:30:49 -07:00
Chi Wang
16a97bec76
set converge flag when no trial can be sampled (#217)
* set converge flag when no trial can be sampled

* require custom_metric to return dict for logging
close #218

* estimate time budget needed

* log info per iteration
2021-09-23 10:49:02 -07:00
Chi Wang
f4529dfe89
package name in setup (#198)
* package name

* learning to rank example: close #200

* try import prophet #201
2021-09-11 21:19:18 -07:00
Chi Wang
71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang
339eb80f44
variable name (#187) 2021-09-04 20:28:37 -07:00
Chi Wang
1bc8786dcb
remove big objects after fit (#176)
* remove big objects after fit

* xgboost>1.3.3 has a weird auc socre on:
kr-vs-kp, fold 5, 1h1c

* keep_search_state
2021-08-26 13:45:13 -07:00
Qingyun Wu
a229a6112a
Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
Kevin Chen
3d0a3d26a2
Forecast (#162)
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']

* update setup.py

* add TimeSeriesSplit to 'regression' and 'classification' task

* add 'time' split_type for 'classification' and 'regression' task

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* feature importance

* variable name

* Update test/test_split.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/test_forecast.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prophet installation fail in windows

* upload flaml_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-08-23 13:26:46 -07:00
Qingyun Wu
10082b9262
v0.5.12 (#150)
* remove extra comma

* exclusive bound

* log file name

* add cost to space

* dataset_format

* add load_openml_dataset test

* docstr

* revise test format

* simplify restore

* order categories

* openml server exception in test

* process space

* add warning

* log format

* reduce n_cpu

* nested space

* hierarchical search space for CFO

* non hierarchical for bs

* unflatten hierarchical config

* connection error

* random sample

* config signature

* check ray version

* preprocess numpy array

* catboost preprocess

* time budget

* seed, verbose, hpo_method

* test cfocat

* shallow copy in flatten_dict
prevent lgbm model duplication

* match estimator name

* quantize and log

* test qloguniform and qrandint

* test qlograndint

* thread.running

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
2021-08-11 23:02:22 -07:00
Qingyun Wu
e24265ee5d
automl fit with starting points (#141)
* add starting point in fit

* add estimator best config

* add test

* add doc string

* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.

Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-31 13:39:31 -07:00
Chi Wang
15fd8adac4
max_leaves (#138)
* max_leaf_nodes in rf and extra_tree

* preprocess numpy str

* free up mem after training
2021-07-27 18:02:49 -07:00
Qingyun Wu
58c0ec959d
Update readme for flaml.tune (#137)
* add time_budget_s for bs in readme

* version update

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-24 17:10:43 -07:00
Chi Wang
95aa719b01
version (#136) 2021-07-20 17:45:02 -07:00
Chi Wang
072e9e4588
constraint (#132)
* constraint

* ensemble
2021-07-10 09:02:17 -07:00
Qingyun Wu
b04b00dc9d
V0.5.6 (#128)
* recover ConcurrencyLimiter

* cost attribute

* update notebooks

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-07-06 08:32:20 -07:00
Chi Wang
2dbf38da0a
discount running thread (#121)
* discount running thread

* version

* limit dir

* report result

* catch

* remove handler
2021-06-25 14:24:46 -07:00
Chi Wang
3a2b6cdddc
Update version.py (#111) 2021-06-18 10:59:00 -07:00
Chi Wang
183b867856
groups (#107)
* groups

* version

* developer's guide
2021-06-15 18:52:57 -07:00
Qingyun Wu
e031c2eb7d
Test restore (#103)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* sync

* version number

* update gitignore

* delta time

* reindex columns when dropping int-indexed columns

* add seed

* add seed in Args

* merge

* stabilize SearchThread speed

* add seed

* fix import

* use except

* add restore test for CFO

* remove test_restore

* remove inspect

* remove print

* change to SearchThread._esp

* add _eps lower bound

* _eps in SearchThread

* add test_restore

* 1<<32

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-06-07 19:49:45 -04:00
Chi Wang
682cb49654
move import position (#102)
* move import position
2021-06-05 11:36:26 -07:00
Qingyun Wu
0d3a0bfab6
Add ChaCha (#92)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* sync

* version number

* update gitignore

* delta time

* reindex columns when dropping int-indexed columns

* add seed

* add seed in Args

* merge

* init upload of ChaCha

* remove redundancy

* add back catboost

* improve AutoVW API

* set min_resource_lease in VWOnlineTrial

* docstr

* rename

* docstr

* add docstr

* improve API and documentation

* fix name

* docstr

* naming

* remove max_resource in scheduler

* add TODO in flow2

* remove redundancy in rearcher

* add input type

* adapt code from ray.tune

* move files

* naming

* documentation

* fix import error

* fix format issues

* remove cb in worse than test

* improve _generate_all_comb

* remove ray tune

* naming

* VowpalWabbitTrial

* import error

* import error

* merge test code

* scheduler import

* fix import

* remove

* import, minor bug and version

* Float or Categorical

* fix default

* add test_autovw.py

* add vowpalwabbit and openml

* lint

* reorg

* lint

* indent

* add autovw notebook

* update notebook

* update log msg and autovw notebook

* update autovw notebook

* update autovw notebook

* add available strings for model_select_policy

* string for metric

* Update vw format in flaml/onlineml/trial.py

Co-authored-by: olgavrou <olgavrou@gmail.com>

* make init_config optional

* add _setup_trial_runner and update notebook

* space

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
Co-authored-by: olgavrou <olgavrou@gmail.com>
2021-06-02 22:08:24 -04:00
Chi Wang
61d1263dfd
log best model (#96)
* log best model
2021-06-02 13:11:41 -07:00
Gian Pio Domiziani
c4c15f533f
datetime feature engineering added. (#89)
* datetime feature engineering added.

* check if datetime in columns moved after drop check. Check if the new columns do not already exist.

* check the drop condition before to add new_column. In transform, check directly if new columns are present in num_column.

* check if new_column is in X.columns.

* fixed lint issue. update version to 0.4.1.
2021-05-25 08:30:08 -07:00
Chi Wang
0925e2b308
constraints (#88)
* pre-training constraints

* metric constraints after training
2021-05-18 15:57:42 -07:00
Chi Wang
0b23c3a028
stepsize (#86)
* decrease step size in suggest

* initialization of the counters

* increase step size

* init phase

* check converge in suggest
2021-05-06 21:29:38 -07:00
Chi Wang
e5123f5595
V0.3.5 (#84)
* choose one direction
2021-04-30 17:19:41 -07:00
Gian Pio Domiziani
068fb9f5c2
X.copy() in the process method (#78)
* X.copy() in the transformer method.

* update version 0.3.4
2021-04-23 17:14:29 -07:00
Chi Wang
b6f57894ef
v0.3.3 (#74) 2021-04-21 11:48:12 -07:00
Chi Wang
d08bb15475
Update version.py (#70)
* Update version.py

* update logo
2021-04-20 16:57:35 -07:00
Qingyun Wu
06045703bf
Lgbm w customized obj (#64)
* add customized lgbm learner

* add comments

* fix format issue

* format

* OpenMLError

* add test

* add notebook

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-04-10 21:14:28 -04:00
Chi Wang
b7a91e0385
V0.3.0 (#55)
* flaml v0.3

* low cost partial config
2021-04-06 11:37:52 -07:00
Chi Wang
f28d093522
v0.2.10 (#51)
* increase search space

* None check
2021-03-28 17:54:25 -07:00
Chi Wang
4a8110c87b
pickle the AutoML object (#37)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* Add Gitter badge (#41)

* prevent divide by zero

* test roberta

* BlendSearchTuner

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
2021-03-16 22:13:35 -07:00
liuzhe-lz
840e3fc104
Fix bug in NNI tuner (#34)
* fix bug in nni tuner

* Update version.py

Co-authored-by: liuzhe <zhe.liu@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-03-06 10:38:33 -08:00
Chi Wang
1560a6e52a
V0.2.7 (#35)
* bug fix

* admissible region

* use CFO's init point as backup

* step lower bound

* test electra
2021-03-05 23:39:14 -08:00
Chi Wang
7bd231e497
v0.2.6 (#32)
* xgboost notebook

* finetuning notebook

* finetuning test

* experimental nni support

* support nested search space

* log file name

* record training_iteration

* eps

* reset times

* std set to default step size if 0
2021-02-28 12:43:43 -08:00
Chi Wang
6ff0ed434b
v0.2.5 (#30)
* test distillbert

* import check

* complete partial config

* None check

* init config is not suggested by bo

* badge

* notebook for lightgbm
2021-02-22 22:10:41 -08:00
Chi Wang (MSR)
bd16eeee69 sample_weight; dependency; notebook 2021-02-13 10:43:11 -08:00