288 Commits

Author SHA1 Message Date
Z.sk
7b24662dca
Makes the evaluation_function could receive the incumbent best result as input in Tune (#339)
* update tune function

* pass incumbent result to the training function

* Update test/tune/test_record_incumbent.py

* Update flaml/searcher/search_thread.py

* Update flaml/searcher/blendsearch.py

* Update flaml/tune/tune.py

* add constant variable

Co-authored-by: 张少坤 <zhangshaokun@fuzhi.ai>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-15 21:12:47 -08:00
Qingyun Wu
17b17d084f
tune api for schedulers (#322)
* revise api and tests

* rename prune_attr

* update finetune notebook

* add scheduler test and notebook

* update tune api for scheduler

* remove scheduler notebook

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

* fix imports

* clear notebook output

* fix ray import

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* improve docstr

* Update flaml/searcher/blendsearch.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* remove redundant import

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-04 21:52:20 -05:00
Chi Wang
7d269435ae add save_best_config() 2021-12-04 16:29:52 -08:00
Chi Wang
18230ed22f
pred_time_limit clarification and logging (#319)
* pred_time_limit clarification

* log prediction time

* handle ChunkedEncodingError in test
2021-12-03 16:02:00 -08:00
Xueqing Liu
fb59bb9928
adding TODOs for NLP module, so students can implement other tasks easier (#321)
* fixing ray pickle bug, skipping macosx bug, completing code for seqregression

* catching connectionerror

* ading TODOs for NLP module
2021-12-03 12:45:16 -05:00
Chi Wang
c57954fbbd
include default value in rf search space (#317)
* include default value in rf search space

* init _mem_per_iter with -1

* bump version to 0.8.2

* docstr for search space's arguments
2021-12-03 09:15:21 -08:00
Michal Chromcak
b0ef3b7995
Add conda forge minimal test (#309)
* add conda forge minimal test, create pytest markers
2021-11-25 13:58:20 -08:00
晓宇
5dc948da18
Update test_regression.py (#306)
* Update test_regression.py

There is a another way for mutioutput-model-trian.
RegressorChain is more adapting to the targets which are relavant.
2021-11-25 08:18:22 -08:00
Xueqing Liu
fd136b02d1
bug fix for TransformerEstimator (#293)
* fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode

* finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1

* adding predict_proba, address PR 293's comments

close #293 #291
2021-11-23 11:26:39 -08:00
Chi Wang
85e21864ce
test -> val; docstr (#300)
* rename test -> val in custom metric function
* add an example in docstr
resolve #299
2021-11-22 22:17:29 -08:00
Chi Wang
d937b03e42
multioutput regression (#292)
* make AutoML inherit sklearn.base.BaseEstimator such that it can be wrapped in sklearn.multioutput.MultiOutputRegressor for multi-output regression.

* moved and simplified preprocessing code in AutoML.predictI() to _preprocess()
2021-11-22 06:59:42 -08:00
Chi Wang
00da79a90b
empty search space (#295)
fix the error when an empty dictionary is passed to BlendSearch as the search space.
2021-11-20 20:05:28 -08:00
Qingyun Wu
49f9e9f86b
add warmstart test (#298)
* add warmstart test

* remove redundancy

* add more types of hps

* revise comments

* simplify name

* reduce redundancy
2021-11-20 20:23:54 -05:00
Chi Wang
72caa2172d
model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283)
if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO.
retrain if using ray.
update ITER_HP in config after a trial is finished.
change prophet logging level.
example and notebook update.
allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class.
remove model_history.
checkpoint bug fix.

* model_history meaning save_best_model_per_estimator

* ITER_HP

* example update

* prophet logging level

* comment update in forecast notebook

* print format improvement

* allow settings to be passed to AutoML constructor

* checkpoint bug fix

* time limit for autohf regression test

* skip slow test on macos

* cleanup before del
2021-11-18 09:39:45 -08:00
Qingyun Wu
e9551de3cc add best_loss_per_estimator 2021-11-17 22:43:20 -08:00
Xueqing Liu
42de3075e9
Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Chi Wang
92ebd1f7f9
when max_iter=1, skip search only if retrain_final (#280)
* when max_iter=1, skip search only if retrain_final

* remove nlp
redesign in #210

* minor change in readme example
2021-11-09 21:51:23 -08:00
Chi Wang
549a0dfb53
limit time and memory consumption (#264)
* limit time and memory

* separate tests

* lrl1 can't be limited by limit_resource

* free memory when possible

* passthrough=False when ensemble fails;
retrain when trained_estimator is None

* use callback to for resource limit

* handle lower version of xgb with no callback

* free mem ratio

* reduce verbosity

* retrain_final when max_iter==1

* remove trained_estimator from result

* model_history

* wheel

* retrain time as best_config_train_time

* ci: libomp version for xgboost on macos

* limit_resource not working in windows

* test pickle load

* mute forecaster

* notebook update

* check hard

* preventive callback

* add use_ray
2021-11-03 19:08:23 -07:00
Kevin Chen
519bfc2a18
Integrate multivariate time series forecasting (#254)
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables

- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task

* update automl.py and test_forecast.py

* update forecast notebook

* update README.md and setup.py

* update ml.py and test_forecast.py

- make "ds" and "y" constant variables

* replace constants with constant variables

* bump version to 0.7.0

* update setup.py
- support 'forecast' and 'ts_forecast'

* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
2021-10-30 09:48:57 -07:00
Antoni Baum
e0155c2339
Fix exception in CFO's _create_condition if all candidate start points didn't return yet (#263)
* Fix exception if first trial returns None

* Add test
2021-10-29 11:44:16 -07:00
Chi Wang
7809ec15ac catch import error 2021-10-19 11:52:41 -07:00
Chi Wang
29fac8807b fix bug in subspace identification 2021-10-19 11:52:41 -07:00
Chi Wang
7d6e860102 n_estimators for catboost 2021-10-18 21:56:21 -07:00
Chi Wang
9e9356f436 time budget in state 2021-10-18 21:56:21 -07:00
Chi Wang
b2d8b097d7 check n_iter == 1 2021-10-18 21:56:21 -07:00
Chi Wang
46b29e05c7 .params 2021-10-18 21:56:21 -07:00
Chi Wang
b03a87e737 no search when max_iter < 2 2021-10-18 21:56:21 -07:00
Chi Wang
524f22bcc5
fix bug in hierarchical search space (#248); optional dependency on lgbm and xgb (#250)
* close #249

* admissible region

* best_config can be None

* optional dependency on lgbm and xgb
resolve #252
2021-10-15 21:36:42 -07:00
Chi Wang
fe65fa143d
v0.6.8 (#247) 2021-10-12 15:08:40 -07:00
Chi Wang
ddc1a63a76
Package (#244)
* build and upload pypi package

* pandas in dependency
2021-10-10 22:57:22 -07:00
Christoph Deil
948f688742
Consistent California (#245) 2021-10-09 07:52:07 -07:00
Chi Wang
f48ca2618f
warning -> info for low cost partial config (#231)
* warning -> info for low cost partial config
#195, #110

* when n_estimators < 0, use trained_estimator's

* log debug info

* test random seed

* remove "objective"; avoid ZeroDivisionError

* hp config to estimator params

* check type of searcher

* default n_jobs

* try import

* Update searchalgo_auto.py

* CLASSIFICATION

* auto_augment flag

* min_sample_size

* make catboost optional
2021-10-08 16:09:43 -07:00
Chi Wang
a99e939404
update config if n_estimators is modified (#225)
* update config if n_estimators is modified

* prediction as int

* handle the case n_estimators <= 0

* if trained and no budget to train more, return the trained model

* split_type=group for classification & regression
2021-09-27 21:30:49 -07:00
Qingyun Wu
b1115d5347
add consistency test (#216)
* add consistency test

* test_consistency and format

* add results attribute

* skip when ray is not installed

* Update flaml/tune/analysis.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-09-19 20:44:25 -04:00
Chi Wang
f3e50136e8
random search (#213)
* random search as a child class of CFO

* random search in sequential search of AutoML

* time to find best model as a property of AutoML
2021-09-19 11:19:23 -07:00
Chi Wang
0ba58e0ace
accommodate nni usage pattern (#209) 2021-09-14 23:16:28 -07:00
Chi Wang
a9d39b71da
consider num_samples in bs thread priority (#207)
* consider num_samples in bs thread priority

* continue search for bs
2021-09-14 18:36:10 -07:00
Chi Wang
f4529dfe89
package name in setup (#198)
* package name

* learning to rank example: close #200

* try import prophet #201
2021-09-11 21:19:18 -07:00
Chi Wang
71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang
e46573a01d
warmstart blendsearch (#186)
* increase test coverage

* use define by run only when needed

* warmstart bs

* classification -> binary, multi

* warm start with evaluated rewards

* data transformer; resource attr for gs

* BlendSearchTuner bug fix and unittest

* bug fix

* docstr and import

* task type
2021-09-04 01:42:21 -07:00
Gian Pio Domiziani
63bba92fd0
Fix decide_split_type bug. (#184)
* Fix decide_split_type bug.
2021-09-02 08:50:22 -07:00
Chi Wang
6ab0730793
remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178)
* remove catboost training dir

* close #48

* bs for hierarchical space. close #85

* retrain for hierarchical space

* clean ml (#180)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* support ranking task

* examples

* cv shuffle

* forecast api and implementation cleaner

* period constraints

* delete groups after fit
2021-09-01 16:25:04 -07:00
Chi Wang
1bc8786dcb
remove big objects after fit (#176)
* remove big objects after fit

* xgboost>1.3.3 has a weird auc socre on:
kr-vs-kp, fold 5, 1h1c

* keep_search_state
2021-08-26 13:45:13 -07:00
Qingyun Wu
a229a6112a
Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
Kevin Chen
3d0a3d26a2
Forecast (#162)
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']

* update setup.py

* add TimeSeriesSplit to 'regression' and 'classification' task

* add 'time' split_type for 'classification' and 'regression' task

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* feature importance

* variable name

* Update test/test_split.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/test_forecast.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prophet installation fail in windows

* upload flaml_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-08-23 13:26:46 -07:00
すずまる
6270353458
support ROC and AUC for multi-class classification (#170)
* support ROC and AUC for multi-class classification

* add a test case to cover ROC and AUC for multi-class classification
2021-08-22 15:16:10 -07:00
Qingyun Wu
10082b9262
v0.5.12 (#150)
* remove extra comma

* exclusive bound

* log file name

* add cost to space

* dataset_format

* add load_openml_dataset test

* docstr

* revise test format

* simplify restore

* order categories

* openml server exception in test

* process space

* add warning

* log format

* reduce n_cpu

* nested space

* hierarchical search space for CFO

* non hierarchical for bs

* unflatten hierarchical config

* connection error

* random sample

* config signature

* check ray version

* preprocess numpy array

* catboost preprocess

* time budget

* seed, verbose, hpo_method

* test cfocat

* shallow copy in flatten_dict
prevent lgbm model duplication

* match estimator name

* quantize and log

* test qloguniform and qrandint

* test qlograndint

* thread.running

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
2021-08-11 23:02:22 -07:00
Xueqing Liu
eeaf5b5963
space -> main (#148)
* subspace in flow2

* search space and trainable from AutoML

* experimental features: multivariate TPE, grouping, add_evaluated_points

* test experimental features

* readme

* define by run

* set time_budget_s for bs

Co-authored-by: liususan091219 <Xqq630517>

* version

* acl

* test define_by_run_func

* size

* constraints

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-08-02 16:10:26 -07:00
Eduardo Büll
46752083a2
fix UnboundLocalError in tune.run (#142) (#145)
Fix UnboundLocalError exception in tune.run when training_function returns a value.

Resolves #142
2021-08-01 17:55:38 -07:00
Qingyun Wu
e24265ee5d
automl fit with starting points (#141)
* add starting point in fit

* add estimator best config

* add test

* add doc string

* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.

Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-31 13:39:31 -07:00