75 Commits

Author SHA1 Message Date
Chi Wang
47e034d203
LightGBM notebook update (#690)
* version update in notebook

* comment about optuna install

* monotone constraints
2022-08-20 07:43:06 -07:00
Xueqing Liu
6c7d37374d
updating nlp notebook (#683) 2022-08-15 18:09:08 -07:00
Kevin Chen
2e8e3937ef
update time series forecast notebook (#682)
* update forecasting with exogeneous variables example

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update forecasting with exogeneous variables example on website

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* rerun automl_time_series_forecast with new predict function for tft

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* correct spelling error

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-08-13 18:58:45 -07:00
Kevin Chen
f718d18b5e
time series forecasting with panel datasets (#541)
* time series forecasting with panel datasets
- integrate Temporal Fusion Transformer as a learner based on pytorchforecasting

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py and test_forecast.py
- remove blank lines

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py to prevent errors

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py and data.py
- change forecast task name
- update documentation for fit() method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py
- add performance test
- use 'fit_kwargs_by_estimator'

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* add time index function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py performance test

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update data.py to prevent type error

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update setup.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update for pytorch forecasting tft on panel datasets

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* - rename estimator
- add 'gpu_per_trial' for tft estimator

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update test_forecast.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* include ts panel forecasting as an example

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl_time_series_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update documentations

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* "weights_summary" argument deprecated and removed for pl.Trainer()

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py tft estimator prediction method

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update model.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update `fit_kwargs` documentation

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* update automl.py

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-08-12 08:39:22 -07:00
Xueqing Liu
5eb5d43d7f
Fix HPO evaluation bug (#645)
* fix eval automl metric bug on val_loss inconsistency

* updating starting point search space to continuous

* shortening notebok
2022-07-28 23:08:42 -04:00
Chi Wang
e14e909af9
Feature names and importances (#621)
* feature names and importances

* None check

* StackingClassifier has no feature_importances_

* StackingClassifier has no feature_names_in_
2022-07-10 12:25:59 -07:00
Zvi Baratz
127e482ac8
Replaced !pip calls with %pip magic command. (#604)
Closes #603.
2022-06-23 18:45:42 -07:00
Zvi Baratz
fefdb48ef3
Fix automl settings in scikit-learn pipeline integration example (#602)
* Added test directory and core file to gitignore.

Closes #601.

* Fixed pipeline fit parameters.

Closes #600.

* Reverted changes to gitignore.
2022-06-22 19:44:14 -07:00
Chi Wang
c45741a67b
support latest xgboost version (#599)
* support latest xgboost version

* Update test_classification.py

* Update 

Exists problems when installing xgb1.6.1 in py3.6

* cleanup

* xgboost version

* remove time_budget_s in test

* remove redundancy

* stop support of python 3.6

Co-authored-by: zsk <shaokunzhang529@gmail.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2022-06-21 18:59:07 -07:00
Chi Wang
0642b6e7bb
init value type match (#575)
* init value type match

* bump version to 1.0.6

* add a note about flaml version in notebook

* add note about mismatched ITER_HP

* catch SSLError when accessing OpenML data

* catch errors in autovw test

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2022-06-09 08:11:15 -07:00
Chi Wang
619107edf5 install openml for notebook example 2022-06-05 22:52:20 -07:00
Chi Wang
c9bac02ea4
add zeroshot notebook (#569)
* add zeroshot notebook
2022-06-04 12:00:58 -07:00
Xueqing Liu
ca35fa969f
refactoring TransformersEstimator to support default and custom_hp (#511)
* refactoring TransformersEstimator to support default and custom_hp

* handling starting_points not in search space

* addressing starting point more than max_iter

* fixing upper < lower bug
2022-04-28 14:06:29 -04:00
Chi Wang
e877de6414 use ffill in forecasting example 2022-04-01 09:23:23 -07:00
Qingyun Wu
2cdc08a75a update notebook and test 2022-03-30 19:11:10 -07:00
liususan091219
0bcf618fea fixing bug 2022-03-28 11:53:58 -07:00
Qingyun Wu
6c16e47e42
Bug fix and add documentation for metric_constraints (#498)
* metric constraint documentation

* update link

* update notebook

* fix a bug in adding 'time_total_s' to result

* use the default multiple factor from config file

* update notebook

* format

* improve test

* revise test budget for macos

* bug fix in adding time_total_s

* increase performance check budget

* revise test

* update notebook

* uncomment test

* remove redundancy

* clear output

* remove n_jobs

* remove constraint in notebook

* increase budget

* revise test

* add python version

* use getattr

* improve code robustness

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2022-03-26 21:11:45 -04:00
Xueqing Liu
5f97532986
adding evaluation (#495)
* adding automl.score

* fixing the metric name in train_with_config

* adding pickle after score

* fixing a bug in automl.pickle
2022-03-25 17:00:08 -04:00
Xueqing Liu
af423463c3
fixing bug for ner (#463)
* fixing bug for ner

* removing global var

* adding class for trial counter

* adding notebook

* adding use_ray dict

* updating documentation for nlp
2022-03-20 22:03:02 -04:00
Kevin Chen
f9eda0cc40
update documentation for time series forecasting (#472)
* update automl.py
- documentation update

* update test_forecast.py

* update model.py

* update automl_time_series_forecast.ipynb

* update time series forecast website examples

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-03-08 11:21:18 -08:00
Chi Wang
6095be039f
update regression metrics in notebooks (#454) 2022-02-15 13:17:43 -08:00
Chi Wang
569908fbe6
fix issues in logging, bug in space.py, constraint sign, and improve code coverage (#388)
* console log handler

* version update

* doc

* skippable steps

* notebook update

* constraint sign

* doc for constraints

* bug fix: define-by-run and unflatten_hierarchical

* const

* handle nested space in indexof()

* test grid search

* test suggestion

* model test

* >1 ckpts

* always increase iter count

* log total # iterations

* security patch

* make iter_per_learner consistent
2022-01-14 13:39:09 -08:00
Chi Wang
8602def1c4
logging (#371)
* query logged runs

* mlflow log when using ray

* key check for newer version of ray #363

* catch importerror

* log and load AutoML model

* retrain if necessary when ensemble fails
2022-01-02 21:37:19 -08:00
Chi Wang
efd85b4c86
Deploy a new doc website (#338)
A new documentation website. And:

* add actions for doc

* update docstr

* installation instructions for doc dev

* unify README and Getting Started

* rename notebook

* doc about best_model_for_estimator #340

* docstr for keep_search_state #340

* DNN

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Z.sk <shaokunzhang@psu.edu>
2021-12-16 17:11:33 -08:00
Chi Wang
b773e2898f Update flaml_pytorch_cifar10.ipynb
training_function -> train_cifar
2021-12-07 15:17:15 -08:00
Qingyun Wu
dd60dbc5eb
rename training_function (#327)
* rename training_function

* add docstr

* update docstr

* update docstr and comments

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-12-06 17:03:43 -05:00
Qingyun Wu
17b17d084f
tune api for schedulers (#322)
* revise api and tests

* rename prune_attr

* update finetune notebook

* add scheduler test and notebook

* update tune api for scheduler

* remove scheduler notebook

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

* fix imports

* clear notebook output

* fix ray import

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* improve docstr

* Update flaml/searcher/blendsearch.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* remove redundant import

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-04 21:52:20 -05:00
Chi Wang
c57954fbbd
include default value in rf search space (#317)
* include default value in rf search space

* init _mem_per_iter with -1

* bump version to 0.8.2

* docstr for search space's arguments
2021-12-03 09:15:21 -08:00
Chi Wang
85e21864ce
test -> val; docstr (#300)
* rename test -> val in custom metric function
* add an example in docstr
resolve #299
2021-11-22 22:17:29 -08:00
Chi Wang
ea6d28d7bd
add max_depth to xgboost search space (#282)
* add max_depth to xgboost search space

* notebook update

* two learners for xgboost (max_depth or max_leaves)
2021-11-22 21:17:48 -08:00
Chi Wang
72caa2172d
model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283)
if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO.
retrain if using ray.
update ITER_HP in config after a trial is finished.
change prophet logging level.
example and notebook update.
allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class.
remove model_history.
checkpoint bug fix.

* model_history meaning save_best_model_per_estimator

* ITER_HP

* example update

* prophet logging level

* comment update in forecast notebook

* print format improvement

* allow settings to be passed to AutoML constructor

* checkpoint bug fix

* time limit for autohf regression test

* skip slow test on macos

* cleanup before del
2021-11-18 09:39:45 -08:00
Xueqing Liu
42de3075e9
Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Chi Wang
59083fbdcb
example update (#281)
* example update

* bump version to 0.7.2

* notebook update
2021-11-12 22:29:33 -08:00
Chi Wang
92ebd1f7f9
when max_iter=1, skip search only if retrain_final (#280)
* when max_iter=1, skip search only if retrain_final

* remove nlp
redesign in #210

* minor change in readme example
2021-11-09 21:51:23 -08:00
Chi Wang
549a0dfb53
limit time and memory consumption (#264)
* limit time and memory

* separate tests

* lrl1 can't be limited by limit_resource

* free memory when possible

* passthrough=False when ensemble fails;
retrain when trained_estimator is None

* use callback to for resource limit

* handle lower version of xgb with no callback

* free mem ratio

* reduce verbosity

* retrain_final when max_iter==1

* remove trained_estimator from result

* model_history

* wheel

* retrain time as best_config_train_time

* ci: libomp version for xgboost on macos

* limit_resource not working in windows

* test pickle load

* mute forecaster

* notebook update

* check hard

* preventive callback

* add use_ray
2021-11-03 19:08:23 -07:00
Kevin Chen
519bfc2a18
Integrate multivariate time series forecasting (#254)
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables

- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task

* update automl.py and test_forecast.py

* update forecast notebook

* update README.md and setup.py

* update ml.py and test_forecast.py

- make "ds" and "y" constant variables

* replace constants with constant variables

* bump version to 0.7.0

* update setup.py
- support 'forecast' and 'ts_forecast'

* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
2021-10-30 09:48:57 -07:00
Chi Wang
524f22bcc5
fix bug in hierarchical search space (#248); optional dependency on lgbm and xgb (#250)
* close #249

* admissible region

* best_config can be None

* optional dependency on lgbm and xgb
resolve #252
2021-10-15 21:36:42 -07:00
Chi Wang
f48ca2618f
warning -> info for low cost partial config (#231)
* warning -> info for low cost partial config
#195, #110

* when n_estimators < 0, use trained_estimator's

* log debug info

* test random seed

* remove "objective"; avoid ZeroDivisionError

* hp config to estimator params

* check type of searcher

* default n_jobs

* try import

* Update searchalgo_auto.py

* CLASSIFICATION

* auto_augment flag

* min_sample_size

* make catboost optional
2021-10-08 16:09:43 -07:00
Chi Wang
ea6c6ded2f
clean up forecast notebook (#202) 2021-09-13 21:16:42 -07:00
Chi Wang
71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang
e46573a01d
warmstart blendsearch (#186)
* increase test coverage

* use define by run only when needed

* warmstart bs

* classification -> binary, multi

* warm start with evaluated rewards

* data transformer; resource attr for gs

* BlendSearchTuner bug fix and unittest

* bug fix

* docstr and import

* task type
2021-09-04 01:42:21 -07:00
Kevin Chen
ec34427ca8
Forecast v2 (#182)
* update flaml_forecast.ipynb

* visualize predictions for comparison

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-09-01 18:06:37 -07:00
Chi Wang
6ab0730793
remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178)
* remove catboost training dir

* close #48

* bs for hierarchical space. close #85

* retrain for hierarchical space

* clean ml (#180)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* support ranking task

* examples

* cv shuffle

* forecast api and implementation cleaner

* period constraints

* delete groups after fit
2021-09-01 16:25:04 -07:00
Qingyun Wu
a229a6112a
Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
Kevin Chen
3d0a3d26a2
Forecast (#162)
* added 'forecast' task with estimators ['fbprophet', 'arima', 'sarimax']

* update setup.py

* add TimeSeriesSplit to 'regression' and 'classification' task

* add 'time' split_type for 'classification' and 'regression' task

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>

* feature importance

* variable name

* Update test/test_split.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/test_forecast.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* prophet installation fail in windows

* upload flaml_forecast.ipynb

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2021-08-23 13:26:46 -07:00
すずまる
6270353458
support ROC and AUC for multi-class classification (#170)
* support ROC and AUC for multi-class classification

* add a test case to cover ROC and AUC for multi-class classification
2021-08-22 15:16:10 -07:00
Naga Budigam
2fb888e64e
Tutorial on using AutoML in sklearn pipeline (#157)
* tutorial on using AutoML in sklearn pipeline
2021-08-11 17:46:46 -07:00
Xueqing Liu
eeaf5b5963
space -> main (#148)
* subspace in flow2

* search space and trainable from AutoML

* experimental features: multivariate TPE, grouping, add_evaluated_points

* test experimental features

* readme

* define by run

* set time_budget_s for bs

Co-authored-by: liususan091219 <Xqq630517>

* version

* acl

* test define_by_run_func

* size

* constraints

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-08-02 16:10:26 -07:00
Qingyun Wu
58c0ec959d
Update readme for flaml.tune (#137)
* add time_budget_s for bs in readme

* version update

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-24 17:10:43 -07:00
Chi Wang
b3bb00966d
coverage (#135)
* coverage

* readme

* timeout
2021-07-20 17:00:44 -07:00