117 Commits

Author SHA1 Message Date
Chi Wang
df01031cfe
Zero-shot AutoML (#468)
* Prepare for release

Co-authored-by: Moe Kayali <t-moekayali@microsoft.com>

* bug fix

* improve doc and code quality

Co-authored-by: Qingyun Wu
2022-03-01 15:39:09 -08:00
Chi Wang
9e88f22167
fix a bug when using ray & update ray on aml (#455)
* fix a bug when using ray & update ray on aml
When using with_parameters(), the config argument must be the first argument in the trainable function.
* make training function runnable standalone
2022-02-11 20:14:10 -08:00
Chi Wang
b4d312412a
bump ray version to 1.10 (#450)
* bump ray version to 1.10

* init ray in test

* Update setup.py to include hotfixes

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-02-09 15:04:29 -08:00
Chi Wang
8a44dd4318
data in csv (#430)
* data in csv

* support ray ObjectRef #365

* use object store to store data when using ray

* make lgbm tuning example a test

* homepage title
2022-01-30 19:36:41 -08:00
Chi Wang
6960a833ec
Gpu support for xgboost (#442)
* xgboost gpu support

* test xgboost gpu

* test sparse data

* add xgboost test

* remove ray.init to avoid pytest error
2022-01-30 13:02:18 -08:00
Kevin Chen
c75f97b475
Change the upper bound for "lags" hyperparameter for sklearn forecast models (#437)
* update model.py
- change upper bound for "lags" hyperparameter

* update test_forecast.py
- add a test for a large dataset

* update sample.py
- pre-commit changes
2022-01-30 07:30:30 -08:00
Xueqing Liu
438ccaa0c9
adding catch for HTTP error (#432) 2022-01-29 22:53:32 -08:00
Kevin Chen
81f54026c9
Support time series forecasting for discrete target variable (#416)
* support 'ts_forecast_classification' task to forecast discrete values

* update test_forecast.py
- add test for forecasting discrete values

* update test_model.py

* pre-commit changes
2022-01-24 18:39:36 -08:00
Xueqing Liu
4814091d87
remove redundant imports (#426)
* remove redundant imports

* getting ride of hf dataset
2022-01-24 14:24:14 -08:00
Chi Wang
6a7caa6a3d
max_iter < 2 -> no search; sign in metric constraints; test and example for forecasting (#415)
* max_iter < 2 -> no search

* use_ray in test

* eval_method in ts example

* check sign of constraints

* test metric constraint sign
2022-01-23 01:24:15 -08:00
Chi Wang
38ad31ea25
remove FLAML sample size from config (#418) 2022-01-22 22:59:44 -08:00
Xueqing Liu
dda4ac90a1
moving intermediate_results logging from model.py to huggingface/trainer.py (#403)
* replacing val_loss with automl_metric
2022-01-14 17:26:10 -08:00
Chi Wang
569908fbe6
fix issues in logging, bug in space.py, constraint sign, and improve code coverage (#388)
* console log handler

* version update

* doc

* skippable steps

* notebook update

* constraint sign

* doc for constraints

* bug fix: define-by-run and unflatten_hierarchical

* const

* handle nested space in indexof()

* test grid search

* test suggestion

* model test

* >1 ckpts

* always increase iter count

* log total # iterations

* security patch

* make iter_per_learner consistent
2022-01-14 13:39:09 -08:00
Xueqing Liu
f41f1c2198
Logging multiple checkpoints (#394) 2022-01-12 19:50:39 -08:00
Xueqing Liu
bd66e40296
fixing load best model at the end (#389) 2022-01-11 10:47:53 -08:00
Kevin Chen
d4273669e6
Time series forecasting with sklearn regressors (#362)
* add sklearn regressors as learners for ts_forecast task

* add direct forecasting strategy
warnings and errors for duplicate rows and missing values

- add preprocess for sklearn time series forecast
 update automl.py
 update test/test_forecast.py

* update model.py and test_forecast.py for cv eval_method

* add "hcrystalball" dependency in setup.py

* update automl.py
- add _validate_ts_data function for abstraction
- include xgb_limitdepth as a learner

* update model.py
- update search space for sklearn ts regressors

* update automl.py and test_forecast.py for numpy array inputs

* add documentations to model.py

* add documentation for removing catboost regressor

* update automl.py
- _validate_ts_data() function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-01-06 23:12:38 -08:00
Chi Wang
612668e8ed
serialize TransformerEstimator (#381)
* serialize TransformerEstimator

* check has_attr

* custom metric needs trainer

* skip test on mac
2022-01-06 10:28:19 -08:00
Xueqing Liu
207b6935d9
adding token classification (#376)
* adding ner
2022-01-03 13:44:10 -05:00
Chi Wang
8602def1c4
logging (#371)
* query logged runs

* mlflow log when using ray

* key check for newer version of ray #363

* catch importerror

* log and load AutoML model

* retrain if necessary when ensemble fails
2022-01-02 21:37:19 -08:00
oberonbot
9c00e4272a
Finish the Multiple Choice Classification (#367)
* adding multiple choice

* update test cases (hard coded)

* merged common code in predict_proba and predict in TransformersEstimator
2022-01-02 20:12:34 -05:00
Chi Wang
2f5d6169d3
example update (#359)
update some examples for consistencies with others.
2021-12-25 16:13:39 -08:00
Xueqing Liu
b2900f4b22
fixing custom metric (#357)
* fixing the error for custom metric
2021-12-24 16:23:09 -05:00
Rui Zhuang
c6c0c29769
Simplify lgbm example (#358)
* simplify lgbm examples

* provide link to lgbm example script.

* simply lgbm example in the example script.

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-23 23:05:14 -08:00
Xueqing Liu
dcfd218108
Fixing the bug in custom metric (#356)
* fixing the bug for custom metric
2021-12-23 18:44:53 -05:00
Chi Wang
300f286667
azureml + ray (#344)
* examples and documentation about how to use azureml + ray
2021-12-23 13:37:07 -08:00
Chi Wang
0b25e89f29
reproducibility for random sampling (#349)
* reproducibility for random sampling #236

* doc update
2021-12-22 12:12:25 -08:00
Xueqing Liu
ee3162e232
Adding the NLP task summarization (#346)
* Add test_autohf_summarization.py

* adding seq2seq

* Update flaml/nlp/huggingface/trainer.py

* rouge metrics

Co-authored-by: XinZofStevens <xzhao4346@gmail.com>
Co-authored-by: JinzhuoWu <wujinzhuo0105@gmail.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-20 14:19:32 -08:00
Chi Wang
efd85b4c86
Deploy a new doc website (#338)
A new documentation website. And:

* add actions for doc

* update docstr

* installation instructions for doc dev

* unify README and Getting Started

* rename notebook

* doc about best_model_for_estimator #340

* docstr for keep_search_state #340

* DNN

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
Co-authored-by: Z.sk <shaokunzhang@psu.edu>
2021-12-16 17:11:33 -08:00
Chia-Chi Hsu
671ccbbe3f
support for customized splitters (#333)
* add support for customized splitters

* use the param split_type for feeding generators

* use single API for customized splitter and add test

* when task==TS_FORCAST, always set shuffle=False

* update docstr

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-16 16:13:04 -08:00
Z.sk
7b24662dca
Makes the evaluation_function could receive the incumbent best result as input in Tune (#339)
* update tune function

* pass incumbent result to the training function

* Update test/tune/test_record_incumbent.py

* Update flaml/searcher/search_thread.py

* Update flaml/searcher/blendsearch.py

* Update flaml/tune/tune.py

* add constant variable

Co-authored-by: 张少坤 <zhangshaokun@fuzhi.ai>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-15 21:12:47 -08:00
Qingyun Wu
17b17d084f
tune api for schedulers (#322)
* revise api and tests

* rename prune_attr

* update finetune notebook

* add scheduler test and notebook

* update tune api for scheduler

* remove scheduler notebook

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

* fix imports

* clear notebook output

* fix ray import

* Update flaml/tune/tune.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* improve docstr

* Update flaml/searcher/blendsearch.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* remove redundant import

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-12-04 21:52:20 -05:00
Chi Wang
7d269435ae add save_best_config() 2021-12-04 16:29:52 -08:00
Chi Wang
18230ed22f
pred_time_limit clarification and logging (#319)
* pred_time_limit clarification

* log prediction time

* handle ChunkedEncodingError in test
2021-12-03 16:02:00 -08:00
Xueqing Liu
fb59bb9928
adding TODOs for NLP module, so students can implement other tasks easier (#321)
* fixing ray pickle bug, skipping macosx bug, completing code for seqregression

* catching connectionerror

* ading TODOs for NLP module
2021-12-03 12:45:16 -05:00
Chi Wang
c57954fbbd
include default value in rf search space (#317)
* include default value in rf search space

* init _mem_per_iter with -1

* bump version to 0.8.2

* docstr for search space's arguments
2021-12-03 09:15:21 -08:00
Michal Chromcak
b0ef3b7995
Add conda forge minimal test (#309)
* add conda forge minimal test, create pytest markers
2021-11-25 13:58:20 -08:00
晓宇
5dc948da18
Update test_regression.py (#306)
* Update test_regression.py

There is a another way for mutioutput-model-trian.
RegressorChain is more adapting to the targets which are relavant.
2021-11-25 08:18:22 -08:00
Xueqing Liu
fd136b02d1
bug fix for TransformerEstimator (#293)
* fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode

* finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1

* adding predict_proba, address PR 293's comments

close #293 #291
2021-11-23 11:26:39 -08:00
Chi Wang
85e21864ce
test -> val; docstr (#300)
* rename test -> val in custom metric function
* add an example in docstr
resolve #299
2021-11-22 22:17:29 -08:00
Chi Wang
d937b03e42
multioutput regression (#292)
* make AutoML inherit sklearn.base.BaseEstimator such that it can be wrapped in sklearn.multioutput.MultiOutputRegressor for multi-output regression.

* moved and simplified preprocessing code in AutoML.predictI() to _preprocess()
2021-11-22 06:59:42 -08:00
Chi Wang
00da79a90b
empty search space (#295)
fix the error when an empty dictionary is passed to BlendSearch as the search space.
2021-11-20 20:05:28 -08:00
Qingyun Wu
49f9e9f86b
add warmstart test (#298)
* add warmstart test

* remove redundancy

* add more types of hps

* revise comments

* simplify name

* reduce redundancy
2021-11-20 20:23:54 -05:00
Chi Wang
72caa2172d
model_history, ITER_HP, settings in AutoML(), checkpoint bug fix (#283)
if save_best_model_per_estimator is False and retrain_final is True, unfit the model after evaluation in HPO.
retrain if using ray.
update ITER_HP in config after a trial is finished.
change prophet logging level.
example and notebook update.
allow settings to be passed to AutoML constructor. Are you planning to add multi-output-regression capability to FLAML #192 Is multi-tasking allowed? #277 can pass the auotml setting to the constructor instead of requiring a derived class.
remove model_history.
checkpoint bug fix.

* model_history meaning save_best_model_per_estimator

* ITER_HP

* example update

* prophet logging level

* comment update in forecast notebook

* print format improvement

* allow settings to be passed to AutoML constructor

* checkpoint bug fix

* time limit for autohf regression test

* skip slow test on macos

* cleanup before del
2021-11-18 09:39:45 -08:00
Qingyun Wu
e9551de3cc add best_loss_per_estimator 2021-11-17 22:43:20 -08:00
Xueqing Liu
42de3075e9
Make NLP tasks available from AutoML.fit() (#210)
Sequence classification and regression: "seq-classification" and "seq-regression"

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-11-16 11:06:20 -08:00
Chi Wang
92ebd1f7f9
when max_iter=1, skip search only if retrain_final (#280)
* when max_iter=1, skip search only if retrain_final

* remove nlp
redesign in #210

* minor change in readme example
2021-11-09 21:51:23 -08:00
Chi Wang
549a0dfb53
limit time and memory consumption (#264)
* limit time and memory

* separate tests

* lrl1 can't be limited by limit_resource

* free memory when possible

* passthrough=False when ensemble fails;
retrain when trained_estimator is None

* use callback to for resource limit

* handle lower version of xgb with no callback

* free mem ratio

* reduce verbosity

* retrain_final when max_iter==1

* remove trained_estimator from result

* model_history

* wheel

* retrain time as best_config_train_time

* ci: libomp version for xgboost on macos

* limit_resource not working in windows

* test pickle load

* mute forecaster

* notebook update

* check hard

* preventive callback

* add use_ray
2021-11-03 19:08:23 -07:00
Kevin Chen
519bfc2a18
Integrate multivariate time series forecasting (#254)
* Integrate multivariate time series forecasting, now supports
continuous and categorical variables

- update data.py to transform time series data
- update search space
- update documentations to reflect changes
- update test_forecast.py
- rename 'forecast' task to 'ts_forecast' task

* update automl.py and test_forecast.py

* update forecast notebook

* update README.md and setup.py

* update ml.py and test_forecast.py

- make "ds" and "y" constant variables

* replace constants with constant variables

* bump version to 0.7.0

* update setup.py
- support 'forecast' and 'ts_forecast'

* update automl.py and data.py
- support 'forecast' and 'ts_forecast' tasks
2021-10-30 09:48:57 -07:00
Antoni Baum
e0155c2339
Fix exception in CFO's _create_condition if all candidate start points didn't return yet (#263)
* Fix exception if first trial returns None

* Add test
2021-10-29 11:44:16 -07:00
Chi Wang
7809ec15ac catch import error 2021-10-19 11:52:41 -07:00