126 Commits

Author SHA1 Message Date
skzhang1
7851a463aa clean up 2022-08-07 18:16:34 +00:00
skzhang1
e5a422c41e update 2022-08-07 18:11:04 +00:00
Xueqing Liu
5eb5d43d7f
Fix HPO evaluation bug (#645)
* fix eval automl metric bug on val_loss inconsistency

* updating starting point search space to continuous

* shortening notebok
2022-07-28 23:08:42 -04:00
Chi Wang
e14e909af9
Feature names and importances (#621)
* feature names and importances

* None check

* StackingClassifier has no feature_importances_

* StackingClassifier has no feature_names_in_
2022-07-10 12:25:59 -07:00
Qingyun Wu
b7846048dc
Allow FLAML_sample_size in starting_points (#619)
* FLAML_sample_size

* clean up

* starting_points as a list

* catch AssertionError

* per estimator sample size

* import

* per estimator min_sample_size

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update test/automl/test_warmstart.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* add warnings

* adding more tests

* fix a bug in validating starting points

* improve test

* revise test

* revise test

* documentation about custom_hp

* doc and efficiency

* update test

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-07-09 16:04:46 -04:00
Xueqing Liu
6108493e0b
fix ner bug; refactor post processing of TransformersEstimator prediction (#615)
* fix ner bug; refactor post processing

* fix too many values to unpack

* supporting id/token label for NER
2022-07-05 13:38:21 -04:00
Chi Wang
9bf13d66f1
use relative url in doc (#620)
* use relative url in doc

* update link
2022-07-01 13:28:16 -07:00
Chi Wang
74cca60606
Allow custom GroupKFold object as split_type (#616)
* Allow custom GroupKFold object

* handle unpickle error for prophet 1.1

* eval_method in test_object()
2022-06-29 21:04:25 -07:00
Chi Wang
c5272ad377 log msg about ensemble 2022-06-18 06:48:41 -07:00
Chi Wang
1b40b4b3a6
set_search_properties (#595)
* update the signature of set_search_properties
2022-06-16 16:30:50 -07:00
Chi Wang
5de3f54fd9
update time from start when using ray (#586) 2022-06-13 00:12:22 -04:00
Chi Wang
f8cc38bc16
enable ensemble when using ray (#583)
* enable ensemble when using ray

* sanitize config
2022-06-10 21:28:47 -07:00
Chi Wang
0642b6e7bb
init value type match (#575)
* init value type match

* bump version to 1.0.6

* add a note about flaml version in notebook

* add note about mismatched ITER_HP

* catch SSLError when accessing OpenML data

* catch errors in autovw test

Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2022-06-09 08:11:15 -07:00
Xueqing Liu
e0e317bfb1
fixing trainable and update function, completing NOTE (#566)
* fix checkpoint naming + trial id for non-ray mode, fix the bug in running test mode, delete all the checkpoints in non-ray mode

* finished testing for checkpoint naming, delete checkpoint, ray, max iter = 1
2022-06-03 15:19:22 -04:00
Chi Wang
49e8f7f028
use zeroshot when no budget is given; custom_hp (#563)
* use zeroshot when no budget is given; custom_hp

* update Getting-Started

* protobuf version

* X_val
2022-05-28 17:22:09 -07:00
Chi Wang
d402c63312
align indent and add missing quotation (#555)
* align indent and add missing quotation
2022-05-20 10:49:39 -07:00
harish445
992ea3416c
fix indentation in automl.py (#553)
* fix indentation in flaml.py
2022-05-19 16:29:45 -07:00
Qiaochu Song
2851134052
Quick-fix (#539)
* fix doc string; enable label transform in automl.score
2022-05-19 11:43:34 -04:00
Chi Wang
7126b69ce0
choose n_jobs for ensemble according to n_jobs per learner (#551)
* set n_jobs in ensemble dict

* catch the ensemble error

* choose n_jobs for stacker

* clarify
2022-05-18 21:01:51 -07:00
Xueqing Liu
2a8decdc50
fix the post-processing bug in NER (#534)
* fix conll bug

* update DataCollatorForAuto

* adding label_list comments
2022-05-10 17:22:57 -04:00
Xueqing Liu
c1e1299855
fixing use_ray in automl.py (#531)
* fixing use_ray
2022-05-02 08:05:23 -07:00
Xueqing Liu
ca35fa969f
refactoring TransformersEstimator to support default and custom_hp (#511)
* refactoring TransformersEstimator to support default and custom_hp

* handling starting_points not in search space

* addressing starting point more than max_iter

* fixing upper < lower bug
2022-04-28 14:06:29 -04:00
Qingyun Wu
6c16e47e42
Bug fix and add documentation for metric_constraints (#498)
* metric constraint documentation

* update link

* update notebook

* fix a bug in adding 'time_total_s' to result

* use the default multiple factor from config file

* update notebook

* format

* improve test

* revise test budget for macos

* bug fix in adding time_total_s

* increase performance check budget

* revise test

* update notebook

* uncomment test

* remove redundancy

* clear output

* remove n_jobs

* remove constraint in notebook

* increase budget

* revise test

* add python version

* use getattr

* improve code robustness

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2022-03-26 21:11:45 -04:00
Chi Wang
7eb7b46ea9
version number and doc (#497)
* version number

* add missing tasks in documentation

* update node-forge version
2022-03-25 17:32:37 -07:00
Xueqing Liu
5f97532986
adding evaluation (#495)
* adding automl.score

* fixing the metric name in train_with_config

* adding pickle after score

* fixing a bug in automl.pickle
2022-03-25 17:00:08 -04:00
Xueqing Liu
af423463c3
fixing bug for ner (#463)
* fixing bug for ner

* removing global var

* adding class for trial counter

* adding notebook

* adding use_ray dict

* updating documentation for nlp
2022-03-20 22:03:02 -04:00
Qingyun Wu
f6ae1331f5
metric constraints in flaml.automl (#479)
* metric constraints

* revise docstr

* fix docstr

* improve docstr

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* Update flaml/automl.py

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* docstr

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2022-03-12 00:39:35 -05:00
Kevin Chen
f9eda0cc40
update documentation for time series forecasting (#472)
* update automl.py
- documentation update

* update test_forecast.py

* update model.py

* update automl_time_series_forecast.ipynb

* update time series forecast website examples

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-03-08 11:21:18 -08:00
Chi Wang
df01031cfe
Zero-shot AutoML (#468)
* Prepare for release

Co-authored-by: Moe Kayali <t-moekayali@microsoft.com>

* bug fix

* improve doc and code quality

Co-authored-by: Qingyun Wu
2022-03-01 15:39:09 -08:00
Chi Wang
e3e737c71a
make AutoML.classes_ an array (#467)
* remove .tolist()

* docstr
2022-02-25 22:13:41 -08:00
Qingyun Wu
05f9065ade
Docstr update (#460)
* parallel tuning docstr

* update n_concurrent_trials docstr

* n_jobs default

* parallel tuning in tune docstr
2022-02-15 09:41:53 -08:00
Chi Wang
9e88f22167
fix a bug when using ray & update ray on aml (#455)
* fix a bug when using ray & update ray on aml
When using with_parameters(), the config argument must be the first argument in the trainable function.
* make training function runnable standalone
2022-02-11 20:14:10 -08:00
Chi Wang
8a44dd4318
data in csv (#430)
* data in csv

* support ray ObjectRef #365

* use object store to store data when using ray

* make lgbm tuning example a test

* homepage title
2022-01-30 19:36:41 -08:00
Chi Wang
6960a833ec
Gpu support for xgboost (#442)
* xgboost gpu support

* test xgboost gpu

* test sparse data

* add xgboost test

* remove ray.init to avoid pytest error
2022-01-30 13:02:18 -08:00
Kevin Chen
81f54026c9
Support time series forecasting for discrete target variable (#416)
* support 'ts_forecast_classification' task to forecast discrete values

* update test_forecast.py
- add test for forecasting discrete values

* update test_model.py

* pre-commit changes
2022-01-24 18:39:36 -08:00
Chi Wang
6a7caa6a3d
max_iter < 2 -> no search; sign in metric constraints; test and example for forecasting (#415)
* max_iter < 2 -> no search

* use_ray in test

* eval_method in ts example

* check sign of constraints

* test metric constraint sign
2022-01-23 01:24:15 -08:00
Xueqing Liu
47d2295fb7
Set use_ray to True for logging to databricks (#414)
* fixing use_ray bug
2022-01-18 18:37:35 -08:00
MichaelMarien
1c911da9f8
Sklearn api x (#405)
* changed signature of automl.predict and automl.predict_proba to X

* XGBoostEstimator

* changed signature of Prophet predict to X

* changed signature of ARIMA predict to X

* changed signature of TS_SKLearn_Regressor predict to X
2022-01-16 14:37:56 -08:00
Chi Wang
569908fbe6
fix issues in logging, bug in space.py, constraint sign, and improve code coverage (#388)
* console log handler

* version update

* doc

* skippable steps

* notebook update

* constraint sign

* doc for constraints

* bug fix: define-by-run and unflatten_hierarchical

* const

* handle nested space in indexof()

* test grid search

* test suggestion

* model test

* >1 ckpts

* always increase iter count

* log total # iterations

* security patch

* make iter_per_learner consistent
2022-01-14 13:39:09 -08:00
Xueqing Liu
c1b5cb5348
fixing default metric for regression + change verbosity for transformers (#397)
* fixing default metric for regression + change verbosity for transformers

* fixing per_device_train_batch_size

* Update flaml/automl.py for gpu_per_trial
2022-01-13 21:08:51 -08:00
Xueqing Liu
f41f1c2198
Logging multiple checkpoints (#394) 2022-01-12 19:50:39 -08:00
Kevin Chen
99667dad5f
Regression forecast debug (#391)
* update automl.py
- fix bug with removing "catboost"
2022-01-11 13:16:59 -08:00
Xueqing Liu
c54c1246c6
fixing auto metric bug (#387) 2022-01-07 16:25:58 -08:00
Kevin Chen
d4273669e6
Time series forecasting with sklearn regressors (#362)
* add sklearn regressors as learners for ts_forecast task

* add direct forecasting strategy
warnings and errors for duplicate rows and missing values

- add preprocess for sklearn time series forecast
 update automl.py
 update test/test_forecast.py

* update model.py and test_forecast.py for cv eval_method

* add "hcrystalball" dependency in setup.py

* update automl.py
- add _validate_ts_data function for abstraction
- include xgb_limitdepth as a learner

* update model.py
- update search space for sklearn ts regressors

* update automl.py and test_forecast.py for numpy array inputs

* add documentations to model.py

* add documentation for removing catboost regressor

* update automl.py
- _validate_ts_data() function

Signed-off-by: Kevin Chen <chenkevin.8787@gmail.com>
2022-01-06 23:12:38 -08:00
Chi Wang
612668e8ed
serialize TransformerEstimator (#381)
* serialize TransformerEstimator

* check has_attr

* custom metric needs trainer

* skip test on mac
2022-01-06 10:28:19 -08:00
Chi Wang
cd9740f022
Fix several issues for nlp tasks (#380)
* num cpu issue #378;
* temp fix for ray issue #379;
* transformers version.
2022-01-05 13:49:12 -08:00
Xueqing Liu
207b6935d9
adding token classification (#376)
* adding ner
2022-01-03 13:44:10 -05:00
Chi Wang
8602def1c4
logging (#371)
* query logged runs

* mlflow log when using ray

* key check for newer version of ray #363

* catch importerror

* log and load AutoML model

* retrain if necessary when ensemble fails
2022-01-02 21:37:19 -08:00
Chi Wang
2f5d6169d3
example update (#359)
update some examples for consistencies with others.
2021-12-25 16:13:39 -08:00
Chi Wang
baa0359324
doc update (#352)
* custom splitter
* NLP
* version number
2021-12-22 14:35:13 -08:00