35 Commits

Author SHA1 Message Date
Chi Wang
fe65fa143d
v0.6.8 (#247) 2021-10-12 15:08:40 -07:00
Chi Wang
ddc1a63a76
Package (#244)
* build and upload pypi package

* pandas in dependency
2021-10-10 22:57:22 -07:00
Christoph Deil
948f688742
Consistent California (#245) 2021-10-09 07:52:07 -07:00
Chi Wang
f48ca2618f
warning -> info for low cost partial config (#231)
* warning -> info for low cost partial config
#195, #110

* when n_estimators < 0, use trained_estimator's

* log debug info

* test random seed

* remove "objective"; avoid ZeroDivisionError

* hp config to estimator params

* check type of searcher

* default n_jobs

* try import

* Update searchalgo_auto.py

* CLASSIFICATION

* auto_augment flag

* min_sample_size

* make catboost optional
2021-10-08 16:09:43 -07:00
Chi Wang
71219df6c6
notebook example (#189)
* config in result

* value can be float

* pytorch notebook example

* docker, pre-commit

* max_failure (#192); early_stop

* extend starting_points (#196)

Co-authored-by: Chi Wang (MSR) <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qw2ky@virginia.edu>
2021-09-10 16:39:16 -07:00
Chi Wang
e46573a01d
warmstart blendsearch (#186)
* increase test coverage

* use define by run only when needed

* warmstart bs

* classification -> binary, multi

* warm start with evaluated rewards

* data transformer; resource attr for gs

* BlendSearchTuner bug fix and unittest

* bug fix

* docstr and import

* task type
2021-09-04 01:42:21 -07:00
Gian Pio Domiziani
63bba92fd0
Fix decide_split_type bug. (#184)
* Fix decide_split_type bug.
2021-09-02 08:50:22 -07:00
Chi Wang
6ab0730793
remove catboost training dir; ensemble api; blendsearch for hierarchical space; ranking task; forecast improvement (#178)
* remove catboost training dir

* close #48

* bs for hierarchical space. close #85

* retrain for hierarchical space

* clean ml (#180)

Co-authored-by: Qingyun Wu <qxw5138@psu.edu>

* support ranking task

* examples

* cv shuffle

* forecast api and implementation cleaner

* period constraints

* delete groups after fit
2021-09-01 16:25:04 -07:00
Chi Wang
1bc8786dcb
remove big objects after fit (#176)
* remove big objects after fit

* xgboost>1.3.3 has a weird auc socre on:
kr-vs-kp, fold 5, 1h1c

* keep_search_state
2021-08-26 13:45:13 -07:00
Qingyun Wu
a229a6112a
Support parallel and add random search (#167)
* non hashable value out of signature

* parallel trials

* add random in _search_parallel

* fix bug in retraining

* check memory constraint before training

* retrain_full

* log custom metric

* retraining budget check

* sample size check before retrain

* remove 'time2eval' from result

* report 'total_search_time' in result

* rename total_search_time to wall_clock_time

* rename train_loss boolean to log_training_metric

* set default train_loss to None

* exclude oom result

* log retrained model

* no subsample

* doc str

* notebook

* predicted value is NaN for sarimax

* version

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
2021-08-23 16:36:51 -07:00
すずまる
6270353458
support ROC and AUC for multi-class classification (#170)
* support ROC and AUC for multi-class classification

* add a test case to cover ROC and AUC for multi-class classification
2021-08-22 15:16:10 -07:00
Qingyun Wu
10082b9262
v0.5.12 (#150)
* remove extra comma

* exclusive bound

* log file name

* add cost to space

* dataset_format

* add load_openml_dataset test

* docstr

* revise test format

* simplify restore

* order categories

* openml server exception in test

* process space

* add warning

* log format

* reduce n_cpu

* nested space

* hierarchical search space for CFO

* non hierarchical for bs

* unflatten hierarchical config

* connection error

* random sample

* config signature

* check ray version

* preprocess numpy array

* catboost preprocess

* time budget

* seed, verbose, hpo_method

* test cfocat

* shallow copy in flatten_dict
prevent lgbm model duplication

* match estimator name

* quantize and log

* test qloguniform and qrandint

* test qlograndint

* thread.running

Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
2021-08-11 23:02:22 -07:00
Qingyun Wu
e24265ee5d
automl fit with starting points (#141)
* add starting point in fit

* add estimator best config

* add test

* add doc string

* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.

Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-07-31 13:39:31 -07:00
Chi Wang
15fd8adac4
max_leaves (#138)
* max_leaf_nodes in rf and extra_tree

* preprocess numpy str

* free up mem after training
2021-07-27 18:02:49 -07:00
Chi Wang
b3bb00966d
coverage (#135)
* coverage

* readme

* timeout
2021-07-20 17:00:44 -07:00
Chi Wang
072e9e4588
constraint (#132)
* constraint

* ensemble
2021-07-10 09:02:17 -07:00
Chi Wang
e039861ab0
multiple logged metrics in cv (#114) 2021-06-18 21:19:59 -07:00
Chi Wang
c26720c299
api doc for chacha (#105)
* api doc for chacha

* update params

* link to paper

* update dataset id

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
2021-06-11 10:25:45 -07:00
Chi Wang
f7cf2ea45a
Multiclass (#99)
* utility functions

* stepsize lower bound
2021-06-04 10:31:33 -07:00
Gian Pio Domiziani
c4c15f533f
datetime feature engineering added. (#89)
* datetime feature engineering added.

* check if datetime in columns moved after drop check. Check if the new columns do not already exist.

* check the drop condition before to add new_column. In transform, check directly if new columns are present in num_column.

* check if new_column is in X.columns.

* fixed lint issue. update version to 0.4.1.
2021-05-25 08:30:08 -07:00
Chi Wang
0925e2b308
constraints (#88)
* pre-training constraints

* metric constraints after training
2021-05-18 15:57:42 -07:00
Chi Wang
0b23c3a028
stepsize (#86)
* decrease step size in suggest

* initialization of the counters

* increase step size

* init phase

* check converge in suggest
2021-05-06 21:29:38 -07:00
Gian Pio Domiziani
730fd14ef6
micro/macro f1 metrics added. (#80)
* micro/macro f1 metrics added.

* format lines.
2021-04-26 14:50:41 -04:00
Gian Pio Domiziani
068fb9f5c2
X.copy() in the process method (#78)
* X.copy() in the transformer method.

* update version 0.3.4
2021-04-23 17:14:29 -07:00
Gian Pio Domiziani
ad42889a3b
datetime columns preprocess for validation data fixed. (#73)
* datetime columns preprocess for validation data fixed.

* code line formatted.
2021-04-21 10:22:54 -04:00
Qingyun Wu
06045703bf
Lgbm w customized obj (#64)
* add customized lgbm learner

* add comments

* fix format issue

* format

* OpenMLError

* add test

* add notebook

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
2021-04-10 21:14:28 -04:00
Chi Wang
97a7c114ee
Issue58 (#59)
* iter per learner

* code cleanup
2021-04-08 09:29:55 -07:00
Chi Wang
37d7518a4c
sample weight in xgboost (#54) 2021-03-31 22:11:56 -07:00
Chi Wang
ae5f8e5426
data validation (#45)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* prevent divide by zero

* test roberta

* BlendSearchTuner

* delta time

* reindex columns when dropping int-indexed columns

* test drop columns and small training data

* param set for ensemble builder

* fillna on copy

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
2021-03-19 09:50:47 -07:00
Chi Wang
4a8110c87b
pickle the AutoML object (#37)
* pickle the AutoML object

* get best model per estimator

* test deberta

* stateless API

* Add Gitter badge (#41)

* prevent divide by zero

* test roberta

* BlendSearchTuner

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
2021-03-16 22:13:35 -07:00
Chi Wang
6ff0ed434b
v0.2.5 (#30)
* test distillbert

* import check

* complete partial config

* None check

* init config is not suggested by bo

* badge

* notebook for lightgbm
2021-02-22 22:10:41 -08:00
Chi Wang
776aa55189
V0.2.2 (#19)
* v0.2.2

separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>
2021-02-05 21:41:14 -08:00
Chi Wang
cb5ce4e3a6
v0.1.3 Set default logging level to INFO (#14)
* set default logging level to INFO

* remove unnecessary import

* API future compatibility

* add test for customized learner

* test dependency

Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
2020-12-15 08:10:43 -08:00
Eric Zhu
4ce908f42e
Fix #11; add tests for training log and python logger (#12) 2020-12-14 23:10:03 -08:00
Chi Wang (MSR)
492990655d v0.1.0 2020-12-04 09:40:27 -08:00