* non hashable value out of signature
* parallel trials
* add random in _search_parallel
* fix bug in retraining
* check memory constraint before training
* retrain_full
* log custom metric
* retraining budget check
* sample size check before retrain
* remove 'time2eval' from result
* report 'total_search_time' in result
* rename total_search_time to wall_clock_time
* rename train_loss boolean to log_training_metric
* set default train_loss to None
* exclude oom result
* log retrained model
* no subsample
* doc str
* notebook
* predicted value is NaN for sarimax
* version
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qxw5138@psu.edu>
* remove extra comma
* exclusive bound
* log file name
* add cost to space
* dataset_format
* add load_openml_dataset test
* docstr
* revise test format
* simplify restore
* order categories
* openml server exception in test
* process space
* add warning
* log format
* reduce n_cpu
* nested space
* hierarchical search space for CFO
* non hierarchical for bs
* unflatten hierarchical config
* connection error
* random sample
* config signature
* check ray version
* preprocess numpy array
* catboost preprocess
* time budget
* seed, verbose, hpo_method
* test cfocat
* shallow copy in flatten_dict
prevent lgbm model duplication
* match estimator name
* quantize and log
* test qloguniform and qrandint
* test qlograndint
* thread.running
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
* add starting point in fit
* add estimator best config
* add test
* add doc string
* when there are multiple points_to_evaluate in CFO, use the best one to start local search; after that use low cost partial config as the start point; then, remove the points whose performance is worse than the converged, and start local search from the remaining ones ordered by their performance.
Co-authored-by: Qingyun Wu <qingyunwu@Qingyuns-MacBook-Pro-2.local>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* api doc for chacha
* update params
* link to paper
* update dataset id
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Qingyun Wu <qiw@microsoft.com>
* datetime feature engineering added.
* check if datetime in columns moved after drop check. Check if the new columns do not already exist.
* check the drop condition before to add new_column. In transform, check directly if new columns are present in num_column.
* check if new_column is in X.columns.
* fixed lint issue. update version to 0.4.1.
* add customized lgbm learner
* add comments
* fix format issue
* format
* OpenMLError
* add test
* add notebook
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
* pickle the AutoML object
* get best model per estimator
* test deberta
* stateless API
* prevent divide by zero
* test roberta
* BlendSearchTuner
* delta time
* reindex columns when dropping int-indexed columns
* test drop columns and small training data
* param set for ensemble builder
* fillna on copy
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
* pickle the AutoML object
* get best model per estimator
* test deberta
* stateless API
* Add Gitter badge (#41)
* prevent divide by zero
* test roberta
* BlendSearchTuner
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: The Gitter Badger <badger@gitter.im>
* v0.2.2
separate the HPO part into the module flaml.tune
enhanced implementation of FLOW^2, CFO and BlendSearch
support parallel tuning using ray tune
add support for sample_weight and generic fit arguments
enable mlflow logging
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>
Co-authored-by: qingyun-wu <qw2ky@virginia.edu>
* set default logging level to INFO
* remove unnecessary import
* API future compatibility
* add test for customized learner
* test dependency
Co-authored-by: Chi Wang (MSR) <chiw@microsoft.com>