2021-02-22 22:10:41 -08:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Copyright (c) 2020-2021 Microsoft Corporation. All rights reserved. \n",
"\n",
"Licensed under the MIT License.\n",
"\n",
"# Tune LightGBM with FLAML Library\n",
"\n",
"\n",
"## 1. Introduction\n",
"\n",
"FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n",
"with low computational cost. It is fast and cheap. The simple and lightweight design makes it easy \n",
"to use and extend, such as adding new learners. FLAML can \n",
"- serve as an economical AutoML engine,\n",
"- be used as a fast hyperparameter tuning tool, or \n",
"- be embedded in self-tuning software that requires low latency & resource in repetitive\n",
" tuning tasks.\n",
"\n",
"In this notebook, we demonstrate how to use FLAML library to tune hyperparameters of LightGBM with a regression example.\n",
"\n",
"FLAML requires `Python>=3.6`. To run this notebook example, please install flaml with the `notebook` option:\n",
"```bash\n",
"pip install flaml[notebook]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install flaml[notebook];"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 2. Regression Example\n",
"### Load data and preprocess\n",
"\n",
"Download [houses dataset](https://www.openml.org/d/537) from OpenML. The task is to predict median price of the house in the region based on demographic composition and a state of housing market in the region."
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 1,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"load dataset from ./openml_ds537.pkl\n",
"Dataset name: houses\n",
"X_train.shape: (15480, 8), y_train.shape: (15480,);\n",
"X_test.shape: (5160, 8), y_test.shape: (5160,)\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"from flaml.data import load_openml_dataset\n",
2021-04-08 09:29:55 -07:00
"X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=537, data_dir='./')"
2021-02-22 22:10:41 -08:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Run FLAML\n",
"In the FLAML automl run configuration, users can specify the task type, time budget, error metric, learner list, whether to subsample, resampling strategy type, and so on. All these arguments have default values which will be used if users do not provide them. "
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 2,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
2021-04-10 21:14:28 -04:00
},
"tags": []
2021-02-22 22:10:41 -08:00
},
"outputs": [],
"source": [
"''' import AutoML class from flaml package '''\n",
"from flaml import AutoML\n",
"automl = AutoML()"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 3,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"settings = {\n",
2021-07-24 20:10:43 -04:00
" \"time_budget\": 150, # total running time in seconds\n",
2021-04-08 09:29:55 -07:00
" \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']\n",
" \"estimator_list\": ['lgbm'], # list of ML learners; we tune lightgbm in this example\n",
" \"task\": 'regression', # task type \n",
" \"log_file_name\": 'houses_experiment.log', # flaml log file\n",
2021-02-22 22:10:41 -08:00
"}"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 4,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stderr",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"[flaml.automl: 07-24 13:49:03] {912} INFO - Evaluation method: cv\n",
"[flaml.automl: 07-24 13:49:03] {616} INFO - Using RepeatedKFold\n",
"[flaml.automl: 07-24 13:49:03] {933} INFO - Minimizing error metric: 1-r2\n",
"[flaml.automl: 07-24 13:49:03] {952} INFO - List of ML learners in AutoML Run: ['lgbm']\n",
"[flaml.automl: 07-24 13:49:03] {1018} INFO - iteration 0, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:03] {1173} INFO - at 0.5s,\tbest lgbm's error=0.7385,\tbest lgbm's error=0.7385\n",
"[flaml.automl: 07-24 13:49:03] {1018} INFO - iteration 1, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:03] {1173} INFO - at 0.7s,\tbest lgbm's error=0.7385,\tbest lgbm's error=0.7385\n",
"[flaml.automl: 07-24 13:49:03] {1018} INFO - iteration 2, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:03] {1173} INFO - at 0.9s,\tbest lgbm's error=0.5520,\tbest lgbm's error=0.5520\n",
"[flaml.automl: 07-24 13:49:03] {1018} INFO - iteration 3, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:04] {1173} INFO - at 1.1s,\tbest lgbm's error=0.3886,\tbest lgbm's error=0.3886\n",
"[flaml.automl: 07-24 13:49:04] {1018} INFO - iteration 4, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:04] {1173} INFO - at 1.2s,\tbest lgbm's error=0.3886,\tbest lgbm's error=0.3886\n",
"[flaml.automl: 07-24 13:49:04] {1018} INFO - iteration 5, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:04] {1173} INFO - at 1.4s,\tbest lgbm's error=0.3886,\tbest lgbm's error=0.3886\n",
"[flaml.automl: 07-24 13:49:04] {1018} INFO - iteration 6, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:04] {1173} INFO - at 1.6s,\tbest lgbm's error=0.3023,\tbest lgbm's error=0.3023\n",
"[flaml.automl: 07-24 13:49:04] {1018} INFO - iteration 7, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:04] {1173} INFO - at 1.9s,\tbest lgbm's error=0.2611,\tbest lgbm's error=0.2611\n",
"[flaml.automl: 07-24 13:49:04] {1018} INFO - iteration 8, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:05] {1173} INFO - at 2.1s,\tbest lgbm's error=0.2611,\tbest lgbm's error=0.2611\n",
"[flaml.automl: 07-24 13:49:05] {1018} INFO - iteration 9, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:05] {1173} INFO - at 2.4s,\tbest lgbm's error=0.2363,\tbest lgbm's error=0.2363\n",
"[flaml.automl: 07-24 13:49:05] {1018} INFO - iteration 10, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:05] {1173} INFO - at 2.6s,\tbest lgbm's error=0.2363,\tbest lgbm's error=0.2363\n",
"[flaml.automl: 07-24 13:49:05] {1018} INFO - iteration 11, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:05] {1173} INFO - at 2.8s,\tbest lgbm's error=0.2363,\tbest lgbm's error=0.2363\n",
"[flaml.automl: 07-24 13:49:05] {1018} INFO - iteration 12, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:06] {1173} INFO - at 3.2s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:06] {1018} INFO - iteration 13, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:06] {1173} INFO - at 3.5s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:06] {1018} INFO - iteration 14, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:06] {1173} INFO - at 3.9s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:06] {1018} INFO - iteration 15, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:07] {1173} INFO - at 4.1s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:07] {1018} INFO - iteration 16, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:07] {1173} INFO - at 4.5s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:07] {1018} INFO - iteration 17, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:07] {1173} INFO - at 4.8s,\tbest lgbm's error=0.1953,\tbest lgbm's error=0.1953\n",
"[flaml.automl: 07-24 13:49:07] {1018} INFO - iteration 18, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:08] {1173} INFO - at 5.5s,\tbest lgbm's error=0.1795,\tbest lgbm's error=0.1795\n",
"[flaml.automl: 07-24 13:49:08] {1018} INFO - iteration 19, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:08] {1173} INFO - at 5.7s,\tbest lgbm's error=0.1795,\tbest lgbm's error=0.1795\n",
"[flaml.automl: 07-24 13:49:08] {1018} INFO - iteration 20, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:11] {1173} INFO - at 8.2s,\tbest lgbm's error=0.1795,\tbest lgbm's error=0.1795\n",
"[flaml.automl: 07-24 13:49:11] {1018} INFO - iteration 21, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:11] {1173} INFO - at 8.5s,\tbest lgbm's error=0.1795,\tbest lgbm's error=0.1795\n",
"[flaml.automl: 07-24 13:49:11] {1018} INFO - iteration 22, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:12] {1173} INFO - at 9.6s,\tbest lgbm's error=0.1768,\tbest lgbm's error=0.1768\n",
"[flaml.automl: 07-24 13:49:12] {1018} INFO - iteration 23, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:13] {1173} INFO - at 10.1s,\tbest lgbm's error=0.1768,\tbest lgbm's error=0.1768\n",
"[flaml.automl: 07-24 13:49:13] {1018} INFO - iteration 24, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:14] {1173} INFO - at 11.1s,\tbest lgbm's error=0.1768,\tbest lgbm's error=0.1768\n",
"[flaml.automl: 07-24 13:49:14] {1018} INFO - iteration 25, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:14] {1173} INFO - at 11.4s,\tbest lgbm's error=0.1768,\tbest lgbm's error=0.1768\n",
"[flaml.automl: 07-24 13:49:14] {1018} INFO - iteration 26, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:18] {1173} INFO - at 15.6s,\tbest lgbm's error=0.1652,\tbest lgbm's error=0.1652\n",
"[flaml.automl: 07-24 13:49:18] {1018} INFO - iteration 27, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:21] {1173} INFO - at 18.5s,\tbest lgbm's error=0.1652,\tbest lgbm's error=0.1652\n",
"[flaml.automl: 07-24 13:49:21] {1018} INFO - iteration 28, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:28] {1173} INFO - at 25.5s,\tbest lgbm's error=0.1642,\tbest lgbm's error=0.1642\n",
"[flaml.automl: 07-24 13:49:28] {1018} INFO - iteration 29, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:29] {1173} INFO - at 26.5s,\tbest lgbm's error=0.1642,\tbest lgbm's error=0.1642\n",
"[flaml.automl: 07-24 13:49:29] {1018} INFO - iteration 30, current learner lgbm\n",
"[flaml.automl: 07-24 13:49:59] {1173} INFO - at 56.8s,\tbest lgbm's error=0.1642,\tbest lgbm's error=0.1642\n",
"[flaml.automl: 07-24 13:49:59] {1018} INFO - iteration 31, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:05] {1173} INFO - at 62.1s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:05] {1018} INFO - iteration 32, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:13] {1173} INFO - at 70.4s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:13] {1018} INFO - iteration 33, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:18] {1173} INFO - at 75.6s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:18] {1018} INFO - iteration 34, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:22] {1173} INFO - at 79.3s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:22] {1018} INFO - iteration 35, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:49] {1173} INFO - at 106.3s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:49] {1018} INFO - iteration 36, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:49] {1173} INFO - at 107.0s,\tbest lgbm's error=0.1622,\tbest lgbm's error=0.1622\n",
"[flaml.automl: 07-24 13:50:49] {1018} INFO - iteration 37, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:54] {1173} INFO - at 112.0s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:50:54] {1018} INFO - iteration 38, current learner lgbm\n",
"[flaml.automl: 07-24 13:50:59] {1173} INFO - at 116.0s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:50:59] {1018} INFO - iteration 39, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:06] {1173} INFO - at 123.3s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:06] {1018} INFO - iteration 40, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:08] {1173} INFO - at 126.0s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:08] {1018} INFO - iteration 41, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:11] {1173} INFO - at 128.1s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:11] {1018} INFO - iteration 42, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:18] {1173} INFO - at 135.4s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:18] {1018} INFO - iteration 43, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:20] {1173} INFO - at 137.7s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:20] {1018} INFO - iteration 44, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:27] {1173} INFO - at 144.3s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:27] {1018} INFO - iteration 45, current learner lgbm\n",
"[flaml.automl: 07-24 13:51:32] {1173} INFO - at 149.4s,\tbest lgbm's error=0.1611,\tbest lgbm's error=0.1611\n",
"[flaml.automl: 07-24 13:51:32] {1219} INFO - selected model: LGBMRegressor(colsample_bytree=0.788228718184241,\n",
" learning_rate=0.08917691724022275, max_bin=256,\n",
" min_child_samples=64, n_estimators=157, num_leaves=4886,\n",
" objective='regression', reg_alpha=0.042293060180467086,\n",
" reg_lambda=95.16149755350158, subsample=0.8278302514488655)\n",
"[flaml.automl: 07-24 13:51:32] {969} INFO - fit succeeded\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"'''The main flaml automl API'''\n",
2021-04-08 09:29:55 -07:00
"automl.fit(X_train=X_train, y_train=y_train, **settings)"
2021-02-22 22:10:41 -08:00
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Best model and metric"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 5,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"Best hyperparmeter config: {'n_estimators': 157, 'num_leaves': 4886, 'min_child_samples': 64, 'learning_rate': 0.08917691724022275, 'subsample': 0.8278302514488655, 'log_max_bin': 9, 'colsample_bytree': 0.788228718184241, 'reg_alpha': 0.042293060180467086, 'reg_lambda': 95.16149755350158}\n",
"Best r2 on validation data: 0.8389\n",
"Training duration of best run: 4.971 s\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"''' retrieve best config'''\n",
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 6,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
2021-05-08 02:50:50 +00:00
"text/plain": [
2021-07-24 20:10:43 -04:00
"LGBMRegressor(colsample_bytree=0.788228718184241,\n",
" learning_rate=0.08917691724022275, max_bin=256,\n",
" min_child_samples=64, n_estimators=157, num_leaves=4886,\n",
" objective='regression', reg_alpha=0.042293060180467086,\n",
" reg_lambda=95.16149755350158, subsample=0.8278302514488655)"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
},
2021-07-24 20:10:43 -04:00
"execution_count": 6,
2021-02-22 22:10:41 -08:00
"metadata": {},
2021-07-24 20:10:43 -04:00
"output_type": "execute_result"
2021-02-22 22:10:41 -08:00
}
],
"source": [
2021-07-06 11:32:20 -04:00
"automl.model.estimator"
2021-02-22 22:10:41 -08:00
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 7,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
2021-03-16 22:13:35 -07:00
"''' pickle and save the automl object '''\n",
2021-02-22 22:10:41 -08:00
"import pickle\n",
2021-03-16 22:13:35 -07:00
"with open('automl.pkl', 'wb') as f:\n",
" pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)"
2021-02-22 22:10:41 -08:00
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 8,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"Predicted labels [143563.86395674 254576.18760499 144158.79619969 ... 196049.43993507\n",
" 252324.54317706 279607.98371458]\n",
"True labels 14740 136900.0\n",
"10101 241300.0\n",
"20566 200700.0\n",
"2670 72500.0\n",
"15709 460000.0\n",
" ... \n",
"13132 121200.0\n",
"8228 137500.0\n",
"3948 160900.0\n",
"8522 227300.0\n",
"16798 265600.0\n",
"Name: median_house_value, Length: 5160, dtype: float64\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"''' compute predictions of testing dataset ''' \n",
"y_pred = automl.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 9,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"r2 = 0.8467624164909245\n",
"mse = 2025572000.0048184\n",
"mae = 29845.819846911687\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"''' compute different metric values on testing dataset'''\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n",
"print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n",
"print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 10,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.1, 'subsample': 1.0, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.1, 'subsample': 1.0, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 12, 'learning_rate': 0.25912534572860507, 'subsample': 0.9266743941610592, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0013933617380144255, 'reg_lambda': 0.18096917948292954}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 12, 'learning_rate': 0.25912534572860507, 'subsample': 0.9266743941610592, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0013933617380144255, 'reg_lambda': 0.18096917948292954}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 24, 'learning_rate': 1.0, 'subsample': 0.8513627344387318, 'log_max_bin': 10, 'colsample_bytree': 0.946138073111236, 'reg_alpha': 0.0018311776973217071, 'reg_lambda': 0.27901659190538414}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 24, 'learning_rate': 1.0, 'subsample': 0.8513627344387318, 'log_max_bin': 10, 'colsample_bytree': 0.946138073111236, 'reg_alpha': 0.0018311776973217071, 'reg_lambda': 0.27901659190538414}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 11, 'num_leaves': 4, 'min_child_samples': 36, 'learning_rate': 1.0, 'subsample': 0.8894434216129232, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0013605736901132325, 'reg_lambda': 0.1222158118565165}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 11, 'num_leaves': 4, 'min_child_samples': 36, 'learning_rate': 1.0, 'subsample': 0.8894434216129232, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0013605736901132325, 'reg_lambda': 0.1222158118565165}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 20, 'num_leaves': 4, 'min_child_samples': 46, 'learning_rate': 1.0, 'subsample': 0.9814787163243813, 'log_max_bin': 9, 'colsample_bytree': 0.8499027725496043, 'reg_alpha': 0.0022085340760961856, 'reg_lambda': 0.546062702473889}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 20, 'num_leaves': 4, 'min_child_samples': 46, 'learning_rate': 1.0, 'subsample': 0.9814787163243813, 'log_max_bin': 9, 'colsample_bytree': 0.8499027725496043, 'reg_alpha': 0.0022085340760961856, 'reg_lambda': 0.546062702473889}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 20, 'num_leaves': 11, 'min_child_samples': 52, 'learning_rate': 1.0, 'subsample': 1.0, 'log_max_bin': 9, 'colsample_bytree': 0.7967145599266738, 'reg_alpha': 0.05680749758595097, 'reg_lambda': 2.756357095973371}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 20, 'num_leaves': 11, 'min_child_samples': 52, 'learning_rate': 1.0, 'subsample': 1.0, 'log_max_bin': 9, 'colsample_bytree': 0.7967145599266738, 'reg_alpha': 0.05680749758595097, 'reg_lambda': 2.756357095973371}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 37, 'num_leaves': 15, 'min_child_samples': 93, 'learning_rate': 0.6413547778096401, 'subsample': 1.0, 'log_max_bin': 9, 'colsample_bytree': 0.6980216487058154, 'reg_alpha': 0.020158745350617662, 'reg_lambda': 0.954042157679914}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 37, 'num_leaves': 15, 'min_child_samples': 93, 'learning_rate': 0.6413547778096401, 'subsample': 1.0, 'log_max_bin': 9, 'colsample_bytree': 0.6980216487058154, 'reg_alpha': 0.020158745350617662, 'reg_lambda': 0.954042157679914}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 75, 'num_leaves': 32, 'min_child_samples': 83, 'learning_rate': 0.19997653978110663, 'subsample': 0.8895588746662894, 'log_max_bin': 7, 'colsample_bytree': 0.663557757490723, 'reg_alpha': 0.03147131714846291, 'reg_lambda': 0.38644069375879475}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 75, 'num_leaves': 32, 'min_child_samples': 83, 'learning_rate': 0.19997653978110663, 'subsample': 0.8895588746662894, 'log_max_bin': 7, 'colsample_bytree': 0.663557757490723, 'reg_alpha': 0.03147131714846291, 'reg_lambda': 0.38644069375879475}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 81, 'num_leaves': 66, 'min_child_samples': 93, 'learning_rate': 0.07560024606664352, 'subsample': 0.8756054034199897, 'log_max_bin': 7, 'colsample_bytree': 0.7142272555842307, 'reg_alpha': 0.00219854653612346, 'reg_lambda': 2.9360090402842274}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 81, 'num_leaves': 66, 'min_child_samples': 93, 'learning_rate': 0.07560024606664352, 'subsample': 0.8756054034199897, 'log_max_bin': 7, 'colsample_bytree': 0.7142272555842307, 'reg_alpha': 0.00219854653612346, 'reg_lambda': 2.9360090402842274}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 283, 'num_leaves': 171, 'min_child_samples': 128, 'learning_rate': 0.056885026855831654, 'subsample': 0.9152991332236934, 'log_max_bin': 8, 'colsample_bytree': 0.7103230835995594, 'reg_alpha': 0.012993197803320033, 'reg_lambda': 7.0529810054461715}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 283, 'num_leaves': 171, 'min_child_samples': 128, 'learning_rate': 0.056885026855831654, 'subsample': 0.9152991332236934, 'log_max_bin': 8, 'colsample_bytree': 0.7103230835995594, 'reg_alpha': 0.012993197803320033, 'reg_lambda': 7.0529810054461715}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 363, 'num_leaves': 269, 'min_child_samples': 128, 'learning_rate': 0.13082160708847235, 'subsample': 0.820105567300051, 'log_max_bin': 10, 'colsample_bytree': 0.6819303877749074, 'reg_alpha': 0.03805198795768637, 'reg_lambda': 18.14103139151093}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 363, 'num_leaves': 269, 'min_child_samples': 128, 'learning_rate': 0.13082160708847235, 'subsample': 0.820105567300051, 'log_max_bin': 10, 'colsample_bytree': 0.6819303877749074, 'reg_alpha': 0.03805198795768637, 'reg_lambda': 18.14103139151093}}\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"from flaml.data import get_output_from_log\n",
"time_history, best_valid_loss_history, valid_loss_history, config_history, train_loss_history = \\\n",
2021-04-08 09:29:55 -07:00
" get_output_from_log(filename=settings['log_file_name'], time_budget=60)\n",
2021-02-22 22:10:41 -08:00
"\n",
"for config in config_history:\n",
" print(config)"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 11,
2021-02-22 22:10:41 -08:00
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
2021-07-24 20:10:43 -04:00
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8GearUAAAdDUlEQVR4nO3df5gcVZ3v8feHIcAohAEzssmQkAgxGkATHeHiT2DFAKskICJwn72KV4O74rqyN0hcRRYvVzSKF5+bRYFlAZffMcao0ciKogJCAoGEBMPGiCSTCOFHIOBIksn3/lE1oWm6e3oyU90zVZ/X8/QzXadOVX0rDf3tc07VKUUEZmZWXLs1OwAzM2suJwIzs4JzIjAzKzgnAjOzgnMiMDMrOCcCM7OCcyIwq0HSuyStbnYcZllyIrAhS9Kjkt7bzBgi4tcRMSmr/UuaJulXkrZI2iTpDkknZXU8s0qcCKzQJLU08dinArcC1wEHAgcAFwAf2IV9SZL/f7Zd4v9wbNiRtJuk8yX9XtJTkm6RtH/J+lsl/UnSs+mv7UNL1l0j6XJJiyS9AByTtjz+l6Tl6TY3S9orrX+0pPUl21etm64/T9JGSRskfVxSSDqkwjkIuBT4ckRcFRHPRsSOiLgjIj6R1rlQ0n+UbDM+3d/u6fIvJV0s6U7gz8AsSUvLjvNZSQvT93tK+rqkxyQ9LunbkloH+HFYDjgR2HD0aWAG8B5gDPAMMLdk/U+AicBrgfuB68u2PxO4GNgH+E1adhpwPDABeBPw0RrHr1hX0vHAucB7gUOAo2vsYxIwFphXo049/haYSXIu3wYmSZpYsv5M4Ib0/SXA64EpaXwdJC0QKzgnAhuOPgn8c0Ssj4gXgQuBU3t/KUfE1RGxpWTdmyXtW7L9DyLizvQX+F/Ssm9FxIaIeBr4IcmXZTXV6p4G/HtErIyIP6fHruY16d+N9Z50Fdekx9seEc8CPwDOAEgTwhuAhWkLZCbw2Yh4OiK2AP8HOH2Ax7cccCKw4egg4PuSNkvaDDwM9AAHSGqRdEnabfQc8Gi6zaiS7ddV2OefSt7/Gdi7xvGr1R1Ttu9Kx+n1VPp3dI069Sg/xg2kiYCkNbAgTUrtwKuA+0r+3X6allvBORHYcLQOOCEi2kpee0VEF8mX33SS7pl9gfHpNirZPqspdzeSDPr2Gluj7mqS8/hgjTovkHx59/qrCnXKz+U2oF3SFJKE0Nst9CTQDRxa8m+2b0TUSnhWEE4ENtSNkLRXyWt3kr7wiyUdBCCpXdL0tP4+wIskv7hfRdL90Si3AGdJeqOkVwFfrFYxkvnfzwW+KOksSSPTQfB3SroirfYA8G5J49Kurdl9BRAR20iuRJoD7E+SGIiIHcCVwDclvRZAUoekabt8tpYbTgQ21C0i+SXb+7oQuAxYCPxM0hbgt8CRaf3rgD8CXcCqdF1DRMRPgG8BvwDWlBz7xSr15wEfBj4GbAAeB/43ST8/EXEbcDOwHLgP+FGdodxA0iK6NSK2l5R/rjeutNvsP0kGra3g5AfTmGVD0huBh4A9y76QzYYUtwjMBpGkk9Pr9fcDvgr80EnAhjonArPBdTbwBPB7kiuZ/q654Zj1zV1DZmYF5xaBmVnB7d7sAPpr1KhRMX78+GaHYWY2rNx3331PRkTFGwiHXSIYP348S5cu7buimZntJOmP1da5a8jMrOCcCMzMCs6JwMys4JwIzMwKzonAzKzght1VQ2ZmRbNgWRdzFq9mw+ZuxrS1MmvaJGZM7Ri0/TsRmA1zWX9JWHMtWNbF7Pkr6N7WA0DX5m5mz18BMGifs7uGzIax3i+Jrs3dBC99SSxY1tXs0GyQzFm8emcS6NW9rYc5i1cP2jHcIjAbxqp9SZw3bzk33vtYk6KywdS1ubti+YYq5bvCLQKzYazal8HWnh0NjsSyskdL5a/pMW2tg3YMtwhsyHLfd9/GtLVW/MXY0dbKzWcf1YSIbLCVjxEAtI5oYda0wXu4nFsEObVgWRfvuOR2Jpz/Y95xye3Drs/Yfd/1mTVtEq0jWl5WNthfEtZcM6Z28JVTDqejrRWRJPmvnHL4oP4oGnbPI+js7AxPOldbtV8Qg/0fT5beccntFX/p7tGyG1PHtTUhoqHryedfZO2mFwiSLwm3nKwSSfdFRGelde4ayqE8DCBWGyBz3/crjdp7T0btvSfTp3Rw5pHjmh2ODUNOBDmUhwHEPVp2qxiv+77NBp8TQQ7lYQCxEQNkZpbwYHGDNHLwNg8DiI0YIDOzRKYtAknHA5cBLcBVEXFJ2fpxwLVAW1rn/IhYlGVMzdCIW8RL9e7zvHnL2dqzY9gOIM6Y2jHsYjYbjjJLBJJagLnAccB6YImkhRGxqqTaF4BbIuJySZOBRcD4rGJqlmYN3u45IrnCZrh0B5lZc2TZNXQEsCYi1kbEVuAmYHpZnQBGpu/3BTZkGE/TNGvwdvLokUyf4l/UZlZbll1DHcC6kuX1wJFldS4Efibp08CrgfdmGE/T5GHw1szyq9mDxWcA10TEgcCJwHclvSImSTMlLZW0dNOmTQ0PcqDyMHhrZvmVZSLoAsaWLB+YlpX6n8AtABFxN7AXMKp8RxFxRUR0RkRne3t7RuFmp/cKmN7Jo3wFjJkNJVl2DS0BJkqaQJIATgfOLKvzGPDXwDWS3kiSCIbfT/46zJjasXNg2N1BZjaUZNYiiIjtwDnAYuBhkquDVkq6SNJJabV/Aj4h6UHgRuCjMdwmPzIzG+YyvY8gvSdgUVnZBSXvVwHvyDIGMzOrrdmDxWZm1mROBGZmBedEYGZWcE4EZmYF52moB8jP1TWz4c6JYAAaPauomVkWnAgGoL+ziq7a+ByTR498RbmZWTN5jGAA+jurqGcDNbOhyC2CAfCsomaWB24RDIBnFTWzPHCLYADy8khIMys2J4IB8qyiZjbcuWvIzKzg3CKok28cM7O8ciKog28cM7M8c9dQHWrdOPbh79zNqo3PNSkyM7OBcyKoQ183jvlGMTMbztw1VAffOGZmeeYWQR1845iZ5ZlbBHXwjWNmlmdOBHXyjWNmllfuGjIzKzgnAjOzgnMiMDMrOCcCM7OCcyIwMyu4TBOBpOMlrZa0RtL5FdZ/U9ID6esRSZuzjMfMzF4ps8tHJbUAc4HjgPXAEkkLI2JVb52I+GxJ/U8DU7OKZ1eUzzi614jdGLX3ns0Oy8xsUGV5H8ERwJqIWAsg6SZgOrCqSv0zgC9lGE+/VJpxdDc1OSgzswxk2TXUAawrWV6flr2CpIOACcDtVdbPlLRU0tJNmzYNeqCVVJpxdEfAuqcrT0BnZjZcDZXB4tOBeRHRU2llRFwREZ0R0dne3t6QgPqacdTMLC+yTARdwNiS5QPTskpOB27MMJZ+G9PWWrG8o0q5mdlwlWUiWAJMlDRB0h4kX/YLyytJegOwH3B3hrH0m2ccNbOiyCwRRMR24BxgMfAwcEtErJR0kaSTSqqeDtwUEZFVLLtixtQOvnLK4ezRkvwTdbS18pVTDveMo2aWOxpi37996uzsjKVLlzbseB/+TtJQ8YyjZjacSbovIjorrRsqg8VmZtYkTgRmZgXnRGBmVnBOBGZmBedEYGZWcE4EZmYF50RgZlZwTgRmZgXnRGBmVnBOBGZmBedEYGZWcE4EZmYF50RgZlZwTgRmZgXnRGBmVnBOBGZmBedEYGZWcDUTgaSRkg6uUP6m7EIyM7NGqpoIJJ0G/A74nqSVkt5WsvqarAMzM7PGqNUi+Dzw1oiYApwFfFfSyek6ZR6ZmZk1xO411rVExEaAiLhX0jHAjySNBYbXE+/NzKyqWolgi6SDI+L3ABGxUdLRwALg0EYE1wgLlnUxZ/FqNmzuZkxbK7OmTWLG1I5mh2Vm1jC1EsHfUdYFFBFbJB0PnJZpVA2yYFkXs+evoHtbDwBdm7uZPX8FgJOBmRVG1UQQEQ9KapH0i4g4pqR8G3B9Q6LL2JzFq3cmgV7d23o4b95ybrz3MQBWbXyOyaNHNiM8M7OGqHn5aET0ADsk7dugeBpqw+buiuVbe3bsfD9
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
2021-02-22 22:10:41 -08:00
},
"metadata": {
"needs_background": "light"
2021-07-24 20:10:43 -04:00
},
"output_type": "display_data"
2021-02-22 22:10:41 -08:00
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"plt.title('Learning Curve')\n",
"plt.xlabel('Wall Clock Time (s)')\n",
"plt.ylabel('Validation r2')\n",
2021-04-08 09:29:55 -07:00
"plt.scatter(time_history, 1 - np.array(valid_loss_history))\n",
"plt.step(time_history, 1 - np.array(best_valid_loss_history), where='post')\n",
2021-02-22 22:10:41 -08:00
"plt.show()"
]
},
{
2021-07-24 20:10:43 -04:00
"cell_type": "markdown",
"metadata": {},
2021-02-22 22:10:41 -08:00
"source": [
"## 3. Comparison with alternatives\n",
"\n",
"### FLAML's accuracy"
2021-07-24 20:10:43 -04:00
]
2021-02-22 22:10:41 -08:00
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 12,
2021-04-10 21:14:28 -04:00
"metadata": {
"tags": []
},
2021-02-22 22:10:41 -08:00
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"flaml r2 = 0.8467624164909245\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"print('flaml r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
2021-07-24 20:10:43 -04:00
"cell_type": "markdown",
"metadata": {},
2021-02-22 22:10:41 -08:00
"source": [
"### Default LightGBM"
2021-07-24 20:10:43 -04:00
]
2021-02-22 22:10:41 -08:00
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 13,
2021-02-22 22:10:41 -08:00
"metadata": {},
"outputs": [],
"source": [
"from lightgbm import LGBMRegressor\n",
"lgbm = LGBMRegressor()"
]
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 14,
2021-02-22 22:10:41 -08:00
"metadata": {},
"outputs": [
{
"data": {
2021-05-08 02:50:50 +00:00
"text/plain": [
"LGBMRegressor()"
]
2021-02-22 22:10:41 -08:00
},
2021-07-24 20:10:43 -04:00
"execution_count": 14,
2021-02-22 22:10:41 -08:00
"metadata": {},
2021-07-24 20:10:43 -04:00
"output_type": "execute_result"
2021-02-22 22:10:41 -08:00
}
],
"source": [
"lgbm.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 15,
2021-04-10 21:14:28 -04:00
"metadata": {
"tags": []
},
2021-02-22 22:10:41 -08:00
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
"default lgbm r2 = 0.8296179648694404\n"
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"y_pred = lgbm.predict(X_test)\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('default lgbm r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
2021-07-24 20:10:43 -04:00
"cell_type": "markdown",
"metadata": {},
2021-02-22 22:10:41 -08:00
"source": [
"### Optuna LightGBM Tuner"
2021-07-24 20:10:43 -04:00
]
2021-02-22 22:10:41 -08:00
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 16,
2021-02-22 22:10:41 -08:00
"metadata": {},
"outputs": [],
"source": [
2021-07-24 20:10:43 -04:00
"# !pip install optuna==2.8.0;"
2021-02-22 22:10:41 -08:00
]
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 17,
2021-02-22 22:10:41 -08:00
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"train_x, val_x, train_y, val_y = train_test_split(X_train, y_train, test_size=0.1)\n",
"import optuna.integration.lightgbm as lgb\n",
"dtrain = lgb.Dataset(train_x, label=train_y)\n",
"dval = lgb.Dataset(val_x, label=val_y)\n",
"params = {\n",
" \"objective\": \"regression\",\n",
" \"metric\": \"regression\",\n",
" \"verbosity\": -1,\n",
"}\n"
]
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 18,
2021-02-22 22:10:41 -08:00
"metadata": {
2021-05-08 02:50:50 +00:00
"tags": [
"outputPrepend"
]
2021-02-22 22:10:41 -08:00
},
"outputs": [
{
2021-07-24 20:10:43 -04:00
"name": "stderr",
2021-02-22 22:10:41 -08:00
"output_type": "stream",
2021-07-24 20:10:43 -04:00
"text": [
"\u001b[32m[I 2021-07-24 13:51:37,223]\u001b[0m A new study created in memory with name: no-name-a9ad03a9-1a95-4cb8-903f-402d3a214640\u001b[0m\n",
"feature_fraction, val_score: 2251528152.166571: 14%|#4 | 1/7 [00:01<00:08, 1.42s/it]\u001b[32m[I 2021-07-24 13:51:38,667]\u001b[0m Trial 0 finished with value: 2251528152.166571 and parameters: {'feature_fraction': 0.5}. Best is trial 0 with value: 2251528152.166571.\u001b[0m\n",
"feature_fraction, val_score: 2251528152.166571: 29%|##8 | 2/7 [00:03<00:07, 1.47s/it]\u001b[32m[I 2021-07-24 13:51:40,257]\u001b[0m Trial 1 finished with value: 2253578981.8374605 and parameters: {'feature_fraction': 0.6}. Best is trial 0 with value: 2251528152.166571.\u001b[0m\n",
"feature_fraction, val_score: 2251528152.166571: 43%|####2 | 3/7 [00:04<00:06, 1.54s/it]\u001b[32m[I 2021-07-24 13:51:41,944]\u001b[0m Trial 2 finished with value: 2293115294.9762316 and parameters: {'feature_fraction': 1.0}. Best is trial 0 with value: 2251528152.166571.\u001b[0m\n",
"feature_fraction, val_score: 2221143011.565778: 57%|#####7 | 4/7 [00:06<00:04, 1.52s/it]\u001b[32m[I 2021-07-24 13:51:43,420]\u001b[0m Trial 3 finished with value: 2221143011.5657783 and parameters: {'feature_fraction': 0.7}. Best is trial 3 with value: 2221143011.5657783.\u001b[0m\n",
"feature_fraction, val_score: 2221143011.565778: 71%|#######1 | 5/7 [00:07<00:03, 1.51s/it]\u001b[32m[I 2021-07-24 13:51:44,892]\u001b[0m Trial 4 finished with value: 2221143011.5657783 and parameters: {'feature_fraction': 0.8}. Best is trial 3 with value: 2221143011.5657783.\u001b[0m\n",
"feature_fraction, val_score: 2221143011.565778: 86%|########5 | 6/7 [00:09<00:01, 1.51s/it]\u001b[32m[I 2021-07-24 13:51:46,405]\u001b[0m Trial 5 finished with value: 2491490333.842695 and parameters: {'feature_fraction': 0.4}. Best is trial 3 with value: 2221143011.5657783.\u001b[0m\n",
"feature_fraction, val_score: 2193757572.841483: 100%|##########| 7/7 [00:10<00:00, 1.52s/it]\u001b[32m[I 2021-07-24 13:51:47,947]\u001b[0m Trial 6 finished with value: 2193757572.841483 and parameters: {'feature_fraction': 0.8999999999999999}. Best is trial 6 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction, val_score: 2193757572.841483: 100%|##########| 7/7 [00:10<00:00, 1.53s/it]\n",
"num_leaves, val_score: 2193757572.841483: 5%|5 | 1/20 [00:06<01:55, 6.10s/it]\u001b[32m[I 2021-07-24 13:51:54,053]\u001b[0m Trial 7 finished with value: 2248042974.885056 and parameters: {'num_leaves': 163}. Best is trial 7 with value: 2248042974.885056.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 10%|# | 2/20 [00:09<01:36, 5.36s/it]\u001b[32m[I 2021-07-24 13:51:57,685]\u001b[0m Trial 8 finished with value: 2202201580.7993436 and parameters: {'num_leaves': 88}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 15%|#5 | 3/20 [00:18<01:46, 6.24s/it]\u001b[32m[I 2021-07-24 13:52:05,988]\u001b[0m Trial 9 finished with value: 2245590498.6014037 and parameters: {'num_leaves': 191}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 20%|## | 4/20 [00:25<01:44, 6.53s/it]\u001b[32m[I 2021-07-24 13:52:13,177]\u001b[0m Trial 10 finished with value: 2313837552.1304107 and parameters: {'num_leaves': 179}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 25%|##5 | 5/20 [00:27<01:19, 5.28s/it]\u001b[32m[I 2021-07-24 13:52:15,543]\u001b[0m Trial 11 finished with value: 2292271962.1367116 and parameters: {'num_leaves': 52}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 30%|### | 6/20 [00:32<01:12, 5.16s/it]\u001b[32m[I 2021-07-24 13:52:20,433]\u001b[0m Trial 12 finished with value: 2262598949.621539 and parameters: {'num_leaves': 143}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 35%|###5 | 7/20 [00:34<00:55, 4.25s/it]\u001b[32m[I 2021-07-24 13:52:22,553]\u001b[0m Trial 13 finished with value: 2290214250.6976314 and parameters: {'num_leaves': 50}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 40%|#### | 8/20 [00:40<00:56, 4.70s/it]\u001b[32m[I 2021-07-24 13:52:28,292]\u001b[0m Trial 14 finished with value: 2274572970.4214416 and parameters: {'num_leaves': 165}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 45%|####5 | 9/20 [00:49<01:06, 6.05s/it]\u001b[32m[I 2021-07-24 13:52:37,485]\u001b[0m Trial 15 finished with value: 2293618526.656807 and parameters: {'num_leaves': 244}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 50%|##### | 10/20 [00:55<01:00, 6.09s/it]\u001b[32m[I 2021-07-24 13:52:43,668]\u001b[0m Trial 16 finished with value: 2248672042.5925345 and parameters: {'num_leaves': 164}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 55%|#####5 | 11/20 [00:59<00:47, 5.33s/it]\u001b[32m[I 2021-07-24 13:52:47,239]\u001b[0m Trial 17 finished with value: 2264385179.765125 and parameters: {'num_leaves': 91}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 60%|###### | 12/20 [01:08<00:51, 6.48s/it]\u001b[32m[I 2021-07-24 13:52:56,413]\u001b[0m Trial 18 finished with value: 2252406272.4344435 and parameters: {'num_leaves': 251}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 65%|######5 | 13/20 [01:12<00:39, 5.71s/it]\u001b[32m[I 2021-07-24 13:53:00,320]\u001b[0m Trial 19 finished with value: 2214542360.152998 and parameters: {'num_leaves': 101}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 70%|####### | 14/20 [01:12<00:25, 4.19s/it]\u001b[32m[I 2021-07-24 13:53:00,950]\u001b[0m Trial 20 finished with value: 2748428041.4812107 and parameters: {'num_leaves': 3}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 75%|#######5 | 15/20 [01:16<00:20, 4.11s/it]\u001b[32m[I 2021-07-24 13:53:04,880]\u001b[0m Trial 21 finished with value: 2228598419.330431 and parameters: {'num_leaves': 100}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 80%|######## | 16/20 [01:20<00:15, 3.97s/it]\u001b[32m[I 2021-07-24 13:53:08,539]\u001b[0m Trial 22 finished with value: 2251484592.265115 and parameters: {'num_leaves': 95}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 85%|########5 | 17/20 [01:22<00:10, 3.47s/it]\u001b[32m[I 2021-07-24 13:53:10,837]\u001b[0m Trial 23 finished with value: 2247121386.2896996 and parameters: {'num_leaves': 49}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 90%|######### | 18/20 [01:23<00:05, 2.71s/it]\u001b[32m[I 2021-07-24 13:53:11,772]\u001b[0m Trial 24 finished with value: 2232858800.451656 and parameters: {'num_leaves': 10}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 95%|#########5| 19/20 [01:27<00:03, 3.14s/it]\u001b[32m[I 2021-07-24 13:53:15,912]\u001b[0m Trial 25 finished with value: 2236616896.4291906 and parameters: {'num_leaves': 111}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 100%|##########| 20/20 [01:32<00:00, 3.62s/it]\u001b[32m[I 2021-07-24 13:53:20,649]\u001b[0m Trial 26 finished with value: 2272025220.904855 and parameters: {'num_leaves': 128}. Best is trial 8 with value: 2202201580.7993436.\u001b[0m\n",
"num_leaves, val_score: 2193757572.841483: 100%|##########| 20/20 [01:32<00:00, 4.64s/it]\n",
"bagging, val_score: 2193757572.841483: 10%|# | 1/10 [00:02<00:19, 2.16s/it]\u001b[32m[I 2021-07-24 13:53:22,811]\u001b[0m Trial 27 finished with value: 2215507853.2691116 and parameters: {'bagging_fraction': 0.833765937354807, 'bagging_freq': 5}. Best is trial 27 with value: 2215507853.2691116.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 20%|## | 2/10 [00:04<00:17, 2.19s/it]\u001b[32m[I 2021-07-24 13:53:25,089]\u001b[0m Trial 28 finished with value: 2289649905.7234273 and parameters: {'bagging_fraction': 0.7166436113253525, 'bagging_freq': 3}. Best is trial 27 with value: 2215507853.2691116.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 30%|### | 3/10 [00:06<00:15, 2.28s/it]\u001b[32m[I 2021-07-24 13:53:27,565]\u001b[0m Trial 29 finished with value: 2435071807.8717895 and parameters: {'bagging_fraction': 0.4311907585027494, 'bagging_freq': 3}. Best is trial 27 with value: 2215507853.2691116.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 40%|#### | 4/10 [00:09<00:13, 2.31s/it]\u001b[32m[I 2021-07-24 13:53:29,967]\u001b[0m Trial 30 finished with value: 2384649358.377396 and parameters: {'bagging_fraction': 0.48304880587976823, 'bagging_freq': 5}. Best is trial 27 with value: 2215507853.2691116.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 50%|##### | 5/10 [00:11<00:11, 2.27s/it]\u001b[32m[I 2021-07-24 13:53:32,141]\u001b[0m Trial 31 finished with value: 2210265907.248649 and parameters: {'bagging_fraction': 0.7877791059553392, 'bagging_freq': 6}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 60%|###### | 6/10 [00:13<00:09, 2.28s/it]\u001b[32m[I 2021-07-24 13:53:34,434]\u001b[0m Trial 32 finished with value: 2272518522.3229847 and parameters: {'bagging_fraction': 0.7142758200867739, 'bagging_freq': 4}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 70%|####### | 7/10 [00:15<00:06, 2.26s/it]\u001b[32m[I 2021-07-24 13:53:36,650]\u001b[0m Trial 33 finished with value: 2265778780.249233 and parameters: {'bagging_fraction': 0.6387893291706077, 'bagging_freq': 1}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 80%|######## | 8/10 [00:18<00:04, 2.26s/it]\u001b[32m[I 2021-07-24 13:53:38,903]\u001b[0m Trial 34 finished with value: 2233770570.763797 and parameters: {'bagging_fraction': 0.7784748097499257, 'bagging_freq': 3}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 90%|######### | 9/10 [00:20<00:02, 2.31s/it]\u001b[32m[I 2021-07-24 13:53:41,337]\u001b[0m Trial 35 finished with value: 2385812002.7077584 and parameters: {'bagging_fraction': 0.43475426988023463, 'bagging_freq': 4}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 100%|##########| 10/10 [00:23<00:00, 2.34s/it]\u001b[32m[I 2021-07-24 13:53:43,759]\u001b[0m Trial 36 finished with value: 2279764756.511298 and parameters: {'bagging_fraction': 0.509552686945636, 'bagging_freq': 5}. Best is trial 31 with value: 2210265907.248649.\u001b[0m\n",
"bagging, val_score: 2193757572.841483: 100%|##########| 10/10 [00:23<00:00, 2.31s/it]\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 17%|#6 | 1/6 [00:01<00:08, 1.68s/it]\u001b[32m[I 2021-07-24 13:53:45,444]\u001b[0m Trial 37 finished with value: 2193757572.841483 and parameters: {'feature_fraction': 0.9159999999999999}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 33%|###3 | 2/6 [00:03<00:06, 1.68s/it]\u001b[32m[I 2021-07-24 13:53:47,130]\u001b[0m Trial 38 finished with value: 2193757572.841483 and parameters: {'feature_fraction': 0.852}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 50%|##### | 3/6 [00:05<00:05, 1.68s/it]\u001b[32m[I 2021-07-24 13:53:48,802]\u001b[0m Trial 39 finished with value: 2193757572.841483 and parameters: {'feature_fraction': 0.82}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 67%|######6 | 4/6 [00:06<00:03, 1.70s/it]\u001b[32m[I 2021-07-24 13:53:50,554]\u001b[0m Trial 40 finished with value: 2293115294.9762316 and parameters: {'feature_fraction': 0.9799999999999999}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 83%|########3 | 5/6 [00:08<00:01, 1.68s/it]\u001b[32m[I 2021-07-24 13:53:52,199]\u001b[0m Trial 41 finished with value: 2193757572.841483 and parameters: {'feature_fraction': 0.8839999999999999}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 100%|##########| 6/6 [00:10<00:00, 1.72s/it]\u001b[32m[I 2021-07-24 13:53:54,016]\u001b[0m Trial 42 finished with value: 2293115294.9762316 and parameters: {'feature_fraction': 0.948}. Best is trial 37 with value: 2193757572.841483.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2193757572.841483: 100%|##########| 6/6 [00:10<00:00, 1.71s/it]\n",
"regularization_factors, val_score: 2193757505.927801: 5%|5 | 1/20 [00:01<00:33, 1.78s/it]\u001b[32m[I 2021-07-24 13:53:55,803]\u001b[0m Trial 43 finished with value: 2193757505.927801 and parameters: {'lambda_l1': 0.035530313749205525, 'lambda_l2': 1.7142996437624364e-05}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 10%|# | 2/20 [00:03<00:31, 1.77s/it]\u001b[32m[I 2021-07-24 13:53:57,544]\u001b[0m Trial 44 finished with value: 2223375534.1890984 and parameters: {'lambda_l1': 1.7561787606994837e-06, 'lambda_l2': 0.028131894820153703}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 15%|#5 | 3/20 [00:05<00:30, 1.80s/it]\u001b[32m[I 2021-07-24 13:53:59,412]\u001b[0m Trial 45 finished with value: 2247496509.5917783 and parameters: {'lambda_l1': 2.7262641755324635e-07, 'lambda_l2': 0.07409628972810856}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 20%|## | 4/20 [00:07<00:31, 1.95s/it]\u001b[32m[I 2021-07-24 13:54:01,725]\u001b[0m Trial 46 finished with value: 2226827669.753629 and parameters: {'lambda_l1': 8.386241389210335, 'lambda_l2': 2.5276971608749994}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 25%|##5 | 5/20 [00:09<00:28, 1.89s/it]\u001b[32m[I 2021-07-24 13:54:03,481]\u001b[0m Trial 47 finished with value: 2216429817.580567 and parameters: {'lambda_l1': 1.1041957573939113e-05, 'lambda_l2': 0.017450153863046828}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 30%|### | 6/20 [00:11<00:26, 1.92s/it]\u001b[32m[I 2021-07-24 13:54:05,458]\u001b[0m Trial 48 finished with value: 2203413700.676657 and parameters: {'lambda_l1': 2.2785459443579694, 'lambda_l2': 0.00206545335591311}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 35%|###5 | 7/20 [00:13<00:25, 1.93s/it]\u001b[32m[I 2021-07-24 13:54:07,430]\u001b[0m Trial 49 finished with value: 2193757572.7271867 and parameters: {'lambda_l1': 0.00039136395694252244, 'lambda_l2': 3.427541754951745e-08}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757505.927801: 40%|#### | 8/20 [00:15<00:23, 1.95s/it]\u001b[32m[I 2021-07-24 13:54:09,422]\u001b[0m Trial 50 finished with value: 2240547395.6341996 and parameters: {'lambda_l1': 0.00018872808872200846, 'lambda_l2': 0.5702426219534228}. Best is trial 43 with value: 2193757505.927801.\u001b[0m\n",
"regularization_factors, val_score: 2193757473.925522: 45%|####5 | 9/20 [00:17<00:20, 1.88s/it]\u001b[32m[I 2021-07-24 13:54:11,149]\u001b[0m Trial 51 finished with value: 2193757473.9255223 and parameters: {'lambda_l1': 5.115495660230786e-06, 'lambda_l2': 2.891511576151028e-05}. Best is trial 51 with value: 2193757473.9255223.\u001b[0m\n",
"regularization_factors, val_score: 2193757473.925522: 50%|##### | 10/20 [00:18<00:18, 1.87s/it]\u001b[32m[I 2021-07-24 13:54:12,975]\u001b[0m Trial 52 finished with value: 2209441211.88343 and parameters: {'lambda_l1': 6.709340304448373e-08, 'lambda_l2': 0.0033851983067145165}. Best is trial 51 with value: 2193757473.9255223.\u001b[0m\n",
"regularization_factors, val_score: 2193757473.925522: 55%|#####5 | 11/20 [00:21<00:17, 1.94s/it]\u001b[32m[I 2021-07-24 13:54:15,091]\u001b[0m Trial 53 finished with value: 2193757554.8642426 and parameters: {'lambda_l1': 0.00398313349668578, 'lambda_l2': 4.918044983674336e-06}. Best is trial 51 with value: 2193757473.9255223.\u001b[0m\n",
"regularization_factors, val_score: 2193757473.925522: 60%|###### | 12/20 [00:22<00:15, 1.89s/it]\u001b[32m[I 2021-07-24 13:54:16,857]\u001b[0m Trial 54 finished with value: 2193757527.2890315 and parameters: {'lambda_l1': 0.03174575050092051, 'lambda_l2': 1.121877835012437e-05}. Best is trial 51 with value: 2193757473.9255223.\u001b[0m\n",
"regularization_factors, val_score: 2193757461.792665: 65%|######5 | 13/20 [00:24<00:13, 1.88s/it]\u001b[32m[I 2021-07-24 13:54:18,728]\u001b[0m Trial 55 finished with value: 2193757461.7926645 and parameters: {'lambda_l1': 0.19245364133000292, 'lambda_l2': 2.098195966946619e-05}. Best is trial 55 with value: 2193757461.7926645.\u001b[0m\n",
"regularization_factors, val_score: 2193757461.792665: 70%|####### | 14/20 [00:26<00:11, 1.87s/it]\u001b[32m[I 2021-07-24 13:54:20,550]\u001b[0m Trial 56 finished with value: 2193757572.4410305 and parameters: {'lambda_l1': 1.0621410162724814e-08, 'lambda_l2': 1.1658751463777809e-07}. Best is trial 55 with value: 2193757461.7926645.\u001b[0m\n",
"regularization_factors, val_score: 2193757022.409628: 75%|#######5 | 15/20 [00:28<00:09, 1.84s/it]\u001b[32m[I 2021-07-24 13:54:22,314]\u001b[0m Trial 57 finished with value: 2193757022.4096284 and parameters: {'lambda_l1': 0.4708623853744175, 'lambda_l2': 0.00013251231491224775}. Best is trial 57 with value: 2193757022.4096284.\u001b[0m\n",
"regularization_factors, val_score: 2193757022.409628: 80%|######## | 16/20 [00:30<00:07, 1.81s/it]\u001b[32m[I 2021-07-24 13:54:24,053]\u001b[0m Trial 58 finished with value: 2193757471.257112 and parameters: {'lambda_l1': 0.48939023058146797, 'lambda_l2': 5.55788736346222e-07}. Best is trial 57 with value: 2193757022.4096284.\u001b[0m\n",
"regularization_factors, val_score: 2193756705.682960: 85%|########5 | 17/20 [00:31<00:05, 1.81s/it]\u001b[32m[I 2021-07-24 13:54:25,872]\u001b[0m Trial 59 finished with value: 2193756705.6829596 and parameters: {'lambda_l1': 0.2944813276784501, 'lambda_l2': 0.00023554829225716752}. Best is trial 59 with value: 2193756705.6829596.\u001b[0m\n",
"regularization_factors, val_score: 2193756705.682960: 90%|######### | 18/20 [00:33<00:03, 1.80s/it]\u001b[32m[I 2021-07-24 13:54:27,663]\u001b[0m Trial 60 finished with value: 2213959934.0753803 and parameters: {'lambda_l1': 5.791380633444603, 'lambda_l2': 0.00017616756434467052}. Best is trial 59 with value: 2193756705.6829596.\u001b[0m\n",
"regularization_factors, val_score: 2193756300.347055: 95%|#########5| 19/20 [00:35<00:01, 1.81s/it]\u001b[32m[I 2021-07-24 13:54:29,475]\u001b[0m Trial 61 finished with value: 2193756300.347055 and parameters: {'lambda_l1': 0.0038798669293550647, 'lambda_l2': 0.000370708788404359}. Best is trial 61 with value: 2193756300.347055.\u001b[0m\n",
"regularization_factors, val_score: 2193754225.665947: 100%|##########| 20/20 [00:37<00:00, 1.81s/it]\u001b[32m[I 2021-07-24 13:54:31,283]\u001b[0m Trial 62 finished with value: 2193754225.665947 and parameters: {'lambda_l1': 0.001623299893398886, 'lambda_l2': 0.0009761417290668266}. Best is trial 62 with value: 2193754225.665947.\u001b[0m\n",
"regularization_factors, val_score: 2193754225.665947: 100%|##########| 20/20 [00:37<00:00, 1.86s/it]\n",
"min_data_in_leaf, val_score: 2193754225.665947: 20%|## | 1/5 [00:01<00:07, 1.90s/it]\u001b[32m[I 2021-07-24 13:54:33,186]\u001b[0m Trial 63 finished with value: 2225388728.9240403 and parameters: {'min_child_samples': 10}. Best is trial 63 with value: 2225388728.9240403.\u001b[0m\n",
"min_data_in_leaf, val_score: 2193754225.665947: 40%|#### | 2/5 [00:03<00:05, 1.88s/it]\u001b[32m[I 2021-07-24 13:54:35,035]\u001b[0m Trial 64 finished with value: 2219135238.044885 and parameters: {'min_child_samples': 5}. Best is trial 64 with value: 2219135238.044885.\u001b[0m\n",
"min_data_in_leaf, val_score: 2193754225.665947: 60%|###### | 3/5 [00:06<00:04, 2.04s/it]\u001b[32m[I 2021-07-24 13:54:37,428]\u001b[0m Trial 65 finished with value: 2275374497.207612 and parameters: {'min_child_samples': 100}. Best is trial 64 with value: 2219135238.044885.\u001b[0m\n",
"min_data_in_leaf, val_score: 2193754225.665947: 80%|######## | 4/5 [00:08<00:01, 1.99s/it]\u001b[32m[I 2021-07-24 13:54:39,318]\u001b[0m Trial 66 finished with value: 2229247396.3587947 and parameters: {'min_child_samples': 25}. Best is trial 64 with value: 2219135238.044885.\u001b[0m\n",
"min_data_in_leaf, val_score: 2193754225.665947: 100%|##########| 5/5 [00:10<00:00, 2.00s/it]\u001b[32m[I 2021-07-24 13:54:41,353]\u001b[0m Trial 67 finished with value: 2274227159.90541 and parameters: {'min_child_samples': 50}. Best is trial 64 with value: 2219135238.044885.\u001b[0m\n",
"min_data_in_leaf, val_score: 2193754225.665947: 100%|##########| 5/5 [00:10<00:00, 2.01s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 2min 55s, sys: 8.33 s, total: 3min 3s\n",
"Wall time: 3min 4s\n"
]
},
{
2021-02-22 22:10:41 -08:00
"name": "stderr",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
"\n"
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
"%%time\n",
"model = lgb.train(params, dtrain, valid_sets=[dtrain, dval], verbose_eval=10000) \n"
]
},
2021-04-10 21:14:28 -04:00
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
2021-05-08 02:50:50 +00:00
"execution_count": 19,
2021-04-10 21:14:28 -04:00
"metadata": {
"tags": []
},
2021-05-08 02:50:50 +00:00
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"Optuna LightGBM Tuner r2 = 0.8454106958774709\n"
2021-05-08 02:50:50 +00:00
]
}
],
2021-04-10 21:14:28 -04:00
"source": [
"y_pred = model.predict(X_test)\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('Optuna LightGBM Tuner r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Add a customized LightGBM learner in FLAML\n",
"The native API of LightGBM allows one to specify a custom objective function in the model constructor. You can easily enable it by adding a customized LightGBM learner in FLAML. In the following example, we show how to add such a customized LightGBM learner with a custom objective function."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a customized LightGBM learner with a custom objective function"
]
},
2021-02-22 22:10:41 -08:00
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 20,
2021-02-22 22:10:41 -08:00
"metadata": {},
2021-04-10 21:14:28 -04:00
"outputs": [],
"source": [
"\n",
"import numpy as np \n",
"\n",
"''' define your customized objective function '''\n",
"def my_loss_obj(y_true, y_pred):\n",
" c = 0.5\n",
" residual = y_pred - y_true\n",
" grad = c * residual /(np.abs(residual) + c)\n",
" hess = c ** 2 / (np.abs(residual) + c) ** 2\n",
" # rmse grad and hess\n",
" grad_rmse = residual\n",
" hess_rmse = 1.0\n",
" \n",
" # mae grad and hess\n",
" grad_mae = np.array(residual)\n",
" grad_mae[grad_mae > 0] = 1.\n",
" grad_mae[grad_mae <= 0] = -1.\n",
" hess_mae = 1.0\n",
"\n",
" coef = [0.4, 0.3, 0.3]\n",
" return coef[0] * grad + coef[1] * grad_rmse + coef[2] * grad_mae, \\\n",
" coef[0] * hess + coef[1] * hess_rmse + coef[2] * hess_mae\n",
"\n",
"\n",
"from flaml.model import LGBMEstimator\n",
"\n",
"''' create a customized LightGBM learner class with your objective function '''\n",
"class MyLGBM(LGBMEstimator):\n",
" '''LGBMEstimator with my_loss_obj as the objective function\n",
" '''\n",
"\n",
" def __init__(self, **params):\n",
" super().__init__(objective=my_loss_obj, **params)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add the customized learner in FLAML"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 21,
2021-04-10 21:14:28 -04:00
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"[flaml.automl: 07-24 13:54:42] {912} INFO - Evaluation method: cv\n",
"[flaml.automl: 07-24 13:54:42] {616} INFO - Using RepeatedKFold\n",
"[flaml.automl: 07-24 13:54:42] {933} INFO - Minimizing error metric: 1-r2\n",
"[flaml.automl: 07-24 13:54:42] {952} INFO - List of ML learners in AutoML Run: ['my_lgbm']\n",
"[flaml.automl: 07-24 13:54:42] {1018} INFO - iteration 0, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:42] {1173} INFO - at 0.3s,\tbest my_lgbm's error=2.9888,\tbest my_lgbm's error=2.9888\n",
"[flaml.automl: 07-24 13:54:42] {1018} INFO - iteration 1, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:42] {1173} INFO - at 0.4s,\tbest my_lgbm's error=2.9888,\tbest my_lgbm's error=2.9888\n",
"[flaml.automl: 07-24 13:54:42] {1018} INFO - iteration 2, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:42] {1173} INFO - at 0.6s,\tbest my_lgbm's error=1.7536,\tbest my_lgbm's error=1.7536\n",
"[flaml.automl: 07-24 13:54:42] {1018} INFO - iteration 3, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:42] {1173} INFO - at 0.8s,\tbest my_lgbm's error=0.4529,\tbest my_lgbm's error=0.4529\n",
"[flaml.automl: 07-24 13:54:42] {1018} INFO - iteration 4, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:43] {1173} INFO - at 1.0s,\tbest my_lgbm's error=0.4529,\tbest my_lgbm's error=0.4529\n",
"[flaml.automl: 07-24 13:54:43] {1018} INFO - iteration 5, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:43] {1173} INFO - at 1.1s,\tbest my_lgbm's error=0.4529,\tbest my_lgbm's error=0.4529\n",
"[flaml.automl: 07-24 13:54:43] {1018} INFO - iteration 6, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:43] {1173} INFO - at 1.5s,\tbest my_lgbm's error=0.3159,\tbest my_lgbm's error=0.3159\n",
"[flaml.automl: 07-24 13:54:43] {1018} INFO - iteration 7, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:43] {1173} INFO - at 1.9s,\tbest my_lgbm's error=0.2717,\tbest my_lgbm's error=0.2717\n",
"[flaml.automl: 07-24 13:54:43] {1018} INFO - iteration 8, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:44] {1173} INFO - at 2.1s,\tbest my_lgbm's error=0.2717,\tbest my_lgbm's error=0.2717\n",
"[flaml.automl: 07-24 13:54:44] {1018} INFO - iteration 9, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:44] {1173} INFO - at 2.5s,\tbest my_lgbm's error=0.2073,\tbest my_lgbm's error=0.2073\n",
"[flaml.automl: 07-24 13:54:44] {1018} INFO - iteration 10, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:44] {1173} INFO - at 2.7s,\tbest my_lgbm's error=0.2073,\tbest my_lgbm's error=0.2073\n",
"[flaml.automl: 07-24 13:54:44] {1018} INFO - iteration 11, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:45] {1173} INFO - at 2.9s,\tbest my_lgbm's error=0.2073,\tbest my_lgbm's error=0.2073\n",
"[flaml.automl: 07-24 13:54:45] {1018} INFO - iteration 12, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:45] {1173} INFO - at 3.4s,\tbest my_lgbm's error=0.1883,\tbest my_lgbm's error=0.1883\n",
"[flaml.automl: 07-24 13:54:45] {1018} INFO - iteration 13, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:45] {1173} INFO - at 3.7s,\tbest my_lgbm's error=0.1883,\tbest my_lgbm's error=0.1883\n",
"[flaml.automl: 07-24 13:54:45] {1018} INFO - iteration 14, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:46] {1173} INFO - at 4.3s,\tbest my_lgbm's error=0.1883,\tbest my_lgbm's error=0.1883\n",
"[flaml.automl: 07-24 13:54:46] {1018} INFO - iteration 15, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:46] {1173} INFO - at 4.6s,\tbest my_lgbm's error=0.1883,\tbest my_lgbm's error=0.1883\n",
"[flaml.automl: 07-24 13:54:46] {1018} INFO - iteration 16, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:47] {1173} INFO - at 5.2s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:47] {1018} INFO - iteration 17, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:47] {1173} INFO - at 5.5s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:47] {1018} INFO - iteration 18, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:48] {1173} INFO - at 6.7s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:48] {1018} INFO - iteration 19, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:49] {1173} INFO - at 7.0s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:49] {1018} INFO - iteration 20, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:51] {1173} INFO - at 9.7s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:51] {1018} INFO - iteration 21, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:52] {1173} INFO - at 10.1s,\tbest my_lgbm's error=0.1878,\tbest my_lgbm's error=0.1878\n",
"[flaml.automl: 07-24 13:54:52] {1018} INFO - iteration 22, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:53] {1173} INFO - at 11.0s,\tbest my_lgbm's error=0.1751,\tbest my_lgbm's error=0.1751\n",
"[flaml.automl: 07-24 13:54:53] {1018} INFO - iteration 23, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:53] {1173} INFO - at 11.4s,\tbest my_lgbm's error=0.1751,\tbest my_lgbm's error=0.1751\n",
"[flaml.automl: 07-24 13:54:53] {1018} INFO - iteration 24, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:54] {1173} INFO - at 12.5s,\tbest my_lgbm's error=0.1751,\tbest my_lgbm's error=0.1751\n",
"[flaml.automl: 07-24 13:54:54] {1018} INFO - iteration 25, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:54:54] {1173} INFO - at 12.8s,\tbest my_lgbm's error=0.1751,\tbest my_lgbm's error=0.1751\n",
"[flaml.automl: 07-24 13:54:54] {1018} INFO - iteration 26, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:00] {1173} INFO - at 18.5s,\tbest my_lgbm's error=0.1660,\tbest my_lgbm's error=0.1660\n",
"[flaml.automl: 07-24 13:55:00] {1018} INFO - iteration 27, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:03] {1173} INFO - at 21.5s,\tbest my_lgbm's error=0.1660,\tbest my_lgbm's error=0.1660\n",
"[flaml.automl: 07-24 13:55:03] {1018} INFO - iteration 28, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:10] {1173} INFO - at 28.2s,\tbest my_lgbm's error=0.1660,\tbest my_lgbm's error=0.1660\n",
"[flaml.automl: 07-24 13:55:10] {1018} INFO - iteration 29, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:11] {1173} INFO - at 29.1s,\tbest my_lgbm's error=0.1660,\tbest my_lgbm's error=0.1660\n",
"[flaml.automl: 07-24 13:55:11] {1018} INFO - iteration 30, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:37] {1173} INFO - at 55.3s,\tbest my_lgbm's error=0.1634,\tbest my_lgbm's error=0.1634\n",
"[flaml.automl: 07-24 13:55:37] {1018} INFO - iteration 31, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:55:50] {1173} INFO - at 68.6s,\tbest my_lgbm's error=0.1624,\tbest my_lgbm's error=0.1624\n",
"[flaml.automl: 07-24 13:55:50] {1018} INFO - iteration 32, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:56:13] {1173} INFO - at 90.9s,\tbest my_lgbm's error=0.1624,\tbest my_lgbm's error=0.1624\n",
"[flaml.automl: 07-24 13:56:13] {1018} INFO - iteration 33, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:56:31] {1173} INFO - at 109.9s,\tbest my_lgbm's error=0.1624,\tbest my_lgbm's error=0.1624\n",
"[flaml.automl: 07-24 13:56:31] {1018} INFO - iteration 34, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:56:41] {1173} INFO - at 119.3s,\tbest my_lgbm's error=0.1624,\tbest my_lgbm's error=0.1624\n",
"[flaml.automl: 07-24 13:56:41] {1018} INFO - iteration 35, current learner my_lgbm\n",
"[flaml.automl: 07-24 13:57:18] {1173} INFO - at 156.1s,\tbest my_lgbm's error=0.1624,\tbest my_lgbm's error=0.1624\n",
"[flaml.automl: 07-24 13:57:18] {1219} INFO - selected model: LGBMRegressor(colsample_bytree=0.7929174747127123,\n",
" learning_rate=0.10575205975801834, max_bin=128,\n",
" min_child_samples=128, n_estimators=754, num_leaves=710,\n",
" objective=<function my_loss_obj at 0x7f51ec3a2310>,\n",
" reg_alpha=0.0009765625, reg_lambda=10.762106709995438)\n",
"[flaml.automl: 07-24 13:57:18] {969} INFO - fit succeeded\n"
2021-05-08 02:50:50 +00:00
]
2021-04-10 21:14:28 -04:00
}
],
"source": [
"automl = AutoML()\n",
"automl.add_learner(learner_name='my_lgbm', learner_class=MyLGBM)\n",
"settings = {\n",
2021-07-24 20:10:43 -04:00
" \"time_budget\": 150, # total running time in seconds\n",
2021-04-10 21:14:28 -04:00
" \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']\n",
" \"estimator_list\": ['my_lgbm',], # list of ML learners; we tune lightgbm in this example\n",
" \"task\": 'regression', # task type \n",
" \"log_file_name\": 'houses_experiment_my_lgbm.log', # flaml log file\n",
"}\n",
"automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "code",
2021-07-24 20:10:43 -04:00
"execution_count": 22,
2021-04-10 21:14:28 -04:00
"metadata": {
"tags": []
},
2021-02-22 22:10:41 -08:00
"outputs": [
{
"name": "stdout",
2021-07-24 20:10:43 -04:00
"output_type": "stream",
2021-05-08 02:50:50 +00:00
"text": [
2021-07-24 20:10:43 -04:00
"Best hyperparmeter config: {'n_estimators': 754, 'num_leaves': 710, 'min_child_samples': 128, 'learning_rate': 0.10575205975801834, 'subsample': 1.0, 'log_max_bin': 8, 'colsample_bytree': 0.7929174747127123, 'reg_alpha': 0.0009765625, 'reg_lambda': 10.762106709995438}\n",
"Best r2 on validation data: 0.8376\n",
"Training duration of best run: 13.28 s\n",
"Predicted labels [135768.25690639 246689.39399877 136637.13857269 ... 175212.47378055\n",
" 243756.5990978 271017.12074672]\n",
"True labels 14740 136900.0\n",
"10101 241300.0\n",
"20566 200700.0\n",
"2670 72500.0\n",
"15709 460000.0\n",
" ... \n",
"13132 121200.0\n",
"8228 137500.0\n",
"3948 160900.0\n",
"8522 227300.0\n",
"16798 265600.0\n",
"Name: median_house_value, Length: 5160, dtype: float64\n",
"r2 = 0.8459538207127344\n",
"mse = 2036260428.588182\n",
"mae = 30277.65301151835\n"
2021-05-08 02:50:50 +00:00
]
2021-02-22 22:10:41 -08:00
}
],
"source": [
2021-04-10 21:14:28 -04:00
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))\n",
"\n",
"y_pred = automl.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)\n",
"\n",
2021-02-22 22:10:41 -08:00
"from flaml.ml import sklearn_metric_loss_score\n",
2021-04-10 21:14:28 -04:00
"print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n",
"print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n",
"print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))"
2021-02-22 22:10:41 -08:00
]
}
],
"metadata": {
2021-07-24 20:10:43 -04:00
"interpreter": {
"hash": "0cfea3304185a9579d09e0953576b57c8581e46e6ebc6dfeb681bc5a511f7544"
},
2021-02-22 22:10:41 -08:00
"kernelspec": {
2021-07-24 20:10:43 -04:00
"display_name": "Python 3.8.0 64-bit ('blend': conda)",
"name": "python3"
2021-02-22 22:10:41 -08:00
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
2021-07-24 20:10:43 -04:00
"version": "3.8.0"
2021-02-22 22:10:41 -08:00
}
},
"nbformat": 4,
"nbformat_minor": 2
}