autogen/notebook/automl_lightgbm.ipynb
Li Jiang da2cd7ca89
Add supporting using Spark as the backend of parallel training (#846)
* Added spark support for parallel training.

* Added tests and fixed a bug

* Added more tests and updated docs

* Updated setup.py and docs

* Added customize_learner and tests

* Update spark tests and setup.py

* Update docs and verbose

* Update logging, fix issue in cloud notebook

* Update github workflow for spark tests

* Update github workflow

* Remove hack of handling _choice_

* Allow for failures

* Fix tests, update docs

* Update setup.py

* Update Dockerfile for Spark

* Update tests, remove some warnings

* Add test for notebooks, update utils

* Add performance test for Spark

* Fix lru_cache maxsize

* Fix test failures on some platforms

* Fix coverage report failure

* resovle PR comments

* resovle PR comments 2nd round

* resovle PR comments 3rd round

* fix lint and rename test class

* resovle PR comments 4th round

* refactor customize_learner to broadcast_code
2022-12-23 08:18:49 -08:00

1069 lines
100 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Copyright (c) Microsoft Corporation. All rights reserved. \n",
"\n",
"Licensed under the MIT License.\n",
"\n",
"# Tune LightGBM with FLAML Library\n",
"\n",
"\n",
"## 1. Introduction\n",
"\n",
"FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n",
"with low computational cost. It is fast and economical. The simple and lightweight design makes it easy \n",
"to use and extend, such as adding new learners. FLAML can \n",
"- serve as an economical AutoML engine,\n",
"- be used as a fast hyperparameter tuning tool, or \n",
"- be embedded in self-tuning software that requires low latency & resource in repetitive\n",
" tuning tasks.\n",
"\n",
"In this notebook, we demonstrate how to use FLAML library to tune hyperparameters of LightGBM with a regression example.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` option:\n",
"```bash\n",
"pip install flaml[notebook]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%pip install flaml[notebook]==1.0.10"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 2. Regression Example\n",
"### Load data and preprocess\n",
"\n",
"Download [houses dataset](https://www.openml.org/d/537) from OpenML. The task is to predict median price of the house in the region based on demographic composition and a state of housing market in the region."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/root/.local/lib/python3.9/site-packages/xgboost/compat.py:31: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
" from pandas import MultiIndex, Int64Index\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"download dataset from openml\n",
"Dataset name: houses\n",
"X_train.shape: (15480, 8), y_train.shape: (15480,);\n",
"X_test.shape: (5160, 8), y_test.shape: (5160,)\n"
]
}
],
"source": [
"from flaml.data import load_openml_dataset\n",
"X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=537, data_dir='./')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Run FLAML\n",
"In the FLAML automl run configuration, users can specify the task type, time budget, error metric, learner list, whether to subsample, resampling strategy type, and so on. All these arguments have default values which will be used if users do not provide them. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [],
"source": [
"''' import AutoML class from flaml package '''\n",
"from flaml import AutoML\n",
"automl = AutoML()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"settings = {\n",
" \"time_budget\": 240, # total running time in seconds\n",
" \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2','rmse','mape']\n",
" \"estimator_list\": ['lgbm'], # list of ML learners; we tune lightgbm in this example\n",
" \"task\": 'regression', # task type \n",
" \"log_file_name\": 'houses_experiment.log', # flaml log file\n",
" \"seed\": 7654321, # random seed\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 07-01 15:22:15] {2427} INFO - task = regression\n",
"[flaml.automl: 07-01 15:22:15] {2429} INFO - Data split method: uniform\n",
"[flaml.automl: 07-01 15:22:15] {2432} INFO - Evaluation method: cv\n",
"[flaml.automl: 07-01 15:22:15] {2501} INFO - Minimizing error metric: 1-r2\n",
"[flaml.automl: 07-01 15:22:15] {2641} INFO - List of ML learners in AutoML Run: ['lgbm']\n",
"[flaml.automl: 07-01 15:22:15] {2933} INFO - iteration 0, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:16] {3061} INFO - Estimated sufficient time budget=1981s. Estimated necessary time budget=2s.\n",
"[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.3s,\testimator lgbm's best error=0.7383,\tbest estimator lgbm's best error=0.7383\n",
"[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 1, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.5s,\testimator lgbm's best error=0.7383,\tbest estimator lgbm's best error=0.7383\n",
"[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 2, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.7s,\testimator lgbm's best error=0.3250,\tbest estimator lgbm's best error=0.3250\n",
"[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 3, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:16] {3108} INFO - at 1.1s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 4, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:17] {3108} INFO - at 1.3s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:17] {2933} INFO - iteration 5, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:19] {3108} INFO - at 3.6s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 6, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:19] {3108} INFO - at 3.8s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 7, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:19] {3108} INFO - at 4.2s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 8, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:20] {3108} INFO - at 4.7s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:20] {2933} INFO - iteration 9, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:20] {3108} INFO - at 4.9s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n",
"[flaml.automl: 07-01 15:22:20] {2933} INFO - iteration 10, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:22] {3108} INFO - at 6.6s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:22] {2933} INFO - iteration 11, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:22] {3108} INFO - at 7.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:22] {2933} INFO - iteration 12, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:28] {3108} INFO - at 12.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:28] {2933} INFO - iteration 13, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:29] {3108} INFO - at 13.6s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:29] {2933} INFO - iteration 14, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:34] {3108} INFO - at 18.4s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:34] {2933} INFO - iteration 15, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:39] {3108} INFO - at 23.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:39] {2933} INFO - iteration 16, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:40] {3108} INFO - at 24.5s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:40] {2933} INFO - iteration 17, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:53] {3108} INFO - at 37.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:53] {2933} INFO - iteration 18, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:53] {3108} INFO - at 38.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:53] {2933} INFO - iteration 19, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:54] {3108} INFO - at 39.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n",
"[flaml.automl: 07-01 15:22:54] {2933} INFO - iteration 20, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:56] {3108} INFO - at 41.0s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:22:56] {2933} INFO - iteration 21, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:58] {3108} INFO - at 42.5s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:22:58] {2933} INFO - iteration 22, current learner lgbm\n",
"[flaml.automl: 07-01 15:22:59] {3108} INFO - at 44.2s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:22:59] {2933} INFO - iteration 23, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:03] {3108} INFO - at 47.8s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:23:03] {2933} INFO - iteration 24, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:04] {3108} INFO - at 48.6s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:23:04] {2933} INFO - iteration 25, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:05] {3108} INFO - at 49.5s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n",
"[flaml.automl: 07-01 15:23:05] {2933} INFO - iteration 26, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:07] {3108} INFO - at 51.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:07] {2933} INFO - iteration 27, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:09] {3108} INFO - at 53.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:09] {2933} INFO - iteration 28, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:11] {3108} INFO - at 55.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:11] {2933} INFO - iteration 29, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:12] {3108} INFO - at 56.6s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:12] {2933} INFO - iteration 30, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:15] {3108} INFO - at 59.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:15] {2933} INFO - iteration 31, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:20] {3108} INFO - at 64.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:20] {2933} INFO - iteration 32, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:20] {3108} INFO - at 65.1s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:20] {2933} INFO - iteration 33, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:31] {3108} INFO - at 76.0s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:31] {2933} INFO - iteration 34, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:32] {3108} INFO - at 76.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:32] {2933} INFO - iteration 35, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:35] {3108} INFO - at 79.3s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:35] {2933} INFO - iteration 36, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:35] {3108} INFO - at 80.2s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:35] {2933} INFO - iteration 37, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:37] {3108} INFO - at 81.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:37] {2933} INFO - iteration 38, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:39] {3108} INFO - at 83.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:39] {2933} INFO - iteration 39, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:40] {3108} INFO - at 84.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:40] {2933} INFO - iteration 40, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:43] {3108} INFO - at 88.1s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:43] {2933} INFO - iteration 41, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:45] {3108} INFO - at 89.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n",
"[flaml.automl: 07-01 15:23:45] {2933} INFO - iteration 42, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:47] {3108} INFO - at 91.7s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:47] {2933} INFO - iteration 43, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:48] {3108} INFO - at 92.4s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:48] {2933} INFO - iteration 44, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:54] {3108} INFO - at 98.5s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:54] {2933} INFO - iteration 45, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:55] {3108} INFO - at 100.2s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:55] {2933} INFO - iteration 46, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:58] {3108} INFO - at 102.6s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:58] {2933} INFO - iteration 47, current learner lgbm\n",
"[flaml.automl: 07-01 15:23:59] {3108} INFO - at 103.4s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:23:59] {2933} INFO - iteration 48, current learner lgbm\n",
"[flaml.automl: 07-01 15:24:03] {3108} INFO - at 108.0s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:24:03] {2933} INFO - iteration 49, current learner lgbm\n",
"[flaml.automl: 07-01 15:24:04] {3108} INFO - at 108.8s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n",
"[flaml.automl: 07-01 15:24:04] {2933} INFO - iteration 50, current learner lgbm\n",
"[flaml.automl: 07-01 15:24:12] {3108} INFO - at 116.3s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:24:12] {2933} INFO - iteration 51, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:01] {3108} INFO - at 166.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:01] {2933} INFO - iteration 52, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:02] {3108} INFO - at 167.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:02] {2933} INFO - iteration 53, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:04] {3108} INFO - at 168.7s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:04] {2933} INFO - iteration 54, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:38] {3108} INFO - at 203.0s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:38] {2933} INFO - iteration 55, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:47] {3108} INFO - at 211.9s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:47] {2933} INFO - iteration 56, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:51] {3108} INFO - at 216.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:51] {2933} INFO - iteration 57, current learner lgbm\n",
"[flaml.automl: 07-01 15:25:53] {3108} INFO - at 217.8s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:25:53] {2933} INFO - iteration 58, current learner lgbm\n",
"[flaml.automl: 07-01 15:26:19] {3108} INFO - at 243.9s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n",
"[flaml.automl: 07-01 15:26:21] {3372} INFO - retrain lgbm for 1.7s\n",
"[flaml.automl: 07-01 15:26:21] {3379} INFO - retrained model: LGBMRegressor(colsample_bytree=0.6884091116362046,\n",
" learning_rate=0.0825101833775657, max_bin=1023,\n",
" min_child_samples=15, n_estimators=436, num_leaves=46,\n",
" reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n",
" verbose=-1)\n",
"[flaml.automl: 07-01 15:26:21] {2672} INFO - fit succeeded\n",
"[flaml.automl: 07-01 15:26:21] {2673} INFO - Time taken to find the best model: 116.267258644104\n"
]
}
],
"source": [
"'''The main flaml automl API'''\n",
"automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Best model and metric"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best hyperparmeter config: {'n_estimators': 436, 'num_leaves': 46, 'min_child_samples': 15, 'learning_rate': 0.0825101833775657, 'log_max_bin': 10, 'colsample_bytree': 0.6884091116362046, 'reg_alpha': 0.0010949400705571237, 'reg_lambda': 0.004934208563558304}\n",
"Best r2 on validation data: 0.8442\n",
"Training duration of best run: 1.668 s\n"
]
}
],
"source": [
"''' retrieve best config'''\n",
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LGBMRegressor(colsample_bytree=0.6884091116362046,\n",
" learning_rate=0.0825101833775657, max_bin=1023,\n",
" min_child_samples=15, n_estimators=436, num_leaves=46,\n",
" reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n",
" verbose=-1)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LGBMRegressor</label><div class=\"sk-toggleable__content\"><pre>LGBMRegressor(colsample_bytree=0.6884091116362046,\n",
" learning_rate=0.0825101833775657, max_bin=1023,\n",
" min_child_samples=15, n_estimators=436, num_leaves=46,\n",
" reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n",
" verbose=-1)</pre></div></div></div></div></div>"
],
"text/plain": [
"LGBMRegressor(colsample_bytree=0.6884091116362046,\n",
" learning_rate=0.0825101833775657, max_bin=1023,\n",
" min_child_samples=15, n_estimators=436, num_leaves=46,\n",
" reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n",
" verbose=-1)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"automl.model.estimator"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<BarContainer object of 8 artists>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAc0AAAD4CAYAAACOhb23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAee0lEQVR4nO3deZRdVZn38e+PIiRAsAIksqojUoJpEEgokgIFQxonUOyXQaKhoSFgL9MMouKiJS0uDdi2QLSlURDC20gQRDsMwgKZGgjkRUKoIpWqBAggiS0RQYYUYCRA5Xn/OLvkcq3h3Bpyh/p91rqrzt1nn72ffU/Bk73PqXMVEZiZmVn/tih3AGZmZtXCSdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLactyB2DDa/z48dHY2FjuMMzMqkpra+sLETGhuNxJs8Y1NjbS0tJS7jDMzKqKpN/2VO7lWTMzs5ycNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyUnTzMwsJz/coMZ1rOukce6t5Q7DzGzYrD3v05utL880zczMcnLSNDMzy8lJ08zMLCcnTTMzs5ycNM3MzHJy0jQzM8vJSbOApNeGoc3DJc1N20dK2nMAbSyW1DzUsZmZWWmcNIdZRNwcEeelt0cCJSdNMzOrDE6aPVBmvqSVkjokzUrlB6dZ33WSHpd0jSSlfYelslZJF0m6JZWfKOlHkg4EDgfmS2qTtFvhDFLSeElr0/bWkn4u6TFJNwJbF8R2iKQHJT0iaZGksZv30zEzG7n8RKCefQZoAvYBxgMPS7o/7dsX2Av4PfAA8GFJLcBlwIyIWCPp2uIGI+LXkm4GbomI6wBSvu3JKcCGiPiApCnAI6n+eOAbwMcj4k+SzgK+CpxbeLCkOcAcgLp3TRjYJ2BmZn/FM82eTQeujYiuiHgOuA/YL+1bFhHPRMQmoA1oBPYAno6INanOXyXNEs0ArgaIiHagPZV/iGx59wFJbcBsYJfigyNiQUQ0R0Rz3Tb1gwzFzMy6eaZZuo0F210M7jN8i7f/4TImR30Bd0XEPwyiTzMzGyDPNHu2BJglqU7SBLKZ37I+6q8GdpXUmN7P6qXeq8B2Be/XAtPS9syC8vuBYwEk7Q1MSeVLyZaD35/2bSvpb/MMyMzMBs9Js2c3ki2JrgDuAb4WEX/orXJE/Bk4FbhdUitZcuzsoerPgX+RtFzSbsD3gFMkLSe7dtrtx8BYSY+RXa9sTf38ETgRuFZSO/Ag2dKwmZltBoqIcsdQEySNjYjX0t20FwNPRsQPyh3X6IZJ0TD7wnKHYWY2bIbjq8EktUbEX/19vGeaQ+cL6eacVUA92d20ZmZWQ3wj0BBJs8qyzyzNzGz4eKZpZmaWk5OmmZlZTk6aZmZmOfmaZo2bPLGelmG4s8zMbCTyTNPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyTcC1biOdZ00zr213GGYmW1Ww/FoPfBM08zMLDcnTTMzs5ycNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcnDRLIOm1fvaPk3Rqwfu/kXRd2m6SdNgA+pwn6czSozUzs6HmpDm0xgF/SZoR8fuImJneNgElJ00zM6scTpoDIGmspLslPSKpQ9IRadd5wG6S2iTNl9QoaaWkrYBzgVlp36ziGWSq15i2z5b0hKT/B+xeUGc3SbdLapW0RNIem2/UZmbmJwINzOvAURHxiqTxwFJJNwNzgb0jogmgOwlGxBuSvgk0R8QX0755PTUsaRpwDNnMdEvgEaA17V4AnBwRT0r6IHAJ8NEe2pgDzAGoe9eEIRiumZmBk+ZACfh3STOATcBEYKchavsg4MaI2ACQkjGSxgIHAoskddcd3VMDEbGALMEyumFSDFFcZmYjnpPmwBwHTACmRcSbktYCY0ps4y3euTze3/FbAOu7Z7FmZrb5+ZrmwNQDz6eE+RFgl1T+KrBdL8cU71sLTAWQNBV4Xyq/HzhS0taStgP+D0BEvAKskfTZdIwk7TN0QzIzs/44aQ7MNUCzpA7gBOBxgIh4EXgg3dQzv+iYe4E9u28EAq4HdpC0Cvgi8ERq4xHgF8AK4Dbg4YI2jgP+SdIKYBVwBGZmttkowpe8atnohknRMPvCcodhZrZZDfarwSS1RkRzcblnmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlO/jvNGjd5Yj0tg7wgbmZmGc80zczMcnLSNDMzy8lJ08zMLCcnTTMzs5x8I1CN61jXSePcW8sdhlnFGOyTYmxk80zTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCcnTTMzs5xGRNKU1ChpZRn6fa3E+vMkndlDeVniNzOzdxoRSdPMzGwojKSkWSfpckmrJN0paWtJTZKWSmqXdKOk7QEkLZbUnLbHS1qbtveStExSWzpmUir/x4LyyyTVdXcq6TuSVqR+dkpljZLuSW3cLem9xcFKmpaOWwGcVlDeYwxmZjb8RlLSnARcHBF7AeuBo4GrgLMiYgrQAXyrnzZOBv4zIpqAZuAZSR8AZgEfTuVdwHGp/rbA0ojYB7gf+EIq/yGwMPV7DXBRD339BDg9HdtnDMUHSpojqUVSS9eGzn6GZGZmeY2kpLkmItrSdiuwGzAuIu5LZQuBGf208SDwdUlnAbtExJ+BjwHTgIcltaX3u6b6bwC3FPTZmLYPAH6Wtn8KTC/sRNK4FNv9BXX6iuEdImJBRDRHRHPdNvX9DMnMzPIaSUlzY8F2FzCuj7pv8fZnM6a7MCJ+BhwO/Bn4laSPAiKbNTal1+4RMS8d8mZEREGfg37Wby8xmJnZZjCSkmaxTuBlSQel98cD3bPOtWSzR4CZ3QdI2hV4OiIuAm4CpgB3AzMlvTvV2UHSLv30/WvgmLR9HLCkcGdErAfWS5peUKevGMzMbDMYyUkTYDYwX1I70AScm8q/B5wiaTkwvqD+54CVaRl2b+CqiHgU+AZwZ2rnLqChn35PB05K9Y8HvtxDnZOAi1Nf6iuGXCM1M7NB09urh1aLRjdMiobZF5Y7DLOK4a8GszwktUZEc3H5SJ9pmpmZ5eakaWZmlpOTppmZWU5OmmZmZjkN+u8GrbJNnlhPi298MDMbEp5pmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjn5RqAa17Guk8a5t5Y7DDMbBn660ebnmaaZmVlOTppmZmY5OWmamZnl5KRpZmaWk5OmmZlZTk6aZmZmOTlpDgNJjZJW5qhzbMH7ZkkXDX90ZmY2UE6a5dMI/CVpRkRLRHypfOGYmVl/RmTSTLO8xyVdI+kxSddJ2kbSxyQtl9Qh6QpJo1P9tZIuSOXLJL0/lV8paWZBu6/10tcSSY+k14Fp13nAQZLaJJ0h6WBJt6RjdpD0S0ntkpZKmpLK56W4Fkt6WpKTrJnZZjQik2ayO3BJRHwAeAX4KnAlMCsiJpM9LemUgvqdqfxHwIUl9PM88ImImArMArqXYOcCSyKiKSJ+UHTMOcDyiJgCfB24qmDfHsChwP7AtySNKu5Q0hxJLZJaujZ0lhCqmZn1ZSQnzd9FxANp+2rgY8CaiHgilS0EZhTUv7bg5wEl9DMKuFxSB7AI2DPHMdOBnwJExD3AjpLelfbdGhEbI+IFsoS8U/HBEbEgIpojorlum/oSQjUzs76M5GfPRtH79cCOOet3b79F+oeHpC2ArXo47gzgOWCfVPf1AcRaaGPBdhcj+xyamW1WI3mm+V5J3TPGY4EWoLH7eiVwPHBfQf1ZBT8fTNtrgWlp+3CyWWWxeuDZiNiU2qxL5a8C2/US2xLgOABJBwMvRMQreQZlZmbDZyTPUlYDp0m6AngU+BKwFFgkaUvgYeDSgvrbS2onm+n9Qyq7HLhJ0grgduBPPfRzCXC9pBOK6rQDXenYK4HlBcfMA65I/W0AZg9uqGZmNhQUUbxKWfskNQK3RMTeOeuvBZrTdcSqMrphUjTMvrDcYZjZMPBXgw0fSa0R0VxcPpKXZ83MzEoyIpdnI2ItkGuWmeo3DlswZmZWNTzTNDMzy8lJ08zMLCcnTTMzs5xG5DXNkWTyxHpafIedmdmQ8EzTzMwsJydNMzOznJw0zczMcnLSNDMzy8k3AtW4jnWdNM69tdxhmFUtP6rOCnmmaWZmlpOTppmZWU5OmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlOFZc0JY2TdGo/dRolHZujrUZJK/vYf6KkHw0kzqE43szMqkvFJU1gHNBn0gQagX6TZrlI8t+/mpnVoEpMmucBu0lqkzQ/vVZK6pA0q6DOQanOGWlGuUTSI+l1YAn97SxpsaQnJX2ru1DSP0palvq4TFJdKj9J0hOSlgEfLqh/paRLJT0EXCCpSdJSSe2SbpS0farXW/liST+Q1CLpMUn7SbohxfVvqc62km6VtCJ9JrMwM7PNphKT5lzgNxHRBCwFmoB9gI8D8yU1pDpLIqIpIn4APA98IiKmArOAi0rob3/gaGAK8FlJzZI+kNr5cIqjCzgu9X0OWbKcDuxZ1NZ7gAMj4qvAVcBZETEF6AC6E3Jv5QBvREQzcClwE3AasDdwoqQdgU8Cv4+IfSJib+D2ngYkaU5Kvi1dGzpL+CjMzKwvlb6MOB24NiK6gOck3QfsB7xSVG8U8CNJTWQJ7m9L6OOuiHgRQNINqc+3gGnAw5IAtiZLzB8EFkfEH1P9XxT1tSgiuiTVA+Mi4r5UvhBY1Ft5wfE3p58dwKqIeDb18zSwcyr/vqTzgVsiYklPA4qIBcACgNENk6KEz8LMzPpQ6UkzrzOA58hmpFsAr5dwbHFSCUDAwoj418Idko7sp60/ldBvTzamn5sKtrvfbxkRT0iaChwG/JukuyPi3EH2aWZmOVXi8uyrwHZpewkwS1KdpAnADGBZUR2AeuDZiNgEHA/UldDfJyTtIGlr4EjgAeBuYKakdwOk/bsADwF/J2lHSaOAz/bUYER0Ai9LOigVHQ/c11t53kAl/Q2wISKuBuYDU0sYp5mZDVLFzTQj4kVJD6Q/FbkNaAdWkM0AvxYRf5D0ItAlaQVwJXAJcL2kE8iu85Uy41sGXE92PfLqiGgBkPQN4E5JWwBvAqdFxFJJ84AHgfVAWx/tzgYulbQN8DRwUj/leUwmu667KcV0SgnHmpnZICnCl7xq2eiGSdEw+8Jyh2FWtfzVYCOTpNZ0Y+Y7VOLyrJmZWUWquOXZ4SDpUOD8ouI1EXFUOeIxM7PqNCKSZkTcAdxR7jjMzKy6eXnWzMwspxEx0xzJJk+sp8U3MpiZDQnPNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcfCNQjetY10nj3FvLHYbZiOcnC9UGzzTNzMxyctI0MzPLyUnTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLKeaTpqSxkk6tZ86jZKOzdFWo6SVQxedmZlVm5pOmsA4oM+kCTQC/SbNUkjyQyPMzGpQrSfN84DdJLVJmp9eKyV1SJpVUOegVOeMNKNcIumR9DowT0eSTpR0s6R7gLsl7SDpl5LaJS2VNCXV6618nqSFqe/fSvqMpAtSrLdLGpXqnSfp0XT893qJZY6kFkktXRs6B/sZmplZUuszornA3hHRJOlo4GRgH2A88LCk+1OdMyPi7wEkbQN8IiJelzQJuBZoztnfVGBKRLwk6YfA8og4UtJHgauAJuCcXsoBdgM+AuwJPAgcHRFfk3Qj8GlJS4CjgD0iIiSN6ymIiFgALAAY3TApcsZuZmb9qPWZZqHpwLUR0RURzwH3Afv1UG8UcLmkDmARWQLL666IeKmgv58CRMQ9wI6S3tVHOcBtEfEm0AHUAben8g6yZeRO4HXgvyR9BthQQmxmZjZIIylp5nUG8BzZjLQZ2KqEY/80yL43AkTEJuDNiOieJW4CtoyIt4D9geuAv+ftpGpmZptBrSfNV4Ht0vYSYJakOkkTgBnAsqI6APXAsylxHU824xuIJcBxAJIOBl6IiFf6KO+XpLFAfUT8iiy57zPA2MzMbABq+ppmRLwo6YH0pyK3Ae3ACiCAr0XEHyS9CHRJWgFcCVwCXC/pBLKZ3EBnj/OAKyS1ky2jzu6nPI/tgJskjQEEfHWAsZmZ2QDo7RVAq0WjGyZFw+wLyx2G2Yjn79OsLpJaI+KvbgKt9eVZMzOzIVPTy7PDQdKhwPlFxWsi4qhyxGNmZpuPk2aJIuIO4I5yx2FmZpufk2aNmzyxnhZfSzEzGxK+pmlmZpaTk6aZmVlOTppmZmY5OWmamZnl5BuBalzHuk4a595a7jDMcvNDAKySeaZpZmaWk5OmmZlZTk6aZmZmOTlpmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjn1mzQlNUpaOVwBSPr1cLU9WIVjl9Qs6aJyx2RmZuVT9ocbRMSB5Y4hj4hoAVrKHYeZmZVP3uXZOkmXS1ol6U5JW0tqkrRUUrukGyVtDyBpsaTmtD1e0tq0vZekZZLa0jGTUvlr6efB6djrJD0u6RpJSvsOS2Wtki6SdEtvgUqaJ2mhpCWSfivpM5IukNQh6XZJo1K9aZLuS23eIamhoHyFpBXAaQXtHtzdr6T9JT0oabmkX0vaPZWfKOmG1M+Tki7o60OV9GNJLelzPaegvMfxStpW0hXpc1wu6Yhe2p2T2m3p2tDZVwhmZlaCvElzEnBxROwFrAeOBq4CzoqIKUAH8K1+2jgZ+M+IaAKagWd6qLMv8BVgT2BX4MOSxgCXAZ+KiGnAhBzx7gZ8FDgcuBq4NyImA38GPp0S5w+BmanNK4DvpGN/ApweEfv00f7jwEERsS/wTeDfC/Y1AbOAycAsSTv30c7ZEdEMTAH+TtKUfsZ7NnBPROwPfASYL2nb4kYjYkFENEdEc9029X10b2Zmpci7PLsmItrSditZUhoXEfelsoXAon7aeBA4W9J7gBsi4ske6iyLiGcAJLUBjcBrwNMRsSbVuRaY009ft0XEm5I6gDrg9lTekdrcHdgbuCtNZuuAZyWNS+O6P9X/KfCpHtqvBxam2XIAowr23R0RnWkMjwK7AL/rJc7PSZpDdh4ayP6xsEUf4z0EOFzSmen9GOC9wGN9fhpmZjYk8ibNjQXbXcC4Puq+xdsz2DHdhRHxM0kPAZ8GfiXpnyPinn76Geg1142pz02S3oyISOWbUpsCVkXEAYUHpaSZx7fJZq9HSWoEFhf3nfQ6BknvA84E9ouIlyVdScHn1QsBR0fE6pxxmpnZEBron5x0Ai9LOii9Px7onnWuBaal7ZndB0jalWwGdRFwE9mSZB6rgV1TcoJs6XOwVgMTJB2QYhslaa+IWA+slzQ91Tuul+PrgXVp+8QBxvAu4E9Ap6SdeHtG29d47wBOL7jWu+8A+zYzswEYzN9pzia7ptZOdh3v3FT+PeAUScuB8QX1PwesTMuue5NdE+1XRPwZOBW4XVIr8CpZ0h6wiHiDLKGfn274aQO67+I9Cbg4xalemrgA+G4a44BmwxGxAlhOdn30Z8ADqbyv8X6bbCm4XdKq9N7MzDYTvb1yWbkkjY2I19IM62LgyYj4QbnjGi5DOd7RDZOiYfaFQxqf2XDy92laJZDUmm7UfIdqeSLQF9LMbxXZ0uhl5Q1n2I208ZqZVYWyP9wgjzTLesdMS9JJwJeLqj4QEadRYdINUKOLio+PiI6e6vc0XjMzK7+qSJo9iYifkP1NZcWLiA+WOwYzMxu8almeNTMzK7uqnWlaPpMn1tPiGyvMzIaEZ5pmZmY5OWmamZnl5KRpZmaWk5OmmZlZTr4RqMZ1rOukce6t5Q7DzCqIn7o0cJ5pmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlONZs0JS2W1Jy2fyVp3BC2fbKkE4aqPTMzqw4j4uEGEXHYELd36VC2Z2Zm1aGiZpqSGiU9LulKSU9IukbSxyU9IOlJSftL2lbSFZKWSVou6Yh07NaSfi7pMUk3AlsXtLtW0vi0/UtJrZJWSZpTUOc1Sd+RtELSUkk79RHnPElnpu3Fks5P8Twh6aBUXifpe5JWSmqXdHoq/1iKuyONY3RBjN+V1CapRdJUSXdI+o2kkwv6/hdJD6c2z+klvjmpjZauDZ2DOCNmZlaoopJm8n7g+8Ae6XUsMB04E/g6cDZwT0TsD3wEmC9pW+AUYENEfAD4FjCtl/Y/HxHTgGbgS5J2TOXbAksjYh/gfuALJcS8ZYrnK6lvgDlAI9AUEVOAaySNAa4EZkXEZLKZ/ikF7fxvRDQBS1K9mcCHgHMAJB0CTAL2B5qAaZJmFAcTEQsiojkimuu2qS9hGGZm1pdKTJprIqIjIjYBq4C7IyKADrIkdAgwV1IbsBgYA7wXmAFcDRAR7UB7L+1/SdIKYCmwM1kSAngDuCVtt6a+8rqhh+M+DlwWEW+lmF4Cdk/jeyLVWZji7nZz+tkBPBQRr0bEH4GN6ZrsIem1HHiE7B8VkzAzs82iEq9pbizY3lTwfhNZvF3A0RGxuvAgSf02LOlgsmR2QERskLSYLOkCvJmSM6mPUj6b7hhLPa63dgrH3f1+S0DAdyPiskH0YWZmA1SJM83+3AGcrpQlJe2byu8nW8pF0t7AlB6OrQdeTglzD7Klz+FyF/DPkrZMMe0ArAYaJb0/1TkeuK+ENu8APi9pbGpzoqR3D2HMZmbWh2pMmt8GRgHtklal9wA/BsZKegw4l2yptNjtwJapznlkS7TD5f8C/5viXAEcGxGvAycBiyR1kM0gc9+JGxF3Aj8DHkzHXwdsN+SRm5lZj/T2iqTVotENk6Jh9oXlDsPMKoi/T7N/klojorm4vBpnmmZmZmVRiTcCVQxJZwOfLSpeFBHfKUc8ZmZWXk6afUjJ0QnSzMwAJ82aN3liPS2+fmFmNiR8TdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyUnTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCd/y0mNk/Qq2fd4VrvxwAvlDmKQamEM4HFUkloYA1TmOHaJiAnFhX6MXu1b3dPX21QbSS3VPo5aGAN4HJWkFsYA1TUOL8+amZnl5KRpZmaWk5Nm7VtQ7gCGSC2MoxbGAB5HJamFMUAVjcM3ApmZmeXkmaaZmVlOTppmZmY5OWnWKEmflLRa0lOS5pY7nv5IWiupQ1KbpJZUtoOkuyQ9mX5un8ol6aI0tnZJU8sY9xWSnpe0sqCs5LglzU71n5Q0uwLGME/SunQ+2iQdVrDvX9MYVks6tKC8rL9zknaWdK+kRyWtkvTlVF4156OPMVTV+ZA0RtIySSvSOM5J5e+T9FCK6ReStkrlo9P7p9L+xv7GVzYR4VeNvYA64DfArsBWwApgz3LH1U/Ma4HxRWUXAHPT9lzg/LR9GHAbIOBDwENljHsGMBVYOdC4gR2Ap9PP7dP29mUewzzgzB7q7pl+n0YD70u/Z3WV8DsHNABT0/Z2wBMp3qo5H32MoarOR/pMx6btUcBD6TP+b+CYVH4pcEraPhW4NG0fA/yir/Ftzt+r4pdnmrVpf+CpiHg6It4Afg4cUeaYBuIIYGHaXggcWVB+VWSWAuMkNZQhPiLifuClouJS4z4UuCsiXoqIl4G7gE8Oe/BJL2PozRHAzyNiY0SsAZ4i+30r++9cRDwbEY+k7VeBx4CJVNH56GMMvanI85E+09fS21HpFcBHgetSefG56D5H1wEfkyR6H1/ZOGnWponA7wreP0Pf/+FVggDulNQqaU4q2ykink3bfwB2StuVPr5S467U8XwxLVte0b2kSZWMIS3v7Us2w6nK81E0Bqiy8yGpTlIb8DzZPzx+A6yPiLd6iOkv8ab9ncCOVMA4ijlpWqWYHhFTgU8Bp0maUbgzsrWaqvv7qGqNG/gxsBvQBDwLfL+s0ZRA0ljgeuArEfFK4b5qOR89jKHqzkdEdEVEE/AestnhHuWNaGg4adamdcDOBe/fk8oqVkSsSz+fB24k+4/sue5l1/Tz+VS90sdXatwVN56IeC79T28TcDlvL4lV9BgkjSJLNtdExA2puKrOR09jqNbzARAR64F7gQPIlsC7n3leGNNf4k3764EXqaBxdHPSrE0PA5PSnWpbkV1Yv7nMMfVK0raStuveBg4BVpLF3H3n4mzgprR9M3BCuvvxQ0BnwfJbJSg17juAQyRtn5bdDkllZVN0jfgosvMB2RiOSXc7vg+YBCyjAn7n0jWw/wIei4j/KNhVNeejtzFU2/mQNEHSuLS9NfAJsuuz9wIzU7Xic9F9jmYC96RVgd7GVz7lvAvJr+F7kd0Z+ATZdYSzyx1PP7HuSnaH3ApgVXe8ZNc07gaeBP4H2CGVC7g4ja0DaC5j7NeSLZe9SXa95Z8GEjfwebKbHJ4CTqqAMfw0xdhO9j+uhoL6Z6cxrAY+VSm/c8B0sqXXdqAtvQ6rpvPRxxiq6nwAU4DlKd6VwDdT+a5kSe8pYBEwOpWPSe+fSvt37W985Xr5MXpmZmY5eXnWzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCcnTTMzs5ycNM3MzHL6/xT29zgweRDLAAAAAElFTkSuQmCC",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"plt.barh(automl.feature_names_in_, automl.feature_importances_)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"''' pickle and save the automl object '''\n",
"import pickle\n",
"with open('automl.pkl', 'wb') as f:\n",
" pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Predicted labels [162131.66541776 261207.15681479 157976.50985102 ... 205999.47588989\n",
" 223985.57564169 277733.77442341]\n",
"True labels 14740 136900.0\n",
"10101 241300.0\n",
"20566 200700.0\n",
"2670 72500.0\n",
"15709 460000.0\n",
" ... \n",
"13132 121200.0\n",
"8228 137500.0\n",
"3948 160900.0\n",
"8522 227300.0\n",
"16798 265600.0\n",
"Name: median_house_value, Length: 5160, dtype: float64\n"
]
}
],
"source": [
"''' compute predictions of testing dataset ''' \n",
"y_pred = automl.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"r2 = 0.8522136092023422\n",
"mse = 1953515373.4904487\n",
"mae = 29086.15911420206\n"
]
}
],
"source": [
"''' compute different metric values on testing dataset'''\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n",
"print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n",
"print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 22, 'num_leaves': 4, 'min_child_samples': 18, 'learning_rate': 0.2293009676418639, 'log_max_bin': 9, 'colsample_bytree': 0.9086551727646448, 'reg_alpha': 0.0015561782752413472, 'reg_lambda': 0.33127416269768944}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 22, 'num_leaves': 4, 'min_child_samples': 18, 'learning_rate': 0.2293009676418639, 'log_max_bin': 9, 'colsample_bytree': 0.9086551727646448, 'reg_alpha': 0.0015561782752413472, 'reg_lambda': 0.33127416269768944}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 28, 'num_leaves': 20, 'min_child_samples': 17, 'learning_rate': 0.32352862101602586, 'log_max_bin': 10, 'colsample_bytree': 0.8801327898366843, 'reg_alpha': 0.004475520554844502, 'reg_lambda': 0.033081571878574946}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 28, 'num_leaves': 20, 'min_child_samples': 17, 'learning_rate': 0.32352862101602586, 'log_max_bin': 10, 'colsample_bytree': 0.8801327898366843, 'reg_alpha': 0.004475520554844502, 'reg_lambda': 0.033081571878574946}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 44, 'num_leaves': 81, 'min_child_samples': 29, 'learning_rate': 0.26477481203117526, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.028486834222229064}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 44, 'num_leaves': 81, 'min_child_samples': 29, 'learning_rate': 0.26477481203117526, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.028486834222229064}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 44, 'num_leaves': 70, 'min_child_samples': 19, 'learning_rate': 0.182061387379683, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.001534805484993033}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 44, 'num_leaves': 70, 'min_child_samples': 19, 'learning_rate': 0.182061387379683, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.001534805484993033}}\n",
"{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 34, 'num_leaves': 178, 'min_child_samples': 14, 'learning_rate': 0.16444778912464286, 'log_max_bin': 9, 'colsample_bytree': 0.8963761466973907, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.027857858022692302}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 34, 'num_leaves': 178, 'min_child_samples': 14, 'learning_rate': 0.16444778912464286, 'log_max_bin': 9, 'colsample_bytree': 0.8963761466973907, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.027857858022692302}}\n"
]
}
],
"source": [
"from flaml.data import get_output_from_log\n",
"time_history, best_valid_loss_history, valid_loss_history, config_history, metric_history = \\\n",
" get_output_from_log(filename=settings['log_file_name'], time_budget=60)\n",
"\n",
"for config in config_history:\n",
" print(config)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAb5ElEQVR4nO3de5wddZ3m8c9DEyDKJWJaB0JC4hCjwQvRCOIVGDXghURFBpidVRyNzojjiBMFR5HBZQeHGVx8bdQFlgFd7ggxajQygqiAkGCAEDBMRIQ0KEEIIEZCkmf/qGo4NKdPOqHrnO5Tz/v16lef+tXvVH0r6e7nVP3qIttERER9bdPpAiIiorMSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgogWJL1R0spO1xFRpQRBjFiS7pL0lk7WYPuntqdVtXxJsyT9RNKjktZIulrSoVWtL6KZBEHUmqSeDq77MOAS4BvAHsALgROAd23FsiQpv8+xVfKDE6OOpG0kHSfpV5J+L+liSbs2zL9E0m8lPVx+2t67Yd45kr4maZGkx4ADyz2Pf5R0S/meiyTtUPY/QNLqhvcP2rec/2lJ90m6V9KHJFnSXk22QcBpwBdtn2X7YdubbF9t+8NlnxMl/b+G90wul7dtOf1jSSdLugb4IzBP0tIB6/mkpIXl6+0l/ZukuyX9TtLXJY19lv8d0QUSBDEafRyYA7wZ2B14CJjfMP/7wFTgBcAvgPMGvP8o4GRgJ+BnZdvhwMHAFOAVwAdarL9pX0kHA8cCbwH2Ag5osYxpwETg0hZ9huKvgbkU2/J1YJqkqQ3zjwLOL1+fArwY2KesbwLFHkjUXIIgRqOPAv9ke7Xtx4ETgcP6PynbPtv2ow3zXilpl4b3f9v2NeUn8D+VbV+xfa/tB4HvUPyxHMxgfQ8H/sP2Ctt/LNc9mOeX3+8b2iYP6pxyfRtsPwx8GzgSoAyElwALyz2QucAnbT9o+1HgfwJHPMv1RxdIEMRotCdwuaS1ktYCtwMbgRdK6pF0SnnY6BHgrvI94xvef0+TZf624fUfgR1brH+wvrsPWHaz9fT7ffl9txZ9hmLgOs6nDAKKvYEFZSj1As8Bbmz4d/tB2R41lyCI0ege4BDb4xq+drDdR/HHbzbF4ZldgMnle9Tw/qpuuXsfxaBvv4kt+q6k2I73tujzGMUf735/1qTPwG25AuiVtA9FIPQfFnoAWAfs3fBvtovtVoEXNZEgiJFujKQdGr62pTgWfrKkPQEk9UqaXfbfCXic4hP3cygOf7TLxcDRkl4q6TnA5wfr6OL+78cCn5d0tKSdy0HwN0g6o+x2E/AmSZPKQ1vHb64A209QnIl0KrArRTBgexNwJvBlSS8AkDRB0qyt3djoHgmCGOkWUXyS7f86ETgdWAj8UNKjwM+B/cr+3wB+A/QBt5Xz2sL294GvAFcBqxrW/fgg/S8F/hL4IHAv8Dvgf1Ac58f2FcBFwC3AjcB3h1jK+RR7RJfY3tDQ/pn+usrDZv9JMWgdNac8mCaiGpJeCtwKbD/gD3LEiJI9gohhJOnd5fn6zwO+BHwnIRAjXYIgYnh9BLgf+BXFmUx/29lyIjYvh4YiImouewQRETW3bacL2FLjx4/35MmTO11GRMSocuONNz5gu+kFhKMuCCZPnszSpUs33zEiIp4k6TeDzcuhoYiImksQRETUXIIgIqLmEgQRETWXIIiIqLlRd9ZQRETdLFjWx6mLV3Lv2nXsPm4s82ZNY86MCcO2/ARBRMQItmBZH8dftpx1T2wEoG/tOo6/bDnAsIVBDg1FRIxgpy5e+WQI9Fv3xEZOXbxy2NaRIIiIGMHuXbtui9q3RoIgImIE233c2C1q3xoZI+igqgeAIrpNHX9n5s2a9rQxAoCxY3qYN2v4Hi6XIOiQdgwARXSTuv7O9G/bpy+9hfUbNzGhggAcdc8jmDlzprvhpnOvP+VK+poc49uuZxtmTBrX/oIiRrhld69l/cZNz2ivy+/Mbfc9wvTdduaij+y/Ve+XdKPtmc3mZYygQwYb6Gn2gx4Rg/9u1OV3ZvpuOzN7n2r2fGp7aKjTxxp3Hze26R7BhHFjtzrxI7rZYHvR+Z159mq5R9B/rLFv7TrMU8caFyzra1sN82ZNY+yYnqe1DfcAUEQ3ye9MdWq5RzDYBRqfvvQWLrjh7rbVsfu4HbhzzWMYKhkAiugm/b8bdTtrqB1qGQQj5fj8+B23Z/yO2zN7nwkctd+ktq47YjSaM2NC/vBXoNIgkHQwcDrQA5xl+5QB8ycB5wLjyj7H2V5UZU2Q4/MREY0qGyOQ1APMBw4BpgNHSpo+oNvngIttzwCOAL5aVT2NcqwxIuIpVQ4W7wussn2n7fXAhcDsAX0M7Fy+3gW4t8J6njRnxgT+5T0vZ7ueYvMnjBvLv7zn5dnljIhaqvLQ0ATgnobp1cB+A/qcCPxQ0seB5wJvqbCep5kzY8KTA8M5HBQRddbp00ePBM6xvQfwduCbkp5Rk6S5kpZKWrpmzZq2FxkR0c2qDII+YGLD9B5lW6O/AS4GsH0dsAMwfuCCbJ9he6btmb29vRWVGxFRT1UGwRJgqqQpkrajGAxeOKDP3cBfAEh6KUUQ5CN/REQbVRYEtjcAxwCLgdspzg5aIekkSYeW3T4FfFjSzcAFwAc82u6CFxExylV6HUF5TcCiAW0nNLy+DXh9lTVERERrnR4sjoiIDksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiai5BEBFRcwmCiIiaSxBERNRcgiAiouYSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgoiImksQRETUXKVBIOlgSSslrZJ0XJP5X5Z0U/l1h6S1VdYTERHPtG1VC5bUA8wH3gqsBpZIWmj7tv4+tj/Z0P/jwIyq6omIiOaq3CPYF1hl+07b64ELgdkt+h8JXFBhPRER0USVQTABuKdhenXZ9gyS9gSmAFcOMn+upKWSlq5Zs2bYC42IqLORMlh8BHCp7Y3NZto+w/ZM2zN7e3vbXFpERHerMgj6gIkN03uUbc0cQQ4LRUR0RJVBsASYKmmKpO0o/tgvHNhJ0kuA5wHXVVhLREQMorIgsL0BOAZYDNwOXGx7haSTJB3a0PUI4ELbrqqWiIgYXGWnjwLYXgQsGtB2woDpE6usodGCZX2cungl965dx+7jxrLDmG0Yv+P27Vp9RMSIVGkQjCQLlvVx/GXLWfdEMR7dt3Yd26jDRUVEjAAj5ayhyp26eOWTIdBvk+GeB9d1qKKIiJGhNkFw79rmf/DXb9zU5koiIkaW2gTB7uPGNm2fMEh7RERd1CYI5s2axtgxPU9rGzumh3mzpnWoooiIkaE2g8VzZhR3t/j0pbewfuMmJowby7xZ055sj4ioq9oEARRhcMENdwNw0Uf273A1EREjQ20ODUVERHMJgoiImksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzLYNA0s6S/rxJ+yuqKykiItpp0CCQdDjwS+BbklZIek3D7HOqLiwiItqj1R7BZ4FX294HOBr4pqR3l/PybK+IiC7R6qZzPbbvA7B9g6QDge9KmgjkQfMREV2i1R7Bo43jA2UoHADMBvauuK6IiGiTVnsEf8uAQ0C2H5V0MHB4pVVFRETbDLpHYPtm4NeSrhrQ/oTt8yqvLCIi2qLl6aO2NwKbJO3SpnoiIqLNhvKEsj8AyyVdATzW32j77yurKiIi2mYoQXBZ+bXFyvGE04Ee4CzbpzTpczhwIsWZSDfbPmpr1hUREVtns0Fg+9ytWbCkHmA+8FZgNbBE0kLbtzX0mQocD7ze9kOSXrA164qIiK1X5b2G9gVW2b7T9nrgQopTTxt9GJhv+yEA2/dXWE9ERDRRZRBMAO5pmF5dtjV6MfBiSddI+nl5KOkZJM2VtFTS0jVr1lRUbkREPXX67qPbAlMpLlQ7EjhT0riBnWyfYXum7Zm9vb3trTAiosttdoxA0ouBecCejf1tH7SZt/YBExum9yjbGq0Grrf9BMU1C3dQBMOSzZceERHDYShnDV0CfB04E9i4BcteAkyVNIUiAI4ABp4RtIBiT+A/JI2nOFR05xasIyIinqWhBMEG21/b0gXb3iDpGGAxxemjZ9teIekkYKntheW8t0m6jSJk5tn+/ZauKyIitt5QguA7kv4OuBx4vL/R9oObe6PtRcCiAW0nNLw2cGz5FRERHTCUIHh/+X1eQ5uBFw1/ORER0W5DuaBsSjsKiYiIzhjKWUNjKG5J/aay6cfA/ynP9ImIiFFuKIeGvgaMAb5aTv912fahqoqKiIj2GUoQvMb2Kxumr5R0c1UFRUREew3lyuKNjY+slPQitux6goiIGMGGskcwD7hK0p0Uj67cEzi60qoiIqJthnLW0I/K20VPK5tW2n681XsiImL0GDQIJB1k+0pJ7xkway9J2N6qh9VERMTI0mqP4M3AlcC7mswzW/nUsoiIGFkGDQLbXyhfnmT7143zyhvJRUREFxjKWUPfatJ26XAXEhERndFqjOAlwN7ALgPGCXYGdqi6sIiIaI9WYwTTgHcC43j6OMGjFM8ajoiILtBqjODbwLcl7W/7ujbWFBERbTSUC8qWSfoYxWGiJw8J2f5gZVVFRETbDGWw+JvAnwGzgKspnj38aJVFRURE+wwlCPay/XngMdvnAu8A9qu2rIiIaJehBEH/cwfWSnoZsAvwgupKioiIdhrKGMEZkp4HfB5YCOwInND6LRERMVoM5aZzZ5UvrybPKY6I6DqtLig7ttUbbZ82/OVERES7tdoj2Kn8Pg14DcVhISguLruhyqIiIqJ9Wl1Q9s8Akn4CvMr2o+X0icD32lJdRERUbihnDb0QWN8wvb5si4iILjCUIPgGcIOkE8u9geuBc4aycEkHS1opaZWk45rM/4CkNZJuKr8+tCXFR0TEszeUs4ZOlvR94I1l09G2l23ufZJ6gPnAW4HVwBJJC23fNqDrRbaP2cK6IyJimLQ6a2hn249I2hW4q/zqn7er7Qc3s+x9gVW27yzfcyEwGxgYBBER0UGt9gjOp7gN9Y0Uj6bsp3J6c9cUTADuaZheTfNbU7xX0puAO4BP2r5nYAdJc4G5AJMmTdrMaiMiYksMOkZg+53l9ym2X9TwNcX2cF1Y9h1gsu1XAFcA5w5Syxm2Z9qe2dvbO0yrjogIaH1o6FWt3mj7F5tZdh8wsWF6j7KtcRm/b5g8C/jXzSwzIiKGWatDQ//eYp6Bgzaz7CXA1PJB933AEcBRjR0k7Wb7vnLyUOD2zSwzIiKGWasLyg58Ngu2vUHSMcBioAc42/YKSScBS20vBP5e0qHABuBB4APPZp0REbHlhnL3UcrbT0/n6U8o+8bm3md7EbBoQNsJDa+PB44farERETH8NhsEkr4AHEARBIuAQ4CfUVxoFhERo9xQriw+DPgL4Le2jwZeSfFwmoiI6AJDCYJ1tjcBGyTtDNzP088GioiIUWwoYwRLJY0DzqS4uOwPwHVVFhUREe3T6jqC+cD5tv+ubPq6pB8AO9u+pS3VRURE5VrtEdwB/Juk3YCLgQuGcrO5iIgYXVrdYuJ02/sDbwZ+D5wt6ZeSviDpxW2rMCIiKrXZwWLbv7H9JdszgCOBOeQK4IiIrrHZIJC0raR3SToP+D6wEnhP5ZVFRERbtBosfivFHsDbKR5WfyEw1/ZjbaotIiLaoNVg8fEUzyT4lO2H2lRPRES0Waubzm3u7qIREdEFhnJlcUREdLEEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiai5BEBFRcwmCiIiaqzQIJB0saaWkVZKOa9HvvZIsaWaV9URExDNVFgSSeoD5wCHAdOBISdOb9NsJ+ARwfVW1RETE4KrcI9gXWGX7TtvrKZ5wNrtJvy8CXwL+VGEtERExiCqDYAJwT8P06rLtSZJeBUy0/b0K64iIiBY6NlgsaRvgNOBTQ+g7V9JSSUvXrFlTfXERETVSZRD0ARMbpvco2/rtBLwM+LGku4DXAgubDRjbPsP2TNsze3t7Kyw5IqJ+qgyCJcBUSVMkbQccASzsn2n7YdvjbU+2PRn4OXCo7aUV1hQREQNUFgS2NwDHAIuB24GLba+QdJKkQ6tab0REbJltq1y47UXAogFtJwzS94Aqa4mIiOZyZXFERM0lCCIiai5BEBFRcwmCiIiaSxBERNRcgiAiouYSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgoiImksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiaq7SIJB0sKSVklZJOq7J/I9KWi7pJkk/kzS9ynoiIuKZKgsCST3AfOAQYDpwZJM/9OfbfrntfYB/BU6rqp6IiGiuyj2CfYFVtu+0vR64EJjd2MH2Iw2TzwVcYT0REdHEthUuewJwT8P0amC/gZ0kfQw4FtgOOKjZgiTNBeYCTJo0adgLjYios44PFtueb/vPgc8Anxukzxm2Z9qe2dvb294CIyK6XJVB0AdMbJjeo2wbzIXAnArriYiIJqoMgiXAVElTJG0HHAEsbOwgaWrD5DuA/6qwnoiIaKKyMQLbGyQdAywGeoCzba+QdBKw1PZC4BhJbwGeAB4C3l9VPRER0VyVg8XYXgQsGtB2QsPrT1S5/oiI2LyODxZHRERnJQgiImouQRARUXMJgoiImqt0sHikWLCsj1MXr+TetesY07MNE3cd2+mSIiJGjK7fI1iwrI/jL1tO39p1GFi/cRO/fuAxFixrdW1bRER9dH0QnLp4Jeue2Pi0tk0u2iMiogZBcO/adVvUHhFRN10fBLuPaz4eMFh7RETddH0QzJs1jbFjep7WNnZMD/NmTetQRRERI0vXnzU0Z8YEgCfPGtp93FjmzZr2ZHtERN11fRBAEQb5wx8R0VzXHxqKiIjWEgQRETWXIIiIqLkEQUREzSUIIiJqTrY7XcMWkbQG+M0Wvm088EAF5YxUddreOm0rZHu7WdXbuqft3mYzRl0QbA1JS23P7HQd7VKn7a3TtkK2t5t1cltzaCgiouYSBBERNVeXIDij0wW0WZ22t07bCtnebtaxba3FGEFERAyuLnsEERExiARBRETNdX0QSDpY0kpJqyQd1+l6hpuksyXdL+nWhrZdJV0h6b/K78/rZI3DRdJESVdJuk3SCkmfKNu7dXt3kHSDpJvL7f3nsn2KpOvLn+mLJG3X6VqHi6QeScskfbec7uZtvUvSckk3SVpatnXkZ7mrg0BSDzAfOASYDhwpaXpnqxp25wAHD2g7DviR7anAj8rpbrAB+JTt6cBrgY+V/5/dur2PAwfZfiWwD3CwpNcCXwK+bHsv4CHgbzpX4rD7BHB7w3Q3byvAgbb3abh+oCM/y10dBMC+wCrbd9peD1wIzO5wTcPK9k+ABwc0zwbOLV+fC8xpZ01VsX2f7V+Urx+l+IMxge7dXtv+Qzk5pvwycBBwadneNdsraQ/gHcBZ5bTo0m1toSM/y90eBBOAexqmV5dt3e6Ftu8rX/8WeGEni6mCpMnADOB6unh7y0MlNwH3A1cAvwLW2t5Qdummn+n/BXwa2FROP5/u3VYoQv2Hkm6UNLds68jPci2eUFZnti2pq84RlrQj8C3gH2w/UnxwLHTb9treCOwjaRxwOfCSzlZUDUnvBO63faOkAzpcTru8wXafpBcAV0j6ZePMdv4sd/seQR8wsWF6j7Kt2/1O0m4A5ff7O1zPsJE0hiIEzrN9Wdnctdvbz/Za4Cpgf2CcpP4Pcd3yM/164FBJd1Ecwj0IOJ3u3FYAbPeV3++nCPl96dDPcrcHwRJgannmwXbAEcDCDtfUDguB95ev3w98u4O1DJvymPH/BW63fVrDrG7d3t5yTwBJY4G3UoyLXAUcVnbriu21fbztPWxPpvg9vdL2X9GF2wog6bmSdup/DbwNuJUO/Sx3/ZXFkt5OceyxBzjb9smdrWh4SboAOIDiFra/A74ALAAuBiZR3LL7cNsDB5RHHUlvAH4KLOep48ifpRgn6MbtfQXFgGEPxYe2i22fJOlFFJ+adwWWAf/N9uOdq3R4lYeG/tH2O7t1W8vturyc3BY43/bJkp5PB36Wuz4IIiKitW4/NBQREZuRIIiIqLkEQUREzSUIIiJqLkEQEVFzCYIYUSR9WdI/NEwvlnRWw/S/Szq2xfvPkXRY+frHkp7xMHBJYySdUt7h8ReSrpN0SDnvLknjt6LuJ9c7yPz55V0mb5O0rnx9k6TDJC3qv15gOEnarf8unoPM307STxou2IqaShDESHMN8DoASdtQXB+xd8P81wHXPst1fBHYDXiZ7VdR3Nhrp2e5zJZsf8z2PsDbgV+Vd5zcx/altt9eXjk83I4FzmxR03qKO1z+ZQXrjlEkQRAjzbUUt1GAIgBuBR6V9DxJ2wMvBX4h6QRJSyTdKukMNd5wqAVJzwE+DHy8/8Ik27+zfXGTvseWy791wF7Kf5d0i4rnBHyzyfu+WO4h9AyxprskjZc0WdIvy/feIek8SW+RdE2597Jv2f+5Kp5DcYOKe/cPdkfd9wI/KN+zd9n/prL2qWWfBcBfDaXO6F7ZJYwRxfa9kjZImkTx6f86ijtO7g88DCy3vV7S/7Z9EkD5x/idwHeGsIq9gLttP9Kqk6RXA0cD+wECrpd0NbAe+BzwOtsPSNp1wPtOpdi7ONpbd7XmXsD7gA9S3CLlKOANwKEUV1HPAf6J4hYMHywPKd0g6T9tP9ZQxxTgoYarcD8KnG77vPJ2K/0hdSvwmq2oM7pI9ghiJLqWIgT6g+C6hulryj4Hqnhy1XKKG5Tt3WxBz8IbgMttP1Y+E+Ay4I3lui6x/QDAgMv/Pw/sYvujWxkCAL+2vdz2JmAFxUNKTHFbjclln7cBx6m4PfWPgR0obknQaDdgTcP0dcBnJX0G2NP2urL+jcD6/vveRD0lCGIk6h8neDnFJ9afU+wRvA64VtIOwFeBw2y/nOI4+A5DXPYqYJKknYe96uIT/KsH7iVsocb76GxqmN7EU3vwAt7bMM4wyXbjU70A1tHwb2L7fIq9inXAIkkHNfTdHvjTs6g5RrkEQYxE11Ic6nnQ9sbyU/c4ijC4lqf+wD2g4tkEg56tM5DtP1LcwfT08hBJ/10+3zeg60+BOZKeU94d8t1l25XA+8qbgzHgj/4PgFOA71X8CXsx8PH+cRFJM5r0uYOn9iD6b3J2p+2vUNzR8hVl+/OBB2w/UWG9McIlCGIkWk5xttDPB7Q9bPuB8gybMyn2FhZTfBLfEp+jOGxym6Rbge8CTxszKB+JeQ5wA8XdTc+yvcz2CuBk4GpJNwOnDXjfJWVtC1XcOroKX6R4bOUtklaU009Tjhf8StJeZdPhwK3l4aSXAd8o2w8EvldRnTFK5O6jEV1K0ruBV9v+XIs+lwHH2b6jfZXFSJOzhiK6lO3L+w9hNVMeGluQEIjsEURE1FzGCCIiai5BEBFRcwmCiIiaSxBERNRcgiAioub+P3xx7QjxT3ySAAAAAElFTkSuQmCC",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import numpy as np\n",
"\n",
"plt.title('Learning Curve')\n",
"plt.xlabel('Wall Clock Time (s)')\n",
"plt.ylabel('Validation r2')\n",
"plt.scatter(time_history, 1 - np.array(valid_loss_history))\n",
"plt.step(time_history, 1 - np.array(best_valid_loss_history), where='post')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Comparison with alternatives\n",
"\n",
"### FLAML's accuracy"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"flaml (4min) r2 = 0.8522136092023422\n"
]
}
],
"source": [
"print('flaml (4min) r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Default LightGBM"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"from lightgbm import LGBMRegressor\n",
"lgbm = LGBMRegressor()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-2 {color: black;background-color: white;}#sk-container-id-2 pre{padding: 0;}#sk-container-id-2 div.sk-toggleable {background-color: white;}#sk-container-id-2 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-2 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-2 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-2 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-2 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-2 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-2 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-2 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-2 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-2 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-2 div.sk-item {position: relative;z-index: 1;}#sk-container-id-2 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-2 div.sk-item::before, #sk-container-id-2 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-2 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-2 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-2 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-2 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-2 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-2 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-2 div.sk-label-container {text-align: center;}#sk-container-id-2 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-2 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LGBMRegressor()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" checked><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LGBMRegressor</label><div class=\"sk-toggleable__content\"><pre>LGBMRegressor()</pre></div></div></div></div></div>"
],
"text/plain": [
"LGBMRegressor()"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lgbm.fit(X_train, y_train)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"default lgbm r2 = 0.8296179648694404\n"
]
}
],
"source": [
"y_pred = lgbm.predict(X_test)\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('default lgbm r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Optuna LightGBM Tuner"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"# uncomment the following line if optuna is not installed\n",
"# %pip install optuna==2.8.0"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"train_x, val_x, train_y, val_y = train_test_split(X_train, y_train, test_size=0.1)\n",
"import optuna.integration.lightgbm as lgb\n",
"dtrain = lgb.Dataset(train_x, label=train_y)\n",
"dval = lgb.Dataset(val_x, label=val_y)\n",
"params = {\n",
" \"objective\": \"regression\",\n",
" \"metric\": \"regression\",\n",
" \"verbosity\": -1,\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"tags": [
"outputPrepend"
]
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[32m[I 2022-07-01 15:26:25,531]\u001b[0m A new study created in memory with name: no-name-0bd516fd-ed41-4e00-874e-ff99ff30eb94\u001b[0m\n",
"feature_fraction, val_score: inf: 0%| | 0/7 [00:00<?, ?it/s]/usr/local/lib/python3.9/site-packages/lightgbm/engine.py:239: UserWarning: 'verbose_eval' argument is deprecated and will be removed in a future release of LightGBM. Pass 'log_evaluation()' callback via 'callbacks' argument instead.\n",
" _log_warning(\"'verbose_eval' argument is deprecated and will be removed in a future release of LightGBM. \"\n",
"feature_fraction, val_score: 2232348512.135204: 14%|#4 | 1/7 [00:01<00:11, 1.99s/it]\u001b[32m[I 2022-07-01 15:26:27,531]\u001b[0m Trial 0 finished with value: 2232348512.135204 and parameters: {'feature_fraction': 0.8}. Best is trial 0 with value: 2232348512.135204.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 29%|##8 | 2/7 [00:03<00:09, 1.90s/it]\u001b[32m[I 2022-07-01 15:26:29,373]\u001b[0m Trial 1 finished with value: 2219902031.2183566 and parameters: {'feature_fraction': 0.8999999999999999}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 43%|####2 | 3/7 [00:05<00:07, 1.82s/it]\u001b[32m[I 2022-07-01 15:26:31,092]\u001b[0m Trial 2 finished with value: 2232348512.135204 and parameters: {'feature_fraction': 0.7}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 57%|#####7 | 4/7 [00:07<00:05, 1.84s/it]\u001b[32m[I 2022-07-01 15:26:32,964]\u001b[0m Trial 3 finished with value: 2296500828.163134 and parameters: {'feature_fraction': 1.0}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 71%|#######1 | 5/7 [00:09<00:03, 1.76s/it]\u001b[32m[I 2022-07-01 15:26:34,581]\u001b[0m Trial 4 finished with value: 2310469779.1515803 and parameters: {'feature_fraction': 0.4}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 86%|########5 | 6/7 [00:10<00:01, 1.72s/it]\u001b[32m[I 2022-07-01 15:26:36,239]\u001b[0m Trial 5 finished with value: 2278468688.4447093 and parameters: {'feature_fraction': 0.6}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 100%|##########| 7/7 [00:12<00:00, 1.73s/it]\u001b[32m[I 2022-07-01 15:26:37,970]\u001b[0m Trial 6 finished with value: 2245941232.289396 and parameters: {'feature_fraction': 0.5}. Best is trial 1 with value: 2219902031.2183566.\u001b[0m\n",
"feature_fraction, val_score: 2219902031.218357: 100%|##########| 7/7 [00:12<00:00, 1.78s/it]\n",
"num_leaves, val_score: 2219902031.218357: 5%|5 | 1/20 [00:03<01:04, 3.40s/it]\u001b[32m[I 2022-07-01 15:26:41,376]\u001b[0m Trial 7 finished with value: 2249765532.297114 and parameters: {'num_leaves': 76}. Best is trial 7 with value: 2249765532.297114.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 10%|# | 2/20 [00:13<02:15, 7.52s/it]\u001b[32m[I 2022-07-01 15:26:51,786]\u001b[0m Trial 8 finished with value: 2255051289.511019 and parameters: {'num_leaves': 248}. Best is trial 7 with value: 2249765532.297114.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 15%|#5 | 3/20 [00:24<02:29, 8.81s/it]\u001b[32m[I 2022-07-01 15:27:02,129]\u001b[0m Trial 9 finished with value: 2255051289.511019 and parameters: {'num_leaves': 248}. Best is trial 7 with value: 2249765532.297114.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 20%|## | 4/20 [00:27<01:43, 6.50s/it]\u001b[32m[I 2022-07-01 15:27:05,085]\u001b[0m Trial 10 finished with value: 2230498327.6313143 and parameters: {'num_leaves': 64}. Best is trial 10 with value: 2230498327.6313143.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 25%|##5 | 5/20 [00:28<01:12, 4.83s/it]\u001b[32m[I 2022-07-01 15:27:06,966]\u001b[0m Trial 11 finished with value: 2219902031.2183566 and parameters: {'num_leaves': 31}. Best is trial 11 with value: 2219902031.2183566.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 30%|### | 6/20 [00:37<01:23, 5.98s/it]\u001b[32m[I 2022-07-01 15:27:15,159]\u001b[0m Trial 12 finished with value: 2239709106.0440993 and parameters: {'num_leaves': 196}. Best is trial 11 with value: 2219902031.2183566.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 35%|###5 | 7/20 [00:44<01:21, 6.29s/it]\u001b[32m[I 2022-07-01 15:27:22,107]\u001b[0m Trial 13 finished with value: 2258349161.4246354 and parameters: {'num_leaves': 162}. Best is trial 11 with value: 2219902031.2183566.\u001b[0m\n",
"num_leaves, val_score: 2219902031.218357: 40%|#### | 8/20 [00:50<01:17, 6.46s/it]\u001b[32m[I 2022-07-01 15:27:28,935]\u001b[0m Trial 14 finished with value: 2238535970.718681 and parameters: {'num_leaves': 170}. Best is trial 11 with value: 2219902031.2183566.\u001b[0m\n",
"num_leaves, val_score: 2218643598.323591: 45%|####5 | 9/20 [01:00<01:22, 7.50s/it]\u001b[32m[I 2022-07-01 15:27:38,719]\u001b[0m Trial 15 finished with value: 2218643598.323591 and parameters: {'num_leaves': 233}. Best is trial 15 with value: 2218643598.323591.\u001b[0m\n",
"num_leaves, val_score: 2218643598.323591: 50%|##### | 10/20 [01:09<01:19, 8.00s/it]\u001b[32m[I 2022-07-01 15:27:47,820]\u001b[0m Trial 16 finished with value: 2251217311.350468 and parameters: {'num_leaves': 216}. Best is trial 15 with value: 2218643598.323591.\u001b[0m\n",
"num_leaves, val_score: 2218643598.323591: 55%|#####5 | 11/20 [01:14<01:02, 6.90s/it]\u001b[32m[I 2022-07-01 15:27:52,224]\u001b[0m Trial 17 finished with value: 2257362003.048632 and parameters: {'num_leaves': 97}. Best is trial 15 with value: 2218643598.323591.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 60%|###### | 12/20 [01:15<00:42, 5.33s/it]\u001b[32m[I 2022-07-01 15:27:53,959]\u001b[0m Trial 18 finished with value: 2201353666.137075 and parameters: {'num_leaves': 27}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 65%|######5 | 13/20 [01:20<00:36, 5.19s/it]\u001b[32m[I 2022-07-01 15:27:58,830]\u001b[0m Trial 19 finished with value: 2208967225.5510316 and parameters: {'num_leaves': 120}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 70%|####### | 14/20 [01:21<00:23, 3.88s/it]\u001b[32m[I 2022-07-01 15:27:59,681]\u001b[0m Trial 20 finished with value: 2423352668.1802897 and parameters: {'num_leaves': 6}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 75%|#######5 | 15/20 [01:26<00:21, 4.28s/it]\u001b[32m[I 2022-07-01 15:28:04,892]\u001b[0m Trial 21 finished with value: 2232470240.5257387 and parameters: {'num_leaves': 123}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 80%|######## | 16/20 [01:28<00:14, 3.57s/it]\u001b[32m[I 2022-07-01 15:28:06,827]\u001b[0m Trial 22 finished with value: 2220349578.978886 and parameters: {'num_leaves': 33}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 85%|########5 | 17/20 [01:34<00:12, 4.12s/it]\u001b[32m[I 2022-07-01 15:28:12,204]\u001b[0m Trial 23 finished with value: 2238019145.854743 and parameters: {'num_leaves': 126}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 90%|######### | 18/20 [01:35<00:06, 3.29s/it]\u001b[32m[I 2022-07-01 15:28:13,573]\u001b[0m Trial 24 finished with value: 2241529396.314549 and parameters: {'num_leaves': 16}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2201353666.137075: 95%|#########5| 19/20 [01:38<00:03, 3.32s/it]\u001b[32m[I 2022-07-01 15:28:16,946]\u001b[0m Trial 25 finished with value: 2223786741.955245 and parameters: {'num_leaves': 71}. Best is trial 18 with value: 2201353666.137075.\u001b[0m\n",
"num_leaves, val_score: 2199305116.477652: 100%|##########| 20/20 [01:45<00:00, 4.28s/it]\u001b[32m[I 2022-07-01 15:28:23,466]\u001b[0m Trial 26 finished with value: 2199305116.4776516 and parameters: {'num_leaves': 154}. Best is trial 26 with value: 2199305116.4776516.\u001b[0m\n",
"num_leaves, val_score: 2199305116.477652: 100%|##########| 20/20 [01:45<00:00, 5.27s/it]\n",
"bagging, val_score: 2199305116.477652: 10%|# | 1/10 [00:06<01:01, 6.83s/it]\u001b[32m[I 2022-07-01 15:28:30,310]\u001b[0m Trial 27 finished with value: 2306928064.5453434 and parameters: {'bagging_fraction': 0.7585165645006501, 'bagging_freq': 1}. Best is trial 27 with value: 2306928064.5453434.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 20%|## | 2/10 [00:15<01:03, 7.88s/it]\u001b[32m[I 2022-07-01 15:28:38,917]\u001b[0m Trial 28 finished with value: 2322722013.504575 and parameters: {'bagging_fraction': 0.6242273851784387, 'bagging_freq': 6}. Best is trial 27 with value: 2306928064.5453434.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 30%|### | 3/10 [00:23<00:56, 8.05s/it]\u001b[32m[I 2022-07-01 15:28:47,173]\u001b[0m Trial 29 finished with value: 2367680138.6985345 and parameters: {'bagging_fraction': 0.7565396640524931, 'bagging_freq': 6}. Best is trial 27 with value: 2306928064.5453434.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 40%|#### | 4/10 [00:32<00:49, 8.19s/it]\u001b[32m[I 2022-07-01 15:28:55,577]\u001b[0m Trial 30 finished with value: 2344148688.917165 and parameters: {'bagging_fraction': 0.6821211318183384, 'bagging_freq': 2}. Best is trial 27 with value: 2306928064.5453434.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 50%|##### | 5/10 [00:40<00:42, 8.40s/it]\u001b[32m[I 2022-07-01 15:29:04,363]\u001b[0m Trial 31 finished with value: 2416425410.5129323 and parameters: {'bagging_fraction': 0.568310870120601, 'bagging_freq': 6}. Best is trial 27 with value: 2306928064.5453434.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 60%|###### | 6/10 [00:48<00:32, 8.22s/it]\u001b[32m[I 2022-07-01 15:29:12,230]\u001b[0m Trial 32 finished with value: 2251131150.014874 and parameters: {'bagging_fraction': 0.9764916567565476, 'bagging_freq': 4}. Best is trial 32 with value: 2251131150.014874.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 70%|####### | 7/10 [00:56<00:24, 8.18s/it]\u001b[32m[I 2022-07-01 15:29:20,311]\u001b[0m Trial 33 finished with value: 2294452797.5768037 and parameters: {'bagging_fraction': 0.9888528981063934, 'bagging_freq': 3}. Best is trial 32 with value: 2251131150.014874.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 80%|######## | 8/10 [01:04<00:16, 8.10s/it]\u001b[32m[I 2022-07-01 15:29:28,234]\u001b[0m Trial 34 finished with value: 2309129638.348013 and parameters: {'bagging_fraction': 0.8370669830657019, 'bagging_freq': 6}. Best is trial 32 with value: 2251131150.014874.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 90%|######### | 9/10 [01:13<00:08, 8.33s/it]\u001b[32m[I 2022-07-01 15:29:37,083]\u001b[0m Trial 35 finished with value: 2448730787.1520085 and parameters: {'bagging_fraction': 0.4658108513480458, 'bagging_freq': 2}. Best is trial 32 with value: 2251131150.014874.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 100%|##########| 10/10 [01:22<00:00, 8.51s/it]\u001b[32m[I 2022-07-01 15:29:45,999]\u001b[0m Trial 36 finished with value: 2419849532.9108562 and parameters: {'bagging_fraction': 0.5555911526705426, 'bagging_freq': 5}. Best is trial 32 with value: 2251131150.014874.\u001b[0m\n",
"bagging, val_score: 2199305116.477652: 100%|##########| 10/10 [01:22<00:00, 8.25s/it]\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 17%|#6 | 1/6 [00:06<00:31, 6.22s/it]\u001b[32m[I 2022-07-01 15:29:52,235]\u001b[0m Trial 37 finished with value: 2199305116.4776516 and parameters: {'feature_fraction': 0.9159999999999999}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 33%|###3 | 2/6 [00:12<00:25, 6.39s/it]\u001b[32m[I 2022-07-01 15:29:58,747]\u001b[0m Trial 38 finished with value: 2199305116.4776516 and parameters: {'feature_fraction': 0.852}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 50%|##### | 3/6 [00:19<00:19, 6.35s/it]\u001b[32m[I 2022-07-01 15:30:05,054]\u001b[0m Trial 39 finished with value: 2199305116.4776516 and parameters: {'feature_fraction': 0.82}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 67%|######6 | 4/6 [00:25<00:12, 6.45s/it]\u001b[32m[I 2022-07-01 15:30:11,657]\u001b[0m Trial 40 finished with value: 2199305116.4776516 and parameters: {'feature_fraction': 0.8839999999999999}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 83%|########3 | 5/6 [00:32<00:06, 6.59s/it]\u001b[32m[I 2022-07-01 15:30:18,484]\u001b[0m Trial 41 finished with value: 2339309140.8117876 and parameters: {'feature_fraction': 0.948}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 100%|##########| 6/6 [00:39<00:00, 6.60s/it]\u001b[32m[I 2022-07-01 15:30:25,101]\u001b[0m Trial 42 finished with value: 2339309140.8117876 and parameters: {'feature_fraction': 0.9799999999999999}. Best is trial 37 with value: 2199305116.4776516.\u001b[0m\n",
"feature_fraction_stage2, val_score: 2199305116.477652: 100%|##########| 6/6 [00:39<00:00, 6.52s/it]\n",
"regularization_factors, val_score: 2199305078.631748: 5%|5 | 1/20 [00:06<02:08, 6.78s/it]\u001b[32m[I 2022-07-01 15:30:31,883]\u001b[0m Trial 43 finished with value: 2199305078.6317477 and parameters: {'lambda_l1': 3.456991981744869e-07, 'lambda_l2': 1.3909176882215133e-05}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 10%|# | 2/20 [00:13<02:04, 6.92s/it]\u001b[32m[I 2022-07-01 15:30:38,911]\u001b[0m Trial 44 finished with value: 2227215910.417912 and parameters: {'lambda_l1': 5.932065146744108e-08, 'lambda_l2': 0.012346257652390797}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 15%|#5 | 3/20 [00:20<01:56, 6.87s/it]\u001b[32m[I 2022-07-01 15:30:45,727]\u001b[0m Trial 45 finished with value: 2208093711.3934827 and parameters: {'lambda_l1': 6.222982571088105e-07, 'lambda_l2': 0.005657569746743592}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 20%|## | 4/20 [00:27<01:49, 6.82s/it]\u001b[32m[I 2022-07-01 15:30:52,458]\u001b[0m Trial 46 finished with value: 2238300649.509333 and parameters: {'lambda_l1': 0.028876756140130917, 'lambda_l2': 0.03474442715862468}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 25%|##5 | 5/20 [00:34<01:44, 6.94s/it]\u001b[32m[I 2022-07-01 15:30:59,622]\u001b[0m Trial 47 finished with value: 2230579851.4348216 and parameters: {'lambda_l1': 0.6408435792442605, 'lambda_l2': 7.923471799415313}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 30%|### | 6/20 [00:41<01:36, 6.91s/it]\u001b[32m[I 2022-07-01 15:31:06,467]\u001b[0m Trial 48 finished with value: 2229105124.7696505 and parameters: {'lambda_l1': 0.010722071300503344, 'lambda_l2': 1.2031073824055891}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 35%|###5 | 7/20 [00:48<01:29, 6.91s/it]\u001b[32m[I 2022-07-01 15:31:13,376]\u001b[0m Trial 49 finished with value: 2237355622.9231133 and parameters: {'lambda_l1': 0.15961712656996224, 'lambda_l2': 1.0178650762499495}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 40%|#### | 8/20 [00:55<01:22, 6.90s/it]\u001b[32m[I 2022-07-01 15:31:20,249]\u001b[0m Trial 50 finished with value: 2199305108.0201645 and parameters: {'lambda_l1': 0.0005450461819794682, 'lambda_l2': 2.928079900278101e-06}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 45%|####5 | 9/20 [01:01<01:15, 6.83s/it]\u001b[32m[I 2022-07-01 15:31:26,931]\u001b[0m Trial 51 finished with value: 2199305107.588573 and parameters: {'lambda_l1': 0.019082135224974688, 'lambda_l2': 2.215319953261056e-08}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199305078.631748: 50%|##### | 10/20 [01:08<01:07, 6.80s/it]\u001b[32m[I 2022-07-01 15:31:33,658]\u001b[0m Trial 52 finished with value: 2245666778.532941 and parameters: {'lambda_l1': 5.414308234389909, 'lambda_l2': 1.2520783411460403e-06}. Best is trial 43 with value: 2199305078.6317477.\u001b[0m\n",
"regularization_factors, val_score: 2199237747.117429: 55%|#####5 | 11/20 [01:15<01:01, 6.79s/it]\u001b[32m[I 2022-07-01 15:31:40,415]\u001b[0m Trial 53 finished with value: 2199237747.1174293 and parameters: {'lambda_l1': 8.100486192199182e-06, 'lambda_l2': 4.641583659119779e-05}. Best is trial 53 with value: 2199237747.1174293.\u001b[0m\n",
"regularization_factors, val_score: 2199237747.117429: 60%|###### | 12/20 [01:22<00:54, 6.77s/it]\u001b[32m[I 2022-07-01 15:31:47,153]\u001b[0m Trial 54 finished with value: 2199305057.642786 and parameters: {'lambda_l1': 3.6631417833294185e-06, 'lambda_l2': 2.1348145216053757e-05}. Best is trial 53 with value: 2199237747.1174293.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 65%|######5 | 13/20 [01:28<00:47, 6.81s/it]\u001b[32m[I 2022-07-01 15:31:54,065]\u001b[0m Trial 55 finished with value: 2197186937.9154205 and parameters: {'lambda_l1': 8.10639397388401e-06, 'lambda_l2': 0.00011673870071542667}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 70%|####### | 14/20 [01:35<00:40, 6.74s/it]\u001b[32m[I 2022-07-01 15:32:00,643]\u001b[0m Trial 56 finished with value: 2225251999.875691 and parameters: {'lambda_l1': 1.7659550523446347e-05, 'lambda_l2': 0.0005366592911597499}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 75%|#######5 | 15/20 [01:42<00:33, 6.78s/it]\u001b[32m[I 2022-07-01 15:32:07,509]\u001b[0m Trial 57 finished with value: 2199305115.965746 and parameters: {'lambda_l1': 0.00011988368737569814, 'lambda_l2': 2.547255235003035e-07}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 80%|######## | 16/20 [01:49<00:27, 6.82s/it]\u001b[32m[I 2022-07-01 15:32:14,428]\u001b[0m Trial 58 finished with value: 2199272523.4245095 and parameters: {'lambda_l1': 1.584428775539112e-08, 'lambda_l2': 0.00019822993735694197}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 85%|########5 | 17/20 [01:56<00:20, 6.80s/it]\u001b[32m[I 2022-07-01 15:32:21,194]\u001b[0m Trial 59 finished with value: 2208977643.7865806 and parameters: {'lambda_l1': 0.00022730715336215045, 'lambda_l2': 0.000349248360832954}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 90%|######### | 18/20 [02:02<00:13, 6.77s/it]\u001b[32m[I 2022-07-01 15:32:27,887]\u001b[0m Trial 60 finished with value: 2199305116.323484 and parameters: {'lambda_l1': 7.693013959177693e-06, 'lambda_l2': 4.173548491660109e-08}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2197186937.915421: 95%|#########5| 19/20 [02:09<00:06, 6.78s/it]\u001b[32m[I 2022-07-01 15:32:34,692]\u001b[0m Trial 61 finished with value: 2199305035.1400375 and parameters: {'lambda_l1': 0.0012748791329935912, 'lambda_l2': 2.971311275786321e-05}. Best is trial 55 with value: 2197186937.9154205.\u001b[0m\n",
"regularization_factors, val_score: 2196893501.218637: 100%|##########| 20/20 [02:16<00:00, 6.76s/it]\u001b[32m[I 2022-07-01 15:32:41,415]\u001b[0m Trial 62 finished with value: 2196893501.2186375 and parameters: {'lambda_l1': 2.4953626772316295e-05, 'lambda_l2': 0.00208420065729694}. Best is trial 62 with value: 2196893501.2186375.\u001b[0m\n",
"regularization_factors, val_score: 2196893501.218637: 100%|##########| 20/20 [02:16<00:00, 6.82s/it]\n",
"min_data_in_leaf, val_score: 2196893501.218637: 20%|## | 1/5 [00:05<00:21, 5.42s/it]\u001b[32m[I 2022-07-01 15:32:46,844]\u001b[0m Trial 63 finished with value: 2224986582.6364565 and parameters: {'min_child_samples': 5}. Best is trial 63 with value: 2224986582.6364565.\u001b[0m\n",
"min_data_in_leaf, val_score: 2196893501.218637: 40%|#### | 2/5 [00:14<00:22, 7.58s/it]\u001b[32m[I 2022-07-01 15:32:55,933]\u001b[0m Trial 64 finished with value: 2327642703.7542973 and parameters: {'min_child_samples': 50}. Best is trial 63 with value: 2224986582.6364565.\u001b[0m\n",
"min_data_in_leaf, val_score: 2196893501.218637: 60%|###### | 3/5 [00:21<00:14, 7.27s/it]\u001b[32m[I 2022-07-01 15:33:02,826]\u001b[0m Trial 65 finished with value: 2339874099.256145 and parameters: {'min_child_samples': 100}. Best is trial 63 with value: 2224986582.6364565.\u001b[0m\n",
"min_data_in_leaf, val_score: 2196893501.218637: 80%|######## | 4/5 [00:27<00:06, 6.68s/it]\u001b[32m[I 2022-07-01 15:33:08,597]\u001b[0m Trial 66 finished with value: 2238985187.2367454 and parameters: {'min_child_samples': 10}. Best is trial 63 with value: 2224986582.6364565.\u001b[0m\n",
"min_data_in_leaf, val_score: 2196893501.218637: 100%|##########| 5/5 [00:34<00:00, 6.96s/it]\u001b[32m[I 2022-07-01 15:33:16,067]\u001b[0m Trial 67 finished with value: 2256871995.7934694 and parameters: {'min_child_samples': 25}. Best is trial 63 with value: 2224986582.6364565.\u001b[0m\n",
"min_data_in_leaf, val_score: 2196893501.218637: 100%|##########| 5/5 [00:34<00:00, 6.93s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 6min 30s, sys: 17 s, total: 6min 47s\n",
"Wall time: 6min 50s\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"%%time\n",
"model = lgb.train(params, dtrain, valid_sets=[dtrain, dval], verbose_eval=10000) \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optuna LightGBM Tuner r2 = 0.8429583826070053\n"
]
}
],
"source": [
"y_pred = model.predict(X_test)\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('Optuna LightGBM Tuner r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Add a customized LightGBM learner in FLAML\n",
"The native API of LightGBM allows one to specify a custom objective function in the model constructor. You can easily enable it by adding a customized LightGBM learner in FLAML. In the following example, we show how to add such a customized LightGBM learner with a custom objective function."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a customized LightGBM learner with a custom objective function"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np \n",
"\n",
"''' define your customized objective function '''\n",
"def my_loss_obj(y_true, y_pred):\n",
" c = 0.5\n",
" residual = y_pred - y_true\n",
" grad = c * residual /(np.abs(residual) + c)\n",
" hess = c ** 2 / (np.abs(residual) + c) ** 2\n",
" # rmse grad and hess\n",
" grad_rmse = residual\n",
" hess_rmse = 1.0\n",
" \n",
" # mae grad and hess\n",
" grad_mae = np.array(residual)\n",
" grad_mae[grad_mae > 0] = 1.\n",
" grad_mae[grad_mae <= 0] = -1.\n",
" hess_mae = 1.0\n",
"\n",
" coef = [0.4, 0.3, 0.3]\n",
" return coef[0] * grad + coef[1] * grad_rmse + coef[2] * grad_mae, \\\n",
" coef[0] * hess + coef[1] * hess_rmse + coef[2] * hess_mae\n",
"\n",
"\n",
"from flaml.model import LGBMEstimator\n",
"\n",
"''' create a customized LightGBM learner class with your objective function '''\n",
"class MyLGBM(LGBMEstimator):\n",
" '''LGBMEstimator with my_loss_obj as the objective function\n",
" '''\n",
"\n",
" def __init__(self, **config):\n",
" super().__init__(objective=my_loss_obj, **config)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add the customized learner in FLAML"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 07-01 15:33:17] {2427} INFO - task = regression\n",
"[flaml.automl: 07-01 15:33:17] {2429} INFO - Data split method: uniform\n",
"[flaml.automl: 07-01 15:33:17] {2432} INFO - Evaluation method: cv\n",
"[flaml.automl: 07-01 15:33:17] {2501} INFO - Minimizing error metric: 1-r2\n",
"[flaml.automl: 07-01 15:33:17] {2641} INFO - List of ML learners in AutoML Run: ['my_lgbm']\n",
"[flaml.automl: 07-01 15:33:17] {2933} INFO - iteration 0, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:17] {3061} INFO - Estimated sufficient time budget=1586s. Estimated necessary time budget=2s.\n",
"[flaml.automl: 07-01 15:33:17] {3108} INFO - at 0.2s,\testimator my_lgbm's best error=2.9883,\tbest estimator my_lgbm's best error=2.9883\n",
"[flaml.automl: 07-01 15:33:17] {2933} INFO - iteration 1, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.4s,\testimator my_lgbm's best error=2.9883,\tbest estimator my_lgbm's best error=2.9883\n",
"[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 2, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.6s,\testimator my_lgbm's best error=1.7086,\tbest estimator my_lgbm's best error=1.7086\n",
"[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 3, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.8s,\testimator my_lgbm's best error=0.3474,\tbest estimator my_lgbm's best error=0.3474\n",
"[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 4, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:18] {3108} INFO - at 1.0s,\testimator my_lgbm's best error=0.3474,\tbest estimator my_lgbm's best error=0.3474\n",
"[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 5, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:18] {3108} INFO - at 1.2s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n",
"[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 6, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.4s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n",
"[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 7, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.6s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n",
"[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 8, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.9s,\testimator my_lgbm's best error=0.2721,\tbest estimator my_lgbm's best error=0.2721\n",
"[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 9, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:19] {3108} INFO - at 2.2s,\testimator my_lgbm's best error=0.2721,\tbest estimator my_lgbm's best error=0.2721\n",
"[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 10, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:21] {3108} INFO - at 3.5s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n",
"[flaml.automl: 07-01 15:33:21] {2933} INFO - iteration 11, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:23] {3108} INFO - at 5.2s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n",
"[flaml.automl: 07-01 15:33:23] {2933} INFO - iteration 12, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:24] {3108} INFO - at 6.3s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n",
"[flaml.automl: 07-01 15:33:24] {2933} INFO - iteration 13, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:25] {3108} INFO - at 7.8s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n",
"[flaml.automl: 07-01 15:33:25] {2933} INFO - iteration 14, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:27] {3108} INFO - at 9.2s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n",
"[flaml.automl: 07-01 15:33:27] {2933} INFO - iteration 15, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:28] {3108} INFO - at 11.0s,\testimator my_lgbm's best error=0.1762,\tbest estimator my_lgbm's best error=0.1762\n",
"[flaml.automl: 07-01 15:33:28] {2933} INFO - iteration 16, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:30] {3108} INFO - at 12.3s,\testimator my_lgbm's best error=0.1762,\tbest estimator my_lgbm's best error=0.1762\n",
"[flaml.automl: 07-01 15:33:30] {2933} INFO - iteration 17, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:36] {3108} INFO - at 19.0s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n",
"[flaml.automl: 07-01 15:33:36] {2933} INFO - iteration 18, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:38] {3108} INFO - at 20.8s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n",
"[flaml.automl: 07-01 15:33:38] {2933} INFO - iteration 19, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:40] {3108} INFO - at 23.0s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n",
"[flaml.automl: 07-01 15:33:40] {2933} INFO - iteration 20, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:33:54] {3108} INFO - at 36.6s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n",
"[flaml.automl: 07-01 15:33:54] {2933} INFO - iteration 21, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:00] {3108} INFO - at 43.2s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n",
"[flaml.automl: 07-01 15:34:00] {2933} INFO - iteration 22, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:04] {3108} INFO - at 47.1s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:04] {2933} INFO - iteration 23, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:08] {3108} INFO - at 50.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:08] {2933} INFO - iteration 24, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:15] {3108} INFO - at 57.5s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:15] {2933} INFO - iteration 25, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:33] {3108} INFO - at 76.2s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:33] {2933} INFO - iteration 26, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:35] {3108} INFO - at 77.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:35] {2933} INFO - iteration 27, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:45] {3108} INFO - at 87.9s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:45] {2933} INFO - iteration 28, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:47] {3108} INFO - at 89.7s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:47] {2933} INFO - iteration 29, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:34:48] {3108} INFO - at 90.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:34:48] {2933} INFO - iteration 30, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:16] {3108} INFO - at 118.7s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:35:16] {2933} INFO - iteration 31, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:19] {3108} INFO - at 121.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n",
"[flaml.automl: 07-01 15:35:19] {2933} INFO - iteration 32, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:26] {3108} INFO - at 128.9s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n",
"[flaml.automl: 07-01 15:35:26] {2933} INFO - iteration 33, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:33] {3108} INFO - at 135.2s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n",
"[flaml.automl: 07-01 15:35:33] {2933} INFO - iteration 34, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:37] {3108} INFO - at 139.6s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n",
"[flaml.automl: 07-01 15:35:37] {2933} INFO - iteration 35, current learner my_lgbm\n",
"[flaml.automl: 07-01 15:35:49] {3108} INFO - at 151.6s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n",
"[flaml.automl: 07-01 15:35:50] {3372} INFO - retrain my_lgbm for 1.5s\n",
"[flaml.automl: 07-01 15:35:50] {3379} INFO - retrained model: LGBMRegressor(colsample_bytree=0.8422311526890249,\n",
" learning_rate=0.4130805075333333, max_bin=1023,\n",
" min_child_samples=10, n_estimators=95, num_leaves=221,\n",
" objective=<function my_loss_obj at 0x7fcd8ac7e940>,\n",
" reg_alpha=0.007704104902643932, reg_lambda=0.0031517673595496476,\n",
" verbose=-1)\n",
"[flaml.automl: 07-01 15:35:50] {2672} INFO - fit succeeded\n",
"[flaml.automl: 07-01 15:35:50] {2673} INFO - Time taken to find the best model: 128.89934134483337\n",
"[flaml.automl: 07-01 15:35:50] {2684} WARNING - Time taken to find the best model is 86% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.\n"
]
}
],
"source": [
"automl = AutoML()\n",
"automl.add_learner(learner_name='my_lgbm', learner_class=MyLGBM)\n",
"settings = {\n",
" \"time_budget\": 150, # total running time in seconds\n",
" \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']\n",
" \"estimator_list\": ['my_lgbm',], # list of ML learners; we tune lightgbm in this example\n",
" \"task\": 'regression', # task type \n",
" \"log_file_name\": 'houses_experiment_my_lgbm.log', # flaml log file\n",
"}\n",
"automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best hyperparmeter config: {'n_estimators': 95, 'num_leaves': 221, 'min_child_samples': 10, 'learning_rate': 0.4130805075333333, 'log_max_bin': 10, 'colsample_bytree': 0.8422311526890249, 'reg_alpha': 0.007704104902643932, 'reg_lambda': 0.0031517673595496476}\n",
"Best r2 on validation data: 0.8368\n",
"Training duration of best run: 1.508 s\n",
"Predicted labels [161485.59767093 248585.87889042 157837.93378106 ... 184356.07034452\n",
" 223247.80995858 259281.61167122]\n",
"True labels 14740 136900.0\n",
"10101 241300.0\n",
"20566 200700.0\n",
"2670 72500.0\n",
"15709 460000.0\n",
" ... \n",
"13132 121200.0\n",
"8228 137500.0\n",
"3948 160900.0\n",
"8522 227300.0\n",
"16798 265600.0\n",
"Name: median_house_value, Length: 5160, dtype: float64\n",
"r2 = 0.842983315140684\n",
"mse = 2075526075.9236298\n",
"mae = 30102.91056064235\n"
]
}
],
"source": [
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))\n",
"\n",
"y_pred = automl.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)\n",
"\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n",
"print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n",
"print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.13 ('syml-py38')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "e3d9487e2ef008ade0db1bc293d3206d35cb2b6081faff9f66b40b257b7398f7"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}