{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved. \n", "\n", "Licensed under the MIT License.\n", "\n", "# Tune LightGBM with FLAML Library\n", "\n", "\n", "## 1. Introduction\n", "\n", "FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n", "with low computational cost. It is fast and economical. The simple and lightweight design makes it easy \n", "to use and extend, such as adding new learners. FLAML can \n", "- serve as an economical AutoML engine,\n", "- be used as a fast hyperparameter tuning tool, or \n", "- be embedded in self-tuning software that requires low latency & resource in repetitive\n", " tuning tasks.\n", "\n", "In this notebook, we demonstrate how to use FLAML library to tune hyperparameters of LightGBM with a regression example.\n", "\n", "FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `notebook` option:\n", "```bash\n", "pip install flaml[notebook]\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%pip install flaml[notebook]==1.0.10" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## 2. Regression Example\n", "### Load data and preprocess\n", "\n", "Download [houses dataset](https://www.openml.org/d/537) from OpenML. The task is to predict median price of the house in the region based on demographic composition and a state of housing market in the region." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/root/.local/lib/python3.9/site-packages/xgboost/compat.py:31: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n", " from pandas import MultiIndex, Int64Index\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "download dataset from openml\n", "Dataset name: houses\n", "X_train.shape: (15480, 8), y_train.shape: (15480,);\n", "X_test.shape: (5160, 8), y_test.shape: (5160,)\n" ] } ], "source": [ "from flaml.data import load_openml_dataset\n", "X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=537, data_dir='./')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Run FLAML\n", "In the FLAML automl run configuration, users can specify the task type, time budget, error metric, learner list, whether to subsample, resampling strategy type, and so on. All these arguments have default values which will be used if users do not provide them. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "outputs": [], "source": [ "''' import AutoML class from flaml package '''\n", "from flaml import AutoML\n", "automl = AutoML()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "settings = {\n", " \"time_budget\": 240, # total running time in seconds\n", " \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2','rmse','mape']\n", " \"estimator_list\": ['lgbm'], # list of ML learners; we tune lightgbm in this example\n", " \"task\": 'regression', # task type \n", " \"log_file_name\": 'houses_experiment.log', # flaml log file\n", " \"seed\": 7654321, # random seed\n", "}" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[flaml.automl: 07-01 15:22:15] {2427} INFO - task = regression\n", "[flaml.automl: 07-01 15:22:15] {2429} INFO - Data split method: uniform\n", "[flaml.automl: 07-01 15:22:15] {2432} INFO - Evaluation method: cv\n", "[flaml.automl: 07-01 15:22:15] {2501} INFO - Minimizing error metric: 1-r2\n", "[flaml.automl: 07-01 15:22:15] {2641} INFO - List of ML learners in AutoML Run: ['lgbm']\n", "[flaml.automl: 07-01 15:22:15] {2933} INFO - iteration 0, current learner lgbm\n", "[flaml.automl: 07-01 15:22:16] {3061} INFO - Estimated sufficient time budget=1981s. Estimated necessary time budget=2s.\n", "[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.3s,\testimator lgbm's best error=0.7383,\tbest estimator lgbm's best error=0.7383\n", "[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 1, current learner lgbm\n", "[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.5s,\testimator lgbm's best error=0.7383,\tbest estimator lgbm's best error=0.7383\n", "[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 2, current learner lgbm\n", "[flaml.automl: 07-01 15:22:16] {3108} INFO - at 0.7s,\testimator lgbm's best error=0.3250,\tbest estimator lgbm's best error=0.3250\n", "[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 3, current learner lgbm\n", "[flaml.automl: 07-01 15:22:16] {3108} INFO - at 1.1s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:16] {2933} INFO - iteration 4, current learner lgbm\n", "[flaml.automl: 07-01 15:22:17] {3108} INFO - at 1.3s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:17] {2933} INFO - iteration 5, current learner lgbm\n", "[flaml.automl: 07-01 15:22:19] {3108} INFO - at 3.6s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 6, current learner lgbm\n", "[flaml.automl: 07-01 15:22:19] {3108} INFO - at 3.8s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 7, current learner lgbm\n", "[flaml.automl: 07-01 15:22:19] {3108} INFO - at 4.2s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:19] {2933} INFO - iteration 8, current learner lgbm\n", "[flaml.automl: 07-01 15:22:20] {3108} INFO - at 4.7s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:20] {2933} INFO - iteration 9, current learner lgbm\n", "[flaml.automl: 07-01 15:22:20] {3108} INFO - at 4.9s,\testimator lgbm's best error=0.1868,\tbest estimator lgbm's best error=0.1868\n", "[flaml.automl: 07-01 15:22:20] {2933} INFO - iteration 10, current learner lgbm\n", "[flaml.automl: 07-01 15:22:22] {3108} INFO - at 6.6s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:22] {2933} INFO - iteration 11, current learner lgbm\n", "[flaml.automl: 07-01 15:22:22] {3108} INFO - at 7.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:22] {2933} INFO - iteration 12, current learner lgbm\n", "[flaml.automl: 07-01 15:22:28] {3108} INFO - at 12.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:28] {2933} INFO - iteration 13, current learner lgbm\n", "[flaml.automl: 07-01 15:22:29] {3108} INFO - at 13.6s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:29] {2933} INFO - iteration 14, current learner lgbm\n", "[flaml.automl: 07-01 15:22:34] {3108} INFO - at 18.4s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:34] {2933} INFO - iteration 15, current learner lgbm\n", "[flaml.automl: 07-01 15:22:39] {3108} INFO - at 23.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:39] {2933} INFO - iteration 16, current learner lgbm\n", "[flaml.automl: 07-01 15:22:40] {3108} INFO - at 24.5s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:40] {2933} INFO - iteration 17, current learner lgbm\n", "[flaml.automl: 07-01 15:22:53] {3108} INFO - at 37.9s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:53] {2933} INFO - iteration 18, current learner lgbm\n", "[flaml.automl: 07-01 15:22:53] {3108} INFO - at 38.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:53] {2933} INFO - iteration 19, current learner lgbm\n", "[flaml.automl: 07-01 15:22:54] {3108} INFO - at 39.2s,\testimator lgbm's best error=0.1744,\tbest estimator lgbm's best error=0.1744\n", "[flaml.automl: 07-01 15:22:54] {2933} INFO - iteration 20, current learner lgbm\n", "[flaml.automl: 07-01 15:22:56] {3108} INFO - at 41.0s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:22:56] {2933} INFO - iteration 21, current learner lgbm\n", "[flaml.automl: 07-01 15:22:58] {3108} INFO - at 42.5s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:22:58] {2933} INFO - iteration 22, current learner lgbm\n", "[flaml.automl: 07-01 15:22:59] {3108} INFO - at 44.2s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:22:59] {2933} INFO - iteration 23, current learner lgbm\n", "[flaml.automl: 07-01 15:23:03] {3108} INFO - at 47.8s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:23:03] {2933} INFO - iteration 24, current learner lgbm\n", "[flaml.automl: 07-01 15:23:04] {3108} INFO - at 48.6s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:23:04] {2933} INFO - iteration 25, current learner lgbm\n", "[flaml.automl: 07-01 15:23:05] {3108} INFO - at 49.5s,\testimator lgbm's best error=0.1738,\tbest estimator lgbm's best error=0.1738\n", "[flaml.automl: 07-01 15:23:05] {2933} INFO - iteration 26, current learner lgbm\n", "[flaml.automl: 07-01 15:23:07] {3108} INFO - at 51.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:07] {2933} INFO - iteration 27, current learner lgbm\n", "[flaml.automl: 07-01 15:23:09] {3108} INFO - at 53.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:09] {2933} INFO - iteration 28, current learner lgbm\n", "[flaml.automl: 07-01 15:23:11] {3108} INFO - at 55.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:11] {2933} INFO - iteration 29, current learner lgbm\n", "[flaml.automl: 07-01 15:23:12] {3108} INFO - at 56.6s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:12] {2933} INFO - iteration 30, current learner lgbm\n", "[flaml.automl: 07-01 15:23:15] {3108} INFO - at 59.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:15] {2933} INFO - iteration 31, current learner lgbm\n", "[flaml.automl: 07-01 15:23:20] {3108} INFO - at 64.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:20] {2933} INFO - iteration 32, current learner lgbm\n", "[flaml.automl: 07-01 15:23:20] {3108} INFO - at 65.1s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:20] {2933} INFO - iteration 33, current learner lgbm\n", "[flaml.automl: 07-01 15:23:31] {3108} INFO - at 76.0s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:31] {2933} INFO - iteration 34, current learner lgbm\n", "[flaml.automl: 07-01 15:23:32] {3108} INFO - at 76.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:32] {2933} INFO - iteration 35, current learner lgbm\n", "[flaml.automl: 07-01 15:23:35] {3108} INFO - at 79.3s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:35] {2933} INFO - iteration 36, current learner lgbm\n", "[flaml.automl: 07-01 15:23:35] {3108} INFO - at 80.2s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:35] {2933} INFO - iteration 37, current learner lgbm\n", "[flaml.automl: 07-01 15:23:37] {3108} INFO - at 81.5s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:37] {2933} INFO - iteration 38, current learner lgbm\n", "[flaml.automl: 07-01 15:23:39] {3108} INFO - at 83.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:39] {2933} INFO - iteration 39, current learner lgbm\n", "[flaml.automl: 07-01 15:23:40] {3108} INFO - at 84.8s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:40] {2933} INFO - iteration 40, current learner lgbm\n", "[flaml.automl: 07-01 15:23:43] {3108} INFO - at 88.1s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:43] {2933} INFO - iteration 41, current learner lgbm\n", "[flaml.automl: 07-01 15:23:45] {3108} INFO - at 89.4s,\testimator lgbm's best error=0.1611,\tbest estimator lgbm's best error=0.1611\n", "[flaml.automl: 07-01 15:23:45] {2933} INFO - iteration 42, current learner lgbm\n", "[flaml.automl: 07-01 15:23:47] {3108} INFO - at 91.7s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:47] {2933} INFO - iteration 43, current learner lgbm\n", "[flaml.automl: 07-01 15:23:48] {3108} INFO - at 92.4s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:48] {2933} INFO - iteration 44, current learner lgbm\n", "[flaml.automl: 07-01 15:23:54] {3108} INFO - at 98.5s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:54] {2933} INFO - iteration 45, current learner lgbm\n", "[flaml.automl: 07-01 15:23:55] {3108} INFO - at 100.2s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:55] {2933} INFO - iteration 46, current learner lgbm\n", "[flaml.automl: 07-01 15:23:58] {3108} INFO - at 102.6s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:58] {2933} INFO - iteration 47, current learner lgbm\n", "[flaml.automl: 07-01 15:23:59] {3108} INFO - at 103.4s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:23:59] {2933} INFO - iteration 48, current learner lgbm\n", "[flaml.automl: 07-01 15:24:03] {3108} INFO - at 108.0s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:24:03] {2933} INFO - iteration 49, current learner lgbm\n", "[flaml.automl: 07-01 15:24:04] {3108} INFO - at 108.8s,\testimator lgbm's best error=0.1608,\tbest estimator lgbm's best error=0.1608\n", "[flaml.automl: 07-01 15:24:04] {2933} INFO - iteration 50, current learner lgbm\n", "[flaml.automl: 07-01 15:24:12] {3108} INFO - at 116.3s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:24:12] {2933} INFO - iteration 51, current learner lgbm\n", "[flaml.automl: 07-01 15:25:01] {3108} INFO - at 166.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:01] {2933} INFO - iteration 52, current learner lgbm\n", "[flaml.automl: 07-01 15:25:02] {3108} INFO - at 167.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:02] {2933} INFO - iteration 53, current learner lgbm\n", "[flaml.automl: 07-01 15:25:04] {3108} INFO - at 168.7s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:04] {2933} INFO - iteration 54, current learner lgbm\n", "[flaml.automl: 07-01 15:25:38] {3108} INFO - at 203.0s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:38] {2933} INFO - iteration 55, current learner lgbm\n", "[flaml.automl: 07-01 15:25:47] {3108} INFO - at 211.9s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:47] {2933} INFO - iteration 56, current learner lgbm\n", "[flaml.automl: 07-01 15:25:51] {3108} INFO - at 216.2s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:51] {2933} INFO - iteration 57, current learner lgbm\n", "[flaml.automl: 07-01 15:25:53] {3108} INFO - at 217.8s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:25:53] {2933} INFO - iteration 58, current learner lgbm\n", "[flaml.automl: 07-01 15:26:19] {3108} INFO - at 243.9s,\testimator lgbm's best error=0.1558,\tbest estimator lgbm's best error=0.1558\n", "[flaml.automl: 07-01 15:26:21] {3372} INFO - retrain lgbm for 1.7s\n", "[flaml.automl: 07-01 15:26:21] {3379} INFO - retrained model: LGBMRegressor(colsample_bytree=0.6884091116362046,\n", " learning_rate=0.0825101833775657, max_bin=1023,\n", " min_child_samples=15, n_estimators=436, num_leaves=46,\n", " reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n", " verbose=-1)\n", "[flaml.automl: 07-01 15:26:21] {2672} INFO - fit succeeded\n", "[flaml.automl: 07-01 15:26:21] {2673} INFO - Time taken to find the best model: 116.267258644104\n" ] } ], "source": [ "'''The main flaml automl API'''\n", "automl.fit(X_train=X_train, y_train=y_train, **settings)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Best model and metric" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best hyperparmeter config: {'n_estimators': 436, 'num_leaves': 46, 'min_child_samples': 15, 'learning_rate': 0.0825101833775657, 'log_max_bin': 10, 'colsample_bytree': 0.6884091116362046, 'reg_alpha': 0.0010949400705571237, 'reg_lambda': 0.004934208563558304}\n", "Best r2 on validation data: 0.8442\n", "Training duration of best run: 1.668 s\n" ] } ], "source": [ "''' retrieve best config'''\n", "print('Best hyperparmeter config:', automl.best_config)\n", "print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n", "print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
LGBMRegressor(colsample_bytree=0.6884091116362046,\n",
       "              learning_rate=0.0825101833775657, max_bin=1023,\n",
       "              min_child_samples=15, n_estimators=436, num_leaves=46,\n",
       "              reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n",
       "              verbose=-1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "LGBMRegressor(colsample_bytree=0.6884091116362046,\n", " learning_rate=0.0825101833775657, max_bin=1023,\n", " min_child_samples=15, n_estimators=436, num_leaves=46,\n", " reg_alpha=0.0010949400705571237, reg_lambda=0.004934208563558304,\n", " verbose=-1)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "automl.model.estimator" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAc0AAAD4CAYAAACOhb23AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAee0lEQVR4nO3deZRdVZn38e+PIiRAsAIksqojUoJpEEgokgIFQxonUOyXQaKhoSFgL9MMouKiJS0uDdi2QLSlURDC20gQRDsMwgKZGgjkRUKoIpWqBAggiS0RQYYUYCRA5Xn/OLvkcq3h3Bpyh/p91rqrzt1nn72ffU/Bk73PqXMVEZiZmVn/tih3AGZmZtXCSdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLactyB2DDa/z48dHY2FjuMMzMqkpra+sLETGhuNxJs8Y1NjbS0tJS7jDMzKqKpN/2VO7lWTMzs5ycNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyUnTzMwsJz/coMZ1rOukce6t5Q7DzGzYrD3v05utL880zczMcnLSNDMzy8lJ08zMLCcnTTMzs5ycNM3MzHJy0jQzM8vJSbOApNeGoc3DJc1N20dK2nMAbSyW1DzUsZmZWWmcNIdZRNwcEeelt0cCJSdNMzOrDE6aPVBmvqSVkjokzUrlB6dZ33WSHpd0jSSlfYelslZJF0m6JZWfKOlHkg4EDgfmS2qTtFvhDFLSeElr0/bWkn4u6TFJNwJbF8R2iKQHJT0iaZGksZv30zEzG7n8RKCefQZoAvYBxgMPS7o/7dsX2Av4PfAA8GFJLcBlwIyIWCPp2uIGI+LXkm4GbomI6wBSvu3JKcCGiPiApCnAI6n+eOAbwMcj4k+SzgK+CpxbeLCkOcAcgLp3TRjYJ2BmZn/FM82eTQeujYiuiHgOuA/YL+1bFhHPRMQmoA1oBPYAno6INanOXyXNEs0ArgaIiHagPZV/iGx59wFJbcBsYJfigyNiQUQ0R0Rz3Tb1gwzFzMy6eaZZuo0F210M7jN8i7f/4TImR30Bd0XEPwyiTzMzGyDPNHu2BJglqU7SBLKZ37I+6q8GdpXUmN7P6qXeq8B2Be/XAtPS9syC8vuBYwEk7Q1MSeVLyZaD35/2bSvpb/MMyMzMBs9Js2c3ki2JrgDuAb4WEX/orXJE/Bk4FbhdUitZcuzsoerPgX+RtFzSbsD3gFMkLSe7dtrtx8BYSY+RXa9sTf38ETgRuFZSO/Ag2dKwmZltBoqIcsdQEySNjYjX0t20FwNPRsQPyh3X6IZJ0TD7wnKHYWY2bIbjq8EktUbEX/19vGeaQ+cL6eacVUA92d20ZmZWQ3wj0BBJs8qyzyzNzGz4eKZpZmaWk5OmmZlZTk6aZmZmOfmaZo2bPLGelmG4s8zMbCTyTNPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyTcC1biOdZ00zr213GGYmW1Ww/FoPfBM08zMLDcnTTMzs5ycNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcnDRLIOm1fvaPk3Rqwfu/kXRd2m6SdNgA+pwn6czSozUzs6HmpDm0xgF/SZoR8fuImJneNgElJ00zM6scTpoDIGmspLslPSKpQ9IRadd5wG6S2iTNl9QoaaWkrYBzgVlp36ziGWSq15i2z5b0hKT/B+xeUGc3SbdLapW0RNIem2/UZmbmJwINzOvAURHxiqTxwFJJNwNzgb0jogmgOwlGxBuSvgk0R8QX0755PTUsaRpwDNnMdEvgEaA17V4AnBwRT0r6IHAJ8NEe2pgDzAGoe9eEIRiumZmBk+ZACfh3STOATcBEYKchavsg4MaI2ACQkjGSxgIHAoskddcd3VMDEbGALMEyumFSDFFcZmYjnpPmwBwHTACmRcSbktYCY0ps4y3euTze3/FbAOu7Z7FmZrb5+ZrmwNQDz6eE+RFgl1T+KrBdL8cU71sLTAWQNBV4Xyq/HzhS0taStgP+D0BEvAKskfTZdIwk7TN0QzIzs/44aQ7MNUCzpA7gBOBxgIh4EXgg3dQzv+iYe4E9u28EAq4HdpC0Cvgi8ERq4xHgF8AK4Dbg4YI2jgP+SdIKYBVwBGZmttkowpe8atnohknRMPvCcodhZrZZDfarwSS1RkRzcblnmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlO/jvNGjd5Yj0tg7wgbmZmGc80zczMcnLSNDMzy8lJ08zMLCcnTTMzs5x8I1CN61jXSePcW8sdhlnFGOyTYmxk80zTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCcnTTMzs5xGRNKU1ChpZRn6fa3E+vMkndlDeVniNzOzdxoRSdPMzGwojKSkWSfpckmrJN0paWtJTZKWSmqXdKOk7QEkLZbUnLbHS1qbtveStExSWzpmUir/x4LyyyTVdXcq6TuSVqR+dkpljZLuSW3cLem9xcFKmpaOWwGcVlDeYwxmZjb8RlLSnARcHBF7AeuBo4GrgLMiYgrQAXyrnzZOBv4zIpqAZuAZSR8AZgEfTuVdwHGp/rbA0ojYB7gf+EIq/yGwMPV7DXBRD339BDg9HdtnDMUHSpojqUVSS9eGzn6GZGZmeY2kpLkmItrSdiuwGzAuIu5LZQuBGf208SDwdUlnAbtExJ+BjwHTgIcltaX3u6b6bwC3FPTZmLYPAH6Wtn8KTC/sRNK4FNv9BXX6iuEdImJBRDRHRHPdNvX9DMnMzPIaSUlzY8F2FzCuj7pv8fZnM6a7MCJ+BhwO/Bn4laSPAiKbNTal1+4RMS8d8mZEREGfg37Wby8xmJnZZjCSkmaxTuBlSQel98cD3bPOtWSzR4CZ3QdI2hV4OiIuAm4CpgB3AzMlvTvV2UHSLv30/WvgmLR9HLCkcGdErAfWS5peUKevGMzMbDMYyUkTYDYwX1I70AScm8q/B5wiaTkwvqD+54CVaRl2b+CqiHgU+AZwZ2rnLqChn35PB05K9Y8HvtxDnZOAi1Nf6iuGXCM1M7NB09urh1aLRjdMiobZF5Y7DLOK4a8GszwktUZEc3H5SJ9pmpmZ5eakaWZmlpOTppmZWU5OmmZmZjkN+u8GrbJNnlhPi298MDMbEp5pmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjn5RqAa17Guk8a5t5Y7DDMbBn660ebnmaaZmVlOTppmZmY5OWmamZnl5KRpZmaWk5OmmZlZTk6aZmZmOTlpDgNJjZJW5qhzbMH7ZkkXDX90ZmY2UE6a5dMI/CVpRkRLRHypfOGYmVl/RmTSTLO8xyVdI+kxSddJ2kbSxyQtl9Qh6QpJo1P9tZIuSOXLJL0/lV8paWZBu6/10tcSSY+k14Fp13nAQZLaJJ0h6WBJt6RjdpD0S0ntkpZKmpLK56W4Fkt6WpKTrJnZZjQik2ayO3BJRHwAeAX4KnAlMCsiJpM9LemUgvqdqfxHwIUl9PM88ImImArMArqXYOcCSyKiKSJ+UHTMOcDyiJgCfB24qmDfHsChwP7AtySNKu5Q0hxJLZJaujZ0lhCqmZn1ZSQnzd9FxANp+2rgY8CaiHgilS0EZhTUv7bg5wEl9DMKuFxSB7AI2DPHMdOBnwJExD3AjpLelfbdGhEbI+IFsoS8U/HBEbEgIpojorlum/oSQjUzs76M5GfPRtH79cCOOet3b79F+oeHpC2ArXo47gzgOWCfVPf1AcRaaGPBdhcj+xyamW1WI3mm+V5J3TPGY4EWoLH7eiVwPHBfQf1ZBT8fTNtrgWlp+3CyWWWxeuDZiNiU2qxL5a8C2/US2xLgOABJBwMvRMQreQZlZmbDZyTPUlYDp0m6AngU+BKwFFgkaUvgYeDSgvrbS2onm+n9Qyq7HLhJ0grgduBPPfRzCXC9pBOK6rQDXenYK4HlBcfMA65I/W0AZg9uqGZmNhQUUbxKWfskNQK3RMTeOeuvBZrTdcSqMrphUjTMvrDcYZjZMPBXgw0fSa0R0VxcPpKXZ83MzEoyIpdnI2ItkGuWmeo3DlswZmZWNTzTNDMzy8lJ08zMLCcnTTMzs5xG5DXNkWTyxHpafIedmdmQ8EzTzMwsJydNMzOznJw0zczMcnLSNDMzy8k3AtW4jnWdNM69tdxhmFUtP6rOCnmmaWZmlpOTppmZWU5OmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlOFZc0JY2TdGo/dRolHZujrUZJK/vYf6KkHw0kzqE43szMqkvFJU1gHNBn0gQagX6TZrlI8t+/mpnVoEpMmucBu0lqkzQ/vVZK6pA0q6DOQanOGWlGuUTSI+l1YAn97SxpsaQnJX2ru1DSP0palvq4TFJdKj9J0hOSlgEfLqh/paRLJT0EXCCpSdJSSe2SbpS0farXW/liST+Q1CLpMUn7SbohxfVvqc62km6VtCJ9JrMwM7PNphKT5lzgNxHRBCwFmoB9gI8D8yU1pDpLIqIpIn4APA98IiKmArOAi0rob3/gaGAK8FlJzZI+kNr5cIqjCzgu9X0OWbKcDuxZ1NZ7gAMj4qvAVcBZETEF6AC6E3Jv5QBvREQzcClwE3AasDdwoqQdgU8Cv4+IfSJib+D2ngYkaU5Kvi1dGzpL+CjMzKwvlb6MOB24NiK6gOck3QfsB7xSVG8U8CNJTWQJ7m9L6OOuiHgRQNINqc+3gGnAw5IAtiZLzB8EFkfEH1P9XxT1tSgiuiTVA+Mi4r5UvhBY1Ft5wfE3p58dwKqIeDb18zSwcyr/vqTzgVsiYklPA4qIBcACgNENk6KEz8LMzPpQ6UkzrzOA58hmpFsAr5dwbHFSCUDAwoj418Idko7sp60/ldBvTzamn5sKtrvfbxkRT0iaChwG/JukuyPi3EH2aWZmOVXi8uyrwHZpewkwS1KdpAnADGBZUR2AeuDZiNgEHA/UldDfJyTtIGlr4EjgAeBuYKakdwOk/bsADwF/J2lHSaOAz/bUYER0Ai9LOigVHQ/c11t53kAl/Q2wISKuBuYDU0sYp5mZDVLFzTQj4kVJD6Q/FbkNaAdWkM0AvxYRf5D0ItAlaQVwJXAJcL2kE8iu85Uy41sGXE92PfLqiGgBkPQN4E5JWwBvAqdFxFJJ84AHgfVAWx/tzgYulbQN8DRwUj/leUwmu667KcV0SgnHmpnZICnCl7xq2eiGSdEw+8Jyh2FWtfzVYCOTpNZ0Y+Y7VOLyrJmZWUWquOXZ4SDpUOD8ouI1EXFUOeIxM7PqNCKSZkTcAdxR7jjMzKy6eXnWzMwspxEx0xzJJk+sp8U3MpiZDQnPNM3MzHJy0jQzM8vJSdPMzCwnJ00zM7OcfCNQjetY10nj3FvLHYbZiOcnC9UGzzTNzMxyctI0MzPLyUnTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLKeaTpqSxkk6tZ86jZKOzdFWo6SVQxedmZlVm5pOmsA4oM+kCTQC/SbNUkjyQyPMzGpQrSfN84DdJLVJmp9eKyV1SJpVUOegVOeMNKNcIumR9DowT0eSTpR0s6R7gLsl7SDpl5LaJS2VNCXV6618nqSFqe/fSvqMpAtSrLdLGpXqnSfp0XT893qJZY6kFkktXRs6B/sZmplZUuszornA3hHRJOlo4GRgH2A88LCk+1OdMyPi7wEkbQN8IiJelzQJuBZoztnfVGBKRLwk6YfA8og4UtJHgauAJuCcXsoBdgM+AuwJPAgcHRFfk3Qj8GlJS4CjgD0iIiSN6ymIiFgALAAY3TApcsZuZmb9qPWZZqHpwLUR0RURzwH3Afv1UG8UcLmkDmARWQLL666IeKmgv58CRMQ9wI6S3tVHOcBtEfEm0AHUAben8g6yZeRO4HXgvyR9BthQQmxmZjZIIylp5nUG8BzZjLQZ2KqEY/80yL43AkTEJuDNiOieJW4CtoyIt4D9geuAv+ftpGpmZptBrSfNV4Ht0vYSYJakOkkTgBnAsqI6APXAsylxHU824xuIJcBxAJIOBl6IiFf6KO+XpLFAfUT8iiy57zPA2MzMbABq+ppmRLwo6YH0pyK3Ae3ACiCAr0XEHyS9CHRJWgFcCVwCXC/pBLKZ3EBnj/OAKyS1ky2jzu6nPI/tgJskjQEEfHWAsZmZ2QDo7RVAq0WjGyZFw+wLyx2G2Yjn79OsLpJaI+KvbgKt9eVZMzOzIVPTy7PDQdKhwPlFxWsi4qhyxGNmZpuPk2aJIuIO4I5yx2FmZpufk2aNmzyxnhZfSzEzGxK+pmlmZpaTk6aZmVlOTppmZmY5OWmamZnl5BuBalzHuk4a595a7jDMcvNDAKySeaZpZmaWk5OmmZlZTk6aZmZmOTlpmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjn1mzQlNUpaOVwBSPr1cLU9WIVjl9Qs6aJyx2RmZuVT9ocbRMSB5Y4hj4hoAVrKHYeZmZVP3uXZOkmXS1ol6U5JW0tqkrRUUrukGyVtDyBpsaTmtD1e0tq0vZekZZLa0jGTUvlr6efB6djrJD0u6RpJSvsOS2Wtki6SdEtvgUqaJ2mhpCWSfivpM5IukNQh6XZJo1K9aZLuS23eIamhoHyFpBXAaQXtHtzdr6T9JT0oabmkX0vaPZWfKOmG1M+Tki7o60OV9GNJLelzPaegvMfxStpW0hXpc1wu6Yhe2p2T2m3p2tDZVwhmZlaCvElzEnBxROwFrAeOBq4CzoqIKUAH8K1+2jgZ+M+IaAKagWd6qLMv8BVgT2BX4MOSxgCXAZ+KiGnAhBzx7gZ8FDgcuBq4NyImA38GPp0S5w+BmanNK4DvpGN/ApweEfv00f7jwEERsS/wTeDfC/Y1AbOAycAsSTv30c7ZEdEMTAH+TtKUfsZ7NnBPROwPfASYL2nb4kYjYkFENEdEc9029X10b2Zmpci7PLsmItrSditZUhoXEfelsoXAon7aeBA4W9J7gBsi4ske6iyLiGcAJLUBjcBrwNMRsSbVuRaY009ft0XEm5I6gDrg9lTekdrcHdgbuCtNZuuAZyWNS+O6P9X/KfCpHtqvBxam2XIAowr23R0RnWkMjwK7AL/rJc7PSZpDdh4ayP6xsEUf4z0EOFzSmen9GOC9wGN9fhpmZjYk8ibNjQXbXcC4Puq+xdsz2DHdhRHxM0kPAZ8GfiXpnyPinn76Geg1142pz02S3oyISOWbUpsCVkXEAYUHpaSZx7fJZq9HSWoEFhf3nfQ6BknvA84E9ouIlyVdScHn1QsBR0fE6pxxmpnZEBron5x0Ai9LOii9Px7onnWuBaal7ZndB0jalWwGdRFwE9mSZB6rgV1TcoJs6XOwVgMTJB2QYhslaa+IWA+slzQ91Tuul+PrgXVp+8QBxvAu4E9Ap6SdeHtG29d47wBOL7jWu+8A+zYzswEYzN9pzia7ptZOdh3v3FT+PeAUScuB8QX1PwesTMuue5NdE+1XRPwZOBW4XVIr8CpZ0h6wiHiDLKGfn274aQO67+I9Cbg4xalemrgA+G4a44BmwxGxAlhOdn30Z8ADqbyv8X6bbCm4XdKq9N7MzDYTvb1yWbkkjY2I19IM62LgyYj4QbnjGi5DOd7RDZOiYfaFQxqf2XDy92laJZDUmm7UfIdqeSLQF9LMbxXZ0uhl5Q1n2I208ZqZVYWyP9wgjzTLesdMS9JJwJeLqj4QEadRYdINUKOLio+PiI6e6vc0XjMzK7+qSJo9iYifkP1NZcWLiA+WOwYzMxu8almeNTMzK7uqnWlaPpMn1tPiGyvMzIaEZ5pmZmY5OWmamZnl5KRpZmaWk5OmmZlZTr4RqMZ1rOukce6t5Q7DzCqIn7o0cJ5pmpmZ5eSkaWZmlpOTppmZWU5OmmZmZjk5aZqZmeXkpGlmZpaTk6aZmVlONZs0JS2W1Jy2fyVp3BC2fbKkE4aqPTMzqw4j4uEGEXHYELd36VC2Z2Zm1aGiZpqSGiU9LulKSU9IukbSxyU9IOlJSftL2lbSFZKWSVou6Yh07NaSfi7pMUk3AlsXtLtW0vi0/UtJrZJWSZpTUOc1Sd+RtELSUkk79RHnPElnpu3Fks5P8Twh6aBUXifpe5JWSmqXdHoq/1iKuyONY3RBjN+V1CapRdJUSXdI+o2kkwv6/hdJD6c2z+klvjmpjZauDZ2DOCNmZlaoopJm8n7g+8Ae6XUsMB04E/g6cDZwT0TsD3wEmC9pW+AUYENEfAD4FjCtl/Y/HxHTgGbgS5J2TOXbAksjYh/gfuALJcS8ZYrnK6lvgDlAI9AUEVOAaySNAa4EZkXEZLKZ/ikF7fxvRDQBS1K9mcCHgHMAJB0CTAL2B5qAaZJmFAcTEQsiojkimuu2qS9hGGZm1pdKTJprIqIjIjYBq4C7IyKADrIkdAgwV1IbsBgYA7wXmAFcDRAR7UB7L+1/SdIKYCmwM1kSAngDuCVtt6a+8rqhh+M+DlwWEW+lmF4Cdk/jeyLVWZji7nZz+tkBPBQRr0bEH4GN6ZrsIem1HHiE7B8VkzAzs82iEq9pbizY3lTwfhNZvF3A0RGxuvAgSf02LOlgsmR2QERskLSYLOkCvJmSM6mPUj6b7hhLPa63dgrH3f1+S0DAdyPiskH0YWZmA1SJM83+3AGcrpQlJe2byu8nW8pF0t7AlB6OrQdeTglzD7Klz+FyF/DPkrZMMe0ArAYaJb0/1TkeuK+ENu8APi9pbGpzoqR3D2HMZmbWh2pMmt8GRgHtklal9wA/BsZKegw4l2yptNjtwJapznlkS7TD5f8C/5viXAEcGxGvAycBiyR1kM0gc9+JGxF3Aj8DHkzHXwdsN+SRm5lZj/T2iqTVotENk6Jh9oXlDsPMKoi/T7N/klojorm4vBpnmmZmZmVRiTcCVQxJZwOfLSpeFBHfKUc8ZmZWXk6afUjJ0QnSzMwAJ82aN3liPS2+fmFmNiR8TdPMzCwnJ00zM7OcnDTNzMxyctI0MzPLyUnTzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCd/y0mNk/Qq2fd4VrvxwAvlDmKQamEM4HFUkloYA1TmOHaJiAnFhX6MXu1b3dPX21QbSS3VPo5aGAN4HJWkFsYA1TUOL8+amZnl5KRpZmaWk5Nm7VtQ7gCGSC2MoxbGAB5HJamFMUAVjcM3ApmZmeXkmaaZmVlOTppmZmY5OWnWKEmflLRa0lOS5pY7nv5IWiupQ1KbpJZUtoOkuyQ9mX5un8ol6aI0tnZJU8sY9xWSnpe0sqCs5LglzU71n5Q0uwLGME/SunQ+2iQdVrDvX9MYVks6tKC8rL9zknaWdK+kRyWtkvTlVF4156OPMVTV+ZA0RtIySSvSOM5J5e+T9FCK6ReStkrlo9P7p9L+xv7GVzYR4VeNvYA64DfArsBWwApgz3LH1U/Ma4HxRWUXAHPT9lzg/LR9GHAbIOBDwENljHsGMBVYOdC4gR2Ap9PP7dP29mUewzzgzB7q7pl+n0YD70u/Z3WV8DsHNABT0/Z2wBMp3qo5H32MoarOR/pMx6btUcBD6TP+b+CYVH4pcEraPhW4NG0fA/yir/Ftzt+r4pdnmrVpf+CpiHg6It4Afg4cUeaYBuIIYGHaXggcWVB+VWSWAuMkNZQhPiLifuClouJS4z4UuCsiXoqIl4G7gE8Oe/BJL2PozRHAzyNiY0SsAZ4i+30r++9cRDwbEY+k7VeBx4CJVNH56GMMvanI85E+09fS21HpFcBHgetSefG56D5H1wEfkyR6H1/ZOGnWponA7wreP0Pf/+FVggDulNQqaU4q2ykink3bfwB2StuVPr5S467U8XwxLVte0b2kSZWMIS3v7Us2w6nK81E0Bqiy8yGpTlIb8DzZPzx+A6yPiLd6iOkv8ab9ncCOVMA4ijlpWqWYHhFTgU8Bp0maUbgzsrWaqvv7qGqNG/gxsBvQBDwLfL+s0ZRA0ljgeuArEfFK4b5qOR89jKHqzkdEdEVEE/AestnhHuWNaGg4adamdcDOBe/fk8oqVkSsSz+fB24k+4/sue5l1/Tz+VS90sdXatwVN56IeC79T28TcDlvL4lV9BgkjSJLNtdExA2puKrOR09jqNbzARAR64F7gQPIlsC7n3leGNNf4k3764EXqaBxdHPSrE0PA5PSnWpbkV1Yv7nMMfVK0raStuveBg4BVpLF3H3n4mzgprR9M3BCuvvxQ0BnwfJbJSg17juAQyRtn5bdDkllZVN0jfgosvMB2RiOSXc7vg+YBCyjAn7n0jWw/wIei4j/KNhVNeejtzFU2/mQNEHSuLS9NfAJsuuz9wIzU7Xic9F9jmYC96RVgd7GVz7lvAvJr+F7kd0Z+ATZdYSzyx1PP7HuSnaH3ApgVXe8ZNc07gaeBP4H2CGVC7g4ja0DaC5j7NeSLZe9SXa95Z8GEjfwebKbHJ4CTqqAMfw0xdhO9j+uhoL6Z6cxrAY+VSm/c8B0sqXXdqAtvQ6rpvPRxxiq6nwAU4DlKd6VwDdT+a5kSe8pYBEwOpWPSe+fSvt37W985Xr5MXpmZmY5eXnWzMwsJydNMzOznJw0zczMcnLSNDMzy8lJ08zMLCcnTTMzs5ycNM3MzHL6/xT29zgweRDLAAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "plt.barh(automl.feature_names_in_, automl.feature_importances_)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "''' pickle and save the automl object '''\n", "import pickle\n", "with open('automl.pkl', 'wb') as f:\n", " pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Predicted labels [162131.66541776 261207.15681479 157976.50985102 ... 205999.47588989\n", " 223985.57564169 277733.77442341]\n", "True labels 14740 136900.0\n", "10101 241300.0\n", "20566 200700.0\n", "2670 72500.0\n", "15709 460000.0\n", " ... \n", "13132 121200.0\n", "8228 137500.0\n", "3948 160900.0\n", "8522 227300.0\n", "16798 265600.0\n", "Name: median_house_value, Length: 5160, dtype: float64\n" ] } ], "source": [ "''' compute predictions of testing dataset ''' \n", "y_pred = automl.predict(X_test)\n", "print('Predicted labels', y_pred)\n", "print('True labels', y_test)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "r2 = 0.8522136092023422\n", "mse = 1953515373.4904487\n", "mae = 29086.15911420206\n" ] } ], "source": [ "''' compute different metric values on testing dataset'''\n", "from flaml.ml import sklearn_metric_loss_score\n", "print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n", "print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n", "print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "subslide" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 4, 'num_leaves': 4, 'min_child_samples': 20, 'learning_rate': 0.09999999999999995, 'log_max_bin': 8, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}}\n", "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 22, 'num_leaves': 4, 'min_child_samples': 18, 'learning_rate': 0.2293009676418639, 'log_max_bin': 9, 'colsample_bytree': 0.9086551727646448, 'reg_alpha': 0.0015561782752413472, 'reg_lambda': 0.33127416269768944}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 22, 'num_leaves': 4, 'min_child_samples': 18, 'learning_rate': 0.2293009676418639, 'log_max_bin': 9, 'colsample_bytree': 0.9086551727646448, 'reg_alpha': 0.0015561782752413472, 'reg_lambda': 0.33127416269768944}}\n", "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 28, 'num_leaves': 20, 'min_child_samples': 17, 'learning_rate': 0.32352862101602586, 'log_max_bin': 10, 'colsample_bytree': 0.8801327898366843, 'reg_alpha': 0.004475520554844502, 'reg_lambda': 0.033081571878574946}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 28, 'num_leaves': 20, 'min_child_samples': 17, 'learning_rate': 0.32352862101602586, 'log_max_bin': 10, 'colsample_bytree': 0.8801327898366843, 'reg_alpha': 0.004475520554844502, 'reg_lambda': 0.033081571878574946}}\n", "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 44, 'num_leaves': 81, 'min_child_samples': 29, 'learning_rate': 0.26477481203117526, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.028486834222229064}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 44, 'num_leaves': 81, 'min_child_samples': 29, 'learning_rate': 0.26477481203117526, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.028486834222229064}}\n", "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 44, 'num_leaves': 70, 'min_child_samples': 19, 'learning_rate': 0.182061387379683, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.001534805484993033}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 44, 'num_leaves': 70, 'min_child_samples': 19, 'learning_rate': 0.182061387379683, 'log_max_bin': 10, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.001534805484993033}}\n", "{'Current Learner': 'lgbm', 'Current Sample': 15480, 'Current Hyper-parameters': {'n_estimators': 34, 'num_leaves': 178, 'min_child_samples': 14, 'learning_rate': 0.16444778912464286, 'log_max_bin': 9, 'colsample_bytree': 0.8963761466973907, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.027857858022692302}, 'Best Learner': 'lgbm', 'Best Hyper-parameters': {'n_estimators': 34, 'num_leaves': 178, 'min_child_samples': 14, 'learning_rate': 0.16444778912464286, 'log_max_bin': 9, 'colsample_bytree': 0.8963761466973907, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.027857858022692302}}\n" ] } ], "source": [ "from flaml.data import get_output_from_log\n", "time_history, best_valid_loss_history, valid_loss_history, config_history, metric_history = \\\n", " get_output_from_log(filename=settings['log_file_name'], time_budget=60)\n", "\n", "for config in config_history:\n", " print(config)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAb5ElEQVR4nO3de5wddZ3m8c9DEyDKJWJaB0JC4hCjwQvRCOIVGDXghURFBpidVRyNzojjiBMFR5HBZQeHGVx8bdQFlgFd7ggxajQygqiAkGCAEDBMRIQ0KEEIIEZCkmf/qGo4NKdPOqHrnO5Tz/v16lef+tXvVH0r6e7nVP3qIttERER9bdPpAiIiorMSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgogWJL1R0spO1xFRpQRBjFiS7pL0lk7WYPuntqdVtXxJsyT9RNKjktZIulrSoVWtL6KZBEHUmqSeDq77MOAS4BvAHsALgROAd23FsiQpv8+xVfKDE6OOpG0kHSfpV5J+L+liSbs2zL9E0m8lPVx+2t67Yd45kr4maZGkx4ADyz2Pf5R0S/meiyTtUPY/QNLqhvcP2rec/2lJ90m6V9KHJFnSXk22QcBpwBdtn2X7YdubbF9t+8NlnxMl/b+G90wul7dtOf1jSSdLugb4IzBP0tIB6/mkpIXl6+0l/ZukuyX9TtLXJY19lv8d0QUSBDEafRyYA7wZ2B14CJjfMP/7wFTgBcAvgPMGvP8o4GRgJ+BnZdvhwMHAFOAVwAdarL9pX0kHA8cCbwH2Ag5osYxpwETg0hZ9huKvgbkU2/J1YJqkqQ3zjwLOL1+fArwY2KesbwLFHkjUXIIgRqOPAv9ke7Xtx4ETgcP6PynbPtv2ow3zXilpl4b3f9v2NeUn8D+VbV+xfa/tB4HvUPyxHMxgfQ8H/sP2Ctt/LNc9mOeX3+8b2iYP6pxyfRtsPwx8GzgSoAyElwALyz2QucAnbT9o+1HgfwJHPMv1RxdIEMRotCdwuaS1ktYCtwMbgRdK6pF0SnnY6BHgrvI94xvef0+TZf624fUfgR1brH+wvrsPWHaz9fT7ffl9txZ9hmLgOs6nDAKKvYEFZSj1As8Bbmz4d/tB2R41lyCI0ege4BDb4xq+drDdR/HHbzbF4ZldgMnle9Tw/qpuuXsfxaBvv4kt+q6k2I73tujzGMUf735/1qTPwG25AuiVtA9FIPQfFnoAWAfs3fBvtovtVoEXNZEgiJFujKQdGr62pTgWfrKkPQEk9UqaXfbfCXic4hP3cygOf7TLxcDRkl4q6TnA5wfr6OL+78cCn5d0tKSdy0HwN0g6o+x2E/AmSZPKQ1vHb64A209QnIl0KrArRTBgexNwJvBlSS8AkDRB0qyt3djoHgmCGOkWUXyS7f86ETgdWAj8UNKjwM+B/cr+3wB+A/QBt5Xz2sL294GvAFcBqxrW/fgg/S8F/hL4IHAv8Dvgf1Ac58f2FcBFwC3AjcB3h1jK+RR7RJfY3tDQ/pn+usrDZv9JMWgdNac8mCaiGpJeCtwKbD/gD3LEiJI9gohhJOnd5fn6zwO+BHwnIRAjXYIgYnh9BLgf+BXFmUx/29lyIjYvh4YiImouewQRETW3bacL2FLjx4/35MmTO11GRMSocuONNz5gu+kFhKMuCCZPnszSpUs33zEiIp4k6TeDzcuhoYiImksQRETUXIIgIqLmEgQRETWXIIiIqLlRd9ZQRETdLFjWx6mLV3Lv2nXsPm4s82ZNY86MCcO2/ARBRMQItmBZH8dftpx1T2wEoG/tOo6/bDnAsIVBDg1FRIxgpy5e+WQI9Fv3xEZOXbxy2NaRIIiIGMHuXbtui9q3RoIgImIE233c2C1q3xoZI+igqgeAIrpNHX9n5s2a9rQxAoCxY3qYN2v4Hi6XIOiQdgwARXSTuv7O9G/bpy+9hfUbNzGhggAcdc8jmDlzprvhpnOvP+VK+poc49uuZxtmTBrX/oIiRrhld69l/cZNz2ivy+/Mbfc9wvTdduaij+y/Ve+XdKPtmc3mZYygQwYb6Gn2gx4Rg/9u1OV3ZvpuOzN7n2r2fGp7aKjTxxp3Hze26R7BhHFjtzrxI7rZYHvR+Z159mq5R9B/rLFv7TrMU8caFyzra1sN82ZNY+yYnqe1DfcAUEQ3ye9MdWq5RzDYBRqfvvQWLrjh7rbVsfu4HbhzzWMYKhkAiugm/b8bdTtrqB1qGQQj5fj8+B23Z/yO2zN7nwkctd+ktq47YjSaM2NC/vBXoNIgkHQwcDrQA5xl+5QB8ycB5wLjyj7H2V5UZU2Q4/MREY0qGyOQ1APMBw4BpgNHSpo+oNvngIttzwCOAL5aVT2NcqwxIuIpVQ4W7wussn2n7fXAhcDsAX0M7Fy+3gW4t8J6njRnxgT+5T0vZ7ueYvMnjBvLv7zn5dnljIhaqvLQ0ATgnobp1cB+A/qcCPxQ0seB5wJvqbCep5kzY8KTA8M5HBQRddbp00ePBM6xvQfwduCbkp5Rk6S5kpZKWrpmzZq2FxkR0c2qDII+YGLD9B5lW6O/AS4GsH0dsAMwfuCCbJ9he6btmb29vRWVGxFRT1UGwRJgqqQpkrajGAxeOKDP3cBfAEh6KUUQ5CN/REQbVRYEtjcAxwCLgdspzg5aIekkSYeW3T4FfFjSzcAFwAc82u6CFxExylV6HUF5TcCiAW0nNLy+DXh9lTVERERrnR4sjoiIDksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiai5BEBFRcwmCiIiaSxBERNRcgiAiouYSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgoiImksQRETUXKVBIOlgSSslrZJ0XJP5X5Z0U/l1h6S1VdYTERHPtG1VC5bUA8wH3gqsBpZIWmj7tv4+tj/Z0P/jwIyq6omIiOaq3CPYF1hl+07b64ELgdkt+h8JXFBhPRER0USVQTABuKdhenXZ9gyS9gSmAFcOMn+upKWSlq5Zs2bYC42IqLORMlh8BHCp7Y3NZto+w/ZM2zN7e3vbXFpERHerMgj6gIkN03uUbc0cQQ4LRUR0RJVBsASYKmmKpO0o/tgvHNhJ0kuA5wHXVVhLREQMorIgsL0BOAZYDNwOXGx7haSTJB3a0PUI4ELbrqqWiIgYXGWnjwLYXgQsGtB2woDpE6usodGCZX2cungl965dx+7jxrLDmG0Yv+P27Vp9RMSIVGkQjCQLlvVx/GXLWfdEMR7dt3Yd26jDRUVEjAAj5ayhyp26eOWTIdBvk+GeB9d1qKKIiJGhNkFw79rmf/DXb9zU5koiIkaW2gTB7uPGNm2fMEh7RERd1CYI5s2axtgxPU9rGzumh3mzpnWoooiIkaE2g8VzZhR3t/j0pbewfuMmJowby7xZ055sj4ioq9oEARRhcMENdwNw0Uf273A1EREjQ20ODUVERHMJgoiImksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzLYNA0s6S/rxJ+yuqKykiItpp0CCQdDjwS+BbklZIek3D7HOqLiwiItqj1R7BZ4FX294HOBr4pqR3l/PybK+IiC7R6qZzPbbvA7B9g6QDge9KmgjkQfMREV2i1R7Bo43jA2UoHADMBvauuK6IiGiTVnsEf8uAQ0C2H5V0MHB4pVVFRETbDLpHYPtm4NeSrhrQ/oTt8yqvLCIi2qLl6aO2NwKbJO3SpnoiIqLNhvKEsj8AyyVdATzW32j77yurKiIi2mYoQXBZ+bXFyvGE04Ee4CzbpzTpczhwIsWZSDfbPmpr1hUREVtns0Fg+9ytWbCkHmA+8FZgNbBE0kLbtzX0mQocD7ze9kOSXrA164qIiK1X5b2G9gVW2b7T9nrgQopTTxt9GJhv+yEA2/dXWE9ERDRRZRBMAO5pmF5dtjV6MfBiSddI+nl5KOkZJM2VtFTS0jVr1lRUbkREPXX67qPbAlMpLlQ7EjhT0riBnWyfYXum7Zm9vb3trTAiosttdoxA0ouBecCejf1tH7SZt/YBExum9yjbGq0Grrf9BMU1C3dQBMOSzZceERHDYShnDV0CfB04E9i4BcteAkyVNIUiAI4ABp4RtIBiT+A/JI2nOFR05xasIyIinqWhBMEG21/b0gXb3iDpGGAxxemjZ9teIekkYKntheW8t0m6jSJk5tn+/ZauKyIitt5QguA7kv4OuBx4vL/R9oObe6PtRcCiAW0nNLw2cGz5FRERHTCUIHh/+X1eQ5uBFw1/ORER0W5DuaBsSjsKiYiIzhjKWUNjKG5J/aay6cfA/ynP9ImIiFFuKIeGvgaMAb5aTv912fahqoqKiIj2GUoQvMb2Kxumr5R0c1UFRUREew3lyuKNjY+slPQitux6goiIGMGGskcwD7hK0p0Uj67cEzi60qoiIqJthnLW0I/K20VPK5tW2n681XsiImL0GDQIJB1k+0pJ7xkway9J2N6qh9VERMTI0mqP4M3AlcC7mswzW/nUsoiIGFkGDQLbXyhfnmT7143zyhvJRUREFxjKWUPfatJ26XAXEhERndFqjOAlwN7ALgPGCXYGdqi6sIiIaI9WYwTTgHcC43j6OMGjFM8ajoiILtBqjODbwLcl7W/7ujbWFBERbTSUC8qWSfoYxWGiJw8J2f5gZVVFRETbDGWw+JvAnwGzgKspnj38aJVFRURE+wwlCPay/XngMdvnAu8A9qu2rIiIaJehBEH/cwfWSnoZsAvwgupKioiIdhrKGMEZkp4HfB5YCOwInND6LRERMVoM5aZzZ5UvrybPKY6I6DqtLig7ttUbbZ82/OVERES7tdoj2Kn8Pg14DcVhISguLruhyqIiIqJ9Wl1Q9s8Akn4CvMr2o+X0icD32lJdRERUbihnDb0QWN8wvb5si4iILjCUIPgGcIOkE8u9geuBc4aycEkHS1opaZWk45rM/4CkNZJuKr8+tCXFR0TEszeUs4ZOlvR94I1l09G2l23ufZJ6gPnAW4HVwBJJC23fNqDrRbaP2cK6IyJimLQ6a2hn249I2hW4q/zqn7er7Qc3s+x9gVW27yzfcyEwGxgYBBER0UGt9gjOp7gN9Y0Uj6bsp3J6c9cUTADuaZheTfNbU7xX0puAO4BP2r5nYAdJc4G5AJMmTdrMaiMiYksMOkZg+53l9ym2X9TwNcX2cF1Y9h1gsu1XAFcA5w5Syxm2Z9qe2dvbO0yrjogIaH1o6FWt3mj7F5tZdh8wsWF6j7KtcRm/b5g8C/jXzSwzIiKGWatDQ//eYp6Bgzaz7CXA1PJB933AEcBRjR0k7Wb7vnLyUOD2zSwzIiKGWasLyg58Ngu2vUHSMcBioAc42/YKSScBS20vBP5e0qHABuBB4APPZp0REbHlhnL3UcrbT0/n6U8o+8bm3md7EbBoQNsJDa+PB44farERETH8NhsEkr4AHEARBIuAQ4CfUVxoFhERo9xQriw+DPgL4Le2jwZeSfFwmoiI6AJDCYJ1tjcBGyTtDNzP088GioiIUWwoYwRLJY0DzqS4uOwPwHVVFhUREe3T6jqC+cD5tv+ubPq6pB8AO9u+pS3VRURE5VrtEdwB/Juk3YCLgQuGcrO5iIgYXVrdYuJ02/sDbwZ+D5wt6ZeSviDpxW2rMCIiKrXZwWLbv7H9JdszgCOBOeQK4IiIrrHZIJC0raR3SToP+D6wEnhP5ZVFRERbtBosfivFHsDbKR5WfyEw1/ZjbaotIiLaoNVg8fEUzyT4lO2H2lRPRES0Waubzm3u7qIREdEFhnJlcUREdLEEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiai5BEBFRcwmCiIiaqzQIJB0saaWkVZKOa9HvvZIsaWaV9URExDNVFgSSeoD5wCHAdOBISdOb9NsJ+ARwfVW1RETE4KrcI9gXWGX7TtvrKZ5wNrtJvy8CXwL+VGEtERExiCqDYAJwT8P06rLtSZJeBUy0/b0K64iIiBY6NlgsaRvgNOBTQ+g7V9JSSUvXrFlTfXERETVSZRD0ARMbpvco2/rtBLwM+LGku4DXAgubDRjbPsP2TNsze3t7Kyw5IqJ+qgyCJcBUSVMkbQccASzsn2n7YdvjbU+2PRn4OXCo7aUV1hQREQNUFgS2NwDHAIuB24GLba+QdJKkQ6tab0REbJltq1y47UXAogFtJwzS94Aqa4mIiOZyZXFERM0lCCIiai5BEBFRcwmCiIiaSxBERNRcgiAiouYSBBERNZcgiIiouQRBRETNJQgiImouQRARUXMJgoiImksQRETUXIIgIqLmEgQRETWXIIiIqLkEQUREzSUIIiJqLkEQEVFzCYKIiJpLEERE1FyCICKi5hIEERE1lyCIiKi5BEFERM0lCCIiaq7SIJB0sKSVklZJOq7J/I9KWi7pJkk/kzS9ynoiIuKZKgsCST3AfOAQYDpwZJM/9OfbfrntfYB/BU6rqp6IiGiuyj2CfYFVtu+0vR64EJjd2MH2Iw2TzwVcYT0REdHEthUuewJwT8P0amC/gZ0kfQw4FtgOOKjZgiTNBeYCTJo0adgLjYios44PFtueb/vPgc8Anxukzxm2Z9qe2dvb294CIyK6XJVB0AdMbJjeo2wbzIXAnArriYiIJqoMgiXAVElTJG0HHAEsbOwgaWrD5DuA/6qwnoiIaKKyMQLbGyQdAywGeoCzba+QdBKw1PZC4BhJbwGeAB4C3l9VPRER0VyVg8XYXgQsGtB2QsPrT1S5/oiI2LyODxZHRERnJQgiImouQRARUXMJgoiImqt0sHikWLCsj1MXr+TetesY07MNE3cd2+mSIiJGjK7fI1iwrI/jL1tO39p1GFi/cRO/fuAxFixrdW1bRER9dH0QnLp4Jeue2Pi0tk0u2iMiogZBcO/adVvUHhFRN10fBLuPaz4eMFh7RETddH0QzJs1jbFjep7WNnZMD/NmTetQRRERI0vXnzU0Z8YEgCfPGtp93FjmzZr2ZHtERN11fRBAEQb5wx8R0VzXHxqKiIjWEgQRETWXIIiIqLkEQUREzSUIIiJqTrY7XcMWkbQG+M0Wvm088EAF5YxUddreOm0rZHu7WdXbuqft3mYzRl0QbA1JS23P7HQd7VKn7a3TtkK2t5t1cltzaCgiouYSBBERNVeXIDij0wW0WZ22t07bCtnebtaxba3FGEFERAyuLnsEERExiARBRETNdX0QSDpY0kpJqyQd1+l6hpuksyXdL+nWhrZdJV0h6b/K78/rZI3DRdJESVdJuk3SCkmfKNu7dXt3kHSDpJvL7f3nsn2KpOvLn+mLJG3X6VqHi6QeScskfbec7uZtvUvSckk3SVpatnXkZ7mrg0BSDzAfOASYDhwpaXpnqxp25wAHD2g7DviR7anAj8rpbrAB+JTt6cBrgY+V/5/dur2PAwfZfiWwD3CwpNcCXwK+bHsv4CHgbzpX4rD7BHB7w3Q3byvAgbb3abh+oCM/y10dBMC+wCrbd9peD1wIzO5wTcPK9k+ABwc0zwbOLV+fC8xpZ01VsX2f7V+Urx+l+IMxge7dXtv+Qzk5pvwycBBwadneNdsraQ/gHcBZ5bTo0m1toSM/y90eBBOAexqmV5dt3e6Ftu8rX/8WeGEni6mCpMnADOB6unh7y0MlNwH3A1cAvwLW2t5Qdummn+n/BXwa2FROP5/u3VYoQv2Hkm6UNLds68jPci2eUFZnti2pq84RlrQj8C3gH2w/UnxwLHTb9treCOwjaRxwOfCSzlZUDUnvBO63faOkAzpcTru8wXafpBcAV0j6ZePMdv4sd/seQR8wsWF6j7Kt2/1O0m4A5ff7O1zPsJE0hiIEzrN9Wdnctdvbz/Za4Cpgf2CcpP4Pcd3yM/164FBJd1Ecwj0IOJ3u3FYAbPeV3++nCPl96dDPcrcHwRJgannmwXbAEcDCDtfUDguB95ev3w98u4O1DJvymPH/BW63fVrDrG7d3t5yTwBJY4G3UoyLXAUcVnbriu21fbztPWxPpvg9vdL2X9GF2wog6bmSdup/DbwNuJUO/Sx3/ZXFkt5OceyxBzjb9smdrWh4SboAOIDiFra/A74ALAAuBiZR3LL7cNsDB5RHHUlvAH4KLOep48ifpRgn6MbtfQXFgGEPxYe2i22fJOlFFJ+adwWWAf/N9uOdq3R4lYeG/tH2O7t1W8vturyc3BY43/bJkp5PB36Wuz4IIiKitW4/NBQREZuRIIiIqLkEQUREzSUIIiJqLkEQEVFzCYIYUSR9WdI/NEwvlnRWw/S/Szq2xfvPkXRY+frHkp7xMHBJYySdUt7h8ReSrpN0SDnvLknjt6LuJ9c7yPz55V0mb5O0rnx9k6TDJC3qv15gOEnarf8unoPM307STxou2IqaShDESHMN8DoASdtQXB+xd8P81wHXPst1fBHYDXiZ7VdR3Nhrp2e5zJZsf8z2PsDbgV+Vd5zcx/altt9eXjk83I4FzmxR03qKO1z+ZQXrjlEkQRAjzbUUt1GAIgBuBR6V9DxJ2wMvBX4h6QRJSyTdKukMNd5wqAVJzwE+DHy8/8Ik27+zfXGTvseWy791wF7Kf5d0i4rnBHyzyfu+WO4h9AyxprskjZc0WdIvy/feIek8SW+RdE2597Jv2f+5Kp5DcYOKe/cPdkfd9wI/KN+zd9n/prL2qWWfBcBfDaXO6F7ZJYwRxfa9kjZImkTx6f86ijtO7g88DCy3vV7S/7Z9EkD5x/idwHeGsIq9gLttP9Kqk6RXA0cD+wECrpd0NbAe+BzwOtsPSNp1wPtOpdi7ONpbd7XmXsD7gA9S3CLlKOANwKEUV1HPAf6J4hYMHywPKd0g6T9tP9ZQxxTgoYarcD8KnG77vPJ2K/0hdSvwmq2oM7pI9ghiJLqWIgT6g+C6hulryj4Hqnhy1XKKG5Tt3WxBz8IbgMttP1Y+E+Ay4I3lui6x/QDAgMv/Pw/sYvujWxkCAL+2vdz2JmAFxUNKTHFbjclln7cBx6m4PfWPgR0obknQaDdgTcP0dcBnJX0G2NP2urL+jcD6/vveRD0lCGIk6h8neDnFJ9afU+wRvA64VtIOwFeBw2y/nOI4+A5DXPYqYJKknYe96uIT/KsH7iVsocb76GxqmN7EU3vwAt7bMM4wyXbjU70A1tHwb2L7fIq9inXAIkkHNfTdHvjTs6g5RrkEQYxE11Ic6nnQ9sbyU/c4ijC4lqf+wD2g4tkEg56tM5DtP1LcwfT08hBJ/10+3zeg60+BOZKeU94d8t1l25XA+8qbgzHgj/4PgFOA71X8CXsx8PH+cRFJM5r0uYOn9iD6b3J2p+2vUNzR8hVl+/OBB2w/UWG9McIlCGIkWk5xttDPB7Q9bPuB8gybMyn2FhZTfBLfEp+jOGxym6Rbge8CTxszKB+JeQ5wA8XdTc+yvcz2CuBk4GpJNwOnDXjfJWVtC1XcOroKX6R4bOUtklaU009Tjhf8StJeZdPhwK3l4aSXAd8o2w8EvldRnTFK5O6jEV1K0ruBV9v+XIs+lwHH2b6jfZXFSJOzhiK6lO3L+w9hNVMeGluQEIjsEURE1FzGCCIiai5BEBFRcwmCiIiaSxBERNRcgiAioub+P3xx7QjxT3ySAAAAAElFTkSuQmCC", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "\n", "plt.title('Learning Curve')\n", "plt.xlabel('Wall Clock Time (s)')\n", "plt.ylabel('Validation r2')\n", "plt.scatter(time_history, 1 - np.array(valid_loss_history))\n", "plt.step(time_history, 1 - np.array(best_valid_loss_history), where='post')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Comparison with alternatives\n", "\n", "### FLAML's accuracy" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "flaml (4min) r2 = 0.8522136092023422\n" ] } ], "source": [ "print('flaml (4min) r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Default LightGBM" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "from lightgbm import LGBMRegressor\n", "lgbm = LGBMRegressor()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
LGBMRegressor()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" ], "text/plain": [ "LGBMRegressor()" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lgbm.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "default lgbm r2 = 0.8296179648694404\n" ] } ], "source": [ "y_pred = lgbm.predict(X_test)\n", "from flaml.ml import sklearn_metric_loss_score\n", "print('default lgbm r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optuna LightGBM Tuner" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# uncomment the following line if optuna is not installed\n", "# %pip install optuna==2.8.0" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "train_x, val_x, train_y, val_y = train_test_split(X_train, y_train, test_size=0.1)\n", "import optuna.integration.lightgbm as lgb\n", "dtrain = lgb.Dataset(train_x, label=train_y)\n", "dval = lgb.Dataset(val_x, label=val_y)\n", "params = {\n", " \"objective\": \"regression\",\n", " \"metric\": \"regression\",\n", " \"verbosity\": -1,\n", "}" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "tags": [ "outputPrepend" ] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\u001b[32m[I 2022-07-01 15:26:25,531]\u001b[0m A new study created in memory with name: no-name-0bd516fd-ed41-4e00-874e-ff99ff30eb94\u001b[0m\n", "feature_fraction, val_score: inf: 0%| | 0/7 [00:00 0] = 1.\n", " grad_mae[grad_mae <= 0] = -1.\n", " hess_mae = 1.0\n", "\n", " coef = [0.4, 0.3, 0.3]\n", " return coef[0] * grad + coef[1] * grad_rmse + coef[2] * grad_mae, \\\n", " coef[0] * hess + coef[1] * hess_rmse + coef[2] * hess_mae\n", "\n", "\n", "from flaml.model import LGBMEstimator\n", "\n", "''' create a customized LightGBM learner class with your objective function '''\n", "class MyLGBM(LGBMEstimator):\n", " '''LGBMEstimator with my_loss_obj as the objective function\n", " '''\n", "\n", " def __init__(self, **config):\n", " super().__init__(objective=my_loss_obj, **config)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add the customized learner in FLAML" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "[flaml.automl: 07-01 15:33:17] {2427} INFO - task = regression\n", "[flaml.automl: 07-01 15:33:17] {2429} INFO - Data split method: uniform\n", "[flaml.automl: 07-01 15:33:17] {2432} INFO - Evaluation method: cv\n", "[flaml.automl: 07-01 15:33:17] {2501} INFO - Minimizing error metric: 1-r2\n", "[flaml.automl: 07-01 15:33:17] {2641} INFO - List of ML learners in AutoML Run: ['my_lgbm']\n", "[flaml.automl: 07-01 15:33:17] {2933} INFO - iteration 0, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:17] {3061} INFO - Estimated sufficient time budget=1586s. Estimated necessary time budget=2s.\n", "[flaml.automl: 07-01 15:33:17] {3108} INFO - at 0.2s,\testimator my_lgbm's best error=2.9883,\tbest estimator my_lgbm's best error=2.9883\n", "[flaml.automl: 07-01 15:33:17] {2933} INFO - iteration 1, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.4s,\testimator my_lgbm's best error=2.9883,\tbest estimator my_lgbm's best error=2.9883\n", "[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 2, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.6s,\testimator my_lgbm's best error=1.7086,\tbest estimator my_lgbm's best error=1.7086\n", "[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 3, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:18] {3108} INFO - at 0.8s,\testimator my_lgbm's best error=0.3474,\tbest estimator my_lgbm's best error=0.3474\n", "[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 4, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:18] {3108} INFO - at 1.0s,\testimator my_lgbm's best error=0.3474,\tbest estimator my_lgbm's best error=0.3474\n", "[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 5, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:18] {3108} INFO - at 1.2s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n", "[flaml.automl: 07-01 15:33:18] {2933} INFO - iteration 6, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.4s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n", "[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 7, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.6s,\testimator my_lgbm's best error=0.3015,\tbest estimator my_lgbm's best error=0.3015\n", "[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 8, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:19] {3108} INFO - at 1.9s,\testimator my_lgbm's best error=0.2721,\tbest estimator my_lgbm's best error=0.2721\n", "[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 9, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:19] {3108} INFO - at 2.2s,\testimator my_lgbm's best error=0.2721,\tbest estimator my_lgbm's best error=0.2721\n", "[flaml.automl: 07-01 15:33:19] {2933} INFO - iteration 10, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:21] {3108} INFO - at 3.5s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n", "[flaml.automl: 07-01 15:33:21] {2933} INFO - iteration 11, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:23] {3108} INFO - at 5.2s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n", "[flaml.automl: 07-01 15:33:23] {2933} INFO - iteration 12, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:24] {3108} INFO - at 6.3s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n", "[flaml.automl: 07-01 15:33:24] {2933} INFO - iteration 13, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:25] {3108} INFO - at 7.8s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n", "[flaml.automl: 07-01 15:33:25] {2933} INFO - iteration 14, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:27] {3108} INFO - at 9.2s,\testimator my_lgbm's best error=0.1833,\tbest estimator my_lgbm's best error=0.1833\n", "[flaml.automl: 07-01 15:33:27] {2933} INFO - iteration 15, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:28] {3108} INFO - at 11.0s,\testimator my_lgbm's best error=0.1762,\tbest estimator my_lgbm's best error=0.1762\n", "[flaml.automl: 07-01 15:33:28] {2933} INFO - iteration 16, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:30] {3108} INFO - at 12.3s,\testimator my_lgbm's best error=0.1762,\tbest estimator my_lgbm's best error=0.1762\n", "[flaml.automl: 07-01 15:33:30] {2933} INFO - iteration 17, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:36] {3108} INFO - at 19.0s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n", "[flaml.automl: 07-01 15:33:36] {2933} INFO - iteration 18, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:38] {3108} INFO - at 20.8s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n", "[flaml.automl: 07-01 15:33:38] {2933} INFO - iteration 19, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:40] {3108} INFO - at 23.0s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n", "[flaml.automl: 07-01 15:33:40] {2933} INFO - iteration 20, current learner my_lgbm\n", "[flaml.automl: 07-01 15:33:54] {3108} INFO - at 36.6s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n", "[flaml.automl: 07-01 15:33:54] {2933} INFO - iteration 21, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:00] {3108} INFO - at 43.2s,\testimator my_lgbm's best error=0.1760,\tbest estimator my_lgbm's best error=0.1760\n", "[flaml.automl: 07-01 15:34:00] {2933} INFO - iteration 22, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:04] {3108} INFO - at 47.1s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:04] {2933} INFO - iteration 23, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:08] {3108} INFO - at 50.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:08] {2933} INFO - iteration 24, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:15] {3108} INFO - at 57.5s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:15] {2933} INFO - iteration 25, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:33] {3108} INFO - at 76.2s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:33] {2933} INFO - iteration 26, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:35] {3108} INFO - at 77.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:35] {2933} INFO - iteration 27, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:45] {3108} INFO - at 87.9s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:45] {2933} INFO - iteration 28, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:47] {3108} INFO - at 89.7s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:47] {2933} INFO - iteration 29, current learner my_lgbm\n", "[flaml.automl: 07-01 15:34:48] {3108} INFO - at 90.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:34:48] {2933} INFO - iteration 30, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:16] {3108} INFO - at 118.7s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:35:16] {2933} INFO - iteration 31, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:19] {3108} INFO - at 121.6s,\testimator my_lgbm's best error=0.1706,\tbest estimator my_lgbm's best error=0.1706\n", "[flaml.automl: 07-01 15:35:19] {2933} INFO - iteration 32, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:26] {3108} INFO - at 128.9s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n", "[flaml.automl: 07-01 15:35:26] {2933} INFO - iteration 33, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:33] {3108} INFO - at 135.2s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n", "[flaml.automl: 07-01 15:35:33] {2933} INFO - iteration 34, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:37] {3108} INFO - at 139.6s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n", "[flaml.automl: 07-01 15:35:37] {2933} INFO - iteration 35, current learner my_lgbm\n", "[flaml.automl: 07-01 15:35:49] {3108} INFO - at 151.6s,\testimator my_lgbm's best error=0.1632,\tbest estimator my_lgbm's best error=0.1632\n", "[flaml.automl: 07-01 15:35:50] {3372} INFO - retrain my_lgbm for 1.5s\n", "[flaml.automl: 07-01 15:35:50] {3379} INFO - retrained model: LGBMRegressor(colsample_bytree=0.8422311526890249,\n", " learning_rate=0.4130805075333333, max_bin=1023,\n", " min_child_samples=10, n_estimators=95, num_leaves=221,\n", " objective=,\n", " reg_alpha=0.007704104902643932, reg_lambda=0.0031517673595496476,\n", " verbose=-1)\n", "[flaml.automl: 07-01 15:35:50] {2672} INFO - fit succeeded\n", "[flaml.automl: 07-01 15:35:50] {2673} INFO - Time taken to find the best model: 128.89934134483337\n", "[flaml.automl: 07-01 15:35:50] {2684} WARNING - Time taken to find the best model is 86% of the provided time budget and not all estimators' hyperparameter search converged. Consider increasing the time budget.\n" ] } ], "source": [ "automl = AutoML()\n", "automl.add_learner(learner_name='my_lgbm', learner_class=MyLGBM)\n", "settings = {\n", " \"time_budget\": 150, # total running time in seconds\n", " \"metric\": 'r2', # primary metrics for regression can be chosen from: ['mae','mse','r2']\n", " \"estimator_list\": ['my_lgbm',], # list of ML learners; we tune lightgbm in this example\n", " \"task\": 'regression', # task type \n", " \"log_file_name\": 'houses_experiment_my_lgbm.log', # flaml log file\n", "}\n", "automl.fit(X_train=X_train, y_train=y_train, **settings)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best hyperparmeter config: {'n_estimators': 95, 'num_leaves': 221, 'min_child_samples': 10, 'learning_rate': 0.4130805075333333, 'log_max_bin': 10, 'colsample_bytree': 0.8422311526890249, 'reg_alpha': 0.007704104902643932, 'reg_lambda': 0.0031517673595496476}\n", "Best r2 on validation data: 0.8368\n", "Training duration of best run: 1.508 s\n", "Predicted labels [161485.59767093 248585.87889042 157837.93378106 ... 184356.07034452\n", " 223247.80995858 259281.61167122]\n", "True labels 14740 136900.0\n", "10101 241300.0\n", "20566 200700.0\n", "2670 72500.0\n", "15709 460000.0\n", " ... \n", "13132 121200.0\n", "8228 137500.0\n", "3948 160900.0\n", "8522 227300.0\n", "16798 265600.0\n", "Name: median_house_value, Length: 5160, dtype: float64\n", "r2 = 0.842983315140684\n", "mse = 2075526075.9236298\n", "mae = 30102.91056064235\n" ] } ], "source": [ "print('Best hyperparmeter config:', automl.best_config)\n", "print('Best r2 on validation data: {0:.4g}'.format(1-automl.best_loss))\n", "print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))\n", "\n", "y_pred = automl.predict(X_test)\n", "print('Predicted labels', y_pred)\n", "print('True labels', y_test)\n", "\n", "from flaml.ml import sklearn_metric_loss_score\n", "print('r2', '=', 1 - sklearn_metric_loss_score('r2', y_pred, y_test))\n", "print('mse', '=', sklearn_metric_loss_score('mse', y_pred, y_test))\n", "print('mae', '=', sklearn_metric_loss_score('mae', y_pred, y_test))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.13 ('syml-py38')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "vscode": { "interpreter": { "hash": "e3d9487e2ef008ade0db1bc293d3206d35cb2b6081faff9f66b40b257b7398f7" } } }, "nbformat": 4, "nbformat_minor": 2 }