autogen/notebook/automl_in_sklearn_pipeline.ipynb

966 lines
64 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Copyright (c) 2021. All rights reserved.\n",
"\n",
"Contributed by: @bnriiitb\n",
"\n",
"Licensed under the MIT License."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using AutoML in Sklearn Pipeline\n",
"\n",
"This tutorial will help you understand how FLAML's AutoML can be used as a transformer in the Sklearn pipeline."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"## 1.Introduction\n",
"\n",
"### 1.1 FLAML - Fast and Lightweight AutoML\n",
"\n",
"FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models with low computational cost. It is fast and cheap. The simple and lightweight design makes it easy to use and extend, such as adding new learners. \n",
"\n",
"FLAML can \n",
"- serve as an economical AutoML engine,\n",
"- be used as a fast hyperparameter tuning tool, or \n",
"- be embedded in self-tuning software that requires low latency & resource in repetitive\n",
" tuning tasks.\n",
"\n",
"In this notebook, we use one real data example (binary classification) to showcase how to use FLAML library.\n",
"\n",
"FLAML requires `Python>=3.6`. To run this notebook example, please install flaml with the `notebook` option:\n",
"```bash\n",
"pip install flaml[notebook]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.2 Why are pipelines a silver bullet?\n",
"\n",
"In a typical machine learning workflow we have to apply all the transformations at least twice. \n",
"1. During Training\n",
"2. During Inference\n",
"\n",
"Scikit-learn pipelines provide an easy to use inteface to automate ML workflows by allowing several transformers to be chained together. \n",
"\n",
"The key benefits of using pipelines:\n",
"* Make ML workflows highly readable, enabling fast development and easy review\n",
"* Help to build sequential and parallel processes\n",
"* Allow hyperparameter tuning across the estimators\n",
"* Easier to share and collaborate with multiple users (bug fixes, enhancements etc)\n",
"* Enforce the implementation and order of steps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### As FLAML's AutoML module can be used a transformer in the Sklearn's pipeline we can get all the benefits of pipeline and thereby write extremley clean, and resuable code."
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [],
"source": [
"!pip install flaml[notebook];"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Classification Example\n",
"### Load data and preprocess\n",
"\n",
"Download [Airlines dataset](https://www.openml.org/d/1169) from OpenML. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"load dataset from ./openml_ds1169.pkl\n",
"Dataset name: airlines\n",
"X_train.shape: (404537, 7), y_train.shape: (404537,);\n",
"X_test.shape: (134846, 7), y_test.shape: (134846,)\n"
]
}
],
"source": [
"from flaml.data import load_openml_dataset\n",
"X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=1169, data_dir='./',random_state=1234)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 12., 2648., 4., 15., 4., 450., 67.], dtype=float32)"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_train[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Create a Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-49480d8e-339b-4d4d-875e-ae6c477072da {color: black;background-color: white;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da pre{padding: 0;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-toggleable {background-color: white;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-estimator:hover {background-color: #d4ebff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-item {z-index: 1;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-parallel-item:only-child::after {width: 0;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-49480d8e-339b-4d4d-875e-ae6c477072da div.sk-container {display: inline-block;position: relative;}</style><div id=\"sk-49480d8e-339b-4d4d-875e-ae6c477072da\" class\"sk-top-container\"><div class=\"sk-container\"><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"267f8e89-5175-46ca-99d6-b361bb057bba\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"267f8e89-5175-46ca-99d6-b361bb057bba\">Pipeline</label><div class=\"sk-toggleable__content\"><pre>Pipeline(steps=[('imputuer', SimpleImputer()),\n",
" ('standardizer', StandardScaler()),\n",
" ('automl', <flaml.automl.AutoML object at 0x7fb8c888a2b0>)])</pre></div></div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"8d157b50-f7f1-401d-a315-9c8445b32cf1\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"8d157b50-f7f1-401d-a315-9c8445b32cf1\">SimpleImputer</label><div class=\"sk-toggleable__content\"><pre>SimpleImputer()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"e8122e5d-9cf5-4551-b274-df987e847f0f\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"e8122e5d-9cf5-4551-b274-df987e847f0f\">StandardScaler</label><div class=\"sk-toggleable__content\"><pre>StandardScaler()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"07dd5e6f-4f85-4d7a-ba5c-f2a745da470a\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"07dd5e6f-4f85-4d7a-ba5c-f2a745da470a\">AutoML</label><div class=\"sk-toggleable__content\"><pre><flaml.automl.AutoML object at 0x7fb8c888a2b0></pre></div></div></div></div></div></div></div>"
],
"text/plain": [
"Pipeline(steps=[('imputuer', SimpleImputer()),\n",
" ('standardizer', StandardScaler()),\n",
" ('automl', <flaml.automl.AutoML object at 0x7fb8c888a2b0>)])"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import sklearn\n",
"from sklearn import set_config\n",
"from sklearn.pipeline import Pipeline\n",
"from sklearn.impute import SimpleImputer\n",
"from sklearn.preprocessing import StandardScaler\n",
"from flaml import AutoML\n",
"\n",
"set_config(display='diagram')\n",
"\n",
"imputer = SimpleImputer()\n",
"standardizer = StandardScaler()\n",
"automl = AutoML()\n",
"\n",
"automl_pipeline = Pipeline([\n",
" (\"imputuer\",imputer),\n",
" (\"standardizer\", standardizer),\n",
" (\"automl\", automl)\n",
"])\n",
"automl_pipeline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run FLAML\n",
"In the FLAML automl run configuration, users can specify the task type, time budget, error metric, learner list, whether to subsample, resampling strategy type, and so on. All these arguments have default values which will be used if users do not provide them. For example, the default ML learners of FLAML are `['lgbm', 'xgboost', 'catboost', 'rf', 'extra_tree', 'lrl1']`. "
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [],
"source": [
"settings = {\n",
" \"time_budget\": 60, # total running time in seconds\n",
" \"metric\": 'accuracy', # primary metrics can be chosen from: ['accuracy','roc_auc','f1','log_loss','mae','mse','r2']\n",
" \"task\": 'classification', # task type \n",
" \"estimator_list\":['xgboost','catboost','lgbm'],\n",
" \"log_file_name\": 'airlines_experiment.log', # flaml log file\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:30] {884} INFO - Evaluation method: holdout\n",
"[flaml.automl: 08-09 19:49:30] {591} INFO - Using StratifiedKFold\n",
"[flaml.automl: 08-09 19:49:30] {905} INFO - Minimizing error metric: 1-accuracy\n",
"[flaml.automl: 08-09 19:49:30] {924} INFO - List of ML learners in AutoML Run: ['xgboost', 'catboost', 'lgbm']\n",
"[flaml.automl: 08-09 19:49:30] {986} INFO - iteration 0 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:30] {1134} INFO - at 0.4s,\tbest xgboost's error=0.3755,\tbest xgboost's error=0.3755\n",
"[flaml.automl: 08-09 19:49:30] {986} INFO - iteration 1 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:30] {1134} INFO - at 0.4s,\tbest lgbm's error=0.3704,\tbest lgbm's error=0.3704\n",
"[flaml.automl: 08-09 19:49:30] {986} INFO - iteration 2 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:30] {1134} INFO - at 0.5s,\tbest xgboost's error=0.3755,\tbest lgbm's error=0.3704\n",
"[flaml.automl: 08-09 19:49:30] {986} INFO - iteration 3 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:30] {1134} INFO - at 0.5s,\tbest lgbm's error=0.3704,\tbest lgbm's error=0.3704\n",
"[flaml.automl: 08-09 19:49:30] {986} INFO - iteration 4 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.6s,\tbest xgboost's error=0.3643,\tbest xgboost's error=0.3643\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 5 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=32, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=32\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.6s,\tbest xgboost's error=0.3643,\tbest xgboost's error=0.3643\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 6 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.7s,\tbest xgboost's error=0.3624,\tbest xgboost's error=0.3624\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 7 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.8s,\tbest xgboost's error=0.3605,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 8 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.8s,\tbest xgboost's error=0.3605,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 9 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 0.9s,\tbest lgbm's error=0.3704,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 10 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=25, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=25\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 1.1s,\tbest xgboost's error=0.3605,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 11 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 1.1s,\tbest lgbm's error=0.3704,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 12 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 1.2s,\tbest xgboost's error=0.3605,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 13 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=7, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=7\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=8 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=39, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=39\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 1.4s,\tbest lgbm's error=0.3658,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 14 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:31] {1134} INFO - at 1.4s,\tbest xgboost's error=0.3605,\tbest xgboost's error=0.3605\n",
"[flaml.automl: 08-09 19:49:31] {986} INFO - iteration 15 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 1.6s,\tbest lgbm's error=0.3588,\tbest lgbm's error=0.3588\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 16 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 1.6s,\tbest xgboost's error=0.3605,\tbest lgbm's error=0.3588\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 17 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=35, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=35\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=17 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=17, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=17\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 1.7s,\tbest lgbm's error=0.3588,\tbest lgbm's error=0.3588\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 18 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 1.8s,\tbest lgbm's error=0.3588,\tbest lgbm's error=0.3588\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 19 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=59, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=59\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=51 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=74, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=74\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=5 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 2.0s,\tbest lgbm's error=0.3588,\tbest lgbm's error=0.3588\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 20 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 2.1s,\tbest xgboost's error=0.3531,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 21 current learner catboost\n",
"[flaml.automl: 08-09 19:49:32] {1134} INFO - at 2.3s,\tbest catboost's error=0.3595,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:32] {986} INFO - iteration 22 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 2.6s,\tbest xgboost's error=0.3531,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 23 current learner catboost\n",
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 2.8s,\tbest catboost's error=0.3595,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 24 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 2.9s,\tbest lgbm's error=0.3588,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 25 current learner catboost\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=20, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=20\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=13 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 3.1s,\tbest catboost's error=0.3587,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 26 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 3.2s,\tbest lgbm's error=0.3588,\tbest xgboost's error=0.3531\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 27 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=36, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=36\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=5 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=35, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=35\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=17 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:33] {1134} INFO - at 3.4s,\tbest lgbm's error=0.3517,\tbest lgbm's error=0.3517\n",
"[flaml.automl: 08-09 19:49:33] {986} INFO - iteration 28 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:34] {1134} INFO - at 3.6s,\tbest lgbm's error=0.3517,\tbest lgbm's error=0.3517\n",
"[flaml.automl: 08-09 19:49:34] {986} INFO - iteration 29 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=31, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=31\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=8 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:34] {1134} INFO - at 3.8s,\tbest xgboost's error=0.3527,\tbest lgbm's error=0.3517\n",
"[flaml.automl: 08-09 19:49:34] {986} INFO - iteration 30 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:34] {1134} INFO - at 3.9s,\tbest xgboost's error=0.3527,\tbest lgbm's error=0.3517\n",
"[flaml.automl: 08-09 19:49:34] {986} INFO - iteration 31 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:35] {1134} INFO - at 4.9s,\tbest xgboost's error=0.3517,\tbest xgboost's error=0.3517\n",
"[flaml.automl: 08-09 19:49:35] {986} INFO - iteration 32 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:35] {1134} INFO - at 4.9s,\tbest lgbm's error=0.3517,\tbest xgboost's error=0.3517\n",
"[flaml.automl: 08-09 19:49:35] {986} INFO - iteration 33 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=26, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=26\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=111 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:35] {1134} INFO - at 5.2s,\tbest xgboost's error=0.3517,\tbest xgboost's error=0.3517\n",
"[flaml.automl: 08-09 19:49:35] {986} INFO - iteration 34 current learner catboost\n",
"[flaml.automl: 08-09 19:49:35] {1134} INFO - at 5.4s,\tbest catboost's error=0.3587,\tbest xgboost's error=0.3517\n",
"[flaml.automl: 08-09 19:49:35] {986} INFO - iteration 35 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:36] {1134} INFO - at 5.6s,\tbest lgbm's error=0.3514,\tbest lgbm's error=0.3514\n",
"[flaml.automl: 08-09 19:49:36] {986} INFO - iteration 36 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=31, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=31\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=35, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=35\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=7 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:36] {1134} INFO - at 5.8s,\tbest lgbm's error=0.3501,\tbest lgbm's error=0.3501\n",
"[flaml.automl: 08-09 19:49:36] {986} INFO - iteration 37 current learner lgbm\n",
"[flaml.automl: 08-09 19:49:36] {1134} INFO - at 6.0s,\tbest lgbm's error=0.3501,\tbest lgbm's error=0.3501\n",
"[flaml.automl: 08-09 19:49:36] {986} INFO - iteration 38 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=41, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=41\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=11 will be ignored. Current value: num_leaves=31\n",
"[LightGBM] [Warning] min_data_in_leaf is set=51, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=51\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=44 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:37] {1134} INFO - at 6.7s,\tbest lgbm's error=0.3492,\tbest lgbm's error=0.3492\n",
"[flaml.automl: 08-09 19:49:37] {986} INFO - iteration 39 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=74, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=74\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=6 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:37] {1134} INFO - at 7.3s,\tbest lgbm's error=0.3492,\tbest lgbm's error=0.3492\n",
"[flaml.automl: 08-09 19:49:37] {986} INFO - iteration 40 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=35, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=35\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=52 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:39] {1134} INFO - at 9.5s,\tbest lgbm's error=0.3492,\tbest lgbm's error=0.3492\n",
"[flaml.automl: 08-09 19:49:39] {986} INFO - iteration 41 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:49:42] {1134} INFO - at 12.4s,\tbest xgboost's error=0.3517,\tbest lgbm's error=0.3492\n",
"[flaml.automl: 08-09 19:49:42] {986} INFO - iteration 42 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=51, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=51\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=44 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:44] {1134} INFO - at 14.3s,\tbest lgbm's error=0.3424,\tbest lgbm's error=0.3424\n",
"[flaml.automl: 08-09 19:49:44] {986} INFO - iteration 43 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=26, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=26\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=170 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:45] {1134} INFO - at 15.5s,\tbest lgbm's error=0.3424,\tbest lgbm's error=0.3424\n",
"[flaml.automl: 08-09 19:49:45] {986} INFO - iteration 44 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=53, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=53\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=12 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:48] {1134} INFO - at 18.2s,\tbest lgbm's error=0.3424,\tbest lgbm's error=0.3424\n",
"[flaml.automl: 08-09 19:49:48] {986} INFO - iteration 45 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=100, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=100\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=18 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:49] {1134} INFO - at 19.1s,\tbest lgbm's error=0.3407,\tbest lgbm's error=0.3407\n",
"[flaml.automl: 08-09 19:49:49] {986} INFO - iteration 46 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=128, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=128\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=70 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:51] {1134} INFO - at 20.8s,\tbest lgbm's error=0.3407,\tbest lgbm's error=0.3407\n",
"[flaml.automl: 08-09 19:49:51] {986} INFO - iteration 47 current learner catboost\n",
"[flaml.automl: 08-09 19:49:51] {1134} INFO - at 21.0s,\tbest catboost's error=0.3587,\tbest lgbm's error=0.3407\n",
"[flaml.automl: 08-09 19:49:51] {986} INFO - iteration 48 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=128, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=128\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=16 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:52] {1134} INFO - at 22.2s,\tbest lgbm's error=0.3376,\tbest lgbm's error=0.3376\n",
"[flaml.automl: 08-09 19:49:52] {986} INFO - iteration 49 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=56, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=56\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=52 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:53] {1134} INFO - at 23.0s,\tbest lgbm's error=0.3376,\tbest lgbm's error=0.3376\n",
"[flaml.automl: 08-09 19:49:53] {986} INFO - iteration 50 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=98, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=98\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:49:56] {1134} INFO - at 26.5s,\tbest lgbm's error=0.3351,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:49:56] {986} INFO - iteration 51 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=97, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=97\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=39 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:00] {1134} INFO - at 29.9s,\tbest lgbm's error=0.3351,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:00] {986} INFO - iteration 52 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=128, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=128\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:05] {1134} INFO - at 35.0s,\tbest lgbm's error=0.3351,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:05] {986} INFO - iteration 53 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:50:05] {1134} INFO - at 35.3s,\tbest xgboost's error=0.3517,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:05] {986} INFO - iteration 54 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=43, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=43\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=11 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:07] {1134} INFO - at 36.9s,\tbest lgbm's error=0.3351,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:07] {986} INFO - iteration 55 current learner catboost\n",
"[flaml.automl: 08-09 19:50:07] {1134} INFO - at 37.4s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:07] {986} INFO - iteration 56 current learner catboost\n",
"[flaml.automl: 08-09 19:50:08] {1134} INFO - at 37.6s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:08] {986} INFO - iteration 57 current learner catboost\n",
"[flaml.automl: 08-09 19:50:08] {1134} INFO - at 37.9s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:08] {986} INFO - iteration 58 current learner catboost\n",
"[flaml.automl: 08-09 19:50:08] {1134} INFO - at 38.1s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:08] {986} INFO - iteration 59 current learner catboost\n",
"[flaml.automl: 08-09 19:50:08] {1134} INFO - at 38.3s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:08] {986} INFO - iteration 60 current learner catboost\n",
"[flaml.automl: 08-09 19:50:09] {1134} INFO - at 38.6s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3351\n",
"[flaml.automl: 08-09 19:50:09] {986} INFO - iteration 61 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=55, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=55\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=16 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:12] {1134} INFO - at 42.5s,\tbest lgbm's error=0.3328,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:12] {986} INFO - iteration 62 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=117, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=117\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=4 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:14] {1134} INFO - at 44.4s,\tbest lgbm's error=0.3328,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:14] {986} INFO - iteration 63 current learner catboost\n",
"[flaml.automl: 08-09 19:50:15] {1134} INFO - at 44.7s,\tbest catboost's error=0.3515,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:15] {986} INFO - iteration 64 current learner catboost\n",
"[flaml.automl: 08-09 19:50:18] {1134} INFO - at 47.9s,\tbest catboost's error=0.3435,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:18] {986} INFO - iteration 65 current learner lgbm\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=128, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=128\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=12 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:23] {1134} INFO - at 52.8s,\tbest lgbm's error=0.3328,\tbest lgbm's error=0.3328\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[LightGBM] [Warning] min_data_in_leaf is set=55, min_child_samples=20 will be ignored. Current value: min_data_in_leaf=55\n",
"[LightGBM] [Warning] num_leaves is set=31, max_leaves=16 will be ignored. Current value: num_leaves=31\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"[flaml.automl: 08-09 19:50:26] {1156} INFO - retrain lgbm for 3.3s\n",
"[flaml.automl: 08-09 19:50:26] {986} INFO - iteration 66 current learner catboost\n",
"[flaml.automl: 08-09 19:50:27] {1134} INFO - at 57.4s,\tbest catboost's error=0.3435,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:29] {1156} INFO - retrain catboost for 1.3s\n",
"[flaml.automl: 08-09 19:50:29] {986} INFO - iteration 67 current learner xgboost\n",
"/Users/budigam.nagaraju/opt/anaconda3/lib/python3.8/site-packages/xgboost/sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1].\n",
" warnings.warn(label_encoder_deprecation_msg, UserWarning)\n",
"[flaml.automl: 08-09 19:50:29] {1134} INFO - at 58.9s,\tbest xgboost's error=0.3517,\tbest lgbm's error=0.3328\n",
"[flaml.automl: 08-09 19:50:30] {1156} INFO - retrain xgboost for 0.9s\n",
"[flaml.automl: 08-09 19:50:30] {1181} INFO - selected model: LGBMClassifier(colsample_bytree=0.7560357004495271,\n",
" learning_rate=0.28478479182882205, max_bin=31, max_leaves=16,\n",
" min_data_in_leaf=55, n_estimators=746, objective='binary',\n",
" reg_alpha=0.0009765625, reg_lambda=0.032652090008547976,\n",
" subsample=0.8847635935300631)\n",
"[flaml.automl: 08-09 19:50:30] {939} INFO - fit succeeded\n"
]
},
{
"data": {
"text/html": [
"<style>#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 {color: black;background-color: white;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 pre{padding: 0;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-toggleable {background-color: white;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-estimator:hover {background-color: #d4ebff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-item {z-index: 1;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-parallel-item:only-child::after {width: 0;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7 div.sk-container {display: inline-block;position: relative;}</style><div id=\"sk-b98f73a3-adb4-454a-b919-4bcf05cd59b7\" class\"sk-top-container\"><div class=\"sk-container\"><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"a42ab7b5-f338-4516-bbcc-97a966b51960\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"a42ab7b5-f338-4516-bbcc-97a966b51960\">Pipeline</label><div class=\"sk-toggleable__content\"><pre>Pipeline(steps=[('imputuer', SimpleImputer()),\n",
" ('standardizer', StandardScaler()),\n",
" ('automl', <flaml.automl.AutoML object at 0x7fb8c888a2b0>)])</pre></div></div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"76a93ef9-a54d-4eeb-a5c4-c293ce60c1b1\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"76a93ef9-a54d-4eeb-a5c4-c293ce60c1b1\">SimpleImputer</label><div class=\"sk-toggleable__content\"><pre>SimpleImputer()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"714d8e9b-935a-44f8-a268-54f4841b57be\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"714d8e9b-935a-44f8-a268-54f4841b57be\">StandardScaler</label><div class=\"sk-toggleable__content\"><pre>StandardScaler()</pre></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"9952a721-e7a0-4798-aee7-a14fea98ce3b\" type=\"checkbox\" ><label class=\"sk-toggleable__label\" for=\"9952a721-e7a0-4798-aee7-a14fea98ce3b\">AutoML</label><div class=\"sk-toggleable__content\"><pre><flaml.automl.AutoML object at 0x7fb8c888a2b0></pre></div></div></div></div></div></div></div>"
],
"text/plain": [
"Pipeline(steps=[('imputuer', SimpleImputer()),\n",
" ('standardizer', StandardScaler()),\n",
" ('automl', <flaml.automl.AutoML object at 0x7fb8c888a2b0>)])"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"automl_pipeline.fit(X_train, y_train, \n",
" automl__time_budget=settings['time_budget'],\n",
" automl__metric=settings['metric'],\n",
" automl__estimator_list=settings['estimator_list'],\n",
" automl__log_training_metric=True)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best ML leaner: lgbm\n",
"Best hyperparmeter config: {'n_estimators': 746.0, 'max_leaves': 16.0, 'min_data_in_leaf': 55.0, 'learning_rate': 0.28478479182882205, 'subsample': 0.8847635935300631, 'log_max_bin': 5.0, 'colsample_bytree': 0.7560357004495271, 'reg_alpha': 0.0009765625, 'reg_lambda': 0.032652090008547976, 'FLAML_sample_size': 364083}\n",
"Best accuracy on validation data: 0.6672\n",
"Training duration of best run: 3.921 s\n"
]
}
],
"source": [
"# Get the automl object from the pipeline\n",
"automl = automl_pipeline.steps[2][1]\n",
"\n",
"# Get the best config and best learner\n",
"print('Best ML leaner:', automl.best_estimator)\n",
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best accuracy on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 {color: black;background-color: white;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 pre{padding: 0;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-toggleable {background-color: white;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.2em 0.3em;box-sizing: border-box;text-align: center;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;margin: 0.25em 0.25em;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-estimator:hover {background-color: #d4ebff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 2em;bottom: 0;left: 50%;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-item {z-index: 1;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel-item {display: flex;flex-direction: column;position: relative;background-color: white;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-parallel-item:only-child::after {width: 0;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0.2em;box-sizing: border-box;padding-bottom: 0.1em;background-color: white;position: relative;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-label label {font-family: monospace;font-weight: bold;background-color: white;display: inline-block;line-height: 1.2em;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-label-container {position: relative;z-index: 2;text-align: center;}#sk-5364845a-380b-46b0-aea6-e1a4599e4f67 div.sk-container {display: inline-block;position: relative;}</style><div id=\"sk-5364845a-380b-46b0-aea6-e1a4599e4f67\" class\"sk-top-container\"><div class=\"sk-container\"><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"4580aaa9-c97f-416f-b859-6c98ab0bc8b7\" type=\"checkbox\" checked><label class=\"sk-toggleable__label\" for=\"4580aaa9-c97f-416f-b859-6c98ab0bc8b7\">LGBMClassifier</label><div class=\"sk-toggleable__content\"><pre>LGBMClassifier(colsample_bytree=0.7560357004495271,\n",
" learning_rate=0.28478479182882205, max_bin=31, max_leaves=16,\n",
" min_data_in_leaf=55, n_estimators=746, objective='binary',\n",
" reg_alpha=0.0009765625, reg_lambda=0.032652090008547976,\n",
" subsample=0.8847635935300631)</pre></div></div></div></div></div>"
],
"text/plain": [
"LGBMClassifier(colsample_bytree=0.7560357004495271,\n",
" learning_rate=0.28478479182882205, max_bin=31, max_leaves=16,\n",
" min_data_in_leaf=55, n_estimators=746, objective='binary',\n",
" reg_alpha=0.0009765625, reg_lambda=0.032652090008547976,\n",
" subsample=0.8847635935300631)"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"automl.model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Persist the model binary file"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"# Persist the automl object as pickle file\n",
"import pickle\n",
"with open('automl.pkl', 'wb') as f:\n",
" pickle.dump(automl, f, pickle.HIGHEST_PROTOCOL)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Predicted labels [0 1 1 ... 0 1 0]\n",
"True labels [0 0 0 ... 1 0 1]\n",
"Predicted probas [0.36424183 0.59111937 0.64600957 0.27020691 0.23272711]\n"
]
}
],
"source": [
"# Performance inference on the testing dataset\n",
"y_pred = automl_pipeline.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)\n",
"y_pred_proba = automl_pipeline.predict_proba(X_test)[:,1]\n",
"print('Predicted probas ',y_pred_proba[:5])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}