{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# AutoML with FLAML Library\n",
"\n",
"\n",
"| | | | |\n",
"|-----|--------|--------|--------|\n",
"| \n",
"\n",
"\n",
"\n",
"### Goal\n",
"In this notebook, we demonstrate how to use AutoML with FLAML to find the best model for our dataset.\n",
"\n",
"\n",
"## 1. Introduction\n",
"\n",
"FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n",
"with low computational cost. It is fast and economical. The simple and lightweight design makes it easy to use and extend, such as adding new learners. FLAML can \n",
"- serve as an economical AutoML engine,\n",
"- be used as a fast hyperparameter tuning tool, or \n",
"- be embedded in self-tuning software that requires low latency & resource in repetitive\n",
" tuning tasks.\n",
"\n",
"In this notebook, we use one real data example (binary classification) to showcase how to use FLAML library.\n",
"\n",
"FLAML requires `Python>=3.7`. To run this notebook example, please install the following packages."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"jupyter": {
"outputs_hidden": true
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2023-04-09T03:11:05.782522Z",
"execution_start_time": "2023-04-09T03:11:05.7822033Z",
"livy_statement_state": "available",
"parent_msg_id": "18b2ee64-09c4-4ceb-8975-e4ed43d7c41a",
"queued_time": "2023-04-09T03:10:33.571519Z",
"session_id": "7",
"session_start_time": null,
"spark_jobs": null,
"spark_pool": null,
"state": "finished",
"statement_id": -1
},
"text/plain": [
"StatementMeta(, 7, -1, Finished, Available)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting flaml[synapse]==1.1.3\n",
" Using cached FLAML-1.1.3-py3-none-any.whl (224 kB)\n",
"Collecting xgboost==1.6.1\n",
" Using cached xgboost-1.6.1-py3-none-manylinux2014_x86_64.whl (192.9 MB)\n",
"Collecting pandas==1.5.1\n",
" Using cached pandas-1.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)\n",
"Collecting numpy==1.23.4\n",
" Using cached numpy-1.23.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)\n",
"Collecting openml\n",
" Using cached openml-0.13.1-py3-none-any.whl\n",
"Collecting scipy>=1.4.1\n",
" Using cached scipy-1.10.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (34.5 MB)\n",
"Collecting scikit-learn>=0.24\n",
" Using cached scikit_learn-1.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)\n",
"Collecting lightgbm>=2.3.1\n",
" Using cached lightgbm-3.3.5-py3-none-manylinux1_x86_64.whl (2.0 MB)\n",
"Collecting pyspark>=3.0.0\n",
" Using cached pyspark-3.3.2-py2.py3-none-any.whl\n",
"Collecting optuna==2.8.0\n",
" Using cached optuna-2.8.0-py3-none-any.whl (301 kB)\n",
"Collecting joblibspark>=0.5.0\n",
" Using cached joblibspark-0.5.1-py3-none-any.whl (15 kB)\n",
"Collecting python-dateutil>=2.8.1\n",
" Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)\n",
"Collecting pytz>=2020.1\n",
" Using cached pytz-2023.3-py2.py3-none-any.whl (502 kB)\n",
"Collecting cliff\n",
" Using cached cliff-4.2.0-py3-none-any.whl (81 kB)\n",
"Collecting packaging>=20.0\n",
" Using cached packaging-23.0-py3-none-any.whl (42 kB)\n",
"Collecting cmaes>=0.8.2\n",
" Using cached cmaes-0.9.1-py3-none-any.whl (21 kB)\n",
"Collecting sqlalchemy>=1.1.0\n",
" Using cached SQLAlchemy-2.0.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.8 MB)\n",
"Collecting tqdm\n",
" Using cached tqdm-4.65.0-py3-none-any.whl (77 kB)\n",
"Collecting alembic\n",
" Using cached alembic-1.10.3-py3-none-any.whl (212 kB)\n",
"Collecting colorlog\n",
" Using cached colorlog-6.7.0-py2.py3-none-any.whl (11 kB)\n",
"Collecting xmltodict\n",
" Using cached xmltodict-0.13.0-py2.py3-none-any.whl (10.0 kB)\n",
"Collecting requests\n",
" Using cached requests-2.28.2-py3-none-any.whl (62 kB)\n",
"Collecting minio\n",
" Using cached minio-7.1.14-py3-none-any.whl (77 kB)\n",
"Collecting liac-arff>=2.4.0\n",
" Using cached liac_arff-2.5.0-py3-none-any.whl\n",
"Collecting pyarrow\n",
" Using cached pyarrow-11.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (35.0 MB)\n",
"Collecting joblib>=0.14\n",
" Using cached joblib-1.2.0-py3-none-any.whl (297 kB)\n",
"Collecting wheel\n",
" Using cached wheel-0.40.0-py3-none-any.whl (64 kB)\n",
"Collecting py4j==0.10.9.5\n",
" Using cached py4j-0.10.9.5-py2.py3-none-any.whl (199 kB)\n",
"Collecting six>=1.5\n",
" Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)\n",
"Collecting threadpoolctl>=2.0.0\n",
" Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)\n",
"Collecting urllib3\n",
" Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)\n",
"Collecting certifi\n",
" Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)\n",
"Collecting idna<4,>=2.5\n",
" Using cached idna-3.4-py3-none-any.whl (61 kB)\n",
"Collecting charset-normalizer<4,>=2\n",
" Using cached charset_normalizer-3.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (195 kB)\n",
"Collecting typing-extensions>=4.2.0\n",
" Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)\n",
"Collecting greenlet!=0.4.17\n",
" Using cached greenlet-2.0.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (618 kB)\n",
"Collecting importlib-metadata\n",
" Using cached importlib_metadata-6.2.0-py3-none-any.whl (21 kB)\n",
"Collecting importlib-resources\n",
" Using cached importlib_resources-5.12.0-py3-none-any.whl (36 kB)\n",
"Collecting Mako\n",
" Using cached Mako-1.2.4-py3-none-any.whl (78 kB)\n",
"Collecting autopage>=0.4.0\n",
" Using cached autopage-0.5.1-py3-none-any.whl (29 kB)\n",
"Collecting cmd2>=1.0.0\n",
" Using cached cmd2-2.4.3-py3-none-any.whl (147 kB)\n",
"Collecting stevedore>=2.0.1\n",
" Using cached stevedore-5.0.0-py3-none-any.whl (49 kB)\n",
"Collecting PrettyTable>=0.7.2\n",
" Using cached prettytable-3.6.0-py3-none-any.whl (27 kB)\n",
"Collecting PyYAML>=3.12\n",
" Using cached PyYAML-6.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (701 kB)\n",
"Collecting attrs>=16.3.0\n",
" Using cached attrs-22.2.0-py3-none-any.whl (60 kB)\n",
"Collecting pyperclip>=1.6\n",
" Using cached pyperclip-1.8.2-py3-none-any.whl\n",
"Collecting wcwidth>=0.1.7\n",
" Using cached wcwidth-0.2.6-py2.py3-none-any.whl (29 kB)\n",
"Collecting zipp>=0.5\n",
" Using cached zipp-3.15.0-py3-none-any.whl (6.8 kB)\n",
"Collecting pbr!=2.1.0,>=2.0.0\n",
" Using cached pbr-5.11.1-py2.py3-none-any.whl (112 kB)\n",
"Collecting MarkupSafe>=0.9.2\n",
" Using cached MarkupSafe-2.1.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)\n",
"Installing collected packages: wcwidth, pytz, pyperclip, py4j, zipp, xmltodict, wheel, urllib3, typing-extensions, tqdm, threadpoolctl, six, PyYAML, pyspark, PrettyTable, pbr, packaging, numpy, MarkupSafe, liac-arff, joblib, idna, greenlet, colorlog, charset-normalizer, certifi, autopage, attrs, stevedore, sqlalchemy, scipy, requests, python-dateutil, pyarrow, minio, Mako, joblibspark, importlib-resources, importlib-metadata, cmd2, cmaes, xgboost, scikit-learn, pandas, cliff, alembic, optuna, openml, lightgbm, flaml\n",
" Attempting uninstall: wcwidth\n",
" Found existing installation: wcwidth 0.2.6\n",
" Uninstalling wcwidth-0.2.6:\n",
" Successfully uninstalled wcwidth-0.2.6\n",
" Attempting uninstall: pytz\n",
" Found existing installation: pytz 2023.3\n",
" Uninstalling pytz-2023.3:\n",
" Successfully uninstalled pytz-2023.3\n",
" Attempting uninstall: pyperclip\n",
" Found existing installation: pyperclip 1.8.2\n",
" Uninstalling pyperclip-1.8.2:\n",
" Successfully uninstalled pyperclip-1.8.2\n",
" Attempting uninstall: py4j\n",
" Found existing installation: py4j 0.10.9.5\n",
" Uninstalling py4j-0.10.9.5:\n",
" Successfully uninstalled py4j-0.10.9.5\n",
" Attempting uninstall: zipp\n",
" Found existing installation: zipp 3.15.0\n",
" Uninstalling zipp-3.15.0:\n",
" Successfully uninstalled zipp-3.15.0\n",
" Attempting uninstall: xmltodict\n",
" Found existing installation: xmltodict 0.13.0\n",
" Uninstalling xmltodict-0.13.0:\n",
" Successfully uninstalled xmltodict-0.13.0\n",
" Attempting uninstall: wheel\n",
" Found existing installation: wheel 0.40.0\n",
" Uninstalling wheel-0.40.0:\n",
" Successfully uninstalled wheel-0.40.0\n",
" Attempting uninstall: urllib3\n",
" Found existing installation: urllib3 1.26.15\n",
" Uninstalling urllib3-1.26.15:\n",
" Successfully uninstalled urllib3-1.26.15\n",
" Attempting uninstall: typing-extensions\n",
" Found existing installation: typing_extensions 4.5.0\n",
" Uninstalling typing_extensions-4.5.0:\n",
" Successfully uninstalled typing_extensions-4.5.0\n",
" Attempting uninstall: tqdm\n",
" Found existing installation: tqdm 4.65.0\n",
" Uninstalling tqdm-4.65.0:\n",
" Successfully uninstalled tqdm-4.65.0\n",
" Attempting uninstall: threadpoolctl\n",
" Found existing installation: threadpoolctl 3.1.0\n",
" Uninstalling threadpoolctl-3.1.0:\n",
" Successfully uninstalled threadpoolctl-3.1.0\n",
" Attempting uninstall: six\n",
" Found existing installation: six 1.16.0\n",
" Uninstalling six-1.16.0:\n",
" Successfully uninstalled six-1.16.0\n",
" Attempting uninstall: PyYAML\n",
" Found existing installation: PyYAML 6.0\n",
" Uninstalling PyYAML-6.0:\n",
" Successfully uninstalled PyYAML-6.0\n",
" Attempting uninstall: pyspark\n",
" Found existing installation: pyspark 3.3.2\n",
" Uninstalling pyspark-3.3.2:\n",
" Successfully uninstalled pyspark-3.3.2\n",
" Attempting uninstall: PrettyTable\n",
" Found existing installation: prettytable 3.6.0\n",
" Uninstalling prettytable-3.6.0:\n",
" Successfully uninstalled prettytable-3.6.0\n",
" Attempting uninstall: pbr\n",
" Found existing installation: pbr 5.11.1\n",
" Uninstalling pbr-5.11.1:\n",
" Successfully uninstalled pbr-5.11.1\n",
" Attempting uninstall: packaging\n",
" Found existing installation: packaging 23.0\n",
" Uninstalling packaging-23.0:\n",
" Successfully uninstalled packaging-23.0\n",
" Attempting uninstall: numpy\n",
" Found existing installation: numpy 1.23.4\n",
" Uninstalling numpy-1.23.4:\n",
" Successfully uninstalled numpy-1.23.4\n",
" Attempting uninstall: MarkupSafe\n",
" Found existing installation: MarkupSafe 2.1.2\n",
" Uninstalling MarkupSafe-2.1.2:\n",
" Successfully uninstalled MarkupSafe-2.1.2\n",
" Attempting uninstall: liac-arff\n",
" Found existing installation: liac-arff 2.5.0\n",
" Uninstalling liac-arff-2.5.0:\n",
" Successfully uninstalled liac-arff-2.5.0\n",
" Attempting uninstall: joblib\n",
" Found existing installation: joblib 1.2.0\n",
" Uninstalling joblib-1.2.0:\n",
" Successfully uninstalled joblib-1.2.0\n",
" Attempting uninstall: idna\n",
" Found existing installation: idna 3.4\n",
" Uninstalling idna-3.4:\n",
" Successfully uninstalled idna-3.4\n",
" Attempting uninstall: greenlet\n",
" Found existing installation: greenlet 2.0.2\n",
" Uninstalling greenlet-2.0.2:\n",
" Successfully uninstalled greenlet-2.0.2\n",
" Attempting uninstall: colorlog\n",
" Found existing installation: colorlog 6.7.0\n",
" Uninstalling colorlog-6.7.0:\n",
" Successfully uninstalled colorlog-6.7.0\n",
" Attempting uninstall: charset-normalizer\n",
" Found existing installation: charset-normalizer 3.1.0\n",
" Uninstalling charset-normalizer-3.1.0:\n",
" Successfully uninstalled charset-normalizer-3.1.0\n",
" Attempting uninstall: certifi\n",
" Found existing installation: certifi 2022.12.7\n",
" Uninstalling certifi-2022.12.7:\n",
" Successfully uninstalled certifi-2022.12.7\n",
" Attempting uninstall: autopage\n",
" Found existing installation: autopage 0.5.1\n",
" Uninstalling autopage-0.5.1:\n",
" Successfully uninstalled autopage-0.5.1\n",
" Attempting uninstall: attrs\n",
" Found existing installation: attrs 22.2.0\n",
" Uninstalling attrs-22.2.0:\n",
" Successfully uninstalled attrs-22.2.0\n",
" Attempting uninstall: stevedore\n",
" Found existing installation: stevedore 5.0.0\n",
" Uninstalling stevedore-5.0.0:\n",
" Successfully uninstalled stevedore-5.0.0\n",
" Attempting uninstall: sqlalchemy\n",
" Found existing installation: SQLAlchemy 2.0.9\n",
" Uninstalling SQLAlchemy-2.0.9:\n",
" Successfully uninstalled SQLAlchemy-2.0.9\n",
" Attempting uninstall: scipy\n",
" Found existing installation: scipy 1.10.1\n",
" Uninstalling scipy-1.10.1:\n",
" Successfully uninstalled scipy-1.10.1\n",
" Attempting uninstall: requests\n",
" Found existing installation: requests 2.28.2\n",
" Uninstalling requests-2.28.2:\n",
" Successfully uninstalled requests-2.28.2\n",
" Attempting uninstall: python-dateutil\n",
" Found existing installation: python-dateutil 2.8.2\n",
" Uninstalling python-dateutil-2.8.2:\n",
" Successfully uninstalled python-dateutil-2.8.2\n",
" Attempting uninstall: pyarrow\n",
" Found existing installation: pyarrow 11.0.0\n",
" Uninstalling pyarrow-11.0.0:\n",
" Successfully uninstalled pyarrow-11.0.0\n",
" Attempting uninstall: minio\n",
" Found existing installation: minio 7.1.14\n",
" Uninstalling minio-7.1.14:\n",
" Successfully uninstalled minio-7.1.14\n",
" Attempting uninstall: Mako\n",
" Found existing installation: Mako 1.2.4\n",
" Uninstalling Mako-1.2.4:\n",
" Successfully uninstalled Mako-1.2.4\n",
" Attempting uninstall: joblibspark\n",
" Found existing installation: joblibspark 0.5.1\n",
" Uninstalling joblibspark-0.5.1:\n",
" Successfully uninstalled joblibspark-0.5.1\n",
" Attempting uninstall: importlib-resources\n",
" Found existing installation: importlib-resources 5.12.0\n",
" Uninstalling importlib-resources-5.12.0:\n",
" Successfully uninstalled importlib-resources-5.12.0\n",
" Attempting uninstall: importlib-metadata\n",
" Found existing installation: importlib-metadata 6.2.0\n",
" Uninstalling importlib-metadata-6.2.0:\n",
" Successfully uninstalled importlib-metadata-6.2.0\n",
" Attempting uninstall: cmd2\n",
" Found existing installation: cmd2 2.4.3\n",
" Uninstalling cmd2-2.4.3:\n",
" Successfully uninstalled cmd2-2.4.3\n",
" Attempting uninstall: cmaes\n",
" Found existing installation: cmaes 0.9.1\n",
" Uninstalling cmaes-0.9.1:\n",
" Successfully uninstalled cmaes-0.9.1\n",
" Attempting uninstall: xgboost\n",
" Found existing installation: xgboost 1.6.1\n",
" Uninstalling xgboost-1.6.1:\n",
" Successfully uninstalled xgboost-1.6.1\n",
" Attempting uninstall: scikit-learn\n",
" Found existing installation: scikit-learn 1.2.2\n",
" Uninstalling scikit-learn-1.2.2:\n",
" Successfully uninstalled scikit-learn-1.2.2\n",
" Attempting uninstall: pandas\n",
" Found existing installation: pandas 1.5.1\n",
" Uninstalling pandas-1.5.1:\n",
" Successfully uninstalled pandas-1.5.1\n",
" Attempting uninstall: cliff\n",
" Found existing installation: cliff 4.2.0\n",
" Uninstalling cliff-4.2.0:\n",
" Successfully uninstalled cliff-4.2.0\n",
" Attempting uninstall: alembic\n",
" Found existing installation: alembic 1.10.3\n",
" Uninstalling alembic-1.10.3:\n",
" Successfully uninstalled alembic-1.10.3\n",
" Attempting uninstall: optuna\n",
" Found existing installation: optuna 2.8.0\n",
" Uninstalling optuna-2.8.0:\n",
" Successfully uninstalled optuna-2.8.0\n",
" Attempting uninstall: openml\n",
" Found existing installation: openml 0.13.1\n",
" Uninstalling openml-0.13.1:\n",
" Successfully uninstalled openml-0.13.1\n",
" Attempting uninstall: lightgbm\n",
" Found existing installation: lightgbm 3.3.5\n",
" Uninstalling lightgbm-3.3.5:\n",
" Successfully uninstalled lightgbm-3.3.5\n",
" Attempting uninstall: flaml\n",
" Found existing installation: FLAML 1.1.3\n",
" Uninstalling FLAML-1.1.3:\n",
" Successfully uninstalled FLAML-1.1.3\n",
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
"virtualenv 20.14.0 requires platformdirs<3,>=2, but you have platformdirs 3.2.0 which is incompatible.\n",
"tensorflow 2.4.1 requires six~=1.15.0, but you have six 1.16.0 which is incompatible.\n",
"tensorflow 2.4.1 requires typing-extensions~=3.7.4, but you have typing-extensions 4.5.0 which is incompatible.\n",
"pmdarima 1.8.2 requires numpy~=1.19.0, but you have numpy 1.23.4 which is incompatible.\n",
"koalas 1.8.0 requires numpy<1.20.0,>=1.14, but you have numpy 1.23.4 which is incompatible.\n",
"gevent 21.1.2 requires greenlet<2.0,>=0.4.17; platform_python_implementation == \"CPython\", but you have greenlet 2.0.2 which is incompatible.\n",
"azureml-dataset-runtime 1.34.0 requires pyarrow<4.0.0,>=0.17.0, but you have pyarrow 11.0.0 which is incompatible.\n",
"azureml-core 1.34.0 requires urllib3<=1.26.6,>=1.23, but you have urllib3 1.26.15 which is incompatible.\u001b[0m\u001b[31m\n",
"\u001b[0mSuccessfully installed Mako-1.2.4 MarkupSafe-2.1.2 PrettyTable-3.6.0 PyYAML-6.0 alembic-1.10.3 attrs-22.2.0 autopage-0.5.1 certifi-2022.12.7 charset-normalizer-3.1.0 cliff-4.2.0 cmaes-0.9.1 cmd2-2.4.3 colorlog-6.7.0 flaml-1.1.3 greenlet-2.0.2 idna-3.4 importlib-metadata-6.2.0 importlib-resources-5.12.0 joblib-1.2.0 joblibspark-0.5.1 liac-arff-2.5.0 lightgbm-3.3.5 minio-7.1.14 numpy-1.23.4 openml-0.13.1 optuna-2.8.0 packaging-23.0 pandas-1.5.1 pbr-5.11.1 py4j-0.10.9.5 pyarrow-11.0.0 pyperclip-1.8.2 pyspark-3.3.2 python-dateutil-2.8.2 pytz-2023.3 requests-2.28.2 scikit-learn-1.2.2 scipy-1.10.1 six-1.16.0 sqlalchemy-2.0.9 stevedore-5.0.0 threadpoolctl-3.1.0 tqdm-4.65.0 typing-extensions-4.5.0 urllib3-1.26.15 wcwidth-0.2.6 wheel-0.40.0 xgboost-1.6.1 xmltodict-0.13.0 zipp-3.15.0\n",
"\u001b[33mWARNING: You are using pip version 22.0.4; however, version 23.0.1 is available.\n",
"You should consider upgrading via the '/nfs4/pyenv-bfada21f-d1ed-44b9-a41d-4ff480d237e7/bin/python -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
"\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
]
},
{
"data": {},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Warning: PySpark kernel has been restarted to use updated packages.\n",
"\n"
]
}
],
"source": [
"%pip install flaml[synapse]==1.1.3 xgboost==1.6.1 pandas==1.5.1 numpy==1.23.4 openml --force-reinstall"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 2. Classification Example\n",
"### Load data and preprocess\n",
"\n",
"Download [Airlines dataset](https://www.openml.org/d/1169) from OpenML. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"jupyter": {
"outputs_hidden": true
},
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2023-04-09T03:11:11.6973622Z",
"execution_start_time": "2023-04-09T03:11:09.4074274Z",
"livy_statement_state": "available",
"parent_msg_id": "25ba0152-0936-464b-83eb-afa5f2f517fb",
"queued_time": "2023-04-09T03:10:33.8002088Z",
"session_id": "7",
"session_start_time": null,
"spark_jobs": null,
"spark_pool": "automl",
"state": "finished",
"statement_id": 67
},
"text/plain": [
"StatementMeta(automl, 7, 67, Finished, Available)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/dask/dataframe/backends.py:187: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
" _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)\n",
"/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/dask/dataframe/backends.py:187: FutureWarning: pandas.Float64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
" _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)\n",
"/home/trusted-service-user/cluster-env/env/lib/python3.8/site-packages/dask/dataframe/backends.py:187: FutureWarning: pandas.UInt64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
" _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)\n"
]
}
],
"source": [
"from flaml.data import load_openml_dataset\n",
"X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=1169, data_dir='./')"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2023-04-09T03:11:12.2518637Z",
"execution_start_time": "2023-04-09T03:11:11.9466307Z",
"livy_statement_state": "available",
"parent_msg_id": "c6f3064c-401e-447b-bd1d-65cd00f48fe1",
"queued_time": "2023-04-09T03:10:33.901764Z",
"session_id": "7",
"session_start_time": null,
"spark_jobs": null,
"spark_pool": "automl",
"state": "finished",
"statement_id": 68
},
"text/plain": [
"StatementMeta(automl, 7, 68, Finished, Available)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"
\n", " | Airline | \n", "Flight | \n", "AirportFrom | \n", "AirportTo | \n", "DayOfWeek | \n", "Time | \n", "Length | \n", "
---|---|---|---|---|---|---|---|
249392 | \n", "EV | \n", "5309.0 | \n", "MDT | \n", "ATL | \n", "3 | \n", "794.0 | \n", "131.0 | \n", "
166918 | \n", "CO | \n", "1079.0 | \n", "IAH | \n", "SAT | \n", "5 | \n", "900.0 | \n", "60.0 | \n", "
89110 | \n", "US | \n", "1636.0 | \n", "CLE | \n", "CLT | \n", "1 | \n", "530.0 | \n", "103.0 | \n", "
70258 | \n", "WN | \n", "928.0 | \n", "CMH | \n", "LAS | \n", "7 | \n", "480.0 | \n", "280.0 | \n", "
492985 | \n", "WN | \n", "729.0 | \n", "GEG | \n", "LAS | \n", "3 | \n", "630.0 | \n", "140.0 | \n", "
LGBMClassifier(colsample_bytree=0.763983850698587,\n", " learning_rate=0.087493667994037, max_bin=127,\n", " min_child_samples=128, n_estimators=302, num_leaves=466,\n", " reg_alpha=0.09968008477303378, reg_lambda=23.227419343318914,\n", " verbose=-1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LGBMClassifier(colsample_bytree=0.763983850698587,\n", " learning_rate=0.087493667994037, max_bin=127,\n", " min_child_samples=128, n_estimators=302, num_leaves=466,\n", " reg_alpha=0.09968008477303378, reg_lambda=23.227419343318914,\n", " verbose=-1)
LGBMClassifier()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LGBMClassifier()
XGBClassifier(base_score=0.5, booster='gbtree', callbacks=None,\n", " colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,\n", " early_stopping_rounds=None, enable_categorical=False,\n", " eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',\n", " importance_type=None, interaction_constraints='',\n", " learning_rate=0.300000012, max_bin=256, max_cat_to_onehot=4,\n", " max_delta_step=0, max_depth=6, max_leaves=0, min_child_weight=1,\n", " missing=nan, monotone_constraints='()', n_estimators=100,\n", " n_jobs=0, num_parallel_tree=1, predictor='auto', random_state=0,\n", " reg_alpha=0, reg_lambda=1, ...)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
XGBClassifier(base_score=0.5, booster='gbtree', callbacks=None,\n", " colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,\n", " early_stopping_rounds=None, enable_categorical=False,\n", " eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',\n", " importance_type=None, interaction_constraints='',\n", " learning_rate=0.300000012, max_bin=256, max_cat_to_onehot=4,\n", " max_delta_step=0, max_depth=6, max_leaves=0, min_child_weight=1,\n", " missing=nan, monotone_constraints='()', n_estimators=100,\n", " n_jobs=0, num_parallel_tree=1, predictor='auto', random_state=0,\n", " reg_alpha=0, reg_lambda=1, ...)