{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved. \n", "\n", "Licensed under the MIT License.\n", "\n", "# FineTuning NLP Models with FLAML Library\n", "\n", "\n", "## 1. Introduction\n", "\n", "FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n", "with low computational cost. It is fast and economical. The simple and lightweight design makes it easy to use and extend, such as adding new learners. FLAML can \n", "- serve as an economical AutoML engine,\n", "- be used as a fast hyperparameter tuning tool, or \n", "- be embedded in self-tuning software that requires low latency & resource in repetitive\n", " tuning tasks.\n", "\n", "In this notebook, we demonstrate how to use the FLAML library to fine tune an NLP language model with hyperparameter search. We have tested this notebook on a server with 4 NVidia V100 GPU (32GB) and 400GB CPU Ram.\n", "\n", "FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the `nlp,ray,notebook` and `blendsearch` option:\n", "```bash\n", "pip install flaml[nlp,ray,notebook,blendsearch];\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0m\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0mRequirement already satisfied: flaml[blendsearch,nlp,notebook,ray] in /data/xliu127/projects/hyperopt/FLAML (1.0.7)\n", "Requirement already satisfied: NumPy>=1.17.0rc1 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.21.2)\n", "Requirement already satisfied: lightgbm>=2.3.1 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (3.2.1)\n", "Requirement already satisfied: xgboost<=1.3.3,>=0.90 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.3.3)\n", "Requirement already satisfied: scipy>=1.4.1 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.6.2)\n", "Requirement already satisfied: pandas>=1.1.4 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.4.3)\n", "Requirement already satisfied: scikit-learn>=0.24 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (0.24.1)\n", "Requirement already satisfied: openml==0.10.2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (0.10.2)\n", "Requirement already satisfied: jupyter in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.0.0)\n", "Requirement already satisfied: matplotlib in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (3.4.3)\n", "Requirement already satisfied: rgf-python in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (3.12.0)\n", "Requirement already satisfied: catboost>=0.26 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.0.6)\n", "Requirement already satisfied: ray[tune]~=1.10 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.13.0)\n", "Requirement already satisfied: protobuf<4 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (3.15.8)\n", "Requirement already satisfied: transformers>=4.14 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (4.18.0)\n", "Requirement already satisfied: datasets in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (2.4.0)\n", "Requirement already satisfied: torch in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.12.0)\n", "Requirement already satisfied: seqeval in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (1.2.2)\n", "Requirement already satisfied: nltk in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (3.7)\n", "Requirement already satisfied: rouge_score in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (0.1.2)\n", "Requirement already satisfied: optuna==2.8.0 in /home/xliu127/.local/lib/python3.8/site-packages (from flaml[blendsearch,nlp,notebook,ray]) (2.8.0)\n", "Requirement already satisfied: liac-arff>=2.4.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (2.5.0)\n", "Requirement already satisfied: xmltodict in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (0.13.0)\n", "Requirement already satisfied: python-dateutil in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (2.8.2)\n", "Requirement already satisfied: requests in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (2.28.1)\n", "Requirement already satisfied: packaging>=20.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (21.3)\n", "Requirement already satisfied: sqlalchemy>=1.1.0 in /home/xliu127/.local/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.4.11)\n", "Requirement already satisfied: tqdm in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (4.64.0)\n", "Requirement already satisfied: colorlog in /home/xliu127/.local/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (5.0.1)\n", "Requirement already satisfied: cmaes>=0.8.2 in /home/xliu127/.local/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (0.8.2)\n", "Requirement already satisfied: alembic in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.8.1)\n", "Requirement already satisfied: cliff in /home/xliu127/.local/lib/python3.8/site-packages (from optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (3.7.0)\n", "Requirement already satisfied: plotly in /home/xliu127/.local/lib/python3.8/site-packages (from catboost>=0.26->flaml[blendsearch,nlp,notebook,ray]) (4.14.3)\n", "Requirement already satisfied: six in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from catboost>=0.26->flaml[blendsearch,nlp,notebook,ray]) (1.16.0)\n", "Requirement already satisfied: graphviz in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from catboost>=0.26->flaml[blendsearch,nlp,notebook,ray]) (0.20.1)\n", "Requirement already satisfied: wheel in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from lightgbm>=2.3.1->flaml[blendsearch,nlp,notebook,ray]) (0.37.1)\n", "Requirement already satisfied: pytz>=2020.1 in /home/xliu127/.local/lib/python3.8/site-packages (from pandas>=1.1.4->flaml[blendsearch,nlp,notebook,ray]) (2021.1)\n", "Requirement already satisfied: attrs in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (20.3.0)\n", "Requirement already satisfied: frozenlist in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (1.3.0)\n", "Requirement already satisfied: virtualenv in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (20.16.2)\n", "Requirement already satisfied: click<=8.0.4,>=7.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (8.0.4)\n", "Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (1.0.2)\n", "Requirement already satisfied: filelock in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (3.7.1)\n", "Requirement already satisfied: grpcio<=1.43.0,>=1.28.1 in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (1.40.0)\n", "Requirement already satisfied: pyyaml in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (6.0)\n", "Requirement already satisfied: aiosignal in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (1.2.0)\n", "Requirement already satisfied: jsonschema in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (3.2.0)\n", "Requirement already satisfied: tensorboardX>=1.9 in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (2.2)\n", "Requirement already satisfied: tabulate in /home/xliu127/.local/lib/python3.8/site-packages (from ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (0.8.9)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: threadpoolctl>=2.0.0 in /home/xliu127/.local/lib/python3.8/site-packages (from scikit-learn>=0.24->flaml[blendsearch,nlp,notebook,ray]) (2.1.0)\n", "Requirement already satisfied: joblib>=0.11 in /home/xliu127/.local/lib/python3.8/site-packages (from scikit-learn>=0.24->flaml[blendsearch,nlp,notebook,ray]) (1.0.1)\n", "Requirement already satisfied: sacremoses in /home/xliu127/.local/lib/python3.8/site-packages (from transformers>=4.14->flaml[blendsearch,nlp,notebook,ray]) (0.0.45)\n", "Requirement already satisfied: regex!=2019.12.17 in /home/xliu127/.local/lib/python3.8/site-packages (from transformers>=4.14->flaml[blendsearch,nlp,notebook,ray]) (2021.8.28)\n", "Requirement already satisfied: tokenizers!=0.11.3,<0.13,>=0.11.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from transformers>=4.14->flaml[blendsearch,nlp,notebook,ray]) (0.12.1)\n", "Requirement already satisfied: huggingface-hub<1.0,>=0.1.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from transformers>=4.14->flaml[blendsearch,nlp,notebook,ray]) (0.8.1)\n", "Requirement already satisfied: xxhash in /home/xliu127/.local/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (2.0.2)\n", "Requirement already satisfied: fsspec[http]>=2021.11.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (2022.7.1)\n", "Requirement already satisfied: multiprocess in /home/xliu127/.local/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (0.70.12.2)\n", "Requirement already satisfied: dill<0.3.6 in /home/xliu127/.local/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (0.3.4)\n", "Requirement already satisfied: aiohttp in /home/xliu127/.local/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (3.7.4.post0)\n", "Requirement already satisfied: responses<0.19 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (0.18.0)\n", "Requirement already satisfied: pyarrow>=6.0.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from datasets->flaml[blendsearch,nlp,notebook,ray]) (8.0.0)\n", "Requirement already satisfied: ipywidgets in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (7.7.1)\n", "Requirement already satisfied: notebook in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (6.4.12)\n", "Requirement already satisfied: qtconsole in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.3.1)\n", "Requirement already satisfied: nbconvert in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (6.5.1)\n", "Requirement already satisfied: jupyter-console in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (6.4.4)\n", "Requirement already satisfied: ipykernel in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter->flaml[blendsearch,nlp,notebook,ray]) (6.15.1)\n", "Requirement already satisfied: pyparsing>=2.2.1 in /home/xliu127/.local/lib/python3.8/site-packages (from matplotlib->flaml[blendsearch,nlp,notebook,ray]) (2.4.7)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /home/xliu127/.local/lib/python3.8/site-packages (from matplotlib->flaml[blendsearch,nlp,notebook,ray]) (1.3.1)\n", "Requirement already satisfied: cycler>=0.10 in /home/xliu127/.local/lib/python3.8/site-packages (from matplotlib->flaml[blendsearch,nlp,notebook,ray]) (0.10.0)\n", "Requirement already satisfied: pillow>=6.2.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from matplotlib->flaml[blendsearch,nlp,notebook,ray]) (9.2.0)\n", "Requirement already satisfied: absl-py in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from rouge_score->flaml[blendsearch,nlp,notebook,ray]) (1.2.0)\n", "Requirement already satisfied: typing-extensions in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from torch->flaml[blendsearch,nlp,notebook,ray]) (4.3.0)\n", "Requirement already satisfied: charset-normalizer<3,>=2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from requests->openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (2.1.0)\n", "Requirement already satisfied: certifi>=2017.4.17 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from requests->openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (2022.6.15)\n", "Requirement already satisfied: idna<4,>=2.5 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from requests->openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (3.3)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from requests->openml==0.10.2->flaml[blendsearch,nlp,notebook,ray]) (1.26.11)\n", "Requirement already satisfied: greenlet!=0.4.17 in /home/xliu127/.local/lib/python3.8/site-packages (from sqlalchemy>=1.1.0->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.0.0)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /home/xliu127/.local/lib/python3.8/site-packages (from aiohttp->datasets->flaml[blendsearch,nlp,notebook,ray]) (5.1.0)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /home/xliu127/.local/lib/python3.8/site-packages (from aiohttp->datasets->flaml[blendsearch,nlp,notebook,ray]) (1.6.3)\n", "Requirement already satisfied: chardet<5.0,>=2.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from aiohttp->datasets->flaml[blendsearch,nlp,notebook,ray]) (4.0.0)\n", "Requirement already satisfied: async-timeout<4.0,>=3.0 in /home/xliu127/.local/lib/python3.8/site-packages (from aiohttp->datasets->flaml[blendsearch,nlp,notebook,ray]) (3.0.1)\n", "Requirement already satisfied: importlib-metadata in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from alembic->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (4.12.0)\n", "Requirement already satisfied: importlib-resources in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from alembic->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (5.9.0)\n", "Requirement already satisfied: Mako in /home/xliu127/.local/lib/python3.8/site-packages (from alembic->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.1.4)\n", "Requirement already satisfied: cmd2>=1.0.0 in /home/xliu127/.local/lib/python3.8/site-packages (from cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.5.0)\n", "Requirement already satisfied: PrettyTable>=0.7.2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (3.3.0)\n", "Requirement already satisfied: stevedore>=2.0.1 in /home/xliu127/.local/lib/python3.8/site-packages (from cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (3.3.0)\n", "Requirement already satisfied: pbr!=2.1.0,>=2.0.0 in /home/xliu127/.local/lib/python3.8/site-packages (from cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (5.5.1)\n", "Requirement already satisfied: debugpy>=1.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.6.2)\n", "Requirement already satisfied: traitlets>=5.1.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.3.0)\n", "Requirement already satisfied: matplotlib-inline>=0.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.1.3)\n", "Requirement already satisfied: psutil in /home/xliu127/.local/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.8.0)\n", "Requirement already satisfied: ipython>=7.23.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (8.4.0)\n", "Requirement already satisfied: jupyter-client>=6.1.12 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (7.3.4)\n", "Requirement already satisfied: tornado>=6.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (6.2)\n", "Requirement already satisfied: nest-asyncio in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.5.5)\n", "Requirement already satisfied: pyzmq>=17 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (23.2.0)\n", "Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipywidgets->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.1.1)\n", "Requirement already satisfied: ipython-genutils~=0.2.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipywidgets->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.2.0)\n", "Requirement already satisfied: widgetsnbextension~=3.6.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipywidgets->jupyter->flaml[blendsearch,nlp,notebook,ray]) (3.6.1)\n", "Requirement already satisfied: pyrsistent>=0.14.0 in /home/xliu127/.local/lib/python3.8/site-packages (from jsonschema->ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (0.17.3)\n", "Requirement already satisfied: setuptools in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jsonschema->ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (61.2.0)\n", "Requirement already satisfied: pygments in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter-console->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.12.0)\n", "Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jupyter-console->jupyter->flaml[blendsearch,nlp,notebook,ray]) (3.0.30)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: defusedxml in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.7.1)\n", "Requirement already satisfied: pandocfilters>=1.4.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.5.0)\n", "Requirement already satisfied: beautifulsoup4 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (4.11.1)\n", "Requirement already satisfied: jupyter-core>=4.7 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (4.11.1)\n", "Requirement already satisfied: lxml in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (4.9.1)\n", "Requirement already satisfied: jupyterlab-pygments in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.2.2)\n", "Requirement already satisfied: bleach in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.0.1)\n", "Requirement already satisfied: tinycss2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.1.1)\n", "Requirement already satisfied: mistune<2,>=0.8.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.8.4)\n", "Requirement already satisfied: MarkupSafe>=2.0 in /home/xliu127/.local/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.0.1)\n", "Requirement already satisfied: jinja2>=3.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (3.1.2)\n", "Requirement already satisfied: nbclient>=0.5.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.6.6)\n", "Requirement already satisfied: nbformat>=5.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.4.0)\n", "Requirement already satisfied: entrypoints>=0.2.2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.4)\n", "Requirement already satisfied: prometheus-client in /home/xliu127/.local/lib/python3.8/site-packages (from notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.10.1)\n", "Requirement already satisfied: argon2-cffi in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (21.3.0)\n", "Requirement already satisfied: Send2Trash>=1.8.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.8.0)\n", "Requirement already satisfied: terminado>=0.8.3 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.15.0)\n", "Requirement already satisfied: retrying>=1.3.3 in /home/xliu127/.local/lib/python3.8/site-packages (from plotly->catboost>=0.26->flaml[blendsearch,nlp,notebook,ray]) (1.3.3)\n", "Requirement already satisfied: qtpy>=2.0.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from qtconsole->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.1.0)\n", "Requirement already satisfied: platformdirs<3,>=2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from virtualenv->ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (2.5.2)\n", "Requirement already satisfied: distlib<1,>=0.3.1 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from virtualenv->ray[tune]~=1.10->flaml[blendsearch,nlp,notebook,ray]) (0.3.5)\n", "Requirement already satisfied: wcwidth>=0.1.7 in /home/xliu127/.local/lib/python3.8/site-packages (from cmd2>=1.0.0->cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (0.2.5)\n", "Requirement already satisfied: colorama>=0.3.7 in /home/xliu127/.local/lib/python3.8/site-packages (from cmd2>=1.0.0->cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (0.4.4)\n", "Requirement already satisfied: pyperclip>=1.6 in /home/xliu127/.local/lib/python3.8/site-packages (from cmd2>=1.0.0->cliff->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (1.8.2)\n", "Requirement already satisfied: stack-data in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.3.0)\n", "Requirement already satisfied: pexpect>4.3 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (4.8.0)\n", "Requirement already satisfied: backcall in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.2.0)\n", "Requirement already satisfied: pickleshare in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.7.5)\n", "Requirement already satisfied: decorator in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (5.1.1)\n", "Requirement already satisfied: jedi>=0.16 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.18.1)\n", "Requirement already satisfied: fastjsonschema in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from nbformat>=5.1->nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.16.1)\n", "Requirement already satisfied: ptyprocess in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from terminado>=0.8.3->notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.7.0)\n", "Requirement already satisfied: argon2-cffi-bindings in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from argon2-cffi->notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (21.2.0)\n", "Requirement already satisfied: soupsieve>1.2 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from beautifulsoup4->nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.3.2.post1)\n", "Requirement already satisfied: webencodings in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from bleach->nbconvert->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.5.1)\n", "Requirement already satisfied: zipp>=0.5 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from importlib-metadata->alembic->optuna==2.8.0->flaml[blendsearch,nlp,notebook,ray]) (3.8.1)\n", "Requirement already satisfied: parso<0.9.0,>=0.8.0 in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from jedi>=0.16->ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.8.3)\n", "Requirement already satisfied: cffi>=1.0.1 in /home/xliu127/.local/lib/python3.8/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (1.14.6)\n", "Requirement already satisfied: executing in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from stack-data->ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.9.1)\n", "Requirement already satisfied: asttokens in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from stack-data->ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.0.7)\n", "Requirement already satisfied: pure-eval in /data/installation/anaconda3/envs/tmp/lib/python3.8/site-packages (from stack-data->ipython>=7.23.1->ipykernel->jupyter->flaml[blendsearch,nlp,notebook,ray]) (0.2.2)\n", "Requirement already satisfied: pycparser in /home/xliu127/.local/lib/python3.8/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook->jupyter->flaml[blendsearch,nlp,notebook,ray]) (2.20)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0m\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0m\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0m\u001b[33mWARNING: Ignoring invalid distribution -andas (/home/xliu127/.local/lib/python3.8/site-packages)\u001b[0m\u001b[33m\n", "\u001b[0mNote: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install flaml[nlp,ray,notebook,blendsearch]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's run some examples. \n", "\n", "Note: throughout this notebook, you may see a few ModuleNotFoundErrors. As long as the cell successfully executes, you can ignore that error." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Sentiment Classification Example\n", "### Load data and preprocess\n", "\n", "The Stanford Sentiment treebank (SST-2) dataset is a dataset for sentiment classification. First, let's load this dataset into pandas dataframes:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Reusing dataset glue (/home/xliu127/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)\n", "Reusing dataset glue (/home/xliu127/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)\n", "Reusing dataset glue (/home/xliu127/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)\n" ] } ], "source": [ "from datasets import load_dataset\n", "\n", "train_dataset = load_dataset(\"glue\", \"sst2\", split=\"train\").to_pandas()\n", "dev_dataset = load_dataset(\"glue\", \"sst2\", split=\"validation\").to_pandas()\n", "test_dataset = load_dataset(\"glue\", \"sst2\", split=\"test\").to_pandas()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take a look at the first 5 examples of this dataset:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | sentence | \n", "label | \n", "idx | \n", "
---|---|---|---|
0 | \n", "hide new secretions from the parental units | \n", "0 | \n", "0 | \n", "
1 | \n", "contains no wit , only labored gags | \n", "0 | \n", "1 | \n", "
2 | \n", "that loves its characters and communicates som... | \n", "1 | \n", "2 | \n", "
3 | \n", "remains utterly satisfied to remain the same t... | \n", "0 | \n", "3 | \n", "
4 | \n", "on the worst revenge-of-the-nerds clichés the ... | \n", "0 | \n", "4 | \n", "