{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Copyright (c) Microsoft Corporation. All rights reserved. \n", "\n", "Licensed under the MIT License.\n", "\n", "# Use FLAML to Optimize Code Generation Performance\n", "\n", "In this notebook, we optimize OpenAI models for code generation. We use [the HumanEval benchmark](https://huggingface.co/datasets/openai_humaneval) released by OpenAI for synthesizing programs from docstrings. \n", "\n", "## Requirements\n", "\n", "FLAML requires `Python>=3.7`. To run this notebook example, please install flaml with the [autogen] option:\n", "```bash\n", "pip install flaml[autogen]==1.2.2\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2023-02-24T23:25:36.910966Z", "iopub.status.busy": "2023-02-24T23:25:36.910473Z", "iopub.status.idle": "2023-02-24T23:25:36.914554Z", "shell.execute_reply": "2023-02-24T23:25:36.914030Z" } }, "outputs": [], "source": [ "# %pip install flaml[autogen]==1.2.2 datasets" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Set your OpenAI key:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2023-02-24T23:25:36.917301Z", "iopub.status.busy": "2023-02-24T23:25:36.917011Z", "iopub.status.idle": "2023-02-24T23:25:36.923156Z", "shell.execute_reply": "2023-02-24T23:25:36.922619Z" } }, "outputs": [], "source": [ "import os\n", "\n", "if \"OPENAI_API_KEY\" not in os.environ:\n", " os.environ[\"OPENAI_API_KEY\"] = \"\"" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If you use Azure OpenAI, uncomment the following:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2023-02-24T23:25:36.925804Z", "iopub.status.busy": "2023-02-24T23:25:36.925423Z", "iopub.status.idle": "2023-02-24T23:25:36.928191Z", "shell.execute_reply": "2023-02-24T23:25:36.927673Z" } }, "outputs": [], "source": [ "# import openai\n", "# openai.api_type = \"azure\"\n", "# openai.api_base = \"https://.openai.azure.com/\"\n", "# openai.api_version = \"2023-03-15-preview\" # change if necessary" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Load dataset\n", "\n", "First, we load the humaneval dataset. The dataset contains 164 examples. In each example, the \"prompt\" is the prompt string for eliciting the code generation (renamed into \"definition\"), \"test\" is the Python code for unit test for the example, and \"entry_point\" is the function name to be tested." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2023-02-24T23:25:36.931255Z", "iopub.status.busy": "2023-02-24T23:25:36.930838Z", "iopub.status.idle": "2023-02-24T23:25:39.148799Z", "shell.execute_reply": "2023-02-24T23:25:39.148113Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Found cached dataset openai_humaneval (/home/vscode/.cache/huggingface/datasets/openai_humaneval/openai_humaneval/1.0.0/2955cebd73602e828fa8c0a424c594e5fab4ec863b316ca98f3d8fdb6a626e75)\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1fdc8853bf2a4aecaa2cd024ad99b5a2", "version_major": 2, "version_minor": 0 }, "text/plain": [ " 0%| | 0/1 [00:00