"<a href=\"https://colab.research.google.com/github/microsoft/FLAML/blob/main/notebook/autogen_agentchat_MathChat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Auto Generated Agent Chat: Using MathChat to Solve Math Problems\n",
"\n",
"MathChat is a convesational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/FLAML/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. The original implementation and exeperiments of MathChat are in this [branch](https://github.com/kevin666aa/FLAML/tree/gpt_math_solver/flaml/autogen/math), and you can find more details in our paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337).\n",
"\n",
"## Requirements\n",
"\n",
"FLAML requires `Python>=3.8`. To run this notebook example, please install flaml with the [mathchat] option.\n",
"```bash\n",
"pip install flaml[mathchat]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# %pip install flaml[mathchat]~=2.0.0rc4"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set your API Endpoint\n",
"\n",
"The [`config_list_from_json`](https://microsoft.github.io/FLAML/docs/reference/autogen/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n"
"It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well). Only the gpt-4 and gpt-3.5-turbo models are kept in the list based on the filter condition.\n",
"\n",
"The config list looks like the following:\n",
"```python\n",
"config_list = [\n",
" {\n",
" 'model': 'gpt-4',\n",
" 'api_key': '<your OpenAI API key here>',\n",
" },\n",
" {\n",
" 'model': 'gpt-4',\n",
" 'api_key': '<your Azure OpenAI API key here>',\n",
" 'api_base': '<your Azure OpenAI API base here>',\n",
" 'api_type': 'azure',\n",
" 'api_version': '2023-06-01-preview',\n",
" },\n",
" {\n",
" 'model': 'gpt-3.5-turbo',\n",
" 'api_key': '<your Azure OpenAI API key here>',\n",
" 'api_base': '<your Azure OpenAI API base here>',\n",
" 'api_type': 'azure',\n",
" 'api_version': '2023-06-01-preview',\n",
" },\n",
"]\n",
"```\n",
"\n",
"If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n",
"\n",
"You can set the value of config_list in other ways you prefer, e.g., loading from a YAML file."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Construct agents for MathChat\n",
"\n",
"We start by initialzing the `AssistantAgent` and `MathUserProxyAgent`. The system message needs to be set to \"You are a helpful assistant.\" for MathChat. The detailed instructions are given in the user message. Later we will use the `MathUserProxyAgent.generate_init_message` to combine the instructions and a math problem for an initial message to be sent to the LLM assistant."
"\\end{align*} This inequality is satisfied if and only if $(x+14)$ and $(x+3)$ are either both positive or both negative. Both factors are positive for $x>-3$ and both factors are negative for $x<-14$. When $-14<x<-3$, one factor is positive and the other negative, so their product is negative. Therefore, the range of $x$ that satisfies the inequality is $ \\boxed{(-\\infty, -14)\\cup(-3,\\infty)} $."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mathproxyagent (to assistant):\n",
"\n",
"Let's use Python to solve a math problem.\n",
"\n",
"Query requirements:\n",
"You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
"You can use packages like sympy to help you.\n",
"You must follow the formats below to write your code:\n",
"```python\n",
"# your code\n",
"```\n",
"\n",
"First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
"Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
"Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
"Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
"1. Solve the problem step by step (do not over-divide the steps).\n",
"2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
"3. Wait for me to give the results.\n",
"4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
"\n",
"After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
"\n",
"Problem:\n",
"Find all $x$ that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$. Express your answer in interval notation.\n",
"This problem can be solved by first simplifying the inequality, finding the critical points, and then testing points from each interval defined by the critical points to find where the inequality holds true. We can use Python with the sympy package for these calculations. Here is how:\n",
"\n",
"Case 1: Solving with Python directly\n",
"\n",
"We will solve this problem in the following steps:\n",
"1. First, we simplify the inequality by expanding both sides and bringing all terms to one side.\n",
"2. Second, we find the critical points by solving the simplified equation.\n",
"3. Third, we test the sign of the simplified function with a number in each interval defined by the critical points.\n",
"4. Finally, we collect all the intervals where the inequality is satisfied.\n",
"Great! So the solution to the inequality $(2x+10)(x+3)<(3x+9)(x+8)$ is given by the union of the two intervals where the inequality holds true. In interval notation, we can express the solution as:\n",
"# given a math problem, we use the mathproxyagent to generate a prompt to be sent to the assistant as the initial message.\n",
"# the assistant receives the message and generates a response. The response will be sent back to the mathproxyagent for processing.\n",
"# The conversation continues until the termination condition is met, in MathChat, the termination condition is the detect of \"\\boxed{}\" in the response.\n",
"math_problem = \"Find all $x$ that satisfy the inequality $(2x+10)(x+3)<(3x+9)(x+8)$. Express your answer in interval notation.\"\n",
"Problem: For what negative value of $k$ is there exactly one solution to the system of equations \\begin{align*}\n",
"y &= 2x^2 + kx + 6 \\\\\n",
"y &= -x + 4?\n",
"\\end{align*}\n",
"\n",
"Correct Solution: Setting the two expressions for $y$ equal to each other, it follows that $2x^2 + kx + 6 = -x + 4$. Re-arranging, $2x^2 + (k+1)x + 2 = 0$. For there to be exactly one solution for $x$, then the discriminant of the given quadratic must be equal to zero. Thus, $(k+1)^2 - 4 \\cdot 2 \\cdot 2 = (k+1)^2 - 16 = 0$, so $k+1 = \\pm 4$. Taking the negative value, $k = \\boxed{-5}$."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mathproxyagent (to assistant):\n",
"\n",
"Let's use Python to solve a math problem.\n",
"\n",
"Query requirements:\n",
"You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
"You can use packages like sympy to help you.\n",
"You must follow the formats below to write your code:\n",
"```python\n",
"# your code\n",
"```\n",
"\n",
"First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
"Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
"Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
"Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
"1. Solve the problem step by step (do not over-divide the steps).\n",
"2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
"3. Wait for me to give the results.\n",
"4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
"\n",
"After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
"\n",
"Problem:\n",
"For what negative value of $k$ is there exactly one solution to the system of equations \\begin{align*}\n",
"The key idea to solve this problem is to set the two equations equal to each other and solve for x. Then we substitute x back into one of the equations to get the quadratic equation in terms of y. We set the discriminant of this quadratic equation equals to zero, because when discriminant (D = b² - 4ac) equals to zero in a quadratic equation, the equation has exactly one root. Solve the equation for k.\n",
"\n",
"```python\n",
"from sympy import symbols, Eq, solve\n",
"\n",
"x, k = symbols('x k')\n",
"\n",
"# The two Y's are equal, so we set the two equations equal to each other\n",
"eq1 = Eq(2*x**2 + k*x + 6, -x + 4)\n",
"solve_for_x = solve(eq1, x)\n",
"\n",
"# Insert x solutions back to the original equation\n",
"Apologies for the oversight in the code. It seems I made a mistake in checking for the negative value of k. Let's modify the code to fix the issue.\n",
"\n",
"```python\n",
"from sympy import symbols, Eq, solve\n",
"\n",
"x, k = symbols('x k')\n",
"\n",
"# The two Y's are equal, so we set the two equations equal to each other\n",
"eq1 = Eq(2*x**2 + k*x + 6, -x + 4)\n",
"solve_for_x = solve(eq1, x)\n",
"\n",
"# Insert x solutions back to the original equation\n",
"eq2 = Eq(2*x + k, -x + 4)\n",
"result = []\n",
"\n",
"for solution in solve_for_x:\n",
" temp = eq2.subs(x, solution)\n",
" sol = solve(temp, k)\n",
" if sol != []:\n",
" result.extend(sol)\n",
"\n",
"# The solution should be negative\n",
"negative_k = [s for s in result if s<0]\n",
"\n",
"print(negative_k)\n",
"```\n",
"This code should now properly identify the negative value of k for which there is exactly one solution to the system of equations.\n",
"math_problem = \"For what negative value of $k$ is there exactly one solution to the system of equations \\\\begin{align*}\\ny &= 2x^2 + kx + 6 \\\\\\\\\\ny &= -x + 4?\\n\\\\end{align*}\"\n",
"Problem: Find all positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational. Express them in decreasing order, separated by commas.\n",
"\n",
"Correct Solution: For the roots to be real and rational, the discriminant must be a perfect square. Therefore, $(-7)^2-4 \\cdot 1 \\cdot c = 49-4c$ must be a perfect square. The only positive perfect squares less than 49 are $1$, $4$, $9$, $16$, $25$, and $36$. The perfect squares that give a integer value of $c$ are $1$, $9$, and $25$. Thus, we have the equations $49-4c=1$, $49-4c=9$, and $49-4c=25$. Solving, we get that the positive integer values of c are $\\boxed{12, 10, 6}$."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mathproxyagent (to assistant):\n",
"\n",
"Let's use Python to solve a math problem.\n",
"\n",
"Query requirements:\n",
"You should always use the 'print' function for the output and use fractions/radical forms instead of decimals.\n",
"You can use packages like sympy to help you.\n",
"You must follow the formats below to write your code:\n",
"```python\n",
"# your code\n",
"```\n",
"\n",
"First state the key idea to solve the problem. You may choose from three ways to solve the problem:\n",
"Case 1: If the problem can be solved with Python code directly, please write a program to solve it. You can enumerate all possible arrangements if needed.\n",
"Case 2: If the problem is mostly reasoning, you can solve it by yourself directly.\n",
"Case 3: If the problem cannot be handled in the above two ways, please follow this process:\n",
"1. Solve the problem step by step (do not over-divide the steps).\n",
"2. Take out any queries that can be asked through Python (for example, any calculations or equations that can be calculated).\n",
"3. Wait for me to give the results.\n",
"4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
"\n",
"After all the queries are run and you get the answer, put the answer in \\boxed{}.\n",
"\n",
"Problem:\n",
"Find all positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational. Express them in decreasing order, separated by commas.\n",
"To ensure the roots of the quadratic equation $x^2 - 7x + c = 0$ are real and rational, we will use two key ideas:\n",
"\n",
"1. The discriminant of the quadratic equation, $\\Delta = b^2 - 4ac$, must be non-negative so that the roots are real. \n",
"2. The discriminant must be a perfect square so that the roots are rational.\n",
"\n",
"We can find the values of $c$ using Python. First, we'll calculate the discriminant and check the conditions for each value of c. We only need to consider positive integers for $c$ until we reach a value where the discriminant becomes negative since after that point, there will be no real roots.\n",
"\n",
"```python\n",
"from sympy import *\n",
"\n",
"c_values = []\n",
"c = 1\n",
"\n",
"while True:\n",
" discriminant = Rational(49 - 4 * c)\n",
" if discriminant < 0:\n",
" break\n",
" if int(discriminant.sqrt())**2 == discriminant:\n",
"math_problem = \"Find all positive integer values of $c$ such that the equation $x^2-7x+c=0$ only has roots that are real and rational. Express them in decreasing order, separated by commas.\"\n",
"MathChat allows different prompts that instruct assistant to solve the problem.\n",
"\n",
"Check out `MathUserProxyAgent.generate_init_message(problem, prompt_type='default', customized_prompt=None)`:\n",
"- You may choose from `['default', 'python', 'two_tools']` for parameter `prompt_type`. We include two more prompts in the paper: \n",
" 1. `'python'` is a simplified prompt from the default prompt that uses Python only. \n",
" 2. `'two_tools'` further allows the selection of Python or Wolfram Alpha based on this simplified `python` prompt. Note that this option requries a Wolfram Alpha API key and put it in `wolfram.txt`.\n",
"\n",
"- You can also input your customized prompt if needed: `mathproxyagent.generate_init_message(problem, customized_prompt=\"Your customized prompt\")`. Since this mathproxyagent detects '\\boxed{}' as termination, you need to have a similar termination sentence in the prompt: \"If you get the answer, put the answer in \\\\boxed{}.\". If the customized is provided, the `prompt_type` will be ignored.\n",
"\n",
"\n",
"### Example 4 (Use the \"python\" prompt):\n",
"\n",
"Problem: If $725x + 727y = 1500$ and $729x+ 731y = 1508$, what is the value of $x - y$ ?\n",
"\n",
"Correct Solution: Subtracting the two equations gives: \n",
"\\begin{align*}\n",
"(729x+731y)-(725x+727y) &= 1508-1500\\\\\n",
"\\Rightarrow\\qquad 4x+4y &= 8\\\\\n",
"\\Rightarrow\\qquad x+y &= 2.\n",
"\\end{align*}\n",
"\n",
"Multiplying this equation by 725 and subtracting this equation from $725x+727y=1500$ gives \\begin{align*}\n",
"Let's first solve the two equations as a system of linear equations. We can do this by either substitution or elimination method. Let's use the elimination method.\n",
"\n",
"We will multiply both equations by necessary multiples such that the coefficients of either x or y will cancel out when we subtract/add the two equations.\n",
"\n",
"We'll multiply the first equation by 729 and the second equation by -725. This will allow us to cancel out the x terms when we add the two equations.\n",
"\n",
"```python\n",
"from sympy import symbols, Eq, solve\n",
"\n",
"x, y = symbols('x y')\n",
"equation1 = Eq(725 * x + 727 * y, 1500)\n",
"equation2 = Eq(729 * x + 731 * y, 1508)\n",
"\n",
"# Multiply equations with necessary multiples\n",
"Great, we have found the solution for y: $y = 25$. Now, let's substitute the y value back into one of the original equations to solve for x. We can use the first equation for this.\n",
"Problem: Find all numbers $a$ for which the graph of $y=x^2+a$ and the graph of $y=ax$ intersect. Express your answer in interval notation.\n",
"\n",
"\n",
"Correct Solution: If these two graphs intersect then the points of intersection occur when \\[x^2+a=ax,\\] or \\[x^2-ax+a=0.\\] This quadratic has solutions exactly when the discriminant is nonnegative: \\[(-a)^2-4\\cdot1\\cdot a\\geq0.\\] This simplifies to \\[a(a-4)\\geq0.\\] This quadratic (in $a$) is nonnegative when $a$ and $a-4$ are either both $\\ge 0$ or both $\\le 0$. This is true for $a$ in $$(-\\infty,0]\\cup[4,\\infty).$$ Therefore the line and quadratic intersect exactly when $a$ is in $\\boxed{(-\\infty,0]\\cup[4,\\infty)}$.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"mathproxyagent (to assistant):\n",
"\n",
"Let's use two tools (Python and Wolfram alpha) to solve a math problem.\n",
"\n",
"Query requirements:\n",
"You must follow the formats below to write your query:\n",
"For Wolfram Alpha:\n",
"```wolfram\n",
"# one wolfram query\n",
"```\n",
"For Python:\n",
"```python\n",
"# your code\n",
"```\n",
"When using Python, you should always use the 'print' function for the output and use fractions/radical forms instead of decimals. You can use packages like sympy to help you.\n",
"When using wolfram, give one query in each code block.\n",
"\n",
"Please follow this process:\n",
"1. Solve the problem step by step (do not over-divide the steps).\n",
"2. Take out any queries that can be asked through Python or Wolfram Alpha, select the most suitable tool to be used (for example, any calculations or equations that can be calculated).\n",
"3. Wait for me to give the results.\n",
"4. Continue if you think the result is correct. If the result is invalid or unexpected, please correct your query or reasoning.\n",
"\n",
"After all the queries are run and you get the answer, put the final answer in \\boxed{}.\n",
"\n",
"Problem: Find all numbers $a$ for which the graph of $y=x^2+a$ and the graph of $y=ax$ intersect. Express your answer in interval notation.\n",
"The inequality $a \\le 0$ represents the interval $(-\\infty, 0]$. \n",
"\n",
"The inequality $a - 4 \\ge 0$ can be rewritten as $a \\ge 4$, which represents the interval $[4, \\infty)$. \n",
"\n",
"Since we are looking for the values of $a$ where the graphs intersect, we need to consider both intervals. Therefore, the final answer would be the union of the two intervals:\n",
"# we set the prompt_type to \"two_tools\", which allows the assistant to select wolfram alpha when necessary.\n",
"math_problem = \"Find all numbers $a$ for which the graph of $y=x^2+a$ and the graph of $y=ax$ intersect. Express your answer in interval notation.\"\n",