Julia Kiseleva 19c7da2dd1
Adding first version of AgentEval -- a framework for assessing task utility for LLM-powered applications (#681)
* add agenteval-notebook for math problems and the blog post about it

* update gitignore

* updates to notebook

* adding folder for the logs

* adding math problems logs

* adding folder for alfworld logs

* added limitiation and future work to blog post

* minor edits blog post

* adding changes

* reorg

* modify the main notebook

* modification of the main notebook

* remove wrong notebook

* uploading new notebook

* update agenteval notebook

* change the sample

* Update agenteval_cq_math.ipynb

* adding final changes to notebook

* updated framework picture

* Update index.mdx

* Update index.md

* Add files via upload

* updates to notebool

* revise the blog

* revise the blog

* update the agent img

* revise the blog

* revise the blog

* Excluded model logs from the main branch, you can find them in agenteval branch

* Fixed pre-commit formatting.

* Update website/blog/2023-11-11-AgentEval/index.mdx

Co-authored-by: Chi Wang <wang.chi@microsoft.com>

* update gitignore

* update index.mdx

* update authors.yml by adding Negar and Julia

* remove md file

* remove md file

* update gitignore

* update authors file

* pre-commit checks

* pre-commit checks on authors.yml

* pre-commit checks on authors.yml

* update index.mdx

* update authors.yml by adding Negar and Julia

* updated the blog-post version 1

* updated the blog-post: TL;DR is ready

* updated the blog-post: first part of introduction is ready

* updated figures: typos on fig 1, changed terminology on the fig 2

* upadated the Framework part

* fixed redering issues

* upload zip file instead of single samples

* update prealgebra.zip

* update

* upload

* update z

* update naming

* update zip

* update the agenteval notebook

* update the notebook - removing unmercenary logs

* updated fig 1 and references to it

* updated fig 1

* incorporated PR comments

* merged agenteval branch

* final changes to the blog

* updated taxonomy

* update notebook

* minor changes to the blog

* Fixed formatting

* Update the link in agenteval_cq_math.ipynb

* update the blog and link in notebook

* Update index.mdx

* change folder name

* Changes to be committed:
	modified:    OAI_CONFIG_LIST_sample.txt

* add sample OAI file

* fix the url link to colab and typos

* fix the url link to colab and typos

* add authors

* update profile pic

* "update authors"

* fixing the problem in test_groupchat.py

* update the title lower case

* reverting changes in setup.py

* rerun pre-commit

---------

Co-authored-by: Negar Arabzadeh <ngr.arabzadeh@gmail.com>
Co-authored-by: Julia Kiseleva <jukisele@microsoft.com>
Co-authored-by: afourney <adamfo@microsoft.com>
Co-authored-by: Chi Wang <wang.chi@microsoft.com>
Co-authored-by: Qingyun Wu <qingyun.wu@psu.edu>
2023-11-21 04:07:33 +00:00

25 lines
2.6 KiB
JSON

{
"problem": "Find $24^{-1} \\pmod{11^2}$. That is, find the residue $b$ for which $24b \\equiv 1\\pmod{11^2}$.\n\nExpress your answer as an integer from $0$ to $11^2-1$, inclusive.",
"level": "Level 5",
"type": "Number Theory",
"solution": "Since $5 \\times 24 = 120 = 121 - 1$, it follows that $-5 \\times 24 \\equiv 1 \\pmod{121}$. Adding 121 to $-5$ to make it positive, we find $(-5 + 121) \\times 24 \\equiv 116 \\times 24 \\equiv 1 \\pmod{121}$, so it follows that the modular inverse of $24$ is $\\boxed{116}$ when taken modulo $121$.",
"correct_ans": "116",
"problem_id": "5",
"response_with_ans": "To find the modular inverse of 24 modulo 11^2, we can use the Extended Euclidean Algorithm. Here is a Python function to compute the modular inverse using this algorithm:\n\n```python\ndef mod_inverse(a, m):\n g, x, _ = extended_gcd(a, m)\n if g != 1:\n raise Exception(f\"{a} and {m} are not coprime.\")\n return x % m\n\ndef extended_gcd(a, b):\n if a == 0:\n return b, 0, 1\n else:\n g, x, y = extended_gcd(b % a, a)\n return g, y - (b // a) * x, x\n```\n\nLet's use the above function to find the modular inverse of 24 modulo 11^2:\n\n```python\na = 24\nm = 11**2\nmod_inverse(a, m)\n```\n\nI will execute the above code to find the modular inverse of 24 modulo 11^2.",
"round": 0,
"messages": [
{
"content": "Find $24^{-1} \\pmod{11^2}$. That is, find the residue $b$ for which $24b \\equiv 1\\pmod{11^2}$.\n\nExpress your answer as an integer from $0$ to $11^2-1$, inclusive.",
"role": "user"
},
{
"content": "To find the modular inverse of 24 modulo 11^2, we can use the Extended Euclidean Algorithm. Here is a Python function to compute the modular inverse using this algorithm:\n\n```python\ndef mod_inverse(a, m):\n g, x, _ = extended_gcd(a, m)\n if g != 1:\n raise Exception(f\"{a} and {m} are not coprime.\")\n return x % m\n\ndef extended_gcd(a, b):\n if a == 0:\n return b, 0, 1\n else:\n g, x, y = extended_gcd(b % a, a)\n return g, y - (b // a) * x, x\n```\n\nLet's use the above function to find the modular inverse of 24 modulo 11^2:\n\n```python\na = 24\nm = 11**2\nmod_inverse(a, m)\n```\n\nI will execute the above code to find the modular inverse of 24 modulo 11^2.",
"role": "assistant"
}
],
"time": 13.481226921081543,
"check_result": "let me verify the modular inverse of 24 modulo 11^2.\n\nafter executing the provided python code:\n\n```python\na = 24\nm = 11**2\nmod_inverse(a, m)\n```\n\nthe answer is found to be: 116.\n\nthe answer is correct.",
"is_correct": true,
"trial": -1
}