| `user_id`| string | Yes | The unique identifier assigned to each user. `user_id` must be less than 32 characters and cannot be empty. The following character sets are supported: <br/>- 26 lowercase English letters (a-z)<br/>- 26 uppercase English letters (A-Z)<br/>- 10 digits (0-9)<br/>- "_", "-", "." |
| `id` | string | Yes | The unique identifier assigned to a conversation session. `id` must be less than 32 characters and cannot be empty. The following character sets are supported: <br/>- 26 lowercase English letters (a-z)<br/>- 26 uppercase English letters (A-Z)<br/>- 10 digits (0-9)<br/>- "_", "-", "." |
-`message`: All conversations in the specified conversation session.
-`role`: `"user"` or `"assistant"`.
-`content`: The text content of user or assistant. The citations are in a format like `##0$$`. The number in the middle, 0 in this case, indicates which part in data.reference.chunks it refers to.
-`user_id`: This is set by the caller.
-`reference`: Each reference corresponds to one of the assistant's answers in `data.message`.
-`img_id`: The image ID of the chunk. It is an optional field only for PDF, PPTX, and images. Call ['GET' /document/get/\<id\>](#get-document-content-or-image) to retrieve the image.
"content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
"content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><throwspan=2>Metrics</th><th>GPT-3.5-turbo</th><th></th><th>Claude-2</th><th>GPT-4</th></tr>\n<tr><th>Zero</th><th>Few</th><th>Zero Few</th><th>Zero Few</th></tr>\n<tr><td>CLIP Score</td><td>0.0</td><td>0.0</td><td>0.0 0.2543</td><td>0.0 0.3055</td></tr>\n<tr><td>BERT Score</td><td>0.1914</td><td>0.3820</td><td>0.2111 0.5038</td><td>0.2076 0.6307</td></tr>\n<tr><td>ViT Score</td><td>0.2437</td><td>0.7497</td><td>0.4082 0.5416</td><td>0.5058 0.6480</td></tr>\n<tr><td>Overall</td><td>0.1450</td><td>0.3772</td><td>0.2064 0.4332</td><td>0.2378 0.5281</td></tr>\n</table>",
"content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . ",
"content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores.",
| `conversation_id`| string | Yes | The ID of the conversation session. Call ['GET' /new_conversation](#create-conversation) to retrieve the ID.|
| `messages` | json | Yes | The latest question in a JSON form, such as `[{"role": "user", "content": "How are you doing!"}]`|
| `quote` | bool | No | Default: true |
| `stream` | bool | No | Default: true |
| `doc_ids` | string | No | Document IDs delimited by comma, like `c790da40ea8911ee928e0242ac180005,23dsf34ree928e0242ac180005`. The retrieved contents will be confined to these documents. |
-`img_id`: The image ID of the chunk. It is an optional field only for PDF, PPTX, and images. Call ['GET' /document/get/\<id\>](#get-document-content-or-image) to retrieve the image.
"answer": "The ViT Score for GPT-4 in the zero-shot scenario is 0.5058, and in the few-shot scenario, it is 0.6480. ##0$$",
"reference": {
"chunks": [
{
"chunk_id": "d0bc7892c3ec4aeac071544fd56730a8",
"content_ltks": "tabl 1:openagi task-solv perform under differ set for three closed-sourc llm . boldfac denot the highest score under each learn schema . metric gpt-3.5-turbo claude-2 gpt-4 zero few zero few zero few clip score 0.0 0.0 0.0 0.2543 0.0 0.3055 bert score 0.1914 0.3820 0.2111 0.5038 0.2076 0.6307 vit score 0.2437 0.7497 0.4082 0.5416 0.5058 0.6480 overal 0.1450 0.3772 0.2064 0.4332 0.2378 0.5281",
"content_with_weight": "<table><caption>Table 1: OpenAGI task-solving performances under different settings for three closed-source LLMs. Boldface denotes the highest score under each learning schema.</caption>\n<tr><throwspan=2>Metrics</th><th>GPT-3.5-turbo</th><th></th><th>Claude-2</th><th>GPT-4</th></tr>\n<tr><th>Zero</th><th>Few</th><th>Zero Few</th><th>Zero Few</th></tr>\n<tr><td>CLIP Score</td><td>0.0</td><td>0.0</td><td>0.0 0.2543</td><td>0.0 0.3055</td></tr>\n<tr><td>BERT Score</td><td>0.1914</td><td>0.3820</td><td>0.2111 0.5038</td><td>0.2076 0.6307</td></tr>\n<tr><td>ViT Score</td><td>0.2437</td><td>0.7497</td><td>0.4082 0.5416</td><td>0.5058 0.6480</td></tr>\n<tr><td>Overall</td><td>0.1450</td><td>0.3772</td><td>0.2064 0.4332</td><td>0.2378 0.5281</td></tr>\n</table>",
"content_ltks": "5.5 experiment analysi the main experiment result are tabul in tab . 1 and 2 , showcas the result for closed-sourc and open-sourc llm , respect . the overal perform is calcul a the averag of cllp 8 bert and vit score . here , onli the task descript of the benchmark task are fed into llm(addit inform , such a the input prompt and llm\u2019output , is provid in fig . a.4 and a.5 in supplementari). broadli speak , closed-sourc llm demonstr superior perform on openagi task , with gpt-4 lead the pack under both zero-and few-shot scenario . in the open-sourc categori , llama-2-13b take the lead , consist post top result across variou learn schema--the perform possibl influenc by it larger model size . notabl , open-sourc llm significantli benefit from the tune method , particularli fine-tun and\u2019rltf . these method mark notic enhanc for flan-t5-larg , vicuna-7b , and llama-2-13b when compar with zero-shot and few-shot learn schema . in fact , each of these open-sourc model hit it pinnacl under the rltf approach . conclus , with rltf tune , the perform of llama-2-13b approach that of gpt-3.5 , illustr it potenti .",
"content_with_weight": "5.5 Experimental Analysis\nThe main experimental results are tabulated in Tab. 1 and 2, showcasing the results for closed-source and open-source LLMs, respectively. The overall performance is calculated as the average of CLlP\n8\nBERT and ViT scores. Here, only the task descriptions of the benchmark tasks are fed into LLMs (additional information, such as the input prompt and LLMs\u2019 outputs, is provided in Fig. A.4 and A.5 in supplementary). Broadly speaking, closed-source LLMs demonstrate superior performance on OpenAGI tasks, with GPT-4 leading the pack under both zero- and few-shot scenarios. In the open-source category, LLaMA-2-13B takes the lead, consistently posting top results across various learning schema--the performance possibly influenced by its larger model size. Notably, open-source LLMs significantly benefit from the tuning methods, particularly Fine-tuning and\u2019 RLTF. These methods mark noticeable enhancements for Flan-T5-Large, Vicuna-7B, and LLaMA-2-13B when compared with zero-shot and few-shot learning schema. In fact, each of these open-source models hits its pinnacle under the RLTF approach. Conclusively, with RLTF tuning, the performance of LLaMA-2-13B approaches that of GPT-3.5, illustrating its potential.",
"content": "4.3 ProcessingOverheadof RL-CacheACKNOWLEDGMENTSThis section evaluates how effectively our RL-Cache implemen-tation leverages modern multi-core CPUs and GPUs to keep the per-request neural-net processing overhead low. Figure 14 depictsThis researchwas supported inpart by the Regional Government of Madrid (grant P2018/TCS-4499, EdgeData-CM)andU.S. National Science Foundation (grants CNS-1763617 andCNS-1717179).REFERENCES",