text -> dataset

This commit is contained in:
rasbt 2024-05-08 08:14:03 -05:00
parent 6cc9cf9f4e
commit a31d571625

View File

@ -519,7 +519,7 @@
" - 1. truncate all messages to the length of the shortest message in the dataset or batch\n",
" - 2. pad all messages to the length of the longest message in the dataset or batch\n",
"\n",
"- We choose option 2 and pad all messages to the longest message in the text\n",
"- We choose option 2 and pad all messages to the longest message in the dataset\n",
"- For that, we use `<|endoftext|>` as a padding token, as discussed in chapter 2"
]
},