mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-11-15 17:44:48 +00:00
renumber exercises
This commit is contained in:
parent
c76941e061
commit
040ce578be
@ -193,12 +193,31 @@
|
|||||||
"There is a 4.3% probability that the word \"pizza\" is sampled if the temperature is set to 5."
|
"There is a 4.3% probability that the word \"pizza\" is sampled if the temperature is set to 5."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "b510ffb0-adca-4d64-8a12-38c4646fd736",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Exercise 5.2: Different temperature and top-k settings"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "884990db-d1a6-4c4e-8e36-2c1e4c1e67c7",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"- Both temperature and top-k settings have to be adjusted based on the individual LLM (a kind of trial and error process until it generates desirable outputs)\n",
|
||||||
|
"- The desirable outcomes are also application-specific, though\n",
|
||||||
|
" - Lower top-k and temperatures result in less random outcomes, which is desired when creating educational content, technical writing or question answering, data analyses, code generation, and so forth\n",
|
||||||
|
" - Higher top-k and temperatures result in more diverse and random outputs, which is more desirable for brainstorming tasks, creative writing, and so forth"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "3f35425d-529d-4179-a1c4-63cb8b25b156",
|
"id": "3f35425d-529d-4179-a1c4-63cb8b25b156",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# Exercise 5.2: Deterministic behavior in the decoding functions"
|
"# Exercise 5.3: Deterministic behavior in the decoding functions"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -357,7 +376,7 @@
|
|||||||
"id": "6d0480e5-fb4e-41f8-a161-7ac980d71d47",
|
"id": "6d0480e5-fb4e-41f8-a161-7ac980d71d47",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# Exercise 5.3: Continued pretraining"
|
"# Exercise 5.4: Continued pretraining"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -507,7 +526,7 @@
|
|||||||
"id": "3384e788-f5a1-407c-8dd1-87959b75026d",
|
"id": "3384e788-f5a1-407c-8dd1-87959b75026d",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# Exercise 5.4: Training and validation set losses of the pretrained model"
|
"# Exercise 5.5: Training and validation set losses of the pretrained model"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@ -774,7 +793,7 @@
|
|||||||
"id": "3a76a1e0-9635-480a-9391-3bda7aea402d",
|
"id": "3a76a1e0-9635-480a-9391-3bda7aea402d",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"# Exercise 5.5: Trying larger models"
|
"# Exercise 5.6: Trying larger models"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user