mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-08-16 20:51:51 +00:00
Note about warm-up steps
This commit is contained in:
parent
81eed9afe2
commit
f03f545a17
@ -203,7 +203,7 @@
|
|||||||
"id": "5bf3a8da-abc4-4b80-a5d8-f1cc1c7cc5f3",
|
"id": "5bf3a8da-abc4-4b80-a5d8-f1cc1c7cc5f3",
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"- Typically, the number of warmup steps is between 0.1% to 10% of the total number of steps\n",
|
"- Typically, the number of warmup steps is between 0.1% to 20% of the total number of steps\n",
|
||||||
"- We can compute the increment as the difference between the `peak_lr` and `initial_lr` divided by the number of warmup steps"
|
"- We can compute the increment as the difference between the `peak_lr` and `initial_lr` divided by the number of warmup steps"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@ -227,6 +227,14 @@
|
|||||||
"print(warmup_steps)"
|
"print(warmup_steps)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "4b6bbdc8-0104-459e-a7ed-b08be8578709",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"- Note that the print book accidentally includes a leftover code line, `warmup_steps = 20`, which is not used and can be safely ignored"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": 6,
|
"execution_count": 6,
|
||||||
@ -809,7 +817,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.11.4"
|
"version": "3.10.6"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
Loading…
x
Reference in New Issue
Block a user