mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2025-11-14 17:13:39 +00:00
add clarifying note about GELU
This commit is contained in:
parent
1e61943bf2
commit
1ffb7500e4
@ -667,7 +667,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"- As we can see, ReLU is a piecewise linear function that outputs the input directly if it is positive; otherwise, it outputs zero\n",
|
"- As we can see, ReLU is a piecewise linear function that outputs the input directly if it is positive; otherwise, it outputs zero\n",
|
||||||
"- GELU is a smooth, non-linear function that approximates ReLU but with a non-zero gradient for negative values\n",
|
"- GELU is a smooth, non-linear function that approximates ReLU but with a non-zero gradient for negative values (except at approximately -0.75)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"- Next, let's implement the small neural network module, `FeedForward`, that we will be using in the LLM's transformer block later:"
|
"- Next, let's implement the small neural network module, `FeedForward`, that we will be using in the LLM's transformer block later:"
|
||||||
]
|
]
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user