mirror of
				https://github.com/rasbt/LLMs-from-scratch.git
				synced 2025-11-04 11:50:14 +00:00 
			
		
		
		
	6 -> 4
This commit is contained in:
		
							parent
							
								
									d8de9377de
								
							
						
					
					
						commit
						774974de97
					
				@ -1440,6 +1440,15 @@
 | 
				
			|||||||
    "print(\"Outputs dimensions:\", outputs.shape) # shape: (batch_size, num_tokens, num_classes)"
 | 
					    "print(\"Outputs dimensions:\", outputs.shape) # shape: (batch_size, num_tokens, num_classes)"
 | 
				
			||||||
   ]
 | 
					   ]
 | 
				
			||||||
  },
 | 
					  },
 | 
				
			||||||
 | 
					  {
 | 
				
			||||||
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
 | 
					   "id": "75430a01-ef9c-426a-aca0-664689c4f461",
 | 
				
			||||||
 | 
					   "metadata": {},
 | 
				
			||||||
 | 
					   "source": [
 | 
				
			||||||
 | 
					    "- As discussed in previous chapters, for each input token, there's one output vector\n",
 | 
				
			||||||
 | 
					    "- Since we fed the model a text sample with 4 input tokens, the output consists of 4 2-dimensional output vectors above"
 | 
				
			||||||
 | 
					   ]
 | 
				
			||||||
 | 
					  },
 | 
				
			||||||
  {
 | 
					  {
 | 
				
			||||||
   "cell_type": "markdown",
 | 
					   "cell_type": "markdown",
 | 
				
			||||||
   "id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b",
 | 
					   "id": "7df9144f-6817-4be4-8d4b-5d4dadfe4a9b",
 | 
				
			||||||
@ -1453,11 +1462,9 @@
 | 
				
			|||||||
   "id": "e3bb8616-c791-4f5c-bac0-5302f663e46a",
 | 
					   "id": "e3bb8616-c791-4f5c-bac0-5302f663e46a",
 | 
				
			||||||
   "metadata": {},
 | 
					   "metadata": {},
 | 
				
			||||||
   "source": [
 | 
					   "source": [
 | 
				
			||||||
    "- As discussed in previous chapters, for each input token, there's one output vector\n",
 | 
					 | 
				
			||||||
    "- Since we fed the model a text sample with 6 input tokens, the output consists of 6 2-dimensional output vectors above\n",
 | 
					 | 
				
			||||||
    "- In chapter 3, we discussed the attention mechanism, which connects each input token to each other input token\n",
 | 
					    "- In chapter 3, we discussed the attention mechanism, which connects each input token to each other input token\n",
 | 
				
			||||||
    "- In chapter 3, we then also introduced the causal attention mask that is used in GPT-like models; this causal mask lets a current token only attend to the current and previous token positions\n",
 | 
					    "- In chapter 3, we then also introduced the causal attention mask that is used in GPT-like models; this causal mask lets a current token only attend to the current and previous token positions\n",
 | 
				
			||||||
    "- Based on this causal attention mechanism, the 6th (last) token above contains the most information among all tokens because it's the only token that includes information about all other tokens\n",
 | 
					    "- Based on this causal attention mechanism, the 4th (last) token contains the most information among all tokens because it's the only token that includes information about all other tokens\n",
 | 
				
			||||||
    "- Hence, we are particularly interested in this last token, which we will finetune for the spam classification task"
 | 
					    "- Hence, we are particularly interested in this last token, which we will finetune for the spam classification task"
 | 
				
			||||||
   ]
 | 
					   ]
 | 
				
			||||||
  },
 | 
					  },
 | 
				
			||||||
@ -2265,7 +2272,7 @@
 | 
				
			|||||||
   "name": "python",
 | 
					   "name": "python",
 | 
				
			||||||
   "nbconvert_exporter": "python",
 | 
					   "nbconvert_exporter": "python",
 | 
				
			||||||
   "pygments_lexer": "ipython3",
 | 
					   "pygments_lexer": "ipython3",
 | 
				
			||||||
   "version": "3.10.6"
 | 
					   "version": "3.10.12"
 | 
				
			||||||
  }
 | 
					  }
 | 
				
			||||||
 },
 | 
					 },
 | 
				
			||||||
 "nbformat": 4,
 | 
					 "nbformat": 4,
 | 
				
			||||||
 | 
				
			|||||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user