mirror of
				https://github.com/rasbt/LLMs-from-scratch.git
				synced 2025-10-31 18:00:08 +00:00 
			
		
		
		
	
		
			
				
	
	
		
			5 lines
		
	
	
		
			506 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			5 lines
		
	
	
		
			506 B
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Adding Bells and Whistles to the Training Loop
 | |
| 
 | |
| The main chapter used a relatively simple training function to keep the code readable and fit Chapter 5 within the page limits. Optionally, we can add a linear warm-up, a cosine decay schedule, and gradient clipping to improve the training stability and convergence.
 | |
| 
 | |
| You can find the code for this more sophisticated training function in [Appendix D: Adding Bells and Whistles to the Training Loop](../../appendix-D/01_main-chapter-code/appendix-D.ipynb). | 
