mirror of
				https://github.com/rasbt/LLMs-from-scratch.git
				synced 2025-10-22 05:20:42 +00:00 
			
		
		
		
	 7374d617b4
			
		
	
	
		7374d617b4
		
	
	
	
	
		
			
			total_training_iters=20, warmup_iters=20= len(train_loader) 4 multiply n_epochs 5, then ZeroDivisionError occurred.
```shell
Traceback (most recent call last):                                                                                                                                                                                                                                                                                              
  File "LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 191, in <module>                                             
    train_loss, val_loss = train_model(                                                                                                                                                                                                                                                                                         
                           ^^^^^^^^^^^^                                                                                                                         
  File "/mnt/raid1/docker/ai/LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 90, in train_model                                                                                                                                                                                                           
    progress = (global_step - warmup_iters) / (total_training_iters - warmup_iters)                                                                             
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                             
ZeroDivisionError: division by zero 
```
		
	
Optimizing Hyperparameters for Pretraining
The hparam_search.py script, based on the extended training function in Appendix D: Adding Bells and Whistles to the Training Loop, is designed to find optimal hyperparameters via grid search.
Note
This script will take a long time to run. You may want to reduce the number of hyperparameter configurations explored in the
HPARAM_GRIDdictionary at the top.