| 
							
							
								 TITC | d16527ddf2 | total training iters may equal to warmup_iters (#301) total_training_iters=20, warmup_iters=20= len(train_loader) 4 multiply n_epochs 5, then ZeroDivisionError occurred.
```shell
Traceback (most recent call last):                                                                                                                                                                                                                                                                                              
  File "LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 191, in <module>                                             
    train_loss, val_loss = train_model(                                                                                                                                                                                                                                                                                         
                           ^^^^^^^^^^^^                                                                                                                         
  File "/mnt/raid1/docker/ai/LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 90, in train_model                                                                                                                                                                                                           
    progress = (global_step - warmup_iters) / (total_training_iters - warmup_iters)                                                                             
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                             
ZeroDivisionError: division by zero 
``` | 2024-08-06 07:10:05 -05:00 |  | 
			
				
					| 
							
							
								 Daniel Kleine | dcbdc1d2e5 | fixes for code (#206) * updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com> | 2024-06-11 20:59:48 -05:00 |  | 
			
				
					| 
							
							
								 rasbt | 6f0a5c320b | fix learning rate scheduler | 2024-06-03 07:06:42 -05:00 |  | 
			
				
					| 
							
							
								 rasbt | b40c260859 | update how to retrieve learning rate | 2024-05-23 17:19:01 -05:00 |  | 
			
				
					| 
							
							
								 Sebastian Raschka | c70ddff558 | Return nan if val loader is empty (#124) | 2024-04-20 08:02:30 -05:00 |  | 
			
				
					| 
							
							
								 Sebastian Raschka | dd51d4ad83 | Make datesets and loaders compatible with multiprocessing (#118) | 2024-04-13 13:57:56 -05:00 |  | 
			
				
					| 
							
							
								 Sebastian Raschka | 2de60d1bfb | Rename variable to context_length to make it easier on readers (#106) * rename to context length
* fix spacing | 2024-04-04 07:27:41 -05:00 |  | 
			
				
					| 
							
							
								 rasbt | 88b2dd780a | make batch loss calculatution more efficient | 2024-03-27 07:11:56 -05:00 |  | 
			
				
					| 
							
							
								 rasbt | 3cb5a52a1b | simplify calc_loss_loader | 2024-03-26 20:34:50 -05:00 |  | 
			
				
					| 
							
							
								 Sebastian Raschka | cf39abac04 | Add and link bonus material (#84) | 2024-03-23 07:27:43 -05:00 |  |