Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							8d79fb13b0 
							
						 
					 
					
						
						
							
							Update README.md  
						
						
						
						
					 
					
						2024-08-10 07:54:51 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							c91999b9f4 
							
						 
					 
					
						
						
							
							fixed bash command ( #305 )  
						
						... 
						
						
						
						Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com> 
						
						
					 
					
						2024-08-09 21:29:04 -05:00 
						 
				 
			
				
					
						
							
							
								TITC 
							
						 
					 
					
						
						
						
						
							
						
						
							3067ed83dc 
							
						 
					 
					
						
						
							
							remove all non-English texts and notice ( #304 )  
						
						... 
						
						
						
						* remove all non-English texts and notice
1. almost 18GB txt left after `is_english` filtered.
2. remove notice use gutenberg's strip_headers
3. after re-run get_data.py, seems all data are under `gutenberg/data/.mirror` folder.
* some improvements
* update readme
---------
Co-authored-by: rasbt <mail@sebastianraschka.com> 
						
						
					 
					
						2024-08-09 17:09:14 -05:00 
						 
				 
			
				
					
						
							
							
								TITC 
							
						 
					 
					
						
						
						
						
							
						
						
							7374d617b4 
							
						 
					 
					
						
						
							
							total training iters may equal to warmup_iters ( #301 )  
						
						... 
						
						
						
						total_training_iters=20, warmup_iters=20= len(train_loader) 4 multiply n_epochs 5, then ZeroDivisionError occurred.
```shell
Traceback (most recent call last):                                                                                                                                                                                                                                                                                              
  File "LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 191, in <module>                                             
    train_loss, val_loss = train_model(                                                                                                                                                                                                                                                                                         
                           ^^^^^^^^^^^^                                                                                                                         
  File "/mnt/raid1/docker/ai/LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 90, in train_model                                                                                                                                                                                                           
    progress = (global_step - warmup_iters) / (total_training_iters - warmup_iters)                                                                             
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                             
ZeroDivisionError: division by zero 
``` 
						
						
					 
					
						2024-08-06 07:10:05 -05:00 
						 
				 
			
				
					
						
							
							
								SSebo 
							
						 
					 
					
						
						
						
						
							
						
						
							22681878a8 
							
						 
					 
					
						
						
							
							Update ch05.ipynb ( #297 )  
						
						... 
						
						
						
						typo 
						
						
					 
					
						2024-08-05 07:12:27 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							6dd8666d9c 
							
						 
					 
					
						
						
							
							Test code in pytorch 2.4 ( #285 )  
						
						... 
						
						
						
						* test code in pytorch 2.4
* update 
						
						
					 
					
						2024-07-24 21:53:41 -05:00 
						 
				 
			
				
					
						
							
							
								TITC 
							
						 
					 
					
						
						
						
						
							
						
						
							bce3a708f9 
							
						 
					 
					
						
						
							
							47,678-->48,725 ( #281 )  
						
						
						
						
					 
					
						2024-07-22 21:24:57 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							d0f3b034d8 
							
						 
					 
					
						
						
							
							Add download help message ( #274 )  
						
						
						
						
					 
					
						2024-07-19 08:29:29 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							5e24a042c1 
							
						 
					 
					
						
						
							
							add links to summary sections  
						
						
						
						
					 
					
						2024-06-29 07:33:26 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							0f43890a15 
							
						 
					 
					
						
						
							
							refresh cross entropy figure  
						
						
						
						
					 
					
						2024-06-29 07:22:23 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							fb4e37ae15 
							
						 
					 
					
						
						
							
							fixed minor issues ( #252 )  
						
						... 
						
						
						
						* fixed typo
* fixed var name in md text 
						
						
					 
					
						2024-06-29 06:38:25 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							7a54d383e7 
							
						 
					 
					
						
						
							
							minor fixes ( #246 )  
						
						... 
						
						
						
						* removed duplicated white spaces
* Update ch07/01_main-chapter-code/ch07.ipynb
* Update ch07/05_dataset-generation/llama3-ollama.ipynb
* removed duplicated white spaces
* fixed title again
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com> 
						
						
					 
					
						2024-06-25 17:30:30 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							def84a039c 
							
						 
					 
					
						
						
							
							Show epochs as integers on x-axis ( #241 )  
						
						... 
						
						
						
						* Show epochs as integers on x-axis
* Update ch07/01_main-chapter-code/previous_chapters.py
* remove extra s
* modify exercise plots
* update chapter 7 plot
* resave ch07 for better file diff 
						
						
					 
					
						2024-06-23 07:41:25 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							0026e6206b 
							
						 
					 
					
						
						
							
							update generate to match output in main chapter  
						
						
						
						
					 
					
						2024-06-22 12:01:51 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							7e0c5c0975 
							
						 
					 
					
						
						
							
							minor fixes ( #235 )  
						
						... 
						
						
						
						* removed unnecessary imports
* removed unnecessary semicolons
* format markdown
* format markdown
* fixed markdown 
						
						
					 
					
						2024-06-21 08:40:54 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							e1046746e8 
							
						 
					 
					
						
						
							
							remove redundant line  
						
						
						
						
					 
					
						2024-06-20 10:12:28 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							cb194fa8fa 
							
						 
					 
					
						
						
							
							fix device loading  
						
						
						
						
					 
					
						2024-06-20 08:07:00 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							c1f9361428 
							
						 
					 
					
						
						
							
							add main and optional sections  
						
						
						
						
					 
					
						2024-06-19 17:48:25 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							eb1da36e98 
							
						 
					 
					
						
						
							
							note about dropout  
						
						
						
						
					 
					
						2024-06-19 17:37:48 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							73be1c592f 
							
						 
					 
					
						
						
							
							fixed num_workers ( #229 )  
						
						... 
						
						
						
						* fixed num_workers
* ch06 & ch07: added num_workers to create_dataloader_v1 
						
						
					 
					
						2024-06-19 17:36:46 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							fcf8bcab0d 
							
						 
					 
					
						
						
							
							Remove duplicated cell ( #212 )  
						
						... 
						
						
						
						* add a suggestion since code snippet has been repeated.
* remove duplicated cell
---------
Co-authored-by: Shuyib <benmainye@gmail.com> 
						
						
					 
					
						2024-06-15 12:48:34 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							a796b9d657 
							
						 
					 
					
						
						
							
							explain truncation in ch05  
						
						
						
						
					 
					
						2024-06-12 19:50:11 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							8d3e58ff81 
							
						 
					 
					
						
						
							
							check gpt files ( #208 )  
						
						
						
						
					 
					
						2024-06-12 07:19:10 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							e5c3c5ce99 
							
						 
					 
					
						
						
							
							minor bug fixes ( #207 )  
						
						... 
						
						
						
						* fixed path arg for create_dataset_csvs()
* updated assign_check() to remove user warning 
						
						
					 
					
						2024-06-12 06:27:56 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							b2ff989174 
							
						 
					 
					
						
						
							
							distinguish better between main chapter code and bonus materials  
						
						
						
						
					 
					
						2024-06-11 21:07:42 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							79210eb393 
							
						 
					 
					
						
						
							
							fixes for code ( #206 )  
						
						... 
						
						
						
						* updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com> 
						
						
					 
					
						2024-06-11 20:59:48 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							f0e4c99bc3 
							
						 
					 
					
						
						
							
							fix typo in comment  
						
						
						
						
					 
					
						2024-06-09 06:14:02 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							40ba3a4068 
							
						 
					 
					
						
						
							
							Remove leftover instances of self.tokenizer ( #201 )  
						
						... 
						
						
						
						* Remove leftover instances of self.tokenizer
* add endoftext token 
						
						
					 
					
						2024-06-08 14:57:34 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							5a1e0eecce 
							
						 
					 
					
						
						
							
							fix learning rate scheduler  
						
						
						
						
					 
					
						2024-06-03 07:06:42 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							f7e528fca6 
							
						 
					 
					
						
						
							
							update loss  
						
						
						
						
					 
					
						2024-05-31 07:30:57 -05:00 
						 
				 
			
				
					
						
							
							
								Kumar Utsav 
							
						 
					 
					
						
						
						
						
							
						
						
							b48d436bfc 
							
						 
					 
					
						
						
							
							Update ch05.ipynb  
						
						... 
						
						
						
						Fixed incorrect token ids 
						
						
					 
					
						2024-05-29 20:34:23 +05:30 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							a0b5603423 
							
						 
					 
					
						
						
							
							Make header more clear  
						
						
						
						
					 
					
						2024-05-25 10:44:12 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							fe8bb9291e 
							
						 
					 
					
						
						
							
							update formatting  
						
						
						
						
					 
					
						2024-05-24 07:20:37 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							aa084656e0 
							
						 
					 
					
						
						
							
							update how to retrieve learning rate  
						
						
						
						
					 
					
						2024-05-23 17:19:01 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							69da9ed447 
							
						 
					 
					
						
						
							
							removed unnecessary .gitignore  
						
						
						
						
					 
					
						2024-05-21 19:25:16 +00:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							bc5cbbf1bd 
							
						 
					 
					
						
						
							
							change defaults to 0 temp  
						
						
						
						
					 
					
						2024-05-19 09:04:49 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							59f5ed8d68 
							
						 
					 
					
						
						
							
							use default value for temperature  
						
						
						
						
					 
					
						2024-05-19 08:48:10 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							9d84935b69 
							
						 
					 
					
						
						
							
							add eos_id option for ch07  
						
						
						
						
					 
					
						2024-05-18 12:35:40 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
						
						
							
						
						
							e6012b944e 
							
						 
					 
					
						
						
							
							fixed empty space  
						
						
						
						
					 
					
						2024-05-17 10:44:18 +02:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							b350daaa93 
							
						 
					 
					
						
						
							
							add readme  
						
						
						
						
					 
					
						2024-05-13 08:50:55 -05:00 
						 
				 
			
				
					
						
							
							
								speed 
							
						 
					 
					
						
						
						
						
							
						
						
							7b34833ee1 
							
						 
					 
					
						
						
							
							fix 1024 characters to 1024 tokens ( #152 )  
						
						
						
						
					 
					
						2024-05-11 13:17:07 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							bb59cbc525 
							
						 
					 
					
						
						
							
							link formatting  
						
						
						
						
					 
					
						2024-04-30 06:26:23 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							a5b353667d 
							
						 
					 
					
						
						
							
							Rename drop_resid to drop_shortcut ( #136 )  
						
						
						
						
					 
					
						2024-04-28 14:31:27 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							4abaa168ac 
							
						 
					 
					
						
						
							
							fix merge conflict  
						
						
						
						
					 
					
						2024-04-22 07:05:40 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							df4fc602d8 
							
						 
					 
					
						
						
							
							update numbering  
						
						
						
						
					 
					
						2024-04-22 07:00:20 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							2dd7bf9cda 
							
						 
					 
					
						
						
							
							file header  
						
						
						
						
					 
					
						2024-04-22 06:53:38 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							79d40c25bf 
							
						 
					 
					
						
						
							
							remove requests dependency ( #125 )  
						
						
						
						
					 
					
						2024-04-21 14:15:05 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							4557d5830e 
							
						 
					 
					
						
						
							
							Return nan if val loader is empty ( #124 )  
						
						
						
						
					 
					
						2024-04-20 08:02:30 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							ef2de4718e 
							
						 
					 
					
						
						
							
							use torch no grad for loss ( #119 )  
						
						
						
						
					 
					
						2024-04-14 08:13:07 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							bae4b0fb08 
							
						 
					 
					
						
						
							
							Make datesets and loaders compatible with multiprocessing ( #118 )  
						
						
						
						
					 
					
						2024-04-13 13:57:56 -05:00