Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c21bfe4a23 
							
						 
					 
					
						
						
							
							Add PyPI package ( #576 )  
						
						... 
						
						
						
						* Add PyPI package
* fixes
* fixes 
						
						
					 
					
						2025-03-23 19:28:49 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							a08d7aaa84 
							
						 
					 
					
						
						
							
							Uv workflow improvements ( #531 )  
						
						... 
						
						
						
						* Uv workflow improvements
* Uv workflow improvements
* linter improvements
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* pytproject.toml fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* windows fixes
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix
* win32 fix 
						
						
					 
					
						2025-02-16 13:16:51 -06:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							68e2efe1c9 
							
						 
					 
					
						
						
							
							Mention small discrepancy due to Dropout non-reproducibility in PyTorch ( #519 )  
						
						... 
						
						
						
						* Mention small discrepancy due to Dropout non-reproducibility in PyTorch
* bump pytorch version 
						
						
					 
					
						2025-02-06 14:59:52 -06:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							126adb7663 
							
						 
					 
					
						
						
							
							Include mathematical breakdown for exercise solution 4.1 ( #483 )  
						
						
						
						
					 
					
						2025-01-14 19:23:00 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dc1b1a05b0 
							
						 
					 
					
						
						
							
							note about random numbers  
						
						
						
						
					 
					
						2024-09-22 12:02:03 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							222f7b16f8 
							
						 
					 
					
						
						
							
							update gpt-2 paper url  
						
						
						
						
					 
					
						2024-09-20 07:00:06 -07:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							8ad50a3315 
							
						 
					 
					
						
						
							
							update gpt-2 paper link  
						
						
						
						
					 
					
						2024-09-09 06:31:28 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							1e48c13e89 
							
						 
					 
					
						
						
							
							update gpt-2 paper link  
						
						
						
						
					 
					
						2024-09-08 15:49:44 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							08040f024c 
							
						 
					 
					
						
						
							
							Test code in pytorch 2.4 ( #285 )  
						
						... 
						
						
						
						* test code in pytorch 2.4
* update 
						
						
					 
					
						2024-07-24 21:53:41 -05:00 
						 
				 
			
				
					
						
							
							
								Thanh Tran 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							070a69fc8b 
							
						 
					 
					
						
						
							
							fix typos & inconsistent texts ( #269 )  
						
						... 
						
						
						
						Co-authored-by: TRAN <you@example.com> 
						
						
					 
					
						2024-07-17 07:34:51 -05:00 
						 
				 
			
				
					
						
							
							
								Jeroen Van Goey 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							48bd72c890 
							
						 
					 
					
						
						
							
							fix typos, add codespell pre-commit hook ( #264 )  
						
						... 
						
						
						
						* fix typos, add codespell pre-commit hook
* Update .pre-commit-config.yaml
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com> 
						
						
					 
					
						2024-07-16 07:07:04 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6ffd628bb6 
							
						 
					 
					
						
						
							
							add missing "be" to figure  
						
						
						
						
					 
					
						2024-07-15 08:06:05 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							921e91a05f 
							
						 
					 
					
						
						
							
							use correct chapter reference  
						
						
						
						
					 
					
						2024-07-02 17:29:57 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							31806828d0 
							
						 
					 
					
						
						
							
							add links to summary sections  
						
						
						
						
					 
					
						2024-06-29 07:33:26 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							796f0e2a30 
							
						 
					 
					
						
						
							
							add clarifying note about GELU  
						
						
						
						
					 
					
						2024-06-29 07:14:36 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							ab23ca5b1b 
							
						 
					 
					
						
						
							
							force refresh figure  
						
						
						
						
					 
					
						2024-06-29 07:01:37 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6a8acf5135 
							
						 
					 
					
						
						
							
							remove redundant plus sign  
						
						
						
						
					 
					
						2024-06-29 06:59:36 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							81c843bdc0 
							
						 
					 
					
						
						
							
							minor fixes ( #246 )  
						
						... 
						
						
						
						* removed duplicated white spaces
* Update ch07/01_main-chapter-code/ch07.ipynb
* Update ch07/05_dataset-generation/llama3-ollama.ipynb
* removed duplicated white spaces
* fixed title again
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com> 
						
						
					 
					
						2024-06-25 17:30:30 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							283397aaf2 
							
						 
					 
					
						
						
							
							add main and optional sections  
						
						
						
						
					 
					
						2024-06-19 17:48:25 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							bbb2a0c3d5 
							
						 
					 
					
						
						
							
							fixed num_workers ( #229 )  
						
						... 
						
						
						
						* fixed num_workers
* ch06 & ch07: added num_workers to create_dataloader_v1 
						
						
					 
					
						2024-06-19 17:36:46 -05:00 
						 
				 
			
				
					
						
							
							
								Daniel Kleine 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dcbdc1d2e5 
							
						 
					 
					
						
						
							
							fixes for code ( #206 )  
						
						... 
						
						
						
						* updated .gitignore
* removed unused GELU import
* fixed model_configs, fixed all tensors on same device
* removed unused tiktoken
* update
* update hparam search
* remove redundant tokenizer argument
---------
Co-authored-by: rasbt <mail@sebastianraschka.com> 
						
						
					 
					
						2024-06-11 20:59:48 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							39c4a887eb 
							
						 
					 
					
						
						
							
							add allowed_special={"<|endoftext|>"}  
						
						
						
						
					 
					
						2024-06-09 06:04:02 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							72a073bbbf 
							
						 
					 
					
						
						
							
							Remove leftover instances of self.tokenizer ( #201 )  
						
						... 
						
						
						
						* Remove leftover instances of self.tokenizer
* add endoftext token 
						
						
					 
					
						2024-06-08 14:57:34 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							98d453b666 
							
						 
					 
					
						
						
							
							update formatting  
						
						
						
						
					 
					
						2024-05-24 07:20:37 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							e5e6aaf9f1 
							
						 
					 
					
						
						
							
							flops analysis  
						
						
						
						
					 
					
						2024-05-23 20:35:41 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							c735c21e87 
							
						 
					 
					
						
						
							
							fix swiglu acronym  
						
						
						
						
					 
					
						2024-05-01 20:26:17 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							97ed38116a 
							
						 
					 
					
						
						
							
							Rename drop_resid to drop_shortcut ( #136 )  
						
						
						
						
					 
					
						2024-04-28 14:31:27 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							d202cabdee 
							
						 
					 
					
						
						
							
							update figures  
						
						
						
						
					 
					
						2024-04-20 11:42:03 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							dd51d4ad83 
							
						 
					 
					
						
						
							
							Make datesets and loaders compatible with multiprocessing ( #118 )  
						
						
						
						
					 
					
						2024-04-13 13:57:56 -05:00 
						 
				 
			
				
					
						
							
							
								James Holcombe 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							05718c6b94 
							
						 
					 
					
						
						
							
							Use instance tokenizer ( #116 )  
						
						... 
						
						
						
						* Use instance tokenizer
* consistency updates
---------
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com> 
						
						
					 
					
						2024-04-10 21:16:19 -04:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							6de0417321 
							
						 
					 
					
						
						
							
							cleanup  
						
						
						
						
					 
					
						2024-04-04 07:58:41 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							2de60d1bfb 
							
						 
					 
					
						
						
							
							Rename variable to context_length to make it easier on readers ( #106 )  
						
						... 
						
						
						
						* rename to context length
* fix spacing 
						
						
					 
					
						2024-04-04 07:27:41 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
							
							
						
						
						
							
						
						
							3829ccdb34 
							
						 
					 
					
						
						
							
							Remove reundant dropout in MLP module ( #105 )  
						
						
						
						
					 
					
						2024-04-03 20:19:08 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							a2cd8436cb 
							
						 
					 
					
						
						
							
							Ch05 supplementary code ( #81 )  
						
						
						
						
					 
					
						2024-03-19 09:26:26 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							ca96abac8a 
							
						 
					 
					
						
						
							
							Set up basic test gh worklows ( #79 )  
						
						... 
						
						
						
						* Set up basic test gh worklows
* update file paths
* env check
* add env check
* Update requirements.txt
* simplify
* upd 
						
						
					 
					
						2024-03-18 11:58:37 -05:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							9d6da22ebb 
							
						 
					 
					
						
						
							
							Update pep8 ( #78 )  
						
						... 
						
						
						
						* simplify requirements file
* style
* apply linter 
						
						
					 
					
						2024-03-18 08:16:17 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							4fc6de7afa 
							
						 
					 
					
						
						
							
							add notes  
						
						
						
						
					 
					
						2024-03-17 09:29:06 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							d60da19fd0 
							
						 
					 
					
						
						
							
							add more notes and embed figures externally to save space  
						
						
						
						
					 
					
						2024-03-17 09:08:38 -05:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							861c296312 
							
						 
					 
					
						
						
							
							add imports and version on top  
						
						
						
						
					 
					
						2024-03-16 09:50:00 -05:00 
						 
				 
			
				
					
						
							
							
								joel-foo 
							
						 
					 
					
						
						
						
						
							
						
						
							dbb5e65a29 
							
						 
					 
					
						
						
							
							Remove duplicate cells  
						
						
						
						
					 
					
						2024-03-10 21:40:57 +08:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							da33ce8054 
							
						 
					 
					
						
						
							
							remove redundant unsqueeze in mask  
						
						
						
						
					 
					
						2024-03-09 17:42:31 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							87fcfd9245 
							
						 
					 
					
						
						
							
							mha variants  
						
						
						
						
					 
					
						2024-03-06 08:30:32 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							e0df4df433 
							
						 
					 
					
						
						
							
							add dropout for embedding layers  
						
						
						
						
					 
					
						2024-03-04 07:05:06 -06:00 
						 
				 
			
				
					
						
							
							
								Sebastian Raschka 
							
						 
					 
					
						
						
						
						
							
						
						
							c9dccb0c40 
							
						 
					 
					
						
						
							
							Merge pull request  #33  from rayedbw/patch-1  
						
						... 
						
						
						
						Update ch04.ipynb 
						
						
					 
					
						2024-02-29 20:00:09 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							267e33cfaf 
							
						 
					 
					
						
						
							
							remove redundant import  
						
						
						
						
					 
					
						2024-02-29 19:59:05 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							b827bf4eea 
							
						 
					 
					
						
						
							
							remove redundant double-unsequeeze  
						
						
						
						
					 
					
						2024-02-29 08:31:07 -06:00 
						 
				 
			
				
					
						
							
							
								Rayed Bin Wahed 
							
						 
					 
					
						
						
						
						
							
						
						
							2fb035435e 
							
						 
					 
					
						
						
							
							Update ch04.ipynb  
						
						... 
						
						
						
						Add missing import 
						
						
					 
					
						2024-02-27 23:05:36 +08:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							f6266c3756 
							
						 
					 
					
						
						
							
							improve code comments  
						
						
						
						
					 
					
						2024-02-27 06:40:35 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							3f186ab072 
							
						 
					 
					
						
						
							
							use .shape instead of .size() for consistency  
						
						
						
						
					 
					
						2024-02-25 08:47:25 -06:00 
						 
				 
			
				
					
						
							
							
								rasbt 
							
						 
					 
					
						
						
						
						
							
						
						
							cdcd73ba7f 
							
						 
					 
					
						
						
							
							drop_last=True  
						
						
						
						
					 
					
						2024-02-25 07:23:38 -06:00