mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-11-04 03:40:21 +00:00

History

Sebastian Raschka aedad7efc3

* Add Llama 3.2 to pkg

* remove redundant attributes

* update tests

* updates

* updates

* updates

* fix link

* fix link

2025-03-31 18:59:47 -05:00

2025-03-27 20:10:23 -05:00

2025-03-27 20:10:23 -05:00

Add PyPI package (#576 )

2025-03-23 19:28:49 -05:00

2024-03-23 07:27:43 -05:00

Add PyPI package (#576 )

2025-03-23 19:28:49 -05:00

Add PyPI package (#576 )

2025-03-23 19:28:49 -05:00

2025-03-31 18:59:47 -05:00

Add PyPI package (#576 )

2025-03-23 19:28:49 -05:00

Add PyPI package (#576 )

2025-03-23 19:28:49 -05:00

2025-03-27 10:43:45 -05:00

README.md

Add readme (#577 )

2025-03-23 19:35:12 -05:00

Chapter 5: Pretraining on Unlabeled Data

Main Chapter Code

02_alternative_weight_loading contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
03_bonus_pretraining_on_gutenberg contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
04_learning_rate_schedulers contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
05_bonus_hparam_tuning contains an optional hyperparameter tuning script
06_user_interface implements an interactive user interface to interact with the pretrained LLM
07_gpt_to_llama contains a step-by-step guide for converting a GPT architecture implementation to Llama 3.2 and loads pretrained weights from Meta AI
08_memory_efficient_weight_loading contains a bonus notebook showing how to load model weights via PyTorch's load_state_dict method more efficiently
09_extending-tokenizers contains a from-scratch implementation of the GPT-2 BPE tokenizer
10_llm-training-speed shows PyTorch performance tips to improve the LLM training speed