LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-09-18 12:44:44 +00:00

Author	SHA1	Message	Date
Sebastian Raschka	8a448a4410	Llama 3 (#384 ) * Implement Llama 3.2 * Add Llama 3.2 files * exclude IMDB link because stanford website seems down	2024-10-05 07:52:15 -05:00
Sebastian Raschka	8553644440	Llama 3.2 requirements file	2024-10-05 07:32:43 -05:00
Sebastian Raschka	b44096acef	Implement Llama 3.2 (#383 )	2024-10-05 07:30:47 -05:00
Sebastian Raschka	a5405c255d	Cos-sin fix in Llama 2 bonus notebook (#381 )	2024-10-03 20:45:40 -05:00
Sebastian Raschka	b993c2b25b	Improve rope settings for llama3 (#380 )	2024-10-03 08:29:54 -05:00
rasbt	278a50a348	add section numbers	2024-09-30 08:42:22 -05:00
rasbt	bfa4215774	llama note	2024-09-26 07:41:11 -05:00
Sebastian Raschka	b56d0b2942	Add llama2 unit tests (#372 ) * add llama2 unit tests * update * updates * updates * update file path * update requirements file * rmsnorm test * update	2024-09-25 19:40:36 -05:00
rasbt	a6d8e93da3	improve formatting	2024-09-24 18:49:17 -05:00
Daniel Kleine	ff31b345b0	ch05/07 gpt_to_llama text improvements (#369 ) * fixed typo * fixed RMSnorm formula * fixed SwiGLU formula * temperature=0 for untrained model for reproducibility * added extra info hf token	2024-09-24 18:45:49 -05:00
rasbt	d144bd5b7a	add json import	2024-09-23 09:12:35 -05:00
rasbt	6bc3de165c	move access token to config.json	2024-09-23 08:56:16 -05:00
rasbt	58df945ed4	add llama3 comparison	2024-09-23 08:17:10 -05:00
Sebastian Raschka	0467c8289b	GPT to Llama (#368 ) * GPT to Llama * fix urls	2024-09-23 07:34:06 -05:00
Sebastian Raschka	76e9a9ec02	Add user interface to ch06 and ch07 (#366 ) * Add user interface to ch06 and ch07 * pep8 * fix url	2024-09-21 20:33:00 -05:00
rasbt	6f6dfb6796	remove unused function from user interface	2024-09-21 14:17:35 -05:00
Daniel Kleine	eefe4bf12b	Chainlit bonus material fixes (#361 ) * fix cmd * moved idx to device * improved code with clone().detach() * fixed path * fix: added extra line for pep8 * updated .gitginore * Update ch05/06_user_interface/app_orig.py * Update ch05/06_user_interface/app_own.py * Apply suggestions from code review --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-09-18 08:08:50 -07:00
Sebastian Raschka	ea9b4e83a4	Add chatpgpt-like user interface (#360 ) * Add chatpgpt-like user interface * fixes	2024-09-17 08:26:44 -05:00
Sebastian Raschka	c2117ca073	topk comment	2024-08-20 20:44:15 -05:00
rasbt	9f0bda7af5	add note about duplicated cell	2024-08-19 21:04:18 -05:00
Sebastian Raschka	01cb137bfd	Note about MPS devices (#329 )	2024-08-19 20:58:45 -05:00
Sebastian Raschka	8ef5022511	Note about ch05 mps support (#324 )	2024-08-19 07:40:24 -05:00
rasbt	b2858a91c5	remove redundant indentation	2024-08-16 07:54:02 -05:00
rasbt	5f0c55ddee	fix code cell ordering	2024-08-12 19:04:05 -05:00
Sebastian Raschka	7feb8cad86	Update README.md	2024-08-10 07:54:51 -05:00
Daniel Kleine	13dbc548f8	fixed bash command (#305 ) Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-09 21:29:04 -05:00
TITC	09a3a73f2d	remove all non-English texts and notice (#304 ) * remove all non-English texts and notice 1. almost 18GB txt left after `is_english` filtered. 2. remove notice use gutenberg's strip_headers 3. after re-run get_data.py, seems all data are under `gutenberg/data/.mirror` folder. * some improvements * update readme --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-08-09 17:09:14 -05:00
TITC	d16527ddf2	total training iters may equal to warmup_iters (#301 ) total_training_iters=20, warmup_iters=20= len(train_loader) 4 multiply n_epochs 5, then ZeroDivisionError occurred. ```shell Traceback (most recent call last): File "LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 191, in <module> train_loss, val_loss = train_model( ^^^^^^^^^^^^ File "/mnt/raid1/docker/ai/LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 90, in train_model progress = (global_step - warmup_iters) / (total_training_iters - warmup_iters) ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ZeroDivisionError: division by zero ```	2024-08-06 07:10:05 -05:00
SSebo	7643c6c0c4	Update ch05.ipynb (#297 ) typo	2024-08-05 07:12:27 -05:00
Sebastian Raschka	08040f024c	Test code in pytorch 2.4 (#285 ) * test code in pytorch 2.4 * update	2024-07-24 21:53:41 -05:00
TITC	6cbe6520a2	47,678-->48,725 (#281 )	2024-07-22 21:24:57 -05:00
Sebastian Raschka	8d02cb1cee	Add download help message (#274 )	2024-07-19 08:29:29 -05:00
rasbt	31806828d0	add links to summary sections	2024-06-29 07:33:26 -05:00
rasbt	219f45f808	refresh cross entropy figure	2024-06-29 07:22:23 -05:00
Daniel Kleine	1e69c8e0b5	fixed minor issues (#252 ) * fixed typo * fixed var name in md text	2024-06-29 06:38:25 -05:00
Daniel Kleine	81c843bdc0	minor fixes (#246 ) * removed duplicated white spaces * Update ch07/01_main-chapter-code/ch07.ipynb * Update ch07/05_dataset-generation/llama3-ollama.ipynb * removed duplicated white spaces * fixed title again --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-06-25 17:30:30 -05:00
Sebastian Raschka	cf0df54d7d	Show epochs as integers on x-axis (#241 ) * Show epochs as integers on x-axis * Update ch07/01_main-chapter-code/previous_chapters.py * remove extra s * modify exercise plots * update chapter 7 plot * resave ch07 for better file diff	2024-06-23 07:41:25 -05:00
rasbt	143a529c33	update generate to match output in main chapter	2024-06-22 12:01:51 -05:00
Daniel Kleine	ad9dd994dc	minor fixes (#235 ) * removed unnecessary imports * removed unnecessary semicolons * format markdown * format markdown * fixed markdown	2024-06-21 08:40:54 -05:00
rasbt	6c84900af7	remove redundant line	2024-06-20 10:12:28 -05:00
rasbt	977f9c6eac	fix device loading	2024-06-20 08:07:00 -05:00
rasbt	283397aaf2	add main and optional sections	2024-06-19 17:48:25 -05:00
rasbt	85827e0a0b	note about dropout	2024-06-19 17:37:48 -05:00
Daniel Kleine	bbb2a0c3d5	fixed num_workers (#229 ) * fixed num_workers * ch06 & ch07: added num_workers to create_dataloader_v1	2024-06-19 17:36:46 -05:00
Sebastian Raschka	7bf70baf10	Remove duplicated cell (#212 ) * add a suggestion since code snippet has been repeated. * remove duplicated cell --------- Co-authored-by: Shuyib <benmainye@gmail.com>	2024-06-15 12:48:34 -05:00
rasbt	c6466990bb	explain truncation in ch05	2024-06-12 19:50:11 -05:00
Sebastian Raschka	bcccda728b	check gpt files (#208 )	2024-06-12 07:19:10 -05:00
Daniel Kleine	ef40f2f9ad	minor bug fixes (#207 ) * fixed path arg for create_dataset_csvs() * updated assign_check() to remove user warning	2024-06-12 06:27:56 -05:00
rasbt	e24fd98cdf	distinguish better between main chapter code and bonus materials	2024-06-11 21:07:42 -05:00
Daniel Kleine	dcbdc1d2e5	fixes for code (#206 ) * updated .gitignore * removed unused GELU import * fixed model_configs, fixed all tensors on same device * removed unused tiktoken * update * update hparam search * remove redundant tokenizer argument --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-06-11 20:59:48 -05:00

1 2 3

125 Commits