LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-11-25 22:47:11 +00:00

Author	SHA1	Message	Date
Sebastian Raschka	ed257789a4	grammar fix	2024-08-21 19:16:21 -05:00
Sebastian Raschka	c2117ca073	topk comment	2024-08-20 20:44:15 -05:00
rasbt	9f0bda7af5	add note about duplicated cell	2024-08-19 21:04:18 -05:00
Sebastian Raschka	01cb137bfd	Note about MPS devices (#329 )	2024-08-19 20:58:45 -05:00
Sebastian Raschka	c443035d56	Note about MPS in ch06 and ch07 (#325 )	2024-08-19 08:11:33 -05:00
Sebastian Raschka	8ef5022511	Note about ch05 mps support (#324 )	2024-08-19 07:40:24 -05:00
rasbt	e7cb2ebd8d	update lora experiments	2024-08-16 08:57:46 -05:00
rasbt	b2858a91c5	remove redundant indentation	2024-08-16 07:54:02 -05:00
TITC	d2befdf00e	typo fix (#323 )	2024-08-16 06:57:30 -05:00
TITC	f6526f7766	typo fix (#321 ) * typo fix experiment 5 is the same size as experiment 1, both are 124 Million. and experiment 6 is 355M ~3x 124M. * typo fix all layer FT vs all layer FT, "slightly worse by 1.3% " indicates it's exp 5 vs exp 9 * exp 5 vs 9 * Update ch06/02_bonus_additional-experiments/README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-15 08:45:34 -05:00
Sebastian Raschka	4ea526908e	Update README.md	2024-08-15 08:22:47 -05:00
Daniel Kleine	c65928f7dc	added std error bars (#320 ) * added std error bars * fixed changes * Update on A100 --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-13 20:57:41 -05:00
Sebastian Raschka	9713e70a20	Consistency update in README.md	2024-08-13 07:33:10 -05:00
TITC	38390b2a8d	track tokens seen in chapter5, track examples seen in chapter6 (#319 )	2024-08-13 07:09:05 -05:00
rasbt	5f0c55ddee	fix code cell ordering	2024-08-12 19:04:05 -05:00
Jeroen Van Goey	76e6910a1a	Small typo fix (#313 ) * typo fix * Update ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb * Update ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb * Update ch03/02_bonus_efficient-multihead-attention/mha-implementations.ipynb --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-12 07:54:12 -05:00
Eric Thomson	da5236ee72	Adds .vscode folder to .gitignore (#314 ) * added .vscode folder to .gitignore * Update .gitignore --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-12 07:49:11 -05:00
Sebastian Raschka	3f6652d87e	update attention benchmarks (#307 )	2024-08-10 09:44:11 -05:00
Sebastian Raschka	7feb8cad86	Update README.md	2024-08-10 07:54:51 -05:00
Daniel Kleine	13dbc548f8	fixed bash command (#305 ) Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-09 21:29:04 -05:00
TITC	09a3a73f2d	remove all non-English texts and notice (#304 ) * remove all non-English texts and notice 1. almost 18GB txt left after `is_english` filtered. 2. remove notice use gutenberg's strip_headers 3. after re-run get_data.py, seems all data are under `gutenberg/data/.mirror` folder. * some improvements * update readme --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-08-09 17:09:14 -05:00
Sebastian Raschka	f1c3d451fe	Update README.md	2024-08-08 07:50:45 -05:00
Sebastian Raschka	81e9cea3d3	Update README.md	2024-08-08 07:47:31 -05:00
Sebastian Raschka	c5eaae11b1	Revert accidental edit	2024-08-06 19:54:34 -05:00
rasbt	06151a809e	note about logistic sigmoid	2024-08-06 19:48:30 -05:00
rasbt	26df0c474c	note about logistic sigmoid	2024-08-06 19:48:06 -05:00
rasbt	e810f9f004	extend equation description	2024-08-06 19:46:50 -05:00
rasbt	c8090f30ef	add more explanations	2024-08-06 19:45:11 -05:00
Sebastian Raschka	98d24a1607	Update README.md	2024-08-06 08:02:01 -05:00
TITC	d16527ddf2	total training iters may equal to warmup_iters (#301 ) total_training_iters=20, warmup_iters=20= len(train_loader) 4 multiply n_epochs 5, then ZeroDivisionError occurred. ```shell Traceback (most recent call last): File "LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 191, in <module> train_loss, val_loss = train_model( ^^^^^^^^^^^^ File "/mnt/raid1/docker/ai/LLMs-from-scratch/ch05/05_bonus_hparam_tuning/hparam_search.py", line 90, in train_model progress = (global_step - warmup_iters) / (total_training_iters - warmup_iters) ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ZeroDivisionError: division by zero ```	2024-08-06 07:10:05 -05:00
Sebastian Raschka	70e5714556	improve gradient accumulation (#300 )	2024-08-05 18:27:20 -05:00
rasbt	36fbc7aa74	small figure update	2024-08-05 17:57:16 -05:00
Sebastian Raschka	50332cf75b	Update README.md	2024-08-05 17:47:06 -05:00
Daniel Kleine	8318d1f002	minor DPO fixes (#298 ) * fixed issues, updated .gitignore * added closing paren * fixed CEL spelling * fixed more minor issues * Update ch07/01_main-chapter-code/ch07.ipynb * Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb * Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb * Update ch07/04_preference-tuning-with-dpo/dpo-from-scratch.ipynb --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-05 08:40:46 -05:00
rasbt	36b9d5e0eb	update model path	2024-08-05 07:36:08 -05:00
SSebo	7643c6c0c4	Update ch05.ipynb (#297 ) typo	2024-08-05 07:12:27 -05:00
Sebastian Raschka	16e83434b5	Update README.md	2024-08-04 16:06:38 -05:00
rasbt	60aada801b	improve latex rendering in dpo notebook	2024-08-04 09:19:59 -05:00
Sebastian Raschka	e130ca293c	Update matplotlib tests on Windows (#295 )	2024-08-04 09:18:19 -05:00
Sebastian Raschka	52435804eb	Direct Preference Optimization from scratch (#294 )	2024-08-04 08:57:36 -05:00
Sebastian Raschka	ff7a6db212	Update README.md	2024-08-01 18:17:42 -05:00
rasbt	b5fc1a6061	restructure into local and cloud setup	2024-07-31 06:59:04 -05:00
Sebastian Raschka	1b100179c0	Add video tutorial	2024-07-30 06:57:46 -05:00
Sebastian Raschka	f5a003744e	Update README.md	2024-07-30 06:55:41 -05:00
rasbt	0dad0a3c04	add state_dict example	2024-07-28 14:15:32 -05:00
rasbt	a7869ad2bf	Fix 8-billion-parameter spelling	2024-07-28 10:48:56 -05:00
Daniel Kleine	9a3b04f92f	fixed typos and formatting (#291 )	2024-07-28 10:04:33 -05:00
Sebastian Raschka	9bf5d67d61	Update README.md	2024-07-28 09:28:11 -05:00
Sebastian Raschka	4f7f5bd443	Update README.md	2024-07-28 08:21:38 -05:00
Sebastian Raschka	f4fc0ededd	buffer tutorial	2024-07-27 17:06:16 -05:00

1 2 3 4 5 ...

712 Commits