LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-08-29 02:50:15 +00:00

Author	SHA1	Message	Date
casinca	374cb94af0	minor readability improvements (#668 )	2025-06-19 18:56:49 -05:00
Sebastian Raschka	55e2a0978a	DeBERTa-v3 baseline (#630 ) * Llama3 from scratch improvements * deberta-baseline * restore	2025-04-19 21:16:17 -05:00
Daniel Kleine	6ec8fb3dfe	fixed `<\|endoftext\|>` token (#620 )	2025-04-16 12:15:59 -05:00
Sebastian Raschka	d5eaa36416	Ch06 and Ch07 videos (#613 ) * Ch06 and Ch07 videos * exclude google scholar from link checking	2025-04-12 14:51:02 -05:00
Sebastian Raschka	b662ec9ada	Improve ModernBERT comments (#606 ) * Improve modernbert comments * bash code formatting	2025-04-06 18:29:22 -05:00
Sebastian Raschka	ab17357474	Correct BERT experiments (#600 )	2025-04-05 10:05:15 -05:00
Sebastian Raschka	14f976e024	Add ModernBERT (#598 )	2025-04-05 09:13:30 -05:00
Sebastian Raschka	d75f74bd0c	Fix data download if UCI is temporarily down (#592 )	2025-03-31 16:25:53 -05:00
Sebastian Raschka	7114ccd10d	Add PyPI package (#576 ) * Add PyPI package * fixes * fixes	2025-03-23 19:28:49 -05:00
Sebastian Raschka	4fb0ea9d1f	Specify UTF-8 encoding in the json load command explicitely (#557 )	2025-03-05 11:46:21 -06:00
Sebastian Raschka	4fad4695f6	Fix timeout issue related to spam data backup url (#544 ) * Add backup url for Spam Dataset * import urllib * fix url * fix timeout issue	2025-02-20 09:26:23 -06:00
Sebastian Raschka	eb6787397c	Add backup url for Spam Dataset (#543 ) * Add backup url for Spam Dataset * import urllib * fix url	2025-02-20 08:08:28 -06:00
Sebastian Raschka	5016499d1d	Uv workflow improvements (#531 ) * Uv workflow improvements * Uv workflow improvements * linter improvements * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * windows fixes * windows fixes * windows fixes * windows fixes * windows fixes * windows fixes * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix	2025-02-16 13:16:51 -06:00
Sebastian Raschka	9dce43ec31	Upgrade to NumPy 2.0 (#520 ) * Upgrade to NumPy 2.0 * bump pytorch * bump pytorch * bump pytorch * bump pytorch * bump pytorch * update * update packages	2025-02-09 06:21:58 -06:00
Sebastian Raschka	7e2092dd01	More pythonic way to find the longest sequence (#512 ) * More pythonic way to find the longest sequence * pep8 fix	2025-02-01 10:22:47 -06:00
Sebastian Raschka	7659af7cdd	Add backup URL for gpt2 weights (#469 ) * Add backup URL for gpt2 weights * newline	2025-01-05 11:28:09 -06:00
Daniel Kleine	dcef9b7d6f	Fixed command for row 16 additional experiment (#439 ) * fixed command for row 16 experiment * Update README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-11-17 06:50:00 +09:00
Sebastian Raschka	841b64f2b9	Add flexible padding bonus experiment (#438 ) * Add flexible padding bonus experiment * fix links	2024-11-15 08:51:01 +09:00
Sebastian Raschka	541b237eff	Add utility to prevent double execution of certain cells (#437 )	2024-11-14 19:56:49 +09:00
Sebastian Raschka	3c3dae0967	Add mean pooling experiment to classifier bonus experiments (#406 ) * Add mean pooling experiment to classifier bonus experiments * formatting * add average embeddings option * pep8	2024-10-20 11:04:18 -05:00
rasbt	59a5c83726	remove redundant code line	2024-10-13 15:58:11 -05:00
Sebastian Raschka	6a9bedc2ec	Update bonus section formatting (#400 )	2024-10-12 10:26:08 -05:00
Sebastian Raschka	68505fab64	Fix truncation issue in classify_review function (#373 )	2024-09-25 19:54:36 -05:00
Daniel Kleine	9bc5a7dd65	removed unnecessary imports (#367 )	2024-09-22 11:59:37 -05:00
Sebastian Raschka	7a9a17608d	Add user interface to ch06 and ch07 (#366 ) * Add user interface to ch06 and ch07 * pep8 * fix url	2024-09-21 20:33:00 -05:00
Sebastian Raschka	081676d8dd	Add missing bullet point	2024-09-21 12:59:12 -05:00
Mingyuan Xu	21e6971b11	Run generate example in ch06 optionally on GPU (#352 ) * model.to("cuda") model.to("cuda") * update device placement --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-09-13 08:01:52 -05:00
Daniel Kleine	95926535f8	ch06/03 fixes (#336 ) * fixed bash commands * fixed help docstrings * added missing logreg bash cmd * Update train_bert_hf.py * Update train_bert_hf_spam.py * Update README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-27 08:23:25 +02:00
rasbt	8eb6fc0ad0	sklearn baseline and roberta-large update	2024-08-26 10:31:54 +02:00
TITC	5acab58d41	add RoBERTa and params frozen (#335 ) * add roberta experiment result * add roberta & params frozen * Update README.md * modify lr * modify lr --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-26 10:27:09 +02:00
Sebastian Raschka	296a91afb8	add BERT experiment results (#333 ) * add BERT experiment results * cleanup * formatting	2024-08-23 08:40:40 -05:00
Sebastian Raschka	7cbfe418bf	grammar fix	2024-08-21 19:16:21 -05:00
Sebastian Raschka	a82169290e	Note about MPS in ch06 and ch07 (#325 )	2024-08-19 08:11:33 -05:00
rasbt	1a962f3983	update lora experiments	2024-08-16 08:57:46 -05:00
TITC	70a7cddf9c	typo fix (#323 )	2024-08-16 06:57:30 -05:00
TITC	e1072bbcd6	typo fix (#321 ) * typo fix experiment 5 is the same size as experiment 1, both are 124 Million. and experiment 6 is 355M ~3x 124M. * typo fix all layer FT vs all layer FT, "slightly worse by 1.3% " indicates it's exp 5 vs exp 9 * exp 5 vs 9 * Update ch06/02_bonus_additional-experiments/README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-15 08:45:34 -05:00
Sebastian Raschka	e4cfcc5b66	Update README.md	2024-08-15 08:22:47 -05:00
TITC	0b998dff97	track tokens seen in chapter5, track examples seen in chapter6 (#319 )	2024-08-13 07:09:05 -05:00
Sebastian Raschka	895512ebee	Revert accidental edit	2024-08-06 19:54:34 -05:00
rasbt	7a5771932b	note about logistic sigmoid	2024-08-06 19:48:30 -05:00
Sebastian Raschka	192bdc3501	improve gradient accumulation (#300 )	2024-08-05 18:27:20 -05:00
Sebastian Raschka	6dd8666d9c	Test code in pytorch 2.4 (#285 ) * test code in pytorch 2.4 * update	2024-07-24 21:53:41 -05:00
Sebastian Raschka	d0f3b034d8	Add download help message (#274 )	2024-07-19 08:29:29 -05:00
Jeroen Van Goey	70cfced899	fix typos, add codespell pre-commit hook (#264 ) * fix typos, add codespell pre-commit hook * Update .pre-commit-config.yaml --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-07-16 07:07:04 -05:00
Sebastian Raschka	2ce1d16de0	show how to use the finetuned model	2024-07-09 06:43:26 -07:00
Daniel Kleine	87f47a281a	fixed spelling typos (#258 )	2024-07-03 07:47:33 -05:00
Sebastian Raschka	2d8eacb0fa	Fix links in summary sections (#254 )	2024-06-29 07:51:31 -05:00
rasbt	5e24a042c1	add links to summary sections	2024-06-29 07:33:26 -05:00
Daniel Kleine	fb4e37ae15	fixed minor issues (#252 ) * fixed typo * fixed var name in md text	2024-06-29 06:38:25 -05:00
Daniel Kleine	e387742b77	minor markdown fixes (#236 )	2024-06-21 13:55:34 -05:00

1 2 3

150 Commits