LLMs-from-scratch

mirror of https://github.com/rasbt/LLMs-from-scratch.git synced 2025-10-29 08:50:34 +00:00

Author	SHA1	Message	Date
Sebastian Raschka	4fad4695f6	Fix timeout issue related to spam data backup url (#544 ) * Add backup url for Spam Dataset * import urllib * fix url * fix timeout issue	2025-02-20 09:26:23 -06:00
Sebastian Raschka	eb6787397c	Add backup url for Spam Dataset (#543 ) * Add backup url for Spam Dataset * import urllib * fix url	2025-02-20 08:08:28 -06:00
Sebastian Raschka	5016499d1d	Uv workflow improvements (#531 ) * Uv workflow improvements * Uv workflow improvements * linter improvements * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * pytproject.toml fixes * windows fixes * windows fixes * windows fixes * windows fixes * windows fixes * windows fixes * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix * win32 fix	2025-02-16 13:16:51 -06:00
Sebastian Raschka	9dce43ec31	Upgrade to NumPy 2.0 (#520 ) * Upgrade to NumPy 2.0 * bump pytorch * bump pytorch * bump pytorch * bump pytorch * bump pytorch * update * update packages	2025-02-09 06:21:58 -06:00
Sebastian Raschka	7e2092dd01	More pythonic way to find the longest sequence (#512 ) * More pythonic way to find the longest sequence * pep8 fix	2025-02-01 10:22:47 -06:00
Sebastian Raschka	7659af7cdd	Add backup URL for gpt2 weights (#469 ) * Add backup URL for gpt2 weights * newline	2025-01-05 11:28:09 -06:00
Daniel Kleine	dcef9b7d6f	Fixed command for row 16 additional experiment (#439 ) * fixed command for row 16 experiment * Update README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-11-17 06:50:00 +09:00
Sebastian Raschka	841b64f2b9	Add flexible padding bonus experiment (#438 ) * Add flexible padding bonus experiment * fix links	2024-11-15 08:51:01 +09:00
Sebastian Raschka	541b237eff	Add utility to prevent double execution of certain cells (#437 )	2024-11-14 19:56:49 +09:00
Sebastian Raschka	3c3dae0967	Add mean pooling experiment to classifier bonus experiments (#406 ) * Add mean pooling experiment to classifier bonus experiments * formatting * add average embeddings option * pep8	2024-10-20 11:04:18 -05:00
rasbt	59a5c83726	remove redundant code line	2024-10-13 15:58:11 -05:00
Sebastian Raschka	6a9bedc2ec	Update bonus section formatting (#400 )	2024-10-12 10:26:08 -05:00
Sebastian Raschka	68505fab64	Fix truncation issue in classify_review function (#373 )	2024-09-25 19:54:36 -05:00
Daniel Kleine	9bc5a7dd65	removed unnecessary imports (#367 )	2024-09-22 11:59:37 -05:00
Sebastian Raschka	7a9a17608d	Add user interface to ch06 and ch07 (#366 ) * Add user interface to ch06 and ch07 * pep8 * fix url	2024-09-21 20:33:00 -05:00
Sebastian Raschka	081676d8dd	Add missing bullet point	2024-09-21 12:59:12 -05:00
Mingyuan Xu	21e6971b11	Run generate example in ch06 optionally on GPU (#352 ) * model.to("cuda") model.to("cuda") * update device placement --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-09-13 08:01:52 -05:00
Daniel Kleine	95926535f8	ch06/03 fixes (#336 ) * fixed bash commands * fixed help docstrings * added missing logreg bash cmd * Update train_bert_hf.py * Update train_bert_hf_spam.py * Update README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-27 08:23:25 +02:00
rasbt	8eb6fc0ad0	sklearn baseline and roberta-large update	2024-08-26 10:31:54 +02:00
TITC	5acab58d41	add RoBERTa and params frozen (#335 ) * add roberta experiment result * add roberta & params frozen * Update README.md * modify lr * modify lr --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-26 10:27:09 +02:00
Sebastian Raschka	296a91afb8	add BERT experiment results (#333 ) * add BERT experiment results * cleanup * formatting	2024-08-23 08:40:40 -05:00
Sebastian Raschka	7cbfe418bf	grammar fix	2024-08-21 19:16:21 -05:00
Sebastian Raschka	a82169290e	Note about MPS in ch06 and ch07 (#325 )	2024-08-19 08:11:33 -05:00
rasbt	1a962f3983	update lora experiments	2024-08-16 08:57:46 -05:00
TITC	70a7cddf9c	typo fix (#323 )	2024-08-16 06:57:30 -05:00
TITC	e1072bbcd6	typo fix (#321 ) * typo fix experiment 5 is the same size as experiment 1, both are 124 Million. and experiment 6 is 355M ~3x 124M. * typo fix all layer FT vs all layer FT, "slightly worse by 1.3% " indicates it's exp 5 vs exp 9 * exp 5 vs 9 * Update ch06/02_bonus_additional-experiments/README.md --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-08-15 08:45:34 -05:00
Sebastian Raschka	e4cfcc5b66	Update README.md	2024-08-15 08:22:47 -05:00
TITC	0b998dff97	track tokens seen in chapter5, track examples seen in chapter6 (#319 )	2024-08-13 07:09:05 -05:00
Sebastian Raschka	895512ebee	Revert accidental edit	2024-08-06 19:54:34 -05:00
rasbt	7a5771932b	note about logistic sigmoid	2024-08-06 19:48:30 -05:00
Sebastian Raschka	192bdc3501	improve gradient accumulation (#300 )	2024-08-05 18:27:20 -05:00
Sebastian Raschka	6dd8666d9c	Test code in pytorch 2.4 (#285 ) * test code in pytorch 2.4 * update	2024-07-24 21:53:41 -05:00
Sebastian Raschka	d0f3b034d8	Add download help message (#274 )	2024-07-19 08:29:29 -05:00
Jeroen Van Goey	70cfced899	fix typos, add codespell pre-commit hook (#264 ) * fix typos, add codespell pre-commit hook * Update .pre-commit-config.yaml --------- Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>	2024-07-16 07:07:04 -05:00
Sebastian Raschka	2ce1d16de0	show how to use the finetuned model	2024-07-09 06:43:26 -07:00
Daniel Kleine	87f47a281a	fixed spelling typos (#258 )	2024-07-03 07:47:33 -05:00
Sebastian Raschka	2d8eacb0fa	Fix links in summary sections (#254 )	2024-06-29 07:51:31 -05:00
rasbt	5e24a042c1	add links to summary sections	2024-06-29 07:33:26 -05:00
Daniel Kleine	fb4e37ae15	fixed minor issues (#252 ) * fixed typo * fixed var name in md text	2024-06-29 06:38:25 -05:00
Daniel Kleine	e387742b77	minor markdown fixes (#236 )	2024-06-21 13:55:34 -05:00
Sebastian Raschka	87deec0f5f	Add standalone finetuning and evaluation scripts for chapter 7 (#234 ) * add finetuning and eval scripts * update link * update links * fix link	2024-06-21 05:23:24 -05:00
rasbt	c1f9361428	add main and optional sections	2024-06-19 17:48:25 -05:00
Daniel Kleine	73be1c592f	fixed num_workers (#229 ) * fixed num_workers * ch06 & ch07: added num_workers to create_dataloader_v1	2024-06-19 17:36:46 -05:00
Jinge Wang	8e2c8d0987	Fixed some typos in ch06.ipynb (#219 )	2024-06-18 05:54:01 -05:00
rasbt	c8c0fd4fb5	fix spelling	2024-06-18 05:50:40 -05:00
rasbt	88ad21490c	replace figure	2024-06-18 05:46:36 -05:00
Daniel Kleine	e5c3c5ce99	minor bug fixes (#207 ) * fixed path arg for create_dataset_csvs() * updated assign_check() to remove user warning	2024-06-12 06:27:56 -05:00
rasbt	b2ff989174	distinguish better between main chapter code and bonus materials	2024-06-11 21:07:42 -05:00
Daniel Kleine	79210eb393	fixes for code (#206 ) * updated .gitignore * removed unused GELU import * fixed model_configs, fixed all tensors on same device * removed unused tiktoken * update * update hparam search * remove redundant tokenizer argument --------- Co-authored-by: rasbt <mail@sebastianraschka.com>	2024-06-11 20:59:48 -05:00
rasbt	b9ed5811c3	fix gradient comment	2024-06-09 20:23:18 -05:00

1 2 3

140 Commits