From d311bae25a37100a07104fbc1a0f81d7450bcd0c Mon Sep 17 00:00:00 2001 From: rasbt Date: Wed, 24 Apr 2024 07:27:04 -0500 Subject: [PATCH] add usage --- ch06/02_bonus_additional-experiments/README.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/ch06/02_bonus_additional-experiments/README.md b/ch06/02_bonus_additional-experiments/README.md index fcaf441..62e85ee 100644 --- a/ch06/02_bonus_additional-experiments/README.md +++ b/ch06/02_bonus_additional-experiments/README.md @@ -7,6 +7,8 @@ For example, - comparing rows 1 and 3 answers the question: "What is the performance difference when we train only the last layer instead of the last block?"; - and so forth. +  + | | Model | Weights | Trainable token | Trainable layers | Context length | CPU/GPU | Training time | Training acc | Validation acc | Test acc | |---|--------------------|------------|-----------------|------------------|-------------------------|---------|---------------|--------------|----------------|----------| | 1 | gpt2-small (124M) | pretrained | last | last_block | longest train ex. (120) | V100 | 0.39 min | 96.63% | 97.99% | 94.33% | @@ -16,4 +18,17 @@ For example, | 5 | gpt2-medium (355M) | pretrained | last | last_block | longest train ex. (120) | V100 | 0.91 min | 87.50% | 51.01% | 56.67% | | 6 | gpt2-large (774M) | pretrained | last | last_block | longest train ex. (120) | V100 | 1.91 min | 99.52% | 98.66% | 96.67% | | 7 | gpt2-small (124M) | random | last | all | longest train ex. (120) | V100 | 0.93 min | 100% | 97.32% | 93.00% | -| 8 | gpt2-small (124M) | pretrained | last | last_block | context length (1024) | V100 | 3.24 min | 83.08% | 87.92% | 78.33% | \ No newline at end of file +| 8 | gpt2-small (124M) | pretrained | last | last_block | context length (1024) | V100 | 3.24 min | 83.08% | 87.92% | 78.33% | + +  + +### Usage: + +- Row 1: `python additional-experiments.py` +- Row 2: `python additional-experiments.py --trainable_token first` +- Row 3: `python additional-experiments.py --trainable_layers last_layer` +- Row 4: `python additional-experiments.py --trainable_layers all` +- Row 5: `python additional-experiments.py --model_size gpt2-medium (355M)` +- Row 6: `python additional-experiments.py --model_size gpt2-large (774M)` +- Row 7: `python additional-experiments.py --weights random --trainable_layers all` +- Row 8: `python additional-experiments.py --context_length "model_context_length"` \ No newline at end of file