add usage

2025-12-20 03:32:10 +00:00 · 2024-04-24 07:27:04 -05:00 · 2024-04-24 07:27:04 -05:00 · d311bae25a
commit d311bae25a
parent fb54b064c9
1 changed files with 16 additions and 1 deletions
--- a/ch06/02_bonus_additional-experiments/README.md
+++ b/ch06/02_bonus_additional-experiments/README.md
@ -7,6 +7,8 @@ For example,
 - comparing rows 1 and 3 answers the question: "What is the performance difference when we train only the last layer instead of the last block?";
 - and so forth.

+&nbsp;
+
 |   | Model              | Weights    | Trainable token | Trainable layers | Context length          | CPU/GPU | Training time | Training acc | Validation acc | Test acc |
 |---|--------------------|------------|-----------------|------------------|-------------------------|---------|---------------|--------------|----------------|----------|
 | 1 | gpt2-small (124M)  | pretrained | last            | last_block       | longest train ex. (120) | V100    | 0.39 min      | 96.63%       | 97.99%         | 94.33%   |
@ -17,3 +19,16 @@ For example,
 | 6 | gpt2-large (774M)  | pretrained | last            | last_block       | longest train ex. (120) | V100    | 1.91 min      | 99.52%       | 98.66%         | 96.67%   |
 | 7 | gpt2-small (124M)  | random     | last            | all              | longest train ex. (120) | V100    | 0.93 min      | 100%         | 97.32%         | 93.00%   |
 | 8 | gpt2-small (124M)  | pretrained | last            | last_block       | context length (1024)   | V100    | 3.24 min      | 83.08%       | 87.92%         | 78.33%   |
+
+&nbsp;
+
+### Usage:
+
+- Row 1: `python additional-experiments.py`
+- Row 2: `python additional-experiments.py --trainable_token first` 
+- Row 3: `python additional-experiments.py --trainable_layers last_layer`
+- Row 4: `python additional-experiments.py --trainable_layers all`
+- Row 5: `python additional-experiments.py --model_size gpt2-medium (355M)`
+- Row 6: `python additional-experiments.py --model_size gpt2-large (774M)`
+- Row 7: `python additional-experiments.py --weights random --trainable_layers all`
+- Row 8: `python additional-experiments.py --context_length "model_context_length"`