mirror of
				https://github.com/rasbt/LLMs-from-scratch.git
				synced 2025-10-25 23:11:23 +00:00 
			
		
		
		
	add usage
This commit is contained in:
		
							parent
							
								
									fb54b064c9
								
							
						
					
					
						commit
						d311bae25a
					
				| @ -7,6 +7,8 @@ For example, | ||||
| - comparing rows 1 and 3 answers the question: "What is the performance difference when we train only the last layer instead of the last block?"; | ||||
| - and so forth. | ||||
| 
 | ||||
|   | ||||
| 
 | ||||
| |   | Model              | Weights    | Trainable token | Trainable layers | Context length          | CPU/GPU | Training time | Training acc | Validation acc | Test acc | | ||||
| |---|--------------------|------------|-----------------|------------------|-------------------------|---------|---------------|--------------|----------------|----------| | ||||
| | 1 | gpt2-small (124M)  | pretrained | last            | last_block       | longest train ex. (120) | V100    | 0.39 min      | 96.63%       | 97.99%         | 94.33%   | | ||||
| @ -16,4 +18,17 @@ For example, | ||||
| | 5 | gpt2-medium (355M) | pretrained | last            | last_block       | longest train ex. (120) | V100    | 0.91 min      | 87.50%       | 51.01%         | 56.67%   | | ||||
| | 6 | gpt2-large (774M)  | pretrained | last            | last_block       | longest train ex. (120) | V100    | 1.91 min      | 99.52%       | 98.66%         | 96.67%   | | ||||
| | 7 | gpt2-small (124M)  | random     | last            | all              | longest train ex. (120) | V100    | 0.93 min      | 100%         | 97.32%         | 93.00%   | | ||||
| | 8 | gpt2-small (124M)  | pretrained | last            | last_block       | context length (1024)   | V100    | 3.24 min      | 83.08%       | 87.92%         | 78.33%   | | ||||
| | 8 | gpt2-small (124M)  | pretrained | last            | last_block       | context length (1024)   | V100    | 3.24 min      | 83.08%       | 87.92%         | 78.33%   | | ||||
| 
 | ||||
|   | ||||
| 
 | ||||
| ### Usage: | ||||
| 
 | ||||
| - Row 1: `python additional-experiments.py` | ||||
| - Row 2: `python additional-experiments.py --trainable_token first`  | ||||
| - Row 3: `python additional-experiments.py --trainable_layers last_layer` | ||||
| - Row 4: `python additional-experiments.py --trainable_layers all` | ||||
| - Row 5: `python additional-experiments.py --model_size gpt2-medium (355M)` | ||||
| - Row 6: `python additional-experiments.py --model_size gpt2-large (774M)` | ||||
| - Row 7: `python additional-experiments.py --weights random --trainable_layers all` | ||||
| - Row 8: `python additional-experiments.py --context_length "model_context_length"` | ||||
		Loading…
	
	
			
			x
			
			
		
	
		Reference in New Issue
	
	Block a user
	 rasbt
						rasbt