@@ -308,19 +308,19 @@ The code defines two C++ structs, `ConfigGemma7B` and `ConfigGemma2B`, which are
308
308
309
309
**ConfigGemma7B**:
310
310
311
- * `seq_len `: Stores the length of the sequence to be processed. It's set to 7168.
312
- * `vocab_size `: Stores the size of the vocabulary, which is 256128.
313
- * `n_layers `: Number of layers in the deep learning model. It's set to 28.
314
- * `dim_model `: Dimension of the model's internal representation. It's set to 3072.
315
- * `dim_ffw_hidden `: Dimension of the feedforward and recurrent layers' hidden representations. It's set to 16 * 3072 / 2.
311
+ * `kSeqLen `: Stores the length of the sequence to be processed. It's set to 7168.
312
+ * `kVocabSize `: Stores the size of the vocabulary, which is 256128.
313
+ * `kLayers `: Number of layers in the deep learning model. It's set to 28.
314
+ * `kModelDim `: Dimension of the model's internal representation. It's set to 3072.
315
+ * `kFFHiddenDim `: Dimension of the feedforward and recurrent layers' hidden representations. It's set to 16 * 3072 / 2.
316
316
317
317
**ConfigGemma2B**:
318
318
319
- * `seq_len `: Stores the length of the sequence to be processed. It's also set to 7168.
320
- * `vocab_size `: Size of the vocabulary, which is 256128.
321
- * `n_layers `: Number of layers in the deep learning model. It's set to 18.
322
- * `dim_model `: Dimension of the model's internal representation. It's set to 2048.
323
- * `dim_ffw_hidden `: Dimension of the feedforward and recurrent layers' hidden representations. It's set to 16 * 2048 / 2.
319
+ * `kSeqLen `: Stores the length of the sequence to be processed. It's also set to 7168.
320
+ * `kVocabSize `: Size of the vocabulary, which is 256128.
321
+ * `kLayers `: Number of layers in the deep learning model. It's set to 18.
322
+ * `kModelDim `: Dimension of the model's internal representation. It's set to 2048.
323
+ * `kFFHiddenDim `: Dimension of the feedforward and recurrent layers' hidden representations. It's set to 16 * 2048 / 2.
324
324
325
325
These structs are used to configure a deep learning model with specific parameters for either Gemma7B or Gemma2B architecture.
326
326
```
0 commit comments