RMSNorm

class keras_mml.layers.normalizations.RMSNorm[source]

Implements Root Mean Square Normalization (RMSNorm).

The implementation of RMSNorm follows Root Mean Square Layer Normalization.

has_learnable_weights

Whether the layer has learnable per-element affine parameters.

use_bias

Whether the layer uses a bias vector.

gain_initializer

Initializer for the gain weights.

bias_initializer

Initializer for the bias vector.

gain_regularizer

Regularizer for the gain weights.

bias_regularizer

Regularizer for the bias vector.

gain_constraint

Constraint for the gain weights.

bias_constraint

Constraint for the bias vector.

scale

Scaling factor. Available only after layer is built.

__init__(has_learnable_weights=True, use_bias=False, gain_initializer='ones', bias_initializer='zeros', gain_regularizer=None, bias_regularizer=None, gain_constraint=None, bias_constraint=None, **kwargs)[source]

Initializes a new RMSNorm instance.

Parameters:
  • has_learnable_weights (bool, default: True) – When set to True, this layer has learnable per-element affine parameters initialized to ones (for weights, a.k.a. for gains) and zeros (for biases).

  • use_bias (bool, default: False) – Whether the layer uses a bias vector. Ignored if has_learnable_weights is False.

  • gain_initializer (str, default: 'ones') – Initializer for the gain weights.

  • bias_initializer (str, default: 'zeros') – Initializer for the bias vector.

  • gain_regularizer (Optional[str], default: None) – Regularizer for the gain weights.

  • bias_regularizer (Optional[str], default: None) – Regularizer for the bias vector.

  • gain_constraint (Optional[str], default: None) – Constraint for the gain weights.

  • bias_constraint (Optional[str], default: None) – Constraint for the bias vector.

  • **kwargs – Keyword arguments for keras.Layer.

build(input_shape)[source]

Create layer weights.

Parameters:

input_shape (Tuple[int, ...]) – Shape of the input.

call(inputs)[source]

Calling method of the layer.

Parameters:

inputs (Float[ndarray, 'batch_size *dims']) – Inputs into the layer.

Returns:

Float[ndarray, 'batch_size *dims'] – Transformed inputs.