TokenEmbedding¶

class keras_mml.layers.core.TokenEmbedding[source]¶

Turns positive integers (indices) into vectors of fixed size.

For example, [[1, 2], [3, 4], [5, 6]], which could be interpreted as 3 sentences with 2 words each, could be embedded as [[[0.1, 0.2, 0.3], [0.3, 0.4, 0.5]], [[1.1, 1.2, 1.3], [1.3, 1.4, 1.5]], [[2.1, 2.2, 2.3], [2.3, 2.4, 2.5]]], which has shape (3, 2, 3) and can be interpreted as 3 sentences with 2 words each with an embedding dimension of 3.

This layer could optionally include position information in the embeddings by enabling the with_positions attribute.

max_len¶: Maximum length of a sentence.

vocab_size¶: Size of the vocabulary. Typically this is one more than the maximum integer index.

embedding_dim¶: Embedding dimension.

with_positions¶: Whether to include position information in the embeddings.

__init__(max_len, vocab_size, embedding_dim, with_positions=False, **kwargs)[source]¶

Initializes a new instance of the layer.

Parameters:

max_len (int) – Maximum length of a sentence.
vocab_size (int) – Size of the vocabulary. Typically this is one more than the maximum integer index.
embedding_dim (int) – Embedding dimension.
with_positions (bool, default: False) – Whether to include position information in the embeddings.
**kwargs – Keyword arguments for keras.Layer.

Raises:

ValueError – If the maximum sentence length is not a positive integer.
ValueError – If the vocabulary size is not a positive integer.
ValueError – If the embedding dimension is not a positive integer.

build(input_shape)[source]¶

Build the layer.

Parameters:: input_shape (Tuple[int, int]) – Shape of the input.

call(inputs)[source]¶

Calling method of the layer.

Parameters:: inputs (Float[ndarray, 'batch_size sequence_len']) – Inputs into the layer.
Returns:: Float[ndarray, 'batch_size sequence_len embedding_dim'] – Transformed inputs.