TokenEmbedding

class keras_mml.layers.core.TokenEmbedding[source]

Turns positive integers (indices) into vectors of fixed size.

For example, [[1, 2], [3, 4], [5, 6]], which could be interpreted as 3 sentences with 2 words each, could be embedded as [[[0.1, 0.2, 0.3], [0.3, 0.4, 0.5]], [[1.1, 1.2, 1.3], [1.3, 1.4, 1.5]], [[2.1, 2.2, 2.3], [2.3, 2.4, 2.5]]], which has shape (3, 2, 3) and can be interpreted as 3 sentences with 2 words each with an embedding dimension of 3.

This layer could optionally include position information in the embeddings by enabling the with_positions attribute.

max_len

Maximum length of a sentence.

vocab_size

Size of the vocabulary. Typically this is one more than the maximum integer index.

embedding_dim

Embedding dimension.

with_positions

Whether to include position information in the embeddings.

__init__(max_len, vocab_size, embedding_dim, with_positions=False, **kwargs)[source]

Initializes a new instance of the layer.

Parameters:
  • max_len (int) – Maximum length of a sentence.

  • vocab_size (int) – Size of the vocabulary. Typically this is one more than the maximum integer index.

  • embedding_dim (int) – Embedding dimension.

  • with_positions (bool, default: False) – Whether to include position information in the embeddings.

  • **kwargs – Keyword arguments for keras.Layer.

Raises:
  • ValueError – If the maximum sentence length is not a positive integer.

  • ValueError – If the vocabulary size is not a positive integer.

  • ValueError – If the embedding dimension is not a positive integer.

build(input_shape)[source]

Build the layer.

Parameters:

input_shape (Tuple[int, int]) – Shape of the input.

call(inputs)[source]

Calling method of the layer.

Parameters:

inputs (Float[ndarray, 'batch_size sequence_len']) – Inputs into the layer.

Returns:

Float[ndarray, 'batch_size sequence_len embedding_dim'] – Transformed inputs.