Transformer

class neuralhydrology.modelzoo.transformer.Transformer(cfg: Config)

Bases: BaseModel

Transformer model class, which relies on PyTorch’s TransformerEncoder class.

This class implements the encoder of a transformer network which can be used for regression. Unless the number of inputs is divisible by the number of transformer heads (transformer_nheads), it is necessary to use an embedding network that guarantees this. To achieve this, use statics/dynamics_embedding, so the static/dynamic inputs will be passed through embedding networks before being concatenated. The embedding network will then map the static and dynamic features to size statics/dynamics_embedding['hiddens'][-1], so the total embedding size will be the sum of these values. The model configuration is specified in the config file using the following options:

transformer_positional_encoding_type: choices to “sum” or “concatenate” positional encoding to other model inputs.
transformer_positional_dropout: fraction of dropout applied to the positional encoding.
seq_length: integer number of timesteps to treat in the input sequence.
transformer_nheads: number of self-attention heads.
transformer_dim_feedforward: dimension of the feed-forward networks between self-attention heads.
transformer_dropout: dropout in the feedforward networks between self-attention heads.
transformer_nlayers: number of stacked self-attention + feedforward layers.

Parameters:: cfg (Config) – The run configuration.

forward(data: dict[str, Tensor | dict[str, Tensor]]) → Dict[str, Tensor]

Perform a forward pass on a transformer model without decoder.

Parameters:

data (dict[str, torch.Tensor | dict[str, torch.Tensor]]) – Dictionary, containing input features as key-value pairs.

Returns:

Model outputs and intermediate states as a dictionary.

y_hat: model predictions of shape [batch size, sequence length, number of target variables].

Return type:

Dict[str, torch.Tensor]

module_parts = ['embedding_net', 'encoder', 'head']