MC-LSTM

class neuralhydrology.modelzoo.mclstm.MCLSTM(cfg: Config)

Bases: BaseModel

Mass-Conserving LSTM model from Hoedt et al. [1].

This model implements the exact MC-LSTM configuratin that was used by Hoedt et al. [1] in the hydrology experiment (for a more general and configurable MC-LSTM model class, check the official MC-LSTM GitHub repository).

The MC-LSTM is an LSTM-inspired timeseries model that guarantees to conserve the mass of a specified mass_input by the special design of its architecture. The model consists of three parts:

  • an input/junction gate that distributes the mass input at a specific timestep across the memory cells

  • a redistribution matrix that allows for internal reorganization of the stored mass

  • an output gate that determines the fraction of the stored mass that is subtracted from the memory cells and defines the output of the MC-LSTM

Starting from the general MC-LSTM architecture as presented by Hoedt et al. [1] the most notably adaption for the hydrology application is the use of a so-called “trash cell”. The trash cell is one particular cell of the cell state vector that is not used for deriving the model prediction, which is defined as the sum of the outgoing mass of all memory cells except the trash cell. For more details and different variants that were tested for the application to hydrology, see Appendix B.4.2 in Hoedt et al [1].

The config argument head is ignored for this model and the model prediction is always computed as the sum over the outgoing mass (excluding the trash cell output).

The config argument initial_forget_bias is here used to close (negative values) or to open (positive values) the output gate at the beginning of the training. Having the output gate closed means that the MC-LSTM has to actively learn when to remove mass from the system, which can be seen as an analogy to an open forget gate in the standard LSTM.

To use this model class, you have to specify the name of the mass input using the mass_input config argument. Additionally, the mass input and target variable should not be normalized. Use the config argument custom_normalization and set the centering and scaling key for both to None (see config arguments for more details on custom_normalization).

Currently, only a single mass input per time step is supported, as well as a single target.

Parameters:

cfg (Config) – The run configuration.

Raises:

ValueError – If no or more than one mass_input is specified in the config. Additionally, an error is raised if the hidden size is smaller than 2, the number of target_variables is greater than 1, or dynamics_embedding is specified in the config, which is (currently) not supported for this model class.

References

forward(data: Dict[str, Tensor]) Dict[str, Tensor]

Perform a forward pass on the MC-LSTM model.

Parameters:

data (Dict[str, torch.Tensor]) – Dictionary, containing input features as key-value pairs.

Returns:

Model outputs and intermediate states as a dictionary.
  • y_hat: model predictions of shape [batch size, sequence length, 1].

  • m_out: mass output of the MC-LSTM (including trash cell as index 0) of shape [batch size, sequence length, hidden size].

  • c: cell state of the MC-LSTM of shape [batch size, sequence length, hidden size].

Return type:

Dict[str, torch.Tensor]

module_parts = ['embedding_net', 'mclstm']