Configuration Arguments ======================= This page provides a list of possible configuration arguments. Check out the file `examples/config.yml.example `__ for an example of how a config file could look like. General experiment configurations --------------------------------- - ``experiment_name``: Defines the name of your experiment that will be used as a folder name (+ date-time string), as well as the name in TensorBoard. Curly brackets insert other configuration arguments into the experiment name. For example, the argument ``experiment_name: batch-size-is-{batch_size}`` will yield ``batch-size-is-42`` if the ``batch_size`` argument is set to *42*. Furthermore, the special expression ``{random_name}`` (in the experiment name) provides a randomly named string (e.g., ``yellow-frog``) that can make experimental runs easier to recognize. - ``run_dir``: Full or relative path to where the run directory is stored (if empty runs are stored in ${current\_working\_dir}/runs/) - ``train_basin_file``: Full or relative path to a text file containing the training basins (use data set basin id, one id per line). - ``validation_basin_file``: Full or relative path to a text file containing the validation basins (use data set basin id, one id per line). - ``test_basin_file``: Full or relative path to a text file containing the training basins (use data set basin id, one id per line). - ``train_start_date``: Start date of the training period (first day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates to specify multiple training periods. If a list is specified, ``train_end_date`` must also be a list of equal length. Corresponding pairs of start and end date denote the different periods. - ``train_end_date``: End date of the training period (last day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates. If a list is specified, also ``train_start_date`` must be a list with an equal length. - ``validation_start_date``: Start date of the validation period (first day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates (similar to train period specifications). - ``validation_end_date``: End date of the validation period (last day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates (similar to train period specifications). - ``test_start_date``: Start date of the test period (first day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates (similar to train period specifications). - ``test_end_date``: End date of the validation period (last day of discharge) in the format ``DD/MM/YYYY``. Can also be a list of dates (similar to train period specifications). - ``per_basin_train_periods_file``: Alternatively to specifying a global train period for all basins (using ``train_start_date`` and ``train_end_date``) it is also possible to use individual periods for each basin. For that to work you have to create a dictionary with one key per basin id. For each basin id, the dictionary contains two keys ``start_dates`` and ``end_dates``, which contain a list of pandas TimeStamps that indicate the start and end dates of the train periods. ``start_dates`` and ``end_dates`` have to be a list, even in case of a single period per basin (see example below). Then use the pickle library, to store this dictionary to disk and use the path to this pickle file as the value for this config argument. .. code-block:: import pandas as pd dates = { 'basin_a': { 'start_dates': [pd.to_datetime('01/01/1980')], 'end_dates': [pd.to_datetime('31/12/1999')] }, 'basin_b': { 'start_dates': [pd.to_datetime('01/01/1980'), pd.to_datetime('01/01/2000')], 'end_dates': [pd.to_datetime('31/12/1990'), pd.to_datetime('01/01/2005')] } } - ``per_basin_validation_periods_file``: Same as ``per_basin_train_periods_file`` but indicating individual periods that are used as validation periods. - ``per_basin_test_periods_file``: Same as ``per_basin_train_periods_file`` but indicating individual periods that are used as test periods. - ``seed``: Fixed random seed. If empty, a random seed is generated for this run. - ``device``: Which device to use in format of ``cuda:0``, ``cuda:1``, etc, for GPUs, ``mps`` for MacOS with Metal support, or ``cpu`` Validation settings ------------------- - ``validate_every``: Integer that specifies in which interval a validation is performed. If empty, no validation is done during training. - ``validate_n_random_basins``: Integer that specifies how many random basins to use per validation. Values larger *n_basins* are clipped to *n_basins*. - ``metrics``: List of metrics to calculate during validation/testing. See :py:mod:`neuralhydrology.evaluation.metrics` for a list of available metrics. Can also be a dictionary of lists, where each key-value pair corresponds to one target variable and the metrics that should be computed for this target. Like this, it is possible to compute different metrics per target during validation and/or evaluation after the training. - ``save_validation_results``: True/False, if True, stores the validation results to disk as a pickle file. Otherwise they are only used for TensorBoard. This is different than ``save_all_validation_output`` in that only the predictive outputs are saved, and not all of the model output features. - ``save_all_validation_output``: True/False, if True, stores all model outputs from the validation runs to disk as a pickle file. Defaults to False. This differs from ``save_validation_results``, in that here all model output is saved. This can result in files that are quite large. Predictions in this file will not be scaled. This is the raw output from the model. General model configuration --------------------------- - ``model``: Defines the model class, i.e. the core of the model, that will be used. Names have to match the values in `this function `__, e.g., [``cudalstm``, ``ealstm``, ``mtslstm``] - ``head``: The prediction head that is used on top of the output of the model class. Currently supported are ``regression``, ``gmm``, ``cmal``, and ``umal``. Make sure to pass the necessary options depending on your choice of the head (see below). - ``hidden_size``: Hidden size of the model class. In the case of an LSTM, this reflects the number of LSTM states. - ``initial_forget_bias``: Initial value of the forget gate bias. - ``output_dropout``: Dropout applied to the output of the LSTM Regression head ~~~~~~~~~~~~~~~ Can be ignored if ``head != 'regression'`` - ``output_activation``: Which activation to use on the output neuron(s) of the linear layer. Currently supported are ``linear``, ``relu``, ``softplus``. If empty, ``linear`` is used. - ``mc_dropout``: True/False. Wheter Monte-Carlo dropout is used to sample during inference. GMM head ~~~~~~~~ Can be ignored if ``head != 'gmm'`` - ``n_distributions``: The number of distributions used for the GMM head. - ``n_samples``: Number of samples generated (per time-step) from GMM. - ``negative_sample_handling``: How to account for negative samples. Possible values are ``none`` for doing nothing, ``clip`` for clipping the values at zero, and ``truncate`` for resampling values that were drawn below zero. If the last option is chosen, the additional argument ``negative_sample_max_retries`` controls how often the values are resampled. - ``negative_sample_max_retries``: The number of repeated samples for the ``truncate`` option of the ``negative_sample_max_retries`` argument. - ``mc_dropout``: True/False. Whether Monte-Carlo dropout is used to sample during inference. CMAL head ~~~~~~~~~ Can be ignored if ``head != 'cmal'`` - ``n_distributions``: The number of distributions used for the CMAL head. - ``n_samples``: Number of samples generated (per time-step) from CMAL. - ``negative_sample_handling``: Approach for handling negative sampling. Possible values are ``none`` for doing nothing, ``clip`` for clipping the values at zero, and ``truncate`` for resampling values that were drawn below zero. If the last option is chosen, the additional argument ``negative_sample_max_retries`` controls how often the values are resampled. - ``negative_sample_max_retries``: The number of repeated samples for the ``truncate`` option of the ``negative_sample_max_retries`` argument. - ``mc_dropout``: True/False. Whether Monte-Carlo dropout is used to sample during inference. UMAL head ~~~~~~~~~ Can be ignored if ``head != 'umal'`` - ``n_taus``: The number of taus sampled to approximate the uncountable distributions. - ``umal_extend_batch``: True/False. Whether the batches should be extended ``n_taus`` times, to account for a specific approximation density already during the training. - ``tau_down`` The lower sampling bound of asymmetry parameter (should be above 0, below 1 and smaller than ``tau_up``). - ``tau_up`` The upper sampling bound of asymmetry parameter (should be above 0, below 1 and larger than ``tau_down``). - ``n_samples``: Number of samples generated (per time-step) from UMAL. - ``negative_sample_handling``: Approach for handling negative sampling. Possible values are ``none`` for doing nothing, ``clip`` for clipping the values at zero, and ``truncate`` for resampling values that were drawn below zero. If the last option is chosen, the additional argument ``negative_sample_max_retries`` controls how often the values are resampled. - ``negative_sample_max_retries``: The number of repeated samples for the ``truncate`` option of the ``negative_sample_max_retries`` argument. - ``mc_dropout``: True/False. Whether Monte-Carlo dropout is used to sample during inference. Multi-timescale training settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These are used if ``model == mtslstm``. - ``transfer_mtslstm_states``: Specifies if and how hidden and cell states are transferred from lower to higher frequencies. This configuration should be a dictionary with keys ``h`` (hidden state) and ``c`` (cell state). Possible values are ``[None, linear, identity]``. If ``transfer_mtslstm_states`` is not provided or empty, the default is linear transfer. - ``shared_mtslstm``: If False, will use a distinct LSTM with individual weights for each timescale. If True, will use a single LSTM for all timescales and use one-hot-encoding to identify the current input timescale. In both cases, ``transfer_mtslstm_states`` can be used to configure hidden and cell state transfer. Mamba settings ~~~~~~~~~~~~~~ These are used if ``model == mamba``. - ``mamba_d_conv``: Local convolution width - ``mamba_d_state``: State Space Model state expansion factor - ``mamba_expand``: Block expansion factor Transformer settings ~~~~~~~~~~~~~~~~~~~~ These are used if ``model == transformer``. - ``transformer_nlayers``: Number of multi-head self-attention layers in the transformer encoder. - ``transformer_positional_encoding_type``: Choices are ``[sum, concatenate]``. Used to change the way that the positional encoding is used in transformer embedding layer. `sum` means that the positional encoding is added to the values of the inputs for that layer, while `concatenate` means that the embedding is concatenated as additional input features. - ``transformer_dim_feedforward``: Dimension of dense layers used between self-attention layers in transformer encoder. - ``transformer_positional_dropout``: Dropout applied only to the positional encoding before using in transformer encoder. - ``transformer_dropout``: Dropout used in transformer encoder layers. - ``transformer_nhead``: Number of parallel transformer heads. XLSTM settings ~~~~~~~~~~~~~~ These are used if ``model == x_lstm``. - ``xlstm_num_blocks``: number of stacked xLSTM blocks - ``xlstm_slstm_at``: indices of blocks of scalar-memory - ``xlstm_heads``: number of heads - ``xlstm_kernel_size``: convolutional kernel size - ``xlstm_proj_factor``: projection factor ODE-LSTM settings ~~~~~~~~~~~~~~~~~ These are used if ``model == odelstm``. - ``ode_method``: Method to use to solve the ODE. One of ``[euler, rk4, heun]``. - ``ode_num_unfolds``: Number of iterations to break each ODE solving step into. - ``ode_random_freq_lower_bound``: Lowest frequency that will be used to randomly aggregate the first slice of the input sequence. See the documentation of the ODELSTM class for more details on the frequency randomization. MC-LSTM settings ~~~~~~~~~~~~~~~~ These are used if ``model == mclstm``. - ``mass_inputs``: List of features that are used as mass input in the MC-LSTM model, i.e. whose quantity is conserved over time. Currently, the MC-LSTM configuration implemented here only supports a single mass input. Make sure to exclude this feature from the default normalization (see :ref:`MC-LSTM ` description). Hybrid-Model settings ~~~~~~~~~~~~~~~~~~~~~ These are used if ``model == hybrid_model``. - ``conceptual_model``: Name of the hydrological conceptual model that is used together with a data-driven method to create the hybrid model e.g., [``SHM``]. - ``dynamic_conceptual_inputs``: List of features that are used as input in the conceptual part of the hybrid model. - ``warmup_period``: Number of time steps (e.g. days) before the information produced by the data-driven part is used in the conceptual model. Handoff Forecast Model settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``forecast_hidden_size``: Integer hidden size for the forecast (decoder) LSTM. This will default to ``hidden_size``. - ``hindcast_hidden_size``: Integer hidden size for the hindcast (encoder) LSTM. This will default to ``hidden_size``. - ``state_handoff_network``: Embedding network that defines the handoff of the cell state and hidden state from the hindcast LSTM to the forecast LSTM. - ``forecast_overlap``: An integer number of timesteps where forecast data overlaps with hindcast data. This does not add to the ``forecast_sequence_length``, and must be no larger than the ``forecast_sequence_length``. Multihead Forecast Model settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``forecast_network``: Fully coupled network with one or multiple layers (this is an Embedding Network type, see documentation herein) that defines the forecast (decoder) portion of the multi-head (non-rollout) forecast model. Stacked Forecast Model settings ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ``bidirectional_stacked_forecast_lstm``: Whether or not the hindcast LSTM in a stacked forecast model should be bidirectional. - ``forecast_hidden_size``: Integer hidden size for the forecast (decoder) LSTM. This will default to ``hidden_size``. - ``hindcast_hidden_size``: Integer hidden size for the hindcast (encoder) LSTM. This will default to ``hidden_size``. Embedding network settings -------------------------- These settings define fully connected networks that are used in various places, such as the embedding network for static or dynamic features in the single-frequency models or as an optional extended input gate network in the EA-LSTM model. For multi-timescale models, these settings can be ignored. - ``statics_embedding``: None (default) or a dict that defines the embedding network for static inputs. The dictionary can have the following keys: - ``type`` (default 'fc'): Type of the embedding net. Currently, only 'fc' for fully-connected net is supported. - ``hiddens``: List of integers that define the number of neurons per layer in the fully connected network. The last number is the number of output neurons. Must have at least length one. - ``activation`` (default 'tanh'): activation function of the network. Supported values are: 'tanh', 'sigmoid', 'linear', and 'relu'. The activation function is not applied to the output neurons, which always have a linear activation function. An activation function for the output neurons has to be applied in the main model class. - ``dropout`` (default 0.0): Dropout rate applied to the embedding network. Note that for EA-LSTM, there will always be an additional linear layer that maps to the EA-LSTM's hidden size. This means that the the embedding layer output size does not have to be equal to ``hidden_size``. - ``dynamics_embedding``: None (default) or a dict that defines the embedding network for dynamic inputs. See ``statics_embedding`` for a description of the dictionary structure. Training settings ----------------- - ``optimizer``: Specify which optimizer to use. Currently supported are Adam and AdamW. New optimizers can be added :py:func:`here `. - ``loss``: Which loss to use. Currently supported are ``MSE``, ``NSE``, ``RMSE``, ``GMMLoss``, ``CMALLoss``, and ``UMALLoss``. New losses can be added :py:mod:`here `. - ``allow_subsequent_nan_losses``: Define a number of training steps for which a loss value of ``NaN`` is ignored and no error is raised but instead the training loop proceeds to the next iteration step. - ``target_loss_weights``: A list of float values specifying the per-target loss weight, when training on multiple targets at once. Can be combined with any loss. By default, the weight of each target is ``1/n`` with ``n`` being the number of target variables. The order of the weights corresponds to the order of the ``target_variables``. - ``regularization``: List of strings or 2-tuples with regularization terms and corresponding weights. If no weights are specified, they default to 1. Currently, two reqularizations are supported: (1) ``tie_frequencies``, which couples the predictions of all frequencies via an MSE term, and (2) ``forecast_overlap``, which couples overlapping sequences between hindcast and forecast models. New regularizations can be added :py:mod:`here `. - ``learning_rate``: Learning rate. Can be either a single number (for a constant learning rate) or a dictionary. If it is a dictionary, the keys must be integer that reflect the epochs at which the learning rate is changed to the corresponding value. The key ``0`` defines the initial learning rate. - ``batch_size``: Mini-batch size used for training. - ``epochs``: Number of training epochs. - ``max_updates_per_epoch``: Maximum number of weight updates per training epoch. Leave unspecified to go through all data in every epoch. - ``use_frequencies``: Defines the time step frequencies to use (daily, hourly, ...). Use `pandas frequency strings `__ to define frequencies. Note: The strings need to include values, e.g., '1D' instead of 'D'. If used, ``predict_last_n`` and ``seq_length`` must be dictionaries. - ``no_loss_frequencies``: Subset of frequencies from ``use_frequencies`` that are "evaluation-only", i.e., the model will get input and produce output in the frequencies listed here, but they will not be considered in the calculation of loss and regularization terms. - ``seq_length``: Length of the input sequence. If ``use_frequencies`` is used, this needs to be a dictionary mapping each frequency to a sequence length, else an int. - ``forecast_seq_length``: Length of the forecast sequence. This is the number of timesteps from the total ``seq_length`` that are part of the forecast rather than the hindcast. Note that this does not add to the total ``seq_length``, and thus, the forecast sequence length must be less than the total sequence length. - ``forecast_overlap``: An integer number of timesteps where forecast data overlaps with hindcast data. This does not add to the ``forecast_sequence_length``, and must be no larger than the ``forecast_sequence_length``. This is used for ``ForecastOverlapMSERegularization`` in the ``handoff_forecast_model`` and for the ``stacked_lstm`` model. - ``predict_last_n``: Defines which time steps are used to calculate the loss, counted backwards. Can't be larger than ``seq_length``. Sequence-to-one would be ``predict_last_n: 1`` and sequence-to-sequence (with e.g. a sequence length of 365) ``predict_last_n: 365``. If ``use_frequencies`` is used, this needs to be a dictionary mapping each frequency to a predict\_last\_n-value, else an int. - ``target_noise_std``: Defines the standard deviation of gaussian noise which is added to the labels during training. Set to zero or leave empty to *not* add noise. - ``clip_gradient_norm``: If a value, clips norm of gradients to that specific value during training. Leave empty for not clipping. - ``num_workers``: Number of (parallel) threads used in the data loader. - ``save_weights_every``: Interval, in which the weights of the model are stored to disk. ``1`` means to store the weights after each epoch, which is the default if not otherwise specified. Finetune settings ----------------- Ignored if ``mode != finetune`` - ``finetune_modules``: List of model parts that will be trained during fine-tuning. All parts *not* listed here will not be updated. Check the documentation of each model to see a list of available module parts. Logger settings --------------- - ``log_interval``: Interval at which the training loss is logged, by default 10. - ``log_tensorboard``: True/False. If True, writes logging results into TensorBoard file. The default, if not specified, is True. - ``log_n_figures``: If a (integer) value greater than 0, saves the predictions as plots of that n specific (random) basins during validations. - ``save_git_diff``: If set to True and NeuralHydrology is a git repository with uncommitted changes, the git diff will be stored in the run directory. When using this option, make sure that your run and data directories are either not located inside the git repository, or that they are part of the ``.gitignore`` file. Otherwise, the git diff may become very large and use up a lot of disk space. To make sure everything is configured correctly, you can simply check that the output of ``git diff HEAD`` only contains your code changes. Data settings ------------- - ``dataset``: Defines which data set will be used. Currently supported are ``camels_us`` (`CAMELS (US) data set by Newman et al. `__), ``camels_gb`` (`CAMELS-GB by Coxon et al. `__), ``camels_cl`` (`CAMELS-CL by Alvarez-Garreton et al. `__), ``camels_br`` (`CAMELS-BR by Chagas et al. `__), ``camels_aus`` (`CAMELS-AUS by Fowler et al. `__), ``lamah_{a,b,c}`` (`LamaH-CE by Klingler et al. `__), ``hourly_camels_us`` (hourly forcing and streamflow data for 516 CAMELS (US) basins, published by `Gauch et al. `__), and ``generic`` (can be used with any dataset that is stored in a specific format, see :py:class:`documentation ` for further informations). - ``data_dir``: Full or relative path to the root directory of the data set. - ``train_data_file``: If not empty, uses the pickled file at this path as the training data. Can be used to not create the same data set multiple times, which saves disk space and time. If empty, creates new data set and optionally stores the data in the run directory (if ``save_train_data`` is True). - ``cache_validation_data``: True/False. If True, caches validation data in memory for the time of training, which does speed up the overall training time. By default True, since even larger datasets are usually just a few GB in memory, which most modern machines can handle. - ``dynamic_inputs``: List of variables to use as time series inputs. Names must match the exact names as defined in the data set. Note: In case of multiple input forcing products, you have to append the forcing product behind each variable. E.g., 'prcp(mm/day)' of the daymet product is 'prcp(mm/day)_daymet'. When training on multiple frequencies (cf. ``use_frequencies``), it is possible to define dynamic inputs for each frequency individually. To do so, ``dynamic_inputs`` must be a dict mapping each frequency to a list of variables. E.g., to use precipitation from daymet for daily and from nldas-hourly for hourly predictions: :: dynamic_inputs: 1D: - prcp(mm/day)_daymet 1h: - total_precipitation_nldas_hourly When ``nan_handling_method`` is set, this can also be specified as a list of lists. Then, each inner list defines a feature group. Refer to ``nan_handling_method`` for a description of how this can be used to handle missing input data. - ``forecast_inputs``: These are dynamic features (exactly like ``dyncamic_inputs``) that are used as inputs to the forecasting portion of a forecast model. This allows different features to be used for the forecast and hindcast portions of a model. If ``forecast_inputs`` is present, then all features in this list must also appear in the ``dynamic_inputs`` list, which will contain both forecast and hindcast features. Note that this does not currently support a forecast rollout, meaning that because forecast inputs behave the same way as dynamic inputs, the forecast input for two timesteps ahead of time t will be the same as the forecast input for one day ahead of time t+1. Note also that forecasting (and forecast inputs) is not supported for multi-timescale models. - ``hindcast_inputs``: These are the same as ``forecast_inputs`` except that they are for the hindcast portion of a forecast model. As with ``forecast_inputs`` these dynamic inputs must be included in the ``dynamic_inputs`` list. - ``target_variables``: List of the target variable(s). Names must match the exact names as defined in the data set. - ``clip_targets_to_zero``: Optional list of target variables to clip to zero during the computation of metrics (e.g. useful to compute zero-clipped metric during the validation between training epochs. Will not affect the data that is saved to disk after evaluation. That is, always the `raw` model outputs are saved in the result files. Therefore, you eventually need to manually clip the targets to zero if you load the model outputs from file and want to reproduce the metric values. - ``duplicate_features``: Can be used to duplicate time series features (e.g., for different normalizations). Can be either a str, list or dictionary (mapping from strings to ints). If string, duplicates the corresponding feature once. If list, duplicates all features in that list once. Use a dictionary to specify the exact number of duplicates you like. To each duplicated feature, we append ``_copyN``, where `N` is counter starting at 1. - ``lagged_features``: Can be used to add a lagged copy of another feature to the list of available input/output features. Has to be a dictionary mapping from strings to int or a list of ints, where the string specifies the feature name and the int(s) the number of lagged time steps. Those values can be positive or negative (see `pandas shift `__ for details). If a list of integers is provided, only unique values are considered. We append ``_shiftN`` to each lagged feature, where `N` is the shift count. ``autoregressive_inputs``: Currently, only one autoregressive input is allowed, and only one output feature is allowed in an autoregressive model. This is a list of target feature(s) to be used as model inputs. These will be lagged by some number of timesteps > 0, and therefore must appear in the list of ``lagged_features``. Autoregressive inputs are appended to the end of the dynamic features list when building the dataset(s). Missing data is supported in autoregressive inputs. During runtime, autoregressive models append binary flags as inputs to indicate missing data. Autoregressive inputs only work with models that support autoregression and will throw an error if they are included in a config file for a model that does not support autoregression. Leave empty if none should be used. - ``nan_handling_method``: One of [``masked_mean``, ``input_replacing``, ``attention``]. These strategies can be used to train and evaluate models when some of the input data is missing some of the time. To use this, specify ``dynamic_inputs`` as a list of lists. Each list then defines a feature group. - The ``masked_mean`` strategy will embed each of these feature groups with its own embedding network (defined via ``dynamics_embedding``) and then average the embeddings of all groups that are not missing at a given timestep. - The ``attention`` strategy is a more general version of masked_mean. It uses an attention mechanism where the query is the concatenation of static attributes (or their embedding) and a positional encoding, the keys and values are the embeddings of each feature group. - The ``input_replacing`` strategy replaces all NaNs with zeros, concatenates all features from all feature groups, and adds a binary flag for each feature group to indicate missing values. - ``nan_sequence_probability``: If set, the dataset will yield samples where each feature group is missing with the specified probability during training. Any combination of feature groups can be dropped, but the dataset takes care not to drop all sequences of all groups at the same time. - ``nan_step_probability``: Same as ``nan_sequence_probability`` except it defines the probability of dropping individual time steps of a feature group, rather than the whole sequence. Note that it is possible for all three time steps to be dropped at the same time. - ``nan_handling_pos_encoding_size``: Defines the size of a sine/cosine positional encoding to be added to each feature group (for ``masked_mean``), the concatenated vector of feature groups (for ``input_replacing``), or to the inputs of the query embedding (for ``attention``). - ``random_holdout_from_dynamic_features``: Dictionary to define timeseries features to remove random sections of data from. This allows for conducting certain types of missing data analyses. Keys of this dictionary must match exact names of dynamic inputs as defined in the data set. Values are a dict with keys "missing_fraction" and "mean_missing_length", and values that are float and float, respectively, representing ("missing_fraction") the long-term fraction of data to be randomly removed from a given feature, and (2) the expected value of the length of continuous subsequences removed from the timeseries. These two distribution parameters do not consider whether there are any NaN's in the original timeseries. Only works for timeseries features (inputs and targets). Leave empty if none should be used. - ``custom_normalization``: Has to be a dictionary, mapping from time series feature names to ``centering`` and/or ``scaling``. Using this argument allows to overwrite the default zero mean, unit variance normalization per feature. Supported options for ``centering`` are 'None' or 'none', 'mean', 'median' and min. None/none sets the centering parameter to 0.0, mean to the feature mean, median to the feature median, and min to the feature minimum, respectively. Supported options for `scaling` are 'None' or 'none', 'std', 'minmax'. None/none sets the scaling parameter to 1.0, std to the feature standard deviation and minmax to the feature max minus the feature min. The combination of centering: min and scaling: minmax results in min/max feature scaling to the range [0,1]. - ``additional_feature_files``: Path to a pickle file (or list of paths for multiple files), containing a dictionary with each key corresponding to one basin id and the value is a date-time indexed pandas DataFrame. Allows the option to add any arbitrary data that is not included in the standard data sets. **Convention**: If a column is used as static input, the value to use for specific sample should be in same row (datetime) as the target discharge value. - ``evolving_attributes``: Columns of the DataFrame loaded with the ``additional_feature_files`` that should be used as "static" features. These values will be used as static inputs, but they can evolve over time. Convention: The value to use for a specific input sequence should be in the same row (datetime) as the last time step of that sequence. Names must match the column names in the DataFrame. Leave empty to not use any additional static feature. - ``use_basin_id_encoding``: True/False. If True, creates a basin-one-hot encoding as a(n) (additional) static feature vector for each sample. ``timestep_counter``: True/False. If True, creates a sequence of counting integers over the forecast sequence length as a dynamic input. This input is used to signal forecast lead time for an unrolling forecast. A similar dynamic input of constant zeros is added to the hindcast inputs. If a forecast model is not used then setting ``timestep_counter`` to True will return an error. - ``static_attributes``: Which static attributes to use (e.g., from the static camels attributes for the CAMELS dataset). Leave empty if none should be used. For hydroatlas attributes, use ``hydroatlas_attributes`` instead. Names must match the exact names as defined in the data set. - ``hydroatlas_attributes``: Which HydroATLAS attributes to use. Leave empty if none should be used. Names must match the exact names as defined in the data set. CAMELS US specific ~~~~~~~~~~~~~~~~~~ Can be ignored if ``dataset not in ['camels_us', 'hourly_camels_us']`` - ``forcings``: Can be either a string or a list of strings that correspond to forcing products in the camels data set. Also supports ``maurer_extended``, ``nldas_extended``, and (for ``hourly_camels_us``) ``nldas_hourly``.