metrics

neuralhydrology.evaluation.metrics.alpha_nse(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate the alpha NSE decomposition [1]

The alpha NSE decomposition is the fraction of the standard deviations of simulations and observations.

\[\alpha = \frac{\sigma_s}{\sigma_o},\]

where \(\sigma_s\) is the standard deviation of the simulations (here, sim) and \(\sigma_o\) is the standard deviation of the observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Alpha NSE decomposition.

Return type:

float

References

neuralhydrology.evaluation.metrics.beta_kge(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate the beta KGE term [2]

The beta term of the Kling-Gupta Efficiency is defined as the fraction of the means.

\[\beta_{\text{KGE}} = \frac{\mu_s}{\mu_o},\]

where \(\mu_s\) is the mean of the simulations (here, sim) and \(\mu_o\) is the mean of the observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Beta NSE decomposition.

Return type:

float

References

neuralhydrology.evaluation.metrics.beta_nse(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate the beta NSE decomposition [3]

The beta NSE decomposition is the difference of the mean simulation and mean observation divided by the standard deviation of the observations.

\[\beta = \frac{\mu_s - \mu_o}{\sigma_o},\]

where \(\mu_s\) is the mean of the simulations (here, sim), \(\mu_o\) is the mean of the observations (here, obs) and \(\sigma_o\) the standard deviation of the observations.

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Beta NSE decomposition.

Return type:

float

References

neuralhydrology.evaluation.metrics.calculate_all_metrics(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, resolution: str = '1D', datetime_coord: str = None) Dict[str, float]

Calculate all metrics with default values.

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • resolution (str, optional) – Temporal resolution of the time series in pandas format, e.g. ‘1D’ for daily and ‘1H’ for hourly.

  • datetime_coord (str, optional) – Datetime coordinate in the passed DataArray. Tried to infer automatically if not specified.

Returns:

Dictionary with keys corresponding to metric name and values corresponding to metric values.

Return type:

Dict[str, float]

Raises:

AllNaNError – If all observations or all simulations are NaN.

neuralhydrology.evaluation.metrics.calculate_metrics(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, metrics: List[str], resolution: str = '1D', datetime_coord: str = None) Dict[str, float]

Calculate specific metrics with default values.

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • metrics (List[str]) – List of metric names.

  • resolution (str, optional) – Temporal resolution of the time series in pandas format, e.g. ‘1D’ for daily and ‘1H’ for hourly.

  • datetime_coord (str, optional) – Datetime coordinate in the passed DataArray. Tried to infer automatically if not specified.

Returns:

Dictionary with keys corresponding to metric name and values corresponding to metric values.

Return type:

Dict[str, float]

Raises:

AllNaNError – If all observations or all simulations are NaN.

neuralhydrology.evaluation.metrics.fdc_fhv(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, h: float = 0.02) float

Calculate the peak flow bias of the flow duration curve [4]

\[\%\text{BiasFHV} = \frac{\sum_{h=1}^{H}(Q_{s,h} - Q_{o,h})}{\sum_{h=1}^{H}Q_{o,h}} \times 100,\]

where \(Q_s\) are the simulations (here, sim), \(Q_o\) the observations (here, obs) and H is the upper fraction of flows of the FDC (here, h).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • h (float, optional) – Fraction of upper flows to consider as peak flows of range ]0,1[, be default 0.02.

Returns:

Peak flow bias.

Return type:

float

References

neuralhydrology.evaluation.metrics.fdc_flv(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, l: float = 0.3) float

Calculate the low flow bias of the flow duration curve [5]

\[\%\text{BiasFMS} = -1 \frac{\sum_{l=1}^{L}[\log(Q_{s,l}) - \log(Q_{s,L})] - \sum_{l=1}^{L}[\log(Q_{o,l}) - \log(Q_{o,L})]}{\sum_{l=1}^{L}[\log(Q_{o,l}) - \log(Q_{o,L})]} \times 100,\]

where \(Q_s\) are the simulations (here, sim), \(Q_o\) the observations (here, obs) and L is the lower fraction of flows of the FDC (here, l).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • l (float, optional) – Fraction of lower flows to consider as low flows of range ]0,1[, be default 0.3.

Returns:

Low flow bias.

Return type:

float

References

neuralhydrology.evaluation.metrics.fdc_fms(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, lower: float = 0.2, upper: float = 0.7) float

Calculate the slope of the middle section of the flow duration curve [6]

\[\%\text{BiasFMS} = \frac{\left | \log(Q_{s,\text{lower}}) - \log(Q_{s,\text{upper}}) \right | - \left | \log(Q_{o,\text{lower}}) - \log(Q_{o,\text{upper}}) \right |}{\left | \log(Q_{s,\text{lower}}) - \log(Q_{s,\text{upper}}) \right |} \times 100,\]

where \(Q_{s,\text{lower/upper}}\) corresponds to the FDC of the simulations (here, sim) at the lower and upper bound of the middle section and \(Q_{o,\text{lower/upper}}\) similarly for the observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • lower (float, optional) – Lower bound of the middle section in range ]0,1[, by default 0.2

  • upper (float, optional) – Upper bound of the middle section in range ]0,1[, by default 0.7

Returns:

Slope of the middle section of the flow duration curve.

Return type:

float

References

neuralhydrology.evaluation.metrics.get_available_metrics() List[str]

Get list of available metrics.

Returns:

List of implemented metric names.

Return type:

List[str]

neuralhydrology.evaluation.metrics.kge(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, weights: List[float] = [1.0, 1.0, 1.0]) float

Calculate the Kling-Gupta Efficieny [7]

\[\text{KGE} = 1 - \sqrt{[ s_r (r - 1)]^2 + [s_\alpha ( \alpha - 1)]^2 + [s_\beta(\beta_{\text{KGE}} - 1)]^2},\]

where \(r\) is the correlation coefficient, \(\alpha\) the \(\alpha\)-NSE decomposition, \(\beta_{\text{KGE}}\) the fraction of the means and \(s_r, s_\alpha, s_\beta\) the corresponding weights (here the three float values in the weights parameter).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • weights (List[float]) – Weighting factors of the 3 KGE parts, by default each part has a weight of 1.

Returns:

Kling-Gupta Efficiency

Return type:

float

References

neuralhydrology.evaluation.metrics.mean_absolute_percentage_peak_error(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate the mean absolute percentage error (MAPE) for peaks

\[\text{MAPE}_\text{peak} = \frac{1}{P}\sum_{p=1}^{P} \left |\frac{Q_{s,p} - Q_{o,p}}{Q_{o,p}} \right | \times 100,\]

where \(Q_{s,p}\) are the simulated peaks (here, sim), \(Q_{o,p}\) the observed peaks (here, obs) and P is the number of peaks.

Uses scipy.find_peaks to find peaks in the observed time series. The observed peaks indices are used to subset observed and simulated flows. Finally, the MAPE metric is calculated as the mean absolute percentage error of observed peak flows and corresponding simulated flows.

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Mean absolute percentage error (MAPE) for peaks.

Return type:

float

neuralhydrology.evaluation.metrics.mean_peak_timing(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, window: int = None, resolution: str = '1D', datetime_coord: str = None) float

Mean difference in peak flow timing.

Uses scipy.find_peaks to find peaks in the observed time series. Starting with all observed peaks, those with a prominence of less than the standard deviation of the observed time series are discarded. Next, the lowest peaks are subsequently discarded until all remaining peaks have a distance of at least 100 steps. Finally, the corresponding peaks in the simulated time series are searched in a window of size window on either side of the observed peaks and the absolute time differences between observed and simulated peaks is calculated. The final metric is the mean absolute time difference across all peaks. For more details, see Appendix of [8]

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • window (int, optional) – Size of window to consider on each side of the observed peak for finding the simulated peak. That is, the total window length to find the peak in the simulations is \(2 * \text{window} + 1\) centered at the observed peak. The default depends on the temporal resolution, e.g. for a resolution of ‘1D’, a window of 3 is used and for a resolution of ‘1H’ the the window size is 12.

  • resolution (str, optional) – Temporal resolution of the time series in pandas format, e.g. ‘1D’ for daily and ‘1H’ for hourly.

  • datetime_coord (str, optional) – Name of datetime coordinate. Tried to infer automatically if not specified.

Returns:

Mean peak time difference.

Return type:

float

References

neuralhydrology.evaluation.metrics.missed_peaks(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray, window: int = None, resolution: str = '1D', percentile: float = 80, datetime_coord: str = None) float

Fraction of missed peaks.

Uses scipy.find_peaks to find peaks in the observed and simulated time series above a certain percentile. Counts the number of peaks in obs that do not exist in sim within the specified window.

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

  • window (int, optional) – Size of window to consider on each side of the observed peak for finding the simulated peak. That is, the total window length to find the peak in the simulations is \(2 * \text{window} + 1\) centered at the observed peak. The default depends on the temporal resolution, e.g. for a resolution of ‘1D’, a window of 1 is used and for a resolution of ‘1H’ the the window size is 12. Note that this is a different default window size than is used in the peak-timing metric for ‘1D’.

  • resolution (str, optional) – Temporal resolution of the time series in pandas format, e.g. ‘1D’ for daily and ‘1H’ for hourly.

  • percentile (float, optional) – Only consider peaks above this flow percentile (0, 100).

  • datetime_coord (str, optional) – Name of datetime coordinate. Tried to infer automatically if not specified.

Returns:

Fraction of missed peaks.

Return type:

float

neuralhydrology.evaluation.metrics.mse(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate mean squared error.

\[\text{MSE} = \frac{1}{T}\sum_{t=1}^T (\widehat{y}_t - y_t)^2,\]

where \(\widehat{y}\) are the simulations (here, sim) and \(y\) are observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Mean squared error.

Return type:

float

neuralhydrology.evaluation.metrics.nse(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate Nash-Sutcliffe Efficiency [9]

Nash-Sutcliffe Efficiency is the R-square between observed and simulated discharge.

\[\text{NSE} = 1 - \frac{\sum_{t=1}^{T}(Q_m^t - Q_o^t)^2}{\sum_{t=1}^T(Q_o^t - \overline{Q}_o)^2},\]

where \(Q_m\) are the simulations (here, sim) and \(Q_o\) are observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Nash-Sutcliffe Efficiency

Return type:

float

References

neuralhydrology.evaluation.metrics.pearsonr(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate pearson correlation coefficient (using scipy.stats.pearsonr)

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Pearson correlation coefficient

Return type:

float

neuralhydrology.evaluation.metrics.rmse(obs: xarray.core.dataarray.DataArray, sim: xarray.core.dataarray.DataArray) float

Calculate root mean squared error.

\[\text{RMSE} = \sqrt{\frac{1}{T}\sum_{t=1}^T (\widehat{y}_t - y_t)^2},\]

where \(\widehat{y}\) are the simulations (here, sim) and \(y\) are observations (here, obs).

Parameters:
  • obs (DataArray) – Observed time series.

  • sim (DataArray) – Simulated time series.

Returns:

Root mean sqaured error.

Return type:

float