by Marie-Amélie Boucher, a HEPEX 2015 Guest Columnist
According to Krzysztofowicz (2001), “Probabilistic forecasts are scientifically more honest [than deterministic forecasts], enable risk-based warnings of floods, enable rational decision making, and offer additional economic benefits.” More than 10 years later, I think most people (and especially members of the HEPEX community!) agree with this statement.
- But what about different types of probabilistic/ensemble forecasts?
- How do they compare to one another?
In the literature, it is very frequent to use deterministic forecasts as a benchmark for comparison with ensemble forecasts and show that ensemble forecasts outperform deterministic ones. It turns out that many end-users have been aware of the superiority of ensemble forecasts over deterministic forecasts for some time already. I don’t know how it is in your part of the world, but not a single operational agency I work with relies solely on deterministic forecasts.
At the same time, none of them use what I would (presumptuously?) call “real” hydrological ensemble forecasts based on meteorological ensemble forecasts. They rather use something in between: mostly different variants of the well-known analogues method.
On the choice of the most appropriate benchmark
Recently, Pappenberger et al. (2015) performed an exhaustive comparison between different types of benchmarks. They state that the choice of the most appropriate benchmark should depend on “the model structure used in the forecasting system, the season, catchment characteristics, river regime and flow conditions.” I would add that if you are using an operational forecasting system as a starting point for your research, one of your benchmarks should consist in this particular operational system. And, if I had the guts to do that, maybe I could cite myself as a bad example…
Hum…Well, okay, please just don’t tell anyone. In this paper, I show that ensemble forecasts outperform deterministic forecasts for the Gatineau River catchment managed by Hydro-Québec. But the thing is: Hydro-Québec has been using a sophisticated analogue forecasting method to produce their operational streamflow forecasts since the 70’s! Not deterministic forecasts! I was comparing my forecasts to the wrong baseline but I did not realize it at that time.
In my defence, I can only mention that none of my co-authors, including those working at Hydro-Québec suggested otherwise. Maybe everybody just had the same reflex? “Compare your ensemble forecasts to deterministic forecasts”.
Fortunately, one can learn from the past and I decided to always use the operational forecasting system (if there is any) as a benchmark for my ongoing and future projects, even if it means that sometimes, it might be disappointing (some analogue methods are really hard to beat!).
A small example for the Montmorency watershed
I want to show you a small, preliminary example, from an ongoing project on the Montmorency River (the tiny red watershed on Figure 1, which covers 1107 km2).
In this case, there is no disappointment: the good guys (ensemble forecasts!) still win (just be aware that these are not final results: the data assimilation scheme, for one thing, has not yet been implemented).
Here’s the example:
Currently, 1- to 5-day ahead streamflow forecasts are made available to the public and decision makers by the Centre d’Expertise Hydrique du Québec. Since 2013, these forecasts take the form of a mean forecast accompanied with a 50% confidence interval. The mean forecast (or 50% scenario) is in fact the deterministic forecast based on Environment Canada’s GEM atmospheric model. Precipitation and temperature forecasts are passed on to the distributed physics-based hydrological model HYDROTEL. The time step is 3h00 with a forecasting horizon of five days (120h00). The confidence interval is established based on the record of past forecasts and observations and it also depends on the season and on the forecasting horizon (see this post for more details). The analogues are obtained from a database that includes the errors between deterministic forecasts and observations from 10 watersheds, among which Montmorency. So it is not basin-specific as recommended by Pappenberger et al. (2015).
As an alternative, we propose using Environment Canada’s ensemble forecasts (also precipitation and temperature) instead of the deterministic forecasts. Each ensemble member is passed on to HYDROTEL, thereby explicitly considering the meteorological uncertainty based on current synoptic conditions and avoiding the use of analogues. The meteorological ensemble forecasts are also produced by the GEM model but with a different spatial resolution, different variants in the physics of the model and initial conditions.
Figure 2 shows hydrographs for both analogues and ensemble forecasts, for two different forecasting horizons (24h and 72h ahead). Analogues forecasts consist in the deterministic forecast dressed with a distribution with parameters that depend on past forecasts errors for the corresponding month. The time step is 3h00. The figure shows a short period of time during fall 2011, just to illustrate the case. The blue dots represent the ensemble forecasts and the red line is the observation.
One can see that analogues are not that bad, especially for the 24h00 horizon. The mean CRPS (Figure 3, below) also confirms that, for the short horizon of 24h00, analogues and ensemble forecasts are quite similar. I also included the mean absolute error (MAE) on Figure 3, so you can see that analogues always outperform deterministic forecasts, as the ensemble forecasts do. Analogues therefore constitute a more challenging baseline than deterministic forecasts.
On the use of more severe benchmarks such as analogues
I see much effort being put on improving meteorological ensemble forecasts and still not that many end-users adopting them operationally. Many operational agencies instead construct ensembles from the deterministic meteorological forecasts and past climatology. But all those efforts put on improving meteorological ensemble forecasts should be beneficial to hydrology. Shouldn’t they?
In my opinion, using deterministic forecasts as the only baseline when assessing the performance of an ensemble forecasting system is akin to using the Nash-Sutcliffe criterion to assess the performance of a deterministic simulation or forecast. This criterion is based on a particular naïve forecast (the average streamflow) that is easily beaten compared to other types of naïve forecasts (ex. Schaefli et al 2007). For instance, the previous streamflow observation, used as a naïve forecast for a persistence-based criterion, is a much more challenging baseline.
Similarly, because they provide an estimate of the probability distribution, analogues are more difficult to beat than deterministic forecasts. In addition, there exists a wide variety of analogue methods, from very simple (like the ones in this blog) to very sophisticated (see, for example, Marty et al., 2012).
I know, for instance, that, in Canada, apart from Quebec and British Columbia, ensemble systems are only just emerging. In such a situation, a comparison between deterministic and ensemble forecasts is still relevant. But otherwise, I would like to see more studies comparing hydrological ensemble forecasts obtained from meteorological ensembles to severe benchmarks such as analogues. More such studies might convince end-users who are already using some form of uncertainty estimation to switch to incorporating meteorological ensemble forecasts in their forecasting process.
And you? What do you think? Is it really important for HEPEX to promote the use of meteorological ensemble forecasts in hydrology, or just any valid method of generating streamflow ensembles will do?
- Boucher M.-A., Anctil F., Perreault L., and Tremblay D. (2011): A comparison between ensemble and deterministic hydrological forecasts in an operational context, Advances in Geosciences, 29, 85-94, doi:10.5194/adgeo-29-85-2011.
- Krzysztofowicz R. (2001): The case for probabilistic forecasting in hydrology, Journal of Hydrology, 249(1-4), 2-9.
- Marty R., Zin I., Obled C., Bontron G. and Djerboua A. (2012): Toward Real-Time Daily PQPF by an Analog Sorting Approach: Application to Flash-Flood Catchments, Journal of Applied Meteorology and Climatology, 51, 505-520.
- Pappenberger F., Ramos M.H., Cloke H.L., Wetterhall F., Alfieri L., Bogner K., Mueller A. and Salamon P. (2015): How do I know if my forecasts are better? Using benchmarks in hydrological ensemble prediction, Journal of Hydrology, 52, 697-713.
- Schaefli B. and Gupta H. V. (2007): Do Nash values have value?, Hydrological Processes, 21(15), 2075-2080.