I have recently contributed to a paper where we investigate how statistical post-processing and data assimilation (also called real-time model updating in the engineering community) can be intrinsically related in the hydrological forecasting framework. The paper, co-written with François Bourgin (main author), Guillaume Thirel, and Vazken Andréassian, can be found here. We were basically guided by the following questions:
- How does data assimilation impact hydrological ensemble forecasts?
- How does post-processing impact hydrological ensemble forecasts?
- How does data assimilation interact with post-processing to improve the quality and skill of hydrological ensemble forecasts over the forecast lead times?
Our study was based on 202 unregulated catchments in France, data at hourly time steps over 1997–2006, bias corrected short-range meteorological ensemble predictions from the PEARP system by Météo-France (11 members, 60-h forecast range, spatially disaggregated to an 8 km 8 km grid), and the GRP hydrological model (a continuous, parsimonious lumped storage-type model designed for flood forecasting and developed at Irstea in France).
Our main conclusions indicated that:
- Data assimilation has a strong impact on improving the quality of the ensemble mean, and a much lesser effect on the variability of the ensemble members (i.e., their spread). Post-processing has a strong impact on forecast reliability.
- The benefits of the combined use of data assimilation and post-processing are clearly shown: both contribute to achieve reliable and sharp forecasts, with impacts acting differently according to the target lead time.
- The stronger impact on forecast reliability comes from the use of post-processing. Adding data assimilation to the system helps in improving sharpness and reliability at all lead times, with higher gains in performance at shorter lead times
But what were we considering as “post-processing” and as “data assimilation”?
One interesting comment of an anonymous reviewer of the paper pointed out to our definition of what pertains to post-processing and what pertains to data assimilation (by the way, interesting posts on these topics were previously published in the Hepex blog and can be found following the links).
The reviewer asked us to provide a clear explanation and difference between data assimilation and post-processing.
In our study, a hydrological uncertainty processor (HUP) was used as a post-processing technique applied to estimate the conditional errors of the hydrological model (i.e., the model is run with observed weather data). Basically, it can be summarized by the following characteristics:
- Data-based and non-parametric method to assess model simulation uncertainties.
- Empirical quantiles of relative errors estimated (stratified by different flow groups).
- HUP trained during the period used for calibrating the parameters of the hydrological model.
As for data assimilation, we considered two procedures in the flood forecasting chain:
- the last available observed discharge is used to directly update the routing store state,
- the last relative error is used to correct the model output with a multiplicative coefficient.
The question of the reviewer more specifically addressed the last point. The reviewer wrote the following:
“…[the] DA scheme includes both model-state updates (“routing store state”), along with multiplicative model error corrections (MEC) applied to the model output discharge time-series. The latter (MEC) could fall into a gray-zone between DA and PP: if DA were defined to operate only on states of the hydrological model, then MEC would not qualify as DA; similarly, if PP [post-processing] could “operate” on time-series of model output (but was agnostic to hydrologic model states), then the MEC could qualify as PP; however, if PP were only to “operate” on static distributions of model errors, but potentially conditional on model output (as the authors’ PP is constructed), but not conditional on model output error values, then the MEC would not fall under the “PP umbrella”. […]”
“Gray-zones” in hydro-meteorological ensemble forecasting
I found it a very interesting comment; one that forced us to clarify our procedures, as I indicated above and you can read in the paper. Interestingly, some time later, at the 2015 EGU General Assembly in Vienna (where we presented a poster on the results of this paper during the ‘Ensemble hydro-meteorological forecasting’ session), a researcher from a meteorological institute came to discuss the same issue: for him, DA implies techniques that change the states of a model (it thus does not include output error corrections).
I didn’t dare to ask this researcher if he was our reviewer, but it made me think again that the “gray-zone” seems to be a real one. Overall, post-processing and data assimilation represent techniques that may be used in a forecasting system to improve the quality of the forecasts (i.e., to provide more accurate and reliable forecasts) and to, ultimately, enhance the usefulness of the forecasts in decision-making. But what we have also learned is that it is very useful to clearly define these operations in the context we use them in our pre-/operational systems.
Besides, I can only believe that other “gray-zones” may exist in hydro-meteorological ensemble forecasting. If you have one in mind, share it using the comment box. We will be glad to hear more about it!