One-step ahead forecasting of geophysical processes within a purely statistical framework

Papacharalampous, Georgia; Tyralis, Hristos; Koutsoyiannis, Demetris

doi:10.1186/s40562-018-0111-1

Research Letter
Open access
Published: 07 April 2018

One-step ahead forecasting of geophysical processes within a purely statistical framework

Georgia Papacharalampous¹,
Hristos Tyralis¹ &
Demetris Koutsoyiannis¹

Geoscience Letters volume 5, Article number: 12 (2018) Cite this article

5233 Accesses
29 Citations
Metrics details

Abstract

The simplest way to forecast geophysical processes, an engineering problem with a widely recognized challenging character, is the so-called “univariate time series forecasting” that can be implemented using stochastic or machine learning regression models within a purely statistical framework. Regression models are in general fast-implemented, in contrast to the computationally intensive Global Circulation Models, which constitute the most frequently used alternative for precipitation and temperature forecasting. For their simplicity and easy applicability, the former have been proposed as benchmarks for the latter by forecasting scientists. Herein, we assess the one-step ahead forecasting performance of 20 univariate time series forecasting methods, when applied to a large number of geophysical and simulated time series of 91 values. We use two real-world annual datasets, a dataset composed by 112 time series of precipitation and another composed by 185 time series of temperature, as well as their respective standardized datasets, to conduct several real-world experiments. We further conduct large-scale experiments using 12 simulated datasets. These datasets contain 24,000 time series in total, which are simulated using stochastic models from the families of AutoRegressive Moving Average and AutoRegressive Fractionally Integrated Moving Average. We use the first 50, 60, 70, 80 and 90 data points for model-fitting and model-validation, and make predictions corresponding to the 51st, 61st, 71st, 81st and 91st respectively. The total number of forecasts produced herein is 2,177,520, among which 47,520 are obtained using the real-world datasets. The assessment is based on eight error metrics and accuracy statistics. The simulation experiments reveal the most and least accurate methods for long-term forecasting applications, also suggesting that the simple methods may be competitive in specific cases. Regarding the results of the real-world experiments using the original (standardized) time series, the minimum and maximum medians of the absolute errors are found to be 68 mm (0.55) and 189 mm (1.42) respectively for precipitation, and 0.23 °C (0.33) and 1.10 °C (1.46) respectively for temperature. Since there is an absence of relevant information in the literature, the numerical results obtained using the standardized real-world datasets could be used as rough benchmarks for the one-step ahead predictability of annual precipitation and temperature.

Background

Forecasting geophysical variables in various time scales and horizons is useful in technological applications (e.g. Giunta et al. 2015), but a difficult task as well. Precipitation and temperature forecasting is mostly based on deterministic models as the Global Circulation Models (GCMs), which simulate the Earth’s atmosphere using numerical equations; therefore, deviating from traditional time series forecasting, i.e. univariate time series forecasting. This particular deviation has been questioned by forecasting scientists (Green and Armstrong 2007; Green et al. 2009; Fildes and Kourentzes 2011, see also the comments in Keenlyside 2011; McSharry 2011). Traditional time series forecasting can be performed using several classes of regression models, as reviewed in De Gooijer and Hyndman (2006), while the two major classes are stochastic and machine learning. Regression models are in general fast-implemented in contrast to their computationally intensive alternative in precipitation and temperature forecasting, i.e. the GCMs. For their simplicity and easy applicability, the former have been proposed as benchmarks for the latter by Green et al. (2009).

Recognizing the necessity of introducing traditional forecasting methods in temperature and precipitation forecasting, Armstrong and Fildes (2006) have recommended a relevant issue in one of the Journals specialized in forecasting. Since then and despite the fact that considerable parts of books in hydrology are devoted to such methods (Sivakumar 2017, pp 63–145; Remesan and Mathew 2015, pp 71–110), there has not been a systematic approach to the subject. However, studies adopting statistical forecasting approaches in geoscience are sporadically published in a variety of Journals. Within a statistical framework, Tyralis and Koutsoyiannis (2014, 2017) use Bayesian techniques for probabilistic climate forecasts under the established assumption of long-range dependence of the observed time series. In the latter study information from GCMs is used to improve the performance of the time series forecasting methods. Moreover, Table 1 presents some examples of studies using univariate time series forecasting approaches that do not utilize exogenous predictor variables to forecast precipitation or temperature variables, and streamflow or river discharge variables. The former can be considered as climatic or meteorological variables depending on the time scale of interest, while the latter can be considered as the results of precipitation (and other) variables and are more frequently modelled by describing this dependence using either deterministic or statistical methods. Such statistical approaches to modelling hydrological variables can be found in Chen et al. (2015), Gholami et al. (2015) and Taormina and Chau (2015).

Table 1 Examples of univariate time series forecasting in geoscience

One-step ahead forecasting of geophysical processes within a purely statistical framework

Abstract

Background

Data and methods

Results and discussion

Experiments using the precipitation datasets

Experiments using the temperature datasets

Experiments using the simulated datasets

Conclusions

Abbreviations

References

Authors’ contributions

Acknowledgements

Competing interests

Availability of data and materials

Funding

Publisher’s Note

Author information

Authors and Affiliations

Corresponding author

Additional files

Additional file 1.

Additional file 2.

Additional file 3.

Additional file 4.

Additional file 5.

Additional file 6.

Additional file 7.

Rights and permissions

About this article

Cite this article

Share this article

Keywords