A study of univariate forecasting methods for crude oil price

Mei-Ling Cheng (Department of Shipping and Transportation Management, National Taiwan Ocean University, Keelung, Taiwan)
Ching-Wu Chu (Department of Shipping and Transportation Management, National Taiwan Ocean University, Keelung, Taiwan)
Hsiu-Li Hsu (Department of Air and Sea Logistics and Marketing, Taipei University of Marine Technology, Taipei, Taiwan)

Maritime Business Review

ISSN: 2397-3757

Article publication date: 20 December 2021

Issue publication date: 7 March 2023

1014

Abstract

Purpose

This paper aims to compare different univariate forecasting methods to provide a more accurate short-term forecasting model on the crude oil price for rendering a reference to manages.

Design/methodology/approach

Six different univariate methods, namely the classical decomposition model, the trigonometric regression model, the regression model with seasonal dummy variables, the grey forecast, the hybrid grey model and the seasonal autoregressive integrated moving average (SARIMA), have been used.

Findings

The authors found that the grey forecast is a reliable forecasting method for crude oil prices.

Originality/value

The contribution of this research study is using a small size of data and comparing the forecasting results of the six univariate methods. Three commonly used evaluation criteria, mean absolute error (MAE), root mean squared error (RMSE) and mean absolute percent error (MAPE), were adopted to evaluate the model performance. The outcome of this work can help predict the crude oil price.

Keywords

Citation

Cheng, M.-L., Chu, C.-W. and Hsu, H.-L. (2023), "A study of univariate forecasting methods for crude oil price", Maritime Business Review, Vol. 8 No. 1, pp. 32-47. https://doi.org/10.1108/MABR-09-2021-0076

Publisher

:

Emerald Publishing Limited

Copyright © 2021, Pacific Star Group Education Foundation


1. Introduction

Nearly one-third of global energy consumption uses crude oil, so crude oil is the most crucial energy resource on Earth. Petroleum products are also made of refined crude oil. Crude oil is the world's leading fuel, and its prices have a significant influence on the global environment and the economy.

The Organization of Petroleum Exporting Countries (OPEC) controls over 40% of the global oil supplies. The OPEC influences oil prices by setting production levels to meet the global demand for crude oil, increasing or decreasing production. When demand exceeds supply, the prices will go up, and when supply exceeds demand, the prices go down. In addition, politics, interest rates and natural disasters, such as COVID-19, can also impact crude oil prices.

Crude oil prices affect many levels, including global economic and trade development, national security, economy, central bank policy and corporate profitability. If we can make accurate predictions, the uncertainty about the future can be reduced. From a government perspective, making accurate forecasts on crude oil prices and implementing appropriate measures can reduce domestic economic shocks and increase the well-being of people.

The fuel cost accounts for somewhere between 22 and 38% of total expenses in the airline industry and is the first or second highest cost item. Therefore, it is essential to operate fuel-efficient aircraft and execute other strategies that can help mitigate jet fuel price volatility. Before the pandemic, a fuel hedging strategy was used by airlines to reduce their exposure to fuel prices. In an environment of demand uncertainty and travel restrictions, including quarantine requirements imposed by governments, a rise in crude oil prices would decrease airline profitability (Midas Aviation).

The above situation in the airline industry is also applicable to the container shipping industry. The bunker cost is the first or second highest cost item. A decline in crude oil prices ultimately flows through to the price of marine fuel. The IMO 2020 rule requires ships without exhaust-gas scrubbers to switch from 3.5% heavy fuel oil to cleaner 0.5% sulfur fuel called very low sulfur fuel oil (VLSFO). The average VLSFO price at the world's top four bunker hubs was $494 per ton, which was much higher than one year ago. An increase in crude oil price would decrease the container shipping industry's profitability.

There is no doubt that crude oil price forecasts are beneficial to governments as well as industries. Thus, forecasting crude oil prices has been an important subject of research by both academia and industry. The purpose of this paper is to make comparisons on different univariate forecasting methods to provide a more accurate short-term forecasting model on the crude oil price for rendering a reference to authorities or managers.

Our paper is organized as the following: Section 2 provides the literature review. Section 3 presents the methodology. The comparison of the results from all methods is discussed in Section 4. Finally, Section 5 provides some concluding remarks.

2. Literature review

Bashiri Behmiri and Pires Manso (2013) provided a comprehensive review of crude oil price forecasting techniques. They categorized the existing forecasting techniques into the two main groups of quantitative and qualitative methods. The quantitative method includes econometrics methods and nonstandard methods. Among them, econometrics models are grouped into three models: time series models, financial models and structural models. On the other hand, the main nonstandard methods that are most frequently applied in oil price forecasting are artificial neural networks and support vector machines. On the other side, qualitative methods estimate the impact of infrequent events, such as wars and natural events on oil prices; these approaches, such as Delphi method, belief networks, fuzzy logic and expert systems, and web text mining method, recently obtained more popularity among oil price forecasting literature.

Econometrics models are the most frequently used methods in existing oil price forecasting literature. Most of the literature studies on econometric methods to predict oil prices use time series models to make predictions. Time series models predict future oil prices based on historical data. In these models, the future price behaviors are deduced from their historical data. Time series models include moving average, exponential smoothing, decomposition method, autoregressive integrated moving average (ARIMA), generalized autoregressive conditional heteroskedasticity (GARCH) and vector autoregressive (VAR).

ARIMA models are widely used in forecasting crude oil prices. Ahmad (2011) applied the ARIMA approach for the time series analysis of monthly average prices of Oman crude oil. The author recommended a seasonal ARIMA for short-term forecasting. Xiang and Zhuang (2013) collected the monthly price of Brent crude oil from November 2012 to April 2013 and constructed an ARIMA model that provided a good prediction result. Zhao and Wang (2014) proposed an ARIMA model to predict world crude oil based on data from 1970 to 2006. They found that the model was able to describe and predict the average annual price of world crude oil for the short-term forecast. Mensah (2015) examined the monthly Brent oil price for the last two decades using ARIMA techniques and compared different ARIMA models. The result showed that the ARIMA (1,1,1) model was the best forecasting model amidst the volatilities in the oil price. Faruk (2018) used the ARIMA to construct the best model for the average monthly price of crude oil in Nigeria and recommended that the government should change the oil price benchmark and replace the MA model with the ARIMA model to ensure more excellent financial stability and efficient fiscal management in Nigeria. Selvi et al. (2018) constructed the ARIMA model to forecast the average annual crude oil price of the period 2017–2021 based on the data of 1946–2016.

The ARIMA model is a linear model, which captures time series linear characteristics. However, some authors believed it might not be easy to grasp the risky and volatile data and explore other models. Sadorsky (2006) compared different forecasting models to forecast petroleum prices. West Texas Intermediate (WTI) daily future prices of crude oil, heating oil and unleaded gasoline were used. The results showed that the GARCH type models outperformed the other techniques. Yaziz et al. (2011) used Box–Jenkins and GARCH models to construct a forecast model of the daily price of West Texas crude oil. The study collected a large amount of historical data and found that GARCH (1, 1) captured the volatility of the data. Nwafor et al. (2018) compared the prediction ability of eight models based on the OPEC Reference Basket, and the results showed that the GARCH(1,1) model provided more accurate daily price forecasts. Gupta et al. (2020) collected a total of 1,500 records of the closing price of crude oil (5 years and 11 months). They proposed a forecasting model based on the back-propagation learning algorithm. The main advantage of this kind of neural network is to build a model for the nonlinear and complex relationship between input and output.

The multivariate time series model can be used to explain the causal relationship between variables. The advantage of using this type of model is that it can explain the causes and consequences in logical analysis and help the decision makers maintain a keen eye on oil price trends. But whether it is more accurate and stable is indecisive. Ye et al. (2005) proposed a forecast model for the monthly spot price of West Texas crude oil. They incorporated variables such as oil inventories, oil production, imports and demand on spot oil prices into the model. Since the total oil inventory level measures the balance between oil production and demand, the model could reflect the changing market conditions of crude oil prices.

Beckers et al. (2015) incorporated relevant variables into the vector auto-regression model. These variables include consumer price index (CPI) inflation, dollar exchange rates, US three-month and ten-year Treasury yields, spreads between long-term and short-term interest rates, oil supply and demand from major producers or regions, global industrial production, the Organisation for Economic Co-operation and Development (OECD) inventory demand and other variables. Although the vector auto-regression model has good forecast performance, it is still difficult to predict in the long-term sample with structural change.

Lyu et al. (2017) collected oil price data for 2003–2015, used a combination of factor analysis and data correlation to select the factors affecting West Texas crude oil prices for world crude oil demand, world crude oil supply, US crude oil inventories and the US dollar index. The authors incorporated these variables into the established inverted neural network and an ARIMA model to predict West Texas crude oil prices. The results indicated that the inverted neural network model had a better performance.

Crude oil prices are highly uncertain. Sometimes the prices remain stable; sometimes, the prices are unstable. To build a model, it must collect a tremendous amount of data. This process is very time-consuming, so a simple model using a small data size is getting popular. Scholars have tried to use the gray forecast in crude oil price forecasting in recent years, which brings a new application to this field. Lin (2009) used the grey forecast model to forecast the average monthly price of WTI crude oil. Xu (2010) applied the grey model to forecast China's crude oil consumption. Huang et al. (2016) utilized the grey prediction model to predict global crude oil consumption. Duan et al. (2018) made forecasting of crude oil consumption in China using a grey prediction model. Norouzi et al. (2020) adopted the gray forecasting model to predict the OPEC crude oil prices. The results of the gray forecast model are better than that of long-term data.

From the literature, we found that the time series model is the widely adopted method for crude oil price forecasting. It is also evident that most studies have focused on long-term forecasting and used a large data size. There is limited research using a small data size and comparing the performance of different methods on short-term forecasting. This paper attempts to bridge this gap in the application, so we aim to find a practical and yet highly accurate model. In this research, six different univariate methods, namely the Grey Forecast, the hybrid grey model, the multiplicative decomposition model, the trigonometric regression model, the regression model with seasonal dummy variables and the seasonal autoregressive integrated moving average (SARIMA), have been used.

3. Methodology

3.1 Grey forecasting model

Grey theory, developed originally by Deng (1989), is a generic theory that deals with systems characterized by poor information or insufficient information. The primary merit of a grey model is that it needs fewer data to generate forecasts.

GM (1, 1) is a notation for a first-order and single-variable grey model. The constructing process of GM (1, 1) can be described as follows.

Denote the original data sequence as

(1)x(0)=(x(0)(1),x(0)(2),x(0)(3),...,x(0)(n)),
where x(0) (k) stands for the original data sequence in period k. The following sequence x(1) is defined as
(2)x(1)=(k=11x(0)(k),k=12x(0)(k),...,k=1nx(0)(k))=(x(1)(1),x(1)(2),x(1)(3),...,x(1)(n)).

Equation (2) was generated based on the accumulated generating operation (AGO) of Equation (1).

To test whether the sequence was acceptable for constructing the model, we calculated the class ratio σ(k) as follows:

(3)σ(1)(k)=x(1)(k1)x(1)(k),k2,

If the original sequence σ(1)(k) (0,1), then x(1)(k) was suitable for constructing the model.

After completing the class ratio test, we constructed the GM(1, 1) model by establishing a first-order differential equation for x(1)(k) as follows:

(4)dx(1)dk+ax(1)=b
where a and b denoted the coefficients to be determined.

Next, we applied the ordinary least squares method to Equation (4) to estimate the coefficients of a and b. After obtaining the estimated coefficients, aˆ and bˆ, we substituted aˆand bˆ in the following equation:

(5)xˆ(1)(k+1)=(x(0)(1)bˆaˆ)eaˆk+bˆaˆ and xˆ(1)(1)=x(0)(1)

The forecast value of the time series could easily be calculated using the inverse accumulated generating operation (IAGO) to convert to x(0)(k) as follows:

(6)xˆ(0)(k+1)=xˆ(1)(k+1)xˆ(1)(k)

3.2 Hybrid grey forecasting model

Peng and Chu (2009) found that the classical decomposition model was the best model for forecasting container throughput with seasonal variations. We modified the same forecasting model proposed by Peng and Chu (2009) by combining the grey model with the multiplicative decomposition model. Since the size of the initial sequence in the grey forecast could affect the forecast performance, we searched for the initial sequence's size with the lowest prediction errors. The constructing steps are as follows.

  1. Step 1: The first step is to remove the seasonal component from the original data; this step is equal to the calculation of the multiplicative decomposition model discussed in Section 3.3. The predicted results were used as the input in Steps 2 and 3.

  2. Step 2: We constructed a grey forecast model, as discussed in Section 3.1, based on the time-series data obtained in Step 1.

    • By setting K, the initial sequence's size to four, we obtained the predicted values and prediction errors.

    • We changed the size of K to 5 and continued the same calculation processes until K = 8.

    • We identified a value of K that provided the lowest prediction error.

  3. Step 3: We conducted a grey forecast month by month based on the data for the same month of different years.

  4. Step 4: Comparing the results from both Step 2 and 3, we chose K's size with the lowest prediction error.

3.3 Multiplicative decomposition model

In the decomposition model, time series can be decomposed into four separate factors: trend, seasonal, cyclical and irregular factors. The decomposition model is based on intuition rather than theory. Two types of decomposition models: multiplicative and additive models, are commonly used in practice. In this study, we adopted the multiplicative model and expressed the time series as follows:

(7)Yt=TRt×SNt×CLt×IRt,
where Yt is the observed value of the time series in time period t.
  • TRt is the trend factor in time period t,

  • SNt is the seasonal factor t in time period t,

  • CLt is the cyclical factor in time period t and

  • IRt is the irregular factor in time period t.

First, we calculated a 12-period moving average and denoted it as MAt for period t. Next, the centered moving average at time t was calculated as CMAt=12(MAt+MAt+1). Since the trend and cyclical components were included in the centralized moving average series, i.e. CMAk = TRt × CLt, in Equation (7), we calculated the product of the seasonal and irregular components of the time series as follows: SIt = SNt × IRt =YtCMAt.

To remove the irregular factor from SIt, we estimated the average of the observations in month t for four successive years to obtain the seasonal factor for month t, expressed as SNt=14(SIt+SIt+12+SIt+24+SIt+36).

Dividing Yt by the seasonal index SNt, we generated a deseasonalized series (YtSNt), for estimating the trend factor TRt. Next, the deseasonalized time series data were used to estimate TRt by a linear trend model as follows:

(8)TRt=α+βt+εt

To obtain the point estimates for α and β, we employed the least squares method to Equation (8) and obtained the following estimate for the trend factor:

(9)TRtˆ=a+bt

3.4 Trigonometric regression model

The trigonometric model can forecast time series that exhibit seasonal variations. We adopted a general specification that allows the modeling of a more complicated increasing seasonal pattern, suggested by Bowerman and O'Connell (1993), as follows:

(10)Yt=β0+β1t+β2sin[2πtL]+β3tcos[2πtL]+β4cos[2πtL]+β5tcos[2πtL]+β6sin[4πtL]+β7tsin[4πtL]+β8cos[4πtL]+β9tcos[4πtL]+εt,
where L denotes the number of periods within a year in the data. In our case, we considered the monthly data so L is 12.

3.5 Regression model with seasonal dummy variables

The regression model can provide an advantageous technique for analysis variance by using seasonal dummy variables. When analyzing a time series with seasonal variation, we often use a model of the following form:

Yt=TRt+SNt+εt,
where TRt is the trend in time period t, SNt is the seasonal factor in time period t and εt denotes the error term in time period t. Assuming that seasonal variations could be represented by dummy variables with one for each month, we have the season factor as follows:
(11)SNt=i=111βsiXsi,t,
where Xsi,t={1ifperiodtismonthi, i=1,2,3,...,11,0otherwise.

Substituting Equation (11) into the above model of the time series, we obtained the regression model as follows:

(12)Yt=β0+β1t+βs1Xs1,t+βs2Xs2,t+·······+βs11Xs11,t+εt.

Applying the least squares method to Equation (12), we expressed the forecasted value of the time series as

(13)Yˆt=b0+b1t+bs1Xs1,t+bs2Xs2,t+·······+bs11Xs11,t.

3.6 ARIMA

This research's seasonal model is the most general form of a univariate class of models originally suggested by Box and Jenkins (1976). It has been extensively studied and used in different fields. An important concept for the model building process is that the time series data is required to be stationary. It implies that the probabilistic structure of the series does not change over time. The SARIMA is an extension to ARIMA that supports time series with a seasonal component by incorporating seasonal factors into ARIMA model. The SARIMA model is usually denoted as SARIMA(p, d, q) (P, D, Q)s, where p is the autoregressive order, d is the number of differencing operations, q the moving average order and P, D and Q are the corresponding seasonal orders. The SARIMA has the following form:

(14)φp(B)Φp(Bs)dsDZt=θq(B)ΘQ(Bs)εt,
where
  • ψp(B) = (1−ψ1Bψ2B2− … −ψp Bp),

  • Φp(Bs) is the seasonal operator of order P,

  • B is the backshift operator with Bm(Zt= Zt-m,

  • s is the season length,

  • d=(1−B) is the non-seasonal operator,

  • sD=(1Bs)D is the seasonal differencing operator,

  • Zt is the stationary data at time t,

  • θq(B) = (1−θ1B−θ2B2− … −θqBq) and

  • ΘQ(Bs) is the seasonal operators of order Q, and εt is the white noise with zero mean and variance.

After transforming the original data into a stationary time series, a model building process included identification, parameter estimation and diagnostic checking. A tentative autoregressive moving average process was developed at the identification stage based on the theoretical model's estimated shape. This process of comparison allowed the definition of p and q.

Having specified an initial model, we estimated the unknown parameters in Equation (14) by inspecting the behavior of the auto-covariance function and partial autocorrelation function.

At the final stage, diagnostic checking, we examined the residuals to determine the model's adequacy. If the model is adequate, it was used in the forecasting.

4. Data and results

In this section, we first describe the data used for the study. Results from all six forecasting models are reported next. Then, we evaluate the results by the three criteria and compare the forecasting accuracy of the models.

4.1 Time series data on crude oil price

The monthly data on the average crude oil price is collected from January 2015 to December 2019. We split the data into two sets: an in-sample data set for estimation and an out-of-sample data set for prediction. The in-sample data cover the period from January 2015 to December 2018, while the out-of-sample data are from January to December 2019.

4.2 Results for the models

4.2.1 Grey forecasting

With the grey model, the forecast values are susceptible to the size of the initial sequence of the time series chosen. Therefore, we conducted the grey forecast five times by varying the initial sequence and found that the lowest prediction errors were obtained when the size of the initial sequence was equal to 4. The following steps demonstrate the process of generating predicted values for the time series by the grey model:

  1. Class ratio test:

Application of GM(1,1) model requires that the time series data pass the class ratio test first. As mentioned in Section 3.1, the value of σ(k) must fall within the interval between 0 and 1 for the sequence x(0) to fit the grey model. The results of the class ratio test suggest that the grey model is appropriate for the time series.

  1. Accumulated generated operation (AGO):

Based on equation (2), we perform AGO to obtain the sequence.

  1. Mean value generating sequence:

We calculated the mean value generating sequence based on the sequence obtained after accumulated generated operation.

  1. Time series prediction model

Using the least square method, we obtain estimates for coefficients a and b as follows:

aˆ=0.1868912999bˆ=89.6763678173

These estimates are used to get

(15)Xˆ(0)(k+1)=(1eaˆ)[X(0)(1)bˆaˆ]eaˆk

Based on equation (15), the predicted values of the time series from January 2019 to December 2019 are calculated and presented in Table 3.

4.2.2 Hybrid grey forecast

Following the same calculation procedures described in Section 3.2 for the hybrid grey model, we can obtain the predicted crude oil prices from January 2019 to December 2019. The detailed results are also given in Table 3.

4.2.3 Multiplicative decomposition model

Based on the multiplicative decomposition model discussed in Section 3.3, we summarize the calculation results for crude oil prices in Table 1.

4.2.4 Trigonometric regression model

We estimate equation (10) using statistical analysis system (SAS) statistical software and obtain the following results:

yˆ=40.66382+0.45501t+3.57426sin(2πt12)0.10327tsin(2πt12)(16.8)(5.31)(1.06)(0.85)
4.11685cos(2πt12)+0.05154tcos(2πt12)2.18172sin(4πt12)(1.2)(0.43)(0.65)
(16)+0.05425tsin(4πt12)+3.97381cos(4πt12)0.15647tcos(4πt12)(0.45)(1.17)(1.31)
where the number in parentheses indicates the t-value for the estimated coefficient. In addition, R2 = 0.4775 and adjusted R2 = 0.3538. For the analysis of variance, the F-value for the overall significance of the model is 3.86 with a p-value of 0.0015, suggesting that the model is acceptable empirically. Equation (16) is then used to forecast the crude oil price from January 2019 to December 2019. Detailed forecast results are summarized in Table 3.

4.2.5 Seasonal dummy regression

Like the trigonometric regression model, we estimate equation (11) using the SAS statistical package. However, before running the regression, we conducted a logarithmic transformation of the dependent variable. The results of the estimation are reported as follows:

yˆt=b0+b1t+bs1Xs1,t+bs2Xs2,t+······+bs11Xs11,t
=35.03896+ 0.46437t+ 4.91305Xs1,t+ 4.86118Xs2,t+ 4.72431Xs3,t(6.76)(5.1)(0.81)(0.8)(0.78)
+ 7.696744Xs4,t+ 10.39808Xs5,t+ 9.07871Xs6,t+ 6.71684Xs7,t(1.32)(1.73)(1.51)(1.12)
(17)+ 3.75997Xs8,t+ 5.0456Xs9,t+ 6.54374Xs10,t+ 1.85687Xs11,t(0.63)(0.84)(1.09)(0.31)
where, again, the number in parentheses is the t-value for the coefficient estimate. The F-value for the overall significance of the model is 2.52 with a p-value of 0.0166; while R2 = 0.4635 and adjusted R2 = 0.2795. Based on the results reported in equation (17), we calculate the predicted crude oil prices from January 2019 to December 2019. The detailed results are listed in Table 3.

4.2.6 ARIMA model

By examining the historical data, we found that a trend existed in the data. Hence, the original series needs a first differences or other forms of differencing to produce a stationary series. After conducting several differences, we found that the second differences can produce a stationary data for the original data. From Figure 1, we can see that the original data becomes stationary after a second differences.

Once we have transformed the original data into stationary time series, we use the autocorrelation function (ACF) and partial autocorrelation function (PACF) to identify tentative Box–Jenkins models. Three possible models are summarized in Table 2. A good way to check the adequacy of an overall Box–Jenkins model is to analyze the residuals obtained from the model. The Ljung–Box statistic from the SAS output is used.

Table 2 provides a summary of the Ljung–Box statistic values and prob-values for the possible models. Using the Ljung–Box statistic values and prob-values to check the model adequacy, we found that both Models 2 and 3 are not adequate since not all p-values are >0.05. Hence, it is clear that only Model 1 is adequate since the Ljung–Box statistic values (Chi-Square values) are small, and all p-values are >0.05.

The best model identified for crude oil price is ARIMA(2,2,0) and estimated as follows:

(18)Zt=0.99775Zt10.56635Zt2+εt(6.73)(3.84)
where the value in parentheses refers to the t-value for the coefficient estimate. Equation (18) is used to forecast crude oil price for the ARIMA specification.

4.3 Comparison of forecasting methods

The predicted values of crude oil price for the out-of-sample period from January to December 2019 are computed for each of the six forecasting methods. We summarize the results in Table 3, along with actual crude oil prices for comparison.

Yokum and Armstrong (1995) conducted two studies on experts' opinions on the criteria used in selecting forecasting techniques. They found that accuracy was considered to be the most critical criterion across by most researchers. Since there is no universally accepted measure of accuracy that can be applied to every forecasting situation, several criteria are typically used to comprehensively assess forecasting models. The performance of the models often differs depending on the accuracy measure being used (Makridakis et al., 1982). In this study, we consider three criteria of measuring accuracy commonly chosen to assess the six forecasting models. They are the root mean squared error (RMSE), the mean absolute error (MAE) and the mean absolute percent error (MAPE) defined as follows:

(19)MAE=i=1n|YiYi|n
(20)MAPE=100i=1n|YiYiYi|n
(21)RMSE=i=1n(YiYi)2n
where Yi and Yi are the actual and the predicted values of the time series in period i, respectively. Obviously, all three measures are positive in value, and the smaller is the value obtained for each of the measures calculated the better the performance of the forecasting method.

The comparative results of forecasting accuracy of the six methods are presented in Table 4. Table 4 shows that the grey forecast is clearly the best forecasting model since it has the lowest values of all three performance measures. The ARIMA appears to be the second best model for forecast accuracy regardless of the three measures used. There seems to be no major significant difference between the results from the hybrid grey forecast, multiplicative decomposition and trigonometric model. On the other hand, seasonal dummy regression is found to be the worst method for predicting crude oil prices.

In the research on the forecast, it is a fact that no single forecasting model is the best for all situations under all circumstances (Makridakis et al., 1982). Nevertheless, from all performance measurements reported above, we found that, in general, the grey forecast outperformed other forecast methods in this study and is a reliable method for forecasting crude oil prices.

5. Conclusions

In this paper, six methods, including the grey forecast, the hybrid grey forecast, the classical decomposition, the trigonometric model, the seasonal dummy variables and the ARIMA, have been applied to forecast West Texas crude oil prices based on the monthly data. We compare the predicting accuracy of the models by calculating MAE, MAPE and RMSE for each of the models.

Grey forecast is the best model for crude oil price forecasting. Comparing the results of this study with Lin's (2009) paper, it is pointed out that the grey forecast is based on a small sample, simple calculation and accurate prediction. In Lin's study, the WTI crude oil price prediction in February 2009 with an accuracy of 95.62%. The results are similar to those of this study, which had an accuracy of 98.83, 97.92, 96.45 and 95.50%, in August, September, October and November 2019, respectively.

ARIMA is the second best model for crude oil price forecasting. Although the time series of crude oil prices is seasonal, it is not very obvious. The single variable ARIMA model is based on mathematical and statistical theory and can convert the historical data of oil price into a stationary pattern, which is a reliable forecasting model in capturing the characteristics of time series. The contribution of this study is using a small size of data and comparing the performance of different methods on short-term forecasting. This paper bridged this gap in the application and found a practical and yet highly accurate model.

There is a research limit in our paper. The collapse of oil prices occurred during the time when COVID-19 had halted much of global economic activity. The phenomenon of negative oil prices in 2020 may be the reason that cannot be well explained in all forecasting methods because it is affected by the epidemic situation, other human factors and economic measures to prevent a severe economic downturn.

In the future, if the research focuses on explaining the causal relationship between variables, a multivariate approach can be used since many factors affect the crude oil price. Furthermore, it may be worthwhile to combine and explore other forecasting methods, such as neural networks, artificial intelligence or advanced data mining techniques, to predict crude oil prices.

Figures

The SAS output of the ACF and PACF for the second differences of the original data

Figure 1

The SAS output of the ACF and PACF for the second differences of the original data

Predicted crude oil price using multiplicative decomposition model

tYt12 MACMA = TRt *CLtSNt*IRtSNtTRt = Yt/SNtTRt = a+btYt = TRt × SNt
147.4 0.98248.2741.457240.7068
250.83 0.95952.9941.901440.1909
347.85 0.98548.5742.345541.7182
454.44 1.02952.9242.789644.0147
559.26 1.06655.5843.233746.0992
659.8 1.03257.9643.677945.0624
751.1648.75948.0941.0641.01050.6544.122044.5683
842.8647.42946.5760.9200.95145.0844.566142.3712
945.4845.72345.3031.0040.97746.5445.010243.9851
1046.244.88344.3211.0421.02245.2245.454446.4419
1142.6443.75943.2420.9860.99242.9845.898545.5326
1237.1942.72542.2650.8800.99537.3746.342646.1242
1331.4441.80441.5430.7570.98232.0246.786745.9399
1430.3541.28241.3600.7340.95931.6447.230945.3028
1537.7741.43941.4260.9120.98538.3447.675046.9688
1640.9641.41341.5670.9851.02939.8248.119149.4968
1746.8541.72141.8451.1201.06643.9448.563351.7819
1848.7541.96942.5831.1451.03247.2549.007450.5608
1944.8943.19844.0781.0181.01044.4449.451549.9517
2044.7544.95845.9180.9750.95147.0749.895647.4382
2145.1746.87847.3700.9540.97746.2250.339849.1933
2249.8947.86348.2881.0331.02248.8350.783951.8872
2345.6248.71348.7820.9350.99245.9951.228050.8196
2451.9348.85148.7021.0660.99552.1851.672151.4286
2552.5648.55348.6271.0810.98253.5352.116351.1729
2653.4048.70148.8381.0930.95955.6752.560450.4148
2749.5848.97449.1631.0080.98550.3353.004552.2194
2851.1749.35349.4221.0351.02949.7553.448654.9789
2948.5049.49249.9540.9711.06645.4953.892857.4646
3045.1750.41650.6550.8921.03243.7854.336956.0593
3146.6750.89451.3580.9091.01046.2054.781055.3351
3248.0351.82352.1880.9200.95150.5255.225152.5052
3349.7152.55353.1070.9360.97750.8755.669354.4014
3451.5653.66054.2910.9501.02250.4656.113457.3325
3556.7154.92355.8131.0160.99257.1756.557556.1066
3657.6756.70357.6261.0010.99557.9457.001756.7330
3763.7058.54959.5521.0700.98264.8757.445856.4060
3862.1760.55561.3811.0130.95964.8257.889955.5267
3962.8662.20763.0580.9970.98563.8158.334057.4700
4066.3263.90864.7051.0251.02964.4758.778260.4610
4169.8665.50365.4931.0671.06665.5259.222363.1474
4267.3365.48365.1261.0341.03265.2659.666461.5577
4370.7464.768 1.01070.0360.110560.7186
4467.85 0.95171.3660.554757.5723
4570.13 0.97771.7660.998859.61
4670.69 1.02269.1961.442962.78
4756.48 0.99256.9361.887061.39
4849.09 0.99549.3262.331262.04
4951.43 0.982 62.775361.64
5054.91 0.959 63.219460.64
5158.10 0.985 63.663562.72
5263.81 1.029 64.107765.94
5360.78 1.066 64.551868.83
5454.61 1.032 64.995967.06
5557.33 1.010 65.440166.10
5654.76 0.951 65.884262.64
5756.90 0.977 66.328364.82
5853.91 1.022 66.772468.22
5956.92 0.992 67.216666.68
6059.77 0.995 67.660767.34

A summary of the Q* values and prob-values for the models

To LagChi squarep-value
Model 1: ARIMA(2,2,0)
Zt = ψ1Zt-1 + ψ2Zt-2 + εt
Zt is the second differences of the original time series
62.740.6016
127.140.7125
189.450.8936
2413.100.9303
Model 2: ARIMA(2,2,0)
Zt = ψ2Zt-2 + εt
Zt is the second differences of the original time series
614.860.0110
1221.910.0251
1826.770.0615
2434.410.0595
Model 3: ARIMA(2,1,0)
Zt = ψ1Zt-1 + ψ2Zt-2 + εt
Zt is the quadratic roots of the original time series
613.190.0001
1213.940.2364
1814.570.6266
2414.690.9054

Actual and predicted crude oil prices Unit: USD/per barrel

Actual priceGrey forecastHybrid grey forecastMultiplicative decompositionTrigono-metric modelSeasonal dummy regressionARIMA (2,2,0)
151.4339.8656.5361.6459.4062.7052.37
254.9147.3264.3060.6463.6563.1151.77
358.1057.9166.4162.7266.1863.4451.15
463.8161.8064.3065.9465.0167.1554.39
560.7868.4469.3068.8362.2670.0455.81
654.6163.5569.5167.0662.0969.1953.46
757.3351.1560.4766.1066.2367.2956.51
854.7654.1163.9062.6471.8064.8055.90
956.9055.7166.8764.8273.8566.5558.29
1053.9155.8965.9968.2270.2968.5159.74
1156.9254.3570.9566.6864.3264.2954.30
1259.7755.9364.1067.3461.5262.9050.82

Performance of various methods of forecasting crude oil prices

References

Ahmad, M.I. (2011), “Modeling and forecasting Oman crude oil prices using Box-Jenkins techniques”, Society of Interdisciplinary Business Research (SIBR) 2011 Conference on Interdisciplinary Business Research.

Bashiri Behmiri, N. and Pires Manso, J.R. (2013), “Crude oil price forecasting techniques: a comprehensive review of literature”, available at: https://ssrn.com/abstract=2275428.

Beckers, B. and Beidas-Strom, S. (2015), Forecasting the Nominal Brent Oil Price with VARs—One Model Fits All ?, 2015, International Monetary Fund, IMF Working papers, available at: https://www.imf.org/en/Publications/WP/Issues/2016/12/31/Forecasting-the-Nominal-Brent-Oil-Price-with-VARs-One-Model-Fits-All-43423.

Bowerman, B.L. and O'Connel, R.T. (1993), Forecasting and Time Series: An Applied Approach, 3rd ed., Duxbury Press, CA.

Box, G.E.P. and Jenkins, G.M. (1976), Time Series Analysis, Forecasting and Control, 2nd ed., Holden Day, San Francisco.

Deng, J.L. (1989), “Introduction grey system theory”, Journal of Grey System, Vol. 1, pp. 1-24.

Duan, H., Lei, G.R. and Shao, K. (2018), “Forecasting crude oil consumption in China using a grey prediction model with an optimal fractional-order accumulating operator”, Complexity, Vol. 2018, pp. 1-12.

Faruk, B.U. (2018), An Application of Box-Jenkins Approach in Forecasting Nigeria Crude Oil Prices, Ph. D. Thesis, Department of Economics, Umaru Musa Yaradua University, Katsina.

Gupta, N. and Nigam, S. (2020), “Crude oil price prediction using Artificial Neural Network”, Procedia Computer Science, Vol. 170, pp. 642-647.

Huang, S.P., Zhang, L., Wang, M.H., Wang, R. and University, C.J. (2016), “Analysis of global oil consumption based on grey prediction model”, Science and Technology Innovation Herald, Vol. 34, pp. 118-121.

Lin, A. (2009), “Prediction of international crude oil futures price based on GM(1,1)”, 2009 IEEE International Conference on Grey Systems and Intelligent Services (GSIS 2009).

Lyu, H. and Chang, Y. (2017), “Research on international crude oil price forecasting model”, International Journal of New Developments in Engineering and Society, Vol. 1 No. 3, pp. 78-81.

Makridakis, S., Anderson, A., Carbone, R., Fildes, R., Hibon, M., Lewandowski, R., Newton, J., Parzen, E. and Winkler, R. (1982), “The accuracy of extrapolation (time series) methods: results of a forecasting competition”, Journal of forecasting, Vol. 1 No. 2, pp. 111-153.

Mensah, E.K. (2015), Box-Jenkins Modelling and Forecasting of Brent Crude Oil Price, MPRA Paper No. 67748, posted 16 Mar 2016 00:14 UTC, Munich Personal RePEc Archive, Munich.

Norouzi, N. and Fani, M. (2020), “Black gold falls, black plague arise -an OPEC crude oil price forecast using a gray prediction model”, Upstream Oil and Gas Technology, Vol. 5, 100015.

Nwafor, C.N. and Oyedele, A.A. (2018), “Forecasting OPEC oil price: a comparison of parametric stochastic models”, European Journal of Business and Management, Vol. 10 No. 10, pp. 49-60.

Peng, W.Y. and Chu, C.W. (2009), “A comparison of univariate methods for forecasting container throughput volumes”, Mathematical and Computer Modeling, Vol. 50, pp. 1045-1057.

Sadorsky, P. (2006), “Modeling and forecasting petroleum futures volatility”, Energy Economics, Vol. 28, pp. 467-488.

Selvi, J.J., Shree, R.K. and Krishnan, J. (2018), “Forecasting crude oil price using ARIMA models”, International Journal of Advance Research in Science and Engineering, Vol. 7 No. 5, pp. 334-343.

Xiang, Y. and Zhuang, X.H. (2013), “Application of ARIMA model in short-term prediction of international crude oil price”, Advanced Materials Research, Vols 798-799, pp. 979-982.

Xu, R.S. (2010), “Research on China's oil consumption demand forecast based on Grey System Theory”, Statistics and Decision, Vol. 320, pp. 98-101.

Yaziz, S.R., Ahmad, M.H., Nian, L.C. and Muhammad, N. (2011), “A comparative study on Box-Jenkins and Garch models in forecasting crude oil prices”, Journal of Applied Sciences, Vol. 11 No. 7, pp. 1129-1135.

Ye, M.T., Zyren, J. and Shore, J. (2005), “A monthly crude oil spot price forecasting model using relative inventories”, International Journal of Forecasting, Vol. 21, pp. 491-501.

Yokum, J.T. and Armstrong, J.S. (1995), “Beyond accuracy: comparison of criteria used to select forecasting methods”, International Journal of Forecasting, Vol. 11 No. 4, pp. 591-597.

Zhao, C. and Wang, B. (2014), “Forecasting crude oil price with an autoregressive integrated moving average (ARIMA) model”, in Fuzzy Information and Engineering and Operations Research and Management, pp. 275-286.

Further reading

Aamir, M. and Shabri, A.B. (2015), “Modeling and forecasting monthly crude prices of Pakistan: a comparative study of ARIMA, GARCH and ARIMA−GARCH Models”, Science International, Vol. 27 No. 3, pp. 2365-2371.

He, X.J. (2018), “Crude oil prices forecasting: time series vs. SVR models”, Journal of International Technology and Information Management, Vol. 27 No. 2, Article 2.

Safari, A. and Mohammadi, M. (2017), “Assessing linear and nonlinear models to forecast OPEC oil prices”, Revista QUID, (Special Issue), Vol. 1 No. 3, pp. 13-20.

Tularam, G.A. and Saeed, T. (2016), “Oil-price forecasting based on various univariate time-series models”, American Journal of Operations Research, Vol. 6, pp. 226-235.

Yu, F., Liu, Y. and Zhang, C. (2021), “Forecasting the price of fuel oil: a STL−(ELM+ARIMA) combination approach”, Journal of Physics: Conference Series, Vol. 1903, 012048.

Acknowledgements

The authors would like to thank anonymous referees for their helpful comment.

Corresponding author

Ching-Wu Chu can be contacted at: cwchu@ntou.edu.tw

Related articles