1 Introduction

The accurate estimation of the implied volatility surface of stock options is paramount for option pricing, risk management, and financial research. Existing methods, often tested on highly liquid index options, demonstrate limitations when applied to single stock options. To address this, our study utilizes an extensive dataset of half a billion daily price observations for options on 499 US individual stocks and the S&P 500. Our analysis identifies the one-dimensional kernel smoother as the most reliable and accurate method for constructing daily volatility surfaces.

These findings bear significant implications for a spectrum of stakeholders, including traders, risk managers, investors, and academics. By adopting the one-dimensional kernel smoother, they can estimate volatility surfaces with higher precision, leading to improved decision-making and analysis. This research accentuates the importance of a robust method that delivers accurate predictions across diverse options, regardless of their maturity, moneyness, and liquidity variations.

The research contrasts the one-dimensional kernel smoother with the three-dimensional kernel smoother by OptionMetrics (2022) and the spline method by Figlewski (2008). The one-dimensional kernel smoother consistently outshines the other methods, delivering the lowest leave-one-out cross-validation root-mean-squared errors (RMSE) across different volume buckets, maturity, and moneyness categories.

Moreover, our research uncovers the disruptive effects of noise in the volatility surface construction, particularly from the three-dimensional kernel smoother. Such distortions profoundly impact the Bakshi et al. (2003) moments and skewness spanning regressions, especially for short-term options and during extreme volatility events. These distortions potentially mislead derivative pricing and trading decisions, emphasizing the urgency for a more reliable volatility surface construction method.Footnote 1

This study significantly enhances the existing literature in several ways. Firstly, we extend the investigation of volatility surface construction methods to include single stock options of varied liquidity, moving beyond the typically studied liquid index options.Footnote 2 This extension gains relevance in the current era of big data and machine learning, where single stock options offer forward-looking insights into firm-specific information.

Secondly, we substantially broaden the scope and depth of the data used in this research field.Footnote 3 Our work incorporates almost half a billion daily option prices from 500 distinct underlying assets, marking a significant advancement in both dataset volume and investigative depth.

Moreover, we challenge the prevailing perception of the three-dimensional kernel smoother as the ’gold standard,’ advocating for the more precise and robust refined one-dimensional kernel smoother. This methodology allows for constructing a dense volatility surface for both single stock and index options, enabling more accurate and reliable inferences in applied empirical research studies.

Lastly, our research uncovers significant issues with the spline method, particularly for short-term out-of-the-money single stock options used in studies like Driessen et al. (2009) and Bollerslev and Todorov (2011), shedding light on areas that require improvement in existing literature.

The rest of the paper is structured as follows. Section 2 provides a detailed exposition of the methodology employed by each state-of-the-art method. In Sect. 3, we elaborate on our evaluation methodology for comparing the predictive accuracy of each method. Section 4 presents a brief overview of the data. We summarize our key findings in Sect. 5. Finally, Sect. 6 provides concluding remarks.

2 Constructing single stock volatility surfaces

In this section, we present the methodology for constructing single stock volatility surfaces by comparing three methods that have been established to be accurate in prior research: (i) the spline method proposed by Figlewski (2008), (ii) the three-dimensional kernel smoother of OptionMetrics (2022), and (iii) the one-dimensional kernel smoother of Ulrich and Walther (2020). We exclude the parametric Gram-Charlier expansion method, as it has been identified as the least accurate method for constructing volatility surfaces of European-style index options by Ulrich and Walther (2020).

While the literature has highlighted the superior performance of the one-dimensional kernel smoother for most moneyness and maturity combinations in the context of European-style index options, our study focuses on American single-stock options that are influenced by firm-specific information, creating additional variance in option prices. Our analysis is unique in several ways. First, we use a much larger dataset consisting of nearly half a billion end-of-day option prices, compared to the 9 million S&P 500 option prices used by Ulrich and Walther (2020). Second, OptionMetrics (2022) has updated its methodology in response to Ulrich and Walther (2020), prompting us to investigate whether the one-dimensional kernel smoother still outperforms OptionMetrics’ approach. Lastly, previous studies have primarily focused on S&P 500 index options, while our research specifically targets single-stock American options.

It is important to note that all of the methods examined are agnostic to both the time and the specific firm under consideration. As such, we omit a firm and time index in the following formulas, which are generalizable to different times and single firm combinations.

2.1 One-dimensional kernel smoother

In this section, a description of the one-dimensional kernel smoother employed in our approach is provided.

We consider the input volatility surface for a firm and day, denoted by \(\{\sigma _{i,\tau }\}_{i \in [1, N_{\tau }], \tau \in {\mathcal {T}}}\). Here, \(N_{\tau }\) denotes the total number of observed option-implied moneyness levels for a \(\tau\) maturity option and \({\mathcal {T}}\) stands for the set of observed maturities. We denote with \(m_{i,\tau } \in \{m_{i,\tau }\}_{i \in [1, N_{\tau }], \tau \in {\mathcal {T}}}\) an ascendingly ordered set of observed option-implied moneyness levels for a specific observed option maturity \(\tau\).Footnote 4 Moneyness \(m_{i,\tau }\) is defined as

$$\begin{aligned} m_{i,\tau }:= \frac{K_{i,\tau }}{F_\tau } \end{aligned}$$
(1)

where K and F stand for the strike and the single stock’s forward price, respectively. We denote the observed option-implied volatility for \(m_{i,\tau }\) as \(\sigma _{i,\tau }\).

We let \(K^{min}_{\tau }\) and \(K^{max}_{\tau }\) denote the smallest and largest observed strike of a \(\tau\)-maturity option, i.e.

$$\begin{aligned} K^{min}_{\tau }:= \min _{i \in [1, N_{\tau }]} K_{i,\tau }, \quad \text {and} \quad K^{max}_{\tau }:= \max _{i \in [1, N_{\tau }]} K_{i,\tau }. \end{aligned}$$
(2)

Our goal is to generate a volatility surface for normalized moneyness levels ranging from \([-10,4]\). Specifically, we express the normalized moneyness \({\bar{m}}_{i,\tau }\) as a function of both the moneyness \(m_{i,\tau }\) and the at-the-money implied volatility \(\sigma ^{ATM}_{\tau }\) for the observed option panel at a given maturity \(\tau\). Mathematically, this reads:

$$\begin{aligned} {\bar{m}}_{i,\tau }(m_{i,\tau }, \sigma ^{ATM}_\tau ):= \frac{\ln m_{i,\tau }}{\sqrt{\tau } \sigma ^{ATM}_\tau }. \end{aligned}$$
(3)

Initially, we employ Ulrich and Walther (2020)’s one-dimensional kernel smoother to compute \({\sigma _{i,\tau }}_{i \in [1,N\tau ]}\) with a target moneyness step size of 0.01. Next, we utilize linear regressions to estimate the missing implied volatility-normalized moneyness pairs for our target interval of \({{\bar{m}}}_{i,\tau } \in [-10,4]\) based on the three left and rightmost implied volatility-moneyness pairs. Lastly, we smooth the resulting volatility surface using the same one-dimensional kernel smoother. The one-dimensional kernel smoother takes the following form:

$$\begin{aligned} {\sigma }^{1d}_\tau ({m_j}) = \sum _{i=1}^{{N}_\tau }{\frac{k_{\tau }({m_j} -{m}_{i, \tau })}{\sum _{i=1}^{{N}_\tau }{k_{\tau }({m_j} - {m}_{i, \tau })}} \times {\sigma }_{i, \tau }} \end{aligned}$$
(4)

where \({\sigma }^{1d}_\tau ({m_j})\) is the smoothed \(\tau\)-period implied volatility for a target moneyness \({m_j}\), \({m}_{i,\tau }\) and \({\sigma }_{i,\tau }\) are observed (and linearly extrapolated) moneyness and option-implied volatility pairs, \({N}_\tau\) represents the total number of input implied volatilities and \(k_{\tau }(x)\) is an unnormalized Gaussian kernel function, defined as

$$\begin{aligned} k_{\tau }(x) = e^{-\left( \frac{x^2}{2*h_{\tau }}\right) }, x \in {\mathbb {R}}, \end{aligned}$$
(5)

where \(h_{\tau }\) is a maturity-specific bandwidth parameter, defined as

$$\begin{aligned} h_{\tau } = \left( 0.75 \frac{1}{(N_{\tau } - 1) S} \times (K^{max}_{\tau } - K^{min}_{\tau })\right) ^2, \end{aligned}$$
(6)

where S is the price of the underlying stock.

2.2 Three-dimensional kernel smoother

In estimating single firm volatility surfaces, OptionMetrics (2022) employs a three-dimensional kernel smoother which includes a maturity dimension \(\tau\), Put/Call flag \(I_j \in \{0,1\}\), and option delta \(\Delta _j\), with the moneyness dimension being replaced by \(\Delta _j\). OptionMetrics (2022) uses options with vegas of \(\nu \ge 0.5\) as input to the three-dimensional kernel smoother, producing an implied volatility surface with option deltas in the range of [0.1, 0.9]. This range of deltas has been extended compared to OptionMetrics (2016) to address the finding in Ulrich and Walther (2020) that a delta range of [0.2, 0.8] is insufficient for accurately capturing left tail risk in index options.

The three-dimensional kernel smoother takes the form

$$\begin{aligned} {\sigma }^{3d}(\Delta _j, \tau _j, I_j)= & {} \sum _{i=1}^{N} \frac{\nu _i k(\Delta _i - \Delta _j, \log (\tau _i)-\log (\tau _j), I_i-I_j)}{\sum _{i=1}^{N}\nu _i k(\Delta _i - \Delta _j, \log (\tau _i)-\log (\tau _j), I_i-I_j)} \nonumber \\{} & {} \times \sigma (\Delta _i, \tau _i, I_i) \end{aligned}$$
(7)

where \({\sigma }^{3d}\) is the implied volatility smoothed using the three-dimensional kernel smoother, \(\nu\) and \(\Delta\) represent the option’s vega and delta, respectively, and \(\sigma\) is the observed implied volatility with a specific delta, maturity, and call or put flag. The three-dimensional Gaussian kernel takes the form

$$\begin{aligned} k(x,y,z) = \frac{1}{\sqrt{2\pi }}e^{-\left( \frac{x^2}{2h_1} + \frac{y^2}{2h_2} + \frac{z^2}{2h_3}\right) } \end{aligned}$$
(8)

with the respective bandwidth parameters \(h_1:= 0.05, h_2:= 0.005, h_3:= 0.001\).

2.3 Spline

The final method we examine is the smoothing spline developed by Figlewski (2008). This semi-parametric approach smoothes the observed implied volatility-moneyness pairs using an nth-order polynomial with one knot point located at a moneyness of one.

To enhance flexibility, we adopt an adaptive approach by setting the order of the smoothing spline \(n_{\tau }\) to vary with \(N_{\tau }\). Specifically, we set the order to increase gradually as the number of observations increases, in order to mitigate overfitting. Concretely, we set the order as follows:

$$\begin{aligned} n_{\tau }={\left\{ \begin{array}{ll} 2, &{} {N_{\tau }< 8}\\ 3, &{} {N_{\tau } < 15}\\ 4, &{} {N_{\tau } \ge 15}. \end{array}\right. } \end{aligned}$$
(9)

The data pre-processing method employed is based on Figlewski (2008). Specifically, the spline is fitted to out-of-the-money call and put options, while smoothing is applied to at-the-money volatilities in order to mitigate noise. The smoothing is performed using the following method:

$$\begin{aligned} \sigma ^*_{\tau }(K_{i, \tau }):= w(K_{i,\tau }) \times \sigma _{\tau ,Put}(K_{i, \tau }) + (1-w(K_{i,\tau })) \times \sigma _{\tau ,call}(K_{i, \tau }) \end{aligned}$$
(10)

with

$$\begin{aligned} w(K_{i,\tau }):= \frac{K^{max}_{\tau }-K_{i,\tau }}{K^{max}_{\tau }-K^{min}_{\tau }}, \quad \forall K_{i,\tau } \in {[0.98F_{\tau }, 1.02F_{\tau }]} \end{aligned}$$
(11)

where \(\sigma ^*_{\tau }(K_{i, \tau })\) stands for the smoothed implied volatility of observed at-the-money options that is later used in the spline smoothing. We address the potential impact of the minimum amount of input data on the model’s performance by reporting results separately for days with a minimum number of inputs of 4 and 5, corresponding to \(N_\tau \ge 4\) and \(N_\tau \ge 5\), respectively.

The parameters of the spline are estimated by minimizing the least squares of the non in-the-money subset of observed implied volatilities \(\{\sigma _{i, \tau }\}_{i \in [1,N{\tau }]}\). As with the one-dimensional kernel, the spline is fitted for each maturity \(\tau\) separately. The resulting smoothed volatilities of the spline are denoted as \(\sigma ^{\text {spline}}_{i,\tau }\).

3 Validation

We employ the leave-one-out cross-validation technique to validate the accuracy of the three state-of-the-art implied volatility surfaces. This technique is a standard tool in machine learning for assessing the generalization performance of predictive models on unseen data (Geisser, 1993; Kohavi, 1995). Our cross-validation procedure involves sequentially leaving out one observation from a day’s option price observations and using the remaining data to predict its value, conditional on one of the three methodologies. The iterative nature of this procedure enables us to evaluate the forecast accuracy of each method when predicting the left-out price, based on the information contained in the other prices. We perform two prediction tasks: one that excludes the leftmost and rightmost points and another that includes all observations. This enables us to evaluate the sensitivity of each method to extrapolation-induced overfitting.

Consistent with the literature, we examine implied volatilities rather than prices to evaluate the predictive performance of the three state-of-the-art methods. For each method \(\text {M} \in \{\text {1d}, \text {3d},\text {spline}_{N\tau \ge 4}\}\) and observed option panel, we calculate the root-mean-squared error (RMSE) and the mean absolute error (MAE),

$$\begin{aligned} \text {RMSE}^{\text {M}}:= \sqrt{\frac{1}{N_\tau \times N_{{\mathcal {T}}}\times N_T \times 500} \sum _{j \in [1,500]} \sum _t \sum _{\tau \in {\mathcal {T}}} \sum _{i=1}^{N^j_\tau } \left( \sigma _{j,t,i,\tau } - {\sigma ^{\text {M}}_{j,t,i,\tau }} \right) ^2} \end{aligned}$$
(12)

and

$$\begin{aligned} \text {MAE}^{\text {M}}:= \frac{1}{N_\tau \times N_{{\mathcal {T}}}\times N_T \times 500} \sum _{j \in [1,500]} \sum _t \sum _{\tau \in {\mathcal {T}}} \sum _{i=1}^{N^j_\tau } |\sigma _{j,t,i,\tau } - {\sigma ^{\text {M}}_{j,t,i,\tau }} |, \end{aligned}$$
(13)

where in order to highlight that the error statistics are calculated across all time periods and all firms, we explicitly incorporate the time index t and a firm index \(j \in [1,500]\). Here, \(N_T\) represents the total number of time points. As we have 499 firms and the S&P 500, we consider \(j \in [1,500]\). To report the RMSE and MAE for different moneyness and maturity combinations, we adjust the formulas by considering the relevant subsets.

4 Data

In this section, we describe the data used in our analysis and explain how we calculate single firm dividend yields and the risk-free rate.

4.1 Option prices

In this section, we provide an overview of the option price data used in our analysis, consisting of 373,825,605 daily option prices for 499 S&P 500 constituents and 14,267,068 prices for options on the S&P 500 between January 2004 and July 2019, sourced from CBOE. For our analysis, we use daily quote data and consider the mid price. Table 1 summarizes the input data broken down into moneyness and maturity buckets. The data is categorized into short-, medium-, and long-term based on the remaining time to maturity: 1 to 29, 30 to 365 days, and longer, respectively. Moneyness is classified as at-the-money (ATM), left-tail, and right-tail if the moneyness is between 0.95 and 1.05, below 0.95, and above 1.05, respectively. On average, there were 31.96 strikes and 6.23 maturities traded per day and single firm, compared to 185.61 traded strikes and 19.53 maturities for a typical day in the index option market.

Table 1 Summary statistics of data for cross-validating volatility surface construction methods

The moneyness breakdown in Table 1 shows that the majority of the input data concentrates in the left tail, with approximately 58% (8,340,544 observations) for index options and 46% (172,918,617 observations) for single firm options. The disparity in the relative proportion of call options between the index and single firm option markets is significant, with right tail options representing only 17% (2,393,238 observations) of quoted S&P 500 options, but constituting 35% (132,102,454 observations) of quoted single firm options. At-the-money options constitute 25% (3,533,286 observations) of index options and 18% (68,804,534 observations) of single firm options.

The maturity breakdown in Table 1 shows that medium-term options make up the majority of quoted prices, comprising approximately 60% for both index and single firm options. However, there is a notable difference in the proportion of short-term options, with only 10% classified as short-term for index options, while for single stock options, this figure rises to 27%.

Our option panel data analysis reveals that for index options, the majority of quoted prices are medium to long-term at-the-money (ATM) and out-of-the-money (OTM) put options, while for single stock options, the largest share of quoted prices are short to medium-term OTM put and call options. We apply standard filters to remove duplicates and option prices that do not fulfill basic no-arbitrage restrictions and drop options for which implied volatility did not converge.

4.2 Risk-free rate, dividend yields and implied volatility

In this section, we provide a description of how we calculate the risk-free rate, dividend yields, and implied volatilities. We follow the methodology suggested by van Binsbergen et al. (2021) to extract the risk-free rate term structure from index options. This approach utilizes the put-call parity relationship for index options with the same maturity and has become the standard in option pricing research. Similar approaches have also been employed by Ulrich et al. (2023), Golez and Jackwerth (2022), and OptionMetrics (2022) to compute the risk-free rate.

Calculating the dividend yield is a challenging task, and no one-size-fits-all solution exists. We choose to work with one at-the-money call and put price pair for each traded maturity to derive the dividend yield (Hull, 2018). Any measurement error in the dividend yield is equally reflected in all the methods and does not introduce any bias in our comparative analysis.

Lastly, we utilize the approach proposed by Barone-Adesi and Whaley (1987) to convert option prices with early exercise features into implied volatilities. Barone-Adesi and Whaley (1987) extends the Black-Scholes model for an early exercise premium and is known for its favorable trade-off of computation speed and accuracy. Similar to the dividend yield calculation, any measurement error in the inversion process is evenly spread across all methods.Footnote 5

5 Findings

In this section, we present the findings of our analysis. We begin by summarizing the cross-validation error statistics for different specifications, including one where we do not evaluate extrapolation to the left and right most observations. To better understand the cross-validation errors, we record the maximum, standard deviation, 95% quantiles, and the entire error distribution using boxplots. We disaggregate the results for different moneyness, maturity, and volume buckets.

To study the relationship between accuracy and trading volume, we introduce volume buckets. We collect all single stock option volumes for each day and calculate the 0.25 (\(q_{0.25}\)) and 0.75 (\(q_{0.75}\)) volume quantiles. For each day t and single firm i, we classify its daily volume into a low, medium, or high volume bucket, depending on whether the actual volume is (i) below \(q_{0.25}\), (ii) between \(q_{0.25}\) and \(q_{0.75}\), or (iii) larger than \(q_{0.75}\).

We then present the results for S&P 500 options and compare them to Ulrich and Walther (2020). This is followed by an analysis of single stock options.

5.1 S&P 500

In this section, we present our analysis of three methods used to construct implied volatility surfaces for S&P 500 index options. Our focus is on the short-, medium-, and long-term options, and we report our results using leave-one-out cross-validation root-mean-squared errors (RMSE) and mean absolute errors (MAE) in panel (a) of Table 2.

Our findings indicate that the one-dimensional kernel smoother is the only method that produces acceptable, small prediction errors across all moneyness and maturity buckets for S&P 500 index options. Specifically, its short-term RMSE for left, center, and right tail options are 0.0090, 0.0029, and 0.0075, respectively. In contrast, the three-dimensional kernel smoother of OptionMetrics (2022) and the spline exhibit strong evidence of overfitting and noise in their smoothed volatility surfaces. Specifically, their respective RMSEs for the same maturity and moneyness buckets are 0.5566, 0.1181, 0.2690, and 3.7805, 0.0489, and 0.4600.

Moreover, we find that the three-dimensional kernel smoother and the spline exhibit a concerning level of noise in their smoothed implied volatilities for short-term options. For instance, for the class of short-term out-of-the-money put options, both methods produce noise in implied volatility forecasts that is 61.84 and 420 times larger than prediction noise arising from the one-dimensional kernel smoother.

Our results hold true for medium and long-term options as well, as indicated by the RMSE of predicted versus realized implied volatility. Specifically, the three-dimensional kernel smoother of OptionMetrics (2022) and the semi-parametric spline exhibit RMSEs that are approximately ten times larger than those of the one-dimensional kernel smoother. Our findings are consistent with previous research by Ulrich and Walther (2020) on the RMSEs of index options in terms of the magnitude of errors and the relative ranking of RMSEs.

Panel (b) of Table 2 confirms that our performance measures are not affected by extrapolation into the left and right tails of the observed implied volatility distribution. The results from both the RMSE and MAE in panels (a) and (b) of Table 2 confirm the superiority of the one-dimensional kernel smoother, which exhibits the smallest prediction errors for S&P 500 index options.

Boxplots for the RMSE of all methods and moneyness/maturity buckets are displayed in Fig. 1, with the results indicating that at-the-money options have the lowest RMSE across all methods. However, the semi-parametric spline exhibits severe difficulties in accurately predicting implied volatility for short-term left tail options.

Fig. 1
figure 1

Boxplots for cross-validation squared prediction errors for S&P 500 options. This figure displays Boxplots of the cross-validation squared errors with inter- and extrapolation for S&P 500 options. The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days. The moneyness buckets, color-coded in the plot, are represented by \(m=\frac{K}{F}\), and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05. The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days. The Spline plot reports the results for spline with \(N_{\tau }\ge 4\) data points per fitted line. The data covers the period from Jan 2004 to Jul 2019

In conclusion, we strongly advise against using the semi-parametric spline and the three-dimensional kernel smoother of OptionMetrics (2022) for constructing implied volatility surfaces of index options. The one-dimensional kernel smoother emerges as the superior choice, producing negligible noise that falls comfortably within typical bid-ask price bounds.

In the following analysis, we turn our attention to assessing the efficacy of the examined methods in predicting implied volatility for stock options.

Table 2 Cross-validation prediction errors for implied volatilities of S&P 500 options

5.2 Single firms

In this section, we focus on evaluating the performance of each method in predicting implied volatility for single stock options. To our knowledge, there is a gap in the literature in terms of examining the quality of implied volatility surfaces for individual stock options. Therefore, we break down the RMSE results for single stock options into three-dimensional volume/maturity/moneyness buckets to further assess the performance of each method for high and low volume option contracts.

Panel (a) of Table 3 shows the average RMSEs and MAEs for single stock options. The results indicate that even the best-performing method incurs twice as high RMSEs when applied to single stock options on low volume days. For short maturity options in high volume days, the average RMSE for left, center, and right-tail options is 0.0329, 0.0152, and 0.0350, respectively. These respective RMSEs increase to 0.0612, 0.0380, and 0.0879 when looking at low volume single stock options while keeping all else equal. Furthermore, the RMSE of even the highest volume bucket is roughly four times larger than the overall RMSE of index options, indicating that illiquidity is a significant concern for single stock options and introduces significant noise into their implied volatility surfaces.

Table 3 Cross-validation prediction errors for implied volatilities of single stock options

We then compare how different methods perform in coping with the noise induced by illiquidity in single stock implied volatilities. The semi-parametric spline exhibits catastrophic RMSEs for short-dated options, with RMSEs for high volume left-tail, center, and right-tail options being 0.1787, 15.4456, and 36.8436, respectively. These RMSEs are five, one thousand, and one thousand times higher than those of the best-performing one-dimensional kernel method. The OptionMetrics (2022) three-dimensional kernel smoother performs better than the semi-parametric spline, but its RMSE for high volume short-dated options (left, center, right) is still 13, 20, and 12 times higher than those of the one-dimensional kernel smoother. This result highlights that building option-implied volatility surfaces using the three-dimensional kernel smoother cannot be recommended.

Figure 2 indicates that the squared prediction error diminishes as the volume of traded options increases. In particular, the one-dimensional kernel smoother exhibits an exponential decrease in error as volume increases, suggesting that the benefits of additional information from option trading volumes are large. In contrast, the semi-parametric spline performs poorly for low volume short-maturity left-tail single stock options, indicating that the spline is less effective in incorporating information on low volume days into the volatility surface. The three-dimensional kernel smoother aligns with the trend of the one-dimensional smoother, but its forecast errors are significantly larger.

Fig. 2
figure 2

Boxplots for cross-validation squared prediction errors for single stock options. This figure displays Boxplots of the average cross-validation squared errors with inter- and extrapolation across single stock options. The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days. The moneyness buckets, color-coded in the plot, are represented by \(m=\frac{K}{F}\), and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05. The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days. The Spline plot reports the results for spline with \(N_{\tau }\ge 4\) data points per fitted line. The data covers the period from Jan 2004 to Jul 2019

In summary, we find that the one-dimensional kernel smoother outperforms the other two methods examined in this study, across all different volume/maturity/moneyness buckets, in terms of accuracy in computing implied volatility surfaces for both index and single stock options. Our recommendation is to use the one-dimensional kernel smoother in future research that requires low RMSE volatility surfaces. Our findings support the notion that simplicity often pays off in terms of prediction accuracy and generalizability to unseen data.

5.3 Industry clusters

In a final step, we group single stocks into industries based on the Global Industry Classification Standard (GICS), which includes eleven sectors: Energy, Materials, Industrials, Consumer Discretionary, Consumer Staples, Health Care, Financials, Information Technology, Communication Services, Utilities, and Real Estate. To explore the industry-specific effects on our findings, we examine the Boxplots for the one-dimensional (Fig. 3), three-dimensional (Fig. 4), and spline methodologies (Fig. 5) across the 11 industries. The results in Fig. 3 reveal that the one-dimensional kernel smoother’s performance remains unaltered by industry, exemplifying its robustness across various sectors. Conversely, Fig. 4 highlights that the three-dimensional kernel smoother persistently generates larger errors in comparison to the one-dimensional smoother throughout all 11 industries, further supporting our conclusion that the one-dimensional kernel smoother is the superior method for computing volatility surfaces.

Fig. 3
figure 3figure 3

Boxplots for cross-validation squared prediction errors for single stock options, one-dimensional kernel smoother, by industry sector. This figure displays Boxplots of the average cross-validation squared errors with inter- and extrapolation across single stock options by GICS industry sector for the one-dimensional kernel smoother method. The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days. The moneyness buckets, color-coded in the plot, are represented by \(m=\frac{K}{F}\), and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05. The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days. The data covers the period from Jan 2004 to Jul 2019

Fig. 4
figure 4figure 4

Boxplots for cross-validation squared prediction errors for single stock options, three-dimensional kernel regression, by industry sector. This figure displays Boxplots of the average cross-validation squared errors with inter- and extrapolation across single stock options by GICS industry sector for the three-dimensional kernel smoother method. The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days. The moneyness buckets, color-coded in the plot, are represented by \(m=\frac{K}{F}\), and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05. The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days. The data covers the period from Jan 2004 to Jul 2019

We also observe in Fig. 5 that the spline method produces varying results for different industries. Some industries exhibit small errors similar to the one-dimensional kernel smoother, while others have significantly larger errors. These industry-specific differences suggest that the semi-parametric spline method may not be reliable for single stock options analysis, particularly in certain industries. As such, we advise caution when using the spline method and recommend that researchers benchmark their results against those obtained using the one-dimensional kernel smoother.

Fig. 5
figure 5figure 5

Boxplots for cross-validation squared prediction errors for single stock options, \(\text {Spline}_{N_\tau \ge 4}\), by industry sector. This figure displays Boxplots of the average cross-validation squared errors with inter- and extrapolation across single stock options by GICS industry sector for the spline method. The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days. The moneyness buckets, color-coded in the plot, are represented by \(m=\frac{K}{F}\), and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05. The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days. The plot reports the results for spline with \(N_{\tau }\ge 4\) data points per fitted line. The data covers the period from Jan 2004 to Jul 2019

5.4 Lessons from the tails

To complete the analysis, we have a closer look at the (i) standard deviation of the prediction error, (ii) the 95% quantile of the absolute prediction error and (iii) the maximum absolute prediction error. These values are displayed in Table 4. As previously, we aggregate prediction errors across all 499 firms and display results for different volume/maturity/moneyness buckets.

Table 4 Tails of the cross-validation prediction error for single stock implied volatilities

The standard deviation of the prediction error, which can be viewed as the volatility of the distribution of prediction errors, is lower for the one-dimensional kernel smoother than for the three-dimensional kernel smoother or spline. In the high volume and short maturity options bucket, the prediction error volatility for the left-tail region is 0.0229, 0.3515, and 0.0999 for the one-dimensional kernel smoother, three-dimensional kernel smoother, and spline, respectively. The one-dimensional kernel smoother performs well regardless of which part of the implied volatility surface is predicted, while the three-dimensional kernel smoother and spline have difficulties predicting the tails. In summary, the realized standard deviation of single stock implied volatility prediction errors is excessive for all methods except the one-dimensional kernel smoother.

In terms of the 95% quantile of the absolute prediction error, the one-dimensional kernel smoother outperforms the other two methods, with respective values of 0.0370, 0.6297, and 0.2359 for high volume short maturity left-tail implied volatilities. This implies that the worst 5% of mispricing for a short maturity out-of-the-money put option is around 4% (implied volatility units) for the one-dimensional kernel smoother, while it is 63% for the three-dimensional kernel smoother of OptionMetrics (2022) and 24% for the spline. The three-dimensional kernel smoother performs significantly worse on days when trading reveals more information, while both the one-dimensional kernel smoother and spline tend to exhibit lower 95% error quantiles as volume increases.

The average maximum absolute prediction error across all single stock options shows that the one-dimensional kernel smoother outperforms the other two methods by several magnitudes. The maximum absolute implied volatility prediction error for the one-dimensional kernel smoother ranges from 0.7004 to 2.7663, while the respective values for the three-dimensional kernel smoother are 3.1467 to 52.1571 and 0.2746 to 109.5617 for the spline.

In summary, the one-dimensional kernel smoother is the most reliable method for constructing single stock implied volatility surfaces. The three-dimensional kernel smoother of OptionMetrics (2022) and the semi-parametric spline of Figlewski (2008) are both prone to prediction errors with large second and higher moments, suggesting that they may be too complex or unrealistic. We therefore recommend academics and practitioners to use the one-dimensional kernel smoother for constructing end-of-day volatility surfaces.

5.5 Further implications for single stock option-implied information

The extraction of meaningful economic insights from implied volatility surfaces can be compromised by the presence of uninformative noise. This is particularly relevant when evaluating the efficacy of option-implied risk measures. In this section, we investigate Bakshi et al. (2003)’s implied moments within the cross-section of implied volatility. Additionally, we extend our analysis to replicate the skewness spanning regressions presented in Bakshi et al. (2003) Table 6, utilizing an expanded dataset that includes a larger number of firms and a longer time span compared to Bakshi et al. (2003). Through these exercises, we aim to demonstrate the potential effects that different methods for constructing volatility surfaces could have on the extraction of information implied by options.

5.5.1 Impact of method selection on Bakshi et al. (2003) moments

Our analysis, as visualized in Fig. 6, underscores the importance of the methodology adopted for constructing the volatility surface. Both the three-dimensional kernel smoother and the spline method inject substantial uninformative noise, resulting in significant distortions in the Bakshi et al. (2003) moments. These distortions are particularly acute for short-term options and in instances of extreme market situations.

Fig. 6
figure 6

Single stock (Bakshi et al., 2003) moments for different volatility surfaces. This figure displays Boxplots of model-free option-implied (Bakshi et al., 2003) moments for the one-dimensional kernel smoother method, the three-dimensional kernel smoother method and spline with \(N_{\tau }\ge 3\) data points per fitted line. The volume buckets stand for low, medium and high volume trading days. All options are of short maturity with less than 30 days to expiration. The data covers the period from Jan 2004 to Jul 2019

For short-term options across varying liquidity days, the distortions become even more pronounced for options-implied skewness and kurtosis, with the three-dimensional kernel smoother drastically underpredicting kurtosis and depicting skewness close to zero. This infers that the implied risk-neutral density for this method resembles a Gaussian distribution, a misleading interpretation when compared to the non-Gaussian return densities implied by the more accurate one-dimensional kernel smoother and, to some extent, the spline method.

This discrepancy raises concerns about the efficacy of the three-dimensional kernel smoother in accurately reflecting the properties of the implied volatility surface. This misrepresentation could lead to significant misconceptions about risk, thus affecting derivative pricing and trading decisions. Our results strongly suggest that, in the context of the Bakshi et al. (2003) moments, the spline method and the one-dimensional kernel smoother provide more reliable insights compared to the three-dimensional kernel smoother, further emphasizing the need for accurate methodology selection in volatility surface construction.

5.5.2 Noise in skewness spanning

Our skewness spanning regressions (depicted in Fig. 7) corroborate the findings of Bakshi et al. (2003) Table 6: the skewness of the index is more pronounced than that of single stocks. Nonetheless, our results also highlight that the choice of method to construct the volatility surface dramatically affects these observations.

Fig. 7
figure 7

Statistics of skewness spanning regressions for one-dimensional kernel smoother. This figure displays Histogram plots for statistics of skewness spanning regressions, given by \(SKEW_n(t) = \beta _0 + \beta _1 \times SKEW_m(t) + \epsilon _n(t)\), where \(SKEW_n\) and \(SKEW_m\) denote the risk-neutral skewness of a single stock n and the market, for the one-dimensional kernel smoother method. All options are of short maturity with less than 30 days to expiration. The data covers the period from Jan 2004 to Jul 2019

Employing the one-dimensional kernel smoother, we are able to accurately detect significant mapping coefficients, aligning with Bakshi et al. (2003)’s original results. However, when the three-dimensional kernel smoother is applied, the added noise drastically diminishes the spanning \(R^2\) and spanning t-statistics, effectively rendering them close to zero across firms (see Fig. 8). Such a result conceals the inherent patterns within the data, thus impairing the accurate extraction of option-implied information.

More worryingly, this discrepancy underlines the risk of potential misinterpretations in derivative pricing and trading decisions, which could be led astray by the distortions introduced by the three-dimensional kernel smoother. In this context, our findings argue powerfully for the adoption of more accurate methodologies, like the one-dimensional kernel smoother, for robust volatility surface construction and reliable skewness spanning regressions.

Fig. 8
figure 8

Statistics of skewness spanning regressions by methodology. This figure displays Histogram plots for statistics of skewness spanning regressions, given by \(SKEW_n(t) = \beta _0 + \beta _1 \times SKEW_m(t) + \epsilon _n(t)\), where \(SKEW_n\) and \(SKEW_m\) denote the risk-neutral skewness of a single stock n and the market, for the one-dimensional kernel smoother method, the three-dimensional kernel smoother method and spline with \(N_{\tau }\ge 3\) data points per fitted line. All options are of short maturity with less than 30 days to expiration. The data covers the period from Jan 2004 to Jul 2019

6 Conclusion

Our research underscores the robust superiority of the one-dimensional kernel smoother in constructing accurate volatility surfaces, outperforming both the widely utilized three-dimensional kernel smoother of OptionMetrics (2022) and the semi-parametric spline of Figlewski (2008) across all tested error metrics. The difference becomes particularly stark when considering the worst 5% absolute implied volatility prediction error for short-maturity out-of-the-money put options: less than 4% for the one-dimensional kernel smoother, compared to 24% and 63% for the spline and three-dimensional kernel smoother, respectively.

This study also reveals the profound impact of method choice on the extraction of option-implied information. The use of the three-dimensional kernel smoother introduces significant distortions in Bakshi et al. (2003) moments, particularly for short-term options and during extreme volatility events. These distortions can potentially mislead derivative pricing and trading decisions, underscoring the urgency of adopting a more accurate and reliable method for constructing daily implied volatility surfaces across all moneyness and maturity ranges.

Further, by extending our investigation to single stock options and incorporating a considerably larger dataset than earlier studies, our research provides a comprehensive evaluation of methodologies for constructing accurate volatility surfaces in the era of abundant data. This contribution deepens our understanding of option-implied information, its extraction process, and the far-reaching implications for broader financial markets, offering valuable insights for practitioners and academics alike.