1 Introduction

Various types of wave energy converters (WECs) for extracting energy from the oceans have been proposed. Point absorbers are one type of WEC [1].

A power-take-off system (PTO) used in point absorbers is selected from a hydraulic system, a rotary generator, and a linear generator, etc [2]. Point absorbers with a linear generator can directly convert the heave motion into electric power. The direct-drive system of the generator simplifies the mechanical structure of the WEC compared to a WEC with other PTO systems [3].

The generator performance of the point absorber WEC in ocean waves, including many wave frequencies, can be poor because the frequency responses of the point absorbers usually tend to be narrow-banded [3]. In previous studies, several control schemes for improving the performance of the point absorber WECs have been proposed [4,5,6,7,8,9]. As one of the major conventional control schemes for the generator in WECs, the reactive control maximizes the power absorbed from waves [5, 8].

Most previous studies on reactive control focused only on the power absorbed from waves. In addition, they tended to neglect loss in the generator [6, 10, 11]. As for one simplified reactive control scheme, the loss in the generator is considered under the assumption that conversion efficiency is constant or depends on the direction of heave motion [12,13,14].

The conversion efficiency of the generator varies continuously depending on the forces generated by the reactive control. As one of the losses of the generator, the copper loss cannot be neglected [15]. As a theoretical solution accounting for copper loss in the case of a point absorber with a linear generator, the approximate complex-conjugate control considering the copper loss (ACL) was proposed [16]. The ACL maximizes the total electric output power of the generator, namely, a balance between extracted power and copper loss.

Numerical simulations and tank tests confirmed that the ACL performs well in the case of regular waves [16, 17]. Although the ACL is the optimal control scheme for maximizing electric output energy in regular waves, it is unclear whether the ACL can maximize electric output energy in the case of irregular waves [16]. Additionally, the ACL does not consider constraints, such as physical limitations on the motion of the movable floater of the WEC. To consider the physical limitation, a previous study proposed a method in which new control parameters are calculated if the floater motion is expected to reach the limitation [18]. The method requires predictions of the wave excitation forces and floater motion.

The model predictive control (MPC), which is an expected model-based control scheme that considers the copper loss and the constraint on floater motion, has been developed [19, 20]. The MPC optimizes the future behavior of the WEC motion while computing a sequence of future control actions. The MPC can maximize the electric output energy while considering the floater motion constraint. The performance of the MPC depends on forecasting wave excitation forces and modeling the dynamics of a WEC [8].

Several techniques for forecasting the wave excitation forces have been studied [21,22,23,24,25,26]. Improving the accuracy and forecasting horizon and revealing the general characteristics of the prediction error, such as the statistical distribution, are currently being studied. However, it is difficult to accurately forecast wave excitation forces.

Modeling errors are derived from un-modeled dynamics, such as the mechanical friction and hydrodynamic non-linearity [27]. They can induce poor control performance [28]. A dynamic model was identified before the MPC was applied to the WEC in a tank test to avoid problems, such as reduction of the output energy and control errors due to modeling errors [20]. On the other hand, data collection for modeling WEC motion is difficult in actual ocean waves.

Various methods for modeling the WEC motion were developed in previous studies and their advantages and drawbacks were summarized [29, 30]. The computational fluid dynamics method can calculate the hydrodynamic forces of WEC considering the viscous and non-linear effects. However, it requires vast computational time. The linear wave potential theory can calculate the hydrodynamic forces of WECs with short computational times but cannot consider viscous and non-linear effects. The partially non-linear model is developed to resolve the drawback of the linear wave potential theory, considering some non-linear effects by modifying the linear model. An appropriate model should be chosen depending on the intended use and accuracy of modeling.

In the model-based control, the accuracy of the model and the easiness of the optimization calculation are required to avoid complicated optimization calculation. The partially non-linear model may be suitable when floaters of point absorbers are spheres. However, the linear model based on linear wave potential flow theory is sufficient for point absorbers whose water line areas are uniform such as cylinders [30]. The most-used models in point absorbers are the linear model based on linear wave potential flow theory because of the ease of control calculation and sufficient accuracy [5, 9, 16, 31].

A model-free control scheme is a solution to avoid the aforementioned problems (i.e., forecasting wave excitation forces and modeling errors). The model-free control schemes are categorized by learning methods, such as reinforcement learning. In the model-free control schemes, the model-free reactive controls using reinforcement learning provided good results presented in different studies. However, these previous research have a few issues [12, 32,33,34,35]. The model-free reactive controls based on reinforcement learning require a large number of training data and times [34, 35]. The control scheme for suppressing displacement is applied instead of the control scheme for maximizing power generation when WEC motion violates the displacement constraint [35]. Other reactive controls using reinforcement learning for considering displacement constraints have been proposed [32, 33]. However, these reactive controls have not been evaluated under wave conditions that may reach the displacement constraint, and their effectiveness has not been verified.

Another model-free control scheme is called the model-free reactive control using the Bayesian optimization [36]. The Bayesian optimization requires less learning data compared to reinforcement learning. However, the proposed control did not consider the copper loss and the displacement constraint on the movable floater. Additionally, in the case of irregular waves, it is questionable that the control parameters optimized by the Bayesian optimization are equivalent to the optimized control parameters.

Model-based and model-free reactive controls that can consider the copper loss and displacement constraints without forecasting the wave excitation forces can be an effective control strategy for wave energy converters. However, the aforementioned model-based and model-free controls each have advantages and disadvantages. Thus, it is desirable to switch the model-based and model-free controls based on various situations.

The final goal of the study is to propose a control scheme that combines—in a compatible manner—the model-based and model-free reactive controls of a point absorber. In this study, two equivalent model-based and model-free reactive controls that can consider the copper loss and displacement constraints without forecasting the wave excitation forces are proposed.

The proposed model-based reactive control can maximize the electric output energy by considering the copper loss and a movable-floater-displacement constraint using a power spectrum instead of forecasting wave excitation forces. Similarly, the model-free reactive control aims to maximize the electric output energy by considering the movable-floater-displacement constraint without predicting the wave excitation forces using the Bayesian optimization. The proposed model-free reactive control through the Bayesian optimization can require only a few learning data compared to the previous model-free reactive control [35].

The proposed model-based reactive control and model-free reactive control are compatible control schemes if the optimization result of the proposed model-free reactive control is equivalent to the result of the model-based control. The development of the two equivalent control schemes will lead to the proposal of a combined model-based and model-free reactive control.

This study is organized into four parts. First, the dynamic equation and power electronics of WECs are introduced. Second, a model-based reactive control using a power spectrum is described. The time-averaged electric power, including the copper loss, and the maximum amplitude of the movable-floater displacement can be estimated from a power-spectral-density function of the displacement. Control parameters used in the model-based reactive control can be optimized by combining the time-averaged electric power and maximum amplitude estimations. The proposed control does not require forecasting of the wave exciting forces and WEC motion. Third, the model-free reactive control based on the Bayesian optimization is described. The model-free control searches for the optimal control parameters using prior data. It can maximize the electric output energy, including the copper loss, while considering the constraint without establishing a dynamic model of the WEC or forecasting wave excitation forces. Fourth, the performances of the two proposed reactive control are compared with that of the ACL by numerically simulating the WEC motion in the time domain. The results of the simulations show that the proposed model-free and model-based reactive controls are compatible.

2 WEC control system

2.1 Dynamic equation and power electronics

Fig. 1
figure 1

Schematic view of direct-drive point absorber

Figure 1 shows the elements of the subjected point absorber. The oscillating system is composed of a movable floater and a PTO system (a linear generator in this study).

Given the linear wave theory and the assumptions that the movable floater of the point absorber can move only in the heave motion, the dynamic equation in time domain can be described by the equation of freedom as follows [4]:

$$\begin{aligned}&(m+m_\infty )\ddot{x}+\int ^t_0{b(t-\tau )}\dot{x}(\tau )\mathrm{d}\tau +cx=f_\mathrm{pto}+f_\mathrm{ext}, \end{aligned}$$
(1)
$$\begin{aligned}&b(\tau )=\frac{2}{\pi }\int ^\infty _0 {B(\omega )\cos {\omega \tau }\mathrm{d}\omega }, \end{aligned}$$
(2)

where \(b(\tau )\) is a function considering the memory effect of a hydrodynamic force acting the movable floater.

The control force based on the reactive control is expressed as

$$\begin{aligned} f_\mathrm{pto}(t)=-K_1\dot{x}(t)-K_2x(t). \end{aligned}$$
(3)

The first and second terms on the right side of Eq. (3) correspond to damping and restoring forces, respectively. The current x(t) and \(\dot{x}(t)\) can be observed by a position encoder installed in the generator.

Instantaneous power transferred from the movable floater to the PTO system is given by

$$\begin{aligned} p_\mathrm{m}(t)=-f_\mathrm{pto}(t)\dot{x}(t). \end{aligned}$$
(4)

As shown in Fig. 1, the target WEC system consists of a linear generator and two power converters, one on the generator side and the other on the grid side. The linear generator is a direct-drive system, indicating that it is rigidly coupled to the movable floater and can move in unison with the movable floater. This study aims to maximize the electric power obtained in the generator-side power converter. To achieve this, the internal loss in the generator must be subtracted from Eq. 4.

Generating the control force induces electric power loss in the generator. The electric power loss can be mainly divided into copper loss and iron loss. The iron loss, which is proportional to frequency, is negligible compared to the copper loss because the WEC mainly operates with a wave period of more than 6.0 s. In the previous study, the iron loss was also neglected [15]. Although the iron loss is negligible, the copper loss increases as the control force increases. The optimal control of the WEC necessitates the consideration of the output power, including the copper loss [16].

The generator-side power converter shown in Fig. 1 operates to reduce the stator loss and adjusts the target control force to the value set by the control scheme. The d-axis component current is normally set to zero to reduce the loss in the stator. Thus, the control force can be operated by only the q-axis component current [16]. Under the assumption that the ideal current vector control is achieved in the generator-side power converter, the copper loss in the generator can be calculated as follows [16]:

$$\begin{aligned} p_\mathrm{c}(t)= \frac{R_\mathrm{s}}{K_\mathrm{t}^2}f_\mathrm{pto}^2(t)=\delta f_\mathrm{pto}^2(t). \end{aligned}$$
(5)

Finally, the electric output power of the generator (including the copper loss) can be expressed as

$$\begin{aligned} p_\mathrm{e}(t)= p_\mathrm{m}(t)-p_\mathrm{c}(t)=-f_\mathrm{pto}(t)\dot{x}(t)-\delta f_\mathrm{pto}^2(t). \end{aligned}$$
(6)

The optimal control problem is to maximize the average power output of a WEC with a linear generator in a certain period defined as

$$\begin{aligned} \begin{aligned} \overline{p_\mathrm{e}}&=\frac{1}{T_\mathrm{h}}\int ^{t_0+T_\mathrm{h}}_{t_0}\left( -f_\mathrm{pto}(t)\dot{x}(t)-\delta f_\mathrm{pto}^2(t)\right) \mathrm{d}t\\&=\frac{1}{T_\mathrm{h}}\int ^{t_0+T_\mathrm{h}}_{t_0}\bigg [K_1\dot{x}^2+K_2\dot{x}x-\delta \bigg (K_1^2\dot{x}^2+2K_1 K_2\dot{x}x\\&\quad +K_2^2 x^2 \bigg )\bigg ]\mathrm{d}t. \end{aligned} \end{aligned}$$
(7)

Additionally, the floater moves under a displacement limit on the heave. The optimization problem is expressed as

$$\begin{aligned} \begin{aligned}&{\text {maximize}}&\overline{p_\mathrm{e}} \\&\text {subject to}&\max (|x(t)|) < x_{\mathrm{lim}}. \end{aligned} \end{aligned}$$
(8)

It is necessary to determine the optimal \(K_1\) and \(K_2\) to maximize the time-averaged electric output power.

2.2 Model-based reactive control using the power spectrum

Under the assumption that the heave velocity and displacement of the movable floater are ergodic, the limits of integration in Eq. (7) can be extended to infinity. Consequently, the averaged electric power can be approximated as

$$\begin{aligned}&\overline{p_\mathrm{e}} \approx \lim _{T_\mathrm{h}\rightarrow \infty }\frac{1}{T_\mathrm{h}}\int ^{T_\mathrm{h}/2}_{-T_\mathrm{h}/2}\bigg [K_1\dot{x}^2+K_2\dot{x}x-\delta \bigg (K_1^2\dot{x}^2+2K_1 K_2\dot{x}x\nonumber \\&\qquad +K_2^2 x^2 \bigg )\bigg ]\mathrm{d}t. \end{aligned}$$
(9)

Autocorrelation functions of the displacement and the velocity and the cross-correlation function between the displacement and the velocity are defined as

$$\begin{aligned} R_{xx}(\tau )= & {} \lim _{T_\mathrm{h}\rightarrow \infty } \frac{1}{T_\mathrm{h}}\int ^{T_\mathrm{h}/2}_{-T_\mathrm{h}/2}x(t)x(t+\tau )\mathrm{d}t, \end{aligned}$$
(10)
$$\begin{aligned} R_{vv}(\tau )= & {} \lim _{T_\mathrm{h}\rightarrow \infty } \frac{1}{T_\mathrm{h}}\int ^{T_\mathrm{h}/2}_{-T_\mathrm{h}/2}\dot{x}(t)\dot{x}(t+\tau )\mathrm{d}t, \end{aligned}$$
(11)
$$\begin{aligned} R_{xv}(\tau )= & {} \lim _{T_\mathrm{h}\rightarrow \infty } \frac{1}{T_\mathrm{h}}\int ^{T_\mathrm{h}/2}_{-T_\mathrm{h}/2}x(t)\dot{x}(t+\tau )\mathrm{d}t. \end{aligned}$$
(12)

By substituting Eqs. 10)–(12 into Eq. 9, we can write the average electric power as the following expression defined as an objective function.

$$\begin{aligned} J&:= K_1R_{vv}(0)-\delta \left[ K_1^2R_{vv}(0)+K_2^2R_{xx}(0)\right] \nonumber \\&\quad +\left( K_2-2\delta K_1K_2 \right) R_{xv}(0). \end{aligned}$$
(13)

Equation 12 equals the function obtained by differentiating Eq. 10 with respect to \(\tau \). \(R_{xv}(\tau )\) is an odd function because \(R_{xx}(\tau )\) is an even function. Consequently,

$$\begin{aligned} R_{xv}(0)=0. \end{aligned}$$
(14)

The n-th-order moment of \(\varPhi _{xx}(\omega )\) and the power-spectral-density (PSD) function composed of the displacement of the movable floater are defined according to the Wiener–Khinchin theorem as

$$\begin{aligned} m_n= & {} \int _0^\infty \omega ^n\varPhi _{xx}(\omega )\mathrm{d}\omega , \end{aligned}$$
(15)
$$\begin{aligned} m_0= & {} \int _0^\infty \omega ^0\varPhi _{xx}(\omega )\mathrm{d}\omega =R_{xx}(0), \end{aligned}$$
(16)
$$\begin{aligned} m_2= & {} \int _0^\infty \omega ^2\varPhi _{xx}(\omega )\mathrm{d}\omega =R_{vv}(0). \end{aligned}$$
(17)

Therefore, the value of the objective function can be calculated from the following equation:

$$\begin{aligned} J = K_1m_2-\delta \left( K_1^2m_2+K_2^2m_0 \right) . \end{aligned}$$
(18)

If the copper loss is neglected (i.e., \(\delta =0\)), Eq. 18 is equivalent to that previously reported [13, 37].

Note that \(\varPhi _{xx}(\omega )\) is yielded from Eqs. 1, 2 in the frequency domain. The following formula is obtained by applying the Fourier transformation to Eqs. 1, 2 [4].

$$\begin{aligned} \left[ -\omega ^2(m+A(\omega ))+j \omega B(\omega )+c \right] X(\omega )=F_\mathrm{ext}(\omega )+F_\mathrm{pto}(\omega ), \end{aligned}$$
(19)

where the capital symbols in Eq. 19 represent the results of the Fourier transformation.

The wave excitation force and control force in the frequency domain can be expressed as

$$\begin{aligned} F_\mathrm{ext}(\omega )= {} E(\omega )\zeta _a (\omega ), \end{aligned}$$
(20)
$$\begin{aligned} F_\mathrm{pto}(\omega )= & {} -j\omega K_1X(\omega )-K_2X(\omega ). \end{aligned}$$
(21)

Substituting Eqs. 20 and 21 into Eq. 19 gives

$$\begin{aligned} \left[ -\omega ^2(m+A(\omega ))+j \omega (B(\omega )+K_1)+(c+K_2) \right] X(\omega )=E(\omega )\zeta _a (\omega ). \end{aligned}$$
(22)

The frequency response function of the displacement can be defined below when \(\zeta _a(\omega )\) is regarded as the input of the control system.

$$\begin{aligned} H(\omega ,K_1,K_2)=\frac{E(\omega )}{\left[ -\omega ^2(m+A(\omega ))+j \omega (B(\omega )+K_1)+(c+K_2)\right] }. \end{aligned}$$
(23)

The frequency response function depends on \(A(\omega )\), \(B(\omega )\), \(K_1\) and \(K_2\).

When the PSD function of the incident waves \(\varPhi _{ \zeta \zeta }(\omega )\) is given by \(\varPhi _{xx}(\omega )\) can be obtained from the Eq. 23 as

$$\begin{aligned} \varPhi _{xx}(\omega ) = \left| H(\omega ,K_1,K_2)\right| ^2 \varPhi _{ \zeta \zeta }(\omega ). \end{aligned}$$
(24)

Assuming that wave heights are characterized by the Rayleigh distribution, the probability distribution of the heave amplitude also becomes the Rayleigh distribution since the heave motion of the movable floater is a linear system. When \(m_0\) is given, the expected value of maximum heave amplitude in N-cycle waves, \(\left\langle |x_\mathrm{max}|\right\rangle _N\), can be estimated as follows [38]:

$$\begin{aligned} \left\langle |x_\mathrm{max}|\right\rangle _N :=\left( \sqrt{\ln {N}}+\frac{0.2886}{\sqrt{\ln {N}}} \right) \sqrt{2m_0}. \end{aligned}$$
(25)

The displacement constraint can be satisfied by making Eq. 25 less than or equal to the heave limitation. Finally, Eq. 8 can be written as the optimal control problem with the displacement constraint as

$$\begin{aligned} \begin{aligned}&\underset{K_1,K_2}{\text {maximize}}&K_1m_2-\delta \left( K_1^2m_2+K_2^2m_0 \right) \\&\text {subject to}&\left\langle |x_{\mathrm{max}}|\right\rangle _N < x_{\mathrm{lim}}. \end{aligned} \end{aligned}$$
(26)

The optimal problem can be numerically solved using a gradient method. In this study, a generalized reduced gradient [39] was used. N denotes the number of times the displacement constraint should be considered, and it was set to 2000. This setting value represents the expected maximal heave amplitude while the WEC is encountering 2000 waves. Regarding the consideration of variation in ocean states, since sea states last a minimum of 30 min and a maximum of 6–8 h [12], the wave spectral shape can be estimated based on measured wave elevations and the optimal K1 and K2 can be determined using model-based methods. In this study, the model-based reactive control using the power spectrum is named as the “PS optimization”.

2.3 Model-free reactive control using Bayesian optimization

Although the PS optimization requires frequency-response and PSD functions, these functions may be uncertain and difficult to estimate. If the dynamic model is unknown or includes modeling errors and uncertainties, then an optimization method for K1 and K2 is needed instead of the PS optimization.

The alternative optimization method adopted in this study is a machine-learning-based method, namely, Bayesian optimization. The Bayesian optimization makes it possible to maximize an unknown objective function using a Gaussian process model. In Bayesian optimization, a function that approximates the output is estimated in the form of a posterior probability distribution based on a Gaussian process when multiple input–output data are obtained. The posterior probability distribution is used to determine the next search input. The simulations and experiments are conducted based on the determined input. The results are added to the previous data, and a new posterior probability distribution is estimated. By repeating this process, the input that gives the optimal solution is estimated.

Several input data, \(\varvec{U}= \left( \varvec{u_1,u_2, \ldots ,u_i}\right) \), and output \(\varvec{Y}=\left( y_1,y_2, \ldots ,y_i \right) ^T\), are observed. Assuming that the input \(\varvec{U}\) and the output \(\varvec{Y}\) for \(\varvec{U}\) follow a Gaussian process, \(\varvec{U}\) and \(\varvec{Y}\) have the following relationship [40].

$$\begin{aligned} \varvec{Y} \sim \mathcal {N}(\varvec{m},{\varvec{K}}), \end{aligned}$$
(27)

where \(\varvec{m}\) is the mean vector and \(\varvec{K}\) is the kernel matrix. The components of the vector and matrix are calculated as

$$\begin{aligned} m_i= {} \mathsf {E}(\varvec{u}_i) \end{aligned}$$
(28)
$$\begin{aligned} {K_{ik}}= & {} k \left( \varvec{u}_i,\varvec{u}_{k}\right) +\varepsilon ^2\delta (i,k), \end{aligned}$$
(29)

where \(\mathsf {E}(\varvec{u}_i)\) is the mean function, \(k \left( \varvec{u}_i, \varvec{u}_{k}\right) \) is the kernel function, and \(\delta (i,k)\) is one if i and k are equal; otherwise, it is zero.

The joint distribution of observed data has been formulated. When the observation data (\(\varvec{U}\),\(\varvec{Y}\)) are given, unknown \(y^*\) is predicted for a new given input \(\varvec{u}^*\) from the posterior based on the Gaussian process. Given the new data, it still follows a Gaussian process. Therefore, \(\varvec{Y}\) with the new data, \(\varvec{Y}^*= \left( y_1,y_2, \ldots ,y_i, y^* \right) \), is similar to Eq. 27.

$$\begin{aligned} \varvec{Y}^* \sim \mathcal {N}\left( \varvec{m}^*,{\varvec{K}^*}\right) . \end{aligned}$$
(30)

The probability distribution can be expressed by dividing the matrix as follows:

$$\begin{aligned}&\begin{pmatrix} \varvec{Y} \\ y^* \\ \end{pmatrix} \sim \mathcal {N}\left( \begin{pmatrix} \varvec{m}\\ \mathsf {E}(\varvec{u}^*) \\ \end{pmatrix}, \begin{pmatrix} \varvec{K} &{} \varvec{k}_* \\ \varvec{k}_*^T &{} k(\varvec{u}^*, \varvec{u}^*) \\ \end{pmatrix} \right) , \end{aligned}$$
(31)
$$\begin{aligned}&\varvec{k}_*=\left[ k(\varvec{u}^*,\varvec{u}_1),k(\varvec{u}^*,\varvec{u}_2),\ldots ,k(\varvec{u}^*,\varvec{u}_i) \right] ^T. \end{aligned}$$
(32)

Equation 31 shows the joint distribution of \(y^*\) and \(\varvec{Y}\), and the conditional probability of \(y^*\) given \(\varvec{Y}\) using the formula of multivariate gauss distribution can be calculated as follows [40]:

$$\begin{aligned}&y^* \sim \mathcal {N}(\mu (\varvec{u}^*),\sigma ^2(\varvec{u}^*)), \end{aligned}$$
(33)
$$\begin{aligned}&\mu (\varvec{u}^*)= \mathsf {E}(\varvec{u}^*)+\varvec{k}_*^\mathrm{T}\varvec{K}^{-1}(\varvec{Y}-\varvec{m}), \end{aligned}$$
(34)
$$\begin{aligned}&\sigma ^2(\varvec{u}^*)= k(\varvec{u}^*, \varvec{u}^*)-\varvec{k}_*^\mathrm{T}\varvec{K}^{-1}\varvec{k}_*. \end{aligned}$$
(35)

In the Gaussian process model, the predictive distribution of the unknown values can be determined given the observed data and the kernel function. The kernel function selected for the Gaussian process model (i.e., a Matern 5/2 kernel [41]) is defined as

$$\begin{aligned} k(\varvec{u}_i, \varvec{u}_{k})= & {} \left( 1+\frac{\sqrt{5}r}{\theta _1}+\frac{5r^2}{3\theta _2^2} \right) \exp { \left( \frac{-\sqrt{5}r}{\theta _3} \right) }\nonumber \\ r= & {} \left|\varvec{u}_i-\varvec{u}_{k}\right|, \end{aligned}$$
(36)

where \(\theta _1\), \(\theta _2\), and \(\theta _3\) are the hyperparameters in the Gaussian process model.

A function is defined to guide the search point where the objective function is expected to give the optimal value based on expectation and variance estimated by the Gaussian process model. This function is called the acquisition function. The expected improvement, which is the most-used acquisition function [41] and was used in this study, is defined as

$$\begin{aligned}&\begin{aligned} a_{EI}(\varvec{u}^*)&=\left[ \mu (\varvec{u}^*)-\max ({\varvec{Y}}) \right] \varPhi \left( \frac{\mu (\varvec{u}^*)-\max ({\varvec{Y}})}{\sigma (\varvec{u}^*)}\right) \\&\quad +\sigma (\varvec{u}^*)\phi \left( \frac{\mu (\varvec{u}^*)-\max ({\varvec{Y}})}{\sigma (\varvec{u}^*)}\right) , \end{aligned} \end{aligned}$$
(37)
$$\begin{aligned}&\quad \varvec{u}_{i+1} = \underset{\varvec{u}^*}{\arg \max } \left[ a_{EI}(\varvec{u}^*)\right] . \end{aligned}$$
(38)

\(\mu (\varvec{u}^*)\) and \(\sigma (\varvec{u}^*)\) in Eq. 37 are calculated using Eqs. 34 and 35. The acquisition function evaluates the output, including the uncertainty of the unsearched region, and searches for the next point to search. The input that maximizes the acquisition function of Eq. 38 is the next search point.

The evaluation index of averaged power considering the displacement constraint is defined as

$$\begin{aligned} p_\mathrm{w}= & {} \frac{\rho g^2 H_\mathrm{s}^2T_\mathrm{e}}{64\pi }, \end{aligned}$$
(39)
$$\begin{aligned} c_\mathrm{w}= & {} \frac{\overline{p_\mathrm{e}}}{p_\mathrm{w}}, \end{aligned}$$
(40)
$$\begin{aligned} y= {} c_\mathrm{w} - \alpha \left[ \max (0,|x_\mathrm{max}|-x_\mathrm{lim}) \right] ^{\beta }, \end{aligned}$$
(41)

where Eqs. 39 and 40 denote the wave power and the capture width in the case of irregular waves [42], respectively. \(c_w\) and \(|x_\mathrm{max}|\) in Eq. 41 are observed during each training trial. The second term in Eq. 41 shows a penalty function that is needed to consider the constraint condition [43]. If the constraint condition is not satisfied, then the penalty function adds the cost to the evaluation index. \(\alpha \) and \(\beta \) are the turning parameters that control the cost of the penalty function. In this study, \(\alpha \) and \(\beta \) were set to 1.2 and 2.0, respectively. The values are not necessarily optimal. Note that the results of the Bayesian optimization depend on \(\alpha \) and \(\beta \).

First, \(K_1\) and \(K_2\) in Eq. (3) are randomly given, and the evaluation index corresponding to these parameters is observed to determine the Gaussian process model. The next search point is given by Eq. 37. The above operation was repeated 50 times. \(K_1\) and \(K_2\) that return the maximal y during 50 iterations are regarded as optimized values. The model-free reactive control using the Bayesian optimization is named as the “Bayesian optimization”.

3 Numerical simulation

3.1 WEC motion simulation model

The proposed and conventional control schemes in the case of irregular waves are compared in this section. It was assumed that the WEC has a cylinder floater. Table 1 lists the specifications of the assumed WEC. It should be noted that floating bodies, such as spherical floats, are not suitable for the proposed control schemes because the restoring force becomes non-linear due to the change in the cross-sectional area. The wave period with a high probability of occurrence in the seas around Japan is from 6.0 to 7.0 s. Considering the change in the natural period because of the effect of the generator, the size of the WEC was designed such that its natural period would be approximately 6.0 s. A one-twentieth-scale model was numerically simulated to validate measurements obtained by tank tests.

In the numerical simulation, the displacement constraint on the cylinder floater was set to 0.28 m, which is the target displacement. The floater can exceed the displacement constraint in the simulation. The control is available when the optimal control enables the floater to satisfy the displacement constraint.

Fig. 2
figure 2

Added mass of movable floater

Fig. 3
figure 3

Damping coefficients of movable floater

Fig. 4
figure 4

Wave excitation force coefficients of movable floater

The hydrodynamic coefficients in Eq. 23 were calculated using a commercial software package, WAMIT [44]. Figures 2, 3, and 4 show \(A(\omega )\), \(B(\omega )\), and \(E(\omega )\) of the movable floater.

The numerical simulation of the movable floater motion in the time domain was based on Eqs. 1, 2. The wave excitation force in the case of irregular waves, \(f_\mathrm{ext}(t)\), is given by

$$\begin{aligned} f_\mathrm{ext}(t)=\sum _{i=1}^{N_w}\sqrt{\varPhi _{\zeta \zeta }(\omega _i)\varDelta \omega _i}E(\omega _i)\cos (\omega _i t+\epsilon _i), \end{aligned}$$
(42)

where \(N_w\) was set to a thousand in the simulation. The JONSWAP spectrum was used for the PSD of irregular waves as follows [45]:

$$\begin{aligned} \varPhi _{ \zeta \zeta }(\omega )= & {} 2\pi \alpha _{j1}H_s^2\omega _p^{4}\omega ^{-5}\exp {\left[ -1.25 \left( \frac{\omega }{\omega _p}\right) ^{-4}\right] }\gamma ^{\alpha _{j2}}, \end{aligned}$$
(43)
$$\begin{aligned} \alpha _{j1}= & {} \frac{0.0624(1.094-0.01915\ln {\gamma })}{0.23+0.0336\gamma -0.185(1.9+\gamma )^{-1}}, \end{aligned}$$
(44)
$$\begin{aligned} \alpha _{j2}= & {} \exp {\left[ -\frac{1}{2\alpha _{j3}^2}\left( \frac{\omega }{\omega _p}-1\right) ^2\right] }, \end{aligned}$$
(45)
$$\begin{aligned} \alpha _{j3}= & {} {\left\{ \begin{array}{ll}0.07 &{} (\omega \le \omega _p)\\ 0.09 &{} (\omega > \omega _p) \end{array}\right. }. \end{aligned}$$
(46)

\(\gamma \) in Eq. 43 controls the sharpness of the spectral peak. It was set to 1.0 in the simulations.

A sufficient number of numerical simulations have been carried out in random seas for different peak periods between 1.0 and 3.0 s to simulate the movable floater motion under various conditions. The significant wave heights were 0.1, 0.3, and 0.5 m. The movable floater motion was repeatedly simulated for different phase-shift arrangements of irregular waves.

In addition to the PS optimization and the Bayesian optimization, the ACL was applied to compare the proposed and conventional control schemes. The ACL is suboptimal in irregular waves because it is the optimal control that maximizes the average power in the case of sinusoidal waves, but is compared as a reference value. The control force in the ACL is expressed in Eq. 3. \(R_\mathrm{m}(\omega )\), \(K_1\), and \(K_2\) in the ACL are given by

$$\begin{aligned} R_\mathrm{m}(\omega )= {} \omega \left( m+A(\omega )\right) -\frac{c}{\omega }, \end{aligned}$$
(47)
$$\begin{aligned} K_1= & {} \frac{B(\omega )+2\delta [B^2(\omega )+R_\mathrm{m}^2(\omega )]}{4\delta ^2 \left[ B^2(\omega )+R_\mathrm{m}^2(\omega )\right] +4\delta B(\omega )+1}, \end{aligned}$$
(48)
$$\begin{aligned} K_2= & {} \frac{\omega R_\mathrm{m}(\omega )}{4\delta ^2 \left[ B^2(\omega )+R_\mathrm{m}^2(\omega )\right] +4\delta B(\omega )+1}, \end{aligned}$$
(49)

where \(R_\mathrm{m}(\omega )\) is the mechanical reactance of the WEC.

Since \(K_1\) and \(K_2\) in the ACL require representative \(\omega \) in irregular waves, \(\omega \) was substituted into \(\omega \) derived from the wave peak period in irregular waves \(T_\mathrm{p}\).

Table 1 Principal particulars of wave energy converter

3.2 Results and discussion

Fig. 5
figure 5

Comparison of maximum amplitude with significant wave height of 0.1 m

Fig. 6
figure 6

Comparison of capture width with significant wave height of 0.1 m

Fig. 7
figure 7

Comparison of optimal control parameter of velocity with significant wave height of 0.1 m

Fig. 8
figure 8

Comparison of optimal control parameter of displacement with significant wave height of 0.1 m

Fig. 9
figure 9

Comparison of maximum amplitude with significant wave height of 0.3 m

Fig. 10
figure 10

Comparison of capture width with significant wave height of 0.3 m

Fig. 11
figure 11

Comparison of control parameter of velocity with significant wave height of 0.3 m

Fig. 12
figure 12

Comparison of control parameter of displacement with significant wave height of 0.3 m

Fig. 13
figure 13

Comparison of maximum amplitude with significant wave height of 0.5 m

Fig. 14
figure 14

Comparison of capture width with significant wave height of 0.5 m

Fig. 15
figure 15

Comparison of control parameter of velocity with significant wave height of 0.5 m

Fig. 16
figure 16

Comparison of control parameter of displacement with significant wave height of 0.5 m

Figures 5, 6, 7 and 8 show the maximum amplitude, capture width, and control parameters simulated with a significant wave height of 0.1 m. \(x_\mathrm{max}\) for each control scheme was smaller than the displacement constraint, and all control schemes can be considered acceptable under these sea conditions, as shown in Fig. 5. For all periods, \(c_\mathrm{w}\) for the PS optimization is larger than that for the ACL. In the range of \(T_\mathrm{p}\) = 1.2\(-\)2.6 s, \(c_w\) for the Bayesian optimization were comparable to that of the PS optimization. For \(T_p\) = 1.0 s, \(c_\mathrm{w}\) for the Bayesian optimization is smaller than that of the PS optimization, as shown in Fig. 6. \(K_1\) and \(K_2\) for the Bayesian optimization were comparable to those values of the PS optimization in the range of \(T_\mathrm{p}\) = 1.2\(-\)2.6 s, as shown in Figs. 7 and 8. Although \(K_1\) and \(K_2\) for the Bayesian optimization are close to those for the PS optimization in \(T_\mathrm{p}\)=1.0 s, \(c_\mathrm{w}\) for the Bayesian optimization is smaller than that for the PS optimization. This result indicates that the range of solutions for which \(K_1\) and \(K_2\) are optimal is narrow because of the high sensitivity of the \(K_1\) and \(K_2\) in \(T_p\)=1.0 s. Although more than a thousand trials are needed for learning in the previous model-free reactive control [35], K1 and K2 were optimized during the 50 trials in the proposed model-free reactive control.

Figures 9, 10, 11 and 12 show the maximum amplitude, capture width, and control parameters simulated with a significant wave height of 0.3 m. Although the maximum amplitude for the ACL exceeded the displacement constraint in the case of most wave periods, the maximum amplitude for the Bayesian optimization and the PS optimization in a few wave periods slightly exceed the constraint. In the PS optimization, K1 and K2 are optimized such that the statistical maximum displacement predicted by Eq. 25 satisfies the displacement constraint. In time-domain simulations, this constraint may not be satisfied when localized waves larger than the expected wave height are generated because of the superposition of wave components in irregular waves. This depends on the evaluation time and the initial phase of the wave components in irregular waves. In the Bayesian optimization, the constraint also may not be satisfied depending on the evaluation time and the initial phase of the wave components in irregular waves. The constraint can be strictly satisfied by introducing a safety factor in the maximum displacement estimation. However, it should be noted that the safety factor will result in a reduction in power generation performance.

The Bayesian optimization and the PS optimization can be considered appropriate for the significant wave height of 0.3 m, as shown in Fig. 9. \(c_\mathrm{w}\) with the PS and Bayesian optimizations decreased due to the displacement constraint (Figs. 6 and 10). In the unrestrained case, the power generation increases because the displacement of motion and velocity of the movable floater simultaneously increases as the wave height and wave energy increase. Under the displacement constraint, the power generated does not increase even as the wave height increases. As a result, the capture width decreases.

In contrast, \(c_\mathrm{w}\) for the ACL is the largest of the three control schemes, and it does not vary based on the significant wave height, as shown in Figs. 6 and 10. This is because the ACL is based on the principle of superposition and does not consider the displacement constraint \(K_1\) and \(K_2\) for the PS optimization and the Bayesian optimization vary with increasing significant wave height, as shown in Figs. 7 and 11. \(K_1\) and \(K_2\) for the Bayesian optimization are comparable to those of the PS optimization.

Figures 13, 14, 15 and 16 show the simulated results of the maximum amplitude, capture width, and control parameters with a significant wave height of 0.5 m. \(x_\mathrm{max}\) for the PS optimization is mostly lower than the constraint. In contrast, \(x_\mathrm{max}\) with the ACL and Bayesian optimization exceeds the constraint for several periods. \(c_\mathrm{w}\) for the Bayesian optimization is larger than that for the PS optimization because \(x_\mathrm{max}\) exceeds the constraint through the Bayesian optimization. \(K_1\) for the Bayesian optimization differs from that for the PS optimization. On the other hand, in the case of several periods, \(K_2\) with the Bayesian optimization is equivalent to that for the PS optimization. In the significant wave height of 0.5 m, the results with Bayesian optimization differ from those values for the PS optimization due to the difference of \(K_1\). Determining the appropriate \(\alpha \) and \(\beta \) in Eq. 41 can reduce the difference between the Bayesian optimization and the PS optimization.

The numerical simulation showed that both the PS optimization and the Bayesian optimization are superior to the ACL in irregular waves. Thus, the PS optimization is the available reactive control in various significant wave heights. Depending on the wave conditions, the Bayesian optimization may well be equivalent to PS optimization. As for the two proposed control schemes, the displacement constraint may not be satisfied because the maximum amplitude is estimated using a statistical method. Therefore, the two proposed control schemes require a safety factor in the maximum displacement estimation or a hard stop function for the heave motion of a WEC.

A representative hard stop function is a mechanical motion suppressor, such as a damper or spring attached at the end of the movable range of the WEC. As a negative effect, energy dissipation occurs because of the force generated by the mechanical damper, but the loss is considered to be small. Moreover, the motion is non-linear when it reaches the end of the range of motion. However, the influence on the calculation of K1 and K2 is considered to be small because the motion is linear in most of the range.

4 Concluding remarks

This study proposed the model-based and model-free reactive controls to maximize the electric power while considering the copper loss in a linear generator and a displacement constraint without forecasting wave excitation forces. We carried out numerical simulations in irregular waves to validate the performance of the model-based and model-free reactive controls.

The model-based reactive control using a power spectrum was formulated first. Time-averaged electric power was formulated by using the zero-order and second-order of the moment on the power-spectral-density function for the displacement of the movable floater. The expected values for maximum amplitudes of displacement can be calculated by the power-spectral-density function. The optimal control problem with the displacement constraint was transformed into the formula using the power-spectral-density function. The optimal control parameters can be obtained by solving the obtained formula numerically.

Subsequently, the model-free reactive control using the Bayesian optimization was developed. The Bayesian optimization makes it possible to optimize the control parameters when the dynamic model of WEC motion is unknown. The evaluation index of the Bayesian optimization was expressed as an optimization problem with the displacement constraint using a penalty function to consider the constraint condition. The optimal control parameters can be determined by maximizing this evaluation index through the Bayesian optimization.

Finally, the model-based and model-free reactive controls were compared with a conventional control scheme (i.e., ACL) by time-domain numerical simulations. Regarding the electric output power of the generator of the WEC, the highest averaged power was obtained by the model-based reactive control. The model-free reactive control achieved comparable performance to that of the model-based reactive control in the case of small and medium wave heights. Regarding the displacement constraint, the model-based reactive control was acceptable. The model-free reactive control was also acceptable in the case of small and medium wave heights. However, in the high wave height, the model-free reactive control did not satisfy the displacement constraint in several wave periods. Similarly, ACL did not satisfy the displacement constraint in the case of medium and large wave heights.

The two proposed controls are superior to the ACL in irregular waves. The number of training trials that the proposed model-free reactive control requires is fewer than that of the previous model-free reactive control using reinforcement learning. Additionally, the model-free reactive control achieves compatible performance with the model-based reactive control in the case of small and medium wave height. The model-based reactive control should be used when an accurate dynamic model of the WEC is given. On the contrary, when a dynamic model of the WEC is inaccurate, it is considered appropriate to switch from the model-based reactive control to the model-free reactive control.

In future research, we will apply model-based and model-free reactive controls under wave conditions with modeling errors and compare their performance. Furthermore, the model-free control will be trained while using the model-based control, and a method to smoothly switch between model-free and model-based controls will be investigated. A control scheme that combines model-based and model-free reactive controls is realized by these studies.