A Formally Motivated Retrieval Framework Applied to the High-resolution Transmission Spectrum of HD 189733 b

Doriann Blain; Alejandro Sánchez-López; Paul Mollière

doi:10.3847/1538-3881/ad2c8b

1. Introduction

The characterization of exoplanet atmospheres provides unique information about their atmospheric composition, dynamics, and aerosol presence, among other properties. Moreover, it may be possible to infer from the chemical composition of a planet how it formed and whether it experienced orbital migration (e.g., Öberg et al. 2011; Madhusudhan et al. 2014a; Mordasini et al. 2016; Mollière et al. 2022).

With a mass and radius comparable to that of Jupiter and an equilibrium temperature of 1209 ± 11 K (see Table 1), HD 189733 b falls into the "hot Jupiter" category (see Figure 1). These properties, coupled with its bright host star, give this planet one of the highest transmission spectroscopy metrics (Kempton et al. 2018), at ≈770. Consequently, HD 189733 b is one of the most studied exoplanets. Focusing solely on spectral features, previous studies over the past 10 yr have demonstrated the presence of H₂O in the atmosphere of HD 189733 b (Birkby et al. 2013; Pont et al. 2013; Danielski et al. 2014; Line et al. 2014; McCullough et al. 2014; Sing et al. 2016; Cabot et al. 2018; Tsiaras et al. 2018; Alonso-Floriano et al. 2019; Sánchez-López et al. 2019; Zhang et al. 2020; Boucher et al. 2021; Changeat et al. 2022; Klein et al. 2023; Finnerty et al. 2024). Detections of CH₄ (Line et al. 2014), CO (Line et al. 2014; Cabot et al. 2018; Brogi & Line 2019; Finnerty et al. 2024), ¹³CO (Finnerty et al. 2024), CO₂ (Line et al. 2014; Changeat et al. 2022), HCN (Cabot et al. 2018), and Na (Pont et al. 2013; Sing et al. 2016; Langeveld et al. 2022) were also reported. In addition, upper limits for FeH (Kesseli et al. 2020), K (Oshagh et al. 2020), and NH₃ (Finnerty et al. 2024) have been established. A spectral slope was also observed at optical wavelengths and attributed to hazes (Pont et al. 2013; Lee et al. 2014; Sing et al. 2016; Pino et al. 2018; Sánchez-López et al. 2020), but it could also be interpreted as unocculted starspots (McCullough et al. 2014; Oshagh et al. 2020). Hot Jupiters, HD 189733 b included, are likely tidally locked (Marcy et al. 1997). Three-dimensional atmospheric models suggest that under these conditions a superrotating equatorial jet forms. In the case of HD 189733 b, winds up to ≈5 km s⁻¹ are expected within this jet (Showman et al. 2012; Dobbs-Dixon & Agol 2013; Lines et al. 2018; Flowers et al. 2019).

**Figure 1.** Mass–radius distribution of the 5243 confirmed exoplanets to date (2023 February 28; Nasa Exoplanet Archive 2023). Not all the confirmed planets have a measured radius and/or mass; thus, the total count in the histograms is less than 5243. The scatter plot shows only the 512 planets for which the mass and radius are known within 15%.
Download figure:
Standard image High-resolution image

Table 1. Parameters of HD 189733 b and Its Star

Parameter	Value		References
Host Star
Spectral type	K2 V		1
M_* (kg)	1.630 ± 0.060 ×10³⁰	(0.828 M_⊙)	2
R_* (Gm)	0.543 ± 0.007	(0.780 R_⊙)	2
T_*,eff (K)	5052 ± 16		3
g_* (m·s⁻²)	${380}_{-1}^{+1}$	(4.49 [cm s⁻²])	3
[Fe/H]	–0.030 ± 0.080		4
t_* (Gyr)	${6.80}_{-4.40}^{+5.20}$		4
V_sys (km s⁻²)	${2.204}_{-0.011}^{+0.010}$		5
R.A./decl. (J2000, epoch 2015.5)	3001821223 227097759	(20:00:43.71 +22:42:35.19)	6
Planet
a_p (Gm)	4.67643 ± 0.00036	(0.03126 au)	2
e	${0.027}_{-0.018}^{+0.021}$		2
i_p (deg)	${85.71}_{-0.00}^{+0.00}$		3
M_p (kg)	2.14 ± 0.15 ×10²⁷	(1.13 M_J)	3
R_p (km)	80 800 ± 700	(1.13 R_J)	3
g_p (m·s⁻²)	21.9 ± 1.6	(3.34 [cm s⁻²])	Derived
T_eq (K)	1209 ± 11		5
T_p,int (K)	≈100		7 (model)
P (days)	2.2185748039122 ± 0.0000001765372		6
T₁₄ (s)	6463.08504 ± 7.89926		6
T₀ (BJD_TDB, days)	2458004.424877 ± 0.000145		Derived

References. (1) Paredes et al. 2021; (2) Rosenthal et al. 2021; (3) Stassun et al. 2017; (4) Bonomo et al. 2017; (5) Addison et al. 2019; (6) ExoFOP website 2022; (7) Rogers & Seager 2010. Some parameters were acquired from the Nasa Exoplanet Archive 2023.

Download table as: ASCII Typeset image

Transmission spectroscopy is a commonly used technique to study the atmosphere of transiting exoplanets. It has been successfully performed with both ground- and space-based instruments at low and high resolving power ( ${ \mathcal R }$ ), that is, at ${ \mathcal R }\approx {10}^{2}$ and ${ \mathcal R }\approx {10}^{5}$ , respectively (see, e.g., the abovementioned works). Spectra obtained from space-based instruments have the advantage of being exempt from telluric contamination, although currently their spectral resolution is limited ( ${ \mathcal R }$ ⪅ 3000; Hubble Space Telescope User Documentation 2022; James Webb Space Telescope User Documentation 2022). From this kind of low-resolution data, it is possible to identify broadband absorption features (e.g., molecular bands), but the core and wings of narrow spectral lines are not resolved. This can prevent the detection of minor species and can also lead to ambiguities when the bands of two different species overlap (e.g., Brogi et al. 2017; Blain et al. 2021; Bézard et al. 2022). Moreover, when hazes or clouds are present in the planet atmosphere, the molecular bands can be dampened or even completely masked, hindering detection (e.g., Barstow et al. 2016; Sing et al. 2016; Barstow 2020). The situation is reversed for spectra obtained from the ground. With the higher resolving powers available, individual lines can be identified unambiguously. In addition, since the core of the lines probes higher altitudes, detection is not as impacted as at lower resolutions when clouds and hazes are present. Another advantage in resolving individual lines is to have information on atmospheric kinematics through the Doppler shifting of the line positions and the shape (in particular, the width) of the spectral lines. However,⁴ telluric contamination plays a dominant role in the collected spectra and needs to be accurately removed. This operation is done after the usual raw data reduction step (flats, darks, wavelength calibration), in an operation that we will call "preparing pipeline"⁵ (also called "preprocessing" or "detrending" in other works).

A well-established technique to analyze high-resolution ground-based data is the cross-correlation function (CCF) technique (e.g., Snellen et al. 2010; Brogi et al. 2012, 2018; Birkby et al. 2013, 2017; Cabot et al. 2018; Hawker et al. 2018; Hoeijmakers et al. 2018; Alonso-Floriano et al. 2019; Sánchez-López et al. 2019; Kesseli et al. 2020; Merritt et al. 2020; Sánchez-López et al. 2020; Stangret et al. 2020; Kesseli & Snellen 2021; Landman et al. 2021; Merritt et al. 2021; Nugroho et al. 2021; Cont et al. 2022a, 2022b; Hoeijmakers et al. 2022; Kesseli et al. 2022; Prinoth et al. 2022; Sánchez-López et al. 2022). Cross-correlation has been successfully employed to detect atmospheric species, as well as to identify in some cases spectral line Doppler shifts—translated in an offset of rest velocity—which have been attributed to high-altitude winds (e.g., for HD 189733 b, Louden & Wheatley 2015; Wyttenbach et al. 2015; Brogi et al. 2016; Alonso-Floriano et al. 2019; Flowers et al. 2019). However, this technique gives no or inaccurate information about atmospheric species' abundances or other parameters such as the temperature profile. Indeed, the CCF is sensitive at first order to the lines' position and shape but not to their absolute depth. On the other hand, data obtained from space are often studied via a χ² analysis (included or not in a Bayesian framework), directly comparing models with data and giving access to such information. Such direct comparison with a model cannot be done as easily with ground-based data of exoplanets because the aforementioned preparing pipeline deforms the spectrum (see also Section 4.1), biasing the analysis.

This issue with the preparing pipeline in high-resolution ground-based data analysis appears mostly solved. In recent years, new techniques to perform Bayesian analysis on ground-based data have emerged, with successful retrievals of atmospheric species abundances, temperature, and other properties (Brogi & Line 2019; Gandhi et al. 2019; Gibson et al. 2020, 2022; Pelletier et al. 2021). The methods used have two main steps: first, prepare the data with a preparing pipeline to remove instrumental effects, as well as telluric and stellar contamination, and then modify the terms in the log-likelihood function used in the Bayesian analysis to take into account the data preparing step. That is, the data are replaced with the prepared data, and the model and the uncertainties are modified accordingly. This allows for direct data–model comparisons and hence, in principle, accurate atmospheric parameter estimations.

However, for both the preparing techniques and the log-likelihood modification, there is currently no agreement on the method to use. The preparing steps are often not described exhaustively, sometimes making reproducibility and comparisons difficult. This need for a consistent, reproducible, and robust preparation technique has already been raised (Cabot et al. 2018; Cheverall et al. 2023). Similarly, how the likelihood function terms are modified can vary significantly between authors—especially regarding the modification of the forward models. A formal justification for why a forward model should be prepared in a given way is also lacking from the literature, although this is a crucial step. Hence, there is also a need for a consistent method to modify the log-likelihood function terms.

In this work, we use previously analyzed high-resolution ground-based observations of an HD 189733 b transit to introduce a formally motivated high-resolution Bayesian retrieval framework in the near-infrared (NIR). We propose a formal demonstration that this framework is unbiased for a subset of preparing pipelines satisfying given requirements. We also introduce the notion of a "bias pipeline metric" (BPM) of the preparing pipeline, which can be used to compare preparing pipelines among each other. Finally, we present the results of our reanalysis of the data and put new constraints on the planet kinematics, temperature profile, cloud and haze properties, and molecular abundances. Doing so, we demonstrate the viability of our framework on real data.

2. Observations and Data Reduction

2.1. Observations

We analyzed a transit event of HD 189733 b observed with CARMENES (Quirrenbach et al. 2016, 2018) on 2017 September 7, which is publicly available from the Calar Alto Observatory (CAO) archive.⁶ This data set has been previously used to successfully detect water vapor and to infer the hazy nature of the atmosphere of this hot Jupiter (Alonso-Floriano et al. 2019; Sánchez-López et al. 2019, 2020). In addition, it was used by Cheverall et al. (2023) as a case study for evaluating different preparing methods for CCF analysis. We used the data from the NIR channel's fiber A (focused on the target star HD 189733), which covers the spectral range 0.96–1.71 μm in 28 echelle orders at a spectral resolution of ${ \mathcal R }\approx {\rm{80,400}}$ . The instrument's fiber B was kept on sky during the observations so as to monitor the occurrence of strong sky emission lines, which we found not to impact our analyses in any significant way. Each spectral order is, in turn, built from two 2040 × 2040 pixel infrared detectors.

The observations consist of 45 exposures of 198 s encompassing the planet orbital phases from −0.035 to 0.036 (i.e., the primary transit of HD 189733 b). In Figure 2 we show the evolution of the relative humidity (RH), air mass, and mean signal-to-noise ratio (S/N) per spectrum during the observations (see additional details in Alonso-Floriano et al. 2019). Even with the low air masses, the precipitable water vapor (PWV) content of the atmosphere ranged from 11.7 to 15.9 mm. This is relatively high compared to the mean value of ≈7 mm at the Calar Alto Observatory and translated into a high number of spectral pixels rejected owing to a strong telluric contamination (see details on the telluric correction and masking in Section 4.1). Nevertheless, the mean S/N stayed high over the course of the observations, being always above 100 during transit.

**Figure 2.** Evolution of the RH (top panel), air mass (middle panel), and mean S/N (bottom panel) during the observations. The transit event of HD 189733 b occurs between the vertical dashed lines.
Download figure:
Standard image High-resolution image

2.2. Data Reduction

The archive provides the option to download the data already reduced using the standard pipeline caracal v2.00 (Zechmeister et al. 2014; Caballero et al. 2016), which provides 1D spectra extracted from the raw observed frames. caracal performs a bias correction to remove the unwanted contribution of each order's specific properties, a dark correction to subtract the contribution from thermal dark currents, and a flat-field subtraction, which removes effects such as artifacts in the images produced by pixel-wise sensitivity differences. In addition, caracal performs a wavelength calibration, providing the final processed spectra with wavelengths in vacuum and in the rest frame of Earth. We took the air mass, Julian date, and barycentric velocity (V_bary) corresponding to each exposure from each file's header. The provided uncertainty matrix for each spectral point at each frame is built by caracal by estimating the readout noise and photon noise.

The time stamps given by the caracal pipeline are in MJD_UTC.⁷ We corrected them from the light-travel time to the barycenter⁸ to obtain time stamps in BJD_TDB, which we used in our analysis. In our case, the difference between the UTC and TDB time stamps is UTC − TDB = −368 s on average. Note that this is equivalent to a Doppler velocity shift of ≈+2 km s⁻¹ of the planetary spectral lines at midtransit.

3. Models

3.1. Transmission Spectrum Model

To model the transmission spectrum of HD 189733 b, we follow the steps below.

Step 1: We model the planet atmosphere with 100 equally log-spaced pressure levels between 10⁷ and 10⁻⁵ Pa. We set the temperature constant with pressure. We use a mass fraction (also called the "mass mixing ratio," or MMR) constant with pressure for every species included in our model. We ensure that the sum of the MMRs ${{ \mathcal X }}_{i}$ of all species i at each level is equal to 1 by doing the following:

1.
If ${\sum }_{i}{{ \mathcal X }}_{i}\gt 1$ : the MMR of each species is divided by ${\sum }_{i}{{ \mathcal X }}_{i}$ .
2.
If ${\sum }_{i}{{ \mathcal X }}_{i}\lt 1$ : H₂ and He are added to the atmosphere so that ${{ \mathcal X }}_{{{\rm{H}}}_{2}}+{{ \mathcal X }}_{\mathrm{He}}+{\sum }_{i}{{ \mathcal X }}_{i}=1$ , with ${{ \mathcal X }}_{\mathrm{He}}/{{ \mathcal X }}_{{{\rm{H}}}_{2}}=12/37$ , which is roughly the ratio found in Jupiter's atmosphere (von Zahn et al. 1998).

The average molar mass (also called the "mean molar weight," or MMW) of each atmospheric levels is obtained via $\mathrm{MMW}={({\sum }_{i}{{\rm{X}}}_{i}/{{ \mathcal M }}_{i})}^{-1}$ , where ${{ \mathcal M }}_{i}$ is the molar mass of species i, obtained from the molmass ⁹ library. We calculate the planet orbital velocity v_o, assuming that the planet's orbit is circular and that its mass is negligible compared to that of its host:

$\begin{eqnarray}&&{v}_{o}=\sqrt{\displaystyle \frac{{{GM}}_{* }}{{a}_{p}}},\end{eqnarray} \tag{ 1 }$

where G is the gravitational constant, M_* is the mass of the star, and a_p is the length of the semimajor axis of the planet orbit.

Step 2: We obtain the planet radial velocity with respect to the observer in the reference frame of the planet's system barycenter V_p(t) following

$\begin{eqnarray}&&{V}_{p}(t)={v}_{o}\sin ({i}_{p})\sin (2\pi \phi (t)),\end{eqnarray} \tag{ 2 }$

where i_p is the planet orbital inclination, with i_p = 90° corresponding to the planet−observer axis being contained into the plane of the planet orbit, and ϕ(t) = (t − T₀)/P are the planet orbital phases with respect to time t, where T₀ is the midtransit time and P is the planet orbital period. The planet's radial velocity semiamplitude ${v}_{o}\sin ({i}_{p})$ is often denoted K_p and will be retrieved in our setup. A positive radial velocity denotes that the planet is moving away from the observer, while a negative one denotes that the planet is moving toward the observer. Accordingly, ϕ = 0 when the planet is in front of the star relative to the observer (mid−primary transit), while ϕ = 0.5 when the planet is behind the star (mid−secondary transit).¹⁰ This value will be used for the Doppler shifting of the model spectrum in step 5.

Step 3: Using the petitRADTRANS¹¹ (pRT) package (Mollière et al. 2019), we compute the model transit radius of the planet m_θ,0 from a set of parameters θ. This transit radius is defined for a set of wavelengths in vacuum and in the rest frame of the model, λ₀, at ${ \mathcal R }={10}^{6}$ . We include the line-by-line opacities of each species i included in step 1, as well as the collision-induced absorptions of H₂–H₂ and H₂–He and the Rayleigh scattering effect of H₂ and He. We use the opacities included with petitRADTRANS, listed in Table 2. To reduce memory usage and improve calculation speed, we downsampled those opacities such as to take 1 out of 4 points along wavelength. Given CARMENES's resolving power, we do not expect this to significantly impact our results. We simulate clouds as an opaque layer topping at a given pressure. Hazes are simulated using an additional opacity following a power law and defined by its value at 0.35 μm (κ₀) and a power-law parameter (γ).

Table 2. References of the Line Lists Used for Our High-resolution Spectra

Species	Reference
CH₄	Hargreaves et al. (2020)
CO	Rothman et al. (2010)
H₂O	Rothman et al. (2010)
H₂S	Rothman et al. (2013)
HCN	Harris et al. (2006), Barber et al. (2014)
NH₃	Yurchenko et al. (2011)

Download table as: ASCII Typeset image

Step 4: We scale the transit radii to the star radius R_* following

$\begin{eqnarray}&&{m}_{\theta ,\mathrm{scale}}({\lambda }_{0})=1-{\left(\displaystyle \frac{{m}_{\theta ,0}({\lambda }_{0})}{{R}_{* }}\right)}^{2}.\end{eqnarray} \tag{ 3 }$

Step 5: In order to take into account the Doppler effect caused by the relative velocity between the planet and the observer, we shift each wavelength λ₀, using

$\begin{eqnarray}&&{\lambda }_{\mathrm{shift}}(t)={\lambda }_{0}\sqrt{\displaystyle \frac{1+\tfrac{{V}_{\mathrm{obs}}(t)}{c}}{1-\tfrac{{V}_{\mathrm{obs}}(t)}{c}}},\end{eqnarray} \tag{ 4 }$

where λ_shift(t) contains the shifted wavelengths in the rest frame of the planet at each exposure; V_obs(t) = V_p(t) + V_sys + V_bary(t) + V_rest, where V_sys is the radial velocity of the star system with respect to the solar system (also called systemic velocity); V_bary(t) is the radial component of the velocity between the observer and the solar system barycenter projected along the vector pointing from the observer to the star; V_rest is a corrective scalar we retrieve so as to take into account additional sources of dynamics in the planet atmosphere; and c is the speed of light in vacuum. Thereby, we obtain from m_θ,scale(λ₀) a spectral matrix M _θ,shift(t, λ_shift), that is, one spectrum in the rest frame of the planet for each exposure.¹²

Step 6: During ingress and egress, the intersection area between the planet's disk and its star's disk changes from 0 at the very beginning (T₁) and end (T₄) of the transit to the planet's disk area during the full transit. Since the transit depth is proportional to this area, the planet signal during ingress and egress evolves accordingly. For a CCF analysis, this can be neglected since it is the position of the spectral features and their shape that matter the most for signal detection at first order. For a retrieval analysis, however, this is more problematic, as the absolute amplitude of the spectral features has an impact on the retrieved values. To take this effect into account, we calculate the wavelength-dependent mutual sky-projected distance between the apparent star center and the apparent planetary center (δ) following Csizmadia (2020):

$\begin{eqnarray}&&\begin{array}{l}\delta (t,{\lambda }_{\mathrm{shift}})=\sqrt{{b}^{2}+\left({\left(1+{r}_{\theta }(t,{\lambda }_{\mathrm{shift}})\right)}^{2}-{b}^{2}\right)\displaystyle \frac{2\phi (t)}{{\phi }_{14}}},\end{array}\end{eqnarray} \tag{ 5 }$

where $b={a}_{p}{R}_{* }\cos ({i}_{p})$ is the impact parameter assuming a circular orbit, ${r}_{\theta }(t,{\lambda }_{\mathrm{shift}})=\sqrt{1-{{\boldsymbol{M}}}_{\theta ,\mathrm{shift}}(t,{\lambda }_{\mathrm{shift}})}$ is simply the normalized transit radii, and ϕ₁₄ = T₁₄/P is the phase arc described by the planet over the course of the total transit, with T₁₄ the planet's total transit time and P the planet's orbital period. Then, we assume that at each wavelength the planet is an opaque, perfectly dark sphere and neglect the limb-darkening effect of the star. We calculate the corrected scaled transit radii M _θ,transit(t, λ_shift) following Mandel & Agol (2002):

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\theta ,\mathrm{transit}}(t,{\lambda }_{\mathrm{shift}})=1,\quad \mathrm{if}\ \delta \gt 1+{r}_{\theta },\end{eqnarray} \tag{ 6a }$

$\begin{eqnarray}&&\begin{array}{l}{{\boldsymbol{M}}}_{\theta ,\mathrm{transit}}(t,{\lambda }_{\mathrm{shift}})=1-\left({r}_{\theta }^{2}{c}_{0}+{c}_{1}-\sqrt{{\rm{\Delta }}}\right)/\pi ,\quad \mathrm{if}\ \left|1-{r}_{\theta }\right|\lt \delta \leqslant 1+{r}_{\theta },\end{array}\end{eqnarray} \tag{ 6b }$

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\theta ,\mathrm{transit}}(t,{\lambda }_{\mathrm{shift}})={{\boldsymbol{M}}}_{\theta ,\mathrm{shift}},\qquad \mathrm{if}\delta \leqslant 1-{r}_{\theta },\end{eqnarray} \tag{ 6c }$

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\theta ,\mathrm{transit}}(t,{\lambda }_{\mathrm{shift}})=0,\quad \mathrm{if}\ \delta \leqslant {r}_{\theta }-1,\end{eqnarray} \tag{ 6d }$

where the dependencies on λ_shift and t are implied, and where ${c}_{0}(t,{\lambda }_{\mathrm{shift}})=\arccos (({\delta }^{2}-{{\boldsymbol{M}}}_{\theta ,\mathrm{shift}})/2{r}_{\theta }\delta )$ , ${c}_{1}(t,{\lambda }_{\mathrm{shift}})=\arccos (({\delta }^{2}\,+{{\boldsymbol{M}}}_{\theta ,\mathrm{shift}})/2\delta )$ , and ${\rm{\Delta }}{(t,{\lambda }_{\mathrm{shift}})=(4{\delta }^{2}-({{\boldsymbol{M}}}_{\theta ,\mathrm{shift}}+{\delta }^{2})}^{2})/4$ . Because of its effect on the lines' amplitude, this step is crucial to retrievals, in contrast with CCF analysis.

Note that we assume that the planet's atmosphere is spatially (i.e., along latitudes and longitudes) uniform. As previously mentioned, models (e.g., Flowers et al. 2019) show that we should expect HD 189733 b's atmosphere to be, in contrast, spatially asymmetric, in particular considering wind velocity (see Section 1). A consequence is that the overall Doppler shift effect on the atmospheric spectral lines during ingress and egress ( $\left|1-{r}_{\theta }\right|\lt \delta \leqslant 1+{r}_{\theta },$ ) should be different from this effect during the full transit (δ ≤ 1 − r_θ). On the other hand, the spectral lines' amplitudes are dampened as the planet eclipses partially its star's disk. This Doppler shift effect discrepancy is thus the strongest when the spectral lines' amplitudes are the weakest. Moreover, in our case a majority of the analyzed data are taken during the full transit (see Section 4.4). Hence, we do not expect this effect to strongly impact our results.

Step 7: In order to simulate the effect of the instrument line-spread function (LSF), we convolve M _θ,transit(t, λ_shift) at each exposure by a Gaussian filter of standard deviation σ_conv given by

$\begin{eqnarray}&&{\sigma }_{\mathrm{conv}}=\displaystyle \frac{\lambda }{{{ \mathcal R }}_{{\rm{C}}}}\displaystyle \frac{1}{\langle {\rm{\Delta }}\lambda \rangle }\displaystyle \frac{1}{2\sqrt{2\mathrm{ln}(2)}},\end{eqnarray} \tag{ 7 }$

where 〈Δλ〉 is the average step between the wavelengths λ, ${{ \mathcal R }}_{{\rm{C}}}$ is the resolving power of the CARMENES NIR channel (expected to be 80,400), and $\mathrm{ln}$ denotes the natural logarithm. The rightmost term is used to convert the FWHM of the LSF into the Gaussian filter's standard deviation. A Gaussian-kernel filter¹³ ${ \mathcal G }$ is applied on the data. This gives us the shifted, convolved spectrum M _θ,conv(t, λ_shift) via

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\theta ,\mathrm{conv}}(t,{\lambda }_{\mathrm{shift}})={ \mathcal G }({{\boldsymbol{M}}}_{\theta ,\mathrm{shift}}(t,{\lambda }_{\mathrm{shift}}),{\sigma }_{\mathrm{conv}}).\end{eqnarray} \tag{ 8 }$

Step 8: M _θ,conv(t, λ_shift) is rebinned to the CARMENES wavelengths λ to obtain the final spectral model M _θ(t, λ). This is done using the rebin_spectrum function of petitRADTRANS. In short, rebin_spectrum constructs bin boundaries from the observational wavelength array that always lie in the middle between two neighboring wavelength points. At the edges of the spectrum the bin is assumed to be symmetric about the wavelength value of the observations. The model is then binned by calculating its integral over the bin width and dividing the integral by that width afterward. For the integral a linear wavelength dependence between the model points is assumed, so a trapezoidal rule is used for integration. Rebinning is preferred over interpolation because the former preserves the spectrum integral (so it is closer to the instrument's physics, where detector pixels collect the light dispersed by the spectrograph over a small area), at a negligible computational cost compared to that of our complete forward model calculation. We qualify this model as 1D because the parameters depend only on the pressure. In Figure 3 we show an illustration of these steps.

**Figure 3.** Illustration of the model construction process for a region of order 46 of the studied transit. Steps 1 and 2 do not involve spectra and are not represented. The rows displaying steps 6 and 7 have a broken y-axis in order to better show the spectra despite the transit effect. In the penultimate row, the spectra have been normalized over wavelengths for illustrative purposes. The bottom panel represents M _θ(t, λ) without modification.
Download figure:
Standard image High-resolution image

3.2. Simulated Data

The simulated data first follow the exact same steps as those listed in Section 3.1. We replace θ, the retrieved parameters, with Θ, a "true" set of parameters. We do not include the effect of stellar lines in this model. In order to simulate the effect of Earth's atmosphere and of the instrument, we follow these additional steps:

Step 6 bis: This step is inserted between step 6 and step 7 of Section 3.1. We obtain the wavelength-dependent transmittance of Earth's atmosphere T _⊕,0 at wavelengths λ_⊕, at a resolving power of 10⁶, and for an air mass μ = 1, using the sky model calculator SKYCALC.¹⁴ The air mass at the time of each exposure μ(t) is obtained from the CARACAL pipeline. We obtain the wavelength- and time-dependent transmittance of Earth's atmosphere, T _⊕(t, λ_⊕), via

$\begin{eqnarray}&&{{\boldsymbol{T}}}_{\oplus }(t,{\lambda }_{\oplus })=\exp \left(\mu (t)\mathrm{ln}({{\boldsymbol{T}}}_{\oplus ,0}({\lambda }_{\oplus }))\right).\end{eqnarray} \tag{ 9 }$

In contrast with the steps described in Section 3.1, here we use an intermediate wavelength grid λ_int at the same resolving power as λ_shift(t), and such that the grid is included in the intersection of λ_shift(t) along t. This intermediate grid is used to ensure that the transmittances of Earth's atmosphere vary smoothly with time after the final rebinning performed in step 8. For each exposure, both M _θ,shift(t, λ_shift) and T _⊕(t, λ_⊕) are rebinned to the corresponding λ_int using the same function as in step 8. The rebinned T _⊕(t, λ_int) is included in the rebinned spectrum M _θ,int(t, λ_int) via

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{{\rm{\Theta }},\mathrm{int},{\boldsymbol{T}}}(t,{\lambda }_{\mathrm{int}})={{\boldsymbol{M}}}_{{\rm{\Theta }},\mathrm{int}}(t,{\lambda }_{\mathrm{int}})\circ {{\boldsymbol{T}}}_{\oplus }(t,{\lambda }_{\mathrm{int}}),\end{eqnarray} \tag{ 10 }$

with "◦" denoting the element-wise product, also called the "Hadamard product" (Million 2007). Then, steps 7 and 8 of Section 3.1 are applied to M _{Θ,int,
T}(t, λ_int) without any modification, to obtain M _{Θ,
T}(t, λ).

Step 9: The variations of observed flux level, pseudocontinuum, and blaze function, which we regroup under the term "instrumental deformations," are represented with a time- and wavelength-dependent matrix X (t, λ). To model X , we used the fit given by the first step of our preparing pipeline (see Section 4.1) on our data. We obtain the noiseless simulated data via

$\begin{eqnarray}&&{{\boldsymbol{F}}}_{\mathrm{sim},0}(t,\lambda )={{\boldsymbol{M}}}_{{\rm{\Theta }},{\boldsymbol{T}}}(t,\lambda )\circ {\boldsymbol{X}}(t,\lambda ).\end{eqnarray} \tag{ 11 }$

Step 10: The noisy simulated data are finally obtained from

$\begin{eqnarray}&&\begin{array}{l}{{\boldsymbol{F}}}_{\mathrm{sim}}(t,\lambda )={{\boldsymbol{F}}}_{\mathrm{sim},0}(t,\lambda )+{\boldsymbol{N}}(t,\lambda ),\end{array}\end{eqnarray} \tag{ 12 }$

where N (t, λ) is the wavelength- and time-dependent Gaussian noise of the data.¹⁵ The scale of the noise U _N(t, λ), i.e., the data uncertainty, is given by the CARACAL pipeline. In Figure 4 we show an illustration of these steps.

**Figure 4.** Illustration of the simulated data construction process for a region of order 46 of the studied transit. Steps 3−5 are identical to those in Figure 3. The slight differences between the telluric line shapes with orbital phase are mainly due to the change in air mass. For illustrative purposes, the noise in step 9 has been increased by a factor 10 compared to the amount expected for CARMENES.
Download figure:
Standard image High-resolution image

3.3. Self-consistent Model

In order to check the physical consistency of our retrieved results, we use a self-consistent model generated by the Exo-REM software.¹⁶ Exo-REM calculates 1D radiative-equilibrium models for H₂-dominated (atmospheric metallicity Z ⪅ 1000 times solar) cold to hot (equilibrium temperature ⪅2000 K) planetary atmospheres. The software is fully described by Baudino et al. (2015, 2017), Charnay et al. (2018), and Blain et al. (2021). To summarize, fluxes in Exo-REM are calculated using the two-stream approximation, assuming hemispheric closure. The radiative−convective equilibrium is solved assuming that the net flux (radiative plus convective) is conservative. The conservation of flux over the pressure grid is solved iteratively using a constrained linear inversion method.

Our HD 189733 b self-consistent model was parameterized following the same procedure described in Blain et al. (2021), except that we did not include the radiative effect of clouds in this simulation. The parameters used are displayed in Table 1. Since HD 189733 b has a mass and radius similar to those of Jupiter, and since its star's metallicity is similar to that of the Sun, we chose to use an atmospheric metallicity of 3 times the solar metallicity, which is approximately the value measured in the upper atmosphere of Jupiter (Atreya et al. 2020). The MMRs and temperature profile obtained are displayed in Figures 5 and 6, respectively. The simulated temperature profile is shown crossing the condensation curve of MnS at 3.7 × 10⁴ Pa, where the contribution to the transmission spectrum is relatively low (⪅2.5% of the total contribution). The corresponding low-resolution ( ${ \mathcal R }\approx 500$ ) transmission spectrum and the contribution of some absorber species over the wavelength range of CARMENES are represented in Figure 7.

**Figure 5.** MMRs obtained from our self-consistent model, for 3 times the solar metallicity. Only the absorbing species MMRs are represented, except TiO and VO, which had MMRs below 10⁻¹². The horizontal dotted and dashed black lines correspond, respectively, to the pressure range concentrating 95% and 68% of the transmission contribution of our petitRADTRANS base model (see Section 4.7) over the CARMENES wavelength range. The horizontal solid black line corresponds to the pressure of the maximum contribution.
Download figure:
Standard image High-resolution image

**Figure 6.** Temperature profile and condensation curves obtained from our self-consistent model, for 3 times the solar metallicity. The horizontal dotted and dashed black lines correspond, respectively, to the pressure range concentrating 95% and 68% of the transmission contribution of our petitRADTRANS base model (see Section 4.7) over the CARMENES wavelength range. The horizontal solid black line corresponds to the pressure of the maximum contribution.
Download figure:
Standard image High-resolution image

**Figure 7.** HD 189733 b low-resolution ( ${ \mathcal R }\approx 500$ ) transit depth and species contributions simulated with Exo-REM. The gray areas represent the wavelength range of the CARMENES orders. On top are indicated the index number of the even-numbered orders. The dotted line labeled "CIA+Ray" represents the combined contributions of the H₂–H₂, H₂–He, and H₂O–H₂O collision-induced absorptions and the effect of Rayleigh scattering. The contributions of CO₂, PH₃, TiO, and VO were negligible in this spectral region and thus are not represented. On the background, in black, are represented the time-averaged data in arbitrary units.
Download figure:
Standard image High-resolution image

**Figure 7.** HD 189733 b low-resolution ( ${ \mathcal R }\approx 500$ ) transit depth and species contributions simulated with Exo-REM. The gray areas represent the wavelength range of the CARMENES orders. On top are indicated the index number of the even-numbered orders. The dotted line labeled "CIA+Ray" represents the combined contributions of the H₂–H₂, H₂–He, and H₂O–H₂O collision-induced absorptions and the effect of Rayleigh scattering. The contributions of CO₂, PH₃, TiO, and VO were negligible in this spectral region and thus are not represented. On the background, in black, are represented the time-averaged data in arbitrary units.
Download figure:
Standard image High-resolution image

This last figure shows which species we can expect to detect according to this model. In addition to H₂O, significant contributions of CH₄, CO, H₂S, and NH₃ are expected assuming chemical disequilibrium (see Blain et al. 2021) with a self-consistent eddy diffusion coefficient (see Charnay et al. 2018). We expect the contributions of FeH, K, and Na to be too low and too localized to be detectable. The other absorbers included in this model, namely CO₂, HCN, PH₃, TiO, and VO, have negligible contributions over the CARMENES NIR spectral region. However, HCN has a band around 1.5 μm and was detected in the atmosphere of this planet (Cabot et al. 2018), with a peak CCF detection corresponding to a volume mixing ratio (VMR) of 10⁻⁶, which is 40 times what our Exo-REM model predicts. There is therefore a possibility that the Exo-REM chemical model does not describe this species' chemistry well, due to unknown or underestimated processes. In conclusion, we decided to include CH₄, CO, H₂O, H₂S, HCN, and NH₃ opacities in our retrieval model.

3.4. 3D Transmission Spectrum Model

We generated 3D transmission spectra using pRT-Orange, which will be described in an upcoming paper (P. Mollière et al. 2024, in preparation.). In short, pRT-Orange divides the planet atmosphere into (orange fruit) segments that are connected at the planet's pole-to-pole axis. The number, longitudinal location, and width of the segments can be specified freely. Within each segment the atmosphere is described by a 1D vertical structure that can be obtained using the full set of options available in classic (1D) pRT model. pRT-Orange's current setup thus assumes that the atmospheric properties do not change as a function of latitude. Changing the segment properties and location as a function of latitude is currently under investigation. The setup of pRT-Orange thus allows us to straightforwardly capture gradients in atmospheric properties that vary longitudinally, such as in the day-to-night or morning−evening differences.

The 3D transmission spectrum is obtained by slicing the planet atmosphere into disks parallel to the equatorial plane. In each disk, the 2D transmission problem is solved. This is very similar to pRT's 1D treatment, with the added complication that atmospheric properties change as a function of longitude. The slices' locations are chosen by transforming the height above the equatorial plane into a coordinate in which constant distances mean constant area fractions of the planet's terminator annulus and then discretizing the coordinate using Gaussian quadrature. Integrating the annulus area, weighted by the atmospheric transmission obtained from the 2D disks, then results in the effective planet area as a function of wavelength. We benchmarked our setup with gCMCRT (Lee et al. 2022), finding excellent agreement.

For our 3D spectra of HD 189733 b we post-processsed the general circulation model (GCM) results reported in Drummond et al. (2018).¹⁷ In particular, we use their cloud-free model at equilibrium chemistry, at solar composition (metallicity Z = 1). For simplicity, we separated the planet into 36 segments spaced equidistantly in longitude and assumed that their temperature−pressure structures are well represented by the equatorial structure from Drummond et al. (2018), at the same longitude. We used their full (also latitudinally varying) velocity structure and the planetary rotation rate to derive line-of-sight velocities at all discretized locations of the planet and Doppler-shifted the opacities accordingly. We used the segments' pressure−temperature profiles and equilibrium chemistry to determine the atmospheric composition in each segment. To coincide better with the self-consistent Exo-REM models, we set a metallicity of Z = 3—instead of the Z = 1 used for the GCM model—for the equilibrium chemistry calculations. We assume that this increase in metallicity would not significantly affect the GCM results or pressure−temperature profiles.

4. Methods

4.1. Preparing Pipeline

In order to retrieve valuable information from the data, it is necessary to have a proper parameterization of each atmospheric property in our forward model. While all properties in Section 3.1 can easily be parameterized, this is not the case for T _⊕(t, λ), X (t, λ), and the stellar lines. They would require a lot of parameters to retrieve, and Earth atmospheric modeling is computationally expensive. We will represent these "nonretrievable"¹⁸ parameters with a so-called "deformation matrix" D (t, λ) ≡ X (t, λ)◦ T (t, λ). Here T (t, λ) represents the telluric transmittance and stellar line component of the noiseless data, which would be obtained via T = M _{Θ,
T}/ M _Θ (see Sections 3.1 and 3.2). By applying a preparing pipeline P_R, it is possible, in principle, to remove most of D (t, λ) from the data. Since the CARMENES data are coming from different orders, we apply the preparing pipeline on each of these orders separately. For a set of spectra F (t, λ), the pipeline P_R( F ) steps are described below.

Step 1: We clean the effect of X (t, λ) by dividing each exposure of F (t, λ) by a corresponding second-order polynomial fit of F (t, λ) over wavelength.¹⁹ If we call the fit $\overline{{\boldsymbol{X}}}(t,\lambda )$ , we obtain the X -corrected ("normalized") spectrum ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ with

$\begin{eqnarray}&&{{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )={\boldsymbol{F}}(t,\lambda )\oslash \overline{{\boldsymbol{X}}}(t,\lambda ),\end{eqnarray} \tag{ 13 }$

where "⊘" denotes the element-wise division, also called "Hadamard division" (Wetzstein et al. 2012). Following the propagation errors, the uncertainties of ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ are ${{\boldsymbol{U}}}_{{\boldsymbol{N}},\overline{{\boldsymbol{X}}}}(t,\lambda )={{\boldsymbol{U}}}_{{\boldsymbol{N}}}(t,\lambda )\oslash \left|\overline{{\boldsymbol{X}}}(t,\lambda )\right|\circ \sqrt{{n}_{\lambda }(t)}$ . The variance correction factor of the fit n_λ(t) is given by ${n}_{\lambda }(t)=({N}_{\lambda }^{\prime} (t)-d)/{N}_{\lambda }$ , where ${N}_{\lambda }^{\prime} (t)$ is the number of nonmasked points along wavelength at time t, N_λ is the number of wavelength points, and d = 3 is the number of degrees of freedom of a second-order polynomial fit. This correction is required to accurately estimate the effect of the fit on the uncertainties. This effect is generally regarded as negligible for a large number of fitted points, but it must be highlighted that here N is the number of points used for each individual fit, not the total number of points in the data. In this first step the fit is not performed over our ≈10⁶ total data points, but one time for each exposure, over the wavelengths. This corresponds to a maximum of 2040 data points fitted each time, and less when some of the points are masked. This number is still large in most orders, but in the second step (see below) the fit will be performed one time for each wavelength, over the exposures. This corresponds to a maximum of 26 (see Section 4.4) points fitted each time. Hence, not accounting for n_λ(t) in our case can lead to a significant overestimation of the uncertainties and, after a retrieval, to a model that misleadingly appears to overfit the data.

Step 2: We fit the effect of T (t, λ) with a second-order polynomial fit²⁰ of $\mathrm{ln}({{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda ))$ over air mass: $\mathrm{ln}\left({{\boldsymbol{T}}}_{\oplus }(t,\lambda )\right)$ is indeed linearly dependent on the air mass (see Equation (9)). Note that this correction also works for stellar lines, which are independent of air mass: the fitting polynomial will simply have its air-mass-dependent term close to 0. The second order is used to fit for eventual other slow temporal variations, like PWV variations when the weather is stable. Next, we divide ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ by the exponential of the fit. If we call the fit $\overline{{\boldsymbol{T}}}(t,\lambda )$ , we obtain the prepared spectrum ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{T}}}}(t,\lambda )$ with

$\begin{eqnarray}&&{{\boldsymbol{F}}}_{\overline{{\boldsymbol{T}}}}(t,\lambda )={{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )\oslash \exp \left(\overline{{\boldsymbol{T}}}(t,\lambda )\right).\end{eqnarray} \tag{ 14 }$

In addition, we mask ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{T}}}}(t,\lambda )$ , where $\exp \left(\overline{{\boldsymbol{T}}}(t,\lambda )\right)\lt 0.8$ . This is done to remove from the analysis data that would be too affected by the telluric lines. The value of the threshold was chosen as a good balance between masking tellurics and not masking too many useful data, although we did not perform extensive testing of the effect of this parameter.

From this pipeline, we also obtain what we will call a preparing matrix ${{\boldsymbol{R}}}_{{\boldsymbol{F}}}(t,\lambda )\equiv 1\oslash \left(\overline{{\boldsymbol{X}}}(t,\lambda )\circ \exp \left({\overline{{\boldsymbol{T}}}}_{\oplus }(t,\lambda )\right)\right)$ . Similarly to the previous step, we obtain the variance correction factor of the fit ${n}_{t}(\lambda )=({N}_{t}^{\prime} (\lambda )-d)/{N}_{t}$ , where ${N}_{t}^{\prime} (\lambda )$ is the number of nonmasked points along time at wavelength λ and N_t is the number of time points. All of the preparing pipeline steps can be represented as a function P_R:

$\begin{eqnarray}&&{P}_{{\boldsymbol{R}}}({\boldsymbol{F}})={\boldsymbol{F}}(t,\lambda )\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}(t,\lambda )\equiv {{\boldsymbol{F}}}_{\overline{{\boldsymbol{T}}}}(t,\lambda ).\end{eqnarray} \tag{ 15 }$

Note that the preparing matrix R will change depending on the input of P_R (e.g., ${P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{\theta })={{\boldsymbol{M}}}_{\theta }\circ {{\boldsymbol{R}}}_{{{\boldsymbol{M}}}_{\theta }}(t,\lambda )$ , with ${{\boldsymbol{R}}}_{{{\boldsymbol{M}}}_{\theta }}\ne {{\boldsymbol{R}}}_{{\boldsymbol{F}}}$ since the fits of M _θ from step 1 and step 2 are different from the fits of F ), and hence it needs to be recalculated for each set of parameters during the retrieval. The uncertainties on P_R( F ), following the propagation of errors, are

$\begin{eqnarray}&&{{\boldsymbol{U}}}_{{\boldsymbol{R}}}(t,\lambda )={{\boldsymbol{U}}}_{{\boldsymbol{N}}}(t,\lambda )\circ \left|{{\boldsymbol{R}}}_{{\boldsymbol{F}}}(t,\lambda )\right|\circ \sqrt{n(t,\lambda )},\end{eqnarray} \tag{ 16 }$

where n(t, λ) = n_λ(t)◦n_t(λ).

In Figures 8 and 9 we show illustrations of these steps for one and all of the exposures of one order, respectively. For convenience we will refer to this pipeline as "Polyfit." Note that the latter is similar to the preparing pipeline developed by, e.g., Brogi & Line (2019). We also implemented the SysRem (Tamuz et al. 2005) preparing pipeline, which is detailed in Appendix B. For this pipeline or principal component analysis (PCA) based pipelines, it has been shown that the equivalent of Equation (15), applied to the forward model, can be optimized to be much more computationally efficient (Gibson et al. 2022; Klein et al. 2023). We have not investigated whether an equivalent optimization could be implemented for "Polyfit."

**Figure 8.** Illustration of the preparing effect on noiseless simulated data corresponding to order 46. Only the spectrum corresponding to orbital phase −0.0073 is shown. The spectrum used is the same as in step 9 of Figure 4. The spectra are given in arbitrary units.
Download figure:
Standard image High-resolution image

**Figure 9.** Illustration of the preparing pipeline effect on the order 46 of the studied data. First row: prepared model. The model is the same as in step 8 of Figure 3. While deformed, the spectral lines are clearly visible. The transit effect is also still visible. Second row: prepared noiseless simulated data. The simulated data used are the same as in step 9 of Figure 4. Third row: same as the second row, but with added noise. The lines are no longer visible to the naked eye. Bottom row: prepared CARMENES data. The white vertical stripes are masked telluric or stellar lines.
Download figure:
Standard image High-resolution image

4.2. Cross-correlation Setup

In order to test our methods, we used the cross-correlation technique to investigate the presence of the previously detected H₂O signal in this data set. To that end, we used our base model of the planet's transit radius, computed as described in Section 3.1, as a template to perform the cross-correlation analysis. This model has already been put through the same preparing pipeline as the data, following the procedures discussed in Section 4.1. This is done in order to take into account the distortion of a potential real signal produced by the analysis methods performed to remove the telluric, stellar, and instrumental contributions (see, e.g., Brogi & Line 2019). We stress that no artificial telluric or stellar lines were added to the model in this step.

Consecutively, we cross-correlated the prepared data matrix P_R( F ) ≡ F _R with our prepared model P_R( M _θ) ≡ M _{θ,
R} over a wide range of exoplanet radial velocities, from −150 to 150 km s⁻¹ in steps of 1.3 km s⁻¹ (i.e., the mean velocity step between pixels in the CARMENES NIR channel). This analysis was conducted independently for each NIR order of the instrument to inspect the individual cross-correlation matrices in a search for strong residuals. We used a conservative order selection following Alonso-Floriano et al. (2019) and performed the cross-correlation following

$\begin{eqnarray}&&\begin{array}{l}{CC}(t,v)=\displaystyle \frac{{\sum }_{i}\left({{\boldsymbol{F}}}_{{\boldsymbol{R}},i}-\langle {{\boldsymbol{F}}}_{{\boldsymbol{R}}}\rangle \right)\left({{\boldsymbol{M}}}_{\theta ,{\boldsymbol{R}},i}(v)-\langle {{\boldsymbol{M}}}_{\theta ,{\boldsymbol{R}}}\rangle \right)}{\sqrt{{\sum }_{i}{\left({{\boldsymbol{F}}}_{{\boldsymbol{R}},i}-\langle {{\boldsymbol{F}}}_{{\boldsymbol{R}}}\rangle \right)}^{2}{\sum }_{i}{\left({{\boldsymbol{M}}}_{\theta ,{\boldsymbol{R}},i}(v)-\langle {{\boldsymbol{M}}}_{\theta ,{\boldsymbol{R}}}\rangle \right)}^{2}}},\end{array}\end{eqnarray} \tag{ 17 }$

where i denotes the wavelength element, v is a specific lag (i.e., Doppler shift applied to the model), and we have dropped the dependencies with time and wavelength for clarity. 〈 F _R〉 and 〈 M _{θ,
R}〉 represent the mean value of the corresponding vectors. With this, we obtain a cross-correlation matrix in Earth's rest frame, as shown in Figure 10, where we expect a potential planet signal to follow the expected exoplanet velocities with respect to Earth (V_obs(t)), which roughly ranged from −22 to 45 km s⁻¹ during this event.

**Figure 10.** In-transit cross-correlation matrix of the prepared data with our base model in Earth's rest frame and for the order selection of Alonso-Floriano et al. (2019). The horizontal axis shows the different Doppler shifts applied to the model, and the vertical axis shows the exposures at different planet orbital phases. The white line represents the expected exoplanet velocities with respect to Earth during the observations, assuming the maximum significance K_P of the CCF analyses (166 km s⁻¹) and the observed −5.2 km s⁻¹ blueshift. The red line shows the expected planet velocities with respect to Earth considering the expected planet K_p (∼153 km s⁻¹) and no additional sources of dynamics in the atmosphere.
Download figure:
Standard image High-resolution image

The final CCF is then obtained by coadding in time the cross-correlation matrix. If a signal from HD 189733 b's atmosphere is indeed present in the data, we would expect the coadded CCF to present a significant peak if we shift the matrix to the rest frame of the exoplanet. In order to add robustness to our analysis, and as is usual in similar works, in this step the planet's K_p is assumed to be unknown, hence allowing us to explore residuals or noise structures to aid our assessment of a potential signal's significance (see Section 5.1). For the latter, we calculated the S/N of the CCFs at each K_p, dividing the peak CCF value by the CCF's standard deviation, excluding a ±15 km s⁻¹ region around the maximum.

4.3. High-resolution Retrieval Setup

4.3.1. Taking into Account the Preparing Pipeline

Due to the use of the preparing pipeline on the data, great care must be taken to correctly set up the retrieval in order to avoid biases. Let us have a model M _θ(t, λ) with retrieved parameters θ and data F (t, λ) with uncertainties U _N(t, λ).

For the following demonstration we make no assumption regarding the specific operations performed by the preparing pipeline (i.e., it does not need to follow the steps described in Section 4.1), but we assume that it can still be described as in Equation (15), i.e., as the data Hadamard-multiplied by any matrix R . We also assume that the uncertainties of the prepared data U _R(t, λ) have been accurately estimated. If we have a model perfectly describing the planet spectra, if Θ is the true set of parameters (as in Section 3.2), and if D and N are the true deformation matrix and the true noise matrix, respectively, then the observed data can be written as

$\begin{eqnarray}&&{\boldsymbol{F}}\equiv {{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}},\end{eqnarray} \tag{ 18 }$

and we can write from Equation (15)

$\begin{eqnarray}&&{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}})=({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}})\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}},\end{eqnarray} \tag{ 19 }$

where the time and wavelength dependencies of all terms are implied. Not knowing Θ or N but knowing D , we would need to compare the prepared data P_R( M _Θ◦ D + N ) with

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\mathrm{ret},\theta }={{\boldsymbol{M}}}_{\theta }\circ {\boldsymbol{D}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}.\end{eqnarray} \tag{ 20 }$

As a proof, we can calculate the reduced χ², defined as

$\begin{eqnarray}&&{\chi }_{\nu }^{2}(f,m,\sigma )=\displaystyle \frac{1}{N}\displaystyle \sum _{i}{\left(\displaystyle \frac{{f}_{i}-{m}_{i}}{{\sigma }_{i}}\right)}^{2},\end{eqnarray} \tag{ 21 }$

where f, m, and σ are vectors representing data, a model, and the uncertainties on f, respectively; N is the number of elements in the vectors; and i denotes the vector element. If Equation (18) is true and if we use θ = Θ, we obtain ${\chi }_{\nu }^{2}({P}_{{\boldsymbol{R}}}({\boldsymbol{F}}),{{\boldsymbol{M}}}_{\mathrm{ret},{\rm{\Theta }}},{{\boldsymbol{U}}}_{{\boldsymbol{R}}})\approx 1$ , which is the expected value for a model well describing the data.

However, for real observations D (t, λ) is unknown. We can rewrite Equation (19) to obtain

$\begin{eqnarray}&&{\boldsymbol{D}}=\displaystyle \frac{{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}})-{\boldsymbol{N}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}}{{{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}},\end{eqnarray} \tag{ 22 }$

where the fraction bar here and from now on represents a Hadamard division. Injecting this equation into Equation (20), we obtain

$\begin{eqnarray}&&\begin{array}{l}{{\boldsymbol{M}}}_{\mathrm{ret},\theta }=\displaystyle \frac{{{\boldsymbol{M}}}_{\theta }}{{{\boldsymbol{M}}}_{{\rm{\Theta }}}}\circ \left({P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}})-{\boldsymbol{N}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}\right).\end{array}\end{eqnarray} \tag{ 23 }$

As stated in Section 4.1, the goal of P_R is to remove D from the data as much as possible. Hence, an ideal preparing pipeline would be such that R _F = 1 ⊘ D . We now assume that the preparing matrix R _F of a nonideal pipeline on data F can be written as

$\begin{eqnarray}&&{{\boldsymbol{R}}}_{{\boldsymbol{F}}}={\boldsymbol{A}}({{\boldsymbol{M}}}_{{\rm{\Theta }}})\oslash {\boldsymbol{D}}+{\boldsymbol{B}}({{\boldsymbol{M}}}_{{\rm{\Theta }}},{\boldsymbol{D}},{\boldsymbol{N}}),\end{eqnarray} \tag{ 24 }$

where A and B are time- and wavelength-dependent matrices. The elements in parentheses are here used only to highlight the matrix dependencies. The matrix A ²¹ can depend only on M _Θ. The matrix B can depend on M _Θ, D , and N . These two matrices represent together the "imperfections" of a nonideal preparing pipeline, that is, how it "catches" some of the signal and how it imperfectly removes D . A preparing pipeline effective at removing D would be such that ∣ B ∣ ≪ ∣ A ⊘ D ∣. Indeed, if this condition is not respected, then the effect of D on the prepared data would still be significant, and the correction would be significantly biased by N . If we assume that this condition is respected, we can write from Equation (19)

$\begin{eqnarray}\begin{array}{rcl}{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}}) & = & \left({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}}\right)\circ \left({\boldsymbol{A}}\oslash {\boldsymbol{D}}+{\boldsymbol{B}}({\boldsymbol{D}},{\boldsymbol{N}})\right)\\ & \approx & {{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{A}}+{\boldsymbol{N}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}},\end{array}\end{eqnarray} \tag{ 25 }$

where the dependencies on M _Θ of A and B are implied. It follows that if there is no deformation matrix or noise, i.e., if we replace D by a matrix of ones and N by a matrix of zeros, using Equation (25) and the fact that A depends only on M _Θ,

$\begin{eqnarray}\begin{array}{rcl}{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}) & = & {{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ \left({\boldsymbol{A}}+{\boldsymbol{B}}({\bf{1}},{\bf{0}})\right)\\ & \approx & {{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{A}}\\ & \approx & {P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}\circ {\boldsymbol{D}}+{\boldsymbol{N}})-{\boldsymbol{N}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}},\end{array}\end{eqnarray} \tag{ 26 }$

where the dependencies on M _Θ of A and B are implied. Injecting Equation (26) into Equation (23), we obtain

$\begin{eqnarray}&&{{\boldsymbol{M}}}_{\mathrm{ret},\theta }(t,\lambda )\approx \displaystyle \frac{{{\boldsymbol{M}}}_{\theta }}{{{\boldsymbol{M}}}_{{\rm{\Theta }}}}\circ {P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}}).\end{eqnarray} \tag{ 27 }$

In a real case, we obviously do not know the true set of parameters Θ, but we can make the assumption that the retrieval best fit will correspond to a good approximation of Θ—which is true if the model accurately represents the physics of the data. Thus, we can write

$\begin{eqnarray}\begin{array}{rcl}{{\boldsymbol{M}}}_{\mathrm{ret},\theta }(t,\lambda ) & \approx & {P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{\theta })\\ & \approx & {{\boldsymbol{M}}}_{\theta }\circ {{\boldsymbol{R}}}_{{{\boldsymbol{M}}}_{\theta }}.\end{array}\end{eqnarray} \tag{ 28 }$

To summarize, if a preparing pipeline can be written as in Equation (15) and has small residuals coming from D and N (which is arguably a desirable property of preparing pipelines), or in other words, if it respects, using the notation of Equation (19), the condition

$\begin{eqnarray}&&{\bf{1}}-\displaystyle \frac{{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{{\rm{\Theta }}})+{\boldsymbol{N}}\circ {{\boldsymbol{R}}}_{{\boldsymbol{F}}}}{{P}_{{\boldsymbol{R}}}({\boldsymbol{F}})}\approx {\bf{0}},\end{eqnarray} \tag{ 29 }$

then the optimal retrieval model is given by Equation (28), that is, applying the preparing pipeline directly on the forward models. This condition can easily be tested on synthetic data, for which Θ is known, using any realistic D and N . We will call the values given by the left-hand term of this equation the BPM.²² The BPM can be seen as analogous to ${\chi }_{\nu }^{2}$ for preparing pipelines, especially if we compute its absolute mean. A ∣〈BPM〉∣ closer to 0 is more desirable, so the BPM can be a tool to compare preparing pipelines following Equation (15) with each other.

4.3.2. Log-likelihood Function and Retrieval Algorithm

In order to retrieve the parameters θ (such as the temperature profile, species abundances, etc.) from the data, we use the multimodal nested sampling algorithm MultiNest (Feroz et al. 2009) and its Python wrapper PyMultiNest (Buchner et al. 2014). This algorithm explores the parameter space with coordinates θ defined by priors and uses a log-likelihood function $\mathrm{ln}({ \mathcal L })$ to compare the model to the data and estimate which set of parameters (the posteriors) best fits the data according to Bayesian statistics. For high-resolution retrievals, Brogi & Line (2019), Gandhi et al. (2019), and Gibson et al. (2020, 2022) make the CCF appear in the classical log-likelihood function. However, the versions of this function with and without the CCF term are strictly equivalent from a mathematical standpoint (see, e.g., Gibson et al. 2020, 2022), and none have a significant advantage over the others from a computational cost standpoint. The CCF is thus not necessary for high-resolution retrievals. This is well-known but often not sufficiently highlighted by the authors who make the choice to use the CCF version of the log-likelihood function. Assuming Gaussian noise, we follow Gibson et al. (2020) and use one of their intermediate steps while dropping constant terms. Gibson et al. (2020) also use two scaling parameters, α and β, for the model and the uncertainties, respectively. We assume that our model correctly parameterizes the atmosphere's scale height, so we set α to 1. To fit for β, we need to assume that our model is necessarily correct and that the uncertainties may be uniformly incorrect over orders, exposures, and wavelengths. Given the relative simplicity of our model, we cannot reasonably assume that it is necessarily correct. Moreover, we observed that the CARMENES uncertainties may be slightly overestimated, and not uniformly so (see Section 4.6). Hence, we also chose to fix β to 1 (we show results retrieving β in Appendix G). From the demonstration in Section 4.3.1, we use

$\begin{eqnarray}&&\mathrm{ln}({ \mathcal L })=-\displaystyle \frac{1}{2}\sum {\left(\displaystyle \frac{{P}_{{\boldsymbol{R}}}({\boldsymbol{F}})-{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{\theta })}{{{\boldsymbol{U}}}_{{\boldsymbol{R}}}}\right)}^{2},\end{eqnarray} \tag{ 30 }$

where the sum is over every nonmasked element of the matrices (orders, exposures, wavelengths).²³ Note that P_R( F ) and U _R need to be calculated only once, while P_R( M _θ) is calculated at every log-likelihood evaluation, with every step of P_R performed on M _θ. From Section 4.3.1, if the model M _θ is accurate, the parameters are not degenerate, the uncertainties U _R are correctly evaluated, and Equation (29) is respected, then using Equation (30) should in principle guarantee an unbiased retrieval.

Our PyMultiNest retrievals use 100 live points, an evidence tolerance of 0.5—which is the recommended value—and a sampling efficiency of 0.8, which is recommended for parameter estimation.²⁴ We used the nonconstant efficiency mode of MultiNest, which is significantly slower than the constant efficiency mode but more accurate in constraint and evidence estimation (see, e.g., Chubb & Min 2022).

4.4. Exposure Selection

The planet total transit duration is given in Table 1. Our observations span nearly 12,000 s, meaning that a bit less than half of our exposures contain no transmission spectra from the planet. We can thus optimize the speed of our retrievals significantly by selecting only the relevant exposures. If we knew T₀ and T₁₄ perfectly well, we could just select the exposures corresponding to the planet's total transit. While we know these two parameters to a precision of ≈10 s, the CARMENES time stamps are only given "for guidance."²⁵ Hence, we chose to retrieve the offset of T₀—which we will abbreviate into T₀ for convenience from now—using a prior spanning ±300 s in our retrievals to provide us with a higher precision. We selected the exposures such that all possible total transits for all possible T₀ values are encompassed in the selection. This corresponds to 26 exposures, from the CARMENES times 2 458 004.385018 to 2 458 004.463298 BJD_TDB (days).

4.5. Order Selection

Not all of the collected CARMENES data contain valuable information. Using some orders might lead to biased results owing to the combination of the low mean Earth's transmittance in them, imperfections in our modeling of physical phenomena and in our preparing pipeline, and unaccounted noise sources. The latter includes unpredictable changes in the instrumental response, changes in the telluric lines due to random changes in the observing conditions, correlated noise, etc. This is true for a CCF analysis, but even more so for a retrieval analysis: these imperfections can significantly bias the retrieved values. In order to perform our retrievals in good conditions, we have to perform an order selection with the goal of maximizing the extraction of useful information and to minimize biases.

A conservative approach would be to simply use all the available data. However, doing so for our selected exposures gives a maximum significance CCF peak (following Section 4.2) at K_p = 97 km s⁻¹ and V_rest = − 2.6 km s⁻¹. For comparison we expect K_p ≈ 153 km s⁻¹ and V_rest ≈ − 4 km s⁻¹, as reported by, for example, Sánchez-López et al. (2019). In contrast, we retrieved sensible values with our Polyfit retrieval (see Appendix C), suggesting that retrievals with our framework are more stable than CCF analysis to this kind of perturbation.

We also tried several variations of approaches based on using the CCF S/N to detect a synthetic signal injected into the data, order by order. The orders selected by these approaches often corresponded to the one with the most important telluric contamination, which is the opposite of the desired behavior. We also developed an algorithmic order selection (see Appendix D), but it seems prone to biases and we decided not to use it. We note that more detailed works on this issue (Cheverall et al. 2023; Debras et al. 2023) reached a similar conclusion.

While we could have performed our analysis with retrievals on all orders, we ultimately chose to use the Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019) selection to be able to compare our results both with our own CCF analysis and with these previous works. Following their approach, we removed orders 18–21 and 36–41, due to the very strong telluric water absorptions present at these wavelengths. This results in a CCF peak at K_p = 166 km s⁻¹ and V_rest = −5.2 km s⁻¹ (see Section 5.1), which is consistent with their results, given the error bars. We thus considered this selection as satisfactory and used it from here.

4.6. Uncertainties Check

From the setup described in Section 4.3 and Equation (30), there are three ways our retrieval could fail at retrieving the planet atmospheric parameters:

1.
Using an improper model, either because of a parameterization leading to biases, incorrect physics, or inaccurate line lists.
2.
Having residuals in the prepared data owing to an imperfect preparing pipeline.
3.
Using an inaccurate estimation of the uncertainties.

The line lists and physical models we are using are state of the art, although they may be improvable, which is beyond the scope of this paper. We discussed our methodology to limit and test parameterizations leading to biases in Sections 4.4 and 4.7. Because our pipeline is perfectible, we expect residuals to be left in the prepared data. We discussed our methodology to limit residuals in Section 4.5. We are thus left to check the accuracy of our data uncertainties.

To estimate whether the uncertainties we are using are accurate, we calculate the order- and time-dependent standard deviations along wavelength of P_R( F ) and compare them with the order- and time-dependent average of U _R along wavelength. In the case of Gaussian noise, these two values should roughly be equal; any deviation may be a hint of an under- or overestimation of the uncertainties. We found that, on average and considering only the selected exposures and our order selection, U _R was overestimated by a factor k_σ ≈ 1.15²⁶ compared to the standard deviations of P_R( F ). We do not observe such discrepancy when doing the same comparison with simulated noisy observations: for the retrieval described in Appendix E, we obtained k_σ = 0.996. The origin of this apparent overestimation is probably linked to the CARACAL pipeline. We did not investigate this further, chose to be conservative, and used the CARMENES uncertainties ( U _N) without modification in our retrievals. As a consequence, the constraints we will establish may be less precise than what they could optimally be. However, when evaluating the ${\chi }_{\nu }^{2}$ of the data against our retrieved models, we will correct the uncertainties by this factor. This is nevertheless an imperfect way of doing so, as k_σ may vary with order and exposure.

4.7. High-resolution Retrieval Setup Validation

4.7.1. BPM

Our nominal petitRADTRANS model is built following the steps described in Section 3.1 and the parameters listed in Tables 1 and 3. We used the aforementioned exposure selection and the order selection described in Section 4.5. Our simulated noisy data are built in the same way, following in addition the steps described in Section 3.2 and using U _N(t, λ) to build the noise matrix. We first test whether our preparing pipeline respects the condition described by Equation (29). Using our noisy simulated data, we obtain an absolute mean BPM of 2.75 × 10⁻⁵ over our 2,319,330 selected data points for that specific noise realization. The maximum of the BPM absolute value is 3.42 × 10⁻². The distribution of our BPM values is represented in Figure 11. For comparison, when applying no pipeline, i.e., when we replace R _F in Equation (29) by 1, we obtain an absolute mean BPM of 122 (with an absolute median of 3.51). When applying the pipeline only to the data (i.e., P_R( M _θ) replaced by M _θ in Equation (29)), we obtain an absolute mean BPM of 1.60 × 10⁻².

**Figure 11.** Probability density distribution of the decimal logarithm of the absolute value of the BPM of different setups with our order selection. Black: BPM without pipeline ( R _F = 1). Orange: BPM when applying the pipeline only to the data. Blue: standard BPM (Equation (29)). The vertical lines represent the 0.16, 0.50, and 0.84 quantiles of the distributions.
Download figure:
Standard image High-resolution image

Table 3. Mass Mixing Ratios Used for Our Simulated Data and Equivalent Volume Mixing Ratios

Species	MMR	log₁₀(VMR)
CH₄	3.4 × 10⁻⁵	−5.3
CO	1.8 × 10⁻²	−2.8
H₂O	5.4 × 10⁻³	−3.2
H₂S	1.0 × 10⁻³	−4.2
HCN	2.7 × 10⁻⁷	−7.6
NH₃	7.9 × 10⁻⁶	−6.0

Download table as: ASCII Typeset image

While these results are encouraging—the mean BPM of our setup is several orders of magnitude lower than the BPM without pipeline, and is arguably close to 0—it is difficult to draw a conclusion on the validity of "Polyfit" solely from the BPM. Indeed, this indicator is useful for comparisons, but we lack values from other trusted preparing pipelines following Equation (15), which is not the case for, e.g., SysRem. We develop this more in Section B.2.

4.7.2. Simulated Retrievals

Setup : To confirm the BPM results, we perform a retrieval with our framework on noiseless simulated data. That is, we take N (t, λ) in Equation (12) as a matrix of zeros, but we keep the uncertainties as U _N(t, λ). The reason for setting the noise to 0 is that we are interested in estimating the retrieval setup bias, not the bias introduced by a random realization of the noise. Typically, we would need to run many retrievals with different N (t, λ) realizations so as to isolate and neglect the effect of noise. However, using our noiseless simulated data provides us directly with the desired results. This was also highlighted by Feng et al. (2018, see their Section 5.2), who showed that the combination of a large number of noisy simulated data retrievals converges to the posteriors obtained from the retrieval of noiseless simulated data. In order to obtain a rough estimation of biases introduced by 3D effects, we also retrieve the pRT-Orange 3D model described in Section 3.4 with the same setup as for our simulated 1D model. The setup and results are compiled in Table 4, and resulting posterior probability distributions are shown in Figure 12.

**Figure 12.** Posterior probability distributions of our simulations. Blue: pRT 1D simulation using Polyfit. Orange: pRT-Orange 3D simulation using Polyfit. The solid red lines correspond to the model input values for the 1D model. The dashed vertical blue lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions for the 1D simulation. On top of each column is indicated the median and the 1σ error bar of the retrieved parameters for the 1D simulation. The contours in the 2D histograms correspond to the 1σ, 2σ, and 3σ contours.
Download figure:
Standard image High-resolution image

Table 4. Setup and Results of Our Noiseless 1D Simulated Data Retrieval with Polyfit

Parameter	Prior	Posterior	Input
T (K)	${ \mathcal U }(100,4000)$	${1188}_{-96}^{+106}$	1209
[CH₄]	${ \mathcal U }(-12,0)$	$-{7.15}_{-2.96}^{+2.81}$	−4.47
[CO]	${ \mathcal U }(-12,0)$	$-{5.56}_{-3.82}^{+3.48}$	−1.74
[H₂O]	${ \mathcal U }(-12,0)$	$-{1.41}_{-0.87}^{+0.71}$	−2.27
[H₂S]	${ \mathcal U }(-12,0)$	$-{5.62}_{-3.79}^{+2.77}$	−3.00
[HCN]	${ \mathcal U }(-12,0)$	$-{8.07}_{-2.63}^{+2.77}$	-6.57
[NH₃]	${ \mathcal U }(-12,0)$	$-{7.75}_{-2.73}^{+2.80}$	−5.10
${\mathrm{log}}_{10}({P}_{c})$ (Pa)	${ \mathcal U }(-5,7)$	${5.35}_{-1.43}^{+0.99}$	5.00
${\mathrm{log}}_{10}({\kappa }_{0})$	${ \mathcal U }(-6,2)$	$-{3.41}_{-1.72}^{+2.29}$	−3.00
γ	${ \mathcal U }(-12,1)$	$-{7.16}_{-3.09}^{+4.43}$	−4.00
${\mathrm{log}}_{10}(g)$ (cm s⁻²)	${ \mathcal U }(2.5,4.0)$	${3.36}_{-0.06}^{+0.06}$	3.35
K_p (km s⁻¹)	${ \mathcal U }(70,250)$	${152.48}_{-4.70}^{+5.12}$	152.53
V_rest (km s⁻¹)	${ \mathcal U }(-20,20)$	$-{0.04}_{-0.68}^{+0.72}$	0.00
${{ \mathcal R }}_{{\rm{C}}}$	${ \mathcal U }({10}^{3},{10}^{5})$	$75\,{100}_{-9\,500}^{+10\,700}$	80 400
T₀ (s)	${ \mathcal U }(-300,300)$	$-{3}_{-121}^{+122}$	0

Note. ${ \mathcal U }({x}_{1},{x}_{2})$ denotes a uniform prior: a prior with a constant positive probability density between its boundaries x₁ and x₂, and a probability density = 0 outside of its boundaries. The values for T₀ are given relative to its value in Table 1.

Download table as: ASCII Typeset image

1D Results : For the retrievals on the 1D simulated data, the probability distributions for most of the retrieved parameters peak at values very close to the true model values. It can be noted that the H₂O abundance posterior's median is roughly 1σ away from the true value. This is mainly caused by the effect of other retrieved parameters combined with our preparing pipeline's imperfection: the simulated data are deformed by D , while the forward models are not. Hence, a prepared forward model with the true parameters will not be exactly equal to the prepared simulated data. In Appendix F, we retrieve the same simulated data with less free parameters. In that case, the H₂O abundance posterior peak is not significantly shifted.

3D Results: For the retrievals on the 3D simulated data, we can make several remarks:

1.
The resolving power is biased toward lower values, while the surface gravity is biased toward higher ones. This is caused by the planet rotation and winds, which deform the line shapes. This effect is displayed in Figure 13. We note that in this configuration we did not retrieve a lower temperature or a higher H₂O MMR as was highlighted by MacDonald et al. (2020). In their case it was caused by an asymmetry in temperature and composition at the terminators, with most of the effect coming from the difference in composition. Our 3D model partially reproduces this asymmetry (see Figure 14), but the difference in H₂O MMR may be too small for the effect to be significant.
2.
The posteriors are overall wider, probably because the 1D model is not able to fit as accurately the spectral features deformed by the 3D effect.
3.
The simulated winds and planet rotation are not translated in any significant shift in V_rest. This is due to the combined effect of rotation and winds canceling out in this model. It thus cannot reproduce the rest velocity blueshifts previously observed.
4.
Otherwise, the other posteriors look similar to the one obtained with the 1D simulated model.

Posterior Analysis: Some comments can be made on the shape of some posteriors, which are not all Gaussian-like. For the CH₄, CO, H₂S, HCN, and NH₃ MMRs, this is due to a combination of low sensitivity of the spectral shape to these parameters (the selected wavelengths cover no or weak lines of these species) and the sensitivity of the model to the atmospheric mean molar mass. Beyond a given abundance, some species lines may start to imprint distinguishable features on the spectra, while the effect was inconclusive at lower abundances. Even if a species' lines have a negligible effect on the spectrum, increasing this species' MMR will in turn increase the mean molar mass, which has an impact on the spectrum. Hence, we obtain uniform posterior probability distributions with a sharp cutoff near the true species' MMR value. This is very similar for the cloud deck pressure: a cloud deck pressure below ≈10⁴ Pa will have a negligible effect on the spectral shape but stops being negligible above that value. The explanation is again the same for κ₀ and γ, but these two parameters are highly correlated: a high haze opacity is possible only with a steep spectral slope in order to avoid a strong effect on the spectral shape.

**Figure 13.** Comparison between our different pRT models. Black: 1D model described in Section 3.1. Purple: 3D pRT-Orange model without rotation and winds. The 3D spectrum is different from the 1D spectrum because the former has 3D geometry with varying temperature profiles and uses equilibrium chemistry. Orange: 3D pRT-Orange model including rotation and winds. While we observe some shifts in the lines when only including rotation, the GCM's strongly superrotating equatorial jet separates these into two small lines left and right from the central line position, where the blueshifted one is slightly stronger, as it corresponds to the hotter morning terminator. A dominating central peak remains in the middle, corresponding to the higher latitudes, which are above the equatorial jet and therefore dominated by the planet's rotation.
Download figure:
Standard image High-resolution image

**Figure 14.** Temperatures of the 3D model used (Drummond et al. 2018) at a pressure of 10⁴ Pa, corresponding roughly to the pressure of the maximum average contribution for this CARMENES data set. The dotted lines corresponds to the leading (or morning) terminator and to the trailing (or evening) terminator. In the pRT-Orange model used, the longitude-dependent temperatures are set, at all latitudes, to the corresponding equatorial (latitude = 0°) temperatures of this model.
Download figure:
Standard image High-resolution image

Note that we did not retrieve the planet radius, which is often a free parameter in low-resolution transmission spectra analysis. In transmission spectroscopy, the planet radius has two main effects: offset the transit radius and change the scale height (by affecting the rate of change of g in the atmosphere), e.g., change the amplitude of the lines. However, our preparing pipeline "normalizes" the spectrum, so the offsetting effect of the planet radius is negated. The amplitude effect remains, but large variations of radius are necessary to affect this property significantly.

We note that Debras et al. (2023) also performed a posterior analysis using simulated HD 189733 b SPIRou transmission spectra, which cover a similar spectral range at a similar resolving power. They, however, used different techniques from ours to build the simulated data and prepare them, as well as a different log-likelihood function. Their results notably show posterior peaks more than 1σ away from the truth in some cases. This includes the H₂O abundance posterior, which seems biased toward lower values. Due to the choice by Debras et al. (2023) to inject their synthetic model into real data, it is unclear whether these biases arise from the technique used or are noise induced.

Validation Summary: To summarize, our setup is in principle essentially unbiased assuming that our model accurately describes the data. If this assumption holds, we should be able to constrain ${{ \mathcal R }}_{{\rm{C}}}$ , K_p, V_r, g, and T, as well as the H₂O MMR. We do not expect to be able to have a significant detection of the other tested molecules, as well as of the cloud and haze parameters. However, our 1D model probably misrepresents some 3D effects. Using our pRT-Orange simulated 3D retrieval results, we can expect a bias of ≈−2σ for ${{ \mathcal R }}_{C}$ . Due to the relative simplicity of pRT-Orange, however, we expect to have missed or underestimated some 3D-induced biases (e.g., the effect of clouds and their patchiness, or the impact of the latitude-dependent atmospheric structure).

5. Results

In this section we present the results of our CCF analysis and retrieval framework for the CARMENES data described in Section 2.1.

5.1. Cross-correlation

The resulting S/N map as a function of K_P and V_rest (the latter obtained as V_obs for each K_P value) is shown in Figure 15. A slice through the K_P of the maximum significance 1D CCF peak is also shown. The map reveals a region of high cross-correlation values, peaking at an S/N of ≈12.4 for a K_P of 167 ± 34 km s⁻¹ and a V_rest = −5.2 ± 2.6 km s⁻¹. The uncertainties were determined by a reduction of 1 in the S/N of the obtained 1D CCF. This result is consistent with the analyses of this data set presented in Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019), considering the significantly different telluric corrections and exposure selections. That is, the expected K_p of HD 189733 b, computed from the literature (152.1 ± 2.9 km s⁻¹) is within the uncertainty intervals of our highest-significance signal, and the CCF peak position is consistent with the previously reported blueshift, suggesting winds flowing from the planet's dayside to its nightside. We note that the S/N map shows a trace of high cross-correlation that spans to low K_p values. This is caused by two regions of high values of cross-correlation appearing at orbital phases between approximately −0.018 and −0.005 (see Figure 10). Although these regions align well with the expected exoplanet trail in the first half of the transit, we cannot discard the possibility of these values being affected by telluric residuals, as the exoplanet velocities with respect to Earth at these times are low. As a safety test, we run our CCF analysis excluding all spectra at orbital phases below −0.01 and find that the exoplanet signal persists, at an S/N of ∼8, a K_p of ∼148 km s⁻¹, and a V_rest of about −6.5 km s⁻¹, with fewer telluric residuals in the K_p − V_rest map. The lower significance we obtained in this case was due to avoiding the region where the exoplanet signal with respect to Earth is the lowest, so this measurement is in principle less affected by overlapping telluric H₂O lines. However, we also note that the highest-S/N exposures and lower target air masses also occurred at these phases of the early transit, so excluding these spectra from the analysis also potentially removed some exoplanet contributions.

5.2. Retrievals

We analyze the data using the procedures and base model described in Sections 3.1, 4.1, and 4.3. We start with retrieval setups similar to that of Boucher et al. (2021), which we label "P-01," retrieving only K_p, V_rest, T, P_c, and the H₂O MMR. The corresponding corner plot is displayed in Figure 16. With Polyfit we are able to constrain all the parameters and to put a lower limit on the pressure of an opaque cloud layer.

**Figure 16.** Posterior probability distributions for P-01, using 100 live points. The dashed vertical blue lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions obtained with Polyfit. Indicated on top of each column are the median and the 1σ error bar of the retrieved parameters using Polyfit. The contours in the 2D histograms correspond to 1σ, 2σ, and 3σ.
Download figure:
Standard image High-resolution image

We know from our retrieval on simulated data (Section 4.7) that we should be able to put constraints on more parameters than those retrieved with our "01" setup. We thus performed additional retrievals including these parameters, which we label "1×." In order to test different hypotheses and to estimate the significance of retrieving some parameters, we compare the difference in log-evidence ( ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })$ ) with respect to a model with a cloud layer and where all species MMRs are set to 10⁻¹², which we label "00." The results are listed in Table 5, and the corresponding posteriors are displayed in Figure 17. A summary of our results can be found in Table 6.²⁷

**Figure 17.** Posterior probability distributions for all of the setups in Table 5. The dashed vertical blue lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions obtained. Indicated below each panel are the median and the 1σ error bar of the retrieved parameters using the setup. The units are the same as in Figure 12. The black rectangle highlights our fiducial setup.
Download figure:
Standard image High-resolution image

Table 5. Log-evidence, Log-likelihood, and Reduced χ² of Our Retrievals

Retrieved Parameter Setups	${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })$ ^a	Best-fit $\mathrm{ln}({ \mathcal L })$	Best-fit ${\chi }_{\nu }^{2}$ ^b	k_σ
(P-00): T, P_c, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	0.00	−842,400.10	1.010	1.146
(P-01): T, H₂O, P_c, K_p, V_rest	202.31	−842,180.77	1.009	1.146
(P-10): T, all species^c, P_c, κ₀, γ, g, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	1184.00^d	−841,173.72	1.008	1.146
(P-11): T, all species^c, P_c, κ₀, γ, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	199.48	−842,174.70	1.009	1.146
(P-12): T, all species^c, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	203.01	−842,175.02	1.009	1.146
(P-13): T, CO, H₂O, H₂S, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	205.69	−842,175.17	1.009	1.146
(P-14): T, H₂O, P_c, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	204.70	−842,174.76	1.009	1.146
(P-15): T, H₂O, P_c, K_p, V_rest, ${{ \mathcal R }}_{C}$	203.65	−842,178.61	1.009	1.146
(P-16): T, H₂O, P_c, K_p, V_rest, T₀	203.56	−842,177.16	1.009	1.146
(P-17): T, H₂O, κ₀, γ, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	205.06	−842,175.13	1.009	1.146
(P-18): T, H₂O, K_p, V_rest, ${{ \mathcal R }}_{C}$ , T₀	206.37	−842,174.94	1.009	1.146

Notes. If a parameter does not appear in a setup: if it is a species, an MMR of 10⁻¹² is used; for the haze parameters, the values κ₀ = 10⁻⁶ and γ = − 12 are used; for P_c, we use 10⁷ Pa; otherwise, the value in Table 1 is used. The uncertainty on $\mathrm{ln}({ \mathcal Z })$ reported by MultiNest is less than 0.01 in all cases.

^aCompared to a setup without any species (MMR = 10⁻¹²).^bCalculated from $-2{k}_{\sigma }^{2}\mathrm{ln}({ \mathcal L })/N$ $-2{k}_{\sigma }^{2}\mathrm{ln}({ \mathcal L })/N$ , where N is the number of nonmasked data points.^cCH₄, CO, H₂O, H₂S, NH₃.^dEdge of prior convergence.

Download table as: ASCII Typeset image

Table 6. Summary of Our Retrieved Values

Parameter	Posterior	Setup
${{ \mathcal R }}_{{\rm{C}}}$ (10³)	${57.6}_{-8.6}^{+10.4}$		P-18
K_p (km s⁻¹)	${157.1}_{-11.6}^{+15.5}$		P-18
V_rest (km s⁻¹)	$-{5.51}_{-0.53}^{+0.66}$		P-18
${\mathrm{log}}_{10}(g)$ (cm s⁻²)	⋯		⋯
T (K)	${661}_{-11}^{+6}$		P-18
[CH₄]	⪅−5.79	⪅−6.62 VMR	P-12
[CO]	⪅−2.64	⪅−3.71 VMR	P-12
[H₂O]	$-{1.50}_{-0.16}^{+0.12}$	$-{2.38}_{-0.16}^{+0.12}$ VMR	P-18
[H₂S]	⪅−4.50	⪅−5.65 VMR	P-12
[HCN]	⪅−5.64	⪅−6.69 VMR	P-12
[NH₃]	⪅−6.35	⪅−7.20 VMR	P-12
${\mathrm{log}}_{10}({P}_{c})$ (Pa)	⪆4.7		P-14
${\mathrm{log}}_{10}(\kappa )$	⋯		⋯
γ	⋯		⋯

Note.The indicated error bars correspond to ±1σ. When a limit is indicated, it corresponds to the 95% upper or lower limit. Ellipsis points indicate that we were not able to put constraints on the parameter. The VMR values are obtained from an atmosphere composed of the median H₂O MMR value and the 95% MMR upper limits of the other species and obeying the same rules as in step 1 of Section 3.1.

Download table as: ASCII Typeset image

The P-10 setup had g converging toward the lower edge of its prior. This means that the width of the g posterior converged to 0, which could affect the posterior of other parameters. The log-evidences and results reported for this setup should not be considered as accurate.

In general, the log-likelihood does not change significantly with the number of retrieved parameters, but the log-evidence roughly decreases, meaning that these removed parameters—which we set if they are not retrieved such that they do not significantly impact the spectrum (see note in Table 5)—are not necessary to describe the data. The exception are the resolving power and midtransit time, which increase slightly the log-evidence, although not significantly so ( ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })\approx 1$ , i.e., ≈2σ). The P-18 setup is the most favored, one with a ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })$ of 206 (≈20σ), compared to the featureless model P-00. For comparison, we expected a ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })$ of 77 (≈13σ) for retrievals on simulated data in the same conditions, using a H₂O MMR of 3.2 × 10⁻², no hazes, and no clouds. Because it is the most favored setup, we will consider results from P-18 as our fiducial ones. However, since this favoritism is not significant ( ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })\approx 7$ , i.e., ≈4σ with P-11), we do not reject the other setups, and we will discuss their results in complement of P-18.

Most of our results with Polyfit are compatible with K_p ≈ 155 km s⁻¹, V_rest ≈ − 5 km s⁻¹, a H₂O MMR of ≈0.03, and a temperature of ≈650 K. We find upper limits for the abundance of the other species included in our model and for the top pressure of an opaque cloud layer. We also find no evidence for hazes. In all cases, the ${\chi }_{\nu }^{2}$ is slightly above 1, with a value of 1.009 excluding P-10. This value does not seem consistent with a strong overfitting or underfitting of the data. However, as discussed in Section 4.6, the ${\chi }_{\nu }^{2}$ is hard to accurately estimate in our case, so its value should be considered with caution. In Figure 18, we display a qualitative comparison of a selection of our models, with their resolution lowered, with the Hubble Space Telescope (HST) WFC3 data from Kilpatrick et al. (2020).

**Figure 18.** HD 189733 b low-resolution ( ${ \mathcal R }\approx 500$ ) transit depth for a selection of models. Black: Exo-REM model with a 10 times solar metallicity. Dotted blue: best-fit model from the P-01 setup. Solid blue: best-fit model from the P-18 setup. Red: HST WFC3 data from Kilpatrick et al. (2020). The red vertical bars represent the 1σ uncertainties on the HST data, and the red horizontal bars represent the width of the data wavelength bins. The dark-blue and light-blue areas represent the envelope of the 1σ and 3σ models for P-18, respectively. The gray areas represent the wavelength range of the orders discarded from our selection. The planet radii of the Exo-REM model (at 10⁵ Pa) and the P-models (at 10³ Pa) have been increased by 1200 and 3600 km, respectively, compared to the value in Table 1.
Download figure:
Standard image High-resolution image

6. Discussion

6.1. Retrieved Kinematics Parameters

6.1.1. Orbital Velocity Semiamplitude

All of our Polyfit results, except those from P-10, are compatible at 1σ with the values of K_p found by Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019) ( ${K}_{p}={160}_{-33}^{+45}$ km s⁻¹ and ${K}_{p}={147}_{-28}^{+33}$ km s⁻¹, respectively), who analyzed the exact same data set. Our retrieved value of K_p with the P-18 setup ( ${157.1}_{-11.6}^{+15.5}$ km s⁻¹) is also fully consistent with our computed value from the literature (K_p = 152.1 ± 2.9 km s⁻¹, from Table 1), as well as the values retrieved by, e.g., Brogi et al. (2016), Damiano et al. (2019), Boucher et al. (2021), Klein et al. (2023), and Finnerty et al. (2024) ( ${179}_{-21}^{+22}$ km s⁻¹, ${167}_{-21}^{+32}$ km s⁻¹, 151 ± 10 km s⁻¹, ${151.8}_{-12.6}^{+15.8}$ km s⁻¹, and 160 ± 8 km s⁻¹, respectively, with their isothermal temperature profile model).

Our retrieved K_p for HD 189733 b is fully consistent with our CCF results, considering the uncertainties of both measurements. However, we note that the retrieved value is more constrained, with less than half the error bars, which seems to indicate that our framework is able to better study the semiamplitude of the orbital velocity than the regular CCF analyses.

6.1.2. Rest Velocity Shift

Like many previous studies (see previous paragraph for example), we report a significantly negative V_rest ( $-{5.51}_{-0.53}^{+0.66}\,\mathrm{km}\,{{\rm{s}}}^{-1}$ ), that is, an overall blueshift of HD 189733 b spectral absorption in our selected exposures. We obtain a similar value with our CCF analysis, but with four times larger error bars compared to our retrieval analysis. This value is consistent with the results from Alonso-Floriano et al. (2019), Damiano et al. (2019), Sánchez-López et al. (2019), Boucher et al. (2021), and Klein et al. (2023) (−3.9 ± 1.3 km s⁻¹ for the first two, $-{4.0}_{-1.8}^{+2.0}$ km s⁻¹ for the third, $-{4.62}_{-0.39}^{+0.41}$ km s⁻¹ for the fourth, and $-{4.73}_{-0.56}^{+0.61}$ km s⁻¹ for the last). This is, however, significantly more than the values reported by Louden & Wheatley (2015), Brogi et al. (2016), and Flowers et al. (2019) ( $-{1.9}_{-0.6}^{+0.7}$ km s⁻¹ averaged over their full transit, $-{1.7}_{-1.2}^{+1.1}$ km s⁻¹, and $-{1.4}_{-0.9}^{+0.8}$ km s⁻¹, respectively) and significantly less than the value retrieved by Wyttenbach et al. (2015) (8 ± 2 km s⁻¹). We summarize these results in Figure 19. Note that the results obtained by Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019), using the same data as ours, were obtained without applying our time stamp correction (see Section 2.1).

**Figure 19.** Summary of the estimated rest velocity of HD 189733 b in the literature. The marker shape indicates the source of the data. In parentheses, next to the data source, is displayed the spectral band coverage of the data. The color indicates which absorber was used to study the data. The label "CO + H₂O" indicates that these two species' opacities were present in the model used.
Download figure:
Standard image High-resolution image

The cause of this rest velocity shift has been attributed to a combination of the winds in a superrotating equatorial jet, the rotation of the planet, and, most importantly, day-to-night winds. The overall effect on HD 189733 b has been modeled by Flowers et al. (2019). They showed that the temperature difference between the dayside and the nightside of the planet creates winds redistributing heat from the former to the latter. In transmission spectra, this means that winds are overall oriented toward the observer, thus blueshifting the spectral lines. In addition, the equatorial jet moves the hottest point of the planet to the east of the substellar point. This creates an asymmetry between the terminators, with the eastern, or trailing, one being hotter and thus more inflated compared to the leading one. During the transit, the trailing terminator, with the jet oriented toward the observer, thus occupies a larger area on the stellar disk than the leading terminator; hence, the properties of the former tend to dominate the overall spectrum. However, according to the Flowers et al. (2019) models, these effects seem insufficient to explain our retrieved value. Moreover, the retrieval on our 3D model did not result in any significant shift in velocity. The blueshift we are observing thus seems difficult to explain with these models. We note that Louden & Wheatley (2015) retrieved a trailing terminator velocity of $-{5.3}_{-1.4}^{+1.0}\,\mathrm{km}\,{{\rm{s}}}^{-1}$ . We can interpret our retrieved value as the trailing terminator dominating our observations more than expected, as also suggested by Boucher et al. (2021), but further modeling work is required to verify this hypothesis.

As noted above, it seems that there are discrepancies in the measurements of V_rest in the literature. This has been previously reported, comparing the value found by Wyttenbach et al. (2015) with those from other works. Two different interpretations were proposed: Louden & Wheatley (2015) showed that the high blueshift of Wyttenbach et al. (2015) could have originated from the Rossiter−McLaughlin effect, leading to the detection of a spurious signal, while Brogi et al. (2016) suggested that the difference could arise from the probing of a different atmospheric regime. We can expand on this discussion by highlighting the differences and similarities between those works. From Figure 19, it seems that there is no clear correlation between the estimated V_rest and the studied species or spectral band. Strong wind speed variation with altitude or strong chemical composition variation with longitude (e.g., H₂O depletion on one terminator) does not seem to be expected from 3D models (e.g., Flowers et al. 2019). The explanation that the different shift observed comes from probing species experiencing different kinematic regimes seems thus unlikely. If these differences are real and do not arise from from data reduction or data analysis artifacts, an alternative explanation could be meteorological variations, but a much more detailed study would be necessary to confirm this.

A point to highlight is that V_rest is correlated with T₀. The consequence is that if the latter is not well evaluated, it may lead to an inaccurate estimation of V_rest, which might explain the discrepancy between some of the studies. Note also that V_rest and T₀ are not degenerate: while their shifting effect on the lines is identical, T₀ has an additional indirect effect on the lines' amplitude within partial transit exposures, because it sets the time of ingress and egress.

6.1.3. Resolving Power

While not strictly speaking a kinematic parameter, the resolving power is a useful proxy to estimate line broadening due to atmospheric dynamics during the transit. As mentioned in Section 4.7, due to the combined Doppler effect of the planet rotation and winds, the spectral lines of the transit spectrum appear broadened. This can be roughly modeled with a lower resolving power compared to what is expected from the instrument. With our 3D model retrieval we expected to retrieve ${{ \mathcal R }}_{C}={\rm{52,000}}\pm {\rm{17,000}}$ . This is fully consistent with what we retrieve with our P-18 setup, ${{ \mathcal R }}_{C}={{\rm{58,000}}}_{-9\,000}^{+10\,000}$ , and roughly half the CARMENES resolving power. Brogi et al. (2016) observed instead that their retrieval favored larger resolving power values than those of the instrument they used, but it was attributed to the properties of the CCF analysis they performed, which our retrieval does not have.

6.1.4. Midtransit Time

The midtransit time has, to our knowledge, never been retrieved in studies similar to ours, most likely because this parameter is usually very well-known from transit light-curve analyses. However, its correlation and nondegeneracy with V_rest (see Section 6.1.2), which can be used to estimate wind speeds in GCM simulations, makes it an important parameter to retrieve. Moreover, K_p is usually also a well-known parameter, but it is ubiquitously retrieved. Therefore, there appears to be no strong reason to not also explore the effects from retrieving T₀. It can be used both as a sanity check for the data time stamps, for the modeled ingress/egress weighting function, and to more accurately measure V_rest.

With our P-18 setup, we retrieve ${T}_{0}=-{47}_{-21}^{+13}$ s, which is fully consistent with the expected value, taking error bars into account, and represents significantly less than the average duration of an exposure. Therefore, we interpret this result as an additional confirmation that we are retrieving the planet's signal and that our data time stamps are accurate.

6.2. Retrieved Atmospheric Parameters

6.2.1. Temperature

The temperature that we retrieve— ${661}_{-11}^{+6}$ K with our P-18 setup—is significantly lower than the equilibrium temperature of the planet, the temperature that we could expect according to our 1D self-consistent model, and the temperatures at the terminator in our 3D model. As mentioned in Section 4.7, an overall colder retrieved temperature can be explained by asymmetric terminators (MacDonald et al. 2020). Moreover, our result is fully consistent with the results from other studies of transmission spectra, for example, Boucher et al. (2021) and Klein et al. (2023) ( $T={698}_{-172}^{+80}$ K and ${698}_{-155}^{+147}$ K, respectively, in their isothermal analysis) or Tsiaras et al. (2018) (T = 621.49 ± 139.05 K). This value is also consistent with the best-fit value found by Brogi et al. (2016) (T ≈ 500 K) and with the value found by McCullough et al. (2014) (T ≈ 700 K), although no error bars were provided in these works. Our value is also roughly 2σ away from the value found by Madhusudhan et al. (2014b) ( $T={787}_{-47}^{+511}$ K).

However, the temperature that we retrieve with our simulated 3D data is $T={1225}_{-253}^{+409}$ , which is not consistent with our findings in the real data. It can be expected that our 3D model does not perfectly represent HD 189733 b, thus explaining an underestimated bias in temperature. Our retrieved temperature is also not consistent with that obtained by Zhang et al. (2020), who retrieved $T={1089}_{-120}^{+110}{\rm{K}}$ . The explanation could be their use of equilibrium chemistry and the prevalence of more species in their spectral ranges (0.3–30 μm): the specific combination of abundances that are required to fit their data might be forbidden by low temperatures. In contrast, our data are dominated by mainly H₂O features, and the temperature and H₂O MMR can vary independently. For these reasons, it is likely that the value of T retrieved by Zhang et al. (2020) is more accurate than ours.

6.2.2. H₂ O abundance

Our retrieved H₂O ${\mathrm{log}}_{10}$ (MMR)—that is, $-{1.50}_{-0.16}^{+0.12}$ with our P-18 setup, corresponding to a log₁₀ VMR of $-{2.39}_{-0.16}^{+0.12}$ —is compatible with an atmospheric metallicity of ≈10 times solar and a solar (0.55) carbon-to-oxygen (C/O) ratio (see Table 7), which is similar to that of Saturn (Atreya et al. 2020). It is also compatible with an atmospheric metallicity of ≈3 times solar and a C/O ratio of 0.1.

Table 7. log₁₀ VMRs at 10⁴ Pa of a HD 189733 b Exo-REM Model for Different Metallicities and Carbon-to-oxygen Ratios

	Z = 1	Z = 3	Z = 10
Species	C/O = 0.55	C/O = 0.1	C/O = 0.55
CH₄	−5.1	−6.7	−5.7
CO	−3.3	−3.6	−2.3
H₂O	−3.6	−2.7	−2.6
H₂S	−4.6	−4.1	−3.6
HCN	−7.6	−8.5	−7.6
NH₃	−6.1	−6.1	−5.8

Note. The VMRs with a C/O ratio of 0.1 are obtained by fixing the oxygen elemental abundance (O/H = Z) and lowering the carbon elemental abundance compared to its solar abundance (C/H = 0.18 Z). Note that the same C/O ratio can be obtained by fixing the the carbon elemental abundance and increasing the oxygen elemental abundance, but it produces different VMRs (typically higher abundances of CO and lower abundances of H₂O).

Download table as: ASCII Typeset image

Comparing with other works using retrievals on high-resolution data, our retrieved H₂O VMR is consistent with Klein et al. (2023) ( $-{2.95}_{-0.53}^{+0.75}$ ) and Finnerty et al. (2024) (−2.9 ± 0.4). In contrast, it is not consistent with Boucher et al. (2021), who obtained a H₂O ${\mathrm{log}}_{10}$ (VMR) of −4.4 ± 0.4. However, their model preparation technique is based on model injection in a reconstruction of the observed data, which differs significantly from our Equation (28). It is possible that this technique introduces a bias in their estimation of the H₂O abundance, but a proper benchmarking is necessary to confirm this.

For other works using high-resolution data but not providing error bars, our retrieved H₂O VMR is also ⪆5σ from the best-fit values found by Danielski et al. (2014) (H₂O ${\mathrm{log}}_{10}($ VMR) ≈ −4 to −3.3) and McCullough et al. (2014) (H₂O ${\mathrm{log}}_{10}($ VMR) ≈ −3.3), although Danielski et al. (2014) only tested models with T ⪆ 1000 K. The H₂O ${\mathrm{log}}_{10}($ VMR) estimated by Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019) is −5, while it is ≈−3 for Brogi et al. (2016) and Flowers et al. (2019),²⁸ but these values come from a limited parameter space exploration and from a CCF analysis, both likely obstacles for an accurate abundance estimation.

Comparing with low-resolution data analysis, our retrieved H₂O abundance is fully consistent with the value found by Tsiaras et al. (2018) (H₂O ${\mathrm{log}}_{10}($ VMR) = −2.51 ± 0.9) and the value found by Zhang et al. (2020) at 10³ Pa (10 mbar, H₂O ${\mathrm{log}}_{10}($ VMR) = −2.5 ± 0.3). In contrast, our result is not consistent with the values found by Madhusudhan et al. (2014b), Pinhas et al. (2018), and Welbanks et al. (2019) (H₂O ${\mathrm{log}}_{10}($ VMR) of $-{5.13}_{-0.18}^{+1.68}$ ,²⁹ $-{5.04}_{-0.30}^{+0.46}$ , and $-{4.66}_{-0.33}^{+0.35}$ , respectively). For Pinhas et al. (2018), Zhang et al. (2020) suggested that the high number of retrieved parameters they used may have led them to obtain an overfitting solution. A similar explanation may apply to Welbanks et al. (2019), as they use the same retrieval setup as Pinhas et al. (2018), and to Madhusudhan et al. (2014b), as they retrieved 13 parameters on ≈30 data points. In addition, we note that all of the above works use the same HST WFC3 data reduced by McCullough et al. (2014). In contrast, the previously mentioned low-resolution studies that show an agreement with our value use either an independent reduction method (Tsiaras et al. 2018) or both a different data set and an independent reduction method (Zhang et al. 2020).

Finally, we also note that studies analyzing the emission spectrum of HD 189733 b, that is, the dayside of the planet, generally find H₂O ${\mathrm{log}}_{10}($ VMR) ranging from ≈−3 to <−5 (e.g., Birkby et al. 2013; Barstow et al. 2014; Line et al. 2014; Brogi et al. 2018; Cabot et al. 2018; Brogi & Line 2019; Finnerty et al. 2024), but making comparison with these studies is not necessarily straightforward as they probe a different region of the planet with different sensitivities to some parameters.

We also note that there is an anticorrelation between the temperature and the H₂O MMR. In Section 6.2.1 we estimated that our retrieved temperature is likely underestimated. Consequently, our retrieved H₂O MMR should be regarded as an upper estimation. However, it is difficult to quantify this bias. In addition to this anticorrelation, as discussed above, 3D models predict an underestimation of the retrieved H₂O abundance in case of difference in chemical composition between the two terminators (MacDonald et al. 2020). The 3D-induced bias and the parameter correlation can add to or compensate each other; hence, estimating the overall effect would require a more detailed study.

6.2.3. Other Species Abundances

With our P-12 setup we can derive 95% upper limits for the abundance of other species, summarized in Table 6.

These constraints are compatible with a slightly subsolar-metallicity atmosphere simulated by our Exo-REM model (see Table 7). For CH₄, CO, and HCN, those are also compatible with an atmospheric C/O ratio of 0.1, which would be consistent with the H₂O VMR we retrieve. We note that a subsolar C/O ratio (0.3 ± 0.1) and supersolar O/H ratio ( ${6}_{-4}^{+9}$ times solar) were derived by Finnerty et al. (2024). The relatively low upper limits for H₂S and NH₃ could also be interpreted as a sulfur and nitrogen depletion, or as a hint for unknown chemical processes. These constraints should be taken cautiously, though, as our retrieval also favors an atmosphere consisting only of H₂, He, and H₂O at 2.4σ (from the ${\rm{\Delta }}\mathrm{ln}({ \mathcal Z })$ between our P-12 and P-14 setups), which is unlikely for a Jupiter-like planet, but confirms that adding those species does not significantly improve the data explanation. The low temperature we retrieve and the limits of our models may also play a role in these results. Data with better S/N and covering stronger molecular bands would be necessary to be more confident.

Regarding our CO estimation, it should be highlighted that HD 189733 presents CO lines in its spectrum. This is known to perturbate CO detection in CCF analysis if the Rossiter–McLaughlin effect is not accurately corrected (e.g., Brogi et al. 2016; Chiavassa 2019). It can be noted that, in the CARMENES spectral range, the stellar CO lines are much weaker than in the K band probed by these studies. However, CO stellar lines may remain in our prepared data, potentially affecting our result. In conclusion, we strongly advise caution regarding our CO abundance estimation, even if it seems compatible with other studies.

We can also compare some of these constraints with other studies:

1.
Our 99% upper limit for the CH₄ ${\mathrm{log}}_{10}($ MMR) (<−5.9) is significantly lower than the one reported by Finnerty et al. (2024) (<−4.6), and our 95% CH₄ ${\mathrm{log}}_{10}($ VMR) upper limit is not consistent with the value retrieved by Line et al. (2014) ( $-{4.70}_{-0.30}^{+0.08}$ ). However, we note that all previous estimations of the CH₄ abundances (Madhusudhan & Seager 2009; Swain et al. 2009; Lee et al. 2012, 2014) found only upper limits for CH₄, ranging from ⪅−2 to ⪅−7. In addition, all of these estimations were obtained using emission spectra.
2.
Our upper limit for the CO ${\mathrm{log}}_{10}($ VMR) is on the lower end of the abundance found in emission spectroscopy by Madhusudhan & Seager (2009), Lee et al. (2012), Line et al. (2014), Brogi & Line (2019), and Finnerty et al. (2024) (⪅−1.7, −1.9 ± 0.5, $-{1.7}_{-2.8}^{+0.2}$ , ⪆−3, and −3.3 ± 0.5, respectively).
3.
The abundance of H₂S is unconstrained in Finnerty et al. (2024). That is, it is at least <−3 ${\mathrm{log}}_{10}($ MMR), their upper prior boundary. Our upper limit is significantly lower.
4.
Our upper limit for the HCN ${\mathrm{log}}_{10}($ VMR) is consistent with the estimation found by Cabot et al. (2018) (≈−6), although this value was estimated using a CCF analysis. We note that Finnerty et al. (2024) reported for this species that their posteriors show a "weak preference" for a HCN ${\mathrm{log}}_{10}($ VMR) of ≈4.8, which would be inconsistent with our results if confirmed, and ≈3 orders of magnitude larger than what is expected by our Exo-REM model.
5.
Our 99% upper limit for the NH₃ ${\mathrm{log}}_{10}($ VMR) (<−6.6) is significantly lower than the one reported by Finnerty et al. (2024) (<−4.5).

6.2.4. Clouds and Hazes

We find no evidence for clouds at the pressures probed by our data. With our P-15 setup, we constrain an opaque cloud layer to be located at ⪆10⁵ Pa (at 1σ). This is consistent with the constraints established by McCullough et al. (2014), Pinhas et al. (2018) (⪆10⁴ Pa), Boucher et al. (2021) (⪆2.0 × 10⁴ Pa), and Finnerty et al. (2024) ( ${5.479}_{-1.06}^{+1.02}$ log₁₀(Pa)), as well as with our 1D self-consistent temperature profile and estimated condensation curves (Figure 6). We also find no evidence of hazes having an effect at our observed wavelengths. We note that, based on low-resolution data, this planet has been inferred to be strongly hazy (e.g., Sing et al. 2016; Zhang et al. 2020). This is not surprising, as Alonso-Floriano et al. (2019) (based on the work by Pino et al. 2018) already established that haze effects (or equivalent starspot effects) affecting water vapor bands in the CARMENES data NIR wavelength range should be negligible. Our results are thus consistent with a clear atmosphere in the NIR and not inconsistent with a strong haze-like spectral feature at bluer wavelengths.

6.3. Retrieved Error Bars

We note that our retrieved error bars for T, T₀, and the H₂O log₁₀(VMR) are respectively 10, 8, and 5 times tighter than expected from our retrieval on simulated data, consistently with the high ${\rm{\Delta }}\mathrm{log}({ \mathcal Z })$ we obtained (see Section 5.2). This might be caused by the simplicity of our transit light-curve model, or by 3D effects, for example. Another cause could be hidden residuals in our prepared data, although none of the median values we retrieve for other parameters are of particular concern if we compare them to other published values—sometimes obtained from different techniques or kinds of data—as discussed in this section. We thus recommend caution when considering our error bars. Analysis of other data sets and the enhancement of our models may be necessary to better understand this behavior.

7. Summary and Conclusion

We have introduced a robust retrieval framework for high-resolution, telluric contaminated data described by Equation (30) and demonstrated analytically that it produced unbiased parameter estimations, with any preparing pipeline, provided that

1.
the pipeline effect can be written as in Equation (15);
2.
the model accurately describes the data;
3.
the data uncertainties are accurate;
4.
there is no degeneracy between the retrieved parameters; and
5.
the BPM (Equation (29)) is ≈0.

We confirmed our analytical demonstration with a simulated retrieval. We introduced the aforementioned BPM that can be used to compare the risk of bias of different preparing pipelines.

We presented the result of a retrieval using a 1D model on 3D simulated data at high resolution, enabled by the pRT-Orange model (P. Mollière et al. 2024, in preparation). The results show that several biases can be expected, according to this model:

1.
the retrieved resolving power is lower than the truth owing to line broadening by the planet rotation and its winds;
2.
the surface gravity is biased toward higher values, for the same reasons;
3.
the posteriors are overall slightly wider than expected, probably because the 1D model is struggling to fit 3D-induced spectral features;
4.
no significant line shift is retrieved (in contradiction with, e.g., Flowers et al. 2019); and
5.
no significant temperature bias is retrieved (in contradiction with the observations).

The contradictions reported above may be partially caused by the pRT-Orange model assumption that the atmospheric properties are constant with latitude, although the latitude dependence of the GCM velocity field is fully taken into account. This will be investigated further in a future work.

We reanalyzed CARMENES data of a HD 189733 b transit, covering wavelengths from 0.96 to 1.71 μm at a resolving power of ≈80,400—studied by Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019)—using our "Polyfit" preparing pipeline, with both a classical CCF analysis and our retrieval framework. Our CCF analysis detects a signal consistent with a transit of HD 189733 b at ≈12.4σ (based on modeling the observation, we predict a detection significance of ≈10σ). With our retrieval, a simple model with H₂O spectral features is favored at ≈20σ using Polyfit compared to a featureless model.

We obtain significantly tighter constraints on T and the H₂O abundance compared to what we could have expected from our simulated retrievals (i.e., respectively a ≈10 K standard deviation instead of an uncertainty of ± 100 K, and an uncertainty of ±0.15 instead of ±0.75). Finding the cause of this discrepancy may require improved atmospheric forward models.

Our results on the kinematics parameters are consistent with the previous analysis, both with a new CCF analysis and with the retrievals. Our retrieval analysis is able to reduce the error bars on K_p and V_rest by a factor 2 for the former and a factor 4 for the latter compared to our CCF analysis. We observe the blueshift of the spectral absorption reported by several other studies, including Alonso-Floriano et al. (2019) and Sánchez-López et al. (2019). Our fiducial value for V_rest is $-{5.51}_{-0.53}^{+0.66}$ km s⁻¹, which is similar to what was observed by, for example, Boucher et al. (2021) but is significantly more than the prediction of models including planetary rotation and winds, such as our pRT-Orange model and the one described by Flowers et al. (2019). Additional modeling effort (e.g., by including the formalism of Klein et al. 2023, Appendix A) and observations may be required to understand this discrepancy.

Our retrieval analysis gave us an estimation of some atmospheric properties of HD 189733 b:

1.
The temperature we retrieve (≈650 K) is significantly lower than expected by 3D models (T ≈ 1200 K) but consistent with some of the other studies. This is likely a bias, possibly caused by terminator asymmetry (MacDonald et al. 2020). A wider spectral range and the use of equilibrium chemistry seem to decrease this effect (Zhang et al. 2020).
2.
The H₂O log₁₀(VMR) we retrieve (≈−2.4) is consistent with a 10 times solar metallicity atmosphere or a 3 times solar metallicity atmosphere with a significantly subsolar C/O ratio (≈0.1). However, the anticorrelation between the temperature and the H₂O abundance and the expected 3D-induced bias (MacDonald et al. 2020) limit the accuracy of this result.
3.
We derived 95% upper limits for the abundances of CH₄, CO, and HCN, consistent with a slightly subsolar atmospheric metallicity, or with a 3 times solar metallicity atmosphere with a C/O ratio ≈0.1. However, we regard these constraints with caution, as the low temperature we retrieve and the limits of our models may impact these results. Our CO upper limit may also be impacted by the Rossiter–McLaughlin effect and should be taken with even further caution.
4.
We derive upper limits for the abundances of H₂S and NH₃, consistent with a subsolar atmospheric metallicity. We also consider these results with caution for the reason pointed out above.
5.
Our results are consistent with no clouds at the pressures probed by our data. Other studies obtained the same result.
6.
We do not detect hazes at the wavelengths of our data, as expected by Alonso-Floriano et al. (2019). We note that a haze-like spectral feature has been observed in the optical and near-UV (e.g., Sing et al. 2016; Zhang et al. 2020), which is not inconsistent with our NIR results.

Comparing our results with other works, we noted that, for V_rest, studies are split between two inconsistent values: either ≈−1.5 ± 1.0 km s⁻¹, or a value in agreement with our findings of ≈−4.5 ± 1.0 km s⁻¹. Similarly, the H₂O log₁₀(VMR) abundance is retrieved to be ≈−2.5 ± 0.5, close to what we retrieve, or ≈−5.0 ± 0.5, depending on the study. Another similar discrepancy can be found with CH₄, although this molecule has overall been less studied than H₂O. In all cases, the cause for these discrepancies has, to our knowledge, yet to be identified. For the molecular abundances, these discrepancies might be explained by overfitting retrievals, the use of a biased retrieval framework, or the data reduction method used. In all cases, we suggest that including the complex 3D-induced line shapes in models, or a more formal-based approach for the choice of the log-likelihood equation and preparing pipelines used, could help solve this issue. In addition, we highlight the correlation between the midtransit time and the rest velocity and argue in favor of also retrieving the former—even if it is well-known—in order to ensure more accurate estimations of V_rest.

In this work we focused on the presentation of our retrieval framework and its comparison with a CCF analysis. The model we chose for the analysis of these HD 189733 b transit spectra is a relatively simple one, but we consider it enough to demonstrate the capabilities of our framework. Using more complex models taking into account terminator asymmetry and the evolution of the transmission lines' shape with time for a more accurate analysis of transit spectra are obvious next steps. This added complexity might actually be necessary to accurately retrieve atmospheric properties in high-resolution data where species' lines are abundant and overlapping. Indeed, while HD 189733 b is a well-studied object, it seems that there are important discrepancies between different works regarding most notably the observed blueshift of the planet spectral absorption and the H₂O abundance. These two parameters are crucial in our understanding of hot Jupiters' atmospheric dynamics and formation process. Finding the origin of these discrepancies and reconciling these results will be important for obtaining a more accurate view of this planet.

Acknowledgments

This research has made use of the Spanish Virtual Observatory (http://svo.cab.inta-csic.es) supported by the MINECO/FEDER through grant AyA2017-84089.7. This research has made use of the NASA Exoplanet Archive, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program. This research has made use of the Exoplanet Follow-up Observation Program (ExoFOP; DOI:10.26134/ExoFOP5) website, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program. The authors thank M. Zechmeister and L. Nortmann for their insights on the CARACAL reduction pipeline and the CARMENES time stamps. D.B. thanks L. Pino for informing him about the correct term to use for the Hadamard product. D.B. also thanks C. Bousardo for her additional proofreading of the article. A.S.-L. acknowledges funding from the European Research Council under the European Union's Horizon 2020 research and innovation program under grant agreement No. 694513 and acknowledges financial support from the Severo Ochoa grant CEX2021-001131-S funded by MCIN/AEI/ 10.13039/501100011033. P.M. thanks J. Wardenier for running comparison calculations with gCMCRT to benchmark pRT-Orange. P.M. also thanks F. Debras for extracting GCM atmospheric structures from public tables for use in pRT-Orange.

Appendix A: UTC to TDB Time Stamp Conversion Code

We display here the Python code snippet that we used to convert the CARACAL time stamps from UTC to TDB. The variable times_tdb is the one we used as our TDB time stamps. This code is also implemented in the petitRADTRANS package. This was inspired by the SERVAL³⁰ code (Zechmeister et al. 2018).

import astropy.units as u

from astropy.coordinates import (EarthLocation,

SkyCoord)

from astropy.time import Time

site_name="CAHA" # Calar Alto astropy site name

ra=300.1821223 * u.deg # (degree) for HD 189733

dec=22.7097759 * u.deg # (degree) for HD 189733

times_utc=... # placeholder to load MJD_UTC times

observer_location=EarthLocation.of_site(site_name)

target_coordinates=SkyCoord(

ra=ra,

dec=dec

)

times_utc=Time(times_utc, format="jd," scale="utc")

times_tdb=(

times_utc.tdb

+ times_utc.light_travel_time(

target_coordinates,

location=observer_location

)

Download table as: ASCII Typeset image

Appendix B: SysRem Pipeline Uffect

B.1. SysRem Preparing Pipeline Implementation

We have implemented SysRem in our framework for comparison with Polyfit (Section 4.1). SysRem (Tamuz et al. 2005) is a preparing pipeline developed to remove systematics of light curves obtained from photometric observations, but it is also widely used to prepare high-resolution data (e.g., Alonso-Floriano et al. 2019; Pelletier et al. 2021; Gibson et al. 2022). The algorithm iterates to find the set of vectors "a(t)" and "c(λ)," such that a(t)◦c(λ) best fits the data. It can be applied repeatedly ("passes") to find several "hidden" linear trends in the data. Our implementation follows the steps described below.

Step 1: This step is almost identical to step 1 of Polyfit (Section 4.1). We obtain the same "normalized" spectrum ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ and correct the uncertainties in the same way to obtain ${{\boldsymbol{U}}}_{{\boldsymbol{N}},\overline{{\boldsymbol{X}}}}(t,\lambda )$ . In addition, we follow Alonso-Floriano et al. (2019) and mask ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ where its values are below 0.8, corresponding to the core of the strongest telluric lines. We note that for this "normalization" step various techniques are used in other works, which involve, for example, mean division, polynomial fitting, and Gaussian filtering, combined in a variety of ways. The efficiency of these different techniques to remove X has to our knowledge never been formally discussed, but such discussion is beyond the scope of this work.

Step 2: We subtract the average over wavelengths³¹ of ${{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )$ following

$\begin{eqnarray}&&{{\boldsymbol{F}}}_{a}(t,\lambda )={{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )-\langle {{\boldsymbol{F}}}_{\overline{{\boldsymbol{X}}}}(t,\lambda )\rangle (t).\end{eqnarray} \tag{ B1 }$

At this step, F _a(t, λ) is on average ≈0. This is crucial for the core of the SysRem algorithm to work properly. Because we only subtract a value from the data, this step has no impact on the uncertainties.

Step 3: We apply the SysRem algorithm as described in Tamuz et al. (2005). The algorithm requires us to start with an estimation of either the parameter "a(t)," which can be seen as similar to the air mass, or the parameter "c(λ)," which can be seen as similar to the extinction coefficients. However, as long as the algorithm is able to converge, the starting parameters and their initial values have no importance. We chose to start with c(λ) = 1. Following Alonso-Floriano et al. (2019), we run one SysRem pass with 15 iterations. We obtain an estimation of the systematics a(t)◦c(λ), which we subtract³² from the data to obtain the prepared spectrum after one pass ⁽¹⁾ F _S(t, λ) with

$\begin{eqnarray}&&{}^{(1)}{{\boldsymbol{F}}}_{S}(t,\lambda )={{\boldsymbol{F}}}_{a}(t,\lambda )-a(t)\circ c(\lambda ),\end{eqnarray} \tag{ B2 }$

where the superscript (1) represents the number SysRem passes. This last step can be repeated any number of times n to remove more linear trends. In that case, F _a in the above equation is replaced by ⁽ⁿ⁻¹⁾ F _S, that is, the spectrum obtained from the previous pass. Because we again only subtract a value from the data, this step has no impact on the uncertainties. Note that because of the subtractions used, this preparing pipeline cannot be represented with Equation (15) and has no preparing matrix as we defined it. We will use the notation P_S to represent all of the SysRem pipeline steps, with P_S( F ) = ⁽ⁿ⁾ F _S(t, λ). Note that as with P_R, the result of P_S will change depending on the input. The effect of the SysRem preparing pipeline is displayed in Figure 20.

**Figure 20.** Illustration of the effect of our implementation of the SysRem preparing pipeline on the order 46 of the studied data, for a selection of number of passes. The left column represents prepared data, the right column prepared models. Rows 1–5: prepared noiseless simulated data and model with 1, 2, 3, 5, and 10 SysRem passes, respectively. The simulated data used are the same as in step 9 of Figure 4. Sixth row: same as the first row, but with added noise. The lines are no longer visible to the naked eye. Bottom row: prepared CARMENES data. The white vertical stripes are masked telluric lines.
Download figure:
Standard image High-resolution image

B.2. "Noiseless BPM"

As previously stated, and as can be seen in Equation (B2), SysRem does not follow Equation (15), so the BPM that we derived cannot be calculated for this preparing pipeline. A workaround to this issue is to remove the noise from Equation (29), so that R _F disappears from the equation and a value for SysRem can be obtained. Note that this is not a strictly correct way to evaluate the BPM: the impact of the noise is not measured, and we stress that the equation has been established assuming that the pipeline respects Equation (15). Doing so and adding 1 to the results of our SysRem implementation, we obtain a "noiseless BPM" of 5 × 10⁻¹⁰ for "Polyfit," while we find 9 × 10⁻⁶ and 2 × 10⁻⁹ for SysRem with 1 and 15 passes, respectively. We are giving these "noiseless BPM" values as indications, but their meaningfulness is questionable.

B.3. Retrieval Results with SysRem

When using SysRem, we use the exact same setup as for Polyfit, only replacing P_R with P_S and U _R with ${{\boldsymbol{U}}}_{{\boldsymbol{N}},\overline{{\boldsymbol{X}}}}$ in Equation (30). The setups are denoted "S-xx-y," where "xx" corresponds to the same setup as in Table 5 and "y" corresponds to the number of SysRem passes used. For example, S-01-1 has the same retrieved parameters and priors as P-01, and one SysRem pass was applied.

In Figure 21, we show the posteriors of a selection of SysRem retrievals. With one SysRem pass, the results obtained with SysRem are quite different from those obtained with Polyfit. Most notably, for the S-01-1 and S-18-1 setups, the H₂O MMR is 2 orders of magnitude lower compared with the latter and compatible at ≈2σ. The midtransit time posterior is also clearly truncated, and the constraints obtained are much softer. With two SysRem passes, we obtain essentially flat posteriors. We also observe flat posteriors with 3, 5, and 10 passes (not shown here). This behavior might seems surprising, as it is common in other works to use more than one SysRem pass (e.g., Alonso-Floriano et al. 2019 used 10 SysRem passes). However, in our framework we apply the preparing pipeline on both the data and the models, which is not necessarily what is done in other works. In Figure 20, it can be seen that the models and the noiseless simulated data are not deformed in a similar way, which might lead to the biased results we observe with our framework. In addition, it seems that the simulated noiseless data are quickly removed by SysRem and that the planet's atmospheric lines in the simulated noiseless data are well preserved with at most two passes (although we did not quantify this). This "line-preserving" number of passes may depend on the spectral window, the kinematic properties of the observed target, and the instrument used. This behavior may also be different in a noisy case.

This highlights that not all preparing pipelines are suitable with all frameworks. We stress that the results presented in this appendix are only valid with our framework and that in other published work using PCA-based methods the forward model is prepared for retrieval in a different manner. The apparently successful use of SysRem and related PCA-based approaches in previous retrieval studies (e.g., Pelletier et al. 2021; Gibson et al. 2022) suggests that formally justified frameworks using SysRem-like preparing pipelines may exist, but investigating them is beyond the scope of this work.

Appendix C: Retrieval with All Orders

We display the result of a retrieval with all orders in Figure 22. The values retrieved with the Polyfit setup are consistent with those retrieved with the same setup but with our order selection.

Appendix D: Risking Biased Results: Automatic Order Selection Algorithms

The results presented in this appendix are derived from models following Section 3.1, except that step 6 is not performed. The time array used here is also in BJD_UTC instead of the BJD_TDB used in the body of the article. As stated in the conclusion of this appendix, we recommend against using the technique described below, and we discarded it ourselves. Nevertheless, we chose to report the following results, as they show that using a similar technique can dangerously bias the data analysis.

The order selection algorithm follows the steps below.

Step 1: Based on the Exo-REM model generated in Section 3.3 and using the fact that most of the transmission signal comes from a pressure of ≈10⁴ Pa, we fix the absorbers' MMR at the values listed in Table 3. We do not include any cloud or haze effect: indeed, we do not expect the MnS cloud to have a significant contribution to the spectrum. We use the parameters listed in Table 1, the CARACAL uncertainties ( U _N), and the 19 exposures centered around T₀.

Step 2: We cross-correlate this model with the prepared data using the setup described in Section 4.2. No coaddition of CCF is performed yet. We choose a limited set of orders, which we estimate are the least impacted by telluric lines and instrumental effects, have strong H₂O absorption or minor contributions from other species, and cover most of the CARMENES wavelength range. For these data, we started with orders 3, 7, 9, 13, 25, 28, 29, 46, and 54.

Step 3: We calculate the coadded CCF of the selected orders' data and compute the corresponding S/N as described in Section 4.2. In addition, we determine

1.
the S/N at the coadded CCF maximum;
2.
the number of elements of our coadded CCF matrix that is greater than or equal to 0.68 times the function maximum (we will call this quantity ${ \mathcal A }$ ); and
3.
the position of this maximum in the K_p–V_rest space.

Step 4: We add one of the remaining orders to the set of included orders and perform step 3 with this new set. We then test for the following:

1.
the S/N at the coadded CCF maximum calculated with the new set is higher than the one calculated with the previous set;
2.
the new ${ \mathcal A }$ is lower than the previous one;
3.
the position of the new coadded CCF maximum is between 145 and 160 km s⁻¹ along the K_p axis; and
4.
the position of the new coadded CCF maximum is between −10 and 10 km s⁻¹ along the V_rest axis.

The order is kept within the set if all these conditions are met. Otherwise, the order is removed from the set. This step is repeated with the new set, until each remaining order has been tested.

Step 5: This step is similar to step 4, but instead we remove an order from the set of orders obtained at the end of step 4. The same tests are performed. The order is removed from the set if all of the conditions are met. Otherwise, the order is kept within the set. This step is repeated until each order in the set has been tested.

We repeat the whole procedure, starting with the obtained order selection from the last iteration, until the set does not change. At the end, we obtain the following stable order selection: 1, 3, 7, 8, 9, 14, 25, 28, 29, 30, 46, and 54. We will call this set our "cleaned-up" order selection.

The procedure assumes the following:

1.
The model used for the cross-correlation is a good representation of the data.
2.
There is enough planet signal to prevent a false-positive CCF peak.
3.
The planet signal was emitted within the procedure K_p and V_rest test ranges and does not vary strongly from order to order.
4.
An order containing valuable information will necessarily increase the S/N of the CCF maximum and narrow the CCF peak in K_p–V_rest space. This narrowing can be estimated with ${ \mathcal A }$ .
5.
An order not containing valuable information will necessarily decrease the S/N of the CCF maximum or expand the CCF peak in K_p–V_rest space.

If some of these assumptions are not true, there is a risk to reject orders containing valuable information and a risk to select worthless orders biasing our results toward our favored atmospheric model. This risk has already been noted by, e.g., Boucher et al. (2021) and Cheverall et al. (2023).

We used the orders selected by this procedure for several retrievals. We show a result example in Figure 23. When using the orders' selection mentioned above, we obtain the posteriors in row "A." When instead we start with orders 11, 18, 19, 25, 26, 32, 33, and 34 and remove the constraints on the coadded CCF maximum location in K_p–V_rest space, we obtain the following order selection: 1, 9, 10, 11, 13, 18, 19, 24, 26, 27, 29, 31, 32, 33, and 34. In the latter case, the coadded CCF maximum is located at unrealistic values: K_p = −142 km s⁻¹ and V_rest = 133 km s⁻¹. We also adapted our K_p and V_rest priors to ${ \mathcal U }(-192,-122)$ km s⁻¹ and ${ \mathcal U }(100,160)\,\mathrm{km}\,{{\rm{s}}}^{-1}$ , respectively. This gives the posteriors in row "B." Despite the difference in K_p and V_rest values, the retrieved temperature and H₂O MMR are similar in rows "A" and "B" and close to the values of the model used to calculate the CCF (Tables 1 and 3). This most probably indicates that the algorithm is biased toward the model used to calculate the CCF. The bias may propagate into the retrieval, despite retrievals being much more flexible than a CCF analysis. Due to the strong risk of bias, we estimate that it is ill-advised to use this algorithm.

Appendix E: Retrieval on Noisy Simulated Data

We can evaluate the performance of our framework against noisy simulated data (see Section 4.7.2) using the Polyfit pipeline. The result is shown in Figure 24. With Polyfit, we obtained a best-fit ${\chi }_{\nu }^{2}=0.999$ (not corrected by k_σ). Note that this is true only for this specific noise realization and no general conclusion can be driven from this result, in contrast with our noiseless simulated retrieval results (Section 4.7.2).

**Figure 24.** Posterior probability distributions of our noisy simulations. The noise matrix generation is detailed in Section 3.2; the seed used was 12345. Blue: pRT 1D simulation using Polyfit. The solid red lines corresponds to the model input values for the 1D model. The dashed vertical blue lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions for the 1D simulation. Indicated on top of each column are the median and the 1σ error bar of the retrieved parameters for the 1D simulation. The contours in the 2D histograms corresponds to the 1σ, 2σ, and 3σ contours.
Download figure:
Standard image High-resolution image

Appendix F: Retrieval on Noiseless Simulated Data with Fewer Free Parameters

In Section 4.7.2, we attribute the slight bias of the H₂O abundance retrieved from our noiseless simulation mainly to the retrieval of other parameters. This can be verified by retrieving the H₂O abundance on the same simulated data, but retrieving with parameters that cannot compete to reproduce the H₂O lines. We display the result of such retrieval in Figure 25. The peak of the H₂O abundance posterior is arguably on the true value. The temperature posterior peaks at temperatures slightly lower than the truth but well within the error bar. This is again caused by our pipeline's imperfections.

**Figure 25.** Posterior probability distributions for of our noiseless simulation, with few retrieved parameters. The solid red lines correspond to the model input values for the 1D model. The dashed vertical lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions obtained. Indicated below each panel are the median and the 1σ error bar of the retrieved parameters using the setup, as well as the true value in parentheses. The units are the same as in Figure 12.
Download figure:
Standard image High-resolution image

Appendix G: Marginalizing Over the Uncertainties

Our framework is also able to retrieve an uncertainty scaling parameter (β) or to use the uncertainty "nulling" procedure described in Brogi & Line (2019) and Gibson et al. (2020). As in Gibson et al. (2020), the uncertainties are scaled in the likelihood function following

$\begin{eqnarray}&&\mathrm{ln}({ \mathcal L })=-\displaystyle \frac{1}{2}\sum {\left(\displaystyle \frac{{P}_{{\boldsymbol{R}}}({\boldsymbol{F}})-{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{\theta })}{\beta {{\boldsymbol{U}}}_{{\boldsymbol{R}}}}\right)}^{2},\end{eqnarray} \tag{ G1 }$

with the scalar β being retrieved. For the "nulling" procedure, we followed Gibson et al. (2020), giving the following log-likelihood function:

$\begin{eqnarray}&&\mathrm{ln}({ \mathcal L })=-\displaystyle \frac{N}{2}\mathrm{log}\left(\displaystyle \frac{1}{N}\sum {\left(\displaystyle \frac{{P}_{{\boldsymbol{R}}}({\boldsymbol{F}})-{P}_{{\boldsymbol{R}}}({{\boldsymbol{M}}}_{\theta })}{{{\boldsymbol{U}}}_{{\boldsymbol{R}}}}\right)}^{2}\right),\end{eqnarray} \tag{ G2 }$

where N is the number of nonmasked data points. For the reason exposed in Section 4.3.2, we did not use this feature to obtain our fiducial results.

Nevertheless, we did run retrievals with these features enabled, which we label P-19 (which is similar to P-14, but with β as an additional retrieved parameter) and P-29 (which is similar to P-14, but using the nulling procedure). For β, we used a uniform prior ${ \mathcal U }(0.5,100)$ . We display the results in Figure 26. In both cases, the results are almost identical to P-14, only with slightly tighter posteriors. The scaling parameter β corresponds well to 1/k_σ. For the P-19 setup, since we essentially "force" ${\chi }_{\nu }^{2}=1$ , we unsurprisingly obtain a larger log-evidence than with the other setups, with $\mathrm{log}({ \mathcal Z })=-807,500$ . With P-29, the log-likelihood equation is changed and we obtain this time a positive log-evidence, with $\mathrm{log}({ \mathcal Z })=307,300$ .

**Figure 26.** Posterior probability distributions for a selection of Polyfit setups. The dashed vertical lines represent the 0.16, 0.50, and 0.84 quantiles (i.e., median and 1σ error bar) of the distributions obtained. Indicated below each panel are the median and the 1σ error bar of the retrieved parameters using the setup. The units are the same as in Figure 12.
Download figure:
Standard image High-resolution image

A Formally Motivated Retrieval Framework Applied to the High-resolution Transmission Spectrum of HD 189733 b

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction

2. Observations and Data Reduction

2.1. Observations

2.2. Data Reduction

3. Models

3.1. Transmission Spectrum Model

3.2. Simulated Data

3.3. Self-consistent Model

3.4. 3D Transmission Spectrum Model

4. Methods

4.1. Preparing Pipeline

4.2. Cross-correlation Setup

4.3. High-resolution Retrieval Setup

4.3.1. Taking into Account the Preparing Pipeline

4.3.2. Log-likelihood Function and Retrieval Algorithm

4.4. Exposure Selection

4.5. Order Selection

4.6. Uncertainties Check

4.7. High-resolution Retrieval Setup Validation

4.7.1. BPM

4.7.2. Simulated Retrievals

5. Results

5.1. Cross-correlation

5.2. Retrievals

6. Discussion

6.1. Retrieved Kinematics Parameters

6.1.1. Orbital Velocity Semiamplitude

6.1.2. Rest Velocity Shift

6.1.3. Resolving Power

6.1.4. Midtransit Time

6.2. Retrieved Atmospheric Parameters

6.2.1. Temperature

6.2.2. H2 O abundance

6.2.3. Other Species Abundances

6.2.4. Clouds and Hazes

6.3. Retrieved Error Bars

7. Summary and Conclusion

Acknowledgments

Appendix A: UTC to TDB Time Stamp Conversion Code

Appendix B: SysRem Pipeline Uffect

B.1. SysRem Preparing Pipeline Implementation

B.2. "Noiseless BPM"

B.3. Retrieval Results with SysRem

Appendix C: Retrieval with All Orders

Appendix D: Risking Biased Results: Automatic Order Selection Algorithms

Appendix E: Retrieval on Noisy Simulated Data

Appendix F: Retrieval on Noiseless Simulated Data with Fewer Free Parameters

Appendix G: Marginalizing Over the Uncertainties

Footnotes

6.2.2. H₂ O abundance