Detection of CO absorption in the atmosphere of the hot Jupiter HD 189733b

With time-series spectroscopic observations taken with the Near Infrared Spectrometer (NIRSPEC) at Keck II, we investigated the atmosphere of the close orbiting transiting extrasolar giant planet, HD 189733b. In particular, we intended to measure the dense absorption line forest around 2.3 micron, which is produced by carbon monoxide (CO). CO is expected to be present in the planetary atmosphere, although no detection of this molecule has been claimed yet. To identify the best suited data analysis method, we created artificial spectra of planetary atmospheres and analyzed them by three approaches found in the literature, the deconvolution method, data modeling via chi^2-minimization, and cross-correlation. As a result, we found that cross-correlation and chi^2-data modeling show systematically a higher sensitivity than the deconvolution method. We analyzed the NIRSPEC data with cross-correlation and detect CO absorption in the day-side spectrum of HD 189733b at the known planetary radial velocity semi-amplitude with 3.4 sigma confidence.


INTRODUCTION
The study of exoplanets is one of the most vibrant and exciting fields in modern astronomy. In the past 18 years, more than 860 exoplanets have been discovered 1 . This has led to an increasing interest in the physical characterization of these new objects. As result of those characterization efforts, several chemicals have thus far been detected in the atmospheres of a few planets, as shown below.
Most of the planets discovered so far are located too close to their host stars to appear separated. Consequently, the light coming from the planet and star are observed simultaneously. When attempting to measure the light only from the planet, one has to find a strategy to remove the dominating star light. A prosperous strategy to learn more about the atmospheres of remote planets is through investigation of those planets that transit their parent stars. The atmospheric properties of transiting exoplanets can be measured in two ways: (1) during the passage of the planet in front of the star (transit), when part of the stellar light crosses the planetary atmosphere and the signature of chemicals in that atmosphere gets imprinted in the light we measure ⋆ E-mail: rodler@ieec.uab.es 1 website: http://www.exoplanet.eu (2013-02-27) from the star, and (2) during the passage of the planet behind the star (eclipse), when the star temporarily blocks the planet's emission and we can determine its temperature, albedo, chemical composition, etc., from the difference spectrum. Via transits, NaI (Charbonneau et al. 2002), HI, OI and CII (Vidal-Madjar et al. 2003, water vapor (Tinetti et al. 2007) have been detected from space in the atmospheres of the hot Jupiters HD 209358b and HD 189733b. From the ground, Redfield et al. (2008) measured NaI in the atmosphere of HD 189733b, while Snellen et al. (2008Snellen et al. ( , 2010 confirmed the Charbonneau et al. (2002) detection of NaI in HD 209458b and also detected CO via the analysis of the transmission spectrum using high-resolution spectroscopy around 2.3 µm. The latter result allowed these authors to directly measure the RV of a transiting hot Jupiter for the very first time. Bean et al. (2010) investigated the atmosphere of the Super Earth GJ 1214b via transmission spectroscopy and found no evidence of atmospheric features, indicating a hydrogen atmosphere with high clouds, or a water dominated atmosphere . Most recently, Sing et al. (2011) and Colón et al. (2012) announced the first detections of potassium in two planets, XO-2b and HD 80606b. Furthermore, eclipses have already provided temperature measurements for over twenty planets, both from space and more recently also from the ground (e.g. Sing & López-Morales 2009;López-Morales et al. 2010).
In this work, we focus on a further strategy to measure atmospheric features in planets that is based on intermediate and high spectral resolution (i.e. R = λ/∆λ > 20, 000; λ denotes the wavelength) spectroscopy with very large telescopes. Key to this method is to observe a large number of spectral features coming from the planet and to observe them at different orbital phases so that the traveling faint planetary signal can be disentangled from the dominating stellar one. Considering this, the main advantage of this method is that it is not restricted to transiting planets (c.f. Brogi et al. 2012;Rodler et al. 2012).
In the past, this method was used in the optical with the goal to detect starlight reflected from hot Jupiters (i.e. massive planets that are a few stellar radii away from their host stars) and to measure their exact masses. Although all those campaigns resulted in a non-detection of reflected starlight, stringent upper limits to the planet-to-star flux ratio and to the geometric albedo of these planets could be established. To date, the tightest 3σ-upper limits on the geometric albedos of the hot Jupiters τ Boo b, HD 75289b and υ And b are 0.39 (Leigh et al. 2003a), 0.46 (Rodler et al. 2008), and0.42 (Collier Cameron et al. 2002), respectively. These results consequently provided important constraints on models of the planetary atmospheres such as those by Marley et al. (1999) and Sudarsky et al. (2000Sudarsky et al. ( , 2003. As a result, models that predicted a high reflectivity for the planetary atmosphere could be ruled out for the studied planets. Towards near-infrared (NIR; 1 − 2.5 µm) wavelengths, the planet-to-star flux ratios drastically increase due to the strong thermal emission of hot Jupiters. Wiedemann et al. (2001); Deming et al. (2005); Barnes et al. (2007aBarnes et al. ( ,b, 2008Barnes et al. ( , 2010 and Cubillos et al. (2011) observed hot Jupiters by means of high-resolution spectroscopy at near-infrared wavelengths, but were not able to detect any molecules in their atmospheres.
Very recently, our group (Rodler et al. 2012) as well as Brogi et al (2012) reported the first successful detection of CO via high-resolution spectroscopic observations around 2.3 µm of the non-transiting hot Jupiter τ Boo b. Both groups were able to measure the orbital motion of this planet, which allowed them to determine the previously unknown orbital inclination of the planet and to finally solve for the exact planetary mass. This paper is dedicated to the investigation of different data analysis approaches which were used by different research groups, with the goal to measure the planetary signal via high-resolution spectroscopy (Sections 2 and 3). In the second part of the paper, we present our studies of the search for carbon-monoxide in the atmosphere of the hot Jupiter HD 189733b (Section 4) and a brief summary of our results and conclusive remarks (Section 5).

Overview
The idea of this approach is to observe spectral features in the planetary atmosphere via intermediate-and highresolution spectroscopy. At visual wavelengths, the main contribution to the flux coming from a planet is the starlight reflected from the companion (Seager et al. 1998). This means that the planet reflects the stellar spectrum, which is shifted in wavelength with respect to the star due to the orbital motion of the planet, and which is furthermore scaled down in intensity by a factor of several times 10 4 due to the albedo of the planet, the illuminated fraction of the planetary disk visible, and the size of planet. Towards infrared wavelengths, the planet-to-star flux ratio drastically increases to the order of 10 −3 for hot Jupiters. At NIR wavelengths, the planetary flux is almost entirely produced by thermal emission from the planet. To measure and identify atmospheric features in the planetary atmospheres, theoretical models are required (e.g. Sharp & Burrows 2007).
Key to this method is to observe a large number of spectral features in the planetary atmosphere to significantly overcome the planet-to-star flux ratio. Furthermore, it is important to take a time-series of spectra and therefore to observe the planetary features at different orbital phases of the planet, allowing to distinguish between the rather fixed stellar spectrum and the planetary one, which is periodically traveling in wavelength.
Data analysis involves the subtraction of the dominating stellar spectrum as well as of the telluric spectrum of the Earth's atmosphere at NIR wavelengths (see Charbonneau et al. 1999;Collier Cameron et al. 2002;Deming et al. 2005;Barnes et al. 2007a, andRodler et al. 2008, for detailed descriptions). We note in passing that Langford et al. (2011) reports a different approach by searching for the planet and stellar spectra in Fourier space.
By adopting one of the methods described in Sections 2.2 -2.4, the spectral features of the planet in each residual spectrum are then co-added to form a mean line profile of this spectrum. The following methods were used by different research groups: the deconvolution method, which was used by Collier Cameron et al. (2000Cameron et al. ( , 2002; Leigh et al. (2003); Barnes et al. (2007aBarnes et al. ( ,b, 2008Barnes et al. ( , 2010, and two straight-forward data modeling approaches adopting χ 2 -statistics (Charbonneau et al. 1999;Rodler et al. 2008Rodler et al. , 2010) as well as cross-correlation (Snellen et al. 2010;Brogi et al. 2012;Rodler et al. 2012). For each residual spectrum, all three methods return -in the ideal case -a vector containing the mean line profile of the atmospheric features of the planet, which is Doppler shifted due to the instantaneous RV of the planet at the time of the observations (cf. Fig. 2). The next step involves the alignment of all mean line profiles in the time series with the alignment being a function of the RV semi-amplitude of the planet Kp. To this end, we transfer each of the planetary line profiles from the velocity grid relative to the star (v) to a grid based on Kp: where φ is the orbital phase of the planet (φ = 0 occurs at mid-transit for transiting planets), which is a priori known from the RV solution of the star. The total of the aligned mean line profiles of the time series are then added up to form a single, overall mean line profile of the spectral features of the planet. We finally search for the global minimum or the maximum peak, respectively, for χ 2 -statistics or cross-correlation / deconvolution method.
Once such a candidate signal at a specific Kp has been found, the confidence level of the measurement is determined as described in Section 2.5.
One of the most important aspects of this technique is that it allows the determination of the orbital inclination i of a non-transiting planet via Kp and and Kp,max = 2πG m⋆ P orb where G is the gravitational constant, m⋆ the stellar mass, and P orb the orbital period of the planet. Kp,max is the maximum possible RV semi-amplitude that the planet can have, and which occurs for an orbital inclination i = 90 • . The knowledge of the orbital inclination i allows us to finally solve for the true planetary mass mp = mp,min/ sin i, where mp,min denotes the minimum mass derived from the RV solution of the planet. We note that Eq. 3 is only valid for m⋆ ≫ mp.

Deconvolution method
The basic idea of this method is to deconvolve each observed star-free (and telluric-line-free) spectrum with a high-resolution reference spectrum that describes the atmospheric features in the planet, thereby summing up all these features into one mean line profile (e.g. Donati et al. 1997; Barnes et al. 1998).
In general, the observed planetary spectrum s(x) at a certain position x on the detector can be approximated as a convolution of the template spectrum f (x ′ ) with the apparent mean planetary line profile p. The template spectrum f (x ′ ) can be an artificial reference spectrum of very high resolution (R > 100, 000) or have the form of a list containing the positions and depths of the planetary absorption lines.
The observed planetary spectrum is given by where index i is the pixel number of the observed object spectrum, and k is the index of the numerical grid used for the intrinsic spectrum (Endl et al. 2000). n denotes a cut-off parameter of p with p i−k = 0 for |i − k| > n, which allows us to shorten the infinite vector containing the mean line profile. Following the algorithm by Endl et al. (2000), the grids of the reference spectrum f and the mean profile p can be oversampled with respect to the grid on which the observed object spectrum is recorded. The oversampled version of Eq. 5 follows as where q = k/i is the oversampling factor, and j and k ′ the indices of the oversampled grids of the reference spectrum f and the output vector p. We combine the terms and rearrange Eq. 6 as follows: where j is the index of the oversampled grid of the reference spectrum f , and k ′ is the index of the oversampled grid of the output vector p. Equation 7 is nothing but a matrix equation of the type Let m be the dimension of the vector − → s , which represents the observed object spectrum, and nq be the dimension of the oversampled vector − → p containing the mean profile. Then the dimensions of the matrix F are m × nq. However, Eq. 8 is incomplete since each data point si of the observed data − → s exhibits its error ∆si. The complete version of that equation follows as Now we are ready for the calculation of the output vector − → p containing the planetary signal by using a deconvolution algorithm; with this step, the signal-to-noise ratio (S/N) of the planetary signal can be boosted by a factor √ l in the ideal case, where the l spectral features have the same weight.
This problem constitutes an inversion problem which is ill-conditioned (due to − → ∆s), but over-determined (the size of the object spectrum is much larger than the size of the mean line profile). There are several algorithms to mathematically solve for this problem. We tested different deconvolution approaches and found that least-squares deconvolution preserves best the planetary signal. Thus, we modify Eq. 10 to form a least squares problem: where − → ∆s is the error vector of − → s , E = Diag[∆s −2 1 , ..., ∆s −2 m ] and m is the dimension of the vector of the observed object spectrum − → s . We find the least squares solution for the output vector containing the mean line profile of the planet − → p by solving the matrix equation obtained by multiplying both sides of Equation 8 with F T E: The matrix resulting from the multiplication F T E F is square, symmetric and positive definite. Therefore, the inverse matrix can be calculated by Cholesky decomposition (Press et al. 1992).

Data modeling and χ 2 -statistics
For each of the residual spectra (i.e. stellar-free and telluricfree spectra), we create a planetary model M being a theoretical spectrum for the planetary atmosphere f (e.g. Sharp & Burrows 2007), which is degraded in spectral resolution to the resolution of the observations. This version of the planetary model M is then Doppler-shifted as a function of Kp sin 2πφ (remember that Kp is the RV semiamplitude of the planet, which is unknown for most of the non-transiting planets; φ is the orbital phase of the planet which is a priori known from the RV solution of the star) with respect to the star and interpolated on the pixel grid of the observations. The spectrum is furthermore scaled in intensity as a function of a planet-to-star flux ratio (ǫ λ ) at wavelength λ as well as to the phase-function (i.e. the illumination geometry of the observed planetary disk at different orbital phases -see e.g. Rodler et al. 2008). For pixel k, where c denotes the speed of light.
Varying the free parameters Kp and ǫ, we finally search for the best-fit model M to all the residual spectra s by way of χ 2 -minimization in appropriate search ranges, where the reduced χ 2 is N is the total number of data points, and m is the number of fitted parameters. Note that ∆s denotes the errors of s.

Data modeling and cross-correlation
For each residual spectrum s, we Doppler-shift the atmospheric model spectrum f of the planet by velocity v (see Eq. 1) in a given search range and interpolate it onto the pixel grid of s. We then calculate the normalized correlation degree C(v) for each residual spectrum following the formalism by Cubillos et al. (2011): where index k denotes the pixel number,f ands are the mean values of the atmospheric model spectrum and residual spectrum, respectively. The parameter w denotes the weight for the specific residual spectrum, while W is the sum of the weights of all residual spectra which are used in the crosscorrelation.

Determination of the confidence level
Once a candidate signal of the planetary spectrum has been located, its confidence level needs to be determined. Note that this candidate feature is a peak in the case where the deconvolution method or the cross-correlation analysis has been employed. If the data was analyzed adopting χ 2statistics, the candidate signal is reverted and produces a dip (Fig. 2). One way to determine the confidence level of the candidate signal is to employ a bootstrap randomization method (e.g. Kürster et al. 1997): random values of the orbital phases are assigned to the observed spectra, thereby creating N different data sets (N = 100, 000 in our analyses).
Any signal present in the original data is then scrambled in these artificial data sets. For all these randomized data sets, we re-run the data analysis in given parameter search ranges and locate the candidate feature with its maximum value or its minimum χ 2 ν,min value, depending on the employed method.
The confidence level of the candidate features is estimated to be ≈ 1 − FAP = 1 − m/N , where FAP is the false alarm probability, m is the number of the best fit models having a χ 2 ν value smaller or equal than the χ 2 ν,min value found in the original, unscrambled data sets. We note that for the other two data analysis methods, we count the number of the best fit models yielding an equal or higher peak than the peak value of the candidate signal.

Determination of the error range of the result
In the case of a significant detection of the planetary atmosphere ( 3σ, which corresponds to 0.9973 confidence), we determine the 1σ-error of the free parameters Kp and ǫ via bootstrap resampling (Barrow et al. 1982). To this end, we create randomly resampled data sets of the size of the original data set. In the resampling process, each set of spectra is randomly drawn from the general pool of spectra. For each spectrum a copy is kept for the artificial data set, but the original spectrum is returned to the pool with the effect that it can be drawn again. This way some of the original spectra will not appear in an artificial data set while others may appear more than once. Then, the parameters Kp and ǫ of the best fit model are stored. We create a total of 10,000 artificial data sets to get a more precise estimate of the distribution of Kp (and ǫ).
To determine the 1σ-errors of the model parameters Kp and ǫ for the original data set we examine the distributions that were obtained for these parameters for the artificial data sets. We create 68.3% confidence intervals around the original Kp and ǫ values by taking the ranges adjacent to either side of the original values that each contain 34.15% of the artificial values.

Data
We created artificial data sets with the goal of testing the three data analysis methods and to find out which is most sensitive for the task. We adopted a PHOENIX model spectrum (Hauschildt et al. 1997;Helling et al. 2008;Witte et al. 2011) for a brown dwarf having a temperature of T = 1800 K and solar metallicity around 2.3 µm (Fig. 1, lower spectrum). In this wavelength regime, the brown dwarf spectrum exhibits a dense forest of absorption lines almost entirely produced by the molecule carbon monoxide (CO).
Each data set consisted of 79 spectra, which were Doppler-shifted as follows. We assigned an orbital phase value to each of the spectra, starting with φ = 0.305 for spectrum 1 and subsequently increasing the value of φ by 0.005 for the following spectra. In total, the orbital phases ranged from φ = 0.305 to 0.695 and therefore comprised the regions where a planet would appear at its brightest. Since hot Jupiters typically have RV semi-amplitudes of the order . Reference spectra adopted for the data analysis. The lower spectrum depicts a theoretical PHOENIX model for a brown dwarf having a temperature of T = 1800 K and solar metallicity. We used this spectrum to create the data sets as well as for data analysis. In the upper spectrum, the same spectrum is shown, but we removed 20% of the lines to study also the effect of missing lines in the data analysis (cf. Case 2 in the text). We note that this normalized spectrum has been shifted up for clarity.
of Kp = 100 km s −1 , we chose Kp = 100 km s −1 and calculated the value of the instantaneous RV shift v of the model spectrum by Eq. 1. We simulated the data for a cross-dispersed infrared spectrograph of high resolution. Thus, we degraded the spectral resolution of the spectrum to R = 60, 000 by a convolution of the model spectrum with a suitable Gaussian and interpolated the resulting spectrum onto a pixel grid ranging from 2.3 to 2.35 µm. This pixel grid consisted of a total of 4800 px, and one pixel corresponded to a velocity range of 1.5 km s −1 . We then scaled the depths of absorption lines in the spectra due a chosen star-to-planet flux ratio ǫ between 10 −3 to 5 × 10 −5 . In the final step, we added Poisson noise to the data in such a way that the S/N level was at 500 per spectral pixel, which is a typical value for high-resolution spectra in the infrared.
We analyzed these artificial data sets by the three data analysis methods and investigated also the three following cases: In Case 1, we analyzed the data by adopting the PHOENIX model spectrum which we had used for creating the data sets. For the analysis with the data modeling approaches, we further degraded the spectral resolution of the reference spectrum to 60,000.
In Case 2, we explored the influence of missing lines in the reference spectrum, which was then adopted for the data analysis. The reference spectrum was a version of the brown dwarf spectrum, in which we had randomly removed 20% of the lines (Fig. 1, upper spectrum).
In the last case (Case 3), we studied how tiny continuum variations affected the result of the data analysis. To this end, we added a continuous sine wave to the modeled spectra having a period of 1000 pixel and a semi-amplitude of 10 −3 . Data analysis was carried out by adopting the PHOENIX model spectrum containing all absorption lines. . Output of the different data analysis methods for a single spectrum: deconvolution method (uppermost panel), data modeling with χ 2 -statistics (middle) and cross-correlation (bottom). The data to be analyzed had been created in the way described in Section 3.1 by adopting a scaling value of ǫ = 10 −3 and an RV semi-amplitude of Kp = 100 km s −1 .

Results
For all three cases, we find that all three methods are almost equally sensitive, with cross-correlation and χ 2 -data modeling showing systematically a higher sensitivity than the deconvolution method (Figures 3 -5). For Case 1 (CO spectra), we are able to retrieve the planetary signal at the correct value of Kp at the 3σconfidence level down to planet-to-star flux ratios of ǫ ≈ 1/8000 for all three methods.
For Case 2 (fewer lines), we made use of the same data sets which had been created for Case 1, and analyzed them with the model spectrum for which we had randomly had removed 20% of the absorption lines. Fig. 4 illustrates that by employing cross-correlation, the data modeling approach using χ 2 -statistics, and the deconvolution method, the planetary signal with a planet-to-star flux ratio of ǫ ≈ 1/5500 is retrieved at the 3σ-confidence level.
In Case 3 (variable continuum), we adopted the same data sets which we had created for Case 1, but multiplied the spectra with a sine wave to simulate variable continuum. Data analyses were carried out by adopting the correct spectrum of the brown dwarf. As shown in Fig. 5, we are able to  . Comparison between the data analysis methods for Case 1. We created different data sets for different planet-to-star flux ratios ǫ (for clarity, we show the inverse of ǫ) and analyzed them with the three data analysis methods. The solid, dashed and dotted graphs depict the false alarm probability (FAP) of the results obtained with the deconvolution method, the data modeling method using χ 2 -statistics, and with cross-correlation, respectively. The four dotted horizontal lines mark the confidence levels in σ-units. As we compare these results, we find that all methods are almost equally sensitive, with cross-correlation and χ 2 -data modeling showing systematically a higher sensitivity than the deconvolution method.
retrieve the planetary signal at the correct position and at the 3σ-confidence level down to planet-to-star flux ratios of ǫ ≈ 1/7500 with all three data analysis methods.
As we compare the individual methods and cases, we realize that the deconvolution method is systematically the least sensitive one of the three data analysis techniques. We attribute the ≈ 3−7% lower sensitivity of the deconvolution approach to the additional processing layer, where a mean line profile is calculated at once according to a mathematical constraint (in our case: least-squares minimization). As a second point, for the deconvolution we adopt a reference spectrum consisting of delta functions (at the reference line positions), which is sampled onto the same pixel grid as the observed spectrum. Due to that discrete sampling of the reference spectrum, the position of each reference line is at a full pixel, i.e. the center of the line is likely to be shifted by a fraction of a pixel. This, and the common case of line blending, where two lines might be treated as one, might produce distorted line profiles which then affect the overall mean line profile.

REANALYSIS OF THE
HD 189733B-NIRSPEC DATA SET

The planet HD 189733b
The parameters of HD 189733b and its parent star are provided in Table 1 Figure 4. Same as in Fig. 3, but for Case 2. We investigated how an incomplete spectral information in the reference spectrum affects the data analysis. The data set was created with the correct brown dwarf model spectrum, while for the data analysis we adopted a version of that model spectrum, in which we had deleted 20% of the lines (cf. Fig. 1). In comparison to the other cases (Figures 3 and 5), we see that the sensitivity of all three methods is strongly affected by an incomplete reference spectrum. 4σ Figure 5. Same as in Fig. 3, but for Case 3. We investigated how a variable continuum affects the data analysis. To this end, we added a sine wave to the data having a period of 1000 pixel and a semi-amplitude of 10 −3 . As result, we again find that the deconvolution method is significantly less sensitive than the other two data analysis methods.
2007, 2012) by studying the phase function of the planet in the infrared, indicating a temperature around 1300 K, as well as the discovery of high-altitude haze (Pont et al. 2008;Sing et al. 2011). Several atoms and molecules in the planet atmosphere were found: water (Grillmair et al. 2008), sodium seen in absorption at visual wavelengths (Redfield et al. 2008), as well as methane and carbon dioxide (e.g. Swain et al. 2009;Désert et al. 2009;Waldmann et al. 2012). Désert et al. (2009) found strong absorption around 4.5 µm, probably due to CO. Lecavelier Des Etangs et al.
(2010) measured strong evaporation of the planetary atmosphere due to the high irradiation from the host star. In addition to these discoveries, the spectrum of the dayside emission has been measured at NIR-and mid-infrared wave- lengths Grillmair et al. 2008;and Waldmann et al. 2012).

NIRSPEC data and their reanalysis
We reanalyzed the data set published in Barnes et al. (2010), which had been taken with the goal to detect H2O and CO in the atmosphere of HD 189733b. Their data analysis, however, resulted in a detection of low significance (98.8%) of these elements in the planetary atmosphere. Data were obtained with NIRSPEC (McLean et al. 1998) at the Keck II Telescope, Hawaii, USA, on UT 2008 June 15 and June 22, when the dayside of the planet was almost entirely visible. A total of 373 spectra were recorded using a 1024 × 1024 InSb Aladdin-3 array. The spectra were taken with a slit width of 0.432 arcsec, giving a spectral resolution of R ≈ 25, 000. Using the method outlined in Barnes et al. (2007a) and Barnes et al. (2008), we reduced the data and attempted to extract the planetary signature from time-series spectra by removing the dominant spectral contributions: namely the stellar spectrum and the telluric lines. Contrary to the data analysis of Barnes et al. (2010), we restricted the data analysis to the last two spectral orders comprising the wavelength region of λ = 2.275 to 2.31 µm and λ = 2.347 to 2.383 µm, respectively (Fig. 6), where we expected the dense CO absorption forest of the companion spectrum. We note that the latter spectral order was not used in the data analysis by Barnes et al. (2010). In the wavelength regime of those two spectral orders, Waldmann et al. (2012) reported a planetto-star flux ratio of ǫ = 2.2 × 10 −3 ≈ 1/450 from secondary eclipse measurements of HD 189733b.
The S/N of the residual spectra (i.e. after the removal of the stellar spectrum and telluric lines) was on average 370 and 450 per spectral pixel in the first and second night, respectively. We rejected those frames taken when the planet was behind the star and not visible. In the end, we worked . Upper spectrum: the CO model with spectral resolution of 25,000. For clarity, the normalized spectrum is shifted up by 0.005 and scaled for a planet-to-star flux ratio of ǫ = 10 −2 , which is a factor 4.5 larger than the actually measured value by Waldmann et al. (2012). Lower spectrum: the two spectral orders used in the data analysis. The residual spectra are shown after the subtraction of the stellar-and telluric absorption lines as well as after the bad pixel removal.
on 153 and 116 residual spectra, almost entirely covering the orbital phases φ = 0.302 to 0.421 and φ = 0.515 to 0.580, respectively for the first and second night, when the day-side of the planet was largely visible. We further identified bad pixels and outliers and discarded them from the data analysis. We adopted cross-correlation (Section 2.4) to search for the CO spectrum of the planet in the residual spectra. As reference spectrum, we adopted a PHOENIX spectrum of a brown dwarf with a temperature of T=1500 K and with a spectral resolution of R = 25, 000 (Fig. 6). In the cross-correlation, we weighed the spectra according to their average SNR level per spectral pixel, and further accounted for both the systemic radial velocity of the star HD 189733 (v = −2.4 km s −1 ) and the barycentric velocity of the Earth. We then co-aligned and co-added the individual cross-correlation functions in the planet's rest-frame, thereby taking into account the orbital phase information of the planet at the barycentric Julian date of the observations.

Results and discussion
As shown in Figure 7, we find the strongest candidate feature at Kp = 154 km s −1 , which is located within the 1σerror range of the known RV semi-amplitude of HD 189733b being, Kp = 152.6 ± 2 km s −1 .
Since the RV semi-amplitude of the planet Kp had already been estimated by Barnes et al. (2007b), we can restrict the bootstrap randomization run to the search range Kp = 152.6 ± 3σ km s −1 (i.e. 146.6 to 158.6 km s −1 ). As result of this bootstrap randomization run, we found that the candidate feature is at a confidence level of 99.54% and therefore represents a 2.9σ-detection of CO in the planetary atmosphere of HD 189733b.
We furthermore carried out the data analysis on the two independent nights and found this candidate feature present in both nights (Fig. 7, lower panel). As a plausibility test, we varied the systemic RV of HD 189733 and determined the confidence levels at the measured RV semi-amplitude of the planet, Kp = 154 km s −1 . The highest peak occurs at the genuine system velocity of HD 189733, which is vsys=-2.4 km s −1 (Fig. 8).
In addition to that, we adopted a different strategy to determine the confidence level of the candidate feature. To this end, we calculated the cross-correlation values for all combinations of the two parameters Kp and the systemic velocity vsys of HD 189733b, respectively, in the range of 5 Kp 200 km s −1 and −100 vsys 100 km s −1 . The average value of the cross-correlation values for all those combinations is 0.00055, its scatter (rms) is 0.00368, while the peak value of the candidate feature shows 0.0129. We subtracted the average value from the peak value and divided the result by the scatter of the cross-correlation functions. This revealed that the candidate feature is indeed significant at the 99.92% (3.36σ)-confidence level. When plotting the individual cross-correlation functions of the residual spectra with the CO model spectra, we do not see the planetary signature (Fig. 9). Given the relatively large flux ratio between the dayside of HD 189733b and its host star, we should have been able to measure the trace of the planetary RV signal. Albeit the presence of systematic noise that hamper the retrieval of the weak planetary signature, that may suggest low abundance of CO in the planetary atmosphere of HD 189733b.
To demonstrate the power of this approach, we injected an artificial planetary signal into the residual spectra and analyzed these data sets with cross-correlation. We modeled the artificial planetary signal adopting CO spectra with solar abundance with a spectral resolution of R = 25, 000. The spectra were then scaled to an intensity ratio of 1/450 and shifted in RV according to the orbital motion of HD 189733b (i.e. for an RV semi-amplitude of Kp = 153 km s −1 ). The results of the data analysis are shown in Figure 10. The injected planetary signal could be recovered at the correct value of Kp, and bootstrap randomization simulations for a search range of Kp from 5 to 200 km s −1 revealed that the signal is significant at a confidence level of > 99.99. By adopting bootstrap resampling simulations, we furthermore determined the 1σ-error ranges of the recovered planetary signal, being 2.5 km s −1 . Figs. 9 and 10 also show some strong cross-correlation artifacts which are of the order of the injected planetary signal. We attribute these artifacts mainly to systematic errors coming from the removal of the stellar signal and the telluric spectrum.
To estimate the flux ratio between the strong planetary lines F lines and the stellar continuum F⋆, we again injected an artificial planetary signal into a scrambled set of the residual spectra and analyzed these data sets with crosscorrelation. The injected planetary spectrum was scaled to a chosen intensity ratio F lines /F⋆ and Doppler-shifted as described above. We recovered the injected the planetary signal and determined its confidence level by the aforementioned straight-forward strategy. We found that for a scaling factor of F lines /F⋆ = 1.8 × 10 −4 , we attain 3.4σ confidence. This value again indicates a low abundance of CO in the planetary atmosphere of HD 189733b, given the large flux ratio between the continua of the planet day-side spectrum and the stellar one, being ǫ = 2.2 × 10 −3 around 2.3 µm.

SUMMARY AND CONCLUSIONS
We carried out studies to find out what data analysis approach is best suited for the search for atoms and molecules in hot Jupiters via high-resolution spectroscopy. We first created artificial data sets consisting of spectra of planetary atmospheres, scaled them in intensity according to a chosen value of F lines /F⋆, and analyzed them by different data analyses approaches. As result, we found that the highest sensitivities to recover the weak planetary features are attained with cross-correlation and χ 2 -data modeling, while the deconvolution method was less sensitive (≈ 3 − 7%) than the two aforementioned methods.
In light of these studies, we attempted to measure the dense CO absorption line forest around 2.3 micron in the day-side spectrum of the transiting hot Jupiter HD 189733b. By employing cross-correlation, we reanalyzed a time-series of spectra taken with the Near Infrared Spectrometer (NIRSPEC) at Keck II during two nights and detect a candidate planet signal at an RV semi-amplitude Kp = 154 km s −1 , which is located within the 1σ-error range of the known RV semi-amplitude of HD 189733b (Kp = 152.6 ± 2 km s −1 ). While bootstrap randomization runs resulted in 2.9σ-confidence for this candidate feature, a more straight-forward test revealed that we detect the planetary signal with a S/N of 3.4.
As a plausibility test, we independently carried out the data analysis for each of the two nights and found this candidate feature clearly present in both nights. In addition, we varied the systemic RV of HD 189733 and found the highest peak at the genuine system velocity of HD 189733, which is vsys=-2.4 km s −1 .
In the past, Barnes et al. (2010) analyzed the same data set in the wavelength region 2.21 -2.36 µm and searched for water and CO absorption in the planetary atmosphere. These authors adopted a different data analysis strategy (different bad pixel correction, deconvolution method, different spectral orders used) for their purposes and found in their data analysis a candidate feature of the planetary signal close to the RV semi-amplitude of the planet at the 98.8% confidence level.
We are consequently confident to claim a detection of CO absorption in the planetary atmosphere of HD 189733b. This work demonstrates the power of intermediateresolution spectroscopy at infrared wavelengths to investigate the atmospheres of remote planets. The measured planetary CO signal is weak and may suggest a low abundance of CO in the planetary atmosphere of HD 189733b. In addition, we observe CO in absorption, which indicates that the atmosphere of HD 189733b lacks a strong thermal inversion layer.

ACKNOWLEDGMENTS
We thank those of the Hawaiian ancestry on whose sacred mountain we are privileged to be guests. JB gratefully acknowledges funding through a University of Hertfordshire Research Fellowship. During this research, FR and JB have received travel support from RoPACS, a Marie Curie Initial Training Network funded by the European Commission's Seventh Framework Programme. We furthermore would like to thank the anonymous referee for very helpful and constructive comments.