Research Highlight #3

Stellar Characterization of Keck HIRES Spectra with The Cannon

(Rice & Brewer 2020, ApJ 898, 119)

Github repository | Trained models

A comprehensive understanding of exoplanet properties is closely intertwined with our understanding of the formation environments of these planets. Precisely determined stellar properties are critical to appropriately interpret exoplanet observations, enabling us to discern the correlations between planetary properties and their host environments.

In this project, we applied the supervised learning code The Cannon to develop a model that extracts 18 stellar labels from continuum-normalized Keck/HIRES spectra: Teff, logg, vsini, and 15 abundances (C, N, O, Na, Mg, Al, Si, Ca, Ti, V, Cr, Mn, Fe, Ni, and Y). We also applied this technique to extract 18 labels from interpolated spectra obtained by the older version of the Keck/HIRES instrument before its 2004 upgrade. By interpolating spectra from the older and newer detector onto the same wavelength range, we were able to reliably recover all 18 labels from the pre-2004 spectra using a model trained on post-2004 spectra.



Figure 1: Distribution of properties for the 1202 pre-labeled SPOCS stars used for our model training and testing.


We trained our model using 1202 pre-labeled stars from the Spectral Properties of Cool Stars (SPOCS) dataset described in Brewer et al. 2016. The distribution of stellar properties is shown in the figure above. The 18 stellar labels of interest were obtained for each of these stars using the Spectroscopy Made Easy (SME) program; however, because stellar modeling with this program is relatively computationally expensive, we developed a new model to rapidly return labels for large sets of stellar spectra.

All input spectra were reduced using the California Planet Search (CPS) data reduction pipeline and continuum-normalized using the methods described in Valenti & Fischer 2005. Beyond this initial normalization, we also used a data-driven renormalization method to further improve the continuum removal as recommended in Ness et al. 2016. Sample continuum fits implementing two different functional forms are shown below for a single echelle order. Our final model implements the polynomial fit.



Figure 2: Sample continuum renormalization, where "continuum pixels" are selected and fit in a data-driven manner. The new fit is divided out to renormalize the spectrum.


We also masked out pixels corresponding to telluric lines, which are imprinted on all ground-based spectra from the Earth's atmosphere. The telluric mask is displayed below for all echelle orders placed side-by-side, as well as for the single echelle order that we used for initial testing. Across all echelle orders, we masked out roughly 37% of pixels.



Figure 3: Visualization of the telluric mask used in our model, with all 16 echelle orders displayed side-by-side and distinguished by color. A zoom-in of our primary testing echelle order is shown on the bottom.


We first verified our model's performance by testing its ability to return labels for a test set of spectra withheld from our training dataset. Our results are shown below; overall, we found that our trained model reliably returned all 18 stellar labels, with scatter provided in the figure.



Figure 4: Performance of our model to recover 18 known stellar labels of 240 SPOCS test set stars from their post-2004 Keck/HIRES spectra.


We also verified that features picked out by the model correspond to known physical phenomena by looking at a few of the coefficients returned by The Cannon for each pixel in the vicinity of the Mg Ib triplet. Pixels with coefficients deviating further from the baseline are weighted more heavily when determining the value of that coefficient for a given spectrum. As expected, the centers of Mg lines correspond to dips in the θMg coefficients, while the wings of the lines more directly impact the code's determined surface gravity. Stellar rotational velocity relies most heavily on intermediate-depth lines that are neither saturated nor washed out by the continuum.



Figure 5: Relevant coefficients in the vicinity of the Mg Ib triplet. Pixels with coefficients that deviate further from their baselines are more heavily weighted when evaluating the corresponding parameter value.


Lastly, we interpolated our post-2004 spectra onto the same wavelength scale as our archival pre-2004 spectra. We re-trained our model on spectra taken with the newer, upgraded spectrograph and tested its ability to recover the correct labels from spectra obtained with the older, pre-2004 instrument. The model again reliably recovered all 18 labels, shown below. This indicates that The Cannon can be used to extract labels from spectra taken with a different spectrograph from the training set! We applied this model to extract labels for 477 stars with archival spectra, with results provided in the paper.


Figure 6: Performance of our model, which is trained on interpolated spectra taken after the Keck/HIRES 2004 detector upgrade, to recover 18 known stellar labels of 337 SPOCS stars from their pre-2004 Keck/HIRES spectra.