Extremely Metal-poor Representatives Explored by the Subaru Survey (EMPRESS). I. A Successful Machine-learning Selection of Metal-poor Galaxies and the Discovery of a Galaxy with M* < 106 M and 0.016 Z*

, , , , , , , , , , , , , , , , , , , , and

Published 2020 August 3 © 2020. The American Astronomical Society. All rights reserved.
, , Citation Takashi Kojima et al 2020 ApJ 898 142 DOI 10.3847/1538-4357/aba047

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

0004-637X/898/2/142

Abstract

We have initiated a new survey for local extremely metal-poor galaxies (EMPGs) with Subaru/Hyper Suprime-Cam (HSC) large-area (∼500 deg2) optical images reaching a 5σ limit of ∼26 mag, about 100 times deeper than the Sloan Digital Sky Survey (SDSS). To select Z/Z < 0.1 EMPGs from ∼40 million sources detected in the Subaru images, we first develop a machine-learning (ML) classifier based on a deep neural network algorithm with a training data set consisting of optical photometry of galaxy, star, and QSO models. We test our ML classifier with SDSS objects having spectroscopic metallicity measurements and confirm that our ML classifier accomplishes 86% completeness and 46% purity EMPG classifications with photometric data. Applying our ML classifier to the photometric data of the Subaru sources, as well as faint SDSS objects with no spectroscopic data, we obtain 27 and 86 EMPG candidates from the Subaru and SDSS photometric data, respectively. We conduct optical follow-up spectroscopy for 10 of our EMPG candidates with Magellan/LDSS-3+MagE, Keck/DEIMOS, and Subaru/FOCAS and find that the 10 EMPG candidates are star-forming galaxies at z = 0.007–0.03 with large Hβ equivalent widths of 104–265 Å, stellar masses of log(${M}_{\star }$/${M}_{\odot }$) = 5.0–7.1, and high specific star formation rates of ∼300 Gyr−1, which are similar to those of early galaxies at z ≳ 6 reported recently. We spectroscopically confirm that 3 out of 10 candidates are truly EMPGs with Z/Z < 0.1, one of which is HSC J1631+4426, the most metal-poor galaxy, with Z/Z = 0.016, ever reported.

Export citation and abstract BibTeX RIS

1. Introduction

The early universe is dominated by a large number of young, low-mass, metal-poor galaxies. Theoretically, the first galaxies are formed at z ∼ 10–20 from gas already metal enriched by Population III (i.e., metal-free) stars (e.g., Bromm & Yoshida 2011). According to hydrodynamical simulation (e.g., Wise et al. 2012), the first galaxies are created in dark matter (DM) mini halos with ∼108 M and have low stellar masses (M ∼ 104–106 M), low metallicities (Z ∼ 0.1%–1% Z), and high specific star formation rates (sSFR ∼ 100 Gyr−1) at z ∼ 10. The typical stellar mass (M ∼ 104–106 M) is remarkably small, comparable to those of star clusters. Such cluster-like galaxies are undergoing an early stage of galaxy formation, which is characterized by intensive star formation. One of the ultimate goals of modern astronomy is to understand the early-stage galaxy formation by probing the cluster-like galaxies. The cluster-like galaxies are also the key galaxy population that is the building block of the galaxy formation hierarchy.

Recent observations (e.g., Stark et al. 2014) have reported that low-mass, young galaxies of log(${M}_{\star }$/${M}_{\odot }$) ∼ 6–9 at z ∼ 2 show strong emission lines with very high equivalent widths (EWs), ∼1000 Å, for [O iii]+Hβ lines. Such very high EWs suggest the intensive star formation predicted by the hydrodynamical simulation of the first galaxies (e.g., Wise et al. 2012). Stellar synthesis and photoionization models (Inoue 2011) also demonstrate that the rest-frame EW of the Hα line, ${\mathrm{EW}}_{0}$(Hα), can reach ∼1000–3000 Å for stellar ages of ≲100 Myr. The association between the first galaxies and local, low-mass galaxies was partly investigated with cosmological hydrodynamic zoom-in simulations by Jeon et al. (2017). Jeon et al. (2017) suggested that the Local Group (LG) ultrafaint dwarf galaxies (UDGs) began as young, low-mass, star-forming galaxies (SFGs) in the past and have been quenched during the epoch of cosmic reionization. Thus, the LG UDGs themselves are not analogs of the high-z galaxies because they are already quenched and old. Contrary to the LG UDGs, low-mass, young SFGs discovered in the local universe are undergoing an intensive star-forming phase and can be regarded as analogs of high-z galaxies.

In the last decade, metal-poor galaxies with a large [O iii] λ5007 rest-frame EW, ${\mathrm{EW}}_{0}$([O iii] λ5007), have been discovered by the broadband-excess technique in the data of the Sloan Digital Sky Survey (SDSS; York et al. 2000). For example, Cardamone et al. (2009) reported metal-poor, actively SFGs at z ∼ 0.3 in the SDSS data that were named "green pea galaxies" (GPs) after their compact size and intrinsically green color caused by the large ${\mathrm{EW}}_{0}$([O iii] λ5007) up to ∼1500 Å. Yang et al. (2017b) also discovered metal-poor, highly SFGs at z ∼ 0.04 in the SDSS data selected with the g-band excess with the very large ${\mathrm{EW}}_{0}$([O iii] λ5007) ∼ 500–2500 Å. The galaxies found by Yang et al. (2017b) have been nicknamed "blueberry galaxies" (BBs).

The typical metallicities of these GPs/BBs are 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 8.0 ± 0.3, which falls into a moderate metallicity range compared to extremely metal-poor galaxies (EMPGs) such as J0811+4730 (Izotov et al. 2018b), SBS 0335−052 (e.g., Izotov et al. 2009), AGC 198691 (Hirschauer et al. 2016), J1234+3901 (Izotov et al. 2019b), Little Cub (Hsyu et al. 2017), DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019), IZw 18 (e.g., Izotov & Thuan 1998; Thuan & Izotov 2005), and Leo P (Skillman et al. 2013) in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2. Stellar synthesis and photoionization models (Inoue 2011) suggest that the ${\mathrm{EW}}_{0}$([O iii] λ5007) takes maximum values around 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0. Thus, galaxies selected with a single broadband excess, such as GPs/BBs, may be somewhat biased toward a large ${\mathrm{EW}}_{0}$([O iii] λ5007), i.e., a moderate metallicity of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0.

As shown in Figure 1, the models of Inoue (2011) also exhibit a peak of [O iii] λ5007/Hα flux ratio at around 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0. The [O iii] λ5007/Hα ratio monotonically decreases with decreasing metallicity in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 8.0 simply because the oxygen element becomes deficient. Indeed, as shown in Figure 1, representative metal-poor galaxies (e.g., Thuan & Izotov 2005; Izotov et al. 2009, 2018b; Skillman et al. 2013; Hirschauer et al. 2016; 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2) have a ratio of [O iii] λ5007/Hα = 0.4–1.0. The [O iii] λ5007 line is no longer the strongest emission line in an optical spectrum of an EMPG, as demonstrated in Figure 2. Thus, the optical spectra of the EMPGs are characterized by multiple strong emission lines, such as hydrogen Balmer lines and [O iii] λλ4959, 5007. The strong emission lines of EMPGs cause g- and r-band excesses at z ≲ 0.03.

Figure 1.

Figure 1. The [Oiii]/Hα ratio as a function of gas-phase metallicity. The crosses (Andrews & Martini 2013) and dots (Nagao et al. 2006) represent the average of local SFGs. We also show the typical values of SDSS GPs (green square; Cardamone et al. 2009; Amorín et al. 2010) and BBs (cyan triangle; Yang et al. 2017b). The solid line is a theoretical prediction (Inoue 2011). The diamonds are representative metal-poor galaxies J0811+4730 (Izotov et al. 2018b), SBS 0335−052 (e.g., Izotov et al. 2009), AGC 198691 (Hirschauer et al. 2016), J1234+3901 (Izotov et al. 2019b), Little Cub (Hsyu et al. 2017), DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019), IZw 18 (e.g., Izotov & Thuan 1998; Thuan & Izotov 2005), and Leo P (Skillman et al. 2013) in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2. The GPs and BBs show moderate metallicities around 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0, which correspond to a high [Oiii]/Hα ratio of ∼2, while the representative EMPGs have a relatively low [Oiii]/Hα ratio of ∼0.3–1.0.

Standard image High-resolution image
Figure 2.

Figure 2. Top: spectrum example of an EMPG with a very low metallicity of 12 + log(O/H) = 7.25 (Kniazev et al. 2003). Bottom: same as the top panel but for a GP with a moderate metallicity of 12 + log(O/H) = 8.01 (Jaskot & Oey 2013). We show the GP spectrum in the rest frame for an easy comparison with the EMPG spectrum. The color curves are throughput curves of HSC grizy-band filters for reference. In the optical spectrum of this typical EMPG (top panel), Hα is the strongest line.

Standard image High-resolution image

Recently, Hsyu et al. (2018) and Senchyna & Stark (2019) started metal-poor galaxy surveys with the SDSS data, where they selected objects that show g- and r-band excesses. Unfortunately, the EMPGs have similar colors to those of other types of objects (i.e., blue stars, transient objects) on a classical color–color diagram of r − i versus g − r (Hsyu et al. 2018; Senchyna & Stark 2019). In this paper, we aim to use machine-learning (ML) techniques to find an EMPG selection criterion that is not limited by the linear relationships in color–color space, which is the most common selection criterion used in previous works.

In this study, we target EMPGs with strong emission lines in the local universe (z ≲ 0.03), which may be a local analog of a high-z SFG. Because such galaxies are intrinsically faint and rare in the local universe, wide-field, deep imaging data are necessary. However, the SDSS data are not deep enough to discover EMPGs with log(${M}_{\star }$/${M}_{\odot }$) < 6, which are possible candidates of the most metal-deficient galaxies inferred from the mass–metallicity relation (e.g., Wuyts et al. 2012; Andrews & Martini 2013). Deeper, wide-field imaging data surveys have since been conducted to discover faint EMPGs that are undetected by SDSS-like surveys. In 2014 March, the Subaru telescope started a large-area (∼1400 deg2) deep survey with the Hyper Suprime-Cam (HSC; Miyazaki et al. 2012, 2018; Furusawa et al. 2018; Kawanomoto et al. 2018; Komiyama et al. 2018) called the HSC Subaru Strategic Program (HSC SSP; Aihara et al. 2018).

Based on the HSC-SSP data, we have initiated a new survey for local EMPGs that has been named the Extremely Metal-Poor Representatives Explored by the Subaru Survey (EMPRESS). The final goal of our research is to discover faint EMPGs by exploiting the Subaru HSC-SSP data, whose i-band limiting magnitude (ilim ∼ 26 mag) is ∼5 mag deeper than the one of the SDSS data (ilim ∼ 21 mag). We also use the SDSS data to complement brighter EMPGs in this study. This paper is the first from our EMPRESS program, which explores EMPGs based on the S17A and S18A data releases of HSC-SSP. This first paper will be followed by other papers in which we investigate details of elemental abundances, physical states of the interstellar medium (ISM), size and morphology, and the stellar population of our EMPGs. We plan to continue updating the EMPG sample with the future HSC-SSP data release and upcoming follow-up spectroscopy.

The outline of this paper is as follows. In Section 2, we explain the Subaru HSC-SSP data, as well as how we construct a source catalog from the HSC-SSP data. We also make a source catalog from SDSS photometry data to complement our EMPG sample. Section 3 explains our new selection technique that we develop with ML and shows the results of a test of our ML selection. Section 4 explains the selection of EMPG candidates from the source catalogs. We describe our optical spectroscopy carried out for our EMPG candidates in Section 5. Section 6 explains the reduction and calibration processes of our spectroscopy data. In Section 7, we estimate emission-line fluxes and galaxy properties such as stellar mass, star formation rate (SFR), and metallicity. We show the results of our spectroscopy and compare our EMPG sample with other low-z galaxy samples in the literature in Section 8. Then we summarize our results in Section 9.

Throughout this paper, magnitudes are on the AB system (Oke & Gunn 1983). We adopt the following cosmological parameters: (h, Ωm, ΩΛ) = (0.7, 0.3, 0.7). The definition of the solar metallicity is given by 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 8.69 (Asplund et al. 2009). We also define an EMPG as a galaxy with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69 (i.e., Z/Z < 0.1) in this paper, which is almost the same as in previous metal-poor galaxy studies (e.g., Kunth & Östlin 2000; Izotov et al. 2012; Guseva et al. 2017). In this paper, we try to select EMPGs with a large ${\mathrm{EW}}_{0}$(Hα) (e.g., ≳800 Å) because our motivation is to discover local counterparts of high-z, low-mass galaxies whose sSFR is expected to be high (≳10 Gyr−1; e.g., Ono et al. 2010; Elmegreen & Elmegreen 2017; Stark et al. 2017; Harikane et al. 2018).

2. Data

We explain the HSC-SSP imaging data used in this study in Section 2.1. We construct source catalogs from the HSC-SSP and SDSS data in Sections 2.2 and 2.3, respectively. These source catalogs and the following selection processes are summarized in Figure 3.

Figure 3.

Figure 3. Picture of our selection flow. We select our EMPG candidates from the HSC (Section 2.2) and SDSS (Section 2.3) source catalogs, which consist of photometric data. To test our ML classifier, we use the SDSS test catalog (Section 3.3.1), which is composed of photometry+spectroscopy data. Our ML classifier (Section 3.2) is trained by SED models of galaxies, stars, and QSOs (Section 3.2.3). We do not use the existing observational data in the training because we target very faint EMPGs that no previous survey could discover. Some of the details are omitted in this flow for simplicity. See the details in each section.

Standard image High-resolution image

2.1. HSC-SSP Imaging Data

We use the HSC-SSP internal data of the S17A and S18A data releases, which were taken from 2014 March to 2017 May and from 2014 March to 2018 January, respectively. The internal S17A+S18A data are explained in the second data release (DR2) paper of HSC-SSP (Aihara et al. 2019). The HSC-SSP survey data are taken in three layers of wide, deep, and ultradeep with five broadband filters of grizy. In the HSC-SSP S17A and S18A data releases, the images were reduced with the HSC pipeline, hscPipe v5.4 and v6.7 (Bosch et al. 2018), respectively, with codes of the Large Synoptic Survey Telescope (LSST) software pipeline (Axelrod et al. 2010; Jurić et al. 2015; Ivezić et al. 2019). The pipeline conducts the bias subtraction, flat-fielding, image stacking, astrometry and zero-point magnitude calibration, source detection, and magnitude measurement. The hscPipe v6.7 (S18A) uses global background subtraction, a lower detection threshold, a new artifact rejection algorithm, the different coadd weighting between old and new i-/r-band filters, and the updated way of point-spread function (PSF) estimation (detailed in Aihara et al. 2019). These pipeline differences slightly change the detection and magnitude measurements, which may affect our classification results. Indeed, as we will explain later in Section 4.1, we find that some of the EMPG candidates are selected only in either S17A or S18A data, which is caused by the different hscPipe versions between S17A and S18A. To maximize the EMPG sample size, we use both S17A and S18A data in this paper. The details of the observations, data reduction, and detection and the photometric catalog are described in Bosch et al. (2018) and Aihara et al. (2019). We use cmodel magnitudes corrected for Milky Way dust extinction to estimate the total magnitudes of a source. The cmodel magnitudes are deblended by fitting profiles to multiple sources on an image. Thus, even if a certain source is overlapped with other sources, the cmodel magnitude represents a total magnitude of the source, which is almost free from its overlapping sources. See the detailed algorithm of the cmodel photometry in Bosch et al. (2018).

2.2. HSC Source Catalog

We explain how we construct an HSC source catalog, from which we select EMPG candidates. We select sources from the HSC-SSP wide-field data. We use isolated or cleanly deblended sources that fall within griz-band images. We also require that none of the pixels in their footprints are interpolated, none of the central 3 × 3 pixels are saturated or suffer from cosmic rays, and there are no bad pixels in their footprints. Then we exclude sources whose cmodel magnitude or centroid position measurements are flagged as problematic by hscPipe. We exclude sources close to a bright star (Coupon et al. 2018; Aihara et al. 2019). We require the detection in the griz-band images. Here we select objects whose photometric measurements are brighter than 5σ limiting magnitudes, g < 26.5, r < 26.0, i < 25.8, and z < 25.2 mag, which are estimated by Ono et al. (2018) with 1farcs5 diameter circular apertures. Note again that we use cmodel photometry to select EMPG candidates. In this study, we do not use y-band photometry because the y-band limiting magnitude is shallower (y = 24.5 mag; Ono et al. 2018) than the other four bands, and the y-band imaging has not yet been completed in the part of the survey area that we use in this study. We also require that the photometric measurement errors are less than 0.1 mag in the griz bands. Here the photometric measurement errors are given by hscPipe. Finally, we obtain 17,912,612 and 40,407,765 sources in total from the HSC-SSP S17A and S18A data, respectively. The effective areas are 205.82 and 508.84 deg2 in the HSC-SSP S17A and S18A data, respectively. Note again that there is overlap between the S17A and S18A data (see also Sections 2.1 and 4.1). Table 1 summarizes the selection criteria that we apply to make the HSC source catalog.

Table 1.  Selection Criteria in Our Source Catalog Construction

Parameter Value Band Comment
(1) (2) (3) (4)
isprimary True Object is a primary one with no deblended children
detect_ispatchinner True Object falls on the inner region of a coadd patch
detect_istractinner True Object falls on the inner region of a coadd tract
pixelflags_edge False griz Object is located within images
pixelflags_interpolatedcenter False griz None of the central 3 × 3 pixels of an object are interpolated
pixelflags_saturatedcenter False griz None of the central 3 × 3 pixels of an object are saturated
pixelflags_crcenter False griz None of the central 3 × 3 pixels of an object are masked as cosmic rays
pixelflags_bad False griz None of the pixels in the footprint of an object are labeled as bad
cmodel_flag False griz cmodel flux measurement has no problem
merge_peak True griz Detected in griz bands
mask_bright_objectcentera False griz No bright stars near an object
cmodel_mag $\lt 26.5$ g g-band cmodel magnitudes are smaller than the 5σ limiting magnitude
  <26.0 r r-band cmodel magnitudes are smaller than the 5σ limiting magnitude
  <25.8 i i-band cmodel magnitudes are smaller than the 5σ limiting magnitude
  <25.2 z z-band cmodel magnitudes are smaller than the 5σ limiting magnitude
cmodel_magsigma <0.1 griz Errors of the griz-band cmodel magnitudes are less than 0.1 mag

Note.

aOnly used in the S18A catalog because we find many contaminants in the S18A data caused by nearby bright stars, while we do not in S17A.

Download table as:  ASCIITypeset image

2.3. SDSS Source Catalog

We construct an SDSS source catalog from the 13th release (DR13; Albareti et al. 2017) of the SDSS photometry data. Although the SDSS data are ∼5 mag shallower (ilim ∼ 21 mag) than the HSC-SSP data (ilim ∼ 26 mag), we expect to select apparently bright but intrinsically faint EMPGs from the SDSS data in the local volume (i.e., z ≲ 0.001). Such apparently bright but intrinsically faint EMPGs can be discovered because a total unique area of the SDSS data (14,555 deg2) is ∼30 times larger than the S18A HSC-SSP data (509 deg2). It should also be noted that 99.0% of the total SDSS sources have not yet been observed in spectroscopy.20 Here we select objects whose photometric measurements are brighter than the SDSS limiting magnitudes, u < 22.0, g < 22.2, r < 22.2, i < 21.3, and z < 21.3 mag.21 We only obtain objects whose magnitude measurement errors are <0.1 mag in the ugriz bands. Here we use Modelmag for the SDSS data. Among the flags in the PhotoObjALL catalog, we require that a clean flag be "1" (i.e., True) to remove objects with photometry measurement issues. The clean flag22 eliminates duplication, the deblending/interpolation problem, suspicious detection, and detection at the edge of an image. We also remove objects with a True cosmic-ray flag and/or a True blended flag, which often mimics a broadband excess in photometry. We reject relatively large objects with a 90% Petrosian radius greater than 10'' to best eliminate contamination of H ii regions in nearby spiral galaxies. Finally, we derive 31,658,307 sources in total from the SDSS DR13 photometry data. The total unique area of the SDSS DR13 data is 14,555 deg2.

3. Construction of Classifier

In this section, we construct a classifier based on ML, which will be applied to the HSC-SSP and SDSS source catalogs to select EMPGs. We target galaxies that have a metallicity of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69–7.69 (i.e., 1%–10% solar metallicity) with a rest-frame Hα EW of ${\mathrm{EW}}_{0}$(Hα) > 800 Å. The basic idea of our selection technique is to build an object classifier that separates EMPG candidates from other types of objects, such as non-EMPG galaxies,23 stars, and QSOs. We construct the object classifier with a deep neural network (DNN; Lecun et al. 2015). In Section 3.1, we discuss the typical colors of EMPGs to show how we determine the ranges of metallicity, EW, and redshift of the EMPGs that we target in this study. Section 3.2 explains how we construct our ML classifier that distinguishes EMPGs from non-EMPG galaxies, stars, and QSOs. In Section 3.3, we test our ML classifier with the SDSS photometry+spectroscopy data (i.e., data of SDSS objects that are detected in photometry and observed in spectroscopy) to check whether our ML classifier successfully selects EMPGs. We refer to a catalog made from the SDSS photometry+spectroscopy data as an SDSS test catalog in this paper.

3.1. EMPG Colors

We examine the typical colors of EMPGs in the literature. This paper only focuses on EMPGs at z ≲ 0.03, where the [O iii]+Hβ and Hα lines fall on the g and r bands, respectively.

We compile SDSS metal-poor galaxies at z < 0.03 with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69 from the literature (Kunth & Östlin 2000; Kniazev et al. 2003; Guseva et al. 2007; Izotov & Thuan 2007; Izotov et al. 2009, 2012; Pustilnik et al. 2010; Pilyugin et al. 2012; Sánchez Almeida et al. 2016; Guseva et al. 2017). Figure 4 shows these SDSS metal-poor galaxies on the r − i versus g − r diagram, whose ${\mathrm{EW}}_{0}$(Hα) values are in the ranges of 0–300, 300–800, 800–1200, and >1200 Å. In Figure 4, metal-poor galaxies with a higher ${\mathrm{EW}}_{0}$(Hα) have a smaller r − i value with g − r ∼ 0 due to the g- and r-band excesses caused by strong nebular emission lines (top panel of Figure 2). This trend is also supported by the stellar synthesis and photoionization models, as shown with solid lines in Figure 4. These g- and r-band excesses are typical for EMPGs with strong emission lines, which basically enable us to separate EMPGs from other types of objects (e.g., galaxies, stars, and QSOs) only with photometric data. In addition, as described in Section 1, EMPGs with strong emission lines are expected to be local analogs of high-z SFGs because high-z SFGs have a high sSFR, which corresponds to high emission-line EWs, and a low metallicity.

Figure 4.

Figure 4. Color–color diagram of g − r vs. r − i for previously reported metal-poor galaxies with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69 at z < 0.03. The red stars, red circles, black crosses, and black dots represent SDSS metal-poor galaxies with ${\mathrm{EW}}_{0}$(Hα) > 1200, ${\mathrm{EW}}_{0}$(Hα) = 800–1200, ${\mathrm{EW}}_{0}$(Hα) = 300–800, and ${\mathrm{EW}}_{0}$(Hα) = 0–300 Å, respectively. The diamonds show the EMPGs at z < 0.03 with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2, AGC 198691 (Hirschauer et al. 2016), Little Cub (Hsyu et al. 2017), DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019), IZw 18 (e.g., Izotov & Thuan 1998; Thuan & Izotov 2005), and Leo P (Skillman et al. 2013), which have SDSS photometry data. The four blue solid lines present the beagle model calculations with ${\mathrm{EW}}_{0}$(Hα) ∼ 2500, 1500, 1000, and 500 Å (from dark blue to light blue) under the assumption of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 7.50. On the blue solid lines, redshifts are indicated with dots (z = 0.01, 0.02, and 0.03 from upper left to lower right). The models are calculated in the same manner as the EMPG models in Section 3.2.3. The SDSS metal-poor galaxies with a larger ${\mathrm{EW}}_{0}$(Hα) show smaller r − i values due to the strong Hα-line contribution in an r-band magnitude, which is consistent with the beagle model calculations.

Standard image High-resolution image

As described in Section 1, there are many contaminants in EMPG candidates selected with the classical color–color selection. Figure 5 shows the SDSS EMPGs with ${\mathrm{EW}}_{0}$(Hα) > 800 Å on the r − i versus g − r diagram, as well as the SDSS source catalog created in Section 2.3. Figure 5 demonstrates that the positions of the EMPGs are overlapped by many sources on the r − i versus g − r diagram. With the visual inspection, we find that most of the overlapping sources are contaminants such as stars and artifacts. Thus, we suggest that the classical color–color diagram is not effective for selecting EMPGs.

Figure 5.

Figure 5. Same as Figure 4 but only with the SDSS EMPGs with ${\mathrm{EW}}_{0}$(Hα) > 800 Å. The black mesh and contours represent a 2D histogram of the SDSS source catalog (discussed later in Section 4.2). The contours indicate the number of sources (N = 1, 3, 10, 30, ..., 10,000) in each bin with a size of Δm = 0.04 mag. On this color–color diagram, the EMPGs largely overlap with many SDSS sources, most of which are contaminants such as stars.

Standard image High-resolution image

To contrast the effectiveness of a selection with the g- and r-band excesses, we compare known EMPGs and EMPG models with the GP/BB selections on the r − i versus g − r diagram. Figure 6 demonstrates that there is little overlap among the EMPGs, GPs, and BBs on the r − i versus g − r diagram. The EMPGs show colors of $r-i\sim -1$ and g − r ∼ 0, which suggest the g- and r-band excesses. The solid and dotted lines are the selection criteria of the GPs (Cardamone et al. 2009) and BBs (Yang et al. 2017b), respectively. We also show galaxy models of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 8.00 (GPs/BBs), as well as 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69 and 6.94 (EMPGs) in Figure 6 (see model details in Section 3.2.3). The 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69 and 6.94 models are of lower metallicity than the lowest-metallicity galaxy currently known, 12 +log(O/H) = 6.98 (J0811+4730; Izotov et al. 2018b). However, we include these models with the expectation that such low-metallicity systems can be found with the deeper HSC-SSP data. The EMPG models with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69 and 6.94 overlap a little with the GP and BB selections, which basically means that the GP/BB selection criteria are not the best to select EMPGs with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ≲ 7.0. The basic idea of g- and r-band excesses seems to satisfy a necessary condition needed to select EMPGs with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ≲ 7.0. The next step is to reduce as many contaminants as possible, as explained in Figure 5.

Figure 6.

Figure 6. Model calculation of g − r and r − i colors for galaxies with ${\mathrm{EW}}_{0}$(Hα) ∼ 2000 Å that have 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69 (dark blue line; EMPGs), 6.94 (light blue line; EMPGs), and 8.00 (gray line; GPs). Dots and crosses are placed in every step of Δz = 0.01 and 0.1, respectively. The green triangles and cyan squares are GPs at z ∼ 0.3 (Cardamone et al. 2009) and BBs at z ∼ 0.04 (Yang et al. 2017b), respectively. The black solid and dotted lines are boundaries used to select GPs at z ∼ 0.3 (Cardamone et al. 2009) and BBs at z ∼ 0.04 (Yang et al. 2017b), respectively. The EMPGs derived from the literature are shown with red stars, red circles, and black diamonds (see Figure 4).

Standard image High-resolution image

3.2. ML Classifier

We construct an object classifier with a DNN that efficiently separates EMPGs from other types of objects by reducing as many contaminants as possible. The DNN, which is one of the most utilized artificial neural networks, is used to solve a classification problem. The DNN is composed of multiple layers of neural networks and optimized with training data to best separate multiple types of objects. We train the classifier with spectral energy distribution (SED) models because only a limited number of metal-poor galaxies have been identified fainter than 21 mag (Izotov et al. 2018b, 2019b), which means that the existing metal-poor galaxy sample is too small to make an ML training sample. In the following sections, we explain the merits of the DNN (Section 3.2.1), how the ML classifier is constructed (Section 3.2.2), and how the training samples are generated from models (Section 3.2.3).

3.2.1. Merits

There are four merits of the use of the DNN, as shown below.

  • (i)  
    We can select objects in a multidimensional photometry space (e.g., in a grizy five-dimensional space), while classical selections use a combination of simple linear projections onto a two-dimensional (2D) color–color diagram. As discussed in Section 3.1, EMPGs are overlapped by many sources on a color–color diagram of g − r versus r − i (Figure 5), for instance. In principle, if we use criteria in a multidimensional space, we can eliminate such overlapping sources more efficiently.
  • (ii)  
    We can use a nonlinear boundary that separates object types. The DNN can determine a nonlinear boundary thanks to a nonlinear function, called an activation function, that is used in the DNN structure (see Section 3.2.2). Although classical selections try to separate object types with a straight line on a color–color diagram, such a simple, straight line does not always separate different types of objects well. The use of a nonlinear boundary usually reduces the contamination and increases the completeness.
  • (iii)  
    A boundary is optimized by the DNN algorithm, whereas the classical boundaries are not optimized mathematically. The DNN enables the objective determination of the boundaries. Figure 7 is a schematic illustration of merits (i)–(iii).
  • (iv)  
    The DNN selection is very fast. Indeed, in principle, we are able to select EMPG candidates by fitting with SED models of galaxies, stars, and QSOs in a wide range of parameters. However, such SED fitting takes much longer than the DNN. Our DNN classifier requires only several minutes to train itself and classify sources once we produce SED models of galaxies, stars, and QSOs.

Figure 7.

Figure 7. Schematic illustrations of two methods of object classification. Left: object classification on the color–color diagram with a linear boundary. Two different types of objects are presented with squares and circles. Right: object classification in the multidimensional space (e.g., five-dimensional space of grizy-band magnitudes) with a nonlinear boundary. The DNN classification corresponds to the right panel, while the left panel demonstrates a classical color–color selection. The size of the circles and squares represents a distance on the line of sight.

Standard image High-resolution image

Although there are many other ML algorithms, such as a support vector machine, we begin with the DNN due to the four merits described above. We focus on the use of the DNN in this paper because our purpose is to construct an EMPG sample, not to find the most efficient selection method. In Section 8.1, we spectroscopically confirm that the success rate of our ML classifier is over ≳50%, which is high enough to construct an EMPG sample. Thus, the comparison between the DNN and other ML techniques is outside the scope of this paper.

3.2.2. Structure

We construct an object classifier that distinguishes four object types of EMPGs, non-EMPG galaxies, stars, and QSOs. For every source input, the classifier calculates the probabilities of the four types and chooses only one type whose probability is the highest of the four. In our calculation, we use an open-source software library for ML, Tensorflow.24 Its detailed structure and training process are explained below.

The object classifier is constructed with the DNN that consists of three hidden layers and one fully connected layer. Figure 8 is an schematic illustration of the structure of our classifier. The three hidden layers and one fully connected layer have 8/16/32 and 64 nodes, respectively. As Figure 8 shows, these nodes are connected with branches, which represent a linear-combination calculation. Each node in the hidden layers is followed by an activation function called a rectified linear unit (ReLU; Morandi et al. 2012). The activation function, ReLU, is a nonlinear function, which is essential to construct a deep-layer structure. In the fully connected layer, 10% of the nodes are dropped at random to avoid overfitting.

Figure 8.

Figure 8. Schematic illustration of the structure of our ML classifier based on the DNN. The nodes (open circles) and branches (solid lines) represent a linear combination. The green circles are ReLU activation functions.

Standard image High-resolution image

The inputs of our classifier are four (five) photometric magnitudes of the HSC griz bands (SDSS ugriz bands). We do not use the HSC y-band photometry, which is shallower than the other bands, to reach as faint magnitudes as possible. After calculations, the classifier outputs four probabilities of the EMPG, non-EMPG galaxy, star, and QSO and chooses only one type whose probability is the highest of the four. Here we obtain probabilities with the softmax function. The softmax function is a mathematical function that normalizes a vector with an exponential function.

The structure of the neural network is optimized so that the sum of the output errors is minimized. We optimize our classifier with a training sample, in which object types are already known beforehand. The optimization process is usually called "training." We use the cross-entropy and Adam optimizers (Kingma & Ba 2014), which are built into the Tensorflow software, to calculate and minimize the errors in the training, respectively. To train our classifier, we prepare a training sample with the SED models that will be detailed in Section 3.2.3. Then the training sample is divided into two independent data sets. Here 80% of the training sample is used as training data and the other 20% as check data. We use the training data to train the neural network, while the check data are prepared to check whether the classifier successfully identifies and separates EMPGs from other object types. In every training step, 100 models are randomly chosen from the training sample and used to train the neural network. The one training step is defined by a training with the 100 models. This training step is repeated 10,000 times.

Successful cases are defined by the true-positive EMPGs (i.e., a real EMPG classified as an EMPG) and its true negative (i.e., a real galaxy/star/QSO classified as a galaxy/star/QSO), as summarized in Figure 9. Here we only focus on whether an object is an EMPG or not. In other words, we ignore mistakes in the classification between galaxies, stars, and QSOs. Our EMPG classification is not affected by this ignorance. We define the success rate by the number of successful cases over the number of total classifications. In the calculation, we repeat the training step 10,000 times, until the success rate exceeds 99.5% constantly.

Figure 9.

Figure 9. Matrix that explains the successful cases defined in this paper. The columns and rows correspond to answers in reality and estimations made by the classifier, respectively. We mark successful cases with circles. Note that we ignore mistakes in the classification between galaxies, stars, and QSOs because we only aim to select EMPGs.

Standard image High-resolution image

3.2.3. Training Sample

We prepare the training sample that is used to train the ML classifier explained in Section 3.2.2. The training sample consists of photometric magnitudes calculated from models of EMPGs, non-EMPG galaxies, stars, and QSOs. The photometric magnitudes are calculated from the SED models by convoluting the SEDs with the throughput curves of the HSC (Kawanomoto et al. 2018) or SDSS (Fukugita et al. 1996) broadband filters. Below, we detail the models of the EMPGs, non-EMPG galaxies, stars, and QSOs.

(1) EMPG model. We generate EMPG SEDs with the SED interpretation code beagle (Chevallard & Charlot 2016). The beagle code calculates both the stellar continuum and the nebular emission (line + continuum) in a self-consistent manner using the stellar population synthesis code (Bruzual & Charlot 2003) and the photoionization code cloudy (Ferland et al. 2013). The beagle codes use the cloudy photoionization models produced by Gutkin et al. (2016), where the photoionization calculations are stopped when the electron density falls below 1% of the hydrogen density or if the temperature falls below 100 K. In the cloudy photoionization models, we assume the solar carbon-to-oxygen abundance ratio (C/O) and the metallicity-dependent nitrogen-to-oxygen abundance ratio (N/O) given in Gutkin et al. (2016). The assumption of the C/O and N/O ratios does not affect our model photometry because carbon and nitrogen lines are very faint. In the beagle calculation, we change the five parameters of stellar mass, maximum stellar age (hereafter just "age"), gas-phase metallicity, ionization parameter (log U), and redshift, as follows:

  • (i)  
    log(${M}_{\star }$/${M}_{\odot }$) = (4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0);
  • (ii)  
    log(age/yr) = (6.00, 6.25, 6.50, 6.60, 6.70, 6.80, 6.90, 7.00, 7.25, 7.50, 7.75);
  • (iii)  
    12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = (6.69, 7.19, 7.69);
  • (iv)  
    $\mathrm{log}\,U$ = (−2.7, −2.5, −2.3); and
  • (v)  
    redshift = (0.01, 0.02).

The stellar mass, age, and gas-phase metallicity cover typical values of known EMPGs (e.g., Thuan & Izotov 2005; Izotov et al. 2009, 2018b; Skillman et al. 2013; Hirschauer et al. 2016; 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2) and/or theoretical predictions (e.g., Wise et al. 2012). A stellar metallicity is matched to a gas-phase metallicity here. The ionization parameter is defined by a ratio of hydrogen ionizing photon flux, SH0, and hydrogen gas density, nH, normalized by the speed of light, c,

Equation (1)

We choose ionization parameters of $\mathrm{log}\,U$ = (−2.7, −2.5, −2.3), which are typical values for metal-poor galaxies, as demonstrated in Figure 10. The constant star formation history is assumed in the model. Here we also assume no dust attenuation because we target very metal-poor galaxies, where the dust production is insufficient. Indeed, representative metal-poor galaxies (e.g., Thuan & Izotov 2005; Izotov et al. 2009; Skillman et al. 2013; Hirschauer et al. 2016; Izotov et al. 2018b; 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2) show a negligibly small dust attenuation with a color excess of E(BV) ∼ 0. The Chabrier (2003) stellar initial mass function (IMF) is applied in the beagle code (Chevallard & Charlot 2016). In total, we generate 2178 (=11 × 11 × 3 × 3 × 2) SEDs with the parameters described above. For each SED, the beagle code also calculates the photometric magnitudes with the response curves of the HSC and SDSS filters, as well as emission-line EWs. From the 2178 model SEDs, we only select 1397 models that satisfy i < 26 mag and ${\mathrm{EW}}_{0}$(Hα) > 1000 Å. The 26 mag corresponds to about an i-band limiting magnitude of the HSC imaging data. The ${\mathrm{EW}}_{0}$(Hα) > 1000 Å corresponds to an age ≲10–100 Myr in this metallicity range, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69, as shown in Figure 11. Although we aim to select EMPGs with ${\mathrm{EW}}_{0}$(Hα) > 800 Å in this paper, here we limit EMPG models with ${\mathrm{EW}}_{0}$(Hα) > 1000 Å. This is because, if we limit EMPG models with ${\mathrm{EW}}_{0}$(Hα) > 800 Å, we tend to obtain galaxies with ${\mathrm{EW}}_{0}$(Hα) < 800 Å due to the 0.1 mag photometry error (explained below), as well as the increasing contamination. Then, to take the magnitude errors of 0.1 mag (see Sections 2.2 and 2.3) into consideration in the models, we generate random numbers under the assumption of the normal distribution with σ = 0.1 and add them to the photometric magnitudes. Here we generate 30 sets of random numbers for each model.25 Thus, we obtain a total of 41,910 (=1397 × 30) models including magnitude errors. We do not use models that satisfy 0.02 < z ≤ 0.03 because we find that the contamination rate increases in that case.

Figure 10.

Figure 10. Ionization parameters of typical local galaxies as a function of metallicity. Red stars represent averages of the local metal-poor galaxies (Nagao et al. 2006, sample A+B). Black dots are obtained from the SDSS composite spectra of Andrews & Martini (2013). Metallicities are based on the electron temperature measurements. Ionization parameters are calculated assuming the photoionization model of Kewley & Dopita (2002). This figure suggests that galaxies with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 7.0–7.5 have $\mathrm{log}\,U\sim -2.5$.

Standard image High-resolution image
Figure 11.

Figure 11. The ${\mathrm{EW}}_{0}$(Hα) values as a function of age. The colors of the solid lines correspond to metallicities 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.69, 6.94, 7.19, 7.50, 7.69, and 8.00 from dark blue to dark red. These relations are provided by the beagle models under the assumption of constant star formation.

Standard image High-resolution image

(2) Galaxy model (non-EMPG). We introduce two types of non-EMPG galaxies: normal SFGs and GPs. First, we generate SEDs of normal SFGs with the beagle code, similar to the EMPG models. In the calculation, we change the five parameters of stellar mass, age, metallicity, redshift, and V-band dust attenuation optical depth (τV) as shown, assuming a bursty star formation history:

  • (i)  
    log(${M}_{\star }$/${M}_{\odot }$) = 8.0, 9.0, 10.0, 11.0;
  • (ii)  
    log(age/yr) = 8.5, 9.0, 9.5, 10.0;
  • (iii)  
    12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 8.19, 8.69, 8.89;
  • (iv)  
    redshift = 0.02, 0.04, 0.06, 0.08, 0.10; and
  • (v)  
    log(τV) = 0.0, 2.0.

The stellar mass, age, metallicity, and V-band dust attenuation optical depth are selected from typical values of local SFGs. We fix an ionization parameter to $\mathrm{log}\,U$ = −3.0, which is a value representative of local galaxies as demonstrated in Figure 10. In total, we generate 480 SEDs with the parameters described above. The photometric magnitudes are calculated in the same manner as the EMPG models. From the 480 models, we only select models that satisfy i < 26 mag. After the i-band magnitude selection, 471 models remain. We introduce magnitude errors of 0.1 mag, similar to the EMPG models, generating 100 sets of random numbers for the 471 models. Then we have 47,100 normal-SFG models in total, including magnitude errors.

Second, we also create GP SEDs with the beagle code with the following four parameters:

  • (i)  
    log(${M}_{\star }$/${M}_{\odot }$) = 7.0, 7.5, 8.0, 8.5, 9.0, 9.5, 10.0;
  • (ii)  
    log(age/yr) = 6.0, 6.1, 6.2, ..., 8.0;
  • (iii)  
    $\mathrm{log}\,U$ = (−3.0, −2.5, −2.0); and
  • (iv)  
    redshift = 0.08, 0.09, 0.10, ..., 0.40.

Metallicity is fixed at 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 8.00, which is the typical value of GPs (see Figure 1). We also assume the dust free condition in the GP models. We obtain 14,553(=7 × 21 × 3 × 21) models. From the 14,553 models, we use 3234 models that satisfy i < 26 mag. We also introduce 0.1 mag errors in magnitude as described above, generating 10 sets of random numbers for 3234 models. Then we have 32,340 GP models in total.

We combine the 47,100 normal-SFG models and 32,340 GP models into 79,440 models of non-EMPG galaxies. We do not include galaxies over z = 0.1 in the non-EMPG galaxy models because galaxies at z > 0.1 have quite different colors from those of low-z EMPGs. The inclusion of high-z galaxy models does not change our result.

(3) Stellar model.  We use the stellar SED models of Castelli & Kurucz (2004), where 53 types of stars are modeled from O- to M-type. For each stellar type, SEDs are calculated in a metallicity range of $\mathrm{log}(Z/{Z}_{\odot })$ = (−2.5, −2.0, −1.5, −1.0, −0.5, ±0.0, +0.2, +0.5). Thus, we obtain 424 (=53 × 8) model SEDs in total. These stellar model SEDs are obtained from the STScI website.26 Assuming the HSC and SDSS filters, we calculate ui, gi, ri, and zi colors from the 424 model SEDs. Then we determine i-band magnitudes, selecting 10 values in the range of i = 15–26 mag at regular intervals. Multiplying the 424 sets of ui, gi, ri, and zi colors and the 10 i-band magnitudes, we generate 4240 (=424 × 10) sets of stellar models with photometric magnitudes. In addition, we also introduce magnitude errors (0.1 mag) similarly to the EMPG models, obtaining 42,400 (=4240 × 10) stellar models in total.

(4) QSO model. We use a composite spectrum of QSOs at 1 < z < 2.1 observed by the X-SHOOTER spectrograph installed on the Very Large Telescope (VLT; Selsing et al. 2016). This composite spectrum covers a wide wavelength range of 1000–11000 Å in the rest frame. From this composite spectrum, we generate mock spectra by varying three parameters, the power-law index (α) of an intrinsic near-ultraviolet (NUV) slope fλ ∝ λα, the V-band dust attenuation optical depth, and the redshift:

  • (i)  
    α= −2.0, −1.5, −1.0, −0.5;
  • (ii)  
    τV = 0.0, 0.5, 1.0; and
  • (iii)  
    redshift = 0.1, 0.2, 0.3, ..., 3.0.

The intrinsic NUV slope and V-band dust attenuation optical depth of typical QSOs are well covered by the parameters above (e.g., Telfer et al. 2002; Selsing et al. 2016). Then we get 360(=4 × 3 × 30) QSO model SEDs in total. Similar to the stellar models, we calculate ui, gi, ri, and zi colors from the 360 model SEDs. Here we take 10 values of the i-band magnitude in the range of i = 15–26 at regular intervals, obtaining 3600 models. In addition, we also introduce magnitude errors (0.1 mag) similar to the EMPG models, obtaining 36,000 (=3600 × 10) stellar models in total.

In this section, we have generated 41,910, 79,440, 42,400, and 36,000 models for the EMPGs, non-EMPG galaxies, stars, and QSOs, respectively. Selecting 30,000 models from each of the EMPGs, non-EMPG galaxies, stars, and QSOs, we obtain a training sample composed of 120,000 (=30,000 × 4) models in total. Figure 12 shows models of EMPGs, non-EMPG galaxies, stars, and QSOs on the projected color–color diagrams of gr versus ri and ri versus iz. The EMPGs are overlapped with non-EMPG galaxies and stars on these projected color–color diagrams, which potentially causes contamination in the EMPG selection.

Figure 12.

Figure 12. Top: model templates of EMPGs (green), non-EMPG galaxies (blue), stars (yellow), and QSOs (red) on the color–color diagrams of g − r vs. r − i. The contours show the number of models (N = 1, 3, 10, 30, 100, and 300) in each bin with a size of Δm = 0.025 mag. Bottom: same as the top panel but for r − i vs. i − z. The gray lines present Equations (2)–(4), described in Section 4.1.

Standard image High-resolution image
Figure 13.

Figure 13. The HSC gri-band images of the four HSC-EMPG candidates, HSC J1429−0110, HSC J2314+0154, HSC J1142−0038, and HSC J1631+4426.

Standard image High-resolution image

3.3. Test with SDSS Data

Before we apply our ML classifier to the HSC-SSP and SDSS source catalogs, we test whether our classifier successfully distinguishes EMPGs from other types of objects (non-EMPG galaxies, stars, or QSOs). We carry out the test with SDSS data whose sources are detected in photometry and observed in spectroscopy. Such a data set is a good test sample because we can easily check object types (galaxy, star, or QSO) and metallicities in their spectra. We can also see if a source satisfies the EMPG condition of 12 + log(O/H) < 7.69. We do not expect to discover unconfirmed EMPGs in the SDSS test catalog because SDSS sources with spectroscopic confirmation have been intensively investigated by many authors (e.g., Sánchez Almeida et al. 2016; Guseva et al. 2017). This step is important toward discovering unconfirmed EMPGs in the HSC data, whose limiting magnitude is ≳5 mag deeper than SDSS, potentially pushing to lower metallicities than what has been discovered so far through SDSS. Here we explain how we create an SDSS test catalog in Section 3.3.1, and the test results are described in Section 3.3.2.

Figure 14.

Figure 14. The SDSS gri-band images of the six SDSS-EMPG candidates, SDSS J0002+1715, SDSS J1642+2233, SDSS J2115−1734, SDSS J2253+1116, SDSS J2310−0211, and SDSS J2327−0200.

Standard image High-resolution image

3.3.1. SDSS Test Catalog

We construct an SDSS test catalog from the SDSS DR13 data. The SDSS DR13 data are based on SDSS-I through SDSS-IV data, which contain the extragalactic spectroscopy data from the SDSS-I (York et al. 2000), the SDSS-II (Abazajian et al. 2009), the Baryon Oscillation Spectroscopic Survey (BOSS; Dawson et al. 2013), and the extended Baryon Oscillation Spectroscopic Survey (eBOSS; Dawson et al. 2016). In the SDSS data, typical wavelength ranges are 3800–9200 or 3650–10400 Å, and a typical spectral resolution is R = 1500–2500. We only select objects cross-matched with the photometry catalog of PhotoObjAll and the spectroscopy catalogs of SpecObjAll, galSpecExtra, and galSpecLine. Then we construct the SDSS test catalog in the same way as the SDSS source catalog described in Section 2.3. The SDSS test catalog is composed of 935,042 sources (579,961 galaxies, 327,421 stars, and 27,660 QSOs) in total. The spectroscopic effective area of the SDSS data is 9376 deg2.

3.3.2. Tests

Applying our ML classifier (Section 3.2) to the SDSS test catalog (Section 3.3.1), we obtained 13 EMPG candidates from the SDSS test catalog. We have checked their object classes (galaxy, star, and QSO) that are given in the SpecObjAll catalog based on spectroscopy. Based on the images, spectra, and object classes, we identify all 13 candidates as galaxies. We find that these 13 galaxies are well studied with spectroscopic data in the literature.

For the 13 galaxies, we obtain redshift, ${\mathrm{EW}}_{0}$(Hα), and ${\mathrm{EW}}_{0}$(Hβ) values from the SpecObjAll and galSpecLine catalogs. Metallicities of the 13 galaxies are derived from the literature (Kunth & Östlin 2000; Kniazev et al. 2003; Guseva et al. 2007; Izotov & Thuan 2007; Izotov et al. 2009, 2012; Pustilnik et al. 2010; Pilyugin et al. 2012; Sánchez Almeida et al. 2016; Guseva et al. 2017). These metallicities are calculated based on electron temperature measurements. We find that six out of the 13 galaxies satisfy the EMPG condition, 12 +$\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69. Although the other seven galaxies do not fulfill the EMPG definition, they still have low metallicities in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.8–8.5. As we expected, all 13 galaxies show large ${\mathrm{EW}}_{0}$(Hα) values (750–1700 Å). We summarize their object classes, redshifts, ${\mathrm{EW}}_{0}$(Hα), ${\mathrm{EW}}_{0}$(Hβ), and metallicities in Table 2.

Table 2.  Parameters of EMPG Candidates Selected in the SDSS Test

No. ID Class Redshift ${\mathrm{EW}}_{0}$(Hα) ${\mathrm{EW}}_{0}$(Hβ) 12 + log (O/H) Citation
        Å Å    
(1) (2) (3) (4) (5) (6) (7) (8)
1 J012534.2+075924.5 EMPG 0.010 1351 242 7.58 P12
2 J080758.0+341439.3 EMPG 0.022 1277 252 7.69 SA16
3 J082555.5+353231.9 EMPG 0.003 1441 238 7.45 K03
4 J104457.8+035313.1 EMPG 0.013 1462 276 7.44 K03
5 J141851.1+210239.7 EMPG 0.009 1153 215 7.50 SA16
6 J223831.1+140029.8 EMPG 0.021 953 183 7.43 P12
7 J001428.8−004443.9 Galaxy 0.014 995 184 8.05 P12
8 J025346.7−072344.0 Galaxy 0.005 787 138 7.97 K04
9 J115804.9+275227.2 Galaxy 0.011 750 119 8.34 B08
10 J125306.0−031258.8 Galaxy 0.023 1291a 236 8.08 K04
11 J131447.4+345259.7 Galaxy 0.003 1700 293 8.14 B08
12 J132347.5−013252.0 Galaxy 0.022 1458 248 7.77 K04
13 J143905.5+364821.9 Galaxy 0.002 766a 140 7.94 P12

Notes. (1) Number. (2) ID. (3) Object class. Galaxies with 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69 are classified as an EMPG. (4) Redshift. (5) and (6) Rest-frame EWs of Hα and Hβ emission lines. These values are obtained from the SDSS DR13 catalog. (7) Gas-phase metallicity obtained with the electron temperature measurement. (8) Citation from which the metallicity values are derived. P12: Pilyugin et al. (2012); K03: Kniazev et al. (2003); K04: Kniazev et al. (2004); B04: Brinchmann et al. (2008); SA16: Sánchez Almeida et al. (2016).

aNo reliable EW0(Hα) measurements are given due to pixel issues on the spectrum. Instead, we estimate EW0(Hα) values from EW0(Hβ) measurements and the empirical relation of EW0(Hα) = 5.47 × EW0(Hβ), which is obtained from metal-poor galaxies in the literature (Figure 19). Note that we highlight values of EWs and metallicities that satisfy the EMPG conditions in bold.

Download table as:  ASCIITypeset image

The success rate, or purity, of our EMPG selection is 46% (6/13) for the SDSS test catalog. It is worth noting that the other 54% of galaxies (7/13) also show a low metallicity as described above. In the local metal-poor galaxy sample obtained from the literature in Section 3.1, we find seven EMPGs, which are also included in the SDSS test catalog and have high EWs, ${\mathrm{EW}}_{0}$(Hα) > 800 Å. Very high EWs are necessary to be selected by the gr-band excess technique. In other words, we have successfully selected the six EMPGs above out of the seven known high-EW EMPGs in the SDSS test catalog, which suggests that our selection reaches 86% (6/7) completeness. Thus, we conclude that our selection method has successfully selected EMPGs and EMPG-like galaxies from the SDSS test catalog.

4. Selection

In Section 3.3, we confirmed that our object classifier works well with the SDSS test catalog. Thus, we expect that our object classifier can also select EMPGs in the HSC-SSP data. In Sections 4.1 and 4.2, we choose EMPG candidates from the HSC-SSP and SDSS source catalogs (Sections 2.2 and 2.3) with our ML classifier (Section 3.2). Hereafter, these candidates chosen from the HSC-SSP and SDSS source catalogs are called "HSC-EMPG candidates" and "SDSS-EMPG candidates," respectively.

4.1. EMPG Candidates from the HSC Data

In Section 2.2, we created the HSC-SSP source catalog, which consists of 17,912,612 and 40,407,765 sources in the S17A and S18A source catalogs, respectively. As noted in Section 2.2, the sources selected from S17A and S18A data at this point are partly duplicated, but the duplication will be removed in the last step of the selection. In this section, we select EMPG candidates from the HSC-SSP source catalog in four steps, described below.

In the first step, we coarsely remove sources based on blending, extendedness, and color before we apply our ML classifier. We remove sources whose photometry is strongly affected by back/foreground objects as follows. Fluxes of a source and a back/foreground object are measured at the central position of the source, and when a flux of the back/foreground object exceeds 50% of the source flux, the source is removed. We only select extended sources whose extendedness_value flags are 1 in all of the griz bands. The hscPipe labels a point source and an extended source as extendedness_value = 0 and 1, respectively. The hscPipe defines a point source as a source whose PSF magnitude (mpsf) and cmodel magnitude (mcmodel) match within mpsfmcmodel < 0.0164 (Bosch et al. 2018). To save calculation time, we remove part of the sources before we apply the classifier. To roughly remove sources whose colors are apparently different from EMPGs, we apply

Equation (2)

Equation (3)

To remove possible contamination from normal galaxies, we also apply

Equation (4)

In other words, we choose sources that satisfy Equations (2)–(4) here. We show Equations (2)–(4) in Figure 12. After these selection criteria, 680 and 2494 sources remain from the S17A and S18A data, respectively. The source removal in the first step effectively reduces the calculation time in the ML classifier in the second step below.

In the second step, we apply the ML classifier constructed in Section 3.2 to the sources selected above. The ML classifier selects 32 and 57 sources out of the 680 (S17A) and 2494 (S18A), respectively.

In the third step, we remove transient objects by checking the g- and r-band multi-epoch images. We measure fluxes in each epoch and calculate an average and a standard deviation of these flux values. If the standard deviation becomes larger than 25% of the average value, we regard the source as a transient object and eliminate it from the sample. Removing 10 and 15 sources, we obtain 22 and 42 sources after the third step for the S17A and S18A data, respectively.

In the last step, we inspect a gri-composite image. Here we remove apparent H ii regions inside a large SFG, sources affected by a surrounding bright star, and apparently red sources. The apparently red sources are mistakenly selected due to an issue in the cmodel photometry. Indeed, they show red colors (r − i > 0.0) in the 1farcs0 aperture photometry, while they become blue in the cmodel photometry. In the inspection of multi-epoch and gri-composite images, we removed 10 and 21 sources from the S17A and S18A data, respectively.

Eventually, we thus obtain 12 and 21 HSC-EMPG candidates from the S17A and S18A catalogs, respectively. We find that six of the HSC-EMPG candidates are duplicated between the S17A and S18A catalogs. Thus, the number of independent HSC-EMPG candidates is 27 (=12 + 21 − 6). The magnitude range of the 27 HSC-EMPG candidates is i = 19.3–24.3 mag.

Out of the 27 candidates, we find six candidates that are selected in S17A but not selected again in S18A. Four out of the six candidates are slightly redder in S18A than in S17A and thus not selected in S18A. The other two are removed in S18A due to flags related to a cosmic ray or nearby bright star. Such differences probably arise due to the different pipeline versions between S17A and S18A. We check the images and photometry of these six candidates individually. Then we confirm that these six candidates seem to have no problem as an EMPG candidate.

4.2. EMPG Candidates from the SDSS Data

In Section 2.3, we constructed the SDSS source catalog consisting of 31,658,307 sources. In this section, we select EMPG candidates from the SDSS source catalog similarly to the HSC source catalog in Section 4.1.

First, we remove sources that have colors apparently different from EMPGs with Equations (2)–(4). Then we apply our ML classifier to the SDSS source catalog, and our classifier selects 107 sources. Checking gri-composite images, we eliminate apparent H ii regions in a spiral galaxy, sources affected by a surrounding bright star, and apparently red sources. We also remove sources if the corresponding composite image shows an apparent problem that may be caused by an incorrect zero-point magnitude. In the visual inspection above, 21 sources have been removed. These steps leave us with 86 SDSS-EMPG candidates from the SDSS source catalog, whose i-band magnitudes are in the range i = 14.8–20.9 mag.

Cross-matching the SDSS-EMPG candidates with the SDSS spectra data, we find that 17 out of the 86 candidates already have an SDSS spectrum. These 17 spectra show strong nebular emission lines from galaxies at z = 0.002–0.026, 15 of which have already been reported with a metallicity measurement in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 7.44–8.22 (Kniazev et al. 2003, 2004; Izotov et al. 2007, 2012; Engelbracht et al. 2008; Shirazi & Brinchmann 2012; Izotov & Thuan 2016; Sánchez Almeida et al. 2016). Seven out of the 15 galaxies satisfy the EMPG condition, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69. All six of the EMPGs chosen in our classifier test (Section 3.3) are selected again here. Another object out of the 86 candidates is HSC J1429−0110, which is also selected as an HSC-EMPG candidate in Section 4.1.

5. Spectroscopy

We have carried out spectroscopy for the 10 EMPG candidates with four spectrographs: the Low Dispersion Survey Spectrograph 3 (LDSS-3) and the Magellan Echellette Spectrograph (MagE; Marshall et al. 2008) on the Magellan telescope, the Deep Imaging Multi-Object Spectrograph (DEIMOS; Faber et al. 2003) on the Keck II telescope, and the Faint Object Camera And Spectrograph (FOCAS; Kashikawa et al. 2002) on the Subaru telescope. We show HSC and SDSS gri-band images of the 10 EMPG candidates in Figures 13 and 14. In this section, we explain the spectroscopy for the 10 EMPG candidates.

5.1. Magellan/LDSS-3 Spectroscopy

We conducted spectroscopy for an HSC-EMPG candidate (HSC J1429−0110) on 2018 June 12 with LDSS-3 at the Magellan telescope (PI: M. Rauch). We used the VPH-ALL grism with the 0farcs75  × 4' long slit, which was placed at the offset position 2' away from the center of the long-slit mask so that the spectroscopy could cover the bluer side. The exposure time was 3600 s. The spectroscopy covered λ ∼ 3700–9500 Å with a spectral resolution of R ≡ λλ ∼ 860. The A0-type standard star CD-32 9972 (R.A. = 14:11:46.37, decl. = −33:03:14.3 in J2000) was also observed. The sky was clear during the observation with seeing sizes of 0farcs6–0farcs9.

5.2. Magellan/MagE Spectroscopy

We carried out spectroscopy for two HSC-EMPG candidates (HSC J2314+0154 and HSC J1142−0038) and six SDSS-EMPG candidates (SDSS J0002+1715, SDSS J1642+2233, SDSS J2115−1734, SDSS J2253+1116, SDSS J2310−0211, and SDSS J2327−0200) on 2018 June 13 with MagE on the Magellan telescope (PI: M. Rauch). We used the echellette grating with the 0farcs85 × 10'' or 1farcs× 10'' long slits. The exposure time was 1800 or 3600 s, depending on the luminosities of the candidates. The MagE spectroscopy covered λ ∼ 3100–10000 Å with a spectral resolution of Rλλ ∼ 4000. The A0-type standard star CD-32 9972 (R.A. = 14:11:46.37, decl. = −33:03:14.3 in J2000) and the DOp-type standard star Feige 110 (R.A. = 23:19:58.39, decl. = −05:09:55.8 in J2000) were also observed. The sky was clear during the observation with seeing sizes of 0farcs8–1farcs5.

5.3. Keck/DEIMOS Spectroscopy

We conducted spectroscopy for an HSC-EMPG candidate (HSC J1631+4426) as a filler target on 2018 August 10 with DEIMOS on the Keck II telescope (PI: Y. Ono). We used the multi-object mode with the 0farcs8 slit width. The exposure time was 2400 s. We used the 600ZD grating and the BAL12 filter with a blaze wavelength at 5500 Å. The DEIMOS spectroscopy covered λ ∼ 3800–8000 Å with a spectral resolution of Rλλ ∼ 1500. The A0-type standard star G191B2B (R.A. = 05:05:30.6, decl. = +52:49:54 in J2000) was also observed. The sky was clear during the observation with seeing sizes of 0farcs5.

5.4. Subaru/FOCAS Spectroscopy

We carried out deep spectroscopy for an HSC-EMPG candidate (HSC J1631+4426) on 2019 May 13 with FOCAS installed on the Subaru telescope (PI: T. Kojima). The candidate was observed again with FOCAS in a longer exposure time of 10,800 s (=3 hr). We used the long-slit mode with a 2farcs0 slit width. We used the 300R grism and L550 filter with a blaze wavelength at 7500 Å in second order. The FOCAS spectroscopy covered λ ∼ 3400–5250 Å with a spectral resolution of Rλλ= 400 with a 2farcs0 slit width. The O-type subdwarf BD+28 4211 (R.A. = 21:51:11.07, decl. = +28:51:51.8 in J2000) was also observed as a standard star. The sky condition was clear during the observation with a seeing size of 0farcs6.

The LDSS-3, MagE, DEIMOS, and FOCAS observations are summarized in Table 3.

Table 3.  Summary of LDSS-3, MagE, DEIMOS, and FOCAS Observations

ID R.A. Decl. Slit Width P.A. Exposure Seeing
  (hh:mm:ss) (dd:mm:ss) (arcsec) (deg) (s) (arcsec)
(1) (2) (3) (4) (5) (6) (7)
LDSS-3 Observation
HSC J1429−0110 14:29:48.61 −01:10:09.67 0.75 +33.8 3600 0.8
MagE Observation
HSC J2314+0154 23:14:37.55 +01:54:14.27 0.85 +84.8 3600 0.9
HSC J1142−0038 11:42:25.19 −00:38:55.64 0.85 +103.3 3600 0.8
SDSS J0002+1715 00:02:09.94 +17:15:58.65 1.2 +110.0 1800 1.5
SDSS J1642+2233 16:42:38.45 +22:33:09.09 0.85 +36.0 1800 1.0
SDSS J2115−1734 21:15:58.33 −17:34:45.09 0.85 +144.0 1800 1.1
SDSS J2253+1116 22:53:42.41 +11:16:30.62 1.2 +76.0 1800 1.2
SDSS J2310−0211 23:10:48.84 −02:11:05.74 1.2 +18.8 1800 1.0
SDSS J2327−0200 23:27:43.69 −02:00:55.89 1.2 +49.6 1800 1.0
DEIMOS Observation
HSC J1631+4426 16:31:14.24 +44:26:04.43 0.80 +45.9 2400 0.5
FOCAS Observation
HSC J1631+4426 16:31:14.24 +44:26:04.43 2.0 +45.9 12,000 0.6

Note. (1) ID. (2) R.A. in J2000. (3) Decl. in J2000. (4) Slit width. (5) Position angle. (6) Exposure time. (7) Seeing.

Download table as:  ASCIITypeset image

6. Reduction and Calibration of Spectroscopic Data

We explain how we reduced and calibrated the spectroscopic data of Magellan/LDSS-3, Magellan/MagE, Keck/DEIMOS, and Subaru/FOCAS in Sections 6.16.4, respectively.

6.1. LDSS-3 Data

We used the iraf package to reduce and calibrate the data taken with LDSS-3 (Section 5.1). The reduction and calibration processes include bias subtraction, flat-fielding, one-dimensional (1D) spectrum subtraction, sky subtraction, wavelength calibration, flux calibration, and atmospheric-absorption correction. A 1D spectrum was extracted from an aperture centered on the blue compact component of our EMPG candidates. A standard star, CD-32 9972, was used in the flux calibration. The wavelengths were calibrated with the HeNeAr lamp. Atmospheric absorption was corrected with the extinction curve at the Cerro Tololo Inter-American Observatory (CTIO). We used the CTIO extinction curve because the Magellan telescopes were located at Las Campanas Observatory, which neighbors the site of the CTIO in Chile at a similar altitude. We also calculate the readout and photon noise of sky+object emission on each CCD pixel, which is propagated to the 1D spectrum.

In our spectroscopy, a slit was not necessarily placed perpendicular to the horizon (i.e., at a parallactic angle) but instead chosen to include extended substructure in our EMPG candidates. Thus, part of our spectra may have been affected by atmospheric refraction. Because targets are acquired with an R-band camera in the LDSS-3 observation, red light falls on the center of the slit, while blue light might drop out of the slit. Thus, the atmospheric refraction can cause a wavelength-dependent slit loss (SL). To estimate the wavelength-dependent SL (SL(λ)) carefully, we made a model of the atmospheric refraction. We assumed the atmospheric refraction measured at La Silla in Chile (Filippenko 1982), where the atmospheric condition was similar to Las Campanas in terms of the altitude and humidity. The model took into consideration a parallactic angle, a slit position angle, an airmass, and a seeing size at the time of exposures. An object size was broadened with a Gaussian convolution. We assumed a wavelength dependence for the seeing size ∝ λ−0.2, where the seeing size was measured in the R band. We integrated a model surface brightness B(λ) on the slit to estimate an observed flux density Fλobs as a function of wavelength. Then we estimated the SL(λ) by comparing the observed flux density ${F}_{\lambda }^{\mathrm{obs}}$ and total flux density ${F}_{\lambda }^{\mathrm{tot}}$ predicted in the model:

Equation (5)

Then we corrected the spectrum with SL(λ) and obtained the SL-corrected spectrum. The obtained SL values for HSC J1429−0110 were SL(4000 Å) = 1.74 and SL(7000 Å) = 1.61, for example, giving an SL ratio of SL(4000 Å)/SL(7000 Å) = 1.08. The SL ratio suggests that emission-line ratios were corrected up to ∼10% between 4000 and 7000 Å. We estimated multiple color excesses E(BV) from multiple pairs of Balmer lines and confirmed that these E(BV) values were consistent between them within the error bars.

6.2. MagE Data

To reduce the raw data taken with MagE, we used the MagE pipeline from the Carnegie Observatories Software Repository.27 The MagE pipeline has been developed on the basis of the Carpy package (Kelson et al. 2000; Kelson 2003). The bias subtraction, flat-fielding, scattered-light subtraction, 2D spectrum subtraction, sky subtraction, wavelength calibration, cosmic-ray removal, and 1D-spectrum subtraction were conducted with the MagE pipeline. The details of these pipeline processes are described on the website of the Carnegie Observatories Software Repository mentioned above. In the sky subtraction, we used a sky-line reference mask (i.e., a mask targeting a blank sky region with no object). The 1D spectra were subtracted by summing pixels along the slit-length direction on a 2D spectrum. The readout noise and photon noise of sky+object emission are calculated on each CCD pixel and propagated to the 1D spectrum.

We conducted the flux calibration with the standard star, Feige 110, using iraf routines. Wavelengths were calibrated with emission lines of the ThAr lamp. Spectra of each order were calibrated separately and combined with the weight of electron counts to generate a single 1D spectrum. Atmospheric absorption was corrected in the same way as in Section 6.1.

In the MagE spectroscopy, we also placed a slit along a substructure of our EMPGs, regardless of a parallactic angle. We also corrected the wavelength-dependent SL carefully in the same manner as the LDSS-3 spectroscopy described in Section 6.1.

6.3. DEIMOS Data

We used the iraf package to reduce and calibrate the data taken with DEIMOS (Section 5.3). The reduction and calibration processes were the same as the LDSS-3 data explained in Section 6.1. A standard star, G191B2B, was used in the flux calibration. Wavelengths were calibrated with the NeArKrXe lamp. Atmospheric absorption was corrected under the assumption of the extinction curve at Maunakea Observatory. It should be noted that part of the flat and arc frames has been affected by stray light.28 In our observation, a spectrum was largely affected in the wavelength range of λ = 4400–4900 Å. Thus, we only used a spectrum within the wavelength range of λ > 4900 Å, which was free from the stray light. We ignore the effect of the atmospheric refraction here because we only use the red side (λ > 4900 Å) of the DEIMOS data, which is insensitive to the atmospheric refraction. We also confirm that the effect of the atmospheric refraction is negligible with the models described in Section 6.1. We calculate noise in the same way as in Section 6.1. In the DEIMOS data, we only used line flux ratios normalized to Hβ flux. Emission-line fluxes were scaled with the Hβ flux by matching the Hβ flux obtained with DEIMOS to the one obtained with FOCAS (see Section 6.4). Note again that we have conducted spectroscopy for HSC J1631+4426 with both DEIMOS and FOCAS.

6.4. FOCAS Data

We used the iraf package to reduce and calibrate the data taken with FOCAS (Section 5.4). The reduction and calibration processes were the same as the LDSS-3 data explained in Section 6.1. A standard star, BD+28 4211, was used in the flux calibration. Wavelengths were calibrated with the ThAr lamp. Atmospheric absorption was corrected in the same way as in Section 6.3. Our FOCAS spectroscopy covered λ ∼ 3800–5250 Å, which was complementary to the DEIMOS spectroscopy described in Section 6.3, whose spectrum was reliable only in the range of λ > 4900 Å. We ignore the atmospheric refraction here because FOCAS is equipped with the atmospheric dispersion corrector. We also calculate noise in the same manner as in Section 6.1. Because a Hβ line overlapped in the FOCAS and DEIMOS spectroscopy, we used the Hβ line flux to scale the emission-line fluxes obtained in the DEIMOS observation (see Section 6.3).

We show spectra of the four HSC-EMPG candidates and six SDSS-EMPG candidates obtained with the LDSS-3, MagE, DEIMOS, and FOCAS spectrographs in Figures 15 and 16. In the spectra of Figures 15 and 16, we find redshifted emission lines, confirming that these 10 candidates are real galaxies. In Figure 17, we also show a FOCAS spectrum of HSC J1631+4426 around an [O iii] λ4363 emission line. The [O iii] λ4363 emission line is detected significantly.

Figure 15.

Figure 15. Spectra of our four HSC EMPGs. The positions of the sky emission lines are indicated by gray vertical lines at the bottom of each panel. We mask the parts of the strong sky lines that are not subtracted very well. We also show noise spectra at the bottom of each panel with red lines. Here we exhibit two spectra of the same target, HSC J1631+4426, for which we conduct spectroscopy with both Keck/DEIMOS (red side; λ ≳ 5000 Å) and Subaru/FOCAS (blue side; λ ≲ 5000 Å).

Standard image High-resolution image
Figure 16.

Figure 16. Same as Figure 15 but for our six SDSS EMPGs. We indicate the parts of the emission lines that may be underestimated because of the saturation with asterisks. The saturation depends on the strength of an emission line and its position in each spectral order of the echellette spectroscopy because an edge (a center) of each order has a low (high) sensitivity.

Standard image High-resolution image
Figure 17.

Figure 17. Spectrum of HSC J1631+4426 around Hγ and [O iii] λ4363 emission lines. The cyan curve is the best-fit Gaussian function of the Hγ and [O iii] λ4363 emission lines. The red line shows noise levels at each wavelength.

Standard image High-resolution image

7. Analysis

In this section, we explain the emission-line measurement (Section 7.1) and the estimation of galaxy properties (Section 7.2). Here we estimate the stellar masses, SFRs, stellar ages, emission-line EWs, electron temperatures, and metallicities of the 10 EMPG candidates confirmed in our spectroscopy.

7.1. Emission-line Measurements

We measure central wavelengths and emission-line fluxes with a best-fit Gaussian profile using the iraf routine splot. In Sections 6.16.4, we calculated the readout and photon noise of the sky+object emission on each CCD pixel and propagated them to 1D spectra. We estimate flux errors from the 1D noise spectra with the same FWHM of emission lines. As described in Section 6.2, we correct the fluxes of the LDSS-3/MagE spectra assuming the wavelength-dependent SL with the model of the atmospheric refraction. We also include the uncertainties of the SL correction in the flux errors. We measure observed EWs of emission lines with the same iraf routine, splot, and convert them into rest-frame EWs (EW0). Redshifts are estimated by comparing the observed central wavelengths and the rest-frame wavelengths in the air of strong emission lines. Generally speaking, when the slit spectroscopy is conducted for a spatially resolved object, one obtains a spectrum only inside a slit, which may not represent an average spectrum of its whole system. However, because our metal-poor galaxies have a size comparable to or slightly larger than the seeing size, our emission-line estimation represents an average of the whole system. The sizes of our metal-poor galaxies will be discussed in Paper III (Y. Isobe et al. 2020, in preparation).

Color excesses, E(BV), are estimated with the Balmer decrement of Hα, Hβ, Hγ, Hδ, ..., and H13 lines under the assumptions of the dust extinction curve given by Cardelli et al. (1989) and the case B recombination. We do not use Balmer emission lines affected by a systematic error, such as cosmic rays and other emission lines blending with the Balmer line. In the case B recombination, we carefully assume electron temperatures (Te) so that the assumed electron temperatures become consistent with electron temperature measurements of O2+, Te(O iii), which will be obtained in Section 7.2. We estimate the best E(BV) values and their errors with the χ2 method (Press et al. 2007). The E(BV) estimation process is detailed as follows.

  • (1)  
    We predict Balmer emission-line ratios based on the case B recombination models with an initial Te guess selected from Te = 10,000, 15,000, 20,000, or 25,000 K. We use the case B recombination models calculated with PyNeb (Luridiana et al. 2015, v1.0.14), a modern Python tool to compute emission-line emissivities based on the n-level atom/ion models. Here we fix an electron density of ne = 100 cm−3 for our 10 galaxies, which is roughly consistent with the ne measurements obtained in Section 7.2 (see also Table 5). Note that the Balmer emission-line ratios are insensitive to the electron density variance.
  • (2)  
    We calculate χ2 from the Balmer emission-line ratios of our measurements and the case B models in a wide range of E(BV). We find the best E(BV) value, which gives the least χ2. We also obtain 16th and 84th percentiles of E(BV) based on the χ2 calculation and regard the 16th and 84th percentiles as E(BV) errors.
  • (3)  
    We apply the dust correction to the emission-line fluxes with the obtained E(BV) value and the Cardelli et al. (1989) dust extinction curve.
  • (4)  
    We estimate electron temperatures Te(O iii), which will be described in Section 7.2. We then compare the Te(O iii) estimates with the initial Te guess to check the consistency between them.
  • (5)  
    Changing the initial Te guess, we repeat steps (1)–(4) until we get a Te(O iii) value roughly consistent with the initial Te guess given in step (1).

We eventually assume Te = 10,000 (SDSS J0002+1715 and SDSS J1642+2233), 15,000 (HSC J1429−0110, HSC J2314+0154, SDSS J2253+1116, SDSS J2310−0211, and SDSS J2327−0200), 20,000 (HSC J1142−0038 and SDSS J2115−1734), and 25,000 K (HSC J1631+4426), which are roughly consistent with the Te(O iii) measurements. In the flux estimation, we ignore the contribution of stellar atmospheric absorption around the Balmer lines because our galaxies have very large EWs compared to the expected absorption EW. We summarize redshifts and dust-corrected fluxes in Tables 4, 6, and 7.

Table 4.  R.A., Decl., Redshifts, and Photometric Magnitudes of Our Targets

No. ID R.A. Decl. Redshift u g r i z y
    (hh:mm:ss) (dd:mm:ss)   (mag) (mag) (mag) (mag) (mag) (mag)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)
1 HSC J1429−0110 14:29:48.61 −01:10:09.67 0.02980 18.14 18.65 19.38 19.47 18.92
2 HSC J2314+0154 23:14:37.55 +01:54:14.27 0.03265 21.94 21.95 22.76 22.57 22.38
3 HSC J1142−0038 11:42:25.19 −00:38:55.64 0.02035 21.39 21.62 22.42 22.28 22.01
4 HSC J1631+4426 16:31:14.24 +44:26:04.43 0.03125 21.84 21.88 22.52 22.75 22.39
5 SDSS J0002+1715 00:02:09.94 +17:15:58.65 0.02083 18.48 17.61 18.05 18.61 18.57
6 SDSS J1642+2233 16:42:38.45 +22:33:09.09 0.01725 18.50 17.99 18.38 19.01 19.14
7 SDSS J2115−1734 21:15:58.33 −17:34:45.09 0.02296 19.59 18.49 19.00 19.67 19.57
8 SDSS J2253+1116 22:53:42.41 +11:16:30.62 0.00730 17.91 16.62 17.07 18.08 18.12
9 SDSS J2310−0211 23:10:48.84 −02:11:05.74 0.01245 18.12 17.19 17.46 17.97 18.02
10 SDSS J2327−0200 23:27:43.69 −02:00:55.89 0.01812 19.02 18.16 18.47 19.26 19.25

Note. (1) Number. (2) ID. (3) R.A. in J2000. (4) Decl. in J2000. (5) Redshift. Typical uncertainties are Δz ∼ 10−6. (6)–(11) Magnitudes of ugrizy broadband photometry. The photometry of our HSC EMPGs is given with HSC cmodel magnitudes, while we use SDSS model magnitudes in the photometry of our SDSS-EMPG.

Download table as:  ASCIITypeset image

We have confirmed a consistency between the observed emission-line fluxes and those estimated in the process of photometric SED fitting. The photometric SED fitting will be detailed in Section 7.2.

7.2. Galaxy Properties

We estimate the stellar masses, maximum stellar ages, SFRs, electron densities (ne), electron temperatures (Te), and gas-phase metallicities (O/H) of our EMPG candidates. The SFR, ne, Te, and O/H estimates and their errors are obtained with the Monte Carlo simulation. For each of the 10 galaxies, we generate 1000 sets of emission-line fluxes based on the emission-line flux measurements and errors obtained in Section 7.1. Here we assume that the flux measurement errors approximately follow the Gaussian profile. The 16th and 84th percentiles in the SFR, ne, Te, and O/H distributions are regarded as lower and upper errors of SFR, ne, Te, and O/H.

We estimate the stellar masses and ages of our EMPG candidates with the SED interpretation code beagle (Chevallard & Charlot 2016). A constant star formation history is assumed in the model. We run the beagle code with the four free parameters of stellar mass, maximum stellar age, ionization parameter, and metallicity, while we fix a redshift determined in our spectroscopy. In the beagle models, both stellar and nebular emission is calculated. Note that, if the nebular emission is not included in the SED models, the stellar continuum (i.e., stellar mass) can be overestimated. We confirm that emission-line fluxes estimated by the beagle codes are consistent with the observed fluxes in our spectroscopy, suggesting that our stellar-mass estimates are not overestimated for the reason above. We assume dust-free conditions to reduce calculation time, but we note that when dust extinction is added as a free parameter in a rough parameter estimation, their values are approximately zero. Finally, we obtain estimates of stellar mass and maximum stellar age in the range of log(${M}_{\star }$/${M}_{\odot }$) = 4.95–7.06 and ${t}_{\mathrm{age},\max }$ = 3.4–51 Myr.

The SFRs are estimated with the dust-corrected Hα fluxes under the assumption of the star formation history of Kennicutt (1998). Here we assume that the Hα emission line is dominantly contributed by the photoionization caused by ionizing photon radiation from massive stars. If the Hα line is saturated, we use a Hβ line instead. The estimated SFRs of our EMPGs are in the range log(SFR/${M}_{\odot }$ yr−1) = (−1.28)–0.43.

We estimate electron densities and temperatures with the PyNeb package getTemDen. We use [O ii] λλ3727/3729 flux ratios to estimate the electron densities of O+ ions, ne(O ii), obtaining ne(O ii) = 29–128 cm−3 for our seven galaxies. We use line ratios of [O iii] λλ4363/5007 and [O ii] λλ(3727+3729)/(7320+7330) to estimate the electron temperatures coupled with O2+ ions, Te(O iii), and O+ ions, Te(O ii), respectively. In the electron temperature estimation, we assume ne = 100 cm−3, which is consistent with our ne measurements. To estimate the Te(O iii) of HSC J1631+4426, we use a PyNeb package, getEmissivity, instead of getTemDen because HSC J1631+4426 shows a high electron temperature. Both the getTemDen and getEmissivity packages provide the same relation between Te(O iii) and [O iii] λλ4363/5007. The getTemDen package can be used only below ∼25,120 K because the [O iii] collision strengths used in getTemDen (Storey et al. 2014) have been calculated in the range of ∼100–25,000 K, while the getEmissivity package is effective up to 200,000 K (Aggarwal & Keenan 1999; Luridiana et al. 2015). If an [O iii] λ5007 line is saturated, we estimate an [O iii] λ5007 flux with [O iii] λ5007 = 2.98 × [O iii] λ4959, which is strictly determined by the Einstein A coefficient. If either the [O ii] λ7320 or [O ii] λ7330 line is detected, we estimate a total flux of [O ii] λλ(7320+7330) with a relation of [O ii] λ7330 = 0.56 × [O ii] λ7320. Using PyNeb, we have confirmed that the [O ii] relation above holds with very little dependence on Te and ne. If no [O ii] λ7320 or [O ii] λ7330 is detected, we estimate Te(O ii) from an empirical relation of Te(O ii) = 0.7 × Te(O iii)+3000 (Campbell et al. 1986; Garnett 1992). The estimates of electron densities and temperatures are summarized in Table 5.

Table 5.  Electron Temperature and Density

No. ID Te(O iii) Te(O ii) ne(O ii)
    (104 K) (104 K) (cm−3)
(1) (2) (3) (4) (5)
1 HSC J1429−0110 ${1.318}_{-0.013}^{+0.014}$ ${0.909}_{-0.013}^{+0.015}$
2 HSC J2314+0154
3 HSC J1142−0038 ${1.492}_{-0.038}^{+0.044}$ a ${109}_{-17}^{+20}$
4 HSC J1631+4426 ${2.557}_{-0.110}^{+0.119}$ a
5 SDSS J0002+1715 ${1.179}_{-0.005}^{+0.006}$ a
6 SDSS J1642+2233 1.210 ± 0.007 ${0.879}_{-0.008}^{+0.007}$ ${77}_{-6}^{+9}$
7 SDSS J2115−1734 ${1.807}_{-0.007}^{+0.008}$ ${1.210}_{-0.016}^{+0.023}$ 29 ± 10
8 SDSS J2253+1116 ${1.480}_{-0.003}^{+0.001}$ 1.333 ± 0.008 ${70}_{-3}^{+5}$
9 SDSS J2310−0211 1.632 ± 0.003 1.151 ± 0.007 ${109}_{-5}^{+3}$
10 SDSS J2327−0200 ${1.566}_{-0.002}^{+0.003}$ ${1.260}_{-0.012}^{+0.010}$ ${128}_{-6}^{+4}$

Notes. (1) Number. (2) ID. (3) and (4) Electron temperatures of ${{\rm{O}}}^{2+}$ and ${{\rm{O}}}^{+}$. (5) Electron density of ${{\rm{O}}}^{+}$.

aThe Te(O ii) values are obtained with the empirical relation of Campbell et al. (1986) and Garnett (1992) because we cannot estimate Te(O ii) directly due to the nondetection of [O ii] λλ7320,7330 emission lines.

Download table as:  ASCIITypeset image

We also estimate Te-based metallicities with [O iii] λ5007/Hβ and [O ii] λλ3727,3729/Hβ flux ratios and electron temperatures of Te(O iii) and Te(O ii) using the PyNeb package getIonAbundance. Here we again assume ne = 100 cm−3, which is consistent with our ne measurements. We obtain Te-based metallicities ranging from 12 + log(O/H) = 6.90 to 8.45. Because an [O iii] λ4363 emission line is not detected in the spectrum of HSC J2314+0154, we do not estimate a Te-based metallicity of HSC J2314+0154. For comparison, we also estimate the metallicities of HSC J2314+0154 and HSC J1631+4426 with the empirical relation obtained from metal-poor galaxies by Skillman (1989). This empirical relation is calibrated with emission-line indices of R23 (≡([O ii] λ3727+[O iii] λλ4959,5007)/Hβ). Izotov et al. (2019a) confirmed that the Skillman (1989) calibration well reproduces the metallicities of the famous EMPGs J0811+4730 (Izotov et al. 2018b), SBS 0335−052 (e.g., Izotov et al. 2009), Little Cub (Hsyu et al. 2017), and DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019). This empirical relation is applicable in the low-metallicity range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ≲ 7.3, which corresponds to log(R23) ≲ 0.5. Because our galaxies, except for HSC J2314+0154 and HSC J1631+4426, do not satisfy log(R23) ≲ 0.5, we do not estimate the metallicities of the other eight galaxies with the empirical relation. The estimates of stellar masses, ages, SFRs, and gas-phase metallicities are summarized in Table 7.

One may be interested in the metallicity of HSC J1631+4426, which gives the lowest metallicity in our sample. In Figure 17, we show a FOCAS spectrum of HSC J1631+4426 around an [O iii] λ4363 emission line. Errors of electron temperature and metallicity are dominantly contributed by the [O iii] λ4363 flux errors because the [O iii] λ4363 collisional excitation line is sensitive to the electron temperature, and the electron temperature is strongly associated with metallicity through the metal cooling. In addition, the [O iii] λ4363 line is the weakest (by a factor of ∼1/10–1/30) among the [O iii] λ4363, [O iii] λ4959, [O iii] λ5007, Hβ, and Hα lines. Note that an O+ ion abundance error does not greatly contribute to the metallicity estimation because the O2+ ions are abundant compared to the O+ ions in the metal-poor galaxies (e.g., Nakajima & Ouchi 2014; Kojima et al. 2017). For reference, a 5% difference in [O iii] λ4363 flux corresponds to an ∼1000 K difference in electron temperature (see Figure 4 of Kojima et al. 2017). When the electron temperature increases (decreases) by 1000 K, the metallicity decreases (increases) by 0.03 dex at 12 + log (O/H) ∼ 6.90. As shown in Figure 17, we significantly detect the [O iii] λ4363 emission line with a signal-to-noise ratio (S/N) of 17.0. We also show the best Gaussian fitting profile with the cyan curve, which seems to be successfully fitted. To double-check the reliability of the [O iii] λ4363 flux estimation, we also estimate the [O iii] λ4363 flux by integrating a continuum-subtracted spectrum around the [O iii] λ4363 emission line and compare it with the [O iii] λ4363 flux obtained by the Gaussian fitting (Section 7.1 and Table 6). We find that a ratio of fluxes obtained by the Gaussian fitting and the flux integration is 1.014. The 1.4% uncertainty between the Gaussian fitting and the flux integration is smaller than the [O iii] λ4363 flux error (1/17.0 × 100 = 5.9%) shown in Table 6. We have confirmed above that the [O iii] λ4363 flux has been reliably calculated in the two independent methods. Recalling that errors of electron temperature and metallicity are dominated by the [O iii] λ4363 flux errors, we conclude that the difference between the flux estimation methods only slightly affects the results of the electron temperature and metallicity estimation.

Table 6.  Flux Measurements

No. ID [O ii] λ3727 [O ii] λ3729 [O ii]tot H13 H12 H11
(1) (2) (3) (4) (5) (6) (7) (8)
1 HSC J1429−0110 166.62 ± 1.99
2 HSC J2314+0154 <14.21 <13.50 <19.60
3 HSC J1142−0038 68.06 ± 0.81 91.82 ± 0.84 159.88 ± 1.16 3.26 ± 0.68 3.63 ± 0.67 6.65 ± 0.79
4 HSC J1631+4426 50.12 ± 2.66
5 SDSS J0002+1715 74.28 ± 0.50 103.08 ± 0.51 177.36 ± 0.71
6 SDSS J1642+2233 103.35 ± 0.72 149.41 ± 0.76 252.75 ± 1.05
7 SDSS J2115−1734 34.05 ± 0.27 46.85 ± 0.30 80.91 ± 0.41 2.80 ± 0.20 3.42 ± 0.32 4.23 ± 0.20
8 SDSS J2253+1116 39.44 ± 0.11 55.06 ± 0.12 94.50 ± 0.16 2.36 ± 0.06 3.40 ± 0.05 4.08 ± 0.05
9 SDSS J2310−0211 43.70 ± 0.10 59.12 ± 0.12 102.82 ± 0.16 2.38 ± 0.06 2.75 ± 0.06 3.78 ± 0.06
10 SDSS J2327−0200 45.07 ± 0.11 59.96 ± 0.12 105.03 ± 0.16 2.48 ± 0.06 3.36 ± 0.06 3.99 ± 0.08
No. H10 H9 H7 Hδ Hγ [O iii] λ4363 Hβ
(1) (9) (10) (11) (12) (13) (14) (15)
1 5.45 ± 0.68 32.48 ± 0.48a 21.64 ± 0.33 45.55 ± 0.28 9.06 ± 0.23 100.00 ± 0.24
2 26.15 ± 3.11 46.56 ± 1.67 100.00 ± 1.00
3 4.39 ± 0.66 6.13 ± 0.63 17.16 ± 0.54 27.36 ± 0.51 48.20 ± 0.43 5.96 ± 0.39 100.00 ± 0.52
4 4.68 ± 1.17 20.03 ± 0.87a 27.53 ± 0.65 46.88 ± 0.50 8.18 ± 0.48 100.00 ± 0.37
5 5.99 ± 0.30 8.42 ± 0.25 16.27 ± 0.19 26.04 ± 0.17 46.92 ± 0.14 6.38 ± 0.09 100.00 ± 0.16
6 6.74 ± 0.33 15.27 ± 0.24 26.44 ± 0.20 44.90 ± 0.16 6.59 ± 0.10 100.00 ± 0.17
7 4.40 ± 0.19 5.88 ± 0.18 15.54 ± 0.16 26.12 ± 0.16 46.99 ± 0.16 13.94 ± 0.11 100.00 ± 0.19
8 5.59 ± 0.05 7.59 ± 0.05 16.12 ± 0.06 25.67 ± 0.05 46.52 ± 0.06 14.13 ± 0.04 100.00 ± 0.07
9 5.72 ± 0.07 7.49 ± 0.06 16.38 ± 0.06 26.66 ± 0.06 48.57 ± 0.07 14.85 ± 0.04 100.00 ± 0.09
10 5.29 ± 0.06 7.19 ± 0.06 17.45 ± 0.07 25.93 ± 0.07 47.39 ± 0.08 12.81 ± 0.05 100.00 ± 0.11
No. [O iii] λ4959 [O iii] λ5007 Hα [N ii] λ6584 [O ii] λ7320 [O ii] λ7330
(1) (16) (17) (18) (19) (20) (21)  
1 210.62 ± 0.29 626.89 ± 0.46 246.66 ± 0.20 5.08 ± 0.18 1.45 ± 0.07 0.92 ± 0.07  
2 69.57 ± 0.72 207.48 ± 0.88 278.45 ± 0.66 2.10 ± 0.29  
3 102.76 ± 0.47 308.14 ± 0.65 272.09 ± 0.57 8.64 ± 0.26  
4 55.76 ± 0.34 170.92 ± 0.38 229.46 ± 1.00 <0.48  
5 196.65 ± 0.19 593.18 ± 0.32 280.48 ± 0.17 5.68 ± 0.04  
6 183.24 ± 0.21 571.59 ± 0.34 276.34 ± 0.20 5.28 ± 0.04 1.84 ± 0.04 1.47 ± 0.04  
7 165.47 ± 0.22 278.90 ± 0.22 3.11 ± 0.04 1.09 ± 0.04 0.83 ± 0.04  
8 250.60 ± 0.11 3.29 ± 0.01 1.55 ± 0.01 1.03 ± 0.01  
9 214.32 ± 0.12 2.76 ± 0.02 1.26 ± 0.02 0.98 ± 0.02  
10 200.95 ± 0.14 3.21 ± 0.02 1.69 ± 0.02  

Notes. (1) Number. (2) ID. (3)–(21) Dust-corrected emission-line fluxes normalized to a Hβ line flux in units of erg s−1 cm−2. Upper limits are given with a 1σ level. Lines suffering from saturation or affected by sky emission lines are also shown as no data. Here [O ii]tot represents a sum of the [O ii] λ3727 and [O ii] λ3729 fluxes. If the spectral resolution is not high enough to resolve the [O ii] λ3727 and [O ii] λ3729 lines, we only show the [O ii]tot fluxes.

aA sum of [Ne iii] λ3867 and H7 fluxes because they are blended due to the low spectral resolution.

Download table as:  ASCIITypeset image

8. Results and Discussions

In Section 8.1, we describe the results of the object class identification for our HSC-EMPG and SDSS-EMPG candidates and show the distribution of ${\mathrm{EW}}_{0}$(Hβ) and metallicity to characterize our sample. We also investigate the cosmic number density of our metal-poor galaxies (Section 8.2) and their environment (Section 8.3). We show the stellar mass and SFR (M–SFR) and the stellar-mass and metallicity (MZ) relations of our EMPG candidates in Sections 8.4 and 8.5. In Section 8.6, we discuss the possibility of active galactic nucleus (AGN)/shock contribution on the diagram of [N ii]/Hα and [O iii]/Hβ emission-line ratios, the so-called Baldwin–Phillips–Terlevich diagram (BPT diagram; Baldwin et al. 1981). The velocity dispersions of our sample are presented and discussed in Section 8.7.

8.1. Object Class Identification

As described in Section 5, we conducted spectroscopy for 4 out of 27 HSC-EMPG candidates and 6 out of 86 SDSS-EMPG candidates. We find that all of the 10 observed EMPG candidates are confirmed as real galaxies with strong emission lines. We show spectra of the four HSC-EMPG candidates and six SDSS-EMPG candidates that exhibit strong emission lines in Figures 15 and 16. Two spectra are shown for HSC J1631+4426 because we have conducted spectroscopy both with Keck/DEIMOS and Subaru/FOCAS for this object.

Figure 18 shows the distribution of the metallicity and ${\mathrm{EW}}_{0}$(Hβ) of our EMPG candidates (red stars). We find that our sample covers a wide range of metallicities, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.9–8.5 (i.e., 0.02–0.6 Z) and that 3 out of our 10 candidates satisfy the EMPG criterion of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69, while the other 7 candidates do not satisfy the criterion. Remarkably, HSC J1631+4426 has a metallicity of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.90 ± 0.03 (i.e., 0.016 Z), which is one of the lowest metallicities reported ever. We also find that two out of the three EMPGs, HSC J2314+0154 and HSC J1631+4426, are selected from the HSC data and have i-band magnitudes of 22.8 and 22.5 mag. We argue that these two faint EMPGs are selected thanks to the deep HSC data, which suggests that the deep HSC data are advantageous for selecting very faint EMPGs. It should be also noted that the other seven galaxies out of the EMPG definition still show a low metallicity (Z/Z ∼ 0.1–0.3). We expect to find more EMPGs from the HSC-EMPG catalog in our future spectroscopy. This is because the pilot spectroscopy has targeted relatively bright HSC sources (∼22 mag) and our future spectroscopy will target fainter HSC sources down to ∼24.3 mag, which are expected to have lower metallicity.

Figure 18.

Figure 18.  ${\mathrm{EW}}_{0}$(Hβ) as a function of metallicity of our metal-poor galaxies from HSC-EMPG and SDSS-EMPG source catalogs (red stars). The solid line indicates the EMPG criterion given in this paper, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69. Galaxies that satisfy the EMPG condition in our metal-poor galaxy sample are marked with a large circle. The open star indicates a galaxy whose metallicity is obtained with the empirical relation of Izotov et al. (2019a), not with the direct method because of the nondetection of [O iii] λ4363 and [O ii] λλ7320,7330 emission lines (Section 7.2). We also present GPs (Yang et al. 2017a; green triangle), BBs (Yang et al. 2017b; cyan square), and metal-poor galaxies (Sánchez Almeida et al. 2016; open circle) from the literature for comparison. With diamonds, we show representative metal-poor galaxies (or clumps of them), J0811+4730 (Izotov et al. 2018b), SBS 0335−052 (e.g., Izotov et al. 2009), AGC 198691 (Hirschauer et al. 2016), J1234+3901 (Izotov et al. 2019b), Little Cub (Hsyu et al. 2017), DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019), IZw 18 (e.g., Izotov & Thuan 1998; Thuan & Izotov 2005), and Leo P (Skillman et al. 2013) of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2.

Standard image High-resolution image

In Figure 18, we also show GPs (Yang et al. 2017a; green triangle), BBs (Yang et al. 2017b; cyan square), and local metal-poor galaxies (Sánchez Almeida et al. 2016, SA16 hereafter; open circle) for comparison. We also compare them with the representative metal-poor galaxies in the range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 7.0–7.2, J0811+4730 (Izotov et al. 2018b), SBS 0335−052 (e.g., Izotov et al. 2009), AGC 198691 (Hirschauer et al. 2016), J1234+3901 (Izotov et al. 2019b), Little Cub (Hsyu et al. 2017), DDO 68 (Pustilnik et al. 2005; Annibali et al. 2019), IZw 18 (e.g., Izotov & Thuan 1998; Thuan & Izotov 2005), and Leo P (Skillman et al. 2013) with diamonds. Although ${\mathrm{EW}}_{0}$(Hα) has been used to select high-EW EMPGs in the models (Section 3.2.3), we compare ${\mathrm{EW}}_{0}$(Hβ) here because some of Hα emission lines are saturated in our observation. The EW condition used in the model, ${\mathrm{EW}}_{0}$(Hα) > 1000 Å, corresponds to ${\mathrm{EW}}_{0}$(Hβ) > 200 Å under the assumption of the tight correlation between ${\mathrm{EW}}_{0}$(Hα) and ${\mathrm{EW}}_{0}$(Hβ) as demonstrated in Figure 19. We find that our metal-poor galaxy sample covers a high ${\mathrm{EW}}_{0}$(Hβ) range of ∼100–300 Å. Most of the BBs and the representative metal-poor galaxies also show high EWs of ∼100–300 Å. These high ${\mathrm{EW}}_{0}$(Hβ) values (∼100–300 Å) are in contrast to the metal-poor galaxy sample of SA16, in which most galaxies show ${\mathrm{EW}}_{0}$(Hβ) ≲ 100 Å. As suggested in Figure 11, galaxies that consist of younger stellar population have higher EWs of Balmer emission lines. Thus, the high ${\mathrm{EW}}_{0}$(Hβ) values may suggest that our metal-poor galaxies, BBs, and the representative metal-poor galaxies possess younger stellar population than the metal-poor galaxies of SA16.

Figure 19.

Figure 19. Relation between the rest-frame EWs of Hα and Hβ for metal-poor galaxies in the literature (Kunth & Östlin 2000; Kniazev et al. 2003; Guseva et al. 2007; Izotov & Thuan 2007; Izotov et al. 2009, 2012; Pustilnik et al. 2010; Pilyugin et al. 2012; Sánchez Almeida et al. 2016; Guseva et al. 2017). The best least-squares fit is shown with a solid line, which is ${\mathrm{EW}}_{0}$(Hα) = 5.47 × ${\mathrm{EW}}_{0}$(Hβ). Because metal-poor galaxies are less dusty, a flux ratio of F(Hα)/F(Hβ) becomes almost constant (∼2.7–3.0, determined by the case B recombination) in most cases. In addition, a ratio of the continuum level at Hα and Hβ, fλ,0(6563 Å)/fλ,0(4861 Å) always becomes ∼0.5 because the continuum slope differs little among metal-poor galaxies at wavelengths of λ > 4000 Å. Thus, the tight relation between ${\mathrm{EW}}_{0}$(Hα) and ${\mathrm{EW}}_{0}$(Hβ) is only applicable to metal-poor galaxies.

Standard image High-resolution image
Figure 20.

Figure 20. Illustration of the large-scale structure slice around SDSS J2327−0200 (R.A. = 23:27:43.69, decl. = −02:00:55.89; red circle) projected onto an R.A.–redshift plane. Gray dots represent galaxies selected from the SDSS DR13 spectroscopic catalog. Here we only show galaxies falling between ±5fdg0 away from SDSS J2327−0200 in declination.

Standard image High-resolution image

8.2. Number Density

We roughly estimate the cosmic number densities of metal-poor galaxies that we have selected from the HSC and SDSS photometry catalogs (i.e., HSC- and SDSS-EMPG candidates). Here, we assume that all of the HSC- and SDSS-EMPG candidates are real metal-poor galaxies because we have spectroscopically confirmed that all of the 10 HSC- and SDSS-EMPG candidates are real metal-poor galaxies (Section 8.1). Note that we do not estimate the cosmic number densities of EMPGs here, but our sample galaxies to see whether or not our sample galaxies are rare. The HSC and SDSS broadband filters select EMPGs at z < 0.035 and z < 0.030, respectively, which correspond to 149 and 128 Mpc in cosmological physical distance. The redshift-range difference (z < 0.035 and z < 0.030) is caused by the different response curves of the HSC and SDSS broadband filters. Because we have selected 27 (86) EMPG candidates from the HSC (SDSS) data, whose effective observation area is 509 (14,555) deg2, within z < 0.035 (z < 0.030), we obtain the number density, 1.5 ×  10−4 (2.8 ×  10−5) Mpc−3, from the HSC (SDSS) data. As suggested by previous surveys (Cardamone et al. 2009; Yang et al. 2017b), we confirm again that the metal-poor galaxies with strong emission lines are rare in the local universe. We also find that the number density of metal-poor galaxies is  10 times higher in the HSC data than in the SDSS data. This difference is explained by the fact that fainter galaxies are more abundant and that our HSC metal-poor galaxies (median: i ∼ 22.5 mag) are ∼30 times fainter than our SDSS metal-poor galaxies (median: i ∼ 18.8 mag). The number density estimation may depend on the selection criteria and the completeness and purity of our EMPG candidate samples.

8.3. Environment

To characterize the environment of our metal-poor galaxies, we compare the nearest neighborhood distances (Dnear) of our 10 spectroscopically confirmed galaxies and local, typical SFGs. One-thousand local, typical SFGs are randomly chosen from the SDSS DR13 spectroscopic catalog in the range of z = 0.03–0.05. We calculate distances from our 10 spectroscopically confirmed galaxies and local, typical SFGs to surrounding galaxies selected from the SDSS DR13 spectroscopic catalog (Figure 20). Then we identify their nearest neighbor. The Dnear values of our metal-poor galaxies range from 0.49 to 17.69 Mpc, which are summarized in Table 7. We also estimate the Dnear of typical SFGs randomly selected from the SDSS DR13 spectroscopic catalog Figure 21 compares the Dnear distributions of our metal-poor galaxies as well as local, typical SFGs. The average Dnear value of our metal-poor galaxies is 3.83 Mpc, which is about 2.5 times larger than that of local, typical SFGs (1.52 Mpc). We also find that 9 out of our 10 metal-poor galaxies have Dnear values larger than the average of local, typical SFGs (i.e., Dnear > 1.52 Mpc). Statistically, a Kolmogorov–Smirnov test rejects the null hypothesis (i.e., the distributions of the two samples are the same) with a p-value of 1.9 × 10−3, suggesting that these distributions are significantly different. Thus, we conclude that our metal-poor galaxies exist in a relatively isolated environment compared to the local, typical SFGs. According to Yang et al. (2017b), their BB galaxy sample also shows significantly larger distances to their nearest neighborhood. Filho et al. (2015) also report that most metal-poor galaxies are found in low-density environments. These observational results suggest that the metal-poor galaxies started intensive star formation in an isolated environment.

Figure 21.

Figure 21. Normalized histogram of the nearest neighborhood distances of our 10 metal-poor galaxies (top panel) and local, typical SFGs obtained from SDSS (bottom panel). The number of galaxies in each bin (N) is normalized by the total number of galaxies (Ntot). The dashed lines indicate average values of the nearest neighborhood distances of our metal-poor galaxies (3.83 Mpc) and typical SFGs (1.52 Mpc). The bin between 10 and 11 represents the number of galaxies whose nearest neighborhood distance is beyond 10 Mpc.

Standard image High-resolution image

Table 7.  Parameters of Our Metal-poor Galaxies

# ID EMPG? F(Hβ) ${\mathrm{EW}}_{0}$(Hβ) 12 + log (O/H) log(${M}_{\star }$) log(SFR) E(BV) Age σ Dnear
      (erg s−1 cm−2) (Å) Direct Empirical (${M}_{\odot }$) (${M}_{\odot }$ yr−1) (mag) (Myr) (km s−1) (Mpc)
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)
1 HSC J1429−0110 no 67.1 ± 2.2 ${172.6}_{-0.6}^{+0.7}$ 8.27 ± 0.02 ${6.55}_{-0.09}^{+0.13}$ 0.426 ± 0.013 0.35 ± 0.02 3.4 1.56
2 HSC J2314+0154 yes 2.61 ± 0.10 ${213.3}_{-17.6}^{+23.4}$ ${7.225}_{-0.024}^{+0.027}$ 5.17 ± 0.01 −0.851 ± 0.013 0.28 ± 0.03 4.1 ${21.6}_{-7.7}^{+0.8}$ 2.25
3 HSC J1142−0038 no 4.27 ± 0.15 ${111.9}_{-1.3}^{+1.4}$ 7.72 ± 0.03 ${4.95}_{-0.01}^{+0.04}$ −1.066 ± 0.013 ${0.00}_{-0.00}^{+0.02}$ 3.7 ${21.9}_{-7.6}^{+0.8}$ 2.28
4 HSC J1631+4426 yes 1.31 ± 0.04 ${123.5}_{-2.8}^{+3.5}$ 6.90 ± 0.03 ${7.175}_{-0.005}^{+0.006}$ ${5.89}_{-0.09}^{+0.10}$ −1.276 ± 0.013 0.19 ± 0.03 50 1.54
5 SDSS J0002+1715 no 45.7 ± 1.4 103.9 ± 0.2 8.22 ± 0.01 7.06 ± 0.03 −0.002 ± 0.013 ${0.00}_{-0.00}^{+0.01}$ 31 ${27.7}_{-5.9}^{+0.9}$ 3.05
6 SDSS J1642+2233 no 46.3 ± 1.5 ${153.7}_{-0.4}^{+0.5}$ 8.45 ± 0.01 ${6.06}_{-0.13}^{+0.03}$ −0.169 ± 0.013 0.02 ± 0.02 25 ${29.8}_{-5.6}^{+1.0}$ 0.49
7 SDSS J2115−1734 yes 69.9 ± 2.1 ${214.0}_{-0.8}^{+0.9}$ 7.68 ± 0.01 6.56 ± 0.02 0.266 ± 0.013 0.17 ± 0.04 21 ${28.6}_{-5.8}^{+0.9}$ 17.69
8 SDSS J2253+1116 no 139 ± 4.28 264.7 ± 0.3 7.973 ± 0.002 5.78 ± 0.01 −0.541 ± 0.013 ${0.00}_{-0.00}^{+0.01}$ 4.1 ${18.9}_{-9.4}^{+0.7}$ 14.33
9 SDSS J2310−0211 no 99.3 ± 3.1 127.6 ± 0.2 ${7.890}_{-0.004}^{+0.003}$ 6.99 ± 0.03 −0.155 ± 0.013 ${0.01}_{-0.01}^{+0.02}$ 51 ${12.6}_{-12.6}^{+0.5}$ 2.15
10 SDSS J2327−0200 no 40.7 ± 1.2 111.0 ± 0.2 ${7.866}_{-0.005}^{+0.004}$ ${6.51}_{-0.03}^{+0.02}$ −0.180 ± 0.013 ${0.00}_{-0.00}^{+0.02}$ 22 ${12.0}_{-12.0}^{+0.5}$ 4.00

Note. (1) Number. (2) ID. (3) Whether or not an object satisfies the EMPG definition, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69. If yes (no), we write yes (no) in the column. (4) Hβ emission-line flux normalized in units of 10−15 erg s−1 cm−2, which is corrected for the SL and the dust extinction. These errors include 3% systematic uncertainties caused by the absolute flux calibration (e.g., Oke 1990). (5) Rest-frame EW of a Hβ emission line. (6)–(7) Gas-phase metallicity obtained with the direct method and the empirical relation of Izotov et al. (2019a). (8) Stellar mass. (9) SFR. (10) Color excess. (11) Maximum stellar age. (12) Velocity dispersion obtained from a Hβ emission line. An instrumental velocity dispersion is already removed. Note that the emission-line broadening from a galaxy rotation is not eliminated. (13) Distance to the nearest neighborhood selected in the SDSS DR13 spectroscopic catalog.

Download table as:  ASCIITypeset image

The formation mechanism of metal-poor strong-line galaxies in the local universe is an open question. One possible explanation is that the cosmic UV background had prevented star formation in metal-poor intergalactic gas until recently in low-density regions, but star formation was suddenly triggered by the collapse or collision of metal-poor gas. Sánchez Almeida et al. (2013, 2015) investigated tadpole galaxies, which are one of the typical metal-poor galaxy populations, and found that the blue head of tadpole galaxies has significantly lower metallicity than the rest of the galaxy body by factors of 3–10. The Northern Extended Millimeter Array millimeter-wave interferometer has revealed that a tadpole galaxy possesses molecular gas at its head (Elmegreen et al. 2018). Filho et al. (2013) demonstrate that metal-poor galaxies are surrounded by asymmetric H i gas, which can be shaped by the accretion of metal-poor gas. However, Filho et al. (2013) and SA16 reported various morphologies of metal-poor galaxies, which suggest that the star formation mechanism is different among metal-poor galaxies (i.e., multiple mechanisms exist). The formation mechanism of low-mass, metal-poor galaxies in the field environment is still under debate. Statistical studies are necessary with larger samples.

8.4. M⋆–SFR Relation

Figure 22 shows SFRs and stellar masses of our metal-poor galaxies, BBs, GPs, metal-poor galaxies of SA16, and the representative metal-poor galaxies from the literature. Our metal-poor galaxies, BBs, GPs, and the representative metal-poor galaxies have higher SFRs than typical z ∼ 0 galaxies (i.e., z ∼ 0 star formation main sequence) for a given stellar mass. In other words, they have a higher sSFR than those given by the z ∼ 0 main sequence. Particularly, our metal-poor galaxies have low stellar-mass values in the range of log(${M}_{\star }$/${M}_{\odot }$) < 6.0, which are lower than that of the BBs, GPs, and metal-poor galaxies of SA16.

Figure 22.

Figure 22. Stellar mass and SFR of our metal-poor galaxies with GPs, BBs, and local metal-poor galaxies. Symbols are the same as in Figure 18. We also show the stellar-mass and SFR distribution of typical z ∼ 0 SFGs (i.e., z ∼ 0 main sequence; black mesh), which we derive from the value-added catalog of SDSS DR7 (Kauffmann et al. 2003; Brinchmann et al. 2004; Salim et al. 2007). The solid lines represent the main sequences at z ∼ 2 and z ∼ 4–5 (Shim et al. 2011; Shivaei et al. 2016). The SFRs of Shivaei et al. (2016) and Shim et al. (2011) are estimated based on the Hα flux. We convert stellar masses and SFRs derived from the literature into those of the Chabrier (2003) IMF, applying conversion factors obtained by Madau & Dickinson (2014). The gray solid lines and accompanying numbers indicate log(sSFR/Gyr−1) = (−2.0, −1.0, ..., 4.0). The stellar masses and SFRs of the representative metal-poor galaxies are derived from the literature (Annibali et al. 2013; Rhode et al. 2013; Hunt et al. 2015; Hirschauer et al. 2016; Sacchi et al. 2016; Hsyu et al. 2017; Izotov et al. 2018b, 2019b)

Standard image High-resolution image

The stellar masses of our metal-poor galaxies fall on the typical stellar-mass range of globular clusters, i.e., log(${M}_{\star }$/${M}_{\odot }$) ∼ 4–6. Thus, one may guess that these metal-poor galaxies might be globular clusters that have been formed very recently. However, further investigation is necessary to understand the association between metal-poor galaxies and globular clusters, which will be discussed in Paper III (Y. Isobe et al. 2020, in preparation).

The solid lines in Figure 22 show the main sequences of typical galaxies at z ∼ 2 (Shivaei et al. 2016) and z ∼ 4–5 (Shim et al. 2011). As suggested by solid lines, the main sequence evolves toward high SFR for a given stellar mass with increasing redshift. Our metal-poor galaxies have higher SFRs for a given M than the z ∼ 0 main sequence, falling onto the extrapolation of the z ∼ 4–5 main sequence. Our metal-poor galaxies have sSFR values as as high those of low-M galaxies at z ≳ 3 and local LyC leakers (e.g., log(sSFR/Gyr−1) ∼ 1–3, Ono et al. 2010; Shim et al. 2011; Vanzella et al. 2017; Izotov et al. 2018b). Table 8 summarizes the sSFR values of our metal-poor galaxies and other galaxy populations from the literature for reference. Based on the high sSFRs, we suggest that our metal-poor galaxies are undergoing intensive star formation comparable to the low-M SFGs at z ≳ 3.

Table 8.  Typical Values of sSFR

Population log(sSFR) Redshift Reference
  (Gyr−1)    
(1) (2) (3) (4)
Our EMPGs 2.47 0.007–0.03 This work
EMPGs (SDSS DR7) 0.34 ≲0.1 SA16
BBs 1.39 ∼0.05 Y17b
GPs 1.38 ∼0.3 Y17a
LyC leaker (${f}_{\mathrm{esc}}^{\mathrm{LyC}}$= 0.46) 1.29 0.37 I18
Main Sequence (z ∼ 0) −0.20 ∼0 SDSS DR7
Main Sequence (z ∼ 2) ∼0.0–0.5 ∼2 S16
Main Sequence (z ∼ 4–5) ∼1.0–1.5 ∼4–5 S11
Low-M SFG ($z\sim 3$) 1.10/1.80 3.12 V16
Little Blue Dots ≳2.0 2–5 E17
LAEs (z = 5.7) 3.05 5.7 O10
LAEs (z = 6.6) 3.05 6.6 O10

Note. (1) Galaxy population. (2) Average of sSFR in units of log(Gyr−1). We calculate a linear average of each sample here. (3) Typical redshift. (4) References of sSFR—SDSS DR7: (Kauffmann et al. 2003; Brinchmann et al. 2004; Salim et al. 2007), SA16: Sánchez Almeida et al. (2016), Y17b: Yang et al. (2017b), Y17a: Yang et al. (2017a), I18: Izotov et al. (2018a), S16: Shivaei et al. (2016), S11: Shim et al. (2011), V16: Vanzella et al. (2017), E17: Elmegreen & Elmegreen (2017), O10: Ono et al. (2010).

Download table as:  ASCIITypeset image

Our SFR estimates are obtained under the simple assumption of Kennicutt (1998) because we only have optical observational results for now, and that simple assumption can be broken in the very young (≲10 Myr), metal-poor, low-M galaxies because the conversion factor is sensitive to the IMF, the star formation history, the metallicity, and the escape fraction and dust absorption of ionizing photons (Kennicutt 1998). Other SFR uncertainties may arise from additional ionizing photon sources, such as a low-luminosity AGN, shock-heated gas, galactic outflows, and X-ray binaries, which are not included in the stellar synthesis models used in the calibration of Kennicutt (1998). Further multiwavelength observations are required to understand the SFR, history, and mechanism of very young, metal-poor, low-M galaxies.

8.5. MZ Relation

Figure 23 exhibits a mass–metallicity (MZ) relation of our metal-poor galaxies. Our metal-poor galaxies are located around the low-mass end of log(${M}_{\star }$/${M}_{\odot }$) = 5–7 among metal-poor galaxy samples of BBs, GPs, the S16 metal-poor galaxies, and the representative metal-poor galaxies in Figure 23. The metallicities of our metal-poor galaxies extend over a relatively wide range, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 6.9–8.5. The gray shaded regions in Figure 23 represent the 68th and 95th percentile distributions of local SFGs of Zahid et al. (2012, Z12 hereafter), who have reported that the metallicity scatter of galaxies becomes larger with decreasing metallicity for a given mass. Although the extrapolation is applied below log(${M}_{\star }$/${M}_{\odot }$) = 8.4 here, five of our metal-poor galaxies fall in the 68th percentile distribution of the local MZ relation.

Figure 23.

Figure 23. Mass-metallicity relation of our metal-poor galaxies. Symbols are the same as in Figure 18. The solid and dashed lines indicate averaged local SFGs given by Andrews & Martini (2013) and Z12 from SDSS data, respectively. The dark gray and light gray shaded regions represent the 68th and 95th percentile distributions of SFGs of Z12, although the extrapolation is applied below log(${M}_{\star }$/${M}_{\odot }$) = 8.4. We also show relatively metal-enriched dwarfs of P08 (crosses) and Z12 (pluses) from SDSS, as well as DEEP2 galaxies of Z12 (dots) in the stellar-mass range of log(${M}_{\star }$/${M}_{\odot }$) < 8.0. The typical metallicity error of our metal-poor galaxies is Δ(O/H) ∼ 0.01 dex.

Standard image High-resolution image

Interestingly, we find that our other five metal-poor galaxies are located above the 68th percentile distribution given by Z12, i.e., higher metallicities for a given stellar mass. We refer to the five metal-poor galaxies located above the 68th percentile distribution as "above-MZ galaxies" hereafter. Our above-MZ galaxies have moderate metallicities of 12 +$\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0 in spite of their very low M (i.e., log(${M}_{\star }$/${M}_{\odot }$) = 5–7). A possible explanation for these above-MZ galaxies has been given by Z12 and Peeples et al. (2008, P08 hereafter). In Figure 23, we also show the low-z galaxy samples of Z12 and P08 in the stellar-mass range of log(${M}_{\star }$/${M}_{\odot }$) < 8.0. In a sample from the DEEP2 survey, Z12 have found galaxies with a metallicity higher than the local MZ relation (Figure 23) and a higher SFR for a given stellar mass, which is similar to our above-MZ galaxies. Z12 also found counterpart galaxies of their DEEP2 galaxies (i.e., above both the MZ and M–SFR relations) in the SDSS data (Figure 23). Z12 argued that their DEEP2 and SDSS galaxies may be transitional objects, which was suggested by P08, from gas-rich dwarf irregulars to gas-poor dwarf spheroidals and ellipticals. P08 also investigated local galaxies whose metallicities are higher than the local MZ relation, with SDSS data. Unlike our above-MZ galaxies and the Z12 galaxies, the P08 sample shows redder colors and lower SFRs consistent with the local M–SFR relation. P08 claimed that the P08 galaxies may be in a later stage of the transition from gas-rich dwarf irregulars to gas-poor dwarf spheroidals and ellipticals, and that the gas deficit leads to the low SFRs and high metallicities. It should be noted that the Z12 and P08 galaxies are located in a relatively isolated environment, similarly to our above-MZ galaxies. If our above-MZ galaxies are explained by an early stage of the transition, our above-MZ galaxies may be losing (or have lost) gas despite their very recent star formation suggested by their high ${\mathrm{EW}}_{0}$(Hβ) (Section 8.1). The gas loss can be caused by the galactic outflow triggered by supernovae (SNe), for example, in young galaxies such as our above-MZ galaxies. However, to characterize these above-MZ galaxies, more observations, such as far-infrared and radio observations which trace emission from molecular gas, H i gas, and the SNe, are necessary.

Figure 24 demonstrates the low-M, low-metallicity ends of the MZ relation. Here we compare our metal-poor galaxies with the representative metal-poor galaxies. Among the representative metal-poor galaxies, we find that HSC J1631+4426 (12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.90, i.e., Z/Z = 0.016) has the lowest metallicity reported ever. The metallicity of HSC J1631+4426 is lower than that of J0811+4730 (Izotov et al. 2018b), AGC 198691 (Hirschauer et al. 2016), SBS 0335−052 (e.g., Izotov et al. 2009), and J1234+3901 (Izotov et al. 2019b). We emphasize that the discovery of the very faint EMPG HSC J1631+4426 (i = 22.5 mag) was enabled by the deep, wide-field HSC-SSP data, which could not have been achieved by the previous SDSS surveys. This paper presents just the first spectroscopic result of 4 out of the 27 HSC-EMPG candidates. We expect to discover more EMPGs from our HSC-EMPG candidates in the ongoing spectroscopy.

Figure 24.

Figure 24. Same as the top panel of Figure 23, but zooms in around the low-M, low-metallicity ends.

Standard image High-resolution image

8.6. BPT Diagram

Figure 25 is an emission-line diagnostic diagram of [N ii]/Hα and [O iii]/Hβ (i.e., BPT diagram) with our metal-poor galaxies. Our metal-poor galaxies fall on the SFG region defined by the maximum photoionization models under the stellar radiation (Kewley et al. 2001). We do not find any evidence that our metal-poor galaxies are affected by an AGN or shock-heating from the optical emission=line ratios. However, Kewley et al. (2013) suggest that metal-poor gas heated by the AGN radiation or shock also show emission-line ratios falling on the SFG region defined by Kewley et al. (2001). We thus do not exclude the possibility of the existence of a metal-poor AGN or shock-heating of metal-poor gas. We will discuss the ionization state of ISM and the ionization-photon sources in Paper II.

Figure 25.

Figure 25. Our metal-poor galaxies on the BPT diagram (red stars). The black mesh represents z ∼ 0 SFGs and AGNs derived from the emission-line catalog of SDSS DR7 (Tremonti et al. 2004). The solid curve indicates the maximum photoionization models that can be achieved under the assumption of stellar radiation (Kewley et al. 2001). The region below the solid curve is defined by the SFG region, while the upper-right side is defined by the AGN region.

Standard image High-resolution image

8.7. Velocity Dispersion

We estimate the velocity dispersions of 8 metal-poor galaxies observed with MagE out of our 10 metal-poor galaxies. We do not estimate the velocity dispersions of the other two galaxies observed with LDSS-3, DEIMOS, and FOCAS due to their spectral resolutions not being high enough to resolve emission lines of our very low-mass sample. We measure emission-line widths of our metal-poor galaxies using a Gaussian fit to a Hβ emission line, obtaining σobs = 36.5–45.3 km s−1. We obtain the intrinsic velocity dispersions σ of our metal-poor galaxies with

Equation (6)

where σinst, σth, and σfs are the instrumental, thermal, and fine-structure line broadening, under the assumption that the broadening can be approximated by a Gaussian profile (e.g., Chávez et al. 2014). We measure the instrumental line broadening with arc-lamp frames and find σinst = 26.4 and 33.3 km s−1 with the slit widths of 0farcs85 and 1farcs20, respectively. We calculate σth with

Equation (7)

where k and m represent the Boltzmann constant and the hydrogen mass, respectively. We use σth = 9.1 km s−1, which is obtained from Equation (7) under the assumption of Te = 10,000 K. We adopt the fine-structure line broadening of Hβ, σfs = 2.4 km s−1 (García-Díaz et al. 2008). Then, we calculate σ with Equation (6), obtaining σ values in the range of 12.0–29.8 km s−1. The obtained σ values are summarized in Table 7. For a careful comparison, we have also checked σ values under the assumption of Te = 15,000 K in Equation (7), which is more consistent with the Te(O iii) measurements as shown in Table 5. However, in this case, two of the σ estimates become zero. Thus, we include the σ differences between the two assumptions (i.e., Te = 10,000 and 15,000 K) in σ errors. We do not remove the effect of the emission-line broadening caused by the dynamical galaxy rotation because the spectral resolution of MagE is still not enough to separate the rotation and the dispersion. However, previous studies with high spectral-resolution spectroscopy (e.g., Melnick et al. 1988; Chávez et al. 2014) find that most galaxies with σ ≲ 60 km s−1 are dispersion supported rather than rotation supported, which suggests that our metal-poor galaxies (σ = 12.0–29.8 km s−1) may be also dispersion supported. Even if this assumption is not true, our σ estimates at least provide upper limits on the velocity dispersions.

The top panel of Figure 26 demonstrates velocity dispersions of our metal-poor galaxies as a function of V-band absolute magnitude in comparison with the stellar velocity dispersions of massive galaxies (Prugniel & Simien 1996), dwarf galaxies (a compiled catalog of Lin & Ishak 2016), and globular clusters (Harris 1996). We find that our metal-poor galaxies fall on a velocity-dispersion sequence made of massive galaxies, dwarf galaxies, and globular clusters in the top panel of Figure 26. The compiled dwarf galaxy catalog of Lin & Ishak (2016) is derived from the literature of dwarf galaxies in the LG (≲3 Mpc) reported by McConnachie (2012), Kirby et al. (2015a, 2015b), Simon et al. (2015), and Martin et al. (2016). On the other hand, our metal-poor galaxies are low-M galaxies outside the LG. The velocity dispersions of our metal-poor galaxies trace the gas kinematics while those of the massive galaxies, dwarf galaxies, and globular clusters shown here are estimated mainly from the motion of individual stars. Indeed, as shown in the bottom panel of Figure 26, the velocity dispersions of gas (green circles) and stars (black circles) are different by a factor of 1.0–1.3 in the range of log(${M}_{\star }$/${M}_{\odot }$) ∼ 8.0–11.5 (Barat et al. 2019). We also show local blue compact galaxies reported by Chávez et al. (2014) in the bottom panel of Figure 26. Chávez et al. (2014) performed spectroscopy with high spectral resolutions (R ∼ 10,000–20,000) for the local blue compact galaxies of log(${M}_{\star }$/${M}_{\odot }$) ∼ 6.5–9.0. They estimate σ with the Hβ emission line and created a sample of dispersion-supported galaxies. In the bottom panel of Figure 26, we find that our metal-poor galaxies in the range of log(${M}_{\star }$/${M}_{\odot }$) ∼ 6.5–7.0 overlap with the dispersion-supported galaxies of Chávez et al. (2014). Thus, our metal-poor galaxies may also be dispersion supported.

Figure 26.

Figure 26. Top: velocity dispersion as a function of optical magnitudes. Stars are the same as in Figure 18. Squares and triangles represent stellar velocity dispersions of bright galaxies and local faint galaxies from the literature, which are compiled by Prugniel & Simien (1996) and Lin & Ishak (2016), respectively. We also show the stellar velocity dispersions of globular clusters (Harris 1996) with blue crosses. To estimate the continuum level in the V band of our metal-poor galaxies, we use an i-band magnitude instead of g- or r-band magnitudes because the g and r bands are strongly affected by strong emission lines. Here we assume a flat continuum from V to i bands in units of erg s−1 cm−2 Hz−1. We confirm that this assumption is correct within ∼0.2 mag by looking at a continuum in the MagE spectra of our metal-poor galaxies. Bottom: same as the top panel, but as a function of stellar mass. Black and green circles represent velocity dispersions obtained with stellar and nebular lines, respectively, from the Sydney-AAO Multi-object Integral field spectrograph (SAMI) galaxy survey (Barat et al. 2019). The gray bars show blue compact galaxies at z ∼ 0.02–0.2 reported by Chávez et al. (2014). The upper and lower limits of the gray bars indicate dynamical masses (Mdyn) and stellar masses of ionizing star clusters (Mcl), respectively. The gray bars represent possible total stellar-mass ranges because the relation Mcl < M* < Mdyn holds by definition.

Standard image High-resolution image

9. Summary

We search for EMPGs at z ≲ 0.03 to construct a local sample whose galaxy properties are similar to those of high-z galaxies in the early star formation phase (i.e., low M, high sSFR, low metallicity, and young stellar ages). We select EMPGs from the wide-field, deep imaging data of the SSP with HSC in combination with the wide-field, shallow data of SDSS. This work is the first metal-poor galaxy survey that exploits the wide (∼500 deg2), deep (ilim ∼ 26 mag) imaging data of HSC SSP, with which we expect to discover faint EMPGs that SDSS could not detect due to magnitude limitations. To remove contamination more efficiently than a simple color–color selection from our sample, we develop a new selection technique based on ML. We construct an ML classifier that distinguishes EMPGs from other types of objects, which is well trained by model templates of galaxies, stars, and QSOs. By testing our ML classifier with the SDSS photometry+spectroscopy data, we confirm that our ML classifier reaches 86% completeness and 46% purity. Then our ML classifier is applied to HSC and SDSS photometry, obtaining 27 and 86 EMPG candidates, respectively. These EMPG candidates have a wide range of i-band magnitudes, i = 14.8–24.3 mag, thanks to the combination of SDSS and HSC data. We have conducted optical spectroscopy with Magellan/LDSS-3, Magellan/MagE, Keck/DEIMOS, and Subaru/FOCAS for 10 out of the 27 + 86 EMPG candidates. Our main results are summarized below.

  • 1.  
    We confirm that the 10 EMPG candidates are real SFGs at z = 0.007–0.03 with strong emission lines, whose rest-frame Hβ EWs (${\mathrm{EW}}_{0}$) reach 104–265 Å, and a metallicity range of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.90–8.45. Three out of the 10 EMPG candidates satisfy the EMPG criterion of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ < 7.69. The other seven galaxies still show low metallicities (∼0.1–0.6 Z). We thus conclude that our new selection based on ML successfully selects real EMPGs or metal-poor, strong-line SFGs.
  • 2.  
    The number density of our HSC metal-poor galaxies is 1.5 × 10−4 Mpc−3, which is  10 times higher than that of our SDSS metal-poor galaxies (2.8 × 10−5 Mpc−3). This difference is explained by the fact that our HSC metal-poor galaxies (median: i ∼ 22.5 mag) are ∼30 times fainter than our SDSS metal-poor galaxies (median: i ∼ 18.8 mag).
  • 3.  
    To characterize the environment of our metal-poor galaxies, we compare the nearest neighborhood distances (Dnear) of our metal-poor galaxies with those of local, typical SFGs. The Dnear of our metal-poor galaxies range from 0.49 to 17.69 Mpc with an average of 3.83 Mpc, which is ∼2.5 times larger than that of local, typical SFGs (average 1.52 Mpc). With a Kolmogorov–Smirnov test (p = 1.9 × 10−3), we significantly confirm that our metal-poor galaxies are located in the relatively isolated environment compared to the local, typical SFGs.
  • 4.  
    We find that our metal-poor galaxy sample encompasses low metallicities, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.90–8.45, low stellar masses, log(${M}_{\star }$/${M}_{\odot }$) = 5.0–7.1, and high sSFR (∼300 Gyr−1), suggesting the possibility that they are analogs of high-z, low-mass SFGs.
  • 5.  
    We find that 5 out of our 10 metal-poor galaxies with spectroscopic confirmation have moderate metallicities of 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ ∼ 8.0 in spite of their very low M (i.e., log(${M}_{\star }$/${M}_{\odot }$) = 5–7), which are located above an extrapolation of the local mass–metallicity relation. One possible explanation is that the five galaxies above the local mass–metallicity relation are in an early stage of the transition from gas-rich dwarf irregulars to gas-poor dwarf spheroidals and ellipticals, which is suggested by Peeples et al. (2008) and Zahid et al. (2012).
  • 6.  
    We confirm that HSC J1631+4426 shows the lowest metallicity value, 12 + $\mathrm{log}({\rm{O}}/{\rm{H}})$ = 6.90 ± 0.03 (i.e., Z/Z = 0.016) reported ever.
  • 7.  
    Our metal-poor galaxies fall on the SFG region of the BPT diagram, and we do not find any evidence that our metal-poor galaxies are affected by an AGN or shock-heating from the optical emission-line ratios. However, we do not exclude the possibility of the existence of a metal-poor AGN or shock because little is known about the low-metallicity AGN or shock to date.
  • 8.  
    We roughly measure velocity dispersions of our metal-poor galaxies with the Hβ emission line, which may trace the ionized gas kinematics. Thanks to the high spectral resolution of MagE (R ∼ 4000), we find that our metal-poor galaxies have small velocity dispersions of σ = 12.0–29.8 km s−1. The velocity dispersions of our metal-poor galaxies are consistent with a relation between the velocity dispersion and V-band magnitude, which is made by a sequence of low-z bright galaxies, dwarf galaxies in the LG, and globular clusters.

We are grateful to Lennox Cowie, Akio Inoue, Taddy Kodama, Matthew Malkan, and Daniel Stark for their important and useful comments. We are grateful to John David Silverman and Anne Verhamme for their helpful comments on our survey name. We would like to express our special thanks to Daniel Kelson for his great efforts in helping us reduce and calibrate our MagE data.

We also thank the staff of the Las Campanas observatories, the Subaru telescope, and the Keck observatories for helping us with our observations. The observations were carried out within the framework of Subaru-Keck time exchange program, where the travel expense was supported by the Subaru Telescope, which is operated by the National Astronomical Observatory of Japan. The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Maunakea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain.

The Hyper Suprime-Cam (HSC) collaboration includes the astronomical communities of Japan and Taiwan, and Princeton University. The HSC instrumentation and software were developed by the National Astronomical Observatory of Japan (NAOJ), the Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), the University of Tokyo, the High Energy Accelerator Research Organization (KEK), the Academia Sinica Institute for Astronomy and Astrophysics in Taiwan (ASIAA), and Princeton University. Funding was contributed by the FIRST program from the Japanese Cabinet Office, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), the Toray Science Foundation, NAOJ, Kavli IPMU, KEK, ASIAA, and Princeton University.

This paper makes use of software developed for the Large Synoptic Survey Telescope. We thank the LSST Project for making their code available as free software at http://dm.lsst.org.

This paper is based on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by Subaru Telescope and Astronomy Data Center (ADC) at NAOJ. Data analysis was in part carried out with the cooperation of Center for Computational Astrophysics (CfCA), NAOJ.

The Pan-STARRS1 Surveys (PS1) and the PS1 public science archive have been made possible through contributions by the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg, and the Max Planck Institute for Extraterrestrial Physics, Garching, The Johns Hopkins University, Durham University, the University of Edinburgh, the Queens University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under grant No. NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, the National Science Foundation grant No. AST-1238877, the University of Maryland, Eotvos Lorand University (ELTE), the Los Alamos National Laboratory, and the Gordon and Betty Moore Foundation.

This work is supported by the World Premier International Research Center Initiative (WPI Initiative), MEXT, Japan, as well as KAKENHI Grant-in-Aid for Scientific Research (A) (15H02064, 17H01110, and 17H01114) through the Japan Society for the Promotion of Science (JSPS). T.K., K.Y., Y.S., and M. Onodera are supported by JSPS KAKENHI grant Nos. 18J12840, 18K13578, 18J12727, and 17K14257. S.F. acknowledges support from the European Research Council (ERC) Consolidator Grant funding scheme (project ConTExt, grant No. 648179). The Cosmic Dawn Center is funded by the Danish National Research Foundation under grant No. 140.

Footnotes

  • Partly based on data obtained with the Subaru Telescope. The Subaru Telescope is operated by the National Astronomical Observatory of Japan.

  • † 

    The data presented herein were partly obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California, and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation.

  • ‡ 

    This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  • 20 
  • 21 

    Magnitudes reaching 95% completeness, which are listed in https://www.sdss.org/dr13/scope/.

  • 22 
  • 23 

    We define a non-EMPG galaxy to have a metallicity 12 + log(O/H) larger than 7.69.

  • 24 
  • 25 

    We remove random numbers beyond σ = 0.1 from the models as we eliminate sources with σ > 0.1 in the source catalogs (Sections 2.2 and 2.3). We continue to generate random numbers until the total number becomes 30.

  • 26 
  • 27 
  • 28 

    As of 2018 September, a cause of the stray light had not yet been identified, according to a support astronomer at W. M. Keck Observatory (2020, private communication). It is reported that the stray-light pattern appears on the blue side of CCD chips when flat and arc frames are taken with a grating tilted toward blue central wavelengths.

Please wait… references are loading.
10.3847/1538-4357/aba047