Generating Galaxy Clusters Mass Density Maps from Mock Multiview Images via Deep Learning

. Galaxy clusters are composed of dark matter, gas and stars. Their dark matter component, which amounts to around 80% of the total mass, cannot be directly observed but traced by the distribution of di ff used gas and galaxy members. In this work, we aim to infer the cluster’s projected total mass distribution from mock observational data, i.e. stars, Sunyaev-Zeldovich, and X-ray, by training deep learning models. To this end, we have created a multiview images dataset from T he T hree H undred simulation that is optimal for training Machine Learning models. We further study deep learning architectures based on the U-Net to account for single-input and multi-input models. We show that the predicted mass distribution agrees well with the true one.


Introduction
Clusters of galaxies are laboratories for studying both cosmology and astrophysics since they are the biggest gravitationally-bound objects in The Universe [1].Particularly, cosmological parameters can be estimated by studying the number of galaxy clusters and their evolution in mass and redshift (e.g.[2] ), and therefore, accurately estimating their total mass is of paramount importance.The main components of a cluster of galaxies are Dark Matter (DM), which amounts to around 80% of its mass, stars, mainly confined in galaxies, and hot gas in the Intra-Cluster Medium (ICM).
The ICM is observed from the electrons' Bremsstrahlung effect in X-rays by, e.g., eROSITA [3] or by XMM-Newton within the CHEX-MATE [4] project.In addition, the ICM is also targeted at microwave frequencies through the Sunyaev-Zel'dovich (SZ) effect due to inverse Compton scattering of photons from the cosmic microwave background.Different experiments such as the Planck satellite [5] and the South Pole Telescope (SPT; [6]) are providing insight into the physical properties of the intracluster gas.
On the other hand, large galaxy surveys from optical telescopes are providing important information on the cluster galaxy members, such as the Sloan Digital Sky Survey (SSDS; [7]).However, the DM component, which amounts to 80 per cent of mass, cannot be directly observed.Therefore, the total mass of a galaxy cluster is usually inferred from ICM mass proxies or from its galaxy members' dynamics.Only by performing Gravitational Lensing observations, one can estimate, by applying theoretical models, its mass density.These lensing observations are very scarce, only around one hundred galaxy clusters are observed [8] in comparison to a few thousand that are currently available for ICM surveys.
Nevertheless, cosmological simulations show that all these mass proxies are biased, due to the theoretical assumptions made, such as the hypothesis that the intracluster gas is in Hydrostatic Equilibrium (HE), see for example [9,10].To address this problem, novel Machine Learning models (ML) have recently been used to estimate masses of galaxy clusters [11][12][13][14][15][16][17][18][19][20], which are free from theoretical assumptions and yield "bias-free" results.All of these Machine Learning models are based on Deep Learning (for a review see, e.g.[21]), which makes use of convolutional layers in the neural network architecture to extract information from multi-dimensional data such as images from astronomical instruments and it utilises that information to estimate the mass.We have to point out that all these works are using supervised training that rely on big datasets with known total mass of the galaxy clusters in order to train Deep Learning models.Therefore, one has to resort to the use of cosmological simulations, where a rich variety of galaxy clusters with different dynamical states [22] are available, and the underlying true total mass distribution is completely known.Nevertheless, training from simulations does not necessarily imply that the trained model can be directly used for estimating the masses of galaxy clusters in real observations from both ICM tracers or cluster galaxy members.Only recently, this step -directly estimating the galaxy cluster masses from observation -was first established by de Andres et al. [23].However, all these previous studies focused on the estimation of the mass inside a certain overdensity (radius) referred to as R 200 or R 500 .In Ferragamo et al [24], we made another step forward -predicting the total density profile based on the estimated masses at different radii.With more information estimated for the cluster mass distribution, we can acquire more knowledge of the cluster's internal properties, thus their formation and evolution.For example, the concentration of cluster is directly inferred by the ML model from the SZ cluster images [24].
In this work, we aim to move another step further -studying Deep Learning architectures that could infer the 2D total mass density maps from X-ray, SZ and stellar density (galaxy members) observations.As a first approach, we limit our application to only studying the idealised cases with simple mock maps (see next section for more details about the maps).Therefore, we ignore the impact of angular resolution, point source contamination, noise and other instrumental effects that characterise real observations.

Mock maps
We use the results of the hydrodynamical simulations from The Three Hundred Project [25], which correspond to "zoom-in" spherical regions centred in the 324 most massive halos of the gravity-only MultiDark simulation MDPL2 [26] whose cosmology is given by the parameters inferred by the Planck mission [27].The size of our 324 regions corresponds to spheres of radius 15h −1 Mpc, that were re-simulated with full baryon physics with particle masses of dark matter and gas of M particle ∼ 10 8 h −1 M ⊙ .In addition, the simulations have been run with two numerical codes that feature different star formation, supernovae and black hole feedbacks modeling, which are Gadget-X [25] and Gizmo-Simba [28].Note that the 324 regions not only include the central halos, but there is also plenty of additional groups and filament structure.
All halos and sub-halos within these simulations are identified by the Amiga Halo Finder (AHF) algorithm [29] and for this work, distinct halos are selected at redshift z ∼ 0 with a mass greater than M 200 ≥ 10 13.5 h −1 M ⊙ .M 200 here means the mass inside a region whose density is 200 times the critical density of the Universe, at the corresponding redshift.Note that the redshift evolution of the scaling laws up to z ≃ 1 is negligible and thus, one should not expect a variance in the mass proxies [30].The mass distribution of the selected sample is shown in Figure 1.We should remark that we have selected our sample of simulated clusters to follow an almost uniform mass distribution so that high-mass clusters are not underrepresented in the training dataset.Taking this into consideration, our sample is composed of 5,041 different clusters.We have further considered 29 lines of sight (l.o.s) projections and thus, our data set is composed of 146,189 maps per input (tSZ, X-ray, star) and output view (mass density).
Regarding our input mock maps, the tSZ maps are created using the publicly available PYMSZ package 1 , the X-ray maps are the bolometric surface brightness computed using the AtomDB database2 , the star maps correspond to the projected star mass density along the observer's l.o.s., and the output maps are the projected total mass density (DM, gas, stars and black holes).Examples of our input and output ground-truth images can be found in Figure 2. Furthermore, our maps are reduced to a grid where 80 pixels equals 2 × R 200 for all the maps with a Gaussian smoothing (FWHM) of ≃ 0.01 × R 200 .

Model and training
The model that we have used is based on the U-Net architecture [31], which was developed for the purpose of analysing biomedical images.We have used the particular architecture of [32] that was also considered in biomedical imaging but we adapted it to the dimensionality of our input dataset.Moreover, that architecture is also modified to account for multiview inference, i.e., it can perform the inference of the mass maps from SZ, X-ray and star data simultaneously.It learns the features that most correlate with mass from each view and SZ X-ray star mass

Results
We show examples in Figure 3 of our predicted maps from 4 different U-Nets: training only with SZ, only with X-ray, only with star and training with all views together in a multiview approach.Note that the ground-truth map is the last map in Figure 2. As a general result, we can distinguish that the density mass maps from SZ and from X-ray are smoother than those maps predicted from the stars and from multiview.This can be explained by Figure 2, where the SZ and X-ray maps do not contain mass distribution information at small scales and therefore, the models that were trained considering only those maps will not predict those small structures.Conversely, star maps contain galaxy members that can track better the positions of the total mass density peaks.To further study this, we have computed the residuals defined as: residuals = predicted density map − ground-truth density map . ( An example of the residuals can be seen in Figure 4.In this figure, we highlight in blue colour that in the case of SZ and X-ray input maps there are more under-predicted substructures.In  our example, for stars, there is a lack of signal in the central region and the residuals are not the substructures that are present in SZ and X-ray residual maps.The advantage of considering the multichannel U-Net model is that the predictions are more accurate due to the fact that the model can point to the substructures from the star maps and it utilises the information from SZ and X-ray maps to improve the inference of the mass density maps.

Figure 1 .
Figure 1.Number of galaxy clusters as a function of mass for our selected dataset at z ∼ 0 from The Three Hundred.

Figure 2 .
Figure 2. Examples of our input mock maps (SZ, X-ray and star) together with the output mass density.The size of all maps is 2 × R 200 .

1 M / kpc 2 Figure 3 .
Figure 3. Example of our predicted maps varying the inputs of our U-Net model: SZ, Xray, star and multiview.

Figure 4 .
Figure 4. Residuals (see Equation1) of our predicted maps varying the inputs of our U-Net model: SZ, Xray, star and multiview.