Figure 1 | A neural network is trained to discover the underlying degrees of freedom in halo density profiles within a low-dimensional latent representation, when presented with the full 3D density structure of a halo at the present-day time (z=0). We physically interpret the discovered representation in terms of the halo’s evolution history by measuring the mutual information (MI) between the latent parameters and the assembly history of the halos.
Figure 2 | The MI between the latent parameters and the halo mass as a function of time (top row), and that between the latent parameters and the rate of change in mass as a function of time (bottom row). The inner shape latent and the NFW concentration carry memory of the early-time mass assembly history, as well as the later-time mass accretion rate. The outer shape latent carries information about the halos' most recent mass accretion rate over the past dynamical time (indicated by the arrow).
Can machine learning make new discoveries in astrophysics? An ‘explainable’ neural network is employed to get insights into the origin of dark matter halo density profiles. The network discovers that the shape of the profile in the halo outskirts is described by a single parameter related to the most recent accretion of mass. This is done without prior knowledge of the halo’s evolution history being provided during training.
Artificial intelligence (AI) has rapidly emerged as a powerful tool in astrophysics and cosmology. Typical uses of machine learning in cosmology include emulating the output of computationally expensive cosmological simulations, or accelerating the estimation of cosmological parameters from data. These approaches effectively treat machine learning models as “black boxes”: humans cannot understand the inner workings of these complex deep learning algorithms involving often millions of parameters. However, only by understanding how machine learning models reach their predictions can scientists trust AI tools in scientific applications.
MPA research fellow Luisa Lucie-Smith's research has focused on developing explainable machine learning frameworks for cosmological structure formation. In these frameworks, the machine learning results can be interpreted and explained in terms of the physics they represent. Luisa Lucie-Smith and her international colleagues designed a neural network, denoted an interpretable variational encoder (IVE), that generates a low-dimensional, compressed ‘latent’ representation of the input data. This latent representation captures all the relevant information about the final output of interest, and can be physically interpreted using the information-theoretic metric of mutual information (MI).
The team first applied the IVE method to discover the building blocks of the density profiles of dark matter halos. Halo density profiles are not only key ingredients of the galaxy-halo connection in cosmological analyses and of direct and indirect dark matter searches; they are also powerful observational testbeds of fundamental physics. This is because their shape, from the inner core to the outskirts, is sensitive to the nature of dark matter and modifications to gravity. However, on the theoretical side, models of halo density profiles still rely solely on empirically found fitting functions. Observationally, it has recently become possible to measure weak lensing and 3D density profiles through a combination of multi-wavelength data; our ability to make use of these measurements requires a more complete understanding of the physical effects that control the shape of the density profiles and their origin.
Given the 3D density structure of a dark matter halo at the present-day time (z=0), the IVE discovered that a three-dimensional latent space is required and sufficient to describe the density profiles of halos out to their outskirts, beyond the radial range of validity of traditional fitting functions such as Navarro-Frenk-White (NFW) profile (Fig. 1). The three-dimensional latent space is disentangled, meaning that each latent parameter captures an independent factor of variation in the halo density profile. Two latent parameters consist of a normalization and an inner shape parameter similar to the two parameters of the NFW profile; the third, additional latent describes the shape of the profile in the halo outskirts. The team then exploits the latent space beyond its original training task, to connect the evolutionary history of dark matter halos with their density profiles. Without any prior knowledge of the halos' evolution being provided during training, the network recovers the known relation between early formation time and the shape of the inner profile. It additionally discovers that the outer profile, which can be described by a single degree of freedom, is sensitive to the halo's most recent mass accretion rate (Fig. 2).
The results of this study represent progress towards enabling new machine-assisted scientific discoveries, going beyond artificial rediscovery of known physical laws as presented so far in the literature. The IVE approach towards this goal consists of compressing the information within a dataset into a set of minimal ingredients which disentangles the independent factors of variation in the output (interpretability), and can be explained in terms of the physics it represents through MI (explainability).
Artificial intelligence (AI) has rapidly emerged as a powerful tool in astrophysics and cosmology. Typical uses of machine learning in cosmology include emulating the output of computationally expensive cosmological simulations, or accelerating the estimation of cosmological parameters from data. These approaches effectively treat machine learning models as “black boxes”: humans cannot understand the inner workings of these complex deep learning algorithms involving often millions of parameters. However, only by understanding how machine learning models reach their predictions can scientists trust AI tools in scientific applications.
MPA research fellow Luisa Lucie-Smith's research has focused on developing explainable machine learning frameworks for cosmological structure formation. In these frameworks, the machine learning results can be interpreted and explained in terms of the physics they represent. Luisa Lucie-Smith and her international colleagues designed a neural network, denoted an interpretable variational encoder (IVE), that generates a low-dimensional, compressed ‘latent’ representation of the input data. This latent representation captures all the relevant information about the final output of interest, and can be physically interpreted using the information-theoretic metric of mutual information (MI).
The team first applied the IVE method to discover the building blocks of the density profiles of dark matter halos. Halo density profiles are not only key ingredients of the galaxy-halo connection in cosmological analyses and of direct and indirect dark matter searches; they are also powerful observational testbeds of fundamental physics. This is because their shape, from the inner core to the outskirts, is sensitive to the nature of dark matter and modifications to gravity. However, on the theoretical side, models of halo density profiles still rely solely on empirically found fitting functions. Observationally, it has recently become possible to measure weak lensing and 3D density profiles through a combination of multi-wavelength data; our ability to make use of these measurements requires a more complete understanding of the physical effects that control the shape of the density profiles and their origin.
Given the 3D density structure of a dark matter halo at the present-day time (z=0), the IVE discovered that a three-dimensional latent space is required and sufficient to describe the density profiles of halos out to their outskirts, beyond the radial range of validity of traditional fitting functions such as Navarro-Frenk-White (NFW) profile (Fig. 1). The three-dimensional latent space is disentangled, meaning that each latent parameter captures an independent factor of variation in the halo density profile. Two latent parameters consist of a normalization and an inner shape parameter similar to the two parameters of the NFW profile; the third, additional latent describes the shape of the profile in the halo outskirts. The team then exploits the latent space beyond its original training task, to connect the evolutionary history of dark matter halos with their density profiles. Without any prior knowledge of the halos' evolution being provided during training, the network recovers the known relation between early formation time and the shape of the inner profile. It additionally discovers that the outer profile, which can be described by a single degree of freedom, is sensitive to the halo's most recent mass accretion rate (Fig. 2).
The results of this study represent progress towards enabling new machine-assisted scientific discoveries, going beyond artificial rediscovery of known physical laws as presented so far in the literature. The IVE approach towards this goal consists of compressing the information within a dataset into a set of minimal ingredients which disentangles the independent factors of variation in the output (interpretability), and can be explained in terms of the physics it represents through MI (explainability).
Author:
Luisa Lucie-Smith
Postdoc
2215
luisals@mpa-garching.mpg.de
Original publication
1. Lucie-Smith L.; Peiris H.V.; Pontzen A.
Explaining dark matter halo density profiles with neural networks
Physical Review Letters 132, 031001, January 2024.
DOI
2. Lucie-Smith L.; Peiris H.V.; Pontzen A.; Nord B.; Thiyagalingam J.; Piras, D.
Discovering the building blocks of dark matter halo density profiles with neural networks
Physical Review D, Volume 105, Issue 10, May 2022.
DOI