bayesian modeling – Page 2

2021-08-102021-08-10

KUG 0859+406 – unravelling the differences between 2018 and 2021 model runs

I did have my old data and model runs of course, in fact they were spread over several directories on two machines. I’m going to refer to it by this catalog designation, KUG standing for the “Kiso survey of Ultraviolet-excess Galaxies.” It’s also a low power radio source with catalog entries in both FIRST and NVSS, and of course it’s in MaNGA with plateifu ID 8440-6104 (mangaid 01-216976).

In my 2018 model runs, which were interesting enough to write 3 posts about, I found this galaxy had undergone an extraordinarily large burst of star formation that began ~1 Gyr ago with locally as much as 60% of the present day stellar mass born in the burst and something like 40% of the mass over the footprint of the IFU. In this years model runs the peak burst fraction was a considerably more modest ~25% and globally barely amounted to a slight enhancement of star formation. The starburst was also much more localized than in the earlier runs:

burst_strengths_compared — Fractional stellar mass in stars between 0.1 and 1.75 Gyr old in 2018 and 2021 model runs

So what happened? First, here is a comparison of modeled star formation histories for the innermost fiber, which got the largest injection of mass in the starburst.

central_sfh_compared — Model star formation histories for central fiber of MaNGA plateifu 8440-6104, 2018 and 2021 model runs

The obvious remark is the double peaked starburst noted back in 2018 (and discussed at some length) has been replaced with a single narrow peak with a slow ramp up and fast decay. The peak SFR is a little larger than before but the total mass in the burst is lower.

I’ve made several changes in model formulation since 2018, of which the most important in the current context is adopting the more flexible “modified Calzetti” attenuation relation that adds an additional slope parameter to the prescription. In the current year model runs a steeper than Calzetti relation is favored throughout the IFU footprint, particularly in the central region where the starburst was strongest:

map_delta — Map of modified Cal;zetti slope parameter δ — MaNGA plateifu 8440-6104

A smaller optical depth of attenuation is also favored throughout:

tauv_compared — Modeled optical depth of attenuation – 2021 runs vs. 2018 MaNGA plateifu 8440-6104

This has a couple predictable consequences. Steeper attenuation will favor an intrinsically bluer, hence younger population while a lower optical depth requires less light, and hence mass in the stellar population. I can test this directly by returning to a model with Calzetti attenuation, and here is the result for the central fiber (this model run is labeled 2021 (c) in the legend below):

mgh_compared — Mass growth histories – 2021 run 2021 run with Calzetti attenuation 2018 run Central fiber of MaNGA plateifu 8440-6104

So, an eyeball analysis suggests about 3/4 of the difference between the 2018 and 2021 runs is due to the modification to the attenuation relation. The other changes I’ve made to the models are to change the stellar contribution parameters from a non-negative vector to a simplex, and at the same time changing the way I rescale the data. In early runs the SSP model fluxes were scaled to make the maximum stellar contribution ≈ 1, while the current models scale both the galaxy and SSP fluxes to ≈ 1 in the neighborhood of V, making the individual stellar contributions approximately the fraction of light contributed. An additional scale factor parameter in the model is used to adjust the overall fit. Assuming I did this right this should have no effect on a deterministic maximum likelihood solution, but with MCMC who knows?

Although the fit to the data looks about the same between the model with and without the attenuation modification the summed log-likelihood is consistently about 1% higher for the modified Calzetti model with no overlap at all in the distribution of likelihood. This suggests the case for a steeper than Calzetti attenuation is a fairly robust result.

ppfits — “Posterior predictive” fits to galaxy flux data – modified Calzetti attenuation vs. Calzetti – central fiber of MaNGA plateifu 8440-6104

The galaxy flux data also changed a little bit. The early runs were on the DR14 release (version 2_1_2 of the MaNGA DRP) while the recent ones used the DR15 release (ver 2_4_3). Most of the calibration differences resemble random noise, but there is some curvature that systematically affects both the red and blue ends of the spectrum and could cause some change in the temperature distribution of the models:

measured_flux_compared — Difference in measured flux from DR14 to DR15 – central fiber of MaNGA plateifu 8440-6104

While the detailed star formation histories changed, quantities that aren’t too model dependent didn’t very much. One example is shown below. Also, the kinematic maps agree with the earlier ones in detail.

halpha_luminosity_compared — Hα luminosity density – 2021 runs vs. 2018 MaNGA plateifu 8440-6104

velocity_field_8440_6104 — Velocity field of MaNGA plateifu 8440-6104 from 2021 model runs. Map interpolated from RSS file spectra.

One input that hasn’t changed are the emiles SSP model spectra, although there have been some procedural changes in how I handle the modeling. Early on I often used a much smaller subset of SSP models with just 27 time bins and 3 metallicities for preliminary modeling, including my first models on the same binning of these data. I also routinely ran 250 warmup iterations with just 250 more per chain. My current standard practice is always to use the largest emiles subset with 216 SSP models in 54 time bins and 4 metallicities, and I generally run 750 post-warmup iterations per chain but still with only 250 warmup iterations. This is generally enough and if adaptation fails it is usually fairly obvious. The small sample size of the earlier runs mostly effects the precision of inferences rather than means.

To conclude for now, my speculation about whether it might be possible to say something about the timing of critical events in a merger from the model star formation history was too optimistic. On a positive note though both sets of model runs retrodict that coalescence occurred at a lookback time around 500Myr ago, which is consistent with the fact that tidal tails and other merger signatures are clearly visible even in SDSS imaging. Both sets of model runs also have that odd uptick in star formation at 30Myr in the central fiber. And while the difference in burst mass contributions is a little disconcerting the current runs are more consistent with the likely gas content of ordinary spiral galaxies.

This example illustrates another well known “degeneracy” among attenuation (and adopted attenuation relation), mass, and stellar age. Whether I’ve broken the degeneracy by adopting the more flexible attenuation prescription described some time ago remains to be validated.

2021-07-142021-07-14

Do the details of the line of sight velocity distribution matter for star formation history modeling? Probably not much.

I decided to try a set of models for one galaxy – NGC 4889 (with MaNGA plateifu 8479-12701), which had the highest overall velocity dispersion of the Coma sample I’ve been discussing in the last several posts. It also has some evidence for multiple kinematic components which isn’t too much of a surprise since it’s one of the central cD galaxies in Coma. The SSP model spectra fed to the SFH models were preconvolved with the element wise means of the LOSVD convolution kernels from the velocity distribution modeling exercise. Again, this is an expedient to avoid what could otherwise be prohibitively computationally expensive. The models I ran were the same as described back in this post — these ignore emission but do model dust attenuation with the usual modified Calzetti attenuation relation.

To get quickly to the results, here are model star formation histories compared to the previous runs that used the full model in its current form. Usually I like these plots of results from all spectra in an IFU, but in this one all 381 spectra met my S/N criterion, so the plot is pretty crowded. You really need to see it live on a 4K monitor to see the details.

NGC 4889 (MaNGA plateifu 8479-12701) Model star formation histories for all spectra, runs with non-parametric LOSVD vs. single gaussian stellar velocity dispersions

Well it’s pretty hard to see but differences in model SFH’s are mostly in the youngest age bins, which are very poorly constrained anyway in these presumably passively evolving galaxies. Here’s a closer look at a single model run that had the largest difference in estimated stellar mass density (more on this right below) of about 0.19 dex:

mgh_comparison_8479-12701 — NGC 4889 (MaNGA plateifu 8479-12701) Model mass growth histories for a single spectrum – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

So, the difference in star formation histories was slower mass build up between about 12-5 Gyr look back times in the second run, which was responsible for the lower current day stellar mass density. How this resulted from the choice of LOSVD is not at all obvious.

Let’s look at a few summary results. First, the model stellar mass surface densities:

sigma_mstar_comparison_8479-12701 — NGC 4889 (MaNGA plateifu 8479-12701) Model Σ_M* – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

These fall on an almost exactly one to one relation with a few hands full of outliers. Oddly these are mostly in the higher signal to noise area of the IFU (i.e. near the center).

Results for star formation rate density and specific star formation rate are even more consistent between runs, with essentially no differences larger than the nominal 1 σ error bars.

sigma_sfr_comparison_8479-12701 — NGC 4889 (MaNGA plateifu 8479-12701) Model Σ_sfr – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

ssfr_comparison_8479-12701 — NGC 4889 (MaNGA plateifu 8479-12701) Model SSFR – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

One problem I encountered was that I had to re-run some models either for technical reasons or because of obvious convergence failures. I suspect there could have been some convergence issues in both sets of runs and am slightly worried that could be the source of the few differences in summary measures seen. Oddly, there were almost no suspicious convergence diagnostics in either set of runs (once the latter were run to satisfactory conclusion), and Stan is quite aggressive about reporting possible convergence issues.

Anyway, modeling kinematics remains an interesting topic to me, but it seems somewhat decoupled from modeling star formation histories. Right now I’m waiting for the final SDSS data release to decide what projects I want to tackle.

I’m going to end with a couple of asides. First, I recognize that all of these error bars are overoptimistic, maybe by a lot. The main reason, I think, is that I treat the flux values as independent which they clearly are not¹, which effectively results in overestimating the sample size. One possible partial solution is to allow the flux uncertainties to vary from their nominal values by, for example, a factor > 1. This would involve adding as few as one parameter to the models, which is something I’ve actually tried in the past. I may relook at that.

One interesting feature of the previous two graphs is the rather obvious systematic trend with radius of both SFR density and specific star formation rate, as shown more directly below taken from the first set of model runs:

sfr_ssfr_d_8479-12701 — NGC 4889 (MaNGA plateifu 8479-12701) Σ_SFR and SSFR vs. distance from IFU center

Are these real trends? I don’t know, but I don’t see an obvious reason why they might be spurious features of the models. In normal star forming galaxies I encounter trends with radius in both directions and sometimes no trend at all.

As a final and related aside there was a paper by Sedgwick et al. that showed up on arxiv not long ago that presented estimates of star formation rates of early type galaxies from observations of core collapse supernovae carefully matched to host galaxies with high confidence morphological classifications. To oversimplify their conclusions they found that typically massive ellipticals might have specific star formation rates ∼ 10^-11 / yr, which is somewhat higher than usually supposed. As I mentioned in my last post my models will always have some contribution from young stars and I typically get central estimates of SSFR > ∼ 10^-11.5 even in galaxies with no hint of emission (as is the case with this Coma sample). This particular galaxy has a total stellar mass within the IFU of ∼ 10^11.5 M_⊙ , so it could be forming stars at a rate of ∼ 1 M_⊙ / yr.

Well, I think I have one more post to write before the SDSS DR17 release.

2021-05-27

A little more on nonparametric line of sight velocity distribution modeling – is regularization needed?

A while back I completed LOSVD models for 3348 spectra in the 33 Coma Cluster galaxies I chose for an initial sample. I want to look briefly at a couple summary statistics from the models. For each sample in the posterior I calculate first and second moments of the velocity distribution, and from those I calculate means and standard deviations of velocity offsets and velocity dispersions. Recall from the earlier posts that wavelengths are in the spectrum rest frame with peculiar velocities estimated by fitting a set of eigenspectra to the data. The first plot below is the mean velocity offset with errorbars ± 1 standard deviation versus the signal to noise in each spectrum. The horizontal line is the overall median of 5.25 km/sec. This is less than 1/10th the width of a wavelength bin and, I think, consistent with the absolute wavelength calibration accuracy claimed by the MaNGA team. From an eyeball analysis there doesn’t appear to be any trend in the mean with S/N, but the dispersion in estimates increases as the data quality gets worse. This seems unlikely to be a real problem. Two outliers cut off from the graph had mean velocity offsets of ≈ 200 km/sec, which is only 3 wavelength bins.

voff_v_snr — Mean velocity offset and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies. Solid horizontal line is median offset.

Possibly more concerning is that the mean velocity dispersion also shows a systematic trend with S/N:

vdisp_v_snr — Mean velocity dispersion and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies.

Of course there is a wide range of velocity dispersions since there’s a range of galaxy masses in this sample, but each galaxy shows a similar trend. This matters because, of course, S/N correlates strongly with distance from the nucleus which is usually the IFU center and this in turn leads to potentially spurious correlations with distance.

MaNGA spectrum S/N vs. distance from IFU center in kpc. — 33 passively evolving Coma Cluster galaxies

For example here are results for the cD galaxy NGC 4889 with MaNGA plateifu 8479-12701. This has the highest overall velocity dispersion in the sample by a fair margin as seen in the plots above. Plotting the estimated velocity dispersion against distance from the IFU center shows an apparent systematic increase¹:

NGC 4889 (MaNGA plateifu 8479-12701) – mean velocity dispersion vs. radius per nonparametric LOSVD fit.

Now this could be a real trend: my recollection is the outskirts of cD galaxies have essentially the velocity dispersion of the clusters they live in, which is taken as strong evidence that they grow by cannibalism. But the IFU only covers the inner part of this galaxy and in any case the confounding effect of the correlation between S/N and modeled velocity dispersion prevents any conclusion. Just for comparison here is the same relation using the single gaussian estimates from the preliminary maximum likelihood fits:

NGC 4889 (MaNGA plateifu 8479-12701) – velocity dispersion vs. radius per maximum likelihood fit.

So, if we’re interested in kinematics or dynamics some regularization with an informative prior seems advisable. What’s not so clear is whether this matters in SFH modeling, which as I’ve said many times is what most interests me. There’s one way to find out…

2021-01-182021-01-18

Arxiv notes: “BAYES-LOSVD: a bayesian framework for … line-of sight velocity distributions”

This paper by J. Falcón-Barroso and M. Martig that showed up on arxiv back in November interested me for a couple reasons. First, this was one of the first research papers I’ve seen to make use of the Stan language or indeed any implementation of Hamiltonian Monte Carlo. What was even more interesting was they were doing exactly something I experimented with a few years ago, namely estimating the elements of convolution kernels for non-parametric description of galaxy stellar kinematics. And in fact their Stan code posted on github has comment lines linking to a thread on discourse.mc-stan.org I initiated where I asked about doing discrete convolution by direct multiplication and summation.

The idea I was pursuing at the time was to use sets of eigenspectra from a principal components decomposition of SSP model spectra as the bases for fitting convolved model spectra, with the expectation being that considerable dimensionality reduction could be achieved by using just the leading PC’s, which would in turn speed up model fitting. In fact this was certainly the case, although at the time I perhaps overestimated the number of components needed ¹ and I found the execution time to be disappointingly large. Another thing I had in mind was that I wanted to constrain emission line kinematics as well, and I couldn’t quite see how to do that in the same modeling framework. So, I soon turned away from this approach and instead tried to constrain both kinematics and star formation histories using the full SSP library and what I dubbed a partially parametric approach to kinematic modeling, using a convolution kernel for the stellar component and Gauss-Hermite polynomials for emission lines. This works, more or less, but it’s prohibitively computationally intensive, taking an order of magnitude or more longer to run than the models with preconvolved spectral templates. Consequently I’ve run this model fewer than a handful of times and only on a small number of spectra with complicated kinematics.

Falcón-Barroso and Martig started with exactly the same idea that I was pursuing of radically reducing the dimensionality of input spectral templates through the use of principal components. They added a few refinements to their models. They include additive Legendre polynomials in their spectral fits, and they also “regularize” the convolution kernels with informative priors of various kinds. Adding polynomials is fairly common in the spectrum fitting industry, although I haven’t tried it myself. There is only a short segment in the red that is sometimes poorly fit with the EMILES library, and that would seem to be difficult to correct with low order polynomials. What does affect the continuum over the entire wavelength range and could potentially cause a severe template matching issue is dust reddening, and this would seem to be more suitable for correction with multiplicative polynomials or simply with the modified Calzetti relation that I’ve been using for some time.

I’m also a little ambivalent about regularizing the convolution kernel. I’m mostly interested in the kinematics for the sake of matching the effective SSP model spectral resolution to that of the spectra being modeled for star formation histores, and not so much in the kinematics as such.

I decided to take another look at my old code, which somewhat to my surprise I couldn’t find stored locally. Fortunately I had posted the complete code on that discourse.mc-stan thread, so I just copied it from there and made a few small modifications. I want to keep things simple for now, so I did not add any polynomial or attenuation corrections and I am not trying to model emission lines. I also didn’t incorporate priors to regularize the elements of the convolution kernel. Without an explicit prior there will be an implicit one of a maximally diffuse dirichlet. I happen to have a sample of MaNGA spectra that are well suited for a simple kinematic analysis — 33 Coma cluster galaxies from a sample of Smith, Lucey, and Carter (2012) that were selected to be passively evolving based on weak Hα emission, which virtually guarantees overall weak emission. Passively evolving early type galaxies also tend to be nearly dust free, making it reasonable not to model it.

I’ve already run SFH models for all of these galaxies, so for each spectrum in a galaxy I adjust the wavelengths to rest frame using the system redshift and redshift offsets that were calculated in the previous round. I then regrid the SSP library (as usual the 216 component EMILES subset) templates to the galaxy rest frame and extract the region where there is coverage of both the galaxy spectrum and the templates. I then standardize the spectra (that is subtract the means and divide by the standard deviations) and calculate the principal components using R’s svd() function. I again rescale the eigenvectors to unit variance and select the number I want to use in the models. I initially did this by choosing a threshold for the cumulative sum of the eigenvalues as a fraction of the sum of all. I think instead I should use the square of the eigenvalues since that will make the choice based on the fraction of variance in the selected eigenvectors. With the former criterion a threshold of 0.95 will result in selecting 8 eigenspectra, 0.975 results in 13 (perhaps ± 1). In fact as few as 2 or 3 might be adequate as noted by Falcón-Barroso and Martig. The picture below of three eigenspectra from one model run illustrates why. The first two look remarkably like real galaxy spectra, with the first looking like that of an old passively evolving population, while the second has very prominent Balmer absorption lines and a bluer continuum characteristic of a younger population. After that the eigenspectra become increasingly un-spectrum like, although familiar features are still recognizable but often with sign flips.

three_eigenspectra — First three eigenspectra from a principal components decomposition of an SSP model spectrum library

The number of elements in the convolution kernel is also chosen at run time. With logarithmic wavelength binning each wavelength bin has constant velocity width — for SDSS and MaNGA spectra of 69.1 km/sec. If a typical galaxy has velocity dispersion around 150 km/sec. and we want the kernel to be ≈ ±3 σ wide then we need 13 elements, which I chose as the default. Keep in mind that I’ve already estimated redshift offsets from the system value for each spectrum, so a priori I expected the mean velocity to be near 0. If peculiar velocities have to be estimated the convolution kernel would need to be much larger. Also in giant ellipticals and other fairly rare cases the velocity dispersion could be much larger as we’ll see below.

Lets look at a few examples. I’ve only run models so far for three galaxies with a total of 434 spectra. One of these (SDSS J125643.51+270205.1) appears disky and is a rapid rotator. The most interesting kinematically is NGC 4889, a cD galaxy at the center of the cluster. The third is SDSS J125831.59+274024.5, a fairly typical elliptical and slow rotator. These plots contain 3 panels. The left shows the marginal posterior distribution of the convolution kernels, displayed as “violin” plots. The second and third are histograms of the mean and standard deviation of the velocities. For now I’m just going to show results for the central spectrum in each galaxy. These are fairly representative although the distributions in each velocity bin get more diffuse farther out as the signal/noise of the spectra decrease. I set the kernel size to the default for the two lower mass galaxies. The cD has much higher velocity dispersion of as much as ~450 km/sec within the area of the IFU, so I selected a kernel size of 39 (± 1320 km/sec) for those model runs. The kernel appears to be fairly symmetrical for the rotating galaxy (top panes), while the two ellipticals show some evidence of multiple kinematic components. Whether these are real or statistical noise remains to be seen.

centrallosvd_8931_1902 — Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8931-1902

centrallosvd_8933_6101 — Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8933-6101

centrallosvd_8479_12701 — Distribution of kernel, velocity offset, and velocity dispersion. Central MaNGA spectrum of NGC 4889 (plateifu 8479-12701)

There are a couple surprises: the velocity dispersions summarized as the standard deviations of the weighted velocities are at best weakly correlated and generally larger than the values estimated in the preliminary fits that I perform as part of the SFH modeling process. More concerning perhaps is the mean velocity offset averages around -50 km/sec. This is fairly consistent across all three galaxies examined so far. Although this is less than one wavelength bin it is larger than the estimated uncertainties and therefore needs explanation.

hist_velocity_offset — Distribution of mean velocity offsets from estimated redshifts for 434 spectra in 3 MaNGA galaxies.

Some tasks ahead: Figure out this systematic. Look at regularization of the kernel. Run models for the remaining 30 galaxies in the sample. Look at the effect on SFH models.