Do the details of the line of sight velocity distribution matter for star formation history modeling? Probably not much.

I decided to try a set of models for one galaxy – NGC 4889 (with MaNGA plateifu 8479-12701), which had the highest overall velocity dispersion of the Coma sample I’ve been discussing in the last several posts. It also has some evidence for multiple kinematic components which isn’t too much of a surprise since it’s one of the central cD galaxies in Coma. The SSP model spectra fed to the SFH models were preconvolved with the element wise means of the LOSVD convolution kernels from the velocity distribution modeling exercise. Again, this is an expedient to avoid what could otherwise be prohibitively computationally expensive. The models I ran were the same as described back in this post — these ignore emission but do model dust attenuation with the usual modified Calzetti attenuation relation.

To get quickly to the results, here are model star formation histories compared to the previous runs that used the full model in its current form. Usually I like these plots of results from all spectra in an IFU, but in this one all 381 spectra met my S/N criterion, so the plot is pretty crowded. You really need to see it live on a 4K monitor to see the details.

NGC 4889 (MaNGA plateifu 8479-12701) Model star formation histories for all spectra, runs with non-parametric LOSVD vs. single gaussian stellar velocity dispersions

Well it’s pretty hard to see but differences in model SFH’s are mostly in the youngest age bins, which are very poorly constrained anyway in these presumably passively evolving galaxies. Here’s a closer look at a single model run that had the largest difference in estimated stellar mass density (more on this right below) of about 0.19 dex:

mgh_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model mass growth histories for a single spectrum – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

So, the difference in star formation histories was slower mass build up between about 12-5 Gyr look back times in the second run, which was responsible for the lower current day stellar mass density. How this resulted from the choice of LOSVD is not at all obvious.

Let’s look at a few summary results. First, the model stellar mass surface densities:

sigma_mstar_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model ΣM* – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

These fall on an almost exactly one to one relation with a few hands full of outliers. Oddly these are mostly in the higher signal to noise area of the IFU (i.e. near the center).

Results for star formation rate density and specific star formation rate are even more consistent between runs, with essentially no differences larger than the nominal 1 σ error bars.

sigma_sfr_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model Σsfr – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion
ssfr_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model SSFR – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

One problem I encountered was that I had to re-run some models either for technical reasons or because of obvious convergence failures. I suspect there could have been some convergence issues in both sets of runs and am slightly worried that could be the source of the few differences in summary measures seen. Oddly, there were almost no suspicious convergence diagnostics in either set of runs (once the latter were run to satisfactory conclusion), and Stan is quite aggressive about reporting possible convergence issues.

Anyway, modeling kinematics remains an interesting topic to me, but it seems somewhat decoupled from modeling star formation histories. Right now I’m waiting for the final SDSS data release to decide what projects I want to tackle.

I’m going to end with a couple of asides. First, I recognize that all of these error bars are overoptimistic, maybe by a lot. The main reason, I think, is that I treat the flux values as independent which they clearly are not1 this is pretty standard practice however, which effectively results in overestimating the sample size. One possible partial solution is to allow the flux uncertainties to vary from their nominal values by, for example, a factor > 1. This would involve adding as few as one parameter to the models, which is something I’ve actually tried in the past. I may relook at that.

One interesting feature of the previous two graphs is the rather obvious systematic trend with radius of both SFR density and specific star formation rate, as shown more directly below taken from the first set of model runs:

sfr_ssfr_d_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) ΣSFR and SSFR vs. distance from IFU center

Are these real trends? I don’t know, but I don’t see an obvious reason why they might be spurious features of the models. In normal star forming galaxies I encounter trends with radius in both directions and sometimes no trend at all.

As a final and related aside there was a paper by Sedgwick et al. that showed up on arxiv not long ago that presented estimates of star formation rates of early type galaxies from observations of core collapse supernovae carefully matched to host galaxies with high confidence morphological classifications. To oversimplify their conclusions they found that typically massive ellipticals might have specific star formation rates ∼ 10-11 / yr, which is somewhat higher than usually supposed. As I mentioned in my last post my models will always have some contribution from young stars and I typically get central estimates of SSFR > ∼ 10-11.5 even in galaxies with no hint of emission (as is the case with this Coma sample). This particular galaxy has a total stellar mass within the IFU of ∼ 1011.5 M , so it could be forming stars at a rate of ∼ 1 M / yr.

Well, I think I have one more post to write before the SDSS DR17 release.

A little more on nonparametric line of sight velocity distribution modeling – is regularization needed?

A while back I completed LOSVD models for 3348 spectra in the 33 Coma Cluster galaxies I chose for an initial sample. I want to look briefly at a couple summary statistics from the models. For each sample in the posterior I calculate first and second moments of the velocity distribution, and from those I calculate means and standard deviations of velocity offsets and velocity dispersions. Recall from the earlier posts that wavelengths are in the spectrum rest frame with peculiar velocities estimated by fitting a set of eigenspectra to the data. The first plot below is the mean velocity offset with errorbars ± 1 standard deviation versus the signal to noise in each spectrum. The horizontal line is the overall median of 5.25 km/sec. This is less than 1/10th the width of a wavelength bin and, I think, consistent with the absolute wavelength calibration accuracy claimed by the MaNGA team. From an eyeball analysis there doesn’t appear to be any trend in the mean with S/N, but the dispersion in estimates increases as the data quality gets worse. This seems unlikely to be a real problem. Two outliers cut off from the graph had mean velocity offsets of ≈ 200 km/sec, which is only 3 wavelength bins.

voff_v_snr
Mean velocity offset and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies. Solid horizontal line is median offset.

Possibly more concerning is that the mean velocity dispersion also shows a systematic trend with S/N:

vdisp_v_snr
Mean velocity dispersion and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies.

Of course there is a wide range of velocity dispersions since there’s a range of galaxy masses in this sample, but each galaxy shows a similar trend. This matters because, of course, S/N correlates strongly with distance from the nucleus which is usually the IFU center and this in turn leads to potentially spurious correlations with distance.

MaNGA spectrum S/N vs. distance from IFU center in kpc. — 33 passively evolving Coma Cluster galaxies

For example here are results for the cD galaxy NGC 4889 with MaNGA plateifu 8479-12701. This has the highest overall velocity dispersion in the sample by a fair margin as seen in the plots above. Plotting the estimated velocity dispersion against distance from the IFU center shows an apparent systematic increase1the trendline is just a “loess” smooth fit to mislead the eye and shouldn’t be interpreted as any kind of analysis:

NGC 4889 (MaNGA plateifu 8479-12701) – mean velocity dispersion vs. radius per nonparametric LOSVD fit.

Now this could be a real trend: my recollection is the outskirts of cD galaxies have essentially the velocity dispersion of the clusters they live in, which is taken as strong evidence that they grow by cannibalism. But the IFU only covers the inner part of this galaxy and in any case the confounding effect of the correlation between S/N and modeled velocity dispersion prevents any conclusion. Just for comparison here is the same relation using the single gaussian estimates from the preliminary maximum likelihood fits:

NGC 4889 (MaNGA plateifu 8479-12701) – velocity dispersion vs. radius per maximum likelihood fit.

So, if we’re interested in kinematics or dynamics some regularization with an informative prior seems advisable. What’s not so clear is whether this matters in SFH modeling, which as I’ve said many times is what most interests me. There’s one way to find out…

Arxiv notes: “BAYES-LOSVD: a bayesian framework for … line-of sight velocity distributions”

This paper by J. Falcón-Barroso and M. Martig that showed up on arxiv back in November interested me for a couple reasons. First, this was one of the first research papers I’ve seen to make use of the Stan language or indeed any implementation of Hamiltonian Monte Carlo. What was even more interesting was they were doing exactly something I experimented with a few years ago, namely estimating the elements of convolution kernels for non-parametric description of galaxy stellar kinematics. And in fact their Stan code posted on github has comment lines linking to a thread on discourse.mc-stan.org I initiated where I asked about doing discrete convolution by direct multiplication and summation.

The idea I was pursuing at the time was to use sets of eigenspectra from a principal components decomposition of SSP model spectra as the bases for fitting convolved model spectra, with the expectation being that considerable dimensionality reduction could be achieved by using just the leading PC’s, which would in turn speed up model fitting. In fact this was certainly the case, although at the time I perhaps overestimated the number of components needed 1I mentioned in that thread that by one criterion — now forgotten — I would need 42, which is about an 80% reduction from the 216 spectra in the EMILES subset I use but still more than I expected. and I found the execution time to be disappointingly large. Another thing I had in mind was that I wanted to constrain emission line kinematics as well, and I couldn’t quite see how to do that in the same modeling framework. So, I soon turned away from this approach and instead tried to constrain both kinematics and star formation histories using the full SSP library and what I dubbed a partially parametric approach to kinematic modeling, using a convolution kernel for the stellar component and Gauss-Hermite polynomials for emission lines. This works, more or less, but it’s prohibitively computationally intensive, taking an order of magnitude or more longer to run than the models with preconvolved spectral templates. Consequently I’ve run this model fewer than a handful of times and only on a small number of spectra with complicated kinematics.

Falcón-Barroso and Martig started with exactly the same idea that I was pursuing of radically reducing the dimensionality of input spectral templates through the use of principal components. They added a few refinements to their models. They include additive Legendre polynomials in their spectral fits, and they also “regularize” the convolution kernels with informative priors of various kinds. Adding polynomials is fairly common in the spectrum fitting industry, although I haven’t tried it myself. There is only a short segment in the red that is sometimes poorly fit with the EMILES library, and that would seem to be difficult to correct with low order polynomials. What does affect the continuum over the entire wavelength range and could potentially cause a severe template matching issue is dust reddening, and this would seem to be more suitable for correction with multiplicative polynomials or simply with the modified Calzetti relation that I’ve been using for some time.

I’m also a little ambivalent about regularizing the convolution kernel. I’m mostly interested in the kinematics for the sake of matching the effective SSP model spectral resolution to that of the spectra being modeled for star formation histores, and not so much in the kinematics as such.

I decided to take another look at my old code, which somewhat to my surprise I couldn’t find stored locally. Fortunately I had posted the complete code on that discourse.mc-stan thread, so I just copied it from there and made a few small modifications. I want to keep things simple for now, so I did not add any polynomial or attenuation corrections and I am not trying to model emission lines. I also didn’t incorporate priors to regularize the elements of the convolution kernel. Without an explicit prior there will be an implicit one of a maximally diffuse dirichlet. I happen to have a sample of MaNGA spectra that are well suited for a simple kinematic analysis — 33 Coma cluster galaxies from a sample of Smith, Lucey, and Carter (2012) that were selected to be passively evolving based on weak Hα emission, which virtually guarantees overall weak emission. Passively evolving early type galaxies also tend to be nearly dust free, making it reasonable not to model it.

I’ve already run SFH models for all of these galaxies, so for each spectrum in a galaxy I adjust the wavelengths to rest frame using the system redshift and redshift offsets that were calculated in the previous round. I then regrid the SSP library (as usual the 216 component EMILES subset) templates to the galaxy rest frame and extract the region where there is coverage of both the galaxy spectrum and the templates. I then standardize the spectra (that is subtract the means and divide by the standard deviations) and calculate the principal components using R’s svd() function. I again rescale the eigenvectors to unit variance and select the number I want to use in the models. I initially did this by choosing a threshold for the cumulative sum of the eigenvalues as a fraction of the sum of all. I think instead I should use the square of the eigenvalues since that will make the choice based on the fraction of variance in the selected eigenvectors. With the former criterion a threshold of 0.95 will result in selecting 8 eigenspectra, 0.975 results in 13 (perhaps ± 1). In fact as few as 2 or 3 might be adequate as noted by Falcón-Barroso and Martig. The picture below of three eigenspectra from one model run illustrates why. The first two look remarkably like real galaxy spectra, with the first looking like that of an old passively evolving population, while the second has very prominent Balmer absorption lines and a bluer continuum characteristic of a younger population. After that the eigenspectra become increasingly un-spectrum like, although familiar features are still recognizable but often with sign flips.

three_eigenspectra
First three eigenspectra from a principal components decomposition of an SSP model spectrum library

The number of elements in the convolution kernel is also chosen at run time. With logarithmic wavelength binning each wavelength bin has constant velocity width — for SDSS and MaNGA spectra of 69.1 km/sec. If a typical galaxy has velocity dispersion around 150 km/sec. and we want the kernel to be ≈ ±3 σ wide then we need 13 elements, which I chose as the default. Keep in mind that I’ve already estimated redshift offsets from the system value for each spectrum, so a priori I expected the mean velocity to be near 0. If peculiar velocities have to be estimated the convolution kernel would need to be much larger. Also in giant ellipticals and other fairly rare cases the velocity dispersion could be much larger as we’ll see below.

Lets look at a few examples. I’ve only run models so far for three galaxies with a total of 434 spectra. One of these (SDSS J125643.51+270205.1) appears disky and is a rapid rotator. The most interesting kinematically is NGC 4889, a cD galaxy at the center of the cluster. The third is SDSS J125831.59+274024.5, a fairly typical elliptical and slow rotator. These plots contain 3 panels. The left shows the marginal posterior distribution of the convolution kernels, displayed as “violin” plots. The second and third are histograms of the mean and standard deviation of the velocities. For now I’m just going to show results for the central spectrum in each galaxy. These are fairly representative although the distributions in each velocity bin get more diffuse farther out as the signal/noise of the spectra decrease. I set the kernel size to the default for the two lower mass galaxies. The cD has much higher velocity dispersion of as much as ~450 km/sec within the area of the IFU, so I selected a kernel size of 39 (± 1320 km/sec) for those model runs. The kernel appears to be fairly symmetrical for the rotating galaxy (top panes), while the two ellipticals show some evidence of multiple kinematic components. Whether these are real or statistical noise remains to be seen.

centrallosvd_8931_1902
Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8931-1902
centrallosvd_8933_6101
Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8933-6101
centrallosvd_8479_12701
Distribution of kernel, velocity offset, and velocity dispersion. Central MaNGA spectrum of NGC 4889 (plateifu 8479-12701)

There are a couple surprises: the velocity dispersions summarized as the standard deviations of the weighted velocities are at best weakly correlated and generally larger than the values estimated in the preliminary fits that I perform as part of the SFH modeling process. More concerning perhaps is the mean velocity offset averages around -50 km/sec. This is fairly consistent across all three galaxies examined so far. Although this is less than one wavelength bin it is larger than the estimated uncertainties and therefore needs explanation.

hist_velocity_offset
Distribution of mean velocity offsets from estimated redshifts for 434 spectra in 3 MaNGA galaxies.

Some tasks ahead: Figure out this systematic. Look at regularization of the kernel. Run models for the remaining 30 galaxies in the sample. Look at the effect on SFH models.