KUG 0859+406 – unravelling the differences between 2018 and 2021 model runs

I did have my old data and model runs of course, in fact they were spread over several directories on two machines. I’m going to refer to it by this catalog designation, KUG standing for the “Kiso survey of Ultraviolet-excess Galaxies.” It’s also a low power radio source with catalog entries in both FIRST and NVSS, and of course it’s in MaNGA with plateifu ID 8440-6104 (mangaid 01-216976).

In my 2018 model runs, which were interesting enough to write 3 posts about, I found this galaxy had undergone an extraordinarily large burst of star formation that began ~1 Gyr ago with locally as much as 60% of the present day stellar mass born in the burst and something like 40% of the mass over the footprint of the IFU. In this years model runs the peak burst fraction was a considerably more modest ~25% and globally barely amounted to a slight enhancement of star formation. The starburst was also much more localized than in the earlier runs:

burst_strengths_compared
Fractional stellar mass in stars between 0.1 and 1.75 Gyr old in 2018 and 2021 model runs

So what happened? First, here is a comparison of modeled star formation histories for the innermost fiber, which got the largest injection of mass in the starburst.

central_sfh_compared
Model star formation histories for central fiber of MaNGA plateifu 8440-6104, 2018 and 2021 model runs

The obvious remark is the double peaked starburst noted back in 2018 (and discussed at some length) has been replaced with a single narrow peak with a slow ramp up and fast decay. The peak SFR is a little larger than before but the total mass in the burst is lower.

I’ve made several changes in model formulation since 2018, of which the most important in the current context is adopting the more flexible “modified Calzetti” attenuation relation that adds an additional slope parameter to the prescription. In the current year model runs a steeper than Calzetti relation is favored throughout the IFU footprint, particularly in the central region where the starburst was strongest:

map_delta
Map of modified Cal;zetti slope parameter δ — MaNGA plateifu 8440-6104

A smaller optical depth of attenuation is also favored throughout:

tauv_compared
Modeled optical depth of attenuation – 2021 runs vs. 2018 MaNGA plateifu 8440-6104

This has a couple predictable consequences. Steeper attenuation will favor an intrinsically bluer, hence younger population while a lower optical depth requires less light, and hence mass in the stellar population. I can test this directly by returning to a model with Calzetti attenuation, and here is the result for the central fiber (this model run is labeled 2021 (c) in the legend below):

mgh_compared
Mass growth histories – 2021 run 2021 run with Calzetti attenuation 2018 run Central fiber of MaNGA plateifu 8440-6104

So, an eyeball analysis suggests about 3/4 of the difference between the 2018 and 2021 runs is due to the modification to the attenuation relation. The other changes I’ve made to the models are to change the stellar contribution parameters from a non-negative vector to a simplex, and at the same time changing the way I rescale the data. In early runs the SSP model fluxes were scaled to make the maximum stellar contribution ≈ 1, while the current models scale both the galaxy and SSP fluxes to ≈ 1 in the neighborhood of V, making the individual stellar contributions approximately the fraction of light contributed. An additional scale factor parameter in the model is used to adjust the overall fit. Assuming I did this right this should have no effect on a deterministic maximum likelihood solution, but with MCMC who knows?

Although the fit to the data looks about the same between the model with and without the attenuation modification the summed log-likelihood is consistently about 1% higher for the modified Calzetti model with no overlap at all in the distribution of likelihood. This suggests the case for a steeper than Calzetti attenuation is a fairly robust result.

ppfits
“Posterior predictive” fits to galaxy flux data – modified Calzetti attenuation vs. Calzetti – central fiber of MaNGA plateifu 8440-6104

The galaxy flux data also changed a little bit. The early runs were on the DR14 release (version 2_1_2 of the MaNGA DRP) while the recent ones used the DR15 release (ver 2_4_3). Most of the calibration differences resemble random noise, but there is some curvature that systematically affects both the red and blue ends of the spectrum and could cause some change in the temperature distribution of the models:

measured_flux_compared
Difference in measured flux from DR14 to DR15 – central fiber of MaNGA plateifu 8440-6104

While the detailed star formation histories changed, quantities that aren’t too model dependent didn’t very much. One example is shown below. Also, the kinematic maps agree with the earlier ones in detail.

halpha_luminosity_compared
Hα luminosity density – 2021 runs vs. 2018 MaNGA plateifu 8440-6104
velocity_field_8440_6104
Velocity field of MaNGA plateifu 8440-6104 from 2021 model runs. Map interpolated from RSS file spectra.

One input that hasn’t changed are the emiles SSP model spectra, although there have been some procedural changes in how I handle the modeling. Early on I often used a much smaller subset of SSP models with just 27 time bins and 3 metallicities for preliminary modeling, including my first models on the same binning of these data. I also routinely ran 250 warmup iterations with just 250 more per chain. My current standard practice is always to use the largest emiles subset with 216 SSP models in 54 time bins and 4 metallicities, and I generally run 750 post-warmup iterations per chain but still with only 250 warmup iterations. This is generally enough and if adaptation fails it is usually fairly obvious. The small sample size of the earlier runs mostly effects the precision of inferences rather than means.

To conclude for now, my speculation about whether it might be possible to say something about the timing of critical events in a merger from the model star formation history was too optimistic. On a positive note though both sets of model runs retrodict that coalescence occurred at a lookback time around 500Myr ago, which is consistent with the fact that tidal tails and other merger signatures are clearly visible even in SDSS imaging. Both sets of model runs also have that odd uptick in star formation at 30Myr in the central fiber. And while the difference in burst mass contributions is a little disconcerting the current runs are more consistent with the likely gas content of ordinary spiral galaxies.

This example illustrates another well known “degeneracy” among attenuation (and adopted attenuation relation), mass, and stellar age. Whether I’ve broken the degeneracy by adopting the more flexible attenuation prescription described some time ago remains to be validated.

Arxiv notes: Wu (2021), “Searching for local counterparts of high-redshift post-starburst galaxies…”

This paper (arxiv ID 2103.16070) is pretty old by now, having been posted on arxiv back in early April. The basic premise of the work is mildly interesting: the author searched MaNGA for galaxies that would meet conventional criteria for post-starburst (aka K+A etc.) spectra if observed at a redshift high enough that the entire galaxy would be covered by a single fiber like the original SDSS spectroscopes. Somewhat surprisingly, he found just 9 that met his selection criteria in the DR15 sample of ~4500 galaxies.

I have to say the paper itself is forgettable, but a manageably sized sample of MaNGA data that’s complete by some criterion is worth a look, and I have a long-standing interest in post-starburst galaxies in particular. So, I ran my current SFH modeling code on all 9 — by the way this was completed some time ago. It’s just taken me a while to get around to generating some graphics and sitting down to write.

The author only measured a few observable quantities: Hδ equivalent width and the 4000Å break index Dn(4000), along with Hα emission equivalent width and (normalized) fluxes. I long ago validated my own absorption line measurements of SDSS single fiber spectra against the MPA-JHU measurement pipeline, which was the gold standard for several years (but last run on DR8). My measurements and uncertainty estimates are in excellent agreement with theirs, so I have a fair amount of confidence in them. Emission line fluxes also agree with published measurements with considerably more scatter. My emission line equivalent widths on the other hand are completely unchecked. So, one of my tasks was to compare my equivalent width measurements with Wu’s. I did not attempt to exactly reproduce his work – I binned spatially using my usual Voronoi partitioning approach whereas Wu binned in elliptical annuli. With that difference in mind the next two plots should be compared to his Figures 4 and 5.

The first two graphs show the radial trends (relative to the effective radii per the NASA/SLOAN catalog) in the Lick HδA and Dn(4000) indexes. These both show very similar trends to Wu’s measurements although with more scatter. This is expected because fewer spectra go into each point in general — from the text it appears Wu binned several separate measurements for each displayed point. Also, I made no attempt to deproject distances. One feature of the Hδ versus radius plot that’s a little different is the trend generally flattens out beyond ∼1 effective radius, while Wu shows a roughly linear trend out to 1.5 Re. This might just be a visible effect of me displaying the trends out to larger radii.

d4000_hdelta_re
Radial trends of Lick HδA and Dn4000 for 9 MaNGA “post-starburst” galaxies from Wu (2021) – arxiv 2103.16070

The Hα emission line measurements are similarly in broad agreement. Like Wu, I find that there are two distinct trends in emission: either moderately strong centrally with a rapid decline or weak throughout with a relatively flat trend. One galaxy (with MaNGA plateifu 9876-12701) has no detectable emission. I haven’t looked in detail at emission line ratios to compare to Wu’s Figure 7, but there’s general agreement that some residual star formation is present in some of the sample and weak AGN or ionization by hot evolved stars in others.

Radial trends of Lick Hα equivalent width and luminosity density for 9 MaNGA “post-starburst” galaxies from Wu (2021) – arxiv 2103.16070

A fairly common failing of this literature (IMO) is the use of proxies for recent star formation but not attempting actually to model star formation histories. There are plenty of publicly available tools for that available now, so there’s really no reason not to perform such modeling exercises. Wu did do some toy evolutionary modeling and posted a graph of trajectories through the Hα emission – Hδ absorption plane, which can scarcely unambiguously constrain star formation histories. Of course much of my hobby time is spent generating fine grained model star formation histories, so let’s take a look at a few selected results.

First, here are maps of the modeled fraction of the current stellar mass in stars of ages between 0.1 and 1 Gyr, very roughly the age range that produces a post-starbursty spectrum. Six of the galaxies have more or less strongly centrally concentrated intermediate age populations, which is generally what’s expected especially in the major merger pathway to a post-starburst interval. I’ll discuss this a little further below.

burst_fraction_maps
Maps of fractional stellar mass in intermediate age populations for 9 MaNGA “post-starburst” galaxies from Wu (2021) – arxiv 2103.16070

In more detail here are summed mass growth histories for the sample, that is all modeled star formation histories for a given observation are summed to produce a single global estimate. I’ve shown total masses here. Because of the pointing strategy MaNGA uses the fiber positions overlap to produce a 100% filling factor, so simply summing overestimates masses by about 0.2dex according to a calculation I performed some time ago. The present day masses in the plot below actually agree pretty well with the values listed in Table 1 of the paper, with an average difference of ~0.1 dex (this is probably because at least some of the light falls outside the IFU footprint in most of these galaxies, offsetting some of the overcounted mass).

total_mgh
Integrated mass growth histories for 9 MaNGA “post-starburst” galaxies from Wu (2021) – arxiv 2103.16070

Somewhat surprisingly several of these galaxies show little evidence of an actual burst of star formation in the recent past, at least at the global level. Some of these could simply have had star formation truncated recently, which can produce a poststarburst spectral signature for a time. Overall intermediate age stars contribute ~ 6-20% of the present day stellar mass, with the two largest contributions in the low mass galaxies in the bottom row of the plot.

There are some other oddities in this small sample. At least 3 galaxies are dwarf ellipticals or perhaps dwarf irregulars (in the case of plateifu 9876-12701), and two others have stellar masses under ~5 x 109 M. Two of the low mass galaxies are in or near the Coma cluster, which suggests environmental effects as the probable cause of quenching. Another possible issue with the low mass galaxies is the infamous “age-metallicity degeneracy,” which refers to the fact that old, low metallicity populations “look like” younger, more metal rich ones by many measures. The Balmer lines in particular fade more slowly with age in lower metallicity populations, and the 4000Å break also becomes metallicity sensitive (smaller at low metallicities) at older ages.

There is only one clear merger remnant in the sample (with plateifu 8440-6104, which I will get to in a moment). One other galaxy (plateifu 8458-6102) is located in a compact group that appears (in Legacy survey imaging) to be embedded in a cloud of extragalactic light. Finally, two galaxies in this sample have been cataloged as K+A based on SDSS spectra — 8080-3702 and 9494-3701, while two others in the catalog of Melnick and dePropris (2013) are not.

SDSS thumbnails of the sample

The one clear merger remnant in the sample is an old friend of mine, and in fact I wrote three lengthy posts about this one back in 2018. In perusing those posts I noticed that the current set of model runs have a slightly weaker and more recent burst than the earlier runs. Also a double peak in the earlier runs has gone away in these, which means my early speculation that it might be possible to time crucial events in a merger from the detailed SFH model was too optimistic. On the other hand the model burst strength in the earlier runs was uncomfortably large, indicating an exceptionally gas rich merger and efficient processing of gas into stars. The current runs have a more reasonable ~10% of mass in the burst. So, I will look into those earlier runs and try to figure out what changed. Fortunately I’m a data hoarder and R is self-archiving to some extent.

kug0839+406
KUG 0839+406, one of 9 “post-starburst” galaxies in Wu (2021)

The idea of looking at the integrated properties of IFU data to pick a post-starburst sample seems reasonable, but this sample appears to me to be both incomplete and possibly with some false positives. When DR17 is finally released I plan to try to develop my own criteria. As I’ve already shown using SDSS spectra alone to select a sample is doomed to produce lots of false positives.

I should finally mention one other paper pursuing a similar idea by Greene et al. (2021) showed up on arxiv recently. The authors lost me when they used the phrase “carefully curated” in their introduction, which was otherwise pretty well written up to that point. Maybe I’ll take another look anyway.

Do the details of the line of sight velocity distribution matter for star formation history modeling? Probably not much.

I decided to try a set of models for one galaxy – NGC 4889 (with MaNGA plateifu 8479-12701), which had the highest overall velocity dispersion of the Coma sample I’ve been discussing in the last several posts. It also has some evidence for multiple kinematic components which isn’t too much of a surprise since it’s one of the central cD galaxies in Coma. The SSP model spectra fed to the SFH models were preconvolved with the element wise means of the LOSVD convolution kernels from the velocity distribution modeling exercise. Again, this is an expedient to avoid what could otherwise be prohibitively computationally expensive. The models I ran were the same as described back in this post — these ignore emission but do model dust attenuation with the usual modified Calzetti attenuation relation.

To get quickly to the results, here are model star formation histories compared to the previous runs that used the full model in its current form. Usually I like these plots of results from all spectra in an IFU, but in this one all 381 spectra met my S/N criterion, so the plot is pretty crowded. You really need to see it live on a 4K monitor to see the details.

NGC 4889 (MaNGA plateifu 8479-12701) Model star formation histories for all spectra, runs with non-parametric LOSVD vs. single gaussian stellar velocity dispersions

Well it’s pretty hard to see but differences in model SFH’s are mostly in the youngest age bins, which are very poorly constrained anyway in these presumably passively evolving galaxies. Here’s a closer look at a single model run that had the largest difference in estimated stellar mass density (more on this right below) of about 0.19 dex:

mgh_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model mass growth histories for a single spectrum – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

So, the difference in star formation histories was slower mass build up between about 12-5 Gyr look back times in the second run, which was responsible for the lower current day stellar mass density. How this resulted from the choice of LOSVD is not at all obvious.

Let’s look at a few summary results. First, the model stellar mass surface densities:

sigma_mstar_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model ΣM* – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

These fall on an almost exactly one to one relation with a few hands full of outliers. Oddly these are mostly in the higher signal to noise area of the IFU (i.e. near the center).

Results for star formation rate density and specific star formation rate are even more consistent between runs, with essentially no differences larger than the nominal 1 σ error bars.

sigma_sfr_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model Σsfr – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion
ssfr_comparison_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) Model SSFR – runs with non-parametric LOSVD vs. single Gaussian stellar velocity dispersion

One problem I encountered was that I had to re-run some models either for technical reasons or because of obvious convergence failures. I suspect there could have been some convergence issues in both sets of runs and am slightly worried that could be the source of the few differences in summary measures seen. Oddly, there were almost no suspicious convergence diagnostics in either set of runs (once the latter were run to satisfactory conclusion), and Stan is quite aggressive about reporting possible convergence issues.

Anyway, modeling kinematics remains an interesting topic to me, but it seems somewhat decoupled from modeling star formation histories. Right now I’m waiting for the final SDSS data release to decide what projects I want to tackle.

I’m going to end with a couple of asides. First, I recognize that all of these error bars are overoptimistic, maybe by a lot. The main reason, I think, is that I treat the flux values as independent which they clearly are not1 this is pretty standard practice however, which effectively results in overestimating the sample size. One possible partial solution is to allow the flux uncertainties to vary from their nominal values by, for example, a factor > 1. This would involve adding as few as one parameter to the models, which is something I’ve actually tried in the past. I may relook at that.

One interesting feature of the previous two graphs is the rather obvious systematic trend with radius of both SFR density and specific star formation rate, as shown more directly below taken from the first set of model runs:

sfr_ssfr_d_8479-12701
NGC 4889 (MaNGA plateifu 8479-12701) ΣSFR and SSFR vs. distance from IFU center

Are these real trends? I don’t know, but I don’t see an obvious reason why they might be spurious features of the models. In normal star forming galaxies I encounter trends with radius in both directions and sometimes no trend at all.

As a final and related aside there was a paper by Sedgwick et al. that showed up on arxiv not long ago that presented estimates of star formation rates of early type galaxies from observations of core collapse supernovae carefully matched to host galaxies with high confidence morphological classifications. To oversimplify their conclusions they found that typically massive ellipticals might have specific star formation rates ∼ 10-11 / yr, which is somewhat higher than usually supposed. As I mentioned in my last post my models will always have some contribution from young stars and I typically get central estimates of SSFR > ∼ 10-11.5 even in galaxies with no hint of emission (as is the case with this Coma sample). This particular galaxy has a total stellar mass within the IFU of ∼ 1011.5 M , so it could be forming stars at a rate of ∼ 1 M / yr.

Well, I think I have one more post to write before the SDSS DR17 release.

A little more on nonparametric line of sight velocity distribution modeling – is regularization needed?

A while back I completed LOSVD models for 3348 spectra in the 33 Coma Cluster galaxies I chose for an initial sample. I want to look briefly at a couple summary statistics from the models. For each sample in the posterior I calculate first and second moments of the velocity distribution, and from those I calculate means and standard deviations of velocity offsets and velocity dispersions. Recall from the earlier posts that wavelengths are in the spectrum rest frame with peculiar velocities estimated by fitting a set of eigenspectra to the data. The first plot below is the mean velocity offset with errorbars ± 1 standard deviation versus the signal to noise in each spectrum. The horizontal line is the overall median of 5.25 km/sec. This is less than 1/10th the width of a wavelength bin and, I think, consistent with the absolute wavelength calibration accuracy claimed by the MaNGA team. From an eyeball analysis there doesn’t appear to be any trend in the mean with S/N, but the dispersion in estimates increases as the data quality gets worse. This seems unlikely to be a real problem. Two outliers cut off from the graph had mean velocity offsets of ≈ 200 km/sec, which is only 3 wavelength bins.

voff_v_snr
Mean velocity offset and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies. Solid horizontal line is median offset.

Possibly more concerning is that the mean velocity dispersion also shows a systematic trend with S/N:

vdisp_v_snr
Mean velocity dispersion and standard deviation of mean vs. spectrum S/N for 33 passively evolving Coma Cluster galaxies.

Of course there is a wide range of velocity dispersions since there’s a range of galaxy masses in this sample, but each galaxy shows a similar trend. This matters because, of course, S/N correlates strongly with distance from the nucleus which is usually the IFU center and this in turn leads to potentially spurious correlations with distance.

MaNGA spectrum S/N vs. distance from IFU center in kpc. — 33 passively evolving Coma Cluster galaxies

For example here are results for the cD galaxy NGC 4889 with MaNGA plateifu 8479-12701. This has the highest overall velocity dispersion in the sample by a fair margin as seen in the plots above. Plotting the estimated velocity dispersion against distance from the IFU center shows an apparent systematic increase1the trendline is just a “loess” smooth fit to mislead the eye and shouldn’t be interpreted as any kind of analysis:

NGC 4889 (MaNGA plateifu 8479-12701) – mean velocity dispersion vs. radius per nonparametric LOSVD fit.

Now this could be a real trend: my recollection is the outskirts of cD galaxies have essentially the velocity dispersion of the clusters they live in, which is taken as strong evidence that they grow by cannibalism. But the IFU only covers the inner part of this galaxy and in any case the confounding effect of the correlation between S/N and modeled velocity dispersion prevents any conclusion. Just for comparison here is the same relation using the single gaussian estimates from the preliminary maximum likelihood fits:

NGC 4889 (MaNGA plateifu 8479-12701) – velocity dispersion vs. radius per maximum likelihood fit.

So, if we’re interested in kinematics or dynamics some regularization with an informative prior seems advisable. What’s not so clear is whether this matters in SFH modeling, which as I’ve said many times is what most interests me. There’s one way to find out…

Confronting SFH models with observables – some results for normal disk galaxies

I’ve posted versions of some of these graphs before for both individual galaxies and a few larger samples, but I think they’ve all been unusual ones. I recently managed to complete model runs on 40 of the spirals from the normal barred and non-barred sample I discussed back in this post. The 20 barred and 20 non-barred galaxies in the sample aren’t really enough to address the results in the paper by Fraser-McKelvie that was the starting point for my investigation and more importantly the initial sample was chosen entirely at my whim. Unfortunately I don’t have the computer resources to analyze more than a small fraction of MaNGA galaxies. The sampling part of the modeling process takes about 15 minutes per spectrum on my 16 core PC (which is a huge improvement) and there are typically ~120 binned spectra per galaxy, so it takes ~30 hours per galaxy with one PC running at full capacity. I should probably take up cryptocurrency mining instead.

This sample comprises 5086 model runs with 2967 spectra of non-barred and 2119 of barred spirals. For some of the plots I’ll add results for 3348 spectra of 33 passively evolving Coma cluster galaxies.

Anyway, first: the modeled star formation rate density versus the rate predicted from the Hα luminosity density, which is easily the most widely used star formation rate calibrator at optical wavelengths. The first plot below shows all spectra with estimates for both values. Red dots are (non-barred) spirals, blue are barred. Both sets of quantities have uncertainties calculated, but I’ve left off error bars for clarity. Units on both axes are log10(M/yr/kpc2). I adopted the relation log(SFR) = log(L) – 41.26 from a review by Calzetti (2012), which is the straight line in these graphs. That calibration is traceable back to Kennicutt (1983), which as far as I know has never been revisited except for small adjustments to account for changing fashions in assumed stellar initial mass functions. In the left panel of the plot below Hα is uncorrected for attenuation. In the right it’s corrected using the modeled stellar attenuation, which as I noted some time ago will systematically underestimate the attenuation in H II regions. Not too surprisingly almost all points lie above the calibration line — the SFH models include a treatment of attenuation that might be too simple but still does make a correction for starlight lost to dust. The more important observation though is there’s a pretty tight relationship between modeled SFR density and estimated Hα luminosity density that holds over a nearly 3 order of magnitude range in both. The scatter around a simple regression line in the graphs below is about 0.2 dex. It’s not really evident on visual inspection but the points do shift slightly to the right in the right hand plot and there’s also a very slight reduction in scatter. These galaxies are actually not especially dusty, with an average model optical depth of around 0.25 (which corresponds to E(B-V) ≈ 0.07).

sfr_ha_40spirals
SFR density vs. prediction from Hα luminosity for 40 normal spirals. (L) Hα luminosity uncorrected for attenuation. (R) Hα corrected using estimated attenuation of stellar component.

To take a more refined look at this I limited the sample to regions with star forming emission line ratios using the standard BPT diagnostic based on [O III]/Hβ vs. [N II]/Hα. I require at least a 3σ detection in each line to make a classification, so besides limiting the analysis to regions that are in fact (I hope) forming stars it allows correcting Hα attenuation for the observed Balmer decrement since Hβ is by construction at least nominally detected. Now we get the results shown in the plot below. Units and symbols are as before. Hα luminosity is corrected using the Balmer decrement assuming an intrinsic ratio of 2.86 and the same attenuation curve shape as returned by the model. The SFR-Hα calibration line is the thick red one. The blue lines with grey ribbons are from “robust” simple regressions using the function lmrob in the R package robustbase1Correcting for attenuation produced a few significant outliers that bias an ordinary least squares fit and although it’s not specifically intended for measurements with errors this function seems to do a little better than either ordinary or weighted least squares.

Model estimates of star formation rate density vs. SFR predicted from Hα luminosity density.

So the model SFR density straddles the calibration line, but with a distinct tilt — regions with relatively low Hα luminosity have higher than expected star formation. To quantify this here is the output from the function lmrob:

Call:
lmrob(formula = sigma_sfr_m ~ sigma_sfr_ha, data = df.sfr)
 \--> method = "MM"
Residuals:
      Min        1Q    Median        3Q       Max 
-3.862996 -0.142375  0.004122  0.137030  1.305471 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -0.174336   0.019224  -9.069   <2e-16 ***
sigma_sfr_ha  0.785954   0.009948  79.008   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Robust residual standard error: 0.2097 
Multiple R-squared:  0.7402,	Adjusted R-squared:  0.7401 
Convergence in 10 IRWLS iterations

Robustness weights: 
 6 observations c(781,802,933,941,2121,2330) are outliers with |weight| = 0 ( < 3.8e-05); 
 223 weights are ~= 1. The remaining 2424 ones are summarized as
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0107  0.8692  0.9525  0.9020  0.9854  0.9990 

I also ran my Bayesian measurement error model on this data set and got the following estimates for the intercept, slope, and residual standard deviation:


         mean      se_mean          sd       2.5%        25%        50%        75%      97.5%    n_eff      Rhat
b0 -0.1942387 1.943297e-04 0.018346806 -0.2312241 -0.2063781 -0.1943811 -0.1819499 -0.1589849 8913.379 0.9997482
b1  0.7767853 9.828814e-05 0.009436693  0.7579702  0.7706115  0.7768086  0.7830051  0.7949343 9218.014 0.9995628
s   0.2044701 3.837428e-05 0.003319280  0.1981119  0.2021872  0.2043949  0.2067169  0.2110549 7481.821 0.9997152

Almost the same! So, how to interpret that slight “tilt”? The obvious comment is that the model results probe a very different time scale — by construction 100 Myr — than Hα (5-10 Myr). As a really toy model consider an isolated, instantaneous burst of star formation. As the population ages its star formation rate will be calculated to be constant from its birth up until 100 Myr when it drops to 0, while its emission line luminosity declines steadily. So its trajectory in the plot above will be horizontally from right to left until it disappears. In fact in spiral galaxies in the local universe star formation is generally localized, usually along the leading edges of arms in grand design spirals. Slightly older populations will be more dispersed.

This can be seen pretty clearly in the SFR maps for two galaxies from this sample below. In both cases regions with high star formation rate track the spiral arms closely, but are more diffuse than regions with high Hα luminosity.

Second topic: the spectral region around the 4000Å “break” has long been known to be sensitive to stellar age. Its use as a quantitative specific star formation rate indicator apparently dates to Brinchmann et al. (2004)2They don’t cite any antecedents and I can’t find any either.. More recently Bluck et al. (2020) used a similar technique at the sub-galactic level on MaNGA galaxies. Both studies use D4000 as a secondary star formation rate indicator, preferring Hα luminosity as the primary SFR calibrator with D4000 reserved for galaxies (or regions) with non-starforming emission line ratios or lacking emission. Oddly, I have been unable to find an actual calibration formula in a slightly better than cursory search of the literature — both of the cited papers present schematic graphs with overlaid curves giving the adopted relationships and approximate uncertainties. The Brinchmann version from the published paper is copied and pasted below.

In the two graphs below I’ve added data from the passively evolving Coma cluster sample comprising 3348 binned spectra in 33 galaxies. There are two versions of the same graphs. Individual points are displayed in the first, as before with error bars suppressed to aid (slightly) clarity. The second displays the density of points at arbitrarily spaced contour intervals. The straight line is the “robust” regression line calculated for the spiral sample only, which for the sake of completeness is

\( \log10(sSFR) = -7.11 (\pm 0.02) – 2.11 (\pm 0.015) D_n(4000)\)
d4000_ssfr_40spirals_asscatter
Model sSFR vs. measured value of D4000. 40 barred and non-barred spirals + 33 passively evolving Coma cluster galaxies.
Model sSFR vs. measured value of D4000. 40 barred and non-barred spirals + 33 passively evolving Coma cluster galaxies.
Model sSFR vs. measured value of D4000 (2D density version). 40 barred and non-barred spirals + 33 passively evolving Coma cluster galaxies.

Call:
lmrob(formula = ssfr_m ~ d4000_n, data = df.ssfr)
 \--> method = "MM"
Residuals:
       Min         1Q     Median         3Q        Max 
-0.9802409 -0.0916555 -0.0005187  0.0962981  7.1748499 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -7.10757    0.02009  -353.8   <2e-16 ***
d4000_n     -2.10894    0.01418  -148.7   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Robust residual standard error: 0.1384 
Multiple R-squared:  0.9043,	Adjusted R-squared:  0.9043 
Convergence in 13 IRWLS iterations

Robustness weights: 
 39 observations c(45,958,1003,1165,1200,1230,1249,1279,1280,1281,1282,1283,1294,1298,1299,1992,2040,2047,2713,2722,2723,2729,2735,2736,2974,3212,3226,3250,3667,3668,3671,3677,3685,3687,3688,3691,4056,4058,4083)
	 are outliers with |weight| <= 1.1e-05 ( < 2.1e-05); 
 418 weights are ~= 1. The remaining 4310 ones are summarized as
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
0.0001994 0.8684000 0.9514000 0.8911000 0.9850000 0.9990000 
The relation between D4000 and sSFR as estimated by Brinchmann et al. 2004

All three groups follow the same relation but with some obvious differences in distribution. The non-barred spiral sample extends to higher star formation rates (either density or sSFR) than barred spirals, which in turn extend into the passively evolving range. The Coma cluster sample has a long tail of high D4000 values (or high specific star formation rates at given D4000) — this is likely because D4000 becomes sensitive to metallicity in older populations and this sample contains some of the most massive (and highest metallicity) galaxies in the local universe. Also, as I’ve noted before these models “want” to produce a smoothly varying mass growth history, which means that even the reddest and deadest elliptical will have some contribution from young populations. This seems to put a floor on modeled specific SFR of ∼10-11.5 yr-1.

Just to touch briefly on the paper by Fraser-McKelvie et al. barred spirals in this sample do have lower overall star formation than non-barred, with large areas in the green valley or even passively evolving. This sample is too incomplete to say much more. For the sake of having a visualization here is the spatially resolved ΣSFR vs. ΣM* relation. The dashed line is Bluck’s estimate of the star forming “main sequence,” which looks displaced downward compared to my estimates.

mstar_sfr_40spirals+33etg
Model SFR density vs. stellar mass density. 40 barred and non-barred spirals + 33 passively evolving Coma cluster galaxies.

Finally, here are a couple of grand design spirals, one barred and one (maybe) not to illustrate how model results track morphological features. In the barred galaxy note that the arms are clearly visible in the SFR maps but they aren’t visible at all in the stellar mass map, which does show the presence of the very prominent bar.

NGC 6001 – thumbnail with MaNGA IFU footprint
NGC 6001 (MaNGA plateifu 9041-12701) (L) Model SFR surface density (M) Hα luminosity density (R) sSFR
NGC 5888- thumbnail with MaNGA IFU footprint
NGC 5888 (MaNGA plateifu 9871-12702) (L) Model SFR surface density (M) Hα luminosity density (R) sSFR
9871-12702_stellar_mass_density
NGC 5888 (MaNGA plateifu 9871-12702) – Log model stellar mass density (Msun/kpc2

I’m not sure how much more I’m going to do with normal spirals. As I’ve said repeatedly the full sample is much too large for my computing resources.

Next time (probably) I’m going to return to a very small sample of post-starburst galaxies, which I may also return to when the final SDSS public data is released.

NGC 4949 – first attempt at modeling star formation histories with non-parametric losvd’s

I only have time to analyze one MaNGA galaxy for now and since it was the first to get the correctly coded LOSVD estimates I chose the Coma Cluster S0 galaxy NGC 4949 that I discussed in the last post for SFH modeling. As I mentioned a few posts ago trying to model the LOSVD and star formation history simultaneously is far too computationally intensive for my resources, so for now I just convolve the SSP model spectral templates with the elementwise means of the convolution kernels estimated as described previously and feed those to the SFH modeling code. My intuition is that the SFH estimates should be at least slightly more variable if the kinematics are treated as variable as well, but for now there’s no real alternative to just picking a fiducial value. Of course a limited test of that hypothesis could be made by pulling multiple draws from the posterior of the LOSVD distribution. I will certainly try that in the future.

My first idea was to leave both dust and emission lines out of the SFH models. The first choice was to save CPU cycles and also based on the expectation that these passively evolving galaxies should have very little dust. The results were rather interesting: here are ribbon plots of model SFH’s for all 86 binned spectra. The pink ribbons are from the original model runs, which assumed single component gaussian velocity distributions, modified Calzetti attenuation, and included emission lines (which contribute negligible flux throughout). The blue ribbons are with non-parametric LOSVD and no attenuation model:

NGC 4949 – modeled star formation histories with single component gaussian and nonparametric estimates of line of sight velocity distributions with no dust model

The differences in star formation histories turn out to be almost entirely due to neglecting dust in the second set of model runs. In this galaxy there are some areas with apparently non-negligible amounts of dust:

NGC 4949 – map of estimated optical depth of attenuation

Ignoring attenuation forces the model to use redder, hence older (and probably more metal rich although I haven’t yet checked) stellar populations and this has, in some cases, profound effects on the model SFH.

So, I tried a second set of model runs adding back in modified Calzetti attenuation but still leaving out emission:

ngc4949_sfh0_sfhnpvddust
NGC 4949 – modeled star formation histories with single component gaussian and nonparametric estimates of line of sight velocity distributions

And now the models are nearly identical, with minor differences mostly in the youngest age bin. All other quantities that I track are similarly nearly identical in both model runs.

Here is the modeled mass growth history for a single spectrum of a region just south of the nucleus that had the largest difference in model SFH in the second set of runs, and the largest optical depth of attenuation in the first and 3rd. This is an extreme case of the shift towards older populations required by the neglect of reddening. Well over 90% of the present day stellar mass was, according to the model, formed in the first half Gyr after the big bang with almost immediate and complete quenching thereafter. While not an impossible star formation history it’s not one I often see result from these models, which strongly favor more gradual mass buildup.

ngc4949_mgh3models_spec14
NGC 4949 – modeled mass growth histories for models with and without an attenuation model

So, the conclusion for now is that having an attenuation model matters a lot, the detailed stellar kinematics not so much. Of course this is a relatively simple example because it’s a rapid rotator and the model convolution kernels are fairly symmetrical throughout the galaxy. The massive ellipticals and especially cD galaxies with more complex kinematics might provide some surprises.

Update on Bayesian line of sight velocity modeling

Well that was simple enough. I made a simple indexing error in the R data preprocessing code that resulted in a one pixel offset between the template and galaxy spectra, which effectively resulted in shifting the elements of the convolution kernel by one bin. I had wanted to look at a rotating galaxy to perform some diagnostic tests, but once I figured out my error this turned out to be a pretty good validation exercise. So I decided to make a new post. The galaxy I’m looking at is NGC 4949, another member of the sample of passively evolving Coma cluster galaxies of Smith et al. It appears to me to be an S0 and is a rapid rotator:

NGC 4949 – SDSS image
NGC 4949 – radial velocity map

These projected velocities are computed as part of my normal workflow. I may in a future post explain in more detail how they’re derived, but basically they are calculated by finding the best redshift offset from the system redshift (taken from the NSA catalog which is usually the SDSS spectroscopic redshift) to match the features of a linear combination of empirically derived eigenspectra to the given galaxy spectrum.

First exercise: find the line of sight velocity distribution after adjusting to the rest frame in each spectrum. This was the originally intended use of these models. This galaxy has fairly low velocity dispersion of ~100 km/sec. so I used a convolution kernel size of just 11 elements with 6 eigenspectra in each fit. Here is a summary of the LOSVD distribution for the central spectrum. This is much better. The kernel estimates are symmetrical and peak on average at the central element. The mean velocity offset is ≈ 9.5 km/sec, which is much closer to 0 than in the previous runs. I will look briefly at velocity dispersions at the end of the post: this one is actually quite close to the one I estimate with a single component gaussian fit (116 km/sec vs 110).

Estimated LOSVD of central spectrum of NGC 4949

Next, here are the posterior mean velocity offsets for all 86 spectra in the Voronoi binned data, plotted against the peculiar velocity calculated as outlined above. The overall average of the mean velocity offsets is 4.6 km/sec. The reason for the apparent tilt in the relationship still needs investigation.

Mean velocity offset vs. peculiar velocity. All NGC 4949 spectra.

Exercise 2: calculate the LOSVD with wavelengths adjusted to the overall system redshift as taken from the NSA catalog, that is no adjustment is made for peculiar redshifts due to rotation. For this exercise I increased the kernel size to 17 elements. This is actually a little more than needed since the projected rotation velocities range over ≈ ± 100 km/sec. First, here is the radial velocity map:

Radial velocity map from Bayesian LOSVD model with no peculiar redshifts assigned.

Here’s a scatterplot of the velocity offsets against peculiar velocities from my normal workflow. Again there’s a slight tilt away from a slope of 1 evident. The residual standard error around the simple regression line is 6.4 km/sec and the intercept is 4 km/sec, which are consistent with the results from the first set of LOSVD models.

Velocity offsets from Bayesian LOSVD models vs. peculiar velocities

Exercise 3: calculate redshift offsets using a set of (for this exercise, 6) eigenspectra from the SSP templates. Here is a scatterplot of the results plotted against the redshift offsets from my usual empirically derived eigenspectra. Why the odd little jumps? I’m not completely sure, but my current code does an initial grid search to try to isolate the global maximum likelihood, which is then found with a general purpose nonlinear minimizer. The default grid size is 10-4, about the size of the gaps. Perhaps it’s time to revisit my search strategy.

Redshift offsets from a set of SSP derived eigenspectra vs. the same routine using my usual set of empirically derived eigenspectra.

Final topic for now: I mentioned in the last post that posterior velocity dispersions (measured by the standard deviation of the LOSVD) were only weakly correlated with the stellar velocity dispersions that I calculate as part of my standard workflow. With the correction to my code the correlation while still weak has greatly improved, but the dispersions are generally higher:

Velocity dispersion form Bayesian LOSVD models vs. stellar velocity dispersion from maximum likelihood fits.

A similar trend is seen when I plot the velocity dispersions from the LOSVD models with correction only for the system redshift and a wider convolution kernel (exercise 2 above) with the fully corrected model runs (exercise 1):

These results hint that the diffuse prior on the convolution kernel is responsible for the different results. As part of the maximum likelihood fitting process I estimate the standard deviation of the stellar velocity distribution assuming it to be a single component gaussian. While the distribution of kernel values in the first graph look pretty symmetrical the tails are on average heavier than a gaussian. This can be seen too in the LOSVD models with the larger convolution kernel of exercise 2. The tails have non-negligible values all the way out to the ends:

Now, what I’m really interested in are model star formation histories. I’ve been using pre-convolved SSP model templates from the beginning along with phony emission line spectra with gaussian profiles with some apparent success. My plan right now is to continue that program with these non-parametric LOSVD’s. The convolutions could be carried out with posterior means of the kernel values or by drawing samples. Repeated runs could be used to estimate how much variation is affected by uncertainty in the kernel.

How to handle emission lines is another problem. For now stepping back to a simpler model (no emission, no dust) would be reasonable for this Coma sample.

Arxiv notes: “BAYES-LOSVD: a bayesian framework for … line-of sight velocity distributions”

This paper by J. Falcón-Barroso and M. Martig that showed up on arxiv back in November interested me for a couple reasons. First, this was one of the first research papers I’ve seen to make use of the Stan language or indeed any implementation of Hamiltonian Monte Carlo. What was even more interesting was they were doing exactly something I experimented with a few years ago, namely estimating the elements of convolution kernels for non-parametric description of galaxy stellar kinematics. And in fact their Stan code posted on github has comment lines linking to a thread on discourse.mc-stan.org I initiated where I asked about doing discrete convolution by direct multiplication and summation.

The idea I was pursuing at the time was to use sets of eigenspectra from a principal components decomposition of SSP model spectra as the bases for fitting convolved model spectra, with the expectation being that considerable dimensionality reduction could be achieved by using just the leading PC’s, which would in turn speed up model fitting. In fact this was certainly the case, although at the time I perhaps overestimated the number of components needed 1I mentioned in that thread that by one criterion — now forgotten — I would need 42, which is about an 80% reduction from the 216 spectra in the EMILES subset I use but still more than I expected. and I found the execution time to be disappointingly large. Another thing I had in mind was that I wanted to constrain emission line kinematics as well, and I couldn’t quite see how to do that in the same modeling framework. So, I soon turned away from this approach and instead tried to constrain both kinematics and star formation histories using the full SSP library and what I dubbed a partially parametric approach to kinematic modeling, using a convolution kernel for the stellar component and Gauss-Hermite polynomials for emission lines. This works, more or less, but it’s prohibitively computationally intensive, taking an order of magnitude or more longer to run than the models with preconvolved spectral templates. Consequently I’ve run this model fewer than a handful of times and only on a small number of spectra with complicated kinematics.

Falcón-Barroso and Martig started with exactly the same idea that I was pursuing of radically reducing the dimensionality of input spectral templates through the use of principal components. They added a few refinements to their models. They include additive Legendre polynomials in their spectral fits, and they also “regularize” the convolution kernels with informative priors of various kinds. Adding polynomials is fairly common in the spectrum fitting industry, although I haven’t tried it myself. There is only a short segment in the red that is sometimes poorly fit with the EMILES library, and that would seem to be difficult to correct with low order polynomials. What does affect the continuum over the entire wavelength range and could potentially cause a severe template matching issue is dust reddening, and this would seem to be more suitable for correction with multiplicative polynomials or simply with the modified Calzetti relation that I’ve been using for some time.

I’m also a little ambivalent about regularizing the convolution kernel. I’m mostly interested in the kinematics for the sake of matching the effective SSP model spectral resolution to that of the spectra being modeled for star formation histores, and not so much in the kinematics as such.

I decided to take another look at my old code, which somewhat to my surprise I couldn’t find stored locally. Fortunately I had posted the complete code on that discourse.mc-stan thread, so I just copied it from there and made a few small modifications. I want to keep things simple for now, so I did not add any polynomial or attenuation corrections and I am not trying to model emission lines. I also didn’t incorporate priors to regularize the elements of the convolution kernel. Without an explicit prior there will be an implicit one of a maximally diffuse dirichlet. I happen to have a sample of MaNGA spectra that are well suited for a simple kinematic analysis — 33 Coma cluster galaxies from a sample of Smith, Lucey, and Carter (2012) that were selected to be passively evolving based on weak Hα emission, which virtually guarantees overall weak emission. Passively evolving early type galaxies also tend to be nearly dust free, making it reasonable not to model it.

I’ve already run SFH models for all of these galaxies, so for each spectrum in a galaxy I adjust the wavelengths to rest frame using the system redshift and redshift offsets that were calculated in the previous round. I then regrid the SSP library (as usual the 216 component EMILES subset) templates to the galaxy rest frame and extract the region where there is coverage of both the galaxy spectrum and the templates. I then standardize the spectra (that is subtract the means and divide by the standard deviations) and calculate the principal components using R’s svd() function. I again rescale the eigenvectors to unit variance and select the number I want to use in the models. I initially did this by choosing a threshold for the cumulative sum of the eigenvalues as a fraction of the sum of all. I think instead I should use the square of the eigenvalues since that will make the choice based on the fraction of variance in the selected eigenvectors. With the former criterion a threshold of 0.95 will result in selecting 8 eigenspectra, 0.975 results in 13 (perhaps ± 1). In fact as few as 2 or 3 might be adequate as noted by Falcón-Barroso and Martig. The picture below of three eigenspectra from one model run illustrates why. The first two look remarkably like real galaxy spectra, with the first looking like that of an old passively evolving population, while the second has very prominent Balmer absorption lines and a bluer continuum characteristic of a younger population. After that the eigenspectra become increasingly un-spectrum like, although familiar features are still recognizable but often with sign flips.

three_eigenspectra
First three eigenspectra from a principal components decomposition of an SSP model spectrum library

The number of elements in the convolution kernel is also chosen at run time. With logarithmic wavelength binning each wavelength bin has constant velocity width — for SDSS and MaNGA spectra of 69.1 km/sec. If a typical galaxy has velocity dispersion around 150 km/sec. and we want the kernel to be ≈ ±3 σ wide then we need 13 elements, which I chose as the default. Keep in mind that I’ve already estimated redshift offsets from the system value for each spectrum, so a priori I expected the mean velocity to be near 0. If peculiar velocities have to be estimated the convolution kernel would need to be much larger. Also in giant ellipticals and other fairly rare cases the velocity dispersion could be much larger as we’ll see below.

Lets look at a few examples. I’ve only run models so far for three galaxies with a total of 434 spectra. One of these (SDSS J125643.51+270205.1) appears disky and is a rapid rotator. The most interesting kinematically is NGC 4889, a cD galaxy at the center of the cluster. The third is SDSS J125831.59+274024.5, a fairly typical elliptical and slow rotator. These plots contain 3 panels. The left shows the marginal posterior distribution of the convolution kernels, displayed as “violin” plots. The second and third are histograms of the mean and standard deviation of the velocities. For now I’m just going to show results for the central spectrum in each galaxy. These are fairly representative although the distributions in each velocity bin get more diffuse farther out as the signal/noise of the spectra decrease. I set the kernel size to the default for the two lower mass galaxies. The cD has much higher velocity dispersion of as much as ~450 km/sec within the area of the IFU, so I selected a kernel size of 39 (± 1320 km/sec) for those model runs. The kernel appears to be fairly symmetrical for the rotating galaxy (top panes), while the two ellipticals show some evidence of multiple kinematic components. Whether these are real or statistical noise remains to be seen.

centrallosvd_8931_1902
Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8931-1902
centrallosvd_8933_6101
Distribution of kernel, velocity offset, and velocity dispersion. Central spectrum of MaNGA plateifu 8933-6101
centrallosvd_8479_12701
Distribution of kernel, velocity offset, and velocity dispersion. Central MaNGA spectrum of NGC 4889 (plateifu 8479-12701)

There are a couple surprises: the velocity dispersions summarized as the standard deviations of the weighted velocities are at best weakly correlated and generally larger than the values estimated in the preliminary fits that I perform as part of the SFH modeling process. More concerning perhaps is the mean velocity offset averages around -50 km/sec. This is fairly consistent across all three galaxies examined so far. Although this is less than one wavelength bin it is larger than the estimated uncertainties and therefore needs explanation.

hist_velocity_offset
Distribution of mean velocity offsets from estimated redshifts for 434 spectra in 3 MaNGA galaxies.

Some tasks ahead: Figure out this systematic. Look at regularization of the kernel. Run models for the remaining 30 galaxies in the sample. Look at the effect on SFH models.

Using Galaxy Zoo classifications to select MaNGA samples

A while back I came across a paper by Fraser-McKelvie et al. (2020, arxiv id 2009.07859) that used Galaxy Zoo classifications to select a sample of barred spiral galaxies with MaNGA observations. This was a followup to a paper by Peterken et al. (2020, arxiv id 2005.03012) that also used Galaxy Zoo classifications to select a parent sample of spiral galaxies (barred and otherwise). There’s nothing new about using GZ classifications for sample selection of course, although these papers are somewhat notable for going farther down the decision tree than usual. What was new to me though when I decided to get my own samples is the SDSS CasJobs database now has a table named mangaGalaxyZoo containing GZ classifications for (I guess) all MaNGA galaxies. The classifications come from the Galaxy Zoo 2 database supplemented with some followup campaigns to fill in the gaps in GZ2. Besides greater completeness than the zoo2* database tables that can also be queried in CasJobs this table contains the newer vote fraction debiasing procedure described in Hart et al. (2016). It’s also much faster to query because it’s indexed on mangaid. When I put together the sample of MaNGA disk galaxies that I’ve posted about several times I took a somewhat indirect approach of looking for SDSS spectroscopic objects close to IFU centers and joining those with classifications in the zoo2MainSpecz table. The query I wrote took about 3 1/2 hours to execute, whereas the ones shown below required no more than a second.

Pasted below are the complete SQL queries, and below the fold are lists of the positions and plateifu IDs of the samples suitable for copying and pasting into the SDSS image list tool. These queries run in the DR16 “context” produced 287 and 272 hits respectively, with 285 unique galaxies in the barred sample and 263 uniques in the non-barred. These numbers are a little different than in the two papers referenced at the top. Fraser-McKelvie ended up with 245 galaxies in their barred sample — most of the difference appears to be due to me selecting from both the primary and secondary MaNGA samples, while they only used the “Primary+” sample (which presumably include the primary and “color enhanced” subsamples). I also did not make any exclusions based on the drp3qual value although I did record it. The total sample size of 548 galaxies is considerably smaller than the parent sample from Peterken, which was either 795 or 798 depending on which paper you consult. The main reason for that is probably that Peterken’s parent sample includes all bar classifications while I excluded galaxies with debiased f_bar levels > 0.2 in my bar-less sample. My barred fraction of around 52% is closer to guesstimates in the literature.

Both samples contain at least a few false positives, as is usual, but there are only one or two gross misclassifications. One that was especially obvious in the barred sample was this early type galaxy, which clearly has neither a bar or spiral structure and at least qualitatively has a brightness profile more characteristic of an elliptical. Oddly, the zoo2MainSpecZ entry for this object has a completely different set of classifications — the debiased vote fraction for “smooth” was 84%, so most volunteers agreed with me. This suggests maybe a misidentification in the mangaGalaxyZoo data.

CGCG 238-030. Not a barred spiral.

Besides this really obvious case I found a few with apparent inner rings or lenses, and a few galaxies in both samples appear to me to be lenticulars with no clear spiral structure. The first of the two below again has a completely different set of classifications in zoo2MainSpecZ than in the MaNGA table.

Again, not a barred spiral.
Lenticular?

Although I didn’t venture to count them a fair number of galaxies in the non-barred sample do appear to have short and varyingly obvious bars. Of course the query didn’t exclude objects with some bar votes — presumably higher purity could be achieved by lowering the threshold for exclusion. And again, there are a few lenticulars in the spiral sample. As my sadly departed friend Jean Tate often commented the galaxy zoo decision tree doesn’t lend itself very well to identifying lenticulars.

IC 2227. Maybe a short bar?
UGC 10381. Classified as S0/a in RC3

Unfortunately I have nothing useful to say about Fraser-Mckelvie’s main research topic, which was to decide if, and perhaps why, barred spirals have lower star formation rates than otherwise similar non-barred ones. 500+ galaxies are far more than I can analyze with my computing resources. Perhaps a really high purity sample would be manageable. I may post an individual example or two anyway. The MaNGA view of grand design spirals in particular can be quite striking.

select into gzbars
  m.mangaid,
  m.plateifu,
  m.plate,
  m.objra,
  m.objdec,
  m.ifura,
  m.ifudec,
  m.mngtarg1,
  m.drp3qual,
  m.nsa_z,
  m.nsa_zdist,
  m.nsa_elpetro_mass,
  m.nsa_elpetro_phi,
  m.nsa_elpetro_ba,
  m.nsa_elpetro_th50_r,
  m.nsa_sersic_n,
  gz.survey,
  gz.t01_smooth_or_features_count as count_features,
  gz.t01_smooth_or_features_a02_features_or_disk_debiased as f_disk,
  gz.t03_bar_count as count_bar,
  gz.t03_bar_a06_bar_debiased as f_bar,
  gz.t04_spiral_count as count_spiral,
  gz.t04_spiral_a08_spiral_debiased as f_spiral,
  gz.t06_odd_count as count_odd,
  gz.t06_odd_a15_no_debiased as f_notodd
from mangaDrpAll m
join mangaGalaxyZoo gz on gz.mangaid = m.mangaid
where
  m.mngtarg2=0 and
  gz.t04_spiral_count >= 20 and
  gz.t03_bar_count >= 20 and
  gz.t01_smooth_or_features_a02_features_or_disk_debiased > 0.43 and
  gz.t03_bar_a06_bar_debiased >= 0.5 and
  gz.t04_spiral_a08_spiral_debiased > 0.8 and
  gz.t06_odd_a15_no_debiased > 0.5 and
  m.nsa_elpetro_ba >= 0.5 and
  m.mngtarg1 >= 1024 and
  m.mngtarg1 < 8192
order by m.plateifu

select into gzspirals
  m.mangaid,
  m.plateifu,
  m.plate,
  m.objra,
  m.objdec,
  m.ifura,
  m.ifudec,
  m.mngtarg1,
  m.drp3qual,
  m.nsa_z,
  m.nsa_zdist,
  m.nsa_elpetro_mass,
  m.nsa_elpetro_phi,
  m.nsa_elpetro_ba,
  m.nsa_elpetro_th50_r,
  m.nsa_sersic_n,
  gz.survey,
  gz.t01_smooth_or_features_count as count_features,
  gz.t01_smooth_or_features_a02_features_or_disk_debiased as f_disk,
  gz.t03_bar_count as count_bar,
  gz.t03_bar_a06_bar_debiased as f_bar,
  gz.t04_spiral_count as count_spiral,
  gz.t04_spiral_a08_spiral_debiased as f_spiral,
  gz.t06_odd_count as count_odd,
  gz.t06_odd_a15_no_debiased as f_notodd
from mangaDrpAll m
join mangaGalaxyZoo gz on gz.mangaid = m.mangaid
where
  m.mngtarg2=0 and
  gz.t04_spiral_count >= 20 and
  gz.t03_bar_count >= 20 and
  gz.t01_smooth_or_features_a02_features_or_disk_debiased > 0.43 and
  gz.t03_bar_a06_bar_debiased <= 0.2 and
  gz.t04_spiral_a08_spiral_debiased > 0.8 and
  gz.t06_odd_a15_no_debiased > 0.5 and
  m.nsa_elpetro_ba >= 0.5 and
  m.mngtarg1 >= 1024 and
  m.mngtarg1 < 8192
order by m.plateifu

Continue reading “Using Galaxy Zoo classifications to select MaNGA samples”

Star formation history modeling and visualization — short note

One of my favorite visualization tools for displaying results of star formation history modeling is a ribbon plot of star formation rate versus look back time. This simply plots the marginal posterior mean or median SFR along with 2.5 and 97.5th percentiles (by default) as the lower and upper boundaries of the ribbon. I’ve been using the simplest possible definition of the star formation rate, namely the total mass born in each age bin divided by the interval between the nominal age of each SSP model and the next younger. One striking feature of these plots that’s especially evident in grids of SFH models for an entire galaxy like the one shown in my last post is that there are invariably jumps in the SFR at certain specific ages as marked below. This particular example is from a sample of passively evolving Coma cluster galaxies, but the same features are seen regardless of the likely “true” star formation history.

sfrh
Star formation rate history estimate – emiles with BaSTI isochrones

These jumps are artifacts of course: the BaSTI isochrones used for the EMILES SSP models that I currently use are tabulated at age intervals that are constant over certain age ranges, with jumps at 4 ages1100 Myr, 500 Myr, 1 Gyr, and 4 Gyr. The jumps in model SFR occur exactly at the breaks in the age intervals. This turns out to be due to an otherwise welcome feature of the SFH models that they “want” to produce SSP contributions that vary smoothly with age as shown below for the same model run. So for example the stellar mass born in the 90-100 Myr age bin per the model is about 90% of that in the 100-150 Myr bin while the time interval increases by a factor of 5, so the model SFR declines by a factor 4.5 or around 0.6 dex.

mass_history
Modeled mass born in each age bin – central spectrum of plateifu 9876-12702

Can I do anything about this? Should I? Changing how I calculate the star formation rate might work — this is after all a derivative and I’m currently using the most stupidly simple numerical approximation possible. It also might help to adjust the effective ages of each SSP model. I should also look at the priors on the SSP model coefficients, although as I noted some time ago it’s hard to affect the model star formation histories much with adjustments to priors.

These jumps are something of a peculiarity of the BaSTI isochrones. I had previously used a subset of MILES SSP models from Padova isochrones, which are tabulated at equal logarithmic age intervals. A comparison model run lacks large jumps except for an early time burst. Since the youngest age bin in the Padova isochrones is around 60 Myr I had added two younger SSP models from an update of the BC03 library, and these show abrupt jumps in model SFR. This is also the case with the youngest age bin in my currently used EMILES library.

sfr_compare
Comparison of star formation rate history estimates: Red – EMILES SSP libraries with BaSTI isochrones Blue – Miuscat SSP library with Padova isochrones

A final comment about these visualizations is that often the mode of the posterior distribution of an SSP model contribution is near 0, and it might make sense to display one sided confidence intervals since what we’re really constraining is an upper limit. I may work on this in the future.