Tried to write; total fail. Doing stuff for my two jobs. Not complaining! Just not researching, not today.
At Gaia DR2 prep meeting, I discussed comoving stars and related matters with Oh and Price-Whelan. We discussed moving from our work (in DR1) that made use of marginalized likelihoods for catalog generation to a parameter-estimation method. What would that look like? As my loyal reader knows, I prefer parameter-estimation methods, for both pragmatic and philosophical reasons. But once you go to parameter-estimation methods, there are lots of parameters you could in principle estimate. For example: You can look at the space-time event to which the two stars made their closest approach in the past, and how far apart they would be at that point. If the separation is small, then coeval? That might be much more interesting than co-moving, in the long run.
At Stars group meeting, Allyson Sheffield (CUNY) and Jeff Carlin (LSST) showed us results on abundances of M-type giant stars in the Sagittarius tidal streams. They can clearly see that the progenitor of the stream had element-abundance gradients in it prior to tidal stripping. They also show that the stream matches onto the abundances trend of the Sagittarius dwarf body. But the coolest thing they showed is that there are two different types of alpha elements, which they called explosive and hydrostatic, and the two types have different trends. I need to check this in APOGEE! Sheffield also mentioned some (possibly weak) evidence that the bifurcation in the stream is not from multiple wraps of the stream but rather because the object that tidally shredded was a binary galaxy (galaxy and satellite) pair! I hope that's true, because it's super cool.
Because of work Bedell did (on a Sunday!) in support of the Milky Way Mapper meeting, I got renewed excitement about our element-abundance-space dimensionality and diversity work: She was able to show that we can see aspects of the low dimensionality of the space in the spectra themselves, mirroring work done by Price-Jones (Toronto) in APOGEE, but with more specificity about the abundance origins of the dimensionality. That got me writing text in a document. As my loyal reader knows, I am a strong believer in writing text during (not after) the data-analysis phases. I'm also interested in looking at information-theoretic or prediction or measurement approaches to dimensionality.
The day started with a realization by Price-Whelan (Princeton) and me that, in our project The Joker, because of how we do our sampling, we have everything we need at the end of the sampling to compute precisely the fully marginalized likelihood of the input model. That's useful, because we are not just making posteriors, we are also making decisions (about, say, what to put in a table or what to follow up). Of course (and as my loyal reader knows), I don't think it is ever a good idea to compute the FML!
At lunch, Paul Steinhardt (Princeton) gave a great black-board talk about the idea that the Universe might have started in a bounce from a previously collapsing universe. His main point (from my perspective; he also has particle-physics objectives) is that the work that inflation does with a quantum mechanism might be possible to achieve with a classical mechanism, if you could design the bounce right. I like that, of course, because I am skeptical that the original fluctuations are fundamentally quantum in nature. I have many things to say here, but I'll just say a few random thoughts: One is that the strongest argument for inflation is the causality argument, and that can be achieved with other space-time histories, like a bounce. That is, the causality (and related problems) are fundamentally about the geometry of the space and the horizon as a function of time, and there are multiple possible universe-histories that would address the problem. So that's a good idea. Another random thought is that there is no way to make the bounce happen (people think) without violating the null-energy condition. That's bad, but so are various things about inflation! A third thought is that the pre-universe (the collapsing one) probably has to be filled with something very special, like a few scalar fields. That's odd, but so is the inflaton! And those fields could be classical. I walked into this talk full of skepticism, and ended up thinking it's a pretty good program to be pursuing.
Today was the (unfortunately Sunday) start to the first full meeting of the Milky Way Mapper team, where MWM is a sub-part of the proposed project SDSS-V, of which I will be a part. It was very exciting! The challenge is to map a large fraction of the Milky Way in red-giant stars (particularly cool, luminous giants), but also get a full census of binary stars in different states of evolution, and follow up exoplanets and other scientific goals. Rix was in town, and pointed out that the survey needs a description that can be stated in two sentences. Right now it is a mix of projects, and doesn't have a description shorter than two dense slides! But it's really exciting and will support an enormous range of science.
There were many highlights of the meeting for me, most of them about technical issues like selection function, adaptive survey design, and making sensitive statistical tests of exoplanet systems. There was also a lot of good talk about how to do non-trivial inferences about binary-star populations with very few radial-velocity measurements per star. That is where Price-Whelan and I shine! Another subject that I was excited about is how one can design a survey that is simultaneously simple to operate but also adaptive as it goes: Can we algorithmically modify what we observe and when based on past results, increase efficiency (on, say, binary stars or exoplanets), but nonetheless produce a survey that is possible to model and understand for population statistics? Another subject was validation of stellar parameter estimates: How to know that we are getting good answers? As my loyal reader can anticipate, I was arguing that such tests ought to be made in the space of the data. Can they be?
Adrian Price-Whelan (Princeton) and Chervin Laporte (Victoria) convened a meeting at Flatiron today to discuss the outer disk. It turned into a very pleasurable free-for-all in part because Kathryn Johnston (Columbia) came down and Sergey Koposov (CMU) was in town for it! We argued about what are the best tracers for fast or early Gaia DR2 results on the warp and other outer-disk structure, which looks non-trivial and interesting. One thing I proposed, which I would like to think about more, is taking the disk-warping simulations of Laporte and using them to inspire or generate a set of basis functions for disk modes in which expected warps and wiggles are compactly described. Then we could fit the Gaia data with these modes and have a regularized but non-parametric model of the crazy.
Late in the day, Ana Bonaca (Harvard) and I walked through our full paper and results on the information in streams with Johnston and Price-Whelan. They gave us lots of good feedback on how to present our results and what to emphasize.
The highlight of my low-research day was a great seminar by Eddie Schlafly (LBL) about Milky Way dust. He showed that he can build three-dimensional models (and maybe four-dimensional, because radial-velocities are available) from PanSTARRS and APOGEE data (modeling stellar spectra and photometry) and he showed that he can even map the extinction curve in three dimensions! That reveals new structures. It is very exciting that in the near future we might be able to really build a dynamical model of the Milky Way with dust as a kinematic tracer. Also interesting to think about the connection to CMB missions. He showed a ridiculous Planck polarization map that I hadn't seen before: It looks like a painting!
We got way too many applications for the #GaiaSprint. This is a great problem to have, although it is giving me an ulcer: Almost every applicant is obviously appropriate for the Sprint and should be there! So the SOC discussed ways we could expand the Sprint but maintain its culture of intimacy and fun.
In Gaia DR2 prep workshop, we discussed our preparations for joining the Kepler data (and especially the whole KIC) with the data from Gaia DR2. We are hoping to have this done within minutes of the data release, making use of the high-end ESA data systems. This activity resulted in the submission of a trouble ticket to the Gaia helpdesk.
At stars group meeting, way too much happened to report. But Ben Pope (NYU) showed that his work on using L1 to regularize the optimization of photometric apertures works extremely well in some cases, but is very brittle, for reasons we don't yet understand. Simon J Murphy (Sydney) started to talk about what he and Foreman-Mackey (Flatiron) have achieved in his week-long visit but he got side-tracked (by me) onto how awesome delta-Scuti stars are and somehow why. And Ana Bonaca (Harvard) gave an overview of what we are doing with stellar streams.
Today was a low-research day! Research was pretty-much limited to a (great) call with Rix (MPIA) and Eilers (MPIA). We discussed several important successes of Eilers's work on latent-variable models. One is that she finds that she can improve the performance of The Cannon operating on stellar spectra if she reduces the dimensionality of the stellar spectra before she starts! That's crazy; how can you throw away information and do better? I think the answer must have something to do with model wrongness: The model is wrong (as all models are), and it is probably less wrong in the projected space than it was in the original pixel basis. This all relates to data representation issues that I have worried about (but done nothing about) before.
Another important success is that Eilers can run the Gaussian-Process latent-variable model on the dimensionality-reduced space much, much faster than the original data space, and not only does it do better than it did before, it does better than The Cannon. That's great, but it isn't just performance we are looking for: The GPLVM has better model structure, such that we can infer labels without having training data that have nuisance parameter labels. That is, we can make a predictive model for the interesting subspace of the label space. This is tremendously important going in to Gaia DR2, because we want to train a spectroscopic parallax method using only geometric inputs: No stellar models, ever!
Ana Bonaca (Harvard) arrived in town for a week of hacking on our stream-information project. She spent today getting more streams in to the analysis. The point of the project is not to model each stream in detail, but rather to examine, using Fisher Information, the information that each stream (or any combination of streams) brings to the measurement of gravitational-potential parameters. We worked also on paper scope and our original goal (way long ago) of constraining the mass and orbit of the LMC.
Today's parallel-working session at NYU was a dream. Richard Galvez (NYU) is working with Rob Fergus (NYU) to train a generative adversarial network on images of galaxies. One issue with these GANs is that a GAN can do well making fake data in a subspace of the whole data space, and still do well, adversarially. So Galvez is using a clustering (k-means) in the data space, and looking at the populations of the clusters in the true data and in the generated data, to see that coverage is good. This is innovative, and important if we are going to use these GANs for science.
Kate Storey-Fisher (NYU) is making something like adversarial (there's that word again) mock catalogs for large-scale structure projects: She is going to make the selection function in each patch of the survey a nonlinear function of the housekeeping data (point-spread function, stellar density, transparency, season, and so on) we have for that patch. Then we can see what LSS statistics are robust to the crazy. These mocks will be adversarial in the sense that they will represent a universe that is out to trick us, while GANs are adversarial in the sense that they use an internal competitive game for training.
And as I was explaining why I am disappointed with the choices that LSST has made for broad-band filters, Alex Malz (NYU) and I came up with an inexpensive and executable proposal that would satisfy me and improve LSST. It involves inexpensive and easy-to-make stochastically ramped filters. I don't think there is an iceball's chance in hell that the Collaboration would even for a moment consider this plan, but the proposal is a good one. I guess this is adversarial in a third sense!
I got up at 0530 and looked at the participants and schedule for the SPHEREx workshop. I realized that I had prepared precisely the wrong talk yesterday! So I threw away my slides and made completely new slides. It was rushed. I forgot things. But it was still an improvement. I switched from saying things about scientific goals to saying things about technical improvements or extensions that could make the project more capable in respects that would serve the needs of (among other things) stellar science.
I then headed in to the workshop; I could only make it to the second day. I learned so much today. I can't do it justice. Here are some random facts: A lot could be learned about exoplanets if we could get bolometric fluxes for the stars.
I knew this already, I guess, but the prospects for SPHEREx here are excellent, if the project can deliver absolutely calibrated flux densities. There is a mass–metallicity relationship inside the Solar System! The Solar System contains Trojan satellites/asteroids around Neptune, not just Jupiter! There is no model for the zodiacal light in the Solar System that matches the observations to the level of precision that an infrared survey would need to remove or avoid it. The zodiacal light is consistent with being made up of ground up asteroids and evaporated comets! ALMA has observed many debris disks around nearby stars; some of these are angularly huge. The poster child is Fomalhaut, which has a thin, elliptical ring. It's a crazy thing. I learned these things from a combination of Dan Stevens (OSU), Jennifer Burt (MIT), Carey Lisse (JHU), and Meredith MacGregor (Harvard), but that's just a tiny sampling.
At the end of the day there was discussion of calibration, led by Doug Finkbeiner (CfA) and me. I very much enjoy the technical challenges for SPHEREx and the enthusiasm of the team taking them on.
It was a very low-research day! But on the train to Boston, I prepared slides for a short talk at a meeting at Harvard about the SPHEREx mission concept. I wrote about how this cosmology mission (line intensity mapping and large-scale structure) might revolutionize our knowledge of stars in the Milky Way.
Simon J Murphy (Sydney) is in town for two weeks of hacking with Dan Foreman-Mackey (Flatiron). On arrival last week, the two of them implemented something I have been wanting to do for a long time, which is use asteroseismic phase shifts to find binary companions (yes, people have done this for a while now) but without binning the data up or ever explicitly measuring any time delays in bins or at times. This week (having solved that) they are looking at radial-velocity predictions from those discoveries, and testing them with HIRES spectra. They teamed up with Megan Bedell (Flatiron) to use her wobble system to make these measurements. All I did was cheer-lead.
In the afternoon, Alex Malz (NYU) and I discussed what we might do in an upcoming LSST transient classification challenge. I am interested in the following question: Say you have two sparsely and irregularly sampled light-curves of two transient events that are intrinsically similar but maybe at different redshifts, and you want to see that they are similar. How do you construct a relevant, useful, and tractable similarity or distance metric? I have lots of ideas; if we can solve this, we might have something to contribute.