2017-10-03

systematizing surprise; taking logs

I had a substantial conversation with Kate Storey-Fisher (NYU) about possible anomaly-search projects in cosmology. The idea is to systematize the search for anomalies, and thereby get some control over the many-hypotheses issues. And also spin-off things around generating high-quality statistics (data compressions) for various purposes. We talked about the structure of the problem, and also what are the kinds of limited domains in which we could start. There is also a literature search we need to be doing.

I also made a Jupyter notebook for Megan Bedell (Flatiron), demonstrating that there is a bias when you naively take the log of your data and average the logs, instead of averaging the data. This bias is there even when you aren't averaging; in principle you ought to correct any model you make of the log of data for this effect, or at least when you transform from linear space to log or back again. Oh wait: This is only relevant if you are not also transforming the noise model appropriately! Obviously you should transform everything self-consistently! In this case we have nearly-Gaussian noise in the linear space (because physics) and we want to treat the noise in the log space as also linear (because computational tractability). Fortunately we are working with very high signal-to-noise data, so these biases are small.

No comments:

Post a Comment