They blinded it for science

By Lori  Ann White

At first glance, the Blind Analysis workshop hosted by KIPAC last month was only one of many gatherings during a very busy March (which actually started the last few days of February). On the SLAC campus, the Institute hosted the Cosmology with CMB-S4 workshop, and then, immediately began preparing to assist with the New Horizons in Inflationary Cosmology conference two days later. Hard on the heels of these gatherings was the LZ collaboration meeting. A social event for prospective graduate students and a planning workshop discussing how to get the most science out of the next generation of X-ray telescopes finished off the month.

This list of activities doesn’t even include all the usual seminars, colloquia, and talks that happen every week. In total, more than 450 guests roamed KIPAC’s halls in the last couple of months, either at SLAC National Accelerator Laboratory or Stanford University. Theorists discussed the possible origins of the universe, experimentalists talked about how to build the most exacting instruments to get the best data, and data analysts talked code optimization and structure (and sometimes, they talked about everything at once).

And then there were the Blind Analysis workshop attendees, who—in effect—gathered to discuss themselves.

Blinding Analysis workshop attendees at break. (Credit: KIPAC.)
Nobel laureate Saul Perlmutter (center) talks experimenter bias with KIPAC’s Josh Meyers and Aaron Roodman (from left) as well as other workshop attendees. (Credit: KIPAC.)

Scientific introspection is necessary because of a known unknown in the world of experimentation called experimenter bias, the name given to all the ways in which simply being human can affect how scientists take data, analyze data, and interpret data. “Experimenter bias” is a known—i.e. it is well-known to exist—but it’s also unknown, because the psychology of the human scientist can’t be easily quantified. Researchers can’t add error bars indicating the strength of a preconceived notion or the weight of an unconscious desire, yet subtle and not-so-subtle influences definitely exist, such as the result of a previous experiment or the conclusion of a popular theory, and these influences can nudge a researcher into accepting a result prematurely or discounting results that don’t conform to expectation.

Historical view of the measure of C, the speed of light in a vacuum. (Image courtesy A. Roodman.)
Different values for the speed of light through a century of measurement. Note the values determined during the 1930s and 40s (red box, lower right corner). They are significantly different than values determined during the previous two decades, but are close to each other, and are thought to have influenced each other. (Figure from Blind Analysis in Nuclear and Particle Physics. Joshua R. Klein and Aaron Roodman. Annual Review of Nuclear and Particle Science, 2005;141-163.)

This is where “blind analysis” comes in -- the term is a general descriptor indicating the idea of scientists implementing some method in their work in an effort to remove all-too-human foibles and frailties from their data analysis. In a blind analysis, scientists cannot be tempted, even subconsciously, to influence the results of an experiment because they hide something about the experiment from themselves.  There are multiple approaches to doing this: they can lock some of the data away in a “black box” while they use the rest to determine the best way to analyze it. They can “salt” the data, which means adding extra data that looks real but actually isn’t, and then just pull it back out when they do the final analysis. They can fudge with the numbers in the data by adding an unknown factor or constant (or both) to them, thus obscuring what will be the final results (and again, remove that factor at the very end).

Real vs. fake gravitational wave signals. (Credit: The LOSC Team and the LIGO Scientific Collaboration.)
The image on the left is a fake gravitational wave signal, added to LIGO data in 2010. The image on the right is an actual gravitational wave signal from 2015. (Credit: The LOSC Team and the LIGO Scientific Collaboration.)

These techniques have been used in particle physics experiments for decades, but are still rare in the realms of astrophysics and cosmology, chiefly because we’ve only recently entered the era of precision astrophysics and cosmology. In other words, these two disciplines have only recently begun to achieve observational results that are precise enough to make experimenter bias worrisome.

Yet it didn’t take long for the scientific stakes to get high enough to make reducing experimenter bias a crucial part of several current and upcoming experiments.

As KIPAC professor Aaron Roodman explained in his introductory talk, the stakes are rising because of the type of data researchers need, the amount of data they need, and the mounting expenses of getting the data they need. After a few hundred years of pointing telescopes at individual targets, they need to broaden their scope (no pun intended) to get the big picture of the universe as a whole, so they can measure the parameters that affect how the Universe as a whole behaves. Such as the Hubble constant (which tells how quickly galaxies are receding from each other), or how much matter exists in the universe as a fraction of the total matter-energy content, or how strong dark energy really is.

To do this, astrophysicists and cosmologists need surveys, and surveys need different—and very expensive—instruments, both space- and Earth-based, that can collect light from all across the electromagnetic spectrum. Many of these instruments, such as the Large Synoptic Survey Telescope (currently under construction in Chile), or the Dark Energy Spectroscopic Instrument (planned to be installed on the Mayall Telescope on Kitt Peak in Arizona in 2018) are or will be one-of-a-kind instruments, and the data they collect is ultimately shared publicly.

The sharing of a single data set has big implications for making sure a result is reproducible—the cross-checks and verifications so necessary to science—and the members of the Blind Analysis workshop had a keen awareness that when several different scientific groups are all looking at the same data, one of the big differences in their results could be the scientists themselves.

KIPAC professor Aaron Roodman. “LSST in particular does present one unique consideration with respect to experimenter’s bias,” says Roodman. “The LSST data set is designed to be as complete an image of the whole Southern sky as possible, and given the great expense and effort involved, it may be the ultimate imaging survey.  

“It’s incumbent on the LSST community to make the most of these unique data and produce lasting scientific results.”

KIPAC postdoctoral researcher Elisabeth Krause.Elisabeth Krause, KIPAC postdoc and chair of the workshop organizing committee, agreed. “I wanted to hold this workshop because blinding will be an important tool for making sure that we get the best cosmology possible out of LSST. It will be such a powerful instrument that experimenter bias could be comparable to statistical errors,” which, she says, “means we could get dark energy completely wrong.”


KIPAC professor Patricia Burchat.According to KIPAC professor Pat Burchat, one strategy of the workshop was to embrace difference in science and scientists. The organizing committee (of which she was a member) invited colleagues from a variety of different backgrounds and projects: from graduate students to Nobel Prize winners; from gravitational wave hunters to dark matter sleuths to particle physicists to psychologists and social scientists. Speakers gave examples of how they had incorporated blind analysis techniques into their data analyses.

Values blinded vs unblinded, demonstrating how well blinding can obscure data values. (Credit: Franz Elsner.)
How extracted cosmological parameters Ωm (matter density), H0 (the Hubble constant), and σ8 (fluctuation amplitude of mass distribution at 8h−1 Mpc) can change in one particular version of unblinding in a DES analysis. (Credit: Franz Elsner.)

In between talks, the attendees split into smaller groups for roundtable discussions that got into the nitty gritty of blind analysis—when to use it, what technique would work best for a particular experiment, how to convince colleagues that a project should use it, and possible drawbacks. A document capturing conclusions and recommendations from the workshop is forthcoming.

KIPAC professor Dan Akerib.While practical considerations provided the primary motivation for the workshop—given that experimenter bias exists, what can be done about it—attendees also had fun pondering some of the more philosophical issues. When Roodman instructed everyone present to think of a current paper and then ask themselves, “How do you even know your results are correct?” KIPAC professor Dan Akerib called that “a brain explosion moment.”

Roodman did ask that question fairly early Monday morning, but Akerib’s brain seemed to explode for bigger reasons than lack of caffeine. “We’re in the cave,” he said, referring to Plato’s Allegory of the Cave, in which our view of reality is reduced to seeing the shadows of real objects on the blank wall of a cavern. Blind analysis is just one way to keep our own shadow puppets from obscuring the view.


More Information

All talks are archived and publically available at the workshop website.