Notes from a New York Exposomics Workshop, Held as the City Slid into COVID

I am at the age where conferences no longer stick in my head the way they used to, so after spending Thursday and Friday in New York at an exposomics workshop hosted by our institute, I figured I should write things down while they are still fresh.

The meeting happened under a very particular kind of pressure

Before getting to the science, the setting matters. Not the background of exposomics, but the background of New York over those two days, when COVID was beginning to break out in earnest.

At that point, the city’s medical capacity was effectively reserved for severe cases. Mild cases were being told to isolate. You could still see people on the street without masks, and even among those wearing them, they were not common. In supermarkets, at least on the Upper East Side and in East Harlem, prices for everyday goods had not changed much. The broader impression was that neither the government nor the public was taking the situation very seriously. Chinese communities, by contrast—Chinatown, Brooklyn’s Eighth Avenue, Flushing—were already visibly quieter.

My own instinct was simple: assume you are already infected, then rely on the advantage of not being in an older age group and try not to dwell on the consequences. Government help did not seem like something to count on. Test kits were scarce, medical resources were scarce, and under those conditions it was impossible to get reliable numbers.

At the time, getting tested required fitting one of three categories: symptomatic with recent travel history, symptomatic with known contact with a confirmed case, or severe illness. That meant community transmission would largely go unmeasured unless someone deteriorated badly enough to qualify. Otherwise, isolation was the most you could expect; testing was not.

And this was New York, a city far denser than rural America. But even in sparsely populated areas, a church gathering over the weekend can collapse the distance that low population density might otherwise buy you. What I found striking was how often governments avoided talking frankly about religion as a routine high-risk setting for transmission. Whether it was Buddhist gatherings in Hong Kong, holy sites in Iran, or cult-like church clusters in South Korea, the pattern was not hard to see. Remote sermons would have been perfectly plausible. It is difficult to imagine that Buddha, Muhammad, or Jesus would prefer their followers to court infection in the name of religious practice.

Under those conditions, holding an academic meeting clearly introduced extra risk. Personally, I did not care all that much, but that does not mean everyone else should have been expected to feel the same. Modern society is deeply built around planning and commitments, and that planning usually has no room for external shocks. Cheap nonrefundable flights and prepaid arrangements eat away at people’s ability to respond rationally to new risks. Once someone has already paid, they resent the loss and keep going, even when the risk has changed drastically. The result is greater exposure for everyone.

The cruise ship example was impossible to ignore: even after the outbreak had already become obvious in early February, the Grand Princess still sailed as planned, and the result was community spread all across California.

Some people still insisted the fatality rate might not be very high, that this was basically a bad flu and did not warrant overreaction. But if you actually looked at the data from China outside Hubei, the fatality figures were not static; they were gradually rising. Both the disease course and transmissibility made it a mistake to treat this pneumonia like influenza. Severe flu cases are roughly around 1%, whereas this was closer to 20%, and the fatality rate was at least an order of magnitude higher than seasonal flu.

Of course, anyone determined to dismiss inconvenient numbers can always do so. Plenty of people only trust the data that fit what they already believe, myself included. Under the right value system, anyone can decide every opposing dataset is fake. But opinions do not substitute for facts. Replacing facts with opinions and then spreading them is either stupid or malicious.

Exposomics is still young, and it did not get here easily

Now to the workshop itself.

Exposomics has had a rough path from the beginning. The concept was proposed in 2005, but for about five years there was essentially no follow-up literature. Only around 2010 did sustained attention begin to form around the contribution of non-genetic factors to disease.

Even now, genomics—the far more glamorous field—still teaches the public to think of genes as the decisive force behind disease and behavior. In historical terms, genomics only had a ten- or twenty-year head start over exposomics, and it also passed through a phase of indifference before its later explosion.

The real driver behind both fields is not branding but analytical technology. A concept becomes scientifically testable only when there are ways to measure it qualitatively and quantitatively.

As I see it, exposomics has been pushed forward by three major technical fronts:

high-resolution mass spectrometry
remote sensing and satellite data
wearable devices

Mass spectrometry makes it possible to identify and quantify chemical exposures and, in some cases, trace them back to source compounds. Satellite-based remote sensing has freed large-scale air pollution assessment from the limits of fixed monitoring stations. Wearables, meanwhile, have made real-time monitoring plausible in ways that were not practical before.

All of this depends on data science, which provides the shared framework for processing and analyzing these very different streams of information. Once those point breakthroughs exist, older disciplines—preventive medicine, epidemiology, environmental monitoring, bioinformatics, psychology, and others—can quickly be pulled into the exposomics framework and used to examine a disease or behavior from multiple angles.

One welcome sign: people finally talked openly about reproducibility

Among the newer themes this year, one point stood out immediately. Linda Birnbaum, the former director of NIEHS, and Marie Lynn Miranda of Rice University both emphasized the importance of data sharing and transparency. They argued that reproducible research needs proper infrastructure and strong platforms to support it.

It was honestly the first time I had heard this issue put so plainly in a conference talk. That is a good sign. The era of everyone running their own isolated publish-or-perish game without common standards has needed a corrective for a long time.

Exposomics and precision medicine still barely speak to each other

Another issue came from Robert Wright, our department chair, who raised the relationship between exposomics and precision medicine.

The difference in emphasis is clear enough. Exposomics is primarily explanatory: it tries to understand why disease happens, especially through environmental and behavioral factors. Precision medicine, as it is usually practiced, is focused on drug development and treatment, and overwhelmingly organized around genetics rather than environmental exposure.

Exposomics is still a small field and has not yet developed enough institutional weight to shape precision medicine. Precision medicine, for its part, has mostly ignored exposomics altogether.

But if environmental exposure is a major driver of disease, then precision medicine should not be limited to targeted pharmaceuticals. It should also include interventions such as lifestyle change and environmental modification.

A case mentioned at the meeting makes the point well. In one family, the older child, who was in school, showed frequent anger problems, while the younger child, around four years old, did not. It later turned out that lead in printed materials was a key factor. The younger child was unaffected largely because they could not yet read and therefore had much less exposure. No amount of genetic testing would have solved that case on its own, though genetic susceptibility may still matter in many situations.

That is why the long-term path of exposomics has to move from asking why toward asking what to do. Once the field reaches that stage, precision medicine becomes a natural outlet rather than a separate enterprise.

From snapshots to trajectories: the push toward “4D data”

Manish Arora, who directs our lab, proposed the idea of environmental biodynamics. In truth, many of the talks already pointed in that direction even before the phrase was used.

The basic idea is that exposomics needs temporal tracking. If one wants the inflated version, it means the field needs 4D data. In plain language, environmental biodynamics means following disease as a dynamic process: tracking the level and role of a pollutant across different stages of disease, rather than simply classifying people into case and control groups and comparing them back and forth.

He also used the phrase deep data to distinguish this kind of longitudinal information from big data. Academic buzzwords are a universal language, apparently, but new terminology can still help a field organize itself.

Stripped of the packaging, though, this is really about integrating analytical epidemiology in a deeper way. The harder part is that disciplinary jargon walls remain very real. One group uses SAS, another uses R, and before you even get to the science you can spend half a day wrestling with data interfaces and file formats.

Remote sensing has advanced more than I had realized

There were roughly four talks this year centered on remote sensing, and I came away realizing that progress in that area is already substantial.

Its role in atmospheric pollution monitoring is no longer the whole story. Integration with social network data, geoscience data, and related streams has matured considerably. Many of the concepts were not unfamiliar to me; similar projects had shown up at R conferences years ago. But there has always been a gap when industry or internet companies try to work on these problems without enough scientific grounding. There is a great deal of domain knowledge that programmers simply do not know unless they have spent time with the actual scientific questions.

That is why I still lean toward learning programming through problems rather than learning programming first and then hunting for problems afterward. Real-world problems are not solved by building an abstract visualization and calling it done. They require thinking from multiple directions and understanding the structure of the problem itself. Starting from reality is usually more useful than polishing theory in the abstract.

One presentation referred to the three C problems of the twenty-first century: Climate, Chemical, and City. That struck me as exactly right. All three emerged in the twentieth century but did not yet dominate in the way they do now. In the twenty-first century, they are impossible to dodge: climate change, chemical exposure, and urbanization. Each one is already shaping daily life, and each one demands actual solutions.

Non-targeted analysis is still bottlenecked by identification

The talks on non-targeted analysis were less novel overall, but there was broad agreement on one key bottleneck: identifying substance peaks remains difficult. It also became clearer, perhaps more clearly than before, that the number of mass spectrometry peaks is not the same thing as the number of substances.

One especially good example came from a Yale presentation that combined non-targeted analysis, targeted screening, and wearable devices. That was at least a compelling scientific story.

People often say non-targeted analysis contains targeted analysis within it, but the validation work behind that claim is still limited. For example, how often do people actually try to recover targeted pollutants from non-targeted data without reference standards? Not often.

My own experience is that instrument software can usually do some of that work, but the sensitivity loss in full-scan acquisition is substantial. There is a real need for an open-source workflow here.

The targeted screening side, if I get the time, is something I could probably implement in R myself. I may be one of the relatively few people who have analyzed 183 isomers of PBDEs; there are 209 in total.

A small workshop, but not an empty one

This was still a small workshop, and exposomics remains an early-stage field. But when a meeting begins to surface new lines of thought and also shows where consensus is forming, that is usually a sign that the next stage may be better than the last.