Overview

These are additional analyses of data examining increases in SARS-CoV-2 RNA concentrations in primary sewage sludge in relation to COVID-19 hospitalizations and cases. The original preprint, led by Jordan Peccia, et al can be found here: https://www.medrxiv.org/content/10.1101/2020.05.19.20105999v1 The original pre-print included a correlation of smoothed versions of the viral RNA data and the epidemiological indicators. In the analyses presented here, we use an error-in-variables time series model to get an estimate of the underlying viral dynamics in sewage and the relationship of this with hospitalizations. This analysis is carried out in the Bayesian framework, allowing us to correctly quantify uncertainty in the estimated associations. These additional analyses were performed by Dan Weinberger (Epidemiology of Microbial Diseases, Yale School of Public Health), with input from Josh Warren (Biostatistics, Yale School of Public Health) and the rest of the original study team.

Explanation of the data and analyses

We have up to 4 measurements of viral RNA in sewage sludge daily (2 targets, tested in duplicate). We also have the number of COVID-19 hospitalization per day in the catchment area.

We have five ways of measuring cases detected by laboratory test.

1a,b) Number of positive tests based on date when sample was collected; with or with adjustment for total number of tests performed

2a,b) Number of positive tests based on date when result was reported to DPH; with or with adjustment for total number of tests performed

  1. Number of cases reported on the DPH website.

The number of cases reported on the DPH website is used for the main analyses here. This is because this is the relevant comparison when the goal is to identify a lead indicator to gain an understanding of the epidemic.cThe relationship among these different measure changes over the course of the epidemic as testing delays shrank. T

In the analyses, we assume that the observed sewage testing data (W) are drawn from an underlying, unobserved trajectory of viral concentration in the sewage, which follows an AR(1) process (X). We then evaluate the association between X and the number of hospitalization/cases/positive tests in Poisson regression models. This count model includes an AR(1) random effect for time. The models are fit using JAGS (see Models/mod2.indiv.lags.R). We test lags of the X variable of 0-7 days in a distributed lags framework. We also present results from each lag tested individually for comparison.

Plots of raw data

This vertical lines are spaced 7 days apart. The increase in sewage is apparent earlier than the increase in the other indicators. This is particularly apparent on the log scale.

Comparing the new cases of COVID-19 based on sample receipt date or date of report, we can see a long lag early in the epidemic between sample collection date and date of report to DPH. This gap seems to narrow as the epidemic progresses, perhaps due to improvements in testing

Comparing the new cases of COVID-19 based on date of report to DPH (black) and when the data were reported on the DPH website (red).

Association between sewage concentration and epidemiological indicators (Distributed lags)

We can model the epidemiological time series as a distributed lag of the sewage data.

-We evaluate three epidemiological time series: COVID-19 hospitalizations, number of cases of covid-19 (based on date of sample receipt), and number of cases of covid-19 (based on date of report).

The plots of the distributed lag model show 90% credible intervals

Distributed lags for the association between viral RNA in sewage and COVID hospitalizations.

This shows that lags of 1-4 days are associated with hospitalization. +/-90% Credible Intervals. From the plots, the sewage sludge RNA concentration increase preceded the increase in hospital admissions.

Cumulative relationship between viral RNA in sewage and hospital admissions. This shows that lags of sewage data of 1-4 days together best correlate with the hospitalization time series (after 4 days, cumulative beta does not increase). +/-90% Credible Intervals.

Distributed lags for the association between viral RNA in sewage and new reported COVID-19 cases, based on reporting date on the CT DPH website

This shows that lags of sewage data at longer time lags (6-8 days) best correlate with the time series of reported cases, with considerable uncertainty in the estimates. +/-90% Credible Intervals. Note that the lag between reported cases and the sewage data shrinks over time (see time series plots) as testing improves.

Cumulative relationship between viral RNA in sewage and hospital admissions. This shows that lags of sewage data of 6-8 days together best correlate with the hospitalization time series (after 4 days, cumulative beta does not increase). +/-90% Credible Intervals.

Distributed lags for the association between viral RNA in sewage and new COVID-19 cases, based on date of report to DPH

This shows that lags of sewage data at longer time lags (0-4 days) best correlate with the time series of reported cases, with considerable uncertainty in the estimates. +/-90% Credible Intervals. Note that the lag between reported cases and the sewage data shrinks over time (see time series plots) as testing improves.

Cumulative relationship between viral RNA in sewage and new cases. This shows that lags of sewage data at longer time lags (0-4 days) best correlate with the time series of reported cases, with considerable uncertainty in the estimates. +/-90% Credible Intervals.

We can also repeat this analysis without adjusting for testing volume

This shows largely the same pattern as when adjusting for testing volume. (1-4 day lags) best correlate with the time series of reported cases, with considerable uncertainty in the estimates. +/-90% Credible Intervals. Cumulative relationship between viral RNA in sewage and new cases (not adjusting for testing volume). This shows that lags of sewage data at longer time lags (1-4 days) best correlate with the time series of reported cases, with considerable uncertainty in the estimates. +/-90% Credible Intervals.

Distributed lags for the association between viral RNA in sewage and number or percent positive, based on date when the sample was collected.

This shows that lags of 0-2 days are associated with the case time series, based on date of test, (on the cumulative plot, beta does not increase after 2 days) +/-90% Credible Intervals. From the plots, the sewage sludge RNA concentration increase preceded the increase in number of cases, based on sample collection date.

Cumulative relationship between viral RNA in sewage and percent positive based on sample collection date. +/-90% Credible Intervals.

and same thing with no offset term

Cumulative relationship between viral RNA in sewage and number of cases based on sample collection date. +/-90% Credible Intervals.

Association between sewage concentration and epidemiological indicators (Individual lags)

-All of the plots of individual lags show 98.75% credible intervals, which represent an alpha of 0.1, adjusted for multiple testing.

Association of sewage viral concentration at different lags with hospitalizations

Test lags of 1-7 days, and leads of 1 days one at a time.

Rate ratio (+/- 98.75% Credible intervals) showing the relative increase in positive tests associated with a 1-log increase of sewage RNA concentration at various lags (based on date of test). This shows the strongest association with a lag of 1 day, with the effect trailing off with longer or shorter lags. From the plots, the sewage sludge RNA concentration increase preceded the increase in hospital admissions.

Association of sewage viral concentration at different lags with new cases (defined by date of test)

Rate ratio (+/- 98.75% Credible) showing the relative increase in positive tests associated with a 1-log increase of sewage RNA concentration at various lags (based on date of test). This shows the strongest association with a lag of 4 days, with uncertainty in the estimates and in the length of the lag. From the plots, the sewage sludge RNA concentration slightly preceded the increase in the cases (based on sample collection date).

Association of sewage viral concentration at different lags with new cases (defined by date of report)

Rate ratio (+/- 98.75% Credible intervals) showing the relative increase in reported cases associated with a 1-log increase of sewage RNA concentration at various lags. Sewage sludge RNA increases first, then reported cases lags 1 or 6 days later, with considerable uncertainty in the estimates and in the length of the lag. From the plots, the sewage sludge RNA concentration increase preceded the increase in reported cases