Main

Wanzhou District is a city of 1.74 million people in the municipality of Chongqing, China, bordering the western side of Hubei Province, of which Wuhan is the capital city. As the gateway to southwestern China from Hubei Province, Wanzhou was quickly affected by the COVID-19 outbreak in China, whose epicenter was Wuhan. Approximately 20,000 people returned to Wanzhou from Hubei Province during the Spring Festival holiday in late January 2020. With the lockdown of Wuhan and surrounding areas on 23 January 2020, Wanzhou also became an enclosed environment for epidemiological investigations. These circumstances have provided an opportunity to better understand transmission dynamics and risk factors for the spread of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)1, the causative agent of COVID-19. Around 47 other Chinese cities in 22 provinces have a similar population size to Wanzhou (1.5 to 2.5 million)2 and implemented the same containment measures to tackle the COVID-19 outbreak. In addition, Wanzhou is comparable in population size to medium- to large-sized cities in Europe and the United States. These characteristics make Wanzhou a suitable example for better understanding the transmission dynamics and risk factors of SARS-CoV-2 infection, as well as the effectiveness of control measures, for both researchers and policy makers in countries and cities that are heavily hit by the COVID-19 outbreak.

Asymptomatic and presymptomatic transmission of SARS-CoV-2 poses serious challenges to intervention strategies and has been reported in multiple studies3,4,5,6,7. In 77 infector–infectee COVID-19 transmission pairs, He and colleagues8 found that 44% of the secondary cases were infected during the presymptomatic stage of their source cases’ infection. However, lack of data on close contacts who were not infected hinders better understanding of the magnitude of asymptomatic and presymptomatic transmission. With detailed epidemiological information on asymptomatic, presymptomatic and symptomatic cases and their close contacts in Wanzhou from the beginning to the end of its COVID-19 outbreak, this study aimed to present the full course of the outbreak in this city and to assess the effectiveness of control measures through an analysis of SARS-CoV-2 transmission in each generation by asymptomatic, presymptomatic and symptomatic cases and an examination of the contact patterns that may have facilitated transmission in the study population during the outbreak.

Results

Characteristics of SARS-CoV-2 transmission in Wanzhou

Between 21 January and 10 April 2020, 183 confirmed cases and 1,983 close contacts who tested negative for SARS-CoV-2 were identified in Wanzhou (Table 1). Among the confirmed cases, 123 (67.2%; including 5 not registered in Wanzhou) were symptomatic and 60 (32.8%) were asymptomatic. Symptomatic cases were individuals with positive reverse transcriptase PCR (RT–PCR) results who showed symptoms before presenting to a hospital during the 2-week quarantine period as a close contact, during the hospital stay or within 4 weeks after being discharged from the hospital. Asymptomatic cases were individuals with positive RT–PCR results who reported no symptoms before being diagnosed and did not show symptoms throughout the quarantine and treatment period and in 4 weeks after being discharged from the hospital. Negative close contacts were those who had contact with cases but were not infected with SARS-CoV-2. Infected close contacts were those who became infected, either symptomatic or asymptomatic, after contact with the source case. Six unconfirmed cases from Wuhan lacked RT–PCR test results and were excluded from the analysis, but their close contacts were included.

Table 1 COVID-19 spread in Wanzhou, Chongqing

Of the 183 confirmed cases, 20 symptomatic and 7 asymptomatic individuals were unable to trace their previous or next generations of transmission and thus could not be clustered. These cases with the cluster unknown had 270 close contacts, none of whom were infected (13.6% of 1,983 negative close contacts; Table 1). Another 165 (8.3% of 1,983) negative close contacts lacked information on their source cases because these sources cases resided outside Wanzhou and were passengers who only shared the same transport on one occasion with these close contacts. Nevertheless, these close contacts were included in the analysis of risk factors for SARS-CoV-2 infection.

The remaining 156 cases formed 28 clusters with two to five generations of transmission originating from 45 cases who traveled to Wanzhou from Hubei Province. Figure 1 shows one cluster with five generations of transmission as an example; all other clusters were defined in this way. Although most of the contact and transmission occurred before 29 January 2020 when there were no strict control measures on close contacts, the majority of asymptomatic cases were identified from mass testing of all identified close contacts after the National Health Commission of China protocol was adopted (8 February 2020). The time of symptom onset varied widely among symptomatic cases compared to the time they had contact with their source cases (Fig. 2a). The number of transmissions decreased quickly after the lockdown of Wuhan and neighboring areas and the commencement of home-based quarantine measures on 23 January 2020 (Fig. 2b). The number of active cases (that is, cases who still carried the virus and thus posed transmission risks to others) peaked between 22 January and 11 February 2020 for symptomatic cases and was more spread out for asymptomatic cases (Fig. 2c).

Fig. 1: A COVID-19 cluster with five generations of transmission in Wanzhou, Chongqing.
figure 1

Generations of SARS-CoV-2 transmission were classified based on the date of first contact with the source case. G, generation of transmission.

Source data

Fig. 2: SARS-CoV-2 transmission over five generations in Wanzhou, Chongqing.
figure 2

a, Transmission map of five generations (G1–G5) of symptomatic cases (n = 103; solid lines) and asymptomatic cases (n = 53; dashed lines) from the date of first contact with the source case to the date of a negative test result by RT–PCR. Unconfirmed cases (n = 6; dotted lines) who traveled to Wanzhou from Hubei Province were also included as source cases. Each horizontal line represents one case. The date of symptom onset for symptomatic cases is marked by a solid triangle. Each vertical line represents a transmission to the next generation. The reproductive number (R) is shown for each generation. NA, not applicable. b, Quantification of SARS-CoV-2 transmission for five generations of symptomatic and asymptomatic cases in each 7-d period between 15 January and 14 March 2020. The date of transmission for G1 cases was counted as the date when they arrived at Wanzhou. c, The number of active COVID-19 cases in each 7-d period between 15 January and 14 March 2020. A case is considered active from the date of contact until becoming negative by RT–PCR and is counted as an active case throughout this period. The purple, blue/gray, orange, blue and maroon solid lines or boxes represent G1 to G5 symptomatic cases, respectively. The purple, blue/gray, orange, blue and maroon dashed lines or patterned boxes represent G1 to G5 asymptomatic cases, respectively.

Source data

Among the 692 close contacts of the G1 cases, 74 (10.7%) became infected (G2) and 54 (7.8%) of these infected close contacts were symptomatic, while 20 (2.9%) were asymptomatic (Table 1). Among the close contacts in later generations, 4.4%, 6.3% and 3.1% became infected in G3, G4 and G5, respectively. Compared to G2 infection, the percentage of close contacts who were infected and symptomatic decreased to 2.0% (13 of 660) in G3 and 4.2% (6 of 142) in G4 infection, but the percentage of close contacts who were infected and asymptomatic changed slightly to 2.4% (16 of 660) in G3 and 2.1% (3 of 142) in G4 infection. Only 5 of 163 (3.1%) close contacts became infected in G5 infection, and they were all asymptomatic. The proportion of asymptomatic cases among the total cases in each generation appeared to increase from 23.1% in G1 (9 of 39) to 27.0% in G2 (20 of 74), 55.2% in G3 (16 of 29) and 57.1% in G4 and G5 combined (8 of 14; P value for chi-square test for trend = 0.004).

The median incubation period estimated using Weibull distribution by Bayesian approach (that is, posterior median of incubation period) was 17.4 d (Table 2; 95% confidence interval (CI): 13.4–21.5) for G1 symptomatic cases, and it was shorter for G2 (12.5 d; 95% CI: 10.7–14.6) and G3 (10.4 d, 95% CI: 7.0–16.7) symptomatic cases. We did not estimate the posterior median of incubation period for G4 symptomatic cases due to non-convergence as there were only six G4 symptomatic cases. Across generations of transmission, asymptomatic cases had a transmission risk period of over 21 d (that is, the time interval between close contact with the source case to diagnosis). The time window for asymptomatic cases to potentially transmit the virus to others was markedly longer than symptomatic cases, which is mostly due to the generally longer time before diagnosis of asymptomatic cases compared to symptomatic cases.

Table 2 Transmission dynamics of SARS-CoV-2 in Wanzhou, Chongqing

A total of 83 infected close contacts with verified information by Wanzhou Center for Disease Control and Prevention (CDC) on the source case (symptomatic or asymptomatic) and contact information were identified (Table 2). This was because, of the 183 cases included in this study, 39 were G1 cases who came to Wanzhou from Hubei Province with no information on their source cases, 27 were infected in Wanzhou but were unable to identify their source cases, 11 were G2 cases who were infected by the 6 unconfirmed G1 cases (excluded from this study), and the remaining 23 cases had unclear contact information. Among the 1,393 close contacts who had contact with symptomatic cases, 922 individuals had contact before the date of symptom onset reported by their source cases and 47 (5.1%) were infected (Table 2). Among the remaining 471 close contacts who had contact after the source cases showed symptoms, 20 (4.2%) were infected. The proportion of infected close contacts did not differ statistically by whether the contact occurred before (presymptomatic transmission) or after (post-symptomatic transmission) the source cases’ onset of symptoms (chi-square test, P = 0.48). Notably, the possible time window of exposure for presymptomatic transmission in our calculation was between when the source cases had contact with their infectors in the previous generation and the date of the source cases’ onset of symptoms; for post-symptomatic transmission, the time window of exposure was after the source cases showed symptoms and before the source cases were diagnosed and transferred to the hospitals. Of the 171 close contacts of asymptomatic cases before their diagnosis, 9 (5.3%) were infected and showed symptoms, while 7 (4.1%) individuals were infected but asymptomatic (Table 2). Due to the centralized quarantine, only two individuals had close contact with asymptomatic cases after the source cases had tested positive by RT–PCR and neither of these two close contacts was infected. In total, 63 of the 83 infected close contacts (75.9%) had contact with their symptomatic source cases before the symptom onset or before their asymptomatic source cases were diagnosed; and 16 of the 83 infected close contacts (19.3%) were infected by asymptomatic source cases, while the remainder (80.7%) were infected by symptomatic source cases.

Reproductive number and dispersion parameter

The estimated G1-to-G2 reproductive number was 1.64 (95% CI: 1.16–2.40; Table 2). After strict control measures were implemented, the reproductive number decreased dramatically to 0.39 (95% CI: 0.24–0.58) for G2-to-G3 transmission and to 0.31 (95% CI: 0.12–0.58) for G3-to-G4 transmission. Stratifying by case type, the G1-to-G2 reproductive numbers were 1.63 (95% CI: 1.03–2.59) and 2.44 (95% CI: 2.12–6.75) for symptomatic and asymptomatic source cases, respectively. The offspring distribution of symptomatic and asymptomatic cases both had large individual variation (Fig. 3). The number of offspring was not necessarily higher with a larger number of close contacts per case (Extended Data Fig. 1), implying that other factors might have driven the possibility of infection. The individual variation was smaller for G1 symptomatic cases than for G2–G5 symptomatic cases combined. This was also observed for G1 and G2–G5 asymptomatic cases. In addition, the offspring distribution was less overdispersed for symptomatic cases than for asymptomatic cases, and the vast majority of asymptomatic cases had zero offspring (no further transmission).

Fig. 3: Percentage of cases with a different number of offspring in G1 and G2–5 symptomatic or asymptomatic cases.
figure 3

a, Among the 30 G1 symptomatic cases, 13 (43.3%) had no offspring, 5 (16.7%) had one offspring, 5 (16.7%) had two offspring, 3 (10.0%) had four offspring, 3 (10.0%) had five offspring, and 1 (3.3%) had seven offspring. b, Among the 9 G1 asymptomatic cases, 7 (77.8%) had no offspring, 1 (11.1%) had six offspring, and another 1 (11.1%) had eight offspring. c, Among the 73 G2–G5 combined symptomatic cases, 54 (74.0%) had zero offspring, 11 (15.1%) had one offspring, 3 (4.1%) had two offspring, 2 (2.7%) had three offspring, 2 (2.7%) had four offspring, and 1 (1.4%) had six offspring. d, Among the 44 G2–G5 combined asymptomatic cases, 42 (95.4%) had no offspring, 1 (2.3%) had one offspring, and another 1 (2.3%) had five offspring.

Source data

The dispersion parameter k when symptomatic and asymptomatic cases were combined was 0.484 (95% CI: 0.226–1.038) for G1, decreasing to 0.284 (95% CI: 0.110–0.735) for G2, 0.107 (95% CI: 0.024–0.482) for G3 and 0.048 (95% CI: 0.004–0.602) for G4 cases, but the CIs of all the estimated dispersion parameters overlapped (Table 2). The overall dispersion parameter k for G1–G4 cases combined was 0.205 (95% CI: 0.126–0.334).

Predicted COVID outbreak in Wanzhou in the absence of control measures

We simulated the number of COVID-19 cases in Wanzhou by the date when the close contact occurred, assuming an absence of control measures and using the simulation modeling approach of epidemic spreading on complex networks9,10,11,12. The predicted number of COVID-19 cases by the date of contact generated by the simulation model fitted well with the actual number of cases up to 16 January 2020 (Extended Data Fig. 2a). After this date, the increase of actual cases slowed down, with few new infections after 23 January 2020, following the lockdown of Wuhan city and its neighboring area and the implementation of further control measures. We predicted that without the lockdown and control measures, including social distancing, wearing face masks, thorough contact tracing, strict quarantine of close contacts and mass testing, the number of COVID-19 cases would surge over time, reaching a peak of 39,059 daily new infections on 30 January (Extended Data Fig. 2b) and leading to almost 560,000 infected individuals in Wanzhou.

Risk factors for SARS-CoV-2 infection

Risk factors for SARS-CoV-2 infection were evaluated in a total of 1,398 cases and close contacts with clear dates of contact and contact patterns (Table 3); the remaining 768 cases and close contacts were excluded as they did not have information on contact patterns. Sex and age distribution were similar between those who were excluded and included (Supplementary Table 1) in this assessment. Although differences in the age distribution between symptomatic cases, asymptomatic cases and negative close contacts were observed (Supplementary Table 2), risk for SARS-CoV-2 infection did not differ by sex or age (Table 3). The risk was higher for exposure to the source case within the first 5 d after the source case had contact with their infector in the previous generation (adjusted odds ratio (OR): 2.88, 95% CI: 1.22–6.78) than after day 5. Duration of contact of ≥8 h increased the risk by more than sixfold (adjusted OR: 6.08, 95% CI: 2.88–12.83), compared with a contact duration of <8 h. Frequent contact doubled the risk for infection (adjusted OR: 2.89, 95% CI: 1.39–6.02), compared with occasional or infrequent contact. When presymptomatic and asymptomatic cases were analyzed separately, their contact patterns were similar (Supplementary Table 3) and were consistent with the risk factors observed above in all the cases combined.

Table 3 Risk factors for infection with SARS-CoV-2 in Wanzhou, Chongqing (n = 1,398)

Discussion

The potential for presymptomatic transmission was identified in analyses of clusters of cases in China13 and Singapore7. Data from the United States showed that unrecognized asymptomatic and presymptomatic infections most likely contributed to the SARS-CoV-2 transmission in nursing facilities6,14. The present study confirmed asymptomatic and presymptomatic transmission in the general population of Wanzhou, China. Asymptomatic and presymptomatic transmission together accounted for 75.9% of infections in the next generation, due to the abundance of close contacts before symptom onset or being diagnosed. Among the infections in the next generations, 80.7% were from symptomatic source cases and 19.3% were from asymptomatic source cases.

Several studies have reported that viral load decreased monotonically after symptom onset8,15,16. One study showed that symptomatic individuals may no longer be infectious 8 d after symptom onset, as live virus could not be cultured thereafter17. However, that study did not test virus loads during the incubation period. The elevated risk for SARS-CoV-2 infection found for early contact with the source cases suggests the need for further investigation of viral load in presymptomatic cases.

Almost all G2 cases in this analysis had their first contact with the G1 cases before 25 January 2020, when nationwide control measures had not been strictly implemented. As the Chinese public gradually became more aware of and more educated about COVID-19, they voluntarily undertook preventive action such as social distancing, wearing face masks and home quarantining with minimum interactions with family members. Thus, our G1-to-G2 reproductive number was lower than model-based basic reproductive numbers that do not consider quarantine for close contacts18,19. As social distancing and other interventions for the COVID-19 pandemic are gradually implemented worldwide, our G1-to-G2 reproductive number may be a more realistic reflection of the situation that most countries have experienced in the past few months, although other countries are likely to be less stringent at ensuring quarantine is maintained compared to China.

After the initiation of the first-level response to major public health emergencies on 24 January 2020 in Chongqing, the Chongqing Municipality Government strongly recommended people to stay at home as much as possible, and people were requested to wear face masks when going out. Consequently, the number of contacts in Wanzhou decreased quickly and substantially (Extended Data Fig. 3). Dining together was the major mode of contact and accounted for 56% of the contacts before 25 January 2020, and it was reduced to 39% between 25 January and 13 February 2020 due to social distancing (Extended Data Fig. 3). No contact via dining together occurred after 13 February 2020. The substantial decrease in the number of contacts over time as a result of the above-described control measures may have contributed to the rapid reduction in reproductive numbers in G2–G4.

We found that the individual infectiousness of SARS-CoV-2 is highly skewed, and the value of dispersion parameter k is in line with previous studies20,21,22. The individual variation of infectiousness overall was smaller for G1 cases than for cases in later generations of transmission, and it was also smaller for symptomatic cases than for asymptomatic cases. The majority of cases that had at least four offspring were G1 cases. This observation supports a recent finding of superspreading events at the early stage of the COVID-19 outbreak in China22.

A recent simulation study showed that with an R0 of 2.5, a delay between symptom onset and isolation of 3.8 d, 30% presymptomatic transmission and 10% asymptomatic cases, tracing even 100% of contacts would be unlikely to halt the spread of COVID-19 within 12–16 weeks23. Our simulation suggests that the lockdown of the epicenter and the control measures implemented in Wanzhou potentially prevented substantially more extensive spread of SARS-CoV-2 infections. The prevented transmissions by the active cases between 25 January and 7 February 2020 could likely be attributed largely to social distancing and face mask wearing when outside homes. Quarantine and contact tracing during this period were less likely to play an important role since most cases were identified on and after 8 February 2020. After 8 February 2020, centralized quarantine and contract tracing played more important roles in preventing transmissions through identification and isolation of cases, particularly presymptomatic and asymptomatic cases.

This study has limitations. Since contact tracing was conducted via interview, recall bias could lead to inadequate contact tracing, particularly for asymptomatic cases who had a longer transmission risk period. It could also be possible that some asymptomatic cases might have had atypical or mild symptoms that were unreported, leading to an underestimation of symptomatic cases. Similarly, some of the home quarantined close contacts without symptoms identified before 8 February 2020 could actually be asymptomatic cases, but when they had RT–PCR tests they might have already cleared the virus from their body, leading to an underestimation of asymptomatic cases in earlier generations of transmission. Secondly, the majority of asymptomatic cases were identified after 8 February 2020 when all close contacts were required to be tested by RT–PCR. The contact tracing for asymptomatic cases who were labeled as ‘home-quarantined close contacts with no symptoms’ before 8 February 2020 could be insufficient as these asymptomatic cases may not be able to recall all the close contacts due to the long time window between having contact with their source cases and being diagnosed, leading to a smaller ratio of cases to close contacts for asymptomatic cases than for symptomatic cases. Moreover, close contacts were identified according to the definitions used by the National Health Commission of China, by which a relatively large symptomatic ratio of cases to close contacts (1:18) was identified in our population. The lack of consistent criteria may limit the interpretation and generalization of our results into other contexts.

In conclusion, the spread of COVID-19 was effectively controlled in Wanzhou by social distancing, including face mask wearing, thorough contact tracing, mass testing, identification and early diagnosis of presymptomatic and asymptomatic cases and strict quarantine of close contacts. Targeting the main risk factors—the timing, frequency and duration of contact—will be key interventions for mitigating COVID-19 and for better handling of possible resurgence in the future.

Methods

Study oversight

This study was approved by the Institutional Review Board of Chongqing Medical University. Written informed consent was waived by the Ethics Committee of Chongqing Medical University, as the study retrospectively analyzed data extracted from reports from the Wanzhou District CDC.

Data sources

Following the lockdown of Wuhan city, the municipality of Chongqing initiated the first-level response to major public health emergencies on 24 January 2020. Local CDCs were required to complete epidemiological investigation of confirmed COVID-19 cases by real-time RT–PCR within 24 h, including tracking down close contacts through interviews of identified cases, family members, doctors and other related people, in accordance with the Protocol for COVID-19 Prevention and Control issued by the National Health Commission of China24. Identified close contacts were transferred to the designated hospitals in Wanzhou immediately for further tests if they reported having or having had symptoms when they were tracked down; otherwise, they were asked to quarantine at home, preferably in one room with minimal interactions with family members and were not allowed to leave their homes. On 6 February 2020, this protocol was amended to require that all close contacts be quarantined in centralized locations (in a hotel if they did not report symptoms and in a hospital if they reported symptoms) rather than at home and tested by RT–PCR twice during the centralized quarantine. Wanzhou CDC adopted this amended protocol on 8 February 2020. All close contacts who were identified before 8 February 2020 and were home quarantined took the RT–PCR test on 8 February 2020 or immediately after. All COVID-19 cases confirmed by RT–PCR, regardless of symptom status, were treated in the designated hospitals. During the hospital stay, cases received treatments in accordance with the national COVID-19 Diagnosis and Treatment Guidelines (seventh edition) issued by the National Health Commission of China, including bed rest, supportive treatment, oxygen supplementation and antiviral therapy (for example, interferon-α, lopinavir/ritonavir, ribavirin and umifenovir). For severe and critical cases, mechanical ventilation or extracorporeal membrane oxygenation was used. Glucocorticoid therapy was applied to individuals with progressive deterioration of oxygenation, rapid progress in chest computed tomography and overreaction of the immune response. Cases were discharged from the hospital only if they had (1) two consecutive negative RT–PCR tests with a test interval of >24 h, (2) normal body temperature for >3 d and (3) substantial improvement in respiratory symptoms and in lesions on the chest radiograph. After being discharged from the hospital, cases were required to quarantine at home (before 8 February) or at centralized locations (on and after 8 February) for 2 weeks and were monitored during the quarantine period. Cases were also required to go to the designated hospitals for follow-up checks for symptoms and for RT–PCR testing at 2 and 4 weeks after being discharged from the hospital. As a result, we have data on the complete time course of COVID-19 for each case and their close contacts. The timeline of management of COVID-19 cases and close contacts in Wanzhou is provided in Extended Data Fig. 4.

Data were thus extracted from (1) epidemiological investigation reports of all positive cases tested by RT–PCR from 21 January through 10 April 2020 and (2) contact-tracing records of close contacts identified by Wanzhou CDC. Data from cases and close contacts were merged and entered into a structural database on a firewall-protected server at Chongqing Medical University. Data were cross-checked and confirmed with Wanzhou CDC. This study included all reported COVID-19 cases in Wanzhou between 21 January and 10 April 2020. The flowchart of samples is detailed in Extended Data Fig. 5.

Case definitions

Case definitions were based on the Protocol for COVID-19 Prevention and Control of the National Health Commission of China24. Close contacts consisted of those who had contact with a symptomatic case either before or after the source case showed symptoms or with asymptomatic cases either before or after the source case tested positive by RT–PCR. More specifically, close contacts included (1) those who lived together (in the same home), studied together (in the same classroom) or worked together (in close proximity or in the same room) with the source case; (2) health care staff who provided treatment or care to the case, or family members, relatives and others who took care of or visited the case, or those who had similar close contact, such as patients who shared the same hospital ward with the case; (3) those who used the same transport as the case, including care providers on the transport, accompaniers and other passengers or crew members who might have had close contact with the case; or (4) others who met the criteria for a close contact after the investigation and evaluation.

Symptomatic cases were individuals who tested positive by RT–PCR and showed symptoms before presenting to a hospital during the 2-week quarantine period as a close contact, during the hospital stay or within 4 weeks after being discharged from the hospital. An RT–PCR cycle threshold value (Ct value) of less than 37 was defined as positive, using a commercial RT–PCR kit (DAAN Gene, 20203400063). Asymptomatic cases were individuals who tested positive by RT–PCR and reported no symptoms before being diagnosed and did not show symptoms throughout the quarantine and treatment period and in 4 weeks after being discharged from the hospital. Of note, cases without symptoms upon diagnosis were labeled as ‘asymptomatic’ temporarily; this label was revised to ‘symptomatic’ later if the cases developed symptoms. This study included all the cases reported in our previous study25 but had more asymptomatic cases based on the official definition by the National Health Commission of China. Negative close contacts were those who tested negative by RT–PCR twice within the quarantine period. Infected close contacts were those who became infected with SARS-CoV-2, either symptomatic or asymptomatic, after contact with a source case.

Generations of transmission in clusters

Transmission clusters were classified by identifying the source case. Generations of transmission within each cluster were based on the date of first contact with the source case. Figure 1 shows an example of one cluster with five generations of transmission (G1–G5); all clusters were defined in this way.

Contact patterns and characteristics of transmission

As required by the National Health Commission of China24, close contacts reported the date when the contact with the source case occurred; for multiple contacts, the first and last dates of contact were given. Close contacts also provided information on the duration of the longest contact, frequency of contact, place of contact, mode of contact and their relationship to the source case. The posterior median of incubation period was calculated for symptomatic cases (more details are provided in ‘Statistical analyses’). For asymptomatic cases, we calculated their transmission risk period—the days between the dates when they had close contact with their source case and when they were diagnosed—to show how long asymptomatic cases could pose transmission risks to others. Timing of contact was defined as the number of days between the date when the source case had contact with their infector in the previous generation and when an individual had close contact with the source case (Extended Data Fig. 6). In the calculation of the timing of contact, the contact date was defined as the date when the contact occurred for those with one-time contact only and as the first date of contact for those with multiple contacts.

Statistical analyses

The proportion of infected close contacts was calculated by dividing the number of infected close contacts by the total number of close contacts, with separate estimates for before and after symptom onset (symptomatic source cases) or diagnosis (asymptomatic source cases). We calculated the posterior median of incubation period for symptomatic cases assuming that the incubation period distribution followed a Weibull distribution, gamma distribution or log-normal distribution, and compared the goodness of fit using the leave-one-out information criterion (LOO IC)26. Lower LOO IC indicates better goodness of fit, and a difference of >2 in the LOO IC between two models shows statistical significance. Since Weibull distribution provided the best fit to the data (Supplementary Table 4), we used it in our calculation of the posterior median of incubation period by generation of transmission. We also adopted the approach used by Backer and colleagues26 to estimate the posterior median of incubation period for G1 cases who traveled to Wanzhou from Wuhan or its neighboring area, by assigning 31 December 2019 as the assumed first date of contact with their infector and the last date of contact as the date when they came to Wanzhou. In addition, the mean incubation period, calculated using the first date of contact (Supplementary Table 5), was similar to the posterior median of incubation period with Weibull distribution (Supplementary Table 4), suggesting that the infection times for the cases were close to the beginning of the exposure window. Consequently, we used the first date of contact in the calculation of the timing of contact.

The reproductive number was calculated directly based on its definition—the number of people on average that an infected person can transmit the virus to—by using transmission over each generation, and its 95% CI was derived by bootstrapping with 5,000 resamples. However, the reproductive number is a population-average measure that does not reflect the individual variation in infectiousness27. To better describe the transmission of SARS-CoV-2 where certain source cases could pass the virus to unusually large numbers of people (superspreading events), we adopted the modeling approach proposed by Lloyd-Smith and colleagues27, which assumes that the individual reproductive number (that is, the expected number of secondary cases caused by one source case) follows a gamma distribution with a mean equal to the basic reproductive number (R0) and a dispersion parameter k, which yields a negative binomial distribution of the secondary cases (offspring distribution). The smaller the dispersion parameter k is, the greater the heterogeneity of the offspring distribution. In the setting with an R0 > 1, higher heterogeneity favors disease extinction. We calculated the dispersion parameter k and the 95% CI by each generation of transmission using the maximum-likelihood approach. Risk factors for SARS-CoV-2 infection were analyzed using multilevel logistic regression in which cases and close contacts were specified at level 1 and the 28 clusters of transmission were at level 2 to account for the between-cluster heterogeneity, as multilevel logistic regression performed better than simple logistic regression (two-tailed likelihood ratio test, P < 0.01).

In addition, we simulated the number of COVID-19 cases in Wanzhou by the date when the close contact occurred, assuming an absence of control measures and using the approach of epidemic spreading on complex networks9,10. The numerical simulations were therefore performed as follows: (1) an uncorrelated configuration model11 with a size of 1.74 million (the population size of Wanzhou) was generated with an average degree of m = 17.7 (the average number of individuals that each case in G1 had close contact with) and a degree distribution of P(m). Nodes represent individuals, and edges stand for the social relationships among individuals; (2) for each individual i, a degree m_i according to P(m) was generated. For a homogeneous network (for example, random regular network), P(m) was set to 1 when m = 17.7. In reality, each individual always connects with a distinct number of people, that is, the number of contacts is heterogeneous. To account for this, we set P(m) ~ m^{γ_d}, where γ_d represents the degree exponent—the larger the γ_d value, the more homogeneous the network12. We specified a γ_d value of 3 in our study because, through extensive numerical simulations, we found that the phenomena presented were not qualitatively affected by other values of γ_d; (3) m_i number of stubs was assigned to node i; (4) two stubs were randomly selected to generate an edge between two individuals. We did not allow self-loops and multiple edges in the network; (5) repeating step 4 until there were no stubs in the system; (6) 39 cases who traveled to Wanzhou from Hubei Province were randomly selected as seeds according to the arriving time to Wanzhou; and (7) at each time step, the state of individuals was updated based on the evolution mechanism of the susceptible–infected–recovered model. We set the evolution time step as t = 1, the epidemic transmission probability λ = 0.1069 (that is, the proportion of close contacts of G1 cases who were infected) and the basic reproductive ratio R0 = λm/γ = 1.64, where γ represents the recovery probability. Data analyses were performed with SAS v9.4 (SAS Institute), R v4.0.2 (R Core Team) and Visual C++ v6.0.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.