Using these breakpoints, the longest putative non-recombining segment (nt1,88521,753) is 9.9kb long, and we call this region NRR2. Eight other BFRs <500nt were identified, and the regions were named BFRAJ in order of length. In Extended Data Fig.
Did Pangolin Trafficking Cause the Coronavirus Pandemic? Mol. 62,63), the GTR+ model and 100bootstrap replicateswas inferred for each BFR >500nt. Sequences were aligned by MAFTT58 v.7.310, with a final alignment length of 30,927, and used in the analyses below. When the genomic data included both coding and non-coding regions we used a single GTR+ substitution model; for concatenated coding genes we partitioned the alignment by codon position and specified an independent GTR+ model for each partition with a separate gamma model to accommodate inter-site rate variation. 2, bottom) show that SARS-CoV-2 is unlikely to have acquired the variable loop from an ancestor of Pangolin-2019 because these two sequences are approximately 1015% divergent throughout the entire Sprotein (excluding the N-terminal domain). RegionB showed no PI signals within the region, except one including sequence SC2018 (Sichuan), and thus this sequence was also removed from the set. In addition, sequences NC_014470 (Bulgaria 2008), CoVZXC21, CoVZC45 and DQ412042 (Hubei-Yichang) needed to be removed to maintain a clean non-recombinant signal in A. 874850). The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. B 281, 20140732 (2014). PubMed Central Nature 503, 535538 (2013). Because the SARS-CoV-2 S protein has been implicated in past recombination events or possibly convergent evolution12, we specifically investigated several subregions of the Sproteinthe N-terminal domain of S1, the C-terminal domain of S1, the variable-loop region of the C-terminal domain, and S2. 82, 48074811 (2008). In light of these time-dependent evolutionary rate dynamics, a slower rate is appropriate for calibration of the sarbecovirus evolutionary history. PubMedGoogle Scholar. 1a-c ), has the third-highest number of confirmed COVID-19 cases in the state of So. PLoS ONE 5, e10434 (2010). RegionsAC had similar phylogenetic relationships among the southern China bat viruses (Yunnan, Guangxi and Guizhou provinces), the Hong Kong viruses, northern Chinese viruses (Jilin, Shanxi, Hebei and Henan provinces, including Shaanxi), pangolin viruses and the SARS-CoV-2 lineage. One study suggests that over a century ago, one lineage of coronavirus circulating in bats gave rise to SARS-CoV-2, RaTG13 and a Pangolin coronavirus known as Pangolin-2019, Live Science . 23, 18911901 (2006). Google Scholar. The shaded region corresponds to the Sprotein. The pangolin coronaviruses show lower similarity to SARS-CoV-2 than bat coronavirus RaTG13 across the whole genome, but higher similarity in the spike receptor binding domain, although the similarity at either scale remains too low to implicate . Cell 181, 223227 (2020). The coronavirus genome that these researchers had assembled, from pangolin lung-tissue samples, contained some gene regions that were ninety-nine per cent similar to equivalent parts of the SARS . Membrebe, J. V., Suchard, M. A., Rambaut, A., Baele, G. & Lemey, P. Bayesian inference of evolutionary histories under time-dependent substitution rates. CAS Posterior means with 95% HPDs are shown in Supplementary Information Table 2. Influenza viruses reassort17 but they do not undergo homologous recombination within RNA segments18,19, meaning that origins questions for influenza outbreaks can always be reduced to origins questions for each of influenzas eight RNA segments. Scientists defined the pangolin lineage of this variant to be B.1.1.523 and it was originally recognized as a variant under monitoring on July 14, 2021. We showed that severe acute respiratory syndrome coronavirus 2 is probably a novel recombinant virus. Pangolin relies on a novel algorithm called pangoLEARN. J. Virol. With horseshoe bats currently the most plausible origin of SARS-CoV-2, it is important to consider that sarbecoviruses circulate in a variety of horseshoe bat species with widely overlapping species ranges57. Patino-Galindo, J. The extent of sarbecovirus recombination history can be illustrated by five phylogenetic trees inferred from BFRs or concatenated adjacent BFRs (Fig. By mid-January 2020, the virus was spreading widely within Hubei province and by early March SARS-CoV-2 was declared a pandemic8. A novel bat coronavirus closely related to SARS-CoV-2 contains natural insertions at the S1/S2 cleavage site of the Spike protein. This produced non-recombining alignment NRA3, which included 63 of the 68genomes. 5. CNN . To evaluate the performance procedure, we confirmed that the recombination masking resulted in (1) a markedly different outcome of the PHI test64, (2) removal of well-supported (bootstrap value >95%) incompatible splits in Neighbor-Net65 and (3) a near-complete reduction of mosaic signal as identified by 3SEQ. Viruses 11, 979 (2019). Annu Rev.
Coronavirus Software Tools - Illumina, Inc. Nevertheless, the viral population is largely spatially structured according to provinces in the south and southeast on one lineage, and provinces in the centre, east and northeast on another (Fig. SARS-CoV-2 genetic lineages in the United States are routinely monitored through epidemiological investigations, virus genetic sequence-based surveillance, and laboratory studies. N. China corresponds to Jilin, Shanxi, Hebei and Henan provinces, and the N. China clade also includes one sequence sampled in Hubei Province in 2004. Time-measured phylogenetic reconstruction was performed using a Bayesian approach implemented in BEAST42 v.1.10.4. The assumption of long-term purifying selection would imply that coronaviruses are in endemic equilibrium with their natural host species, horseshoe bats, to which they are presumably well adapted. To obtain J. Virol. Two other bat viruses (CoVZXC21 and CoVZC45) from Zhejiang Province fall on this lineage as recombinants of the RaTG13/SARS-CoV-2 lineage and the clade of Hong Kong bat viruses sampled between 2005 and 2007 (Fig. Aside from RaTG13, Pangolin-CoV is the most closely related CoV to SARS-CoV-2. The command line tool is open source software available under the GNU General Public License v3.0. Coronavirus: Pangolins found to carry related strains. This underscores the need for a global network of real-time human disease surveillance systems, such as that which identified the unusual cluster of pneumonia in Wuhan in December 2019, with the capacity to rapidly deploy genomic tools and functional studies for pathogen identification and characterization. Preprint at https://doi.org/10.1101/2020.05.28.122366 (2020). Forni, D., Cagliani, R., Clerici, M. & Sironi, M. Molecular evolution of human coronavirus genomes. PubMed Slider with three articles shown per slide. Robertson, D. nCoVs relationship to bat coronaviruses & recombination signals (no snakes) no evidence the 2019-nCoV lineage is recombinant. Over relatively shallow timescales, such differences can primarily be explained by varying selective pressure, with mildly deleterious variants being eliminated more strongly by purifying selection over longer timescales44,45,46. 6, eabb9153 (2020). performed codon usage analysis. If the latter still identified non-negligible recombination signal, we removed additional genomes that were identified as major contributors to the remaining signal. 1c). J. Gen. Virol. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins.
Future trajectory of SARS-CoV-2: Constant spillover back and forth D.L.R. The construction of NRR1 is the most conservative as it is least likely to contain any remaining recombination signals. & Boni, M. F. Improved algorithmic complexity for the 3SEQ recombination detection algorithm. Consistent with this, we estimate a concomitantly decreasing non-synonymous-to-synonymous substitution rate ratio over longer evolutionary timescales: 1.41 (1.20,1.68), 0.35 (0.30,0.41) and 0.133 (0.129,0.136) for SARS, MERS-CoV and HCoV-OC43, respectively. Open reading frames are shown above the breakpoint plot, with the variable-loop region indicated in the Sprotein. Maclean, O. In this study, we report the case of a child with severe combined immu presenting a prolonged severe acute respiratory syndrome coronavirus 2 infection. The idea is that pangolins carrying the virus, SARS-CoV-2, came into contact with humans. We thank A. Chan and A. Irving for helpful comments on the manuscript. Our third approach involved identifying breakpoints and masking minor recombinant regions (with gaps, which are treated as unobserved characters in probabilistic phylogenetic approaches).
A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist Gray inset shows majority rule consensus trees with mean posterior branch lengths for the two regions, with posterior probabilities on the key nodes showing the relationships among SARS-CoV-2, RaTG13, and Pangolin 2019. Software package for assigning SARS-CoV-2 genome sequences to global lineages. D.L.R. The virus then. 53), this is inferred to have occurred before the divergence of RaTG13 and SARS-CoV-2 and thus should not influence our inferences. Lancet 395, 565574 (2020). RegionC showed no PI signals within it.
Pangolins: What are they and why are they linked to Covid-19? - Inverse Its origin and direct ancestral viruses have not been . Boxplots show interquartile ranges, white lines are medians and box whiskers show the full range of posterior distribution. COVID-19 lineage names can be confusing to navigate; there are many aliases and if you want to catch them all to examine further in data analyses it helps to Allen O'Brien on LinkedIn: #r #rstudio #rstats #pangolin #covid19 #datascience #epidemiology P.L. TMRCA estimates for SARS-CoV-2 and SARS-CoV from their respective most closely related bat lineages are reasonably consistent for the different data sets and different rate priors in our analyses. and D.L.R.
These datasets were subjected to the same recombination masking approach as NRA3 and were characterized by a strong temporal signal (Fig. Individual sequences such as RpShaanxi2011, Guangxi GX2013 and two sequences from Zhejiang Province (CoVZXC21/CoVZC45), as previously shown22,25, have strong phylogenetic recombination signals because they fall on different evolutionary lineages (with bootstrap support >80%) depending on what region of the genome is being examined. is funded by the MRC (no. Posterior rate distributions for MERS-CoV (far left) and HCoV-OC43 (far right) using BEAST on n=27 sequences spread over 4 years (MERS-CoV) and n=27 sequences spread over 49 years (HCoV-OC43). PubMed Central Dudas, G., Carvalho, L. M., Rambaut, A. 87, 62706282 (2013). The divergence time estimates for SARS-CoV-2 and SARS-CoV from their respective most closely related bat lineages are reasonably consistent among the three approaches we use to eliminate the effects of recombination in the alignment. Unfortunately, a response that would achieve containment was not possible. 5). Bryant, D. & Moulton, V. Neighbor-Net: an agglomerative method for the construction of phylogenetic networks.
Coronavirus origins: genome analysis suggests two viruses may have combined Evol. Viruses 11, 174 (2019). Genetics 172, 26652681 (2006). Li, Q. et al. Concatenated region ABC is NRR1. These are in general agreement with estimates using NRR2 and NRA3, which result in divergence times of 1982 (19482009) and 1948 (18791999), respectively, for SARS-CoV-2, and estimates of 1952 (19061989) and 1970 (19321996), respectively, for the divergence time of SARS-CoV from its closest known bat relative. is funded by The National Natural Science Foundation of China Excellent Young Scientists Fund (Hong Kong and Macau; no. 68, 10521061 (2019). Pink, green and orange bars show BFRs, with regionA (nt 13,29119,628) showing two trimmed segments yielding regionA (nt13,29114,932, 15,40517,162, 18,00919,628). 88, 70707082 (2014). Phylogenies of subregions of NRR1 depict an appreciable degree of spatial structuring of the bat sarbecovirus population across different regions (Fig. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. Because there is no single accepted method of inferring breakpoints and identifying clean subregions with high certainty, we implemented several approaches to identifying three classic statistical signals of recombination: mosaicism, phylogenetic incongruence and excessive homoplasy51. Developed by the Centre for Genomic Pathogen Surveillance.
CoV-lineages GitHub 31922087). This is not surprising for diverse viral populations with relatively deep evolutionary histories. M.F.B., P.L. RegionsB and C span nt3,6259,150 and 9,26111,795, respectively. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Yu, H. et al. Sorting these breakpoint-free regions (BFRs) by length results in two segments >5kb: an ORF1a subregion spanning nucleotides (nt) 3,6259,150 and the first half of ORF1b spanning nt13,29119,628 (sequence numbering given in Source Data, https://github.com/plemey/SARSCoV2origins). Wong, A. C. P., Li, X., Lau, S. K. P. & Woo, P. C. Y.
Meet the people who warn the world about new covid variants To avoid artefacts due to recombination, we focused on NRR1 and NRR2 and the recombination-masked alignment NRA3 to infer time-measured evolutionary histories. By 2009, however, rapid genomic analysis had become a routine component of outbreak response. Nat. Except for specifying that sequences are linear, all settings were kept to their defaults.
cov-lineages/pangolin - GitHub Biol. This new approach classifies the newly sequenced genome against all the diverse lineages present instead of a representative select sequences. Adv. Curr. Concurrent evidence also proposed pangolins as a potential intermediate species for SARS-CoV-2 emergence and suggested them as a potential reservoir species11,12,13.
Is the COVID-19 Outbreak the 'Revenge of the Pangolin'? | PETA The new paper finds that the genetic sequences of several strains of coronavirus found in pangolins were between 88.5 percent and 92.4 percent similar to those of the novel coronavirus. We compiled a dataset including 27human coronavirus OC43 virus genomes and ten related animal virus genomes (six bovine, three white-tailed deer and one canine virus). 25, 3548 (2017). Gorbalenya, A. E. et al.
Possible Bat Origin of Severe Acute Respiratory Syndrome Coronavirus 2