Receptor Recognition Mechanisms of Coronaviruses: a Decade of Structural Studies
, Editor
DOI: 10.1128/JVI.02615-14
ABSTRACT
Receptor
recognition by viruses is the first and essential step of viral
infections of host cells. It is an important determinant of viral host
range and cross-species infection and a primary target for antiviral
intervention. Coronaviruses recognize a variety of host receptors,
infect many hosts, and are health threats to humans and animals. The
receptor-binding S1 subunit of coronavirus spike proteins contains two
distinctive domains, the N-terminal domain (S1-NTD) and the C-terminal
domain (S1-CTD), both of which can function as receptor-binding domains
(RBDs). S1-NTDs and S1-CTDs from three major coronavirus genera
recognize at least four protein receptors and three sugar receptors and
demonstrate a complex receptor recognition pattern. For example, highly
similar coronavirus S1-CTDs within the same genus can recognize
different receptors, whereas very different coronavirus S1-CTDs from
different genera can recognize the same receptor. Moreover, coronavirus
S1-NTDs can recognize either protein or sugar receptors. Structural
studies in the past decade have elucidated many of the puzzles
associated with coronavirus-receptor interactions. This article reviews
the latest knowledge on the receptor recognition mechanisms of
coronaviruses and discusses how coronaviruses have evolved their complex
receptor recognition pattern. It also summarizes important principles
that govern receptor recognition by viruses in general.
INTRODUCTION
Coronaviruses
(CoV) are a group of common, ancient, and diverse viruses. They infect
many mammalian and avian species and cause respiratory,
gastrointestinal, and central nervous system diseases (1, 2).
Coronavirus virions contain an envelope, a helical capsid, and a
single-stranded and positive-sense RNA genome. The length of their
genomes, which are the largest among all RNA viruses, typically ranges
between 27 and 32 kb. They were named “coronaviruses” because of the
protruding spike proteins on their envelope that give the virions a
crown-like shape (“corona” in Latin means crown). Coronaviruses belong
to the Coronaviridae family in the order of Nidovirales. They can be classified into at least three major genera, α, β, and γ (formerly group 1, 2, and 3, respectively) (3).
Prototypic α-genus coronaviruses include human coronavirus NL63
(HCoV-NL63), porcine transmissible gastroenteritis coronavirus (TGEV),
and porcine respiratory coronavirus (PRCV). Prototypic β-genus
coronaviruses include severe acute respiratory syndrome coronavirus
(SARS-CoV), Middle East respiratory syndrome coronavirus (MERS-CoV),
mouse hepatitis coronavirus (MHV), and bovine coronavirus (BCoV).
Prototypic γ-genus coronaviruses include avian infectious bronchitis
virus (IBV). These three major coronavirus genera and their prototypic
coronaviruses are the focus of this review article (Fig. 1).
Coronaviruses impose health threats to humans and animals.
Two β-coronaviruses, SARS-CoV and MERS-CoV, are highly pathogenic human
pathogens. SARS-CoV caused the SARS epidemic in 2002 to 2003, with over
8,000 infections and a fatality rate of ∼10% (4–7).
MERS-CoV emerged from the Middle East in 2012. As of 16 October 2014,
MERS-CoV had caused 877 infections with a fatality rate of ∼36% (http://www.who.int/csr/don/16-october-2014-mers/en/) (8, 9).
HCoV-NL63 from the α-genus is a prevalent human respiratory pathogen
that is often associated with common colds in healthy adults and acute
respiratory diseases in young children (10, 11).
Among the animal coronaviruses, TGEV from the α-genus and MHV from the
β-genus cause close to 100% fatality in young pigs and young mice,
respectively (12–15); BCoV from the β-genus and IBV from the γ-genus also cause significant health damage in cattle and chickens, respectively (16–19). Therefore, research on coronaviruses has strong health and economic implications.
Receptor recognition by viruses is the first and essential step of viral infections of host cells (20).
An envelope-anchored spike protein mediates coronavirus entry into host
cells by first binding to a receptor on the host cell surface and then
fusing viral and host membranes (21, 22). A member of the class I viral membrane fusion proteins (23–26),
the coronavirus spike consists of three segments—an ectodomain, a
single-pass transmembrane anchor, and a short intracellular tail (27, 28).
The ectodomain can be divided into a receptor-binding S1 subunit and a
membrane-fusion S2 subunit. The amino acid sequences of S1 diverge
across different genera but are relatively conserved within each genus (29).
S1 contains two independent domains, an N-terminal domain (S1-NTD) and a
C-terminal domain (S1-CTD, also called S1 C-domain) (Fig. 1) (29).
Either or both of these S1 domains can function as a receptor-binding
domain (RBD). The binding interaction between coronavirus RBD and its
receptor is one of the most important determinants of the coronavirus
host range and cross-species infection (2, 30).
In addition, coronavirus RBDs contain major neutralization epitopes,
induce most of the host immune responses, and may serve as subunit
vaccines against coronavirus infections (31–36).
Knowledge about the receptor recognition mechanisms of coronaviruses is
critical for understanding coronavirus pathogenesis and epidemics and
for human intervention in coronavirus infections.
Coronaviruses recognize a variety of host receptors (Fig. 1). Although HCoV-NL63 and SARS-CoV belong to the α-genus and β-genus, respectively, their S1-CTDs recognize the same receptor, angiotensin-converting enzyme 2 (ACE2) (37–43). Although HCoV-NL63, TGEV, and PRCV all belong to the α-genus, their S1-CTDs recognize different receptors—TGEV and PRCV S1-CTDs both recognize aminopeptidase N (APN) (44, 45). Similarly, although SARS-CoV and MERS-CoV both belong to the β-genus, their S1-CTDs recognize different receptors—MERS-CoV S1-CTD recognizes dipeptidyl peptidase 4 (DPP4) (46–48). Although MHV and BCoV both belong to the β-genus, their S1-NTDs recognize carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) and sugar, respectively (49–53). In addition, the S1-NTDs of α-genus TGEV and γ-genus IBV also recognize sugar (52, 54–58). Overall, coronaviruses have evolved a complex receptor recognition pattern: (i) coronaviruses use one or both S1 domains as RBDs; (ii) highly similar coronavirus S1-CTDs within the same genus can recognize different protein receptors, whereas very different coronavirus S1-CTDs from different genera can recognize the same protein receptor; and (iii) coronavirus S1-NTDs can recognize either protein or sugar receptors. Understanding the receptor recognition mechanisms of coronaviruses can provide critical insight into the origin, evolution, and receptor selection of coronaviruses.
In addition to their viral receptor functions, the receptors for
coronaviruses have their own physiological functions.Coronaviruses recognize a variety of host receptors (Fig. 1). Although HCoV-NL63 and SARS-CoV belong to the α-genus and β-genus, respectively, their S1-CTDs recognize the same receptor, angiotensin-converting enzyme 2 (ACE2) (37–43). Although HCoV-NL63, TGEV, and PRCV all belong to the α-genus, their S1-CTDs recognize different receptors—TGEV and PRCV S1-CTDs both recognize aminopeptidase N (APN) (44, 45). Similarly, although SARS-CoV and MERS-CoV both belong to the β-genus, their S1-CTDs recognize different receptors—MERS-CoV S1-CTD recognizes dipeptidyl peptidase 4 (DPP4) (46–48). Although MHV and BCoV both belong to the β-genus, their S1-NTDs recognize carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) and sugar, respectively (49–53). In addition, the S1-NTDs of α-genus TGEV and γ-genus IBV also recognize sugar (52, 54–58). Overall, coronaviruses have evolved a complex receptor recognition pattern: (i) coronaviruses use one or both S1 domains as RBDs; (ii) highly similar coronavirus S1-CTDs within the same genus can recognize different protein receptors, whereas very different coronavirus S1-CTDs from different genera can recognize the same protein receptor; and (iii) coronavirus S1-NTDs can recognize either protein or sugar receptors. Understanding the receptor recognition mechanisms of coronaviruses can provide critical insight into the origin, evolution, and receptor selection of coronaviruses.
ACE2 is a zinc-dependent carboxypeptidase that cleaves one residue from the C terminus of angiotensin peptides and functions in blood pressure regulation (59–62). ACE2 also protects against severe acute lung failure, and SARS-CoV-induced downregulation of ACE2 promotes lung injury (63, 64). APN is a zinc-dependent aminopeptidase that cleaves one residue from the N terminus of many physiological peptides and plays multifunctional roles such as in pain regulation, blood pressure regulation, and tumor cell angiogenesis (65, 66). DPP4 is a serine exoprotease that cleaves two residues from the N terminus of many physiological peptides and functions in immune regulation, signal transduction, and apoptosis (67–70). CEACAM1 is a cell adhesion molecule and functions in cell-cell adhesion (71–73). Sugars decorate many proteins and fats on cell surfaces and function in many biological processes such as immunity and cell-cell communication (57, 74, 75). How these cell-surface molecules are selected by viruses as their entry receptors has been a major puzzle in virology.
Analyses of crystal structures of coronavirus S1 domains and their complexes with their respective receptor have elucidated many puzzles associated with coronavirus-receptor interactions. Since the SARS epidemic, the crystal structures of five coronavirus S1 domains complexed with their respective receptor have been determined. These are the β-genus SARS-CoV S1-CTD complexed with human ACE2 (76), β-genus MERS-CoV S1-CTD complexed with human DPP4 (77, 78), α-genus HCoV-NL63 S1-CTD complexed with human ACE2 (79), α-genus PRCV S1-CTD complexed with porcine APN (80), and β-genus MHV S1-NTD complexed with murine CEACAM1 (81). In addition, the crystal structure of β-genus BCoV S1-NTD by itself has been determined, with its sugar-binding site identified through mutagenesis (53). These six representative structures not only reveal how coronaviruses recognize their receptors in atomic details but also shed light on how coronaviruses do so using complicated evolutionary strategies. Other than these six representative structures, several variant forms of these structures have also been determined, including S1-CTDs of different SARS-CoV strains complexed with ACE2 from animals and S1-CTD of a MERS-CoV-related bat coronavirus HKU4 complexed with human DPP4 (82–84). This article reviews these structural studies and their implications for the receptor recognition and evolution of coronaviruses.
S1-CTDs OF β-GENUS CORONAVIRUSES
β-genus
SARS-CoV S1-CTD complexed with human ACE2 was the first crystal
structure determined for a coronavirus S1 domain and S1 domain/receptor
complex (Fig. 2A) (76, 85).
SARS-CoV S1-CTD contains two subdomains: a core structure and an
extended loop. The core structure consists of a five-stranded
antiparallel β-sheet and several short connecting α-helices. The
extended loop lies on one edge of the core structure and forms a gently
concave surface with two ridges on both sides and a two-stranded
antiparallel β-sheet sitting in the middle (Fig. 3A and B).
Because this extended loop makes all the contacts with ACE2, it has
been termed receptor-binding motif (RBM). On the other hand, the
peptidase domain of ACE2 has a claw-like structure with two lobes. The
enzymatic active site of ACE2 is buried in a cavity surrounded by the
two lobes. SARS-CoV S1-CTD binds to the outer surface of the N-terminal
lobe, away from the peptidase active site. Consequently, SARS-CoV
binding has no effect on the enzymatic activity of ACE2 and vice versa.
The SARS-CoV-binding region on the ACE2 surface has been termed
virus-binding motif (VBM). The RBM and VBM complement each other in
shape and chemical details. The structure of SARS-CoV S1-CTD/ACE2
complex provided the first view of coronavirus S1 and S1/receptor
complex and laid the foundation for future structural and evolutionary
comparisons with other coronavirus S1 and S1/receptor complexes.
FIG 2
Crystal
structures of coronavirus S1-CTDs complexed with their respective
receptor. (A to D) These structures include β-genus SARS-CoV S1-CTD
complexed with ACE2 (Protein Data Bank identifier [PDB ID]: 2AJF) (76) (A), β-genus MERS-CoV S1-CTD complexed with DPP4 (PDB ID: 4KR0) (77) (B), α-genus HCoV-NL63 S1-CTD complexed with ACE2 (PDB ID: 3KBH) (79) (C), and α-genus PRCV S1-CTD complexed with APN (PDB ID: 4F5C) (80)
(D). In the structures of the complexes, the receptors are in green,
and the cores and RBMs of S1-CTDs are in cyan and red, respectively. (E
and F) The structural topologies of the four coronavirus S1-CTDs are
shown as schematic illustrations, where β-strands are depicted as arrows
and α-helices as cylinders. In the tertiary structures and structural
topologies of the S1-CTDs, the secondary structures of all of the
S1-CTDs are colored and numbered in the same way as for HCoV-NL63
S1-CTD.
Virus-binding
hot spots on ACE2 that are critical for the binding of SARS-CoV and
HCoV-NL63. (A) Enlarged view of the SARS-CoV–ACE2 interface. VBMs on
ACE2 and RBM on SARS-CoV S1-CTD are in blue and red, respectively. (B)
Footprint of SARS-CoV on the surface of ACE2. The view is derived from
the one in panel A by rotating ACE2 by 90° along a horizontal axis in
such a way that the edge facing the viewer moves up. VBM1 residues are
in orange, VBM2 residues in magenta, VBM3 residues in red, and VBM1b
residues in green. (C) A virus-binding hot spot on ACE2 centering on
Lys31 is critical for the binding of SARS-CoV S1-CTD. Mutation of
residue 479 on SARS-CoV S1-CTD is critical for the transmission of
SARS-CoV from palm civets to humans. (D) A second virus-binding hot spot
on ACE2 centering on Lys353 is also critical for the binding of
SARS-CoV S1-CTD. Mutation of residue 487 on SARS-CoV S1-CTD is critical
for the transmission of SARS-CoV from human to human. (E) Enlarged view
of the HCoV-NL63–ACE2 interface. (F) Footprint of HCoV-NL63 on the
surface of ACE2. (G) The same virus-binding hot spot on ACE2 centering
on Lys353 is also critical for the binding of HCoV-NL63 S1-CTD.
Comparative studies of the
interactions between the S1-CTD from different SARS-CoV strains and ACE2
from different host species have elucidated the molecular and
structural mechanisms by which SARS-CoV transmitted from animals to
humans and caused the SARS epidemic (30, 83, 84, 86–89).
Two virus-binding hot spots have been identified in the VBM of ACE2,
one centering on ACE2 residue Lys31 and the other centering on ACE2
residue Lys353 (Fig. 3C and D).
Both of these virus-binding hot spots consist of a salt bridge that is
buried in a hydrophobic environment. Structure-guided functional studies
revealed that both virus-binding hot spots provide significant energy
to the virus-receptor binding interactions (90).
Indeed, all of the naturally selected viral mutations in SARS-CoV RBM
surround the two hot spots, with significant impact on the structures of
the hot spots, the ACE2 binding affinity, and the host immune responses
(84, 91).
One of these viral mutations, K479N, facilitated transmission of
SARS-CoV from palm civets to humans. Another viral mutation, S487T,
facilitated transmission of SARS-CoV from human to human.
These two
mutations contributed significantly to the SARS epidemic in 2002 to
2003. The S1-CTD of a SARS-CoV-related Rs3367 bat coronavirus contains
two asparagines at these two positions (corresponding to positions 479
and 487 in human SARS-CoV strains) (92).
The first asparagine is favorable for human ACE2 binding, and the
second one is less favorable. Thus, Rs3367 recognizes human ACE2 but
probably less well than the human SARS-CoV strains do. For more details
about how the structural analysis of SARS-CoV RBD/ACE2 interactions has
provided insight into the SARS epidemic, please refer to another recent
review article on this topic (30).
These structural studies of SARS-CoV S1-CTD/ACE2 interactions
demonstrate that it is critical to understand viral evolution,
cross-species transmission, and epidemics within a detailed structural
framework.
The crystal structures of β-genus MERS-CoV
S1-CTD by itself and in complex with human DPP4 provided another view of
coronavirus S1 and S1/receptor complex (Fig. 2B) (77, 78, 93).
Like SARS-CoV S1-CTD, MERS-CoV S1-CTD also contains a core structure
and an RBM. The core structures of MERS-CoV and SARS-CoV S1-CTDs are
highly similar to each other, but their RBMs are markedly different,
leading to different receptor specificities. The RBM of MERS-CoV S1-CTD
mainly consists of a four-stranded β-sheet, in contrast to the
loop-dominated RBM in SARS-CoV S1-CTD. Like the VBM for SARS-CoV on
ACE2, the VBM for MERS-CoV is also located on the outer surface of DPP4,
away from the peptidase active site. Whereas the conserved core
structures of SARS-CoV and MERS-CoV S1-CTDs suggest a common
evolutionary origin, the different RBMs of the two S1-CTDs indicate a
divergent evolutionary pathway that has led to their recognition of
different host receptors. The S1-CTDs of MERS-CoV and a highly related
bat coronavirus HKU4 recognize DPP4 in very similar ways, suggesting a
close evolutionary relationship between the two viruses (82, 94).
In addition to enhancing the understanding of coronavirus evolution,
the structure of MERS-CoV S1-CTD/DPP4 complex has important implications
for understanding the host range and cross-species transmission of
MERS-CoV (82, 94–97).
S1-CTDs OF α-GENUS CORONAVIRUSES
α-Genus HCoV-NL63 S1-CTD complexed with human ACE2 was the first crystal structure determined for an α-coronavirus S1 domain (Fig. 2C) (79).
This structure, along with the structure of β-genus SARS-CoV S1-CTD
complexed with ACE2, provided the first view of how two different
viruses recognize their common host receptor. The finding was
intriguing. At first glance, HCoV-NL63 and SARS-CoV S1-CTDs are very
different. The core structure of HCoV-NL63 S1-CTD is a β-sandwich
consisting of two β-sheet layers stacked together through hydrophobic
interactions, which is in contrast to the single β-sheet layer in the
core structure of SARS-CoV S1-CTD. Their RBMs are also different. The
RBMs of HCoV-NL63 S1-CTD are three short and discontinuous loops,
whereas the RBM of SARS-CoV S1-CTD is a single long and continuous
subdomain. Indeed, the protein-folding Dali server failed to detect any
structural similarity between HCoV-NL63 and SARS-CoV S1-CTDs (98).
However, structural topology analysis revealed that the secondary
structural elements in HCoV-NL63 S1-CTD are connected in the same way as
those in SARS-CoV S1-CTD, although two β-strands in the former (strands
β-1 and β-4) become α-helices in the latter (helices α-1 and α-4) and
another β-strand (strand β-1) in the former is missing altogether in the
latter (Fig. 2E and F) (29).
These results suggest that HCoV-NL63 and SARS-CoV S1-CTDs share an
evolutionary origin and that the structural differences between the two
S1-CTDs result from extensive divergent evolution.
Despite their different tertiary structures, HCoV-NL63 and SARS-CoV S1-CTDs bind to a common region on ACE2 (79, 90). The VBMs for the two viruses on ACE2 overlap, and a number of ACE2 residues interact with both S1-CTDs (Fig. 3E and F).
Surprisingly, one of the two virus-binding hot spots on ACE2 for
SARS-CoV binding, which centers on ACE2 residue Lys353, plays a
similarly critical role in the binding of HCoV-NL63 (Fig. 3G).
Disturbance of the hot spot structure via mutagenesis decreased or
abolished the binding of both viruses. Hence, Lys353 and the nearby
residues on ACE2 form a common virus-binding hot spot that is critical
for the attachment of two different coronaviruses. On the other hand,
among the three RBMs in HCoV-NL63 S1-CTD, only RBM1 and RBM2, but not
RBM3, are involved in binding the common virus-binding hot spot on ACE2,
despite the fact that RBM3 is topologically equivalent to the RBM in
SARS-CoV S1-CTD (Fig. 2A, C, E, and F).
The different molecular mechanisms used by the two S1-CTDs to recognize
ACE2 suggest a convergent evolutionary relationship between the two
S1-CTDs (i.e., the two S1-CTDs evolved independently to recognize the
same virus-binding hot spot on ACE2), although a divergent evolutional
relationship cannot be completely ruled out (i.e., the two S1-CTDs both
evolved from a common ancestral protein that bound ACE2). Therefore,
after HCoV-NL63 and SARS-CoV S1-CTDs underwent divergent evolution to
attain different structures, they might have further converged to
recognize the same region on the same receptor. The common virus-binding
hot spot on ACE2 might be the driving force for this later convergent
evolution.
The crystal structure of α-genus PRCV S1-CTD
complexed with porcine APN illustrated how another similar α-coronavirus
S1-CTD recognizes a different host receptor (Fig. 2D) (80).
Similarly to the structural relationship between SARS-CoV and MERS-CoV
S1-CTDs, PRCV and HCoV-NL63 S1-CTDs also have highly similar core
structures. However, their three RBMs are divergent, leading to
different receptor specificities. Similarly to the VBMs on ACE2 and
DPP4, the VBMs for PRCV on APN are also located on the outer surface of
APN, away from the peptidase active site. Overall, these results suggest
that PRCV and HCoV-NL63 S1-CTDs share an evolutionary origin but have
diverged in their RBM loops to recognize different host receptors.
We propose the following evolutionary scenario for coronavirus S1-CTDs (Fig. 4).
All coronavirus S1-CTDs likely shared one evolutionary origin, as
evidenced by their related structural topologies across different genera
(Fig. 2E and F).
Through divergent evolution, coronavirus S1-CTDs attained β-sandwich
core structures in the α-genus and β-sheet core structures in the
β-genus. Although the structures of γ-coronavirus S1-CTDs are not known,
their core structures may also have a topology related to those of α-
and β-coronavirus S1-CTDs. Furthermore, α-coronavirus S1-CTDs diverged
in the three RBM loops to acquire different receptor specificities—ACE2
specificity for HCoV-NL63 and APN specificity for PRCV. β-Coronavirus
S1-CTDs also diverged in the RBM subdomain to acquire different receptor
specificities—ACE2 specificity for SARS-CoV and DPP4 specificity for
MERS-CoV. The S1-CTDs of α-genus HCoV-NL63 and β-genus SARS-CoV first
diverged into different tertiary structures but later converged to
recognize the same receptor ACE2. In sum, coronavirus S1-CTDs have
undergone convoluted structural evolutions, leading to their complex
receptor recognition pattern.
S1-NTDs OF β-GENUS CORONAVIRUSES
β-Genus
MHV S1-NTD complexed with mouse CEACAM1 was the first structure
available for a coronavirus S1-NTD and S1-NTD/receptor complex (Fig. 5A) (81).
Surprisingly, MHV S1-NTD contains a core structure that has the same
structural fold as human galectins (galactose-binding lectins) (Fig. 5C) (99).
The core structure of MHV S1-NTD is a thirteen-stranded β-sandwich
consisting of two β-sheet layers of six and seven strands, respectively.
The structural topologies of MHV S1-NTD and human galectins are
identical, except that MHV S1-NTD contains two additional β-strands in
one of the β-sheet layers (Fig. 5D and E).
Compared with human galectins, MHV S1-NTD contains additional
structural motifs on top of the core that form a ceiling-like structure.
The outer surface of this ceiling-like structure functions as RBM by
binding to the VBM on the N-terminal Ig-like domain of CEACAM1. Despite
its galectin fold, MHV S1-NTD does not bind sugars, as revealed by
sugar-binding assays. Moreover, neither the RBM on MHV S1-NTD nor the
VBM on CEACAM1 contains any sugar at the binding interface. Instead, MHV
S1-NTD binds to CEACAM1 through exclusive protein-protein interactions.
A hydrophobic patch in the VBM of CEACAM1 functions as a virus-binding
hot spot; mutations in this region significantly decreased the binding
of MHV S1-NTD (81, 100–102).
Taken together, these results suggest that MHV S1-NTD and host
galectins share the same evolutionary origin; they also indicate that
although MHV S1-NTD binds only a CEACAM1 protein receptor, other
coronavirus S1-NTDs may bind sugar receptors and function as viral
lectins.
Analysis of the crystal structure of β-genus BCoV S1-NTD
provided the first view of a functional lectin domain in a coronavirus
spike (Fig. 5B) (53).
The overall structure of BCoV S1-NTD is highly similar to that of MHV
S1-NTD, also containing a galectin-like core and a ceiling-like
structure on top of the core. In contrast to MHV S1-NTD, which binds
CEACAM1 but not sugars, BCoV S1-NTD binds a sugar receptor but not
CEACAM1. Glycan screen arrays identified Neu5,9Ac2
(5-N-acetyl-9-O-acetylneuraminic acid) as the sugar receptor for BCoV
S1-NTD. Although the structure of a sugar-bound BCoV S1-NTD is not
available, structure-guided mutagenesis has revealed that the
sugar-binding site is located in a pocket surrounded by the core and the
ceiling-like structure on top of the core. The sugar-binding sites in
BCoV S1-NTD and human galectins overlap, although human galectins
recognize a different sugar receptor, galactose. Structural comparison
between MHV and BCoV S1-NTDs revealed that subtle structural changes
between the two S1-NTDs, mainly involving different conformations of RBM
loops, explain why BCoV S1-NTD does not bind CEACAM1 and why MHV S1-NTD
does not bind sugars. These results suggest that MHV and BCoV S1-NTDs
are both evolutionarily related to human galectins but that they have
diverged from human galectins with specificities for a novel protein
receptor and a different sugar receptor, respectively.
We propose the following evolutionary scenario for coronavirus S1-NTDs (Fig. 6).
Ancestral coronaviruses stole a host galectin gene and inserted it into
the 5′ end of their spike gene, which became coronavirus S1-NTD. Since
then, coronavirus S1-NTDs have undergone divergent evolution in three
genera. β-Genus BCoV S1-NTD has kept the lectin activity but evolved
specificity for a different sugar receptor, Neu5,9Ac2. Although the
crystal structures of α- and γ-coronavirus S1-NTDs are not available,
they may also have the galectin fold for the following reasons. First,
the conserved structural topology of S1-CTDs across different
coronavirus genera strongly suggests a similarly conserved structural
topology of S1-NTDs across different coronavirus genera. Second, the
S1-NTDs of both α-genus TGEV and γ-genus IBV function as lectins,
although the former recognizes both N-glycolylneuraminic acid (Neu5Gc)
and N-acetylneuraminic acid (Neu5Ac) and the latter recognizes Neu5Gc.
Hence, sugar-binding S1-NTDs across different coronavirus genera may
share the same galectin fold but have diverged to recognize different
sugar receptors. On the other hand, β-genus MHV S1-NTD has evolved
specificity for a novel protein receptor, CEACAM1. Subsequently, MHV
S1-NTD lost its lectin activity because proteins in general have
advantages over sugars as viral receptors by providing higher affinity
and specificity for viral attachment.
FIG 6
Proposed
origin and evolution of coronavirus S1-NTDs. Orange arrows indicate the
locations of CEACAM1 or sugar that binds coronavirus NTDs. Question
marks indicate the postulated structures of hypothetical evolutionary
intermediates.
Are
coronaviruses the only viruses that stole a host lectin and integrated
it into their spike? A survey of viral lectins with known tertiary
structures revealed that galectin-like domains are present in a variety
of viral spikes, including influenza virus hemagglutinin, whose
galectin-like fold was previously unknown (24, 103).
Moreover, these viral lectins display diverse sugar-binding modes, but
they share a feature—their sugar-binding sites are all located in
cavities and are not easily accessible to host antibodies and immune
cells. As a comparison, the sugar-binding sites in host galectins are
open and easily accessible (Fig. 5C).
It was thus hypothesized that these viral lectins all originated from
host galectins but have evolved to use hidden sugar-binding sites to
evade host immune surveillance (104).
The above analysis may explain why coronavirus S1-NTDs have evolved the
ceiling-like structure on top of the core, which is used to protect the
sugar-binding site in coronavirus S1-NTDs from the host immune system.
Subsequently, MHV S1-NTD took advantage of the ceiling-like structure
and evolved CEACAM1-binding RBM on the outer surface of this
ceiling-like structure. In this sense, the evolution of CEACAM1-binding
RBM in MHV S1-NTD might be an indirect outcome of the efforts of
coronaviruses to battle the host immune attacks.
RECEPTOR BINDING BY CORONAVIRUSE
So far, we have reviewed the
receptor recognition and evolution of coronavirus S1-NTDs and S1-CTDs
separately. How do S1-NTDs and S1-CTDs work together in the receptor
recognition and evolution of coronavirus spikes? Electron microscopic
studies of the SARS-CoV spike revealed that it is a clove-shaped trimer,
with three individual S1 heads and a trimeric S2 stalk (Fig. 7) (27, 28).
ACE2 binds to the tip of the SARS-CoV spike trimer, where S1-CTD is
located. Because the membrane-distal tips of the trimeric spike are the
most exposed and protruding region on the whole spike, S1-CTD is
directly exposed to the host immune system, evolves at an increased pace
to evade the host immune surveillance, and becomes hypervariable in
primary, secondary, and tertiary structures. The RBM of S1-CTD is
located on the very tip of the trimeric spikes and evolves at the
fastest pace. On the other hand, S1-NTD is likely located underneath
S1-CTD, is less exposed to the host immune system, and evolves at a
slower pace than S1-CTD. Therefore, between the two S1 domains, the more
conserved S1-NTDs may function as the more reliable RBDs that recognize
sugar receptors, allowing coronaviruses to search for additional and
high-affinity protein receptors using their fast-evolving S1-CTDs. Such
dual-RBD structures in coronavirus spikes may give coronaviruses an
evolutionary advantage in finding new receptors and expanding their host
ranges.
FIG 7
Summary
of the receptor recognition mechanisms of coronaviruses in a
three-dimensional view. The overall structure of trimeric SARS-CoV spike
complexed with ACE2 is shown; it includes both the schematic topology
of the spike and the negative-stain electron micropic images of the
spike ectodomain (upper right). TM, transmembrane anchor. IC,
intracellular tail. The structures and functions of coronavirus S1
domains are listed. The question marks indicate possible tertiary
structures of coronavirus S1 domains.
Why
were specific host cell surface molecules selected as coronavirus
receptors? Among the known coronavirus receptors, sugars are probably
the primordial and fallback receptors for coronaviruses. Sugars are
abundant on host cell surfaces and are easy targets for viruses to grab.
To use sugars as their receptors, a variety of viruses might have
stolen a host galectin and used it as a viral lectin. On the other hand,
using protein receptors may enhance the affinity and specificity of
viral attachment, increase the efficiency of viral entry, and facilitate
viruses to expand their host ranges and alter their tropisms (105).
Host cell surface proteins have some common features as viral
receptors. First, they frequently undergo endocytosis, which facilitates
viral entry. Second, they contain VBM on their surfaces for
high-affinity virus binding. In the VBMs of both ACE2 and CEACAM1,
virus-binding hot spots have been identified and contribute significant
energy to virus/receptor binding interactions (79, 81, 90).
Therefore, host cell surface molecules are not randomly selected by
viruses as their receptors. In fact, there are structural and
evolutional reasons behind these selections by viruses.
CONCLUDING REMARKS
The
structural studies of coronavirus-receptor interactions described above
have established the following virology principles. First, drastic
structural changes in viral RBDs can still lead to recognition of a
virus-binding hot spot on the same receptor protein. Supporting this
principle is the finding that SARS-CoV and HCoV-NL63 recognize a common
virus-binding hot spot on ACE2 using structurally divergent S1-CTDs.
Second, subtle structural changes in viral RBDs can lead to a complete
receptor switch. For example, HCoV-NL63 and PRCV recognize two different
protein receptors using structurally conserved S1-CTDs with divergent
RBMs, and so do SARS-CoV and MERS-CoV. Moreover, MHV and BCoV S1-NTDs
recognize a protein receptor and a sugar receptor, respectively, through
subtle conformational changes in receptor-binding loops. Third, it is a
successful viral strategy to steal a host protein and evolve it into
viral RBDs with novel protein receptor specificities or altered sugar
receptor specificities. For example, MHV and BCoV S1-NTDs have the same
structural fold as human galectins, but they recognize a novel protein
receptor and a different sugar receptor, respectively. Fourth, a few
residue changes at the receptor binding interface can lead to efficient
cross-species infection and human-to-human transmission of a virus. For
example, SARS-CoV needed only one or two mutations in its RBD to
transmit from palm civets to humans. These virology principles may be
extended from the Coronaviridae family to other virus families.
What
are the remaining important questions regarding receptor recognition
mechanisms of coronaviruses? First, what are the crystal structures of
α-coronavirus S1-NTDs, γ-coronavirus S1-NTDs, and γ-coronavirus S1-CTDs?
We have hypothesized that α-coronavirus and γ-coronavirus S1-NTDs have a
galectin fold and that γ-coronavirus S1-CTDs have either a β-sandwich
fold or a β-sheet fold. These hypotheses need to be tested using
experimentally determined crystal structures of these S1 domains.
Second, what are the detailed sugar-binding mechanisms for coronavirus
S1-NTDs? The crystal structures of coronavirus S1-NTDs complexed with
sugar receptors will reveal how sugar receptor specificities are
achieved in these viral lectins across different coronavirus genera.
Third, why do coronaviruses rely on peptidases as their receptors? Three
of the four known protein receptors for coronaviruses are peptidases:
ACE2, APN, and DPP4. They are all recognized by S1-CTDs of different
coronaviruses. It is highly unlikely that the use of peptidases as
coronavirus receptors is simply a coincidence. On the other hand, these
receptors' peptidase activities have no effects on coronavirus entry,
indicating that their common physiological function in degrading
peptides was not the reason why they were selected as coronavirus
receptors. To fully understand why peptidases became chosen receptors
for coronaviruses, it will be important in the future to comprehensively
examine the physiological functions of these peptidase receptors. Last,
what was the evolutionary origin of coronavirus S1-CTDs? So far,
coronavirus S1-CTDs appear to have a novel fold not related to any other
proteins in the protein structure database. However, our previous
structural studies of coronavirus spikes repeatedly showed that tertiary
structures of viral proteins can deceive the currently available
tertiary structural analysis software (98). Instead, our structural topology analysis is a powerful tool to identify structural homology among viral proteins (29, 103).
This approach may help identify the evolutionary origin of coronavirus
S1-CTDs. To sum up, structural studies in the past decade have
elucidated many puzzles surrounding receptor recognition, evolution, and
cross-species transmission of coronaviruses. Future structural studies
will continue to solve the remaining puzzles as well as new puzzles that
may emerge regarding the receptor recognition mechanisms of
coronaviruses.
ACKNOWLEDGMENT
This work was supported by NIH grant R01AI089728.
Copyright © 2015, American Society for Microbiology. All Rights Reserved.


Inga kommentarer:
Skicka en kommentar