Reverse engineering the SARS-CoV-2 RBD
Some scientists argue it would be impossible for humans to have designed something that can bind human ACE2 so well. That underestimates human ingenuity and overstates nature's potential to harm us.

In 2015 Vineet Menachery and Ralph Baric made a synthetic virus using the spike gene of RsSHC014, which WIV claimed to have sampled from a bat in a Yunnan cave in 2012. My previous work suggests WIV’s claim is fraudulent and RsSHC014 is not a natural virus. Unfortunately, Menachery et al didn’t consider this possibility. But they did demonstrate that it was capable of infecting human airway cells and causing pathogenesis in mice.

This was an interesting finding because RsSHC014’s RBM is quite different from SARS’, with only ~50% amino acids the same. There are 30 amino acid differences, including 7 of the 14 residues that make contact with the ACE2 receptor. This shows that the RBM is able to accumulate many mutations, and still bind human ACE2 efficiently.
If my claim is correct that RsSHC014 is artificial how might its RBM have been designed?
In 2008 a team led by Christian Drosten discovered a bat virus, BM48-31, which had (at the time) the closest RBD to SARS. This is surprising because BM48-31 was sampled from bats in Bulgaria, 8000km from Guangdong where the SARS outbreak started.

Interestingly, RsSHC014 shares several residues with RsSHC014, that weren’t in SARS or any other known bat CoVs. How were they conserved between different bat species, viral lineages, 8000km apart and perhaps thousands of years evolved from a common ancestor?
I postulate RsSHC014 has a chimeric RBM: residues from BM48-31 were selectively introduced to create an RBM somewhat divergent from SARS, but still able to bind hACE2.
So, what about SARS-CoV-2?
A chimeric design for SARS-CoV-2’s RBM
Chimeric RBMs were of interest to scientists researching broad spectrum coronavirus vaccines, like this team featuring AMMS officers Wanbo Tai and the mysteriously deceased Yusen Zhou. They took mutations from various MERS strains, including camel variants and combined them into more divergent RBDs. Antibodies against wild-type MERS worked well, but these new RBDs have only a handful of mutations from that strain.
In contrast, Menachery et al found that antibodies to SARS don’t neutralize RsSHC014 effectively, there are too many differences between them. One solution is to use a chimeric RBM that combines short, surface-exposed amino acid sequences (epitopes) from both. It could also include other divergent SARS-related viruses deemed spillover risks. A vaccine based on this chimeric RBM could hopefully guard against the emergence of even unknown SARS-like viruses, as Baric describes:
Approximately 50% of amino acids are conserved between the RBM of SARS-1 and SARS-CoV-2. Could SARS-CoV-2 RBM also be chimeric? What viruses, other than SARS, might be used to construct it?
RsSHC014 was already known to bind hACE2 so it would be an obvious candidate. As it happens, RsSHC014 shares several residues with SARS-CoV-2 that aren’t in SARS-1.

Another virus that was known and is divergent from these is PDF-2386. This was discovered by the PREDICT consortium in Uganda and is similar to other African viruses (e.g. BtKY72). PDF-2386 and RsSHC014 were the subject of proposed experiments by Ralph Baric and Fang Li in 2017-18, which were stalled due to the Gain of Function moratorium in place at the time. It isn’t clear if they ever went ahead.


Had this experiment gone ahead as described, it wouldn’t have resulted in SARS-CoV-2. Only one of the 12 mutations specified is present in SARS-CoV-2, and that is also found in other human SARS strains.
Nonetheless, PDF-2386 has other RBM residues shared with SARS-CoV-2 that aren’t in SARS or RsSHC014.

SARS-CoV-2 also many have inherited some residues from Chinese lineage SL-CoVs, of which HKU3 is representative.
A larger alignment using only SARS-1 related sequences shows that all but 5 residues have a match somewhere. Conversely, viruses from all regions and lineages have epitopes represented in the SARS-CoV-2 RBM.
This is not a sequence nature would have evolved by mutating residues one at a time - but it is one that humans would have devised!

However not all the sequences in this alignment were published before the SARS-CoV-2 outbreak. The region in the cyan box shows particularly close homology between SARS-CoV-2 and bat viruses from the UK (e.g. RhGB02). But no UK viruses were sampled or published before the outbreak. At least, none were known to have been discovered. This suggests viruses might have been sampled but not disclosed. Based on other documentary evidence this is almost certainly the case in South-East Asia and China, but perhaps also further afield.
An alien peptide?
One region (in the red box above) stands out as having especially low homology with most sequences (with RsSHC014 being an exception).
Compared to SARS-1, SARS-CoV-2 has 9 consecutive amino acid differences. Nucleotide conservation is also poor. If this evolved naturally it must have been by a process other than regular substitution. The changes have an important effect on structure and function.
The SARS-1 sequence has a remarkable 5 Proline residues in a region of 12. Proline adds structural rigidity that can reduce binding affinity. In SARS-CoV-2, only one Proline is conserved, while two Glycines are gained (Glycine is small and flexible). This creates a much more flexible structure in this loop, allowing closer contacts, stronger bonds to form with the ACE2 receptor.
Curiously, when I blasted this peptide I got an unlikely match to a peptide from a bat adenovirus - WIV17/WIV18.
Ordinarily this might seem spurious- adenoviruses are DNA viruses unrelated to coronaviruses, any peptide of this size can be found in many organisms in GenBank. But WIV17/WIV18 were (purportedly) discovered in 2012 in Mojiang County, Yunnan - the same time and place as RaTG13. It’s quite a remarkable co-incidence given the far smaller pool of sequenced organisms in this one cave.

But - stranger still - an earlier WIV PhD student’s thesis stated WIV17/18 were collected in a cave near Kunming, not Mojiang, which is 300km and 5 hours drive away. So they seem to have changed the sampling location so that it corresponds to RaTG13.

WIV may also have intended RaTG13 itself to have originated in the Kunming cave, rather than the Mojiang mine. In the PLoS paper “A rich gene pool…” single virus said to have been sampled from R. affinis in the Kunming cave, was never sequenced, and it doesn’t appear in phylogenetic trees. They were perhaps keeping their options open.
If intended for vaccine research, why plan a cover-up?
Although a chimeric RBM may have some legitimate use in vaccine research, WIV seem to be engaged in something quite different - planning the cover up of an outbreak that was yet to happen. And SARS-CoV-2 is certainly not a vaccine - not even a failed attempt at one. - there is no evidence of an attempt to attenuate it as would be expected in a LAV. Quite the contrary, it seems engineered for human transmissibility and pathogenicity.
A figure at the center of coronavirus research and intersecting networks of Chinese and US researchers is Fang Li. He is a close collaborator with Baric, but also with WIV, where he was a visiting fellow in 2016-17. He studied the remarkably similar MERS FCS with AMMS’ Yusen Zhou and his widow Lanying Du. Over the years his lab has hosted several postdocs from the State Key Lab of Virology (which connects Wuhan University and WIV scientists) and it is these students who have done most of the hands-on lab work on coronavirus structure and entry mechanisms. In 2018 he collaborated with WIV’s Shi Zhengli and Jie Cui on a paper about the origin of SARS and MERS. The resulting paper published in December 2018, suggests an origin scenario for SARS similar to that initially proposed for SARS-CoV-2 (substituting pangolins for civets).
Jie Cui, and his doctoral student Yu Ping were responsible for sequencing RaTG13 - (although curiously their names aren’t listed as authors on the eventual paper). In July 2018, they uploaded partial sequence data to GenBank for RaTG13, and many other viruses, but they omitted the all-important spike gene. In Yu Ping’s thesis there is an alignment of the RBDs of these new viruses, but RaTG13 was again omitted.
Interestingly, the RBD is the region of RaTG13 which is notably different to SARS-CoV-2, while the Guangdong pangolin coronavirus has a close RBD, but is divergent elsewhere in the genome. An early hypothesis was that SARS-CoV-2 was formed of the recombination of the two.
FOI documents acquired by US Right to Know show Fang Li’s proposed work would have determined the crystal structure of bat coronaviruses in complex with ACE2 of various species. This work might be useful in informing the design of a chimeric RBD. It’s unclear if this work was ever done, no results have been published.

Fang Li shares common interests with Baric in receptor binding and protease cleavage. Unlike Baric, he is also interested in the spike NTD as a potential sugar binding domain (“sugar” includes sialic acids, heparan sulfate, other glycans). This interest is perhaps vindicated by the recent pre-print from NIH scientists suggesting that heparan sulfate - not ACE2 - is the primary receptor for SARS-CoV-2 (the NTD will be the subject of a future article).
I don’t mean to imply that Fang Li had any direct or willful input into the design of SARS-CoV-2 - but someone with such expertise and connections should not have been overlooked by investigations such as the Covid Select Subcommittee.