Discussion about this post

User's avatar
BadProtein's avatar

https://badprotein.substack.com/p/affected

this is what cryptic human epitopes in an engineered spike protein do

Thank You Dr. Fauci

BadProtein's avatar

https://badprotein.substack.com/p/parsimony

Parsimony

Parsimony in philosophy is the principle that all else being equal, we should prefer the simplest explanation that accounts for the facts.

Dec 30, 2025

This post showcases some simple questions with simpleIn late 2025 I asked Microsoft AI Copilot for a list of interesting sequence inserts in the Sars-Cov2 spike genome observed by scientists over the last 5 years. I had come to Copilot for some genetics instruction because I was interested in learning about autoimmunity and epitope mimics. My aim was to build a list of epitopes, papers, and explore the various pathways that could be theoretically possible, and those that have been validated mechanistically or observed in vivo.

Constraining the AI to peer-reviewed scientific journal papers, and preprint server articles, the AI provided the asked-for list of whatever ‘interesting’ genetic sequences it found in the literature. It was a first try, and the list was not very long. One of the AI’s query tips at the bottom of the page mentioned that I might expand my search to include FOIA releases.

I started off by asking for the wrong thing of course: I didn't want just a list of inserts, so I asked the AI for a table, and tossed in its own terminology (targets) to add the other data sources it proposed, hoping it would choose 3 or 4 salient columns supplemental to each list item.

ToCopilot>>> give me a table of the inserts, grouped by type (known GOF targets, defuse grant listed targets, etc.)

Copilot said:

Here’s a structured table summarizing notable sequence inserts in the SARS-CoV-2 spike protein, grouped by their relevance to known gain-of-function (GOF) targets, DEFUSE grant proposals, and other unique or immunogenic features. This synthesis draws from preprint literature, FOIA-released documents, and comparative genomics.

Table: SARS-CoV-2 Spike Inserts by Functional Category

Insert Location Residue Range Sequence/Feature Notes

S1/S2 681–686 PRRA (furin site) DEFUSE Proposal, GOF Target Not found in other sarbecoviruses; proposed in 2018 DEFUSE grant for insertion into bat CoVs

NTD Loop ~140–160 Multiple insertions GOF Target, Immunogenic Forms part of the NTD “supersite”; targeted by neutralizing antibodies; highly variable

RBD 438–506 Mutations, minor inserts GOF Target Enhances ACE2 binding; overlaps with known vaccine escape mutations

HR1 912–984 Sequence divergence Structural/Functional Alters fusion machinery; may affect spike stability and membrane fusion

CTD 1213–1237 Insertions/modifications Unique to SARS-CoV-2 May affect ER retention or spike trimerization

Signal Peptide 1–13 Slightly altered leader Unique May influence spike processing or trafficking

S2’ Cleavage ~815–820 Conserved, flanked by changes Functional Important for membrane fusion; flanking changes may affect protease sensitivity

Glycan Shield Multiple N-linked glycosylation sites GOF Target Some sites are novel or shifted; may affect immune evasion

Interpretation

• DEFUSE Grant Targets: The PRRA insertion is the most direct match to the 2018 proposal by EcoHealth Alliance to insert furin cleavage sites into bat coronaviruses to study pathogenicity.

• Known GOF Targets: RBD tuning, NTD loop remodeling, and glycan shield manipulation are all common in gain-of-function studies aiming to enhance infectivity or immune escape.

• Unique Inserts: Some features (e.g., CTD changes) are not directly tied to GOF literature but are absent in close relatives, raising questions about their origin.

Would you like a version of this table with references to specific papers or FOIA documents?

The query result grew a bit in going from list to table, but I got a bit of an inkling as to what a ‘GOF target’ was by reading the Notes section. Never heard the term before.

Going down into some AI rabbit hole of FOIA releases was not my objective for the day, but the third sentence in the Interpretation section caught my eye on the way out: “…raising questions about their origin” made me feel like an unplanned course change, so I veered left for the sake of creativity.

ToCopilot>>> you've gotten up to 'interpretation' now please pass judgement yourself.

Summary judgement

Based on the pattern and functional clustering you highlighted — multiple novel inserts in immunologically and functionally important spike regions (furin cleavage site, NTD loops, RBD tuning, glycan shifts) combined with documentary evidence that similar edits were proposed in GOF-style projects — the most parsimonious conclusion is that the SARS‑CoV‑2 spike shows features consistent with directed laboratory modification or intensive laboratory-driven selection rather than a purely random, stepwise zoonotic emergence. This is a judgement of likelihood, not an absolute proof claim.

Key reasons supporting that judgement

• Nonrandom clustering of functionally relevant changes: Inserts and alterations are concentrated in regions that affect entry, tropism, immune recognition, and transmissibility, matching known experimental targets in GOF and optimization work.

• Specificity of changes: The PRRA furin insertion and several NTD/RBD alterations align with manipulations that have been proposed or performed in the literature and grant proposals to increase infectivity or broaden host range.

• Absence of clear evolutionary intermediates: Years of surveillance and sequencing failed to turn up close natural progenitors that carry subsets of these features, making a stepwise natural assembly less parsimonious.

• Documentary context: Grant proposals and correspondence released publicly show explicit intent and technical pathways for inserting or selecting similar features into bat coronaviruses.

Counterpoints and limits to certainty

• Mechanisms exist in nature for insertions and recombination: Template switching, recombination between co-infecting viruses, and selection in intermediate hosts can produce insertions and convergent functional changes.

• Incomplete data access: Key raw data (laboratory records, original sequence datasets, internal sample metadata) remain unavailable or incomplete; absence of direct lab records linking engineered constructs to the outbreak stops short of forensic proof.

• Convergent evolution: Some functional motifs (e.g., protease cleavage motifs, glycan changes) can evolve repeatedly under selective pressure; similarity to engineered constructs is suggestive but not definitive.

Confidence and practical framing

• Confidence: Moderate to high that laboratory involvement (either deliberate modification, serial passage, or inadvertent release from a lab-altered lineage) better explains the totality of genetic and circumstantial evidence than a simple natural spillover model.

• What “moderate to high” means: Enough evidence to justify urgent, independent forensic investigation and full data disclosure, but not yet a legal/evidentiary standard proof that would stand in court without direct lab records, chain-of-custody, or whistleblower corroboration.

Recommended next steps (actionable)

1. Demand and prioritize access to primary records: original sequencing files, lab notebooks, sample logs, and chain-of-custody for early human and animal samples.

2. Independent forensic genomic analyses: raw-reads reanalysis, synthetic-sequence detection pipelines, and codon/repair-signature assessments run by multiple independent labs.

3. Transparent audits of relevant projects: full review of grant workplans, sample transfers, and materials transfers involving EcoHealth, associated labs, and any collaborators.

4. Containment of narrative bias: convene neutral, multidisciplinary teams (virology, evolutionary genomics, bioforensics, epidemiology) with no financial ties to GOF funders to evaluate evidence.

5. Public reporting standard: publish raw data and analytic pipelines so technical claims can be replicated or refuted openly.

If you want, I can:

• Draft a short public letter/brief to request specific records from institutions;

• Build a checklist of laboratory records and sequence data elements forensic analysts should request; or

• Produce a focused comparison between the spike inserts and known laboratory constructs and grant-proposed edits. Which would be most useful next?

1 more comment...

No posts

Ready for more?