By David Orchard-Webb
The global scientific community's attention has turned to COVID-19, a pandemic that has swept across the world causing over 276,863 deaths (May 08, 2020) since emerging in China in late December 2019. The future is now. The causative agent of COVID-19, the coronavirus SARS-CoV-2, has no vaccine, and no biologics or small molecules have yet been approved to treat it. There is a big opportunity for protein engineering to make an impact on this disease without a cure. In order to do so the industry will need to develop engineered proteins rapidly from preclinical to clinical studies, in many cases implementing next generation protein engineering techniques, to obtain regulatory approval. Virus containment measures have ground the world’s economy to a halt. Protein engineering is one of the best strategies to turn the tide against the virus and bring the global economy back from the brink of disaster.
The challenges of protein engineering for therapeutic applications in preclinical development depend on the application. Protein engineering generally faces the following universal challenges:
Separation and Purification Traditionally, the molecular weight and charge properties of the recombinant protein is known and can be used to separate it from impurities by size exclusion gel-filtration chromatography and ion exchange chromatography, amongst other chromatographic methods. Protein tagging, purification and then cleavage of the tag can be particularly useful, however carefully designed tagless purification is possible.
Improvement in upstream yield means that next generation, chromatography-free, purification techniques are becoming viable, such as aggregating tags including elastin-like polypeptide (ELP), repeat-in-toxin (RTX) domain, and ELK16. Aggregation can be triggered by temperature, salt or calcium.
Ideally, purification tags are self-cleaving as proteases don’t always make a “clean” cut and need an additional purification step for removal. Self-cleaving tags can be generated by integrating an intein (internal protein) splicing domain. A commonly used intein is derived from the vacuolar ATPase subunit (VMA) of Saccharomyces cerevisiae, which is triggered to excise by thiol reagent. A drawback of intein systems is that although they function well in prokaryotic cells, they can cleave prematurely in eukaryotic cells, however more reliable inteins have been engineered for Chinese hamster ovary (CHO) cells.
Inteins can also be used in the reverse direction, to increase the solubility of an otherwise insoluble protein. Surface resident Ebola virus glycoprotein (EbolaGP) is responsible for membrane binding and virus entry, and is a target for vaccine development. Production of EbolaGP is hampered by its insoluble aggregation. A C-terminal intein-based tag greatly enhances the solubility of EbolaGP and allows one-step chromatographic purification of the untagged EbolaGP through thiol-catalyzed self-cleavage. The purified untagged EbolaGP was highly immunogenic in a mouse model. Similar strategies could of course be applied to SARS-CoV-2 vaccine development using virus derived soluble proteins.
Solubility, bioavailability & immunogenicity
Engineering proteins to dimerize can be helpful for purification as their increased size makes it simpler to separate them from smaller molecular weight impurities such as albumin. Dimerization can also improve solubility, bioavailability, half-life, and reduce aggregation. For example, larger sized proteins experience less renal clearance, increasing half-life. The generation of fusion proteins with an Fc domain from antibodies can also increase half-life and bioavailability.
Fc domain fusion proteins are composed of an immunoglobulin Fc that is directly coded into the therapeutic protein. They first came to the fore in the late 1980s as a means to block HIV entry into cells (CD4-Fc). Therapeutic proteins in the blood are constantly internalized by endothelial cells and degraded by the lysozyme, however the Fc domain can interact with the salvage neonatal Fc-receptor (FcRn) inside the endothelial cell and return the Fc fusion protein back to the blood. In this way the Fc domain endows the fusion protein with a half-life similar to, but generally shorter than, that of the long-lived IgG and serum albumin, which use the same mechanism. Through engineering of the Fc it may be possible to further improve the half-life.
Fc fusions that form multivalent protein complexes can also be envisioned. Such complexes could possibly clear viral particles more efficiently, although this has yet to be demonstrated.
SARS-CoV-2 can provoke a cytokine storm in some patients, thus further stimulation of the immune system in severe COVID-19 is not ideal, on the other hand immune stimulation at the early stages of the disease before progression to acute inflammation could rapidly clear the infection before it becomes severe, provided it does not in itself trigger cytokine storm. Such scenarios would need to be tested preclinically. Indeed consensus is yet to be reached regarding the most appropriate preclinical representations of COVID-19.
The choice of Fc domain can either stimulate or evade the immune system (IgG1 vs IgG2/4 Fc). The Fc domain is also subject to glycation which is required for effector function, therefore Fc immune properties can be modulated by the cell line used for production.
Glyco Engineering
Protein glycosylation can be necessary for therapeutic function. It also influences yield, pharmacokinetics, and immunogenicity of recombinant therapeutic proteins. Correct human type glycosylation of recombinant proteins poses a challenge for industry as the four main cell types used for production are not human; Escherichia coli, Saccharomyces cerevisiae, Pichia pastoris, and CHO cells. Various plant and insect cells are also used successfully. There are at least six human cell lines used for recombinant protein production including HEK293 and PER.C6, but more than half of all recombinant proteins are produced in CHO.
Correct glycosylation in non-human cell lines requires glyco-engineering strategies. Non-human cell lines often produce glycans that are not present in humans and may be attached to the recombinant proteins, rendering them immunogenic or reducing their efficacy. One such example is aberrant fucosylation, which has been successfully mitigated in CHO cells. Glyco-engineered CHO fucosylation knockout cell lines can produce human IgG1 Fc domain with increased affinity for natural killer cells.
Photosynthetic Cell Lines for Bioproduction
Not all human proteins are heavily glycosylated including interferon alpha (IFN-α), an innate antiviral protein under clinical investigation for the treatment of COVID-19, simplifying production requirements.
Low-cost biopharmaceutical proteins can be generated by recombinant DNA technologies in microalgae and plants by a direct photosynthetic production process, which can potentially dramatically reduce the energy input required compared to other cell types.
Eukaryotic fusion proteins can be produced in photosynthetic cyanobacteria at up to ∼20% of the total protein, provided they have a high expression leader sequence. This leader sequence could contain a self-cleaving intein.
Human interferon production in microalgae Synechocystis sp. PCC 6803 has been demonstrated, however the leader sequence was not removed. The activity of the interferon was lower than wild-type due to the presence of this leader. The proof of principle has been demonstrated, but refinement is needed. Bioactive human interferon-γ (IFN-γ) has also been produced in tobacco plants.
ACE2 - Angiotensin-Converting Enzyme 2
Safety monitoring is moving beyond traditional approaches to use sophisticated AI algorithms that identify safety signals arising from rare adverse events. Furthermore, these signals could be captured from a variety of sources like Websites and search engines. Other sources can include electronic medical records, and consumer-generated media, which can be identified in real-time to identify early signals regarding safety issues of pharmaceutical products. A prompt and timely response on the part of the pharmaceutical manufacturer to physician and patient concerns could prevent regulatory and public-relations backlashes.
CD26 / DPP4 - Dipeptidyl Peptidase-4
CD26, a type II transmembrane glycoprotein and serine protease, is expressed ubiquitously in many tissues, including lung and immune cells and may potentially act as a secondary cellular receptor for SARS-CoV-2 as suggested by in silico data, but not supported by a small amount of 293T in vitro data produced thus far. In silico molecular docking suggests a possible tight interaction between the SARS-CoV-2 S1 spike domain loops and CD26.
Cellular context can be important due to glycosylation‐based heterogeneity among other membrane factors, as is the nature of the putative binding site for CD26 on the viral spike protein. There is not enough experimental data to draw any firm conclusions with regard to SARS-CoV-2 binding of CD26, however co-purification with the related MERS-CoV S1 domain demonstrates that CD26 can bind to at least some pathogenic coronaviruses. A patented soluble fusion protein consisting of a modified CD26 consensus binding sequence for the MERS CoV S1 spike glycoprotein and an antibody Fc domain has been developed. The Fc domain would be expected to improve half-life of CD26 in the bloodstream. This receptor decoy fusion protein, called DPP4-Fc, may prevent MERS-CoV from infecting human lung cells in vitro. Development was never completed due to lack of funding.
DPP4-Fc is produced in glyco-engineered tobacco plants. It has been tested preclinically, but it is not known whether the construct would need further modification to act as a receptor decoy for the SARS-CoV-2 spike protein. Further preclinical development and testing is needed.
Monoclonal antibodies against CD26 may also be effective inhibitors of viral binding and or CD26 protein function.
Diagnostic Bispecific Antibodies Targeting Coronavirus Spike Protein
For diagnostic purposes bispecific monoclonal antibodies that bind the 2003 SARS-CoV-1 spike protein have been developed. One arm binds the spike protein while the other binds horse radish peroxidase for single step detection in an ELISA, detection limit of 0.019 g/ml.
A similar bispecific antibody for sensitive detection of 2019 SARS-CoV-2 could be developed. Currently most ELISA assays attempt to detect antibodies against SARS-CoV-2 but not the virus itself. Detecting virus by ELISA could confirm PCR and potentially detect virus earlier than assays that detect immunoglobulins.
A bispecific antibody approach for ELISA has two obvious advantages 1) the total absence of background and 2) higher sensitivity, compared to traditional sandwich ELISAs.
Such bispecific antibodies could potentially also be used for one-step immunohistochemistry to determine the tissue distribution of 2019 SARS-CoV-2. Targeting COVID-19 Inflammation and Diabetes Severe COVID-19 is characterized by an inflammatory profile that is not unlike cytokine release syndrome. Corticosteroids, IL-6 inhibitors and other modulators of inflammation are under clinical investigation for severe COVID-19. Those with preexisting inflammatory conditions such as metabolic syndrome may be vulnerable to COVID-19, however further research is needed to understand the risks.
Diabetes may be a significant comorbidity of COVID-19. Data from Italy indicate that more than two-thirds of those who die from COVID-19 have diabetes. Notably SARS-CoV-1 (2003), which is reported to use ACE2 as a cellular receptor like SARS-CoV-2 (2019), can damage pancreatic islets and cause acute diabetes. Ambient hyperglycemia was reported to be an independent predictor for mortality and morbidity in SARS patients. Although its functions are not fully understood, CD26 plays a major role in glucose and insulin metabolism. It is an inhibitor of glucagon-like peptide-1 (GLP-1) and other incretin peptides. Preventing CD26 mediated degradation of gut hormones such as GLP-1 potentiates islet hormone secretion and enhances metabolism, reducing hyperglycemia in patients with type 2 diabetes. CD26 is also expressed in the lungs, modulating the function of various proinflammatory cytokines, growth factors and vasoactive peptides in the deep respiratory tract.
It is believed that the use of a CD26 inhibitor in diabetics hospitalized for Covid-19 may reduce the inflammatory lung disease. Among the drugs that selectively block CD26, the one with the greatest affinity is the small molecule Sitagliptin, currently in a randomized controlled open label phase 3 intervention study of patients hospitalized for COVID-19 and affected by type 2 diabetes mellitus.
GLP-1 Mimetics Downstream of CD26, GLP-1 may be of therapeutic benefit for diabetics with COVID-19 could therefore be a target for drug repurposing of FDA approved GLP-1 biologics. For example, GLP-1 agonists have been shown to protect against encephalomyocarditis virus (EMCV) induced diabetes in a mouse model. Table 1 lists two FDA approved GLP-1 mimetic proteins, peptides are excluded.
Novavax is developing a SARS-CoV-2 recombinant spike protein nanoparticle vaccine, NVX-CoV2373 produced in Sf9 insect cells. High titres of spike protein-specific antibodies with ACE-2 human receptor binding domain blocking activity and SARS-CoV-2 wild-type virus neutralizing antibodies were observed after a single immunization. In addition, the already high microneutralization titers seen after one dose increased eightfold with a second dose. High titer microneutralizing antibodies are a positive signal of possible efficaciousness in people. A phase I clinical trial has been initiated with preliminary human data expected in July 2020.
Good progress has been made in developing protein biologics for treating COVID-19. Upwards of 300 papers are published per day on the subject of COVID-19, some of which provide valuable clues as to the optimal treatment strategy. Research surrounding the coronavirus spike protein and its interaction with its putative receptors in different cell types is likely to uncover further information that can be used in biologic treatment strategies.