By Dr Catarina Carrao
Successful antibody discovery has been dependent on the proper identification of a target and the mechanisms of action associated to the development of a specific disease. But, this conventional wisdom of the antibody development roadmap is being challenged, by how long it takes from target research to the initiation of a clinical program. As the search for validated targets becomes more challenging, unconventional strategies, such as phenotypic screening, are becoming more significant.
In a “function-first” approach of new antibody discovery, or phenotypic screening, the first step is targeting a specific cell type over others; for example, a malignant cell over its normal equivalent 1. By using primary cells from patients, the phenotypic discovery of antibodies allows the development of personalized drugs.
Antibodies against targets are then screened in clinically and mechanistically relevant, high-throughput, functional in vitro assays¹. Finally, antibody targets are deconvoluted, and the most biologically and therapeutically promising antibodies are tested in state-of-the-art in vivo experimental models2. This strategy allows the identification of antibodies that mediate their mechanism of action through the biology of the target receptor; and, also, through effector mechanisms (e.g., Fc-receptor binding), and a further engagement of immune effector cells.
Recently, Anne Ljungars and colleagues3 from the department of Immunotechnology in Lund university (Sweden), have developed a platform, the function-FIRST antibody discovery platform (F.I.R.S.T™) for phenotypic discovery of therapeutic antibody-target combinations. By using Chronic Lymphocytic Leukaemia (CLL) primary cells throughout the discovery process (panning, screening, and functional studies), they discovered antibodies with improved cytotoxicity compared to the standard of care (Rituximab, a CD20-specific monoclonal antibody)3. The isolated antibodies were found to target six different receptors on CLL cells (e.g., CD32, CD200, and HLA-DR), which appeared to be the most potent targets for antibodybased cytotoxicity treatment3. One of the antibodies discovered by this study is currently being tested in phase I/II clinical trials on CD32b+ B-cell lymphomas and leukemias, showing that phenotypically discovered targets are translatable to the clinic (NCT02933320, NCT03571568).
The arrival of high-throughput sequencing brought an understanding to antibody repertoires and their dynamics4. There are an increasing number of computational tools that aid in the immune repertoire characterization, like genome editing5 and mammalian surface display6. This not only enables the screening of fulllength antibodies for antigen binding, but also allows for the integration of immunoglobulin genes to predict biophysical properties optimal for downstream therapeutic development7.
Next generation sequencing (NGS) has allowed a massive increase in capacity, a relative low-cost and a short time analysis of thousands of antibody targets. One of the main challenges in the use of NGS for antibody selection relates to the size of the encoded genes. The smallest antibody fragments (variable domains) range between 300-400 bp in length, while the commonly used scFv and Fab antibody fragment formats range from 700/800 bp to >1,500 bp, respectively8. There are many NGS platforms that can sequence up to 300–400 bp with reasonable throughput, like the Illumina Miseq and Hiseq9, or Ion Torrent PMG10. Using paired-end reads can also generate long sequences, which is particularly useful for scFv
formats, enabling the sequencing of multiple Complementarity-Determining Regions (CDRs) of VH and VL domains8. Some studies have focused on the center of the antigen binding site, which is a major determinant of antigen binding; as such, focusing on the relatively short VH CDR3 repertoire only11.
Another challenge of the use of NGS is paying attention to sequencing errors; because DNA amplification inevitably results in context-dependent polymerase errors (10−5–10−6 per base), and errors will certainly be present in large NGS datasets that include billions of bases8. Not only that, but NGS technologies themselves can be vulnerable to the introduction of errors, such as cluster misamplification and base misincorporation4. To help identify PCR and sequencing errors, unique molecular identifiers (UMIs; stretches of 8–10 degenerate DNA bases) can be added to primers during the first two cycles of PCR amplification, meaning that reads that share the same UMI have a high probability of being derived from the same original template12. Such reads can then be grouped and used for error correction.
Following confirmation of the quality of the NGS read-data, the large sample size of NGS method logically requests the use of bioinformatics tools. The selection of a tool will be dependent on the number of NGS reads being analyzed. Programs such as IMPre, IgBLAST, IMGT/High V-QUEST, and ImmundiveRsity, are used to identify VH and VL germlines as well as VH and VL CDRs8. Post-processing of the output of the bioinformatics tools is then needed to understand the clone structure within the dataset, and to pair VH and VL domain sequences. The clone structures can be determined by applying sequence clustering
tools, such as CD-hit13 or UCLUST14 to CDR3s alone.
This can be done at either the amino acid or nucleotide sequence level (or, even to the full-length sequence), in order to group closely related sequences into a “clonal” groupset8. Finally, the scripts can be used to examine and review the diversity and other characteristics of the library. Fortunately, nowadays it is actually possible to outsource the sequencing and/or analysis of antibody libraries to commercial suppliers, with companies offering a range of options from multiplexing/ sequencing to complete analysis from purified DNA or phages (e.g., CD Genomics, MCLAB).
Once the library has been established, then the development of computational methods to aid antibody engineering relies on the successful exploration and exploitation of these new data sources. A field that currently benefits from a steady stream of new data from NGS is the B-cell Receptors (BCRs), that can be used as proxy for antibodies15. Recently, Leonard D. Goldstein and co-workers16 from Genentech in San Francisco (USA), described the use of high-throughput single-cell B-cell receptor sequencing (scBCR-seq) to obtain accurately paired full-length variable regions in a massively parallel fashion. Besides of identifying additional candidate lineages, not recoverable as hybridomas; they found that 99% (n = 92/93) of the clones were antigen-reactive, establishing scBCR-seq as a powerful tool for antibody discovery16.
The isolation of thousands of antibodies that might contain specificity for all therapeutically relevant cell surface receptors makes biopanning a crucial step to further identify specific antibodies. New strategies are being developed to facilitate the biopanning procedure. For example, panning in solution using biotynilated antigens seems to offer advantages in comparison to using immobilized ones17. Chia Chiu Lim and colleagues18 from the Institute for Research in Molecular Medicine in Malaysia, recently presented a new process coined ‘YinYang’ panning by using saturation of non-binders with blocking agents to promote positive binder enrichment with antigen preparations devoid of purification, allowing the use of crude antigens for further selection. Also, using whole cells for panning appears to be a good way to present membrane embedded target antigens in their natural environment, and conformation, to antibody-bearing phages19.
Since, in the panning process, with its consecutive selection rounds, there is a tendency to mainly identify the most abundant antibodies, independent of functionality; there’s also a need to facilitate discovery of low-abundant antibodies and, when appropriate, additional specificities. With this in mind, deep mining strategies have recently been used to uncover rare antibodies that bind to new targets and epitopes20.
For example, a protein depletion step can be used as a deep mining strategy to reduce the fraction of antibodies binding to previously discovered hits and remove low therapeutic interest targets21. NGS can also be used has a deep mining tool for analysis of low frequency clones20, and provide an effective way to retrieve antigen-specific scFv clones that are difficult to identify by the conventional colony screening method22.
The high-throughput screening of antibodies needs to be done with clinically predictive in vitro and in vivo assays. Despite the range of in vitro measures to characterize physicochemical properties of bio-therapeutics, “in vitro-in vivo” correlations (IVIVC) have yet to be established23. As such, implementing tools during early phases of drug discovery to identify antibodies at risk of poor pharmacokinetics, for example, will eventually reduce the time needed by improving the lead antibody selection process. Simple in silico tools can be used to evaluate the presence or absence of certain molecular
features, such as net charge and hydrophobicity of critical parts of antibodies, which can then be followed by more complex 3D modeling assessment of hot spot analysis, isoelectric point, or charge distribution23. Also, in vitro high throughput tools addressing non-specific binding (e.g., Baculovirus-ELISA, heparin binding, or extracellular matrix binding assay), binding to FcRn (human FcRn liquid chromatography) and polyspecificity (e.g., protein chips) can be used in a pharmacokinetic de-risk strategy24.
Good solubility is also crucial to avoid in vivo antibody aggregation, which can lead to loss of activity and further degradation of the bio-therapeutic. Aggregation prone regions (APRs) around the CDRs of therapeutic antibodies can be predicted by algorithms15. Recently, Puneet Rawat and co-workers25 from the Protein Bioinformatics Lab in the Indian Institute of Technology, collected experimental data on aggregation kinetics and used machine learning to identify protein aggregation rate enhancer and mitigatory mutations. This new algorithm enables in-silico optimization of antibodies with
an eye on improved biophysical features25.
The identification of hydrophobic regions is also an important step in aggregation prediction that ideally requires a crystal structure of the antibody or a reliable homology model15. Tushar Jain and colleagues26 from Adimab in Palo Alto (USA), developed a surface accessible area predictor that can be applied to an antibody sequence, to further create a propensity score that could be correlated with aggregation risk26. The low computational cost and high accuracy of this approach, allows real-time assessment of the hydrophobic character of an antibody. This not only supports the
antibody ranking during the discovery process, but also a rational engineering approach to reduce hydrophobic liabilities26.
One of the difficulties associated with phenotypic drugdiscovery, in comparison with target-based, is that it requires downstream identification of targets engaged by the newly identified antibodies – called target deconvolution. This is an essential step to understand the compound mechanism of action, and for using the identified hits as tools for further dissecting a given biological process27. Since, this is a difficult and labour intensive step, a cell binding heat map analysis could be used to group antibodies with the same specificity together, simplifying the target deconvolution process3. This allows a reduction in the number of analysed antibodies that must be investigated in low throughput, high-cost techniques, such as immunoprecipitation-mass spectrometry (IP-MS).
The evaluation of an antibody fragment-induced activation of major histocompatibility complex class II (MHCII) molecules is also an important step in bio-therapeutics discovery15.
Computational methods can be used to identify potential MHCI and MHCII binding T-cell epitopes, as well as conformational B-cell epitopes28. The prediction of T-cell epitopes by machine learning approaches is tackled mostly by neural networks-based methods, that evaluate the binding affinity of a given short peptide towards MHC-I or II29. Recently, Masato Ogishi and Hiroshi Yotsuyanagi30 from the Institute of Medical Sciences Research Hospital in the University of Tokyo (Japan), proposed a supervised machine learning framework that generates probabilistic estimates of immunogenicity, termed “immunogenicity scores”, based on computed sequence-based simulation. This method mimics the thermodynamic interactions between MHC complexes and T-cell receptor clone-types, which enables an efficient prioritization of epitope candidates highly prone to T-cell recognition in multiple individuals30.
Accelerated development of antibody tool reagents is also an essential requirement for the successful progress of therapeutic antibodies in today’s fast-paced and competitive drug development marketplace. Aaron Winters and colleagues31 from Amgen in California (USA), have recently reported a nanofluidic optoelectronic single B lymphocyte antibody screening technique (NanOBlast) applied to the generation of anti-idiotypic reagent antibodies.
This platform is capable of manipulating living cells within microfluidic chip nanopens using structured light, capable of quantifying free therapeutic IgG molecules directly from human patient serum, which facilitates important biotherapeutics development decision-making31. New approaches to develop antibody-based therapeutics are also focused on different molecular formats, like the H-chain only antibody, or nanobody, that are naturally occurring in camelids and sharks15. The first therapeutic approval in 2018 of Caplacizumab, has raised the interest in these type of molecules. Caplacizumab is a humanized anti–von Willebrand factor, bivalent variable-domain-only immunoglobulin fragment. It inhibits the interaction between von Willebrand factor multimers and platelets, for the treatment of thrombotic thrombocytopenic purpura (TTP), an immune-mediated deficiency disease32. The results of the phase III clinical trial have just come out, showing promising results when added to the standard treatment of affected patients.
The annual number of antibody-based therapeutics to receive approval in either the European Union or the United States has reached double digits for the first time in 201731. The specific binding characteristics of antibodies, together with their compliance to protein engineering make these molecules powerful research tools, and the perfect vehicle for a more straightforward road to the discovery of new medical treatments.
Dr Catarina Carrao gained a PhD in Biochemistry from Northeastern Ohio Medical and Pharmacy University and an M.Phil in Biochemistry at the University of Beira Interior. She has worked as a researcher at Max F. Perutz Laboratories, Yale Cardiovascular Center at Yale University School of Medicine, and the Center Cardiovascular Research (CCR) at Charité Medical University.