compass

Reassessing the RNA-Protein Nexus for Clinical Applications

Written by Jay H. Lee, CEO & Co-founder | Mar 12, 2025 6:07:57 PM

The central dogma – DNA to RNA to protein – underpins all of biology.  And while RNA keeps unveiling XKCD-worthy surprises, it’s proteins that drive cellular function and thus are the predominate targets for most drugs.  Yet, for drug discovery and development, RNA-based biomarkers are still very common. Why the disparity?

The genomic era, fueled by advances in sequencing and oligonucleotide technologies, has put DNA and RNA in the spotlight. While these tools excel at reconstructing and predicting cellular processes, they fall short of directly observing and quantifying proteins with the precision needed to connect science to medicine.

In drug development and patient treatment, especially for antigen-targeted therapies like Antibody-Drug Conjugates (ADCs), leaning on RNA alone is like buying a new home from its blueprints and mockups without ever stepping inside. A gene might produce mRNA, but that’s no guarantee its protein is made or functions as expected, especially over time. Factors like translation efficiency, RNA degradation, protein turnover, and post-translational modifications keep the correlation between mRNA and protein levels moderate, typically ranging from 0.3 and 0.61.

Evidence

There’s quite a lot of evidence to support this. Studies show significant variation depending on cell type or context – for example, Jovanovic and colleagues reported a correlation of 0.77-0.82 in activated mouse dendritic cells2 , while others found much lower correlations in fibroblasts3.   These differences in expression matter for drug developers: RNA might suggest a target, but without protein data, the picture’s incomplete.

 

Table 1: Key studies examining the RNA-protein relationship

Study

Correlation (R)

Tissue/Cell Type

Methods Used

Conclusion

Jovanovic et al., 20152 

~0.77-0.82

Mouse dendritic cells

Ribosome profiling, RNA-seq, pulsed-SILAC proteomics

mRNA abundance explains 59-68% of steady-state protein variance, with translational efficiency contributing significantly to protein synthesis regulation

Schwanhäusser et al., 20113

~0.6

NIH 3T3 fibroblasts

RNA-seq, SILAC-based proteomics

mRNA accounts for ~40% of protein variation, with post-transcriptional regulation key.

Shalek et al., 20134

Not directly reported

Mouse dendritic cells

Single-cell RNA-seq, flow cytometry

Transcriptional noise and burst kinetics add variability to RNA-protein links.

Schwartz et al., 20145

Not directly reported

Yeast

Ψ-seq for pseudouridylation mapping

RNA pseudouridylation impacts stability, indirectly affecting protein levels.

Schulz et al., 20186

Variable (e.g., 0.68 for HER2 population, 0.16–0.45 for CK19)

Breast cancer tissue

Imaging mass cytometry

Gene-specific patterns emerge, with HER2 showing stronger correlation than CK19.

Ingolia et al., 20147

Not directly reported

Mouse ESCs, HEK293 cells (validation in mouse tissues)

Ribosome profiling

Pervasive translation outside annotated genes shapes RNA-protein dynamics.

Magnusson et al., 20228

~0.21 (raw), ~0.86 (modeled)

Human T cells

RNA-seq, mass spectrometry, time-series modeling

Raw correlation low (0.21), splice variant and time-delayed models boost it to 0.86, highlighting temporal effects.

Taniguchi et al., 20109

0.77 (FISH) or 0.54 (RNA-seq). Single-cell correlations are ~0.

E. coli

Single-molecule imaging, transcriptomics

Ensemble shows moderate correlation, single-cell shows none, due to stochastic expression and protein/mRNA turnover differences.

Edfors et al., 201610

0.39-0.79 (direct), 0.93 (with RTP)

Human cell lines (e.g., U2OS, A431, SH-SY5Y) and tissues

RNA-seq, targeted proteomics

Moderate direct correlation (0.39-0.79) improves to 0.93 with gene-specific RTP factors, reflecting translational and stability differences.

 

A correlation of 0.6 might sound solid, but it leaves 40% of protein levels unexplained.3  For ADCs, where success hinges on proteins like HER2 or EGFR being present on tumor cells, that’s a gamble. RNA-seq can flag a lead, but if the protein doesn’t materialize, the therapy flops. Even if the protein is present, its intra-tumoral heterogeneity and expression heterogeneity within the patient cohort dilutes the observable response, mandating larger cohort sizes and longer trials, while risking even more costly trial failures due to the lack of patient stratification and treatment biomarkers.11  Proteomics isn’t optional, it’s the reality check. For solid tumors, understanding the spatial relationship among the protein targets, drugs, and immune modifiers can be critical.6 

 

Again, this relationship isn’t just academic – it directly impacts drug development, especially in drug classes like radioligand-based therapies and ADCs. These drugs marry antibodies to toxic payloads, relying on the target protein being expressed as expected in order to confer specificity. Researchers have mined datasets like TCGA and GTEx to see how RNA and protein align for ADC targets. The pattern holds: moderate correlations with some standouts, shaped by cancer type, immune context, and molecular quirks.

 

Table 2: Examining the RNA-protein relationship in the context of drug development

Study

Corr. (R)

Dataset/Tissue

Methods

Conclusion

Zhang et al., 201412

0.47 (avg, steady-state), 0.23 (avg, gene variation)

TCGA colorectal cancer

RNA-seq, mass spectrometry

RNA-protein correlations (median 0.39) vary by subtype-specific proteomic patterns; HER2-enriched cancers show tight mRNA-protein links for ERBB2 (r=0.84).

Mertins et al., 201613

0.39 (median, global), e.g., 0.84 (ERBB2)

TCGA (breast cancer)

RNA-seq, mass spectrometry

RNA-protein correlations vary by subtype; HER2-enriched cancers show tighter links.

Gholami et al., 201314

~0.5-0.76

Human cancer cell lines

RNA-seq, quantitative proteomics

Moderate to high correlations between transcriptome and proteome, with post-transcriptional regulation influencing drug-response proteins potentially relevant to cancer therapeutics

Coscia et al., 201615

Not directly reported

Human ovarian cancer tissues and cell-line xenografts

Mass spectrometry, transcriptomics

Protein-level validation critical for identifying immunotherapy targets, including potential ADC-relevant antigens, in ovarian cancer

Nusinow et al., 202016

0.48 (mean)

Human cell lines (CCLE)

RNA-seq, mass spectrometry

RNA explains ~23% of protein variance on average; cell cycle and posttranscriptional regulation, including protein stability, contribute to differences between RNA and protein expression

Jiang et al., 201917

~0.36-0.66

TCGA (hepatocellular carcinoma)

RNA-seq, proteomics

Immune infiltration and tumor heterogeneity modulate RNA-protein correlations.

Clark et al., 201918

0.43-0.44 (median)

CPTAC (ccRCC), TCGA

RNA-seq, mass spectrometry

RNA-protein correlations in ccRCC are similar to other cancers; N-linked glycosylation is upregulated in aggressive subtypes but not linked to detection issues.

HER2 is a standout here, syncing mRNA and protein tightly—think R > 0.7 in breast cancer data6—making it a rare case where RNA might suffice. Most targets, though, linger in that 0.4-0.6 range, with factors like glycosylation or immune infiltration adding twists.

A Closing Thought

RNA research has flipped biology’s script. RNA-seq has unlocked research into gene regulation, cell states, and disease mechanisms with speed and breadth proteomics can’t match. It’s a discovery engine – dynamic and scalable. But when it’s time to act clinically, especially with protein-targeting therapies like ADCs, RNA alone won’t cut it.

A blueprint’s crucial for dreaming up your new home, but you’d never move in without inspecting the real thing. RNA hints at what might be; proteins prove what is. In drug development and clinical medicine, that gap is everything. For picking patients, predicting outcomes, or driving precision medicine, protein data’s the clincher.

References

  1. Maier, T. et al. (2009). Correlation of mRNA and protein in complex biological samples. FEBS Lett. 583, 3966–3973. pmid:19850042 doi:10.1016/j.febslet.2009.10.036.
  2. Jovanovic, M. et al. (2015). Dynamic profiling of the protein life cycle in response to pathogens. Science 347, 1259038. pmid:25745177 doi:10.1126/science.1259038.
  3. Schwanhäusser, B. et al. (2011). Global quantification of mammalian gene expression control. Nature 473, 337–342. pmid:21593866 doi:10.1038/nature10098.
  4. Shalek, A.K. et al. (2013). Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–240. pmid:23685454 doi:10.1038/nature12172.
  5. Schwartz, S. et al. (2014). Transcriptome-wide Mapping Reveals Widespread Dynamic-Regulated Pseudouridylation of ncRNA and mRNA. Cell 159, 148–162. pmid:25219674 doi:10.1016/j.cell.2014.08.028.
  6. Schulz, D. et al. (2018). Simultaneous Multiplexed Imaging of mRNA and Proteins with Subcellular Resolution in Breast Cancer Tissue Samples by Mass Cytometry. Cell Syst. 6, 25-36.e5. pmid:29289569 doi:10.1016/j.cels.2017.12.001.
  7. Ingolia, N.T. et al. (2014). Ribosome Profiling Reveals Pervasive Translation Outside of Annotated Protein-Coding Genes. Cell Rep. 8, 1365–1379. pmid:25159147 doi:10.1016/j.celrep.2014.07.045.
  8. Magnusson, R. et al. (2022). RNA-sequencing and mass-spectrometry proteomic time-series analysis of T-cell differentiation identified multiple splice variants models that predicted validated protein biomarkers in inflammatory diseases. Front. Mol. Biosci. 9, 916128. pmid:36106020 doi:10.3389/fmolb.2022.916128.
  9. Taniguchi, Y. et al. (2010). Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells. Science 329, 533–538. pmid:20671182 doi:10.1126/science.1188308.
  10. Edfors, F. et al. (2016). Gene‐specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol. 12, MSB167144. pmid:27951527 doi:10.15252/msb.20167144.
  11. Vogel, C., and Marcotte, E.M. (2012). Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet. 13, 227–232. pmid:22411467 doi:10.1038/nrg3185.
  12. Zhang, B. et al. (2014). Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387. pmid:25043054 doi:10.1038/nature13438.
  13. Mertins, P. et al. (2016). Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534, 55–62. pmid:27251275 doi:10.1038/nature18003.
  14. Gholami, A.M. et al. (2013). Global Proteome Analysis of the NCI-60 Cell Line Panel. Cell Rep. 4, 609–620. pmid:23933261 doi:10.1016/j.celrep.2013.07.018.
  15. Coscia, F. et al. (2016). Integrative proteomic profiling of ovarian cancer cell lines reveals precursor cell associated proteins and functional status. Nat. Commun. 7, 12645. pmid:27561551 doi:10.1038/ncomms12645.
  16. Nusinow, D.P. et al. (2020). Quantitative Proteomics of the Cancer Cell Line Encyclopedia. Cell 180, 387-402.e16. pmid:31978347 doi:10.1016/j.cell.2019.12.023.
  17. Jiang, Y. et al. (2019). Proteomics identifies new therapeutic targets of early-stage hepatocellular carcinoma. Nature 567, 257–261. pmid:30814741 doi:10.1038/s41586-019-0987-8.
  18. Clark, D.J. et al. (2019). Integrated Proteogenomic Characterization of Clear Cell Renal Cell Carcinoma. Cell 179, 964-983.e31. pmid:31675502 doi:10.1016/j.cell.2019.10.007.