Skip to main content
Log in

TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection

  • PERSPECTIVE
  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Drug discovery, especially virtual screening and drug repositioning, can be accelerated through deeper understanding and prediction of Drug Target Interactions (DTIs). The advancement of deep learning as well as the time and financial costs associated with conventional wet-lab experiments have made computational methods for DTI prediction more popular. However, the majority of these computational methods handle the DTI problem as a binary classification task, ignoring the quantitative binding affinity that determines the drug efficacy to their target proteins. Moreover, computational space as well as execution time of the model is often ignored over accuracy. To address these challenges, we introduce a novel method, called Time-efficient Multimodal Drug Target Binding Affinity (TeM-DTBA), which predicts the binding affinity between drugs and targets by fusing different modalities based on compound structures and target sequences. We employ the Lasso feature selection method, which lowers the dimensionality of feature vectors and speeds up the proposed model training time by more than 50%. The results from two benchmark datasets demonstrate that our method outperforms state-of-the-art methods in terms of performance. The mean squared errors of 18.8% and 23.19%, achieved on the KIBA and Davis datasets, respectively, suggest that our method is more accurate in predicting drug-target binding affinity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets used in the current study are available in the repository https://github.com/hkmztrk/DeepDTA/tree/master/data.

Notes

  1. https://pypi.org/project/padelpy/

  2. https://github.com/agemagician/ProtTrans.

References

  1. Gonzalez MW, Kann MG (2012) Chapter 4: Protein interactions and disease. PLoS Comput Biol 8(12):e1002819. https://doi.org/10.1371/journal.pcbi.1002819

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Mamoshina P et al (2018) Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front Genet 9:242. https://doi.org/10.3389/fgene.2018.00242

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Xuan P et al (2019) Gradient boosting decision tree-based method for predicting interactions between target genes and drugs. Front Genet 10:459. https://doi.org/10.3389/fgene.2019.00459

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Paul SM et al (2010) How to improve r & d productivity: the pharmaceutical industry’s grand challenge. Nat Rev Drug Discovery 9(3):203–214. https://doi.org/10.1038/nrd3078

    Article  CAS  PubMed  Google Scholar 

  5. Wang L et al (2021) Nmfcda: Combining randomization-based neural network with non-negative matrix factorization for predicting circrna-disease association. Appl Soft Comput 110:107629. https://doi.org/10.1016/j.asoc.2021.107629

    Article  Google Scholar 

  6. Wang L et al (2021) Sganrda: semi-supervised generative adversarial networks for predicting circrna-disease associations. Briefings Bioinform 22(5):bbab028. https://doi.org/10.1093/bib/bbab028

    Article  CAS  Google Scholar 

  7. Wang L et al (2017) An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences. Oncotarget 8(3):5149. https://doi.org/10.18632/oncotarget.14103

    Article  PubMed  Google Scholar 

  8. Zhu S, Bing J, Min X, Lin C, Zeng X (2018) Prediction of drug-gene interaction by using metapath2vec. Front Genet 9:248. https://doi.org/10.3389/fgene.2018.00248

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Luo H et al (2021) Biomedical data and computational models for drug repositioning: a comprehensive review. Brief Bioinform 22(2):1604–1619. https://doi.org/10.1093/bib/bbz176

    Article  CAS  PubMed  Google Scholar 

  10. El-Behery H, Attia A-F, El-Fishawy N, Torkey H (2021) Efficient machine learning model for predicting drug-target interactions with case study for covid-19. Comput Biol Chem 93:107536. https://doi.org/10.1021/ci00057a005

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Wen M et al (2017) Deep-learning-based drug-target interaction prediction. J Proteome Res 16(4):1401–1409. https://doi.org/10.1021/acs.jproteome.6b00618

    Article  CAS  PubMed  Google Scholar 

  12. Kairys V, Baranauskiene L, Kazlauskiene M, Matulis D, Kazlauskas E (2019) Binding affinity in drug design: experimental and computational techniques. Expert Opin Drug Discov 14(8):755–768. https://doi.org/10.1080/17460441.2019.1623202

    Article  CAS  PubMed  Google Scholar 

  13. Chen R, Liu X, Jin S, Lin J, Liu J (2018) Machine learning for drug-target interaction prediction. Molecules 23(9):2208. https://doi.org/10.1186/s12911-020-1052-0

    Article  PubMed  PubMed Central  Google Scholar 

  14. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55(2):263–274. https://doi.org/10.1021/ci500747n

    Article  CAS  PubMed  Google Scholar 

  15. Dong J, Zhao M, Liu Y, Su Y, Zeng X (2022) Deep learning in retrosynthesis planning: datasets, models and tools. Briefing Bioinform 23(1):bbab391. https://doi.org/10.1093/bib/bbab391

    Article  CAS  Google Scholar 

  16. Kar S, Roy K (2011) Development and validation of a robust qsar model for prediction of carcinogenicity of drugs. Indian Journal of Biochemistry and Biophysics48(2):111–22. http://nopr.niscpr.res.in/handle/123456789/11614

  17. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539

    Article  CAS  PubMed  Google Scholar 

  18. Baltrušaitis T, Ahuja C, Morency L-P (2018) Multimodal machine learning: A survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443. https://doi.org/10.1109/TPAMI.2018.2798607

    Article  PubMed  Google Scholar 

  19. Lan W, Wang J, Li M, Wu F-X, Pan Y (2015) Predicting drug-target interaction based on sequence and structure information. IFAC-PapersOnLine 48(28):12–16. https://doi.org/10.1016/j.ifacol.2015.12.092

    Article  Google Scholar 

  20. He T, Heidemeyer M, Ban F, Cherkasov A, Ester M (2017) Simboost: a read-across approach for predicting drug-target binding affinities using gradient boosting machines. J Cheminform 9(1):1–14. https://doi.org/10.1186/s13321-017-0209-z

    Article  CAS  Google Scholar 

  21. Liyaqat T, Ahmad T, Saxena C (2022) A methodology for the prediction of drug target interaction using CDK descriptors. CoRRabs/2210.11482. https://doi.org/10.48550/arXiv.2210.11482

  22. Pahikkala T et al (2015) Toward more realistic drug-target interaction predictions. Brief Bioinform 16(2):325–337. https://doi.org/10.1093/bib/bbu010

    Article  CAS  PubMed  Google Scholar 

  23. Karimi M, Wu D, Wang Z, Shen Y (2019) Deepaffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35(18):3329–3338. https://doi.org/10.1093/bioinformatics/btz111

    Article  CAS  PubMed  Google Scholar 

  24. Zhao L, Wang J, Pang L, Liu Y, Zhang J (2020) Gansdta: Predicting drug-target binding affinity using gans. Front Genet 10:1243. https://doi.org/10.3389/fgene.2019.01243

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Öztürk H, Özgür A, Ozkirimli E (2018) Deepdta: deep drug-target binding affinity prediction. Bioinformatics 34(17):i821–i829. https://doi.org/10.1093/bioinformatics/bty593

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Wang H, Zhou G, Liu S, Jiang J-Y, Wang W (2021) Drug-target interaction prediction with graph attention networks. arXiv preprint arXiv:2107.06099. https://doi.org/10.48550/arXiv.2107.06099

  27. Zhao, Q., Xiao, F., Yang, M., Li, Y. & Wang, J. Yoo, I., Bi, J. & Hu, X. (eds) Attentiondta: prediction of drug-target binding affinity using attention model. (eds Yoo, I., Bi, J. & Hu, X.) 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019, San Diego, CA, USA, November 18-21, 2019, 64–69 (IEEE 2019). https://doi.org/10.1109/BIBM47256.2019.8983125

  28. Lin X (2020) Deepgs: Deep representation learning of graphs and sequences for drug-target binding affinity prediction. arXiv preprint arXiv:2003.13902. https://arxiv.org/abs/2003.13902

  29. Thafar MA et al (2022) Affinity2vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning. Sci Rep 12(1):1–18. https://doi.org/10.1038/s41598-022-08787-9

    Article  CAS  Google Scholar 

  30. Shin B, Park S, Kang K, Ho JC, Doshi-Velez F et al (eds) (2019) Self-attention based molecule representation for predicting drug-target interaction. (eds Doshi-Velez, F. et al.) , Vol. 106 of Proceedings of Machine Learning Research, 230–248 (PMLR). http://proceedings.mlr.press/v106/shin19a.html

  31. Yang X et al (2022) Modality-dta: Multimodality fusion strategy for drug-target affinity prediction. IEEE/ACM Trans Comput Biol Bioinf. https://doi.org/10.1109/TCBB.2022.3205282

    Article  Google Scholar 

  32. Song T et al (2022) Deepfusion: A deep learning based multi-scale feature fusion method for predicting drug-target interactions. Methods 204:269–277. https://doi.org/10.1016/j.ymeth.2022.02.007

    Article  CAS  PubMed  Google Scholar 

  33. Tang J et al (2014) Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model 54(3):735–743. https://doi.org/10.1021/ci400709d

    Article  CAS  PubMed  Google Scholar 

  34. Davis MI et al (2011) Comprehensive analysis of kinase inhibitor selectivity. Nat Biotechnol 29(11):1046–1051. https://doi.org/10.1038/nbt.1990

    Article  CAS  PubMed  Google Scholar 

  35. Weininger D (1988) Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inform Comput Sci 28(1):31–36. https://doi.org/10.1021/ci00057a005

    Article  CAS  Google Scholar 

  36. Liu, F., Ren, X., Zhang, Z., Sun, X. & Zou, Y. Scott, D., Bel, N. & Zong, C. (eds) Rethinking skip connection with layer normalization. (eds Scott, D., Bel, N. & Zong, C.) Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020, 3586–3598 (International Committee on Computational Linguistics, 2020). https://doi.org/10.18653/v1/2020.coling-main.320

  37. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. CoRRabs/1512.03385. http://arxiv.org/abs/1512.03385.1512.03385

  38. Xia F et al (2018) Predicting tumor cell line response to drug pairs with deep learning. BMC Bioinform 19(18):71–79. https://doi.org/10.1186/s12859-018-2509-3

    Article  CAS  Google Scholar 

  39. Jaeger S, Fulle S, Turk S (2018) Mol2vec: unsupervised machine learning approach with chemical intuition. J Chem Inf Model 58(1):27–35. https://doi.org/10.1021/acs.jcim.7b00616

    Article  CAS  PubMed  Google Scholar 

  40. Dong J et al (2015) Chemdes: an integrated web-based platform for molecular descriptor and fingerprint computation. J Cheminform 7(1):1–10. https://doi.org/10.1186/s13321-015-0109-z

    Article  CAS  Google Scholar 

  41. Yap CW (2011) Padel-descriptor: An open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707

    Article  CAS  PubMed  Google Scholar 

  42. Rost B, Sander C (1996) Bridging the protein sequence-structure gap by structure predictions. Annu Rev Biophys Biomol Struct 25(1):113–136. https://doi.org/10.1146/annurev.bb.25.060196.000553

    Article  CAS  PubMed  Google Scholar 

  43. Elnaggar A et al (2020) Prottrans: towards cracking the language of life’s code through self-supervised deep learning and high performance computing. arXiv preprint arXiv:2007.06225. https://doi.org/10.1109/TPAMI.2021.3095381

  44. Steinegger M, Mirdita M, Söding J (2019) Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat Methods 16(7):603–606. https://doi.org/10.1038/s41592-019-0437-4

    Article  CAS  PubMed  Google Scholar 

  45. Steinegger M, Söding J (2018) Clustering huge protein sequence sets in linear time. Nat Commun 9(1):1–8. https://doi.org/10.1038/s41467-018-04964-5

    Article  CAS  Google Scholar 

  46. Mahmud SH et al (2020) Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting. Analytical Biochem 589:113507. https://doi.org/10.1016/j.ab.2019.113507

    Article  CAS  Google Scholar 

  47. Mahmud SH et al (2020) Deepaction: A deep learning-based method for predicting novel drug-target interactions. Anal Biochem 610:113978. https://doi.org/10.1016/j.ab.2020.113978

    Article  CAS  Google Scholar 

  48. Mahmud SH et al (2021) Predtis: prediction of drug-target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques. Briefings Bioinform 22(5):046. https://doi.org/10.1093/bib/bbab046

    Article  CAS  Google Scholar 

  49. Chen C et al (2021) Dnn-dtis: Improved drug-target interactions prediction using xgboost feature selection and deep neural network. Comput Biol Med 136:104676. https://doi.org/10.1016/j.compbiomed.2021.104676

    Article  PubMed  Google Scholar 

  50. Refahi MS, Mir A, Nasiri JA (2020) A novel fusion based on the evolutionary features for protein fold recognition using support vector machines. Sci Rep 10(1):1–13. https://doi.org/10.1038/s41598-020-71172-x

    Article  CAS  Google Scholar 

  51. Lobley A, Sadowski MI, Jones DT (2009) pgenthreader and pdomthreader: new methods for improved protein fold recognition and superfamily discrimination. Bioinformatics 25(14):1761–1767. https://doi.org/10.1093/bioinformatics/btp302

    Article  CAS  PubMed  Google Scholar 

  52. Zhu H-J et al (2019) Improved prediction of protein-protein interactions using descriptors derived from pssm via gray level co-occurrence matrix. IEEE Access 7:49456–49465. https://doi.org/10.1109/ACCESS.2019.2907132

    Article  Google Scholar 

  53. Wang L, Wang H-F, Liu S-R, Yan X, Song K-J (2019) Predicting protein-protein interactions from matrix-based protein sequence using convolution neural network and feature-selective rotation forest. Sci Rep 9(1):1–12. https://doi.org/10.1038/s41598-019-46369-4

    Article  CAS  Google Scholar 

  54. Liu T, Geng X, Zheng X, Li R, Wang J (2012) Accurate prediction of protein structural class using auto covariance transformation of psi-blast profiles. Amino Acids 42:2243–2249. https://doi.org/10.1109/TCBB.2022.3205282

    Article  CAS  PubMed  Google Scholar 

  55. Liu T, Zheng X, Wang J (2010) Prediction of protein structural class for low-similarity sequences using support vector machine and psi-blast profile. Biochimie 92(10):1330–1334. https://doi.org/10.1016/j.biochi.2010.06.013

    Article  CAS  PubMed  Google Scholar 

  56. Dong Q, Zhou S, Guan J (2009) A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20):2655–2662. https://doi.org/10.1093/bioinformatics/btp500

    Article  CAS  PubMed  Google Scholar 

  57. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc: Ser B (Methodol) 58(1):267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

    Article  Google Scholar 

  58. Nebauer C (1998) Evaluation of convolutional neural networks for visual recognition. IEEE Trans Neural Networks 9(4):685–696. https://doi.org/10.1109/72.701181

    Article  CAS  PubMed  Google Scholar 

  59. Abdeljaber O et al (2018) 1-d cnns for structural damage detection: Verification on a structural health monitoring benchmark data. Neurocomputing 275:1308–1317. https://doi.org/10.1016/j.neucom.2017.09.069

    Article  Google Scholar 

  60. Kiranyaz S, Ince T, Gabbouj M (2015) Real-time patient-specific ecg classification by 1-d convolutional neural networks. IEEE Trans Biomed Eng 63(3):664–675. https://doi.org/10.1109/TBME.2015.2468589

    Article  PubMed  Google Scholar 

  61. Shim J, Hong Z-Y, Sohn I, Hwang C (2021) Prediction of drug-target binding affinity using similarity-based convolutional neural network. Sci Rep 11(1):4416. https://doi.org/10.1038/s41598-021-83679-y

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Pratim Roy P, Paul S, Mitra I, Roy K (2009) On two novel parameters for validation of predictive qsar models. Molecules 14(5):1660–1701. https://doi.org/10.3390/molecules14051660

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Roy K et al (2013) Some case studies on application of “rm2” metrics for judging quality of quantitative structure-activity relationship predictions: emphasis on scaling of response data. J Comput Chem 34(12):1071–1082. https://doi.org/10.1002/jcc.23231

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

TL: methodology, software, validation, formal analysis, investigation, data curation, writing—original draft. TA: validation, formal analysis, investigation, writing—review and editing. CS: writing—review and editing, formal analysis, conceptualization, validation, visualization.

Corresponding author

Correspondence to Tanya Liyaqat.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Ethical approval

The data used in the study adheres to the standard ethical rules and regulations.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liyaqat, T., Ahmad, T. & Saxena, C. TeM-DTBA: time-efficient drug target binding affinity prediction using multiple modalities with Lasso feature selection. J Comput Aided Mol Des 37, 573–584 (2023). https://doi.org/10.1007/s10822-023-00533-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-023-00533-1

Keywords

Navigation