All data sets are publicly available:

Download ID all lists as a ZIP Download all PDB files a ZIP Download all Sequence files a ZIP Download all Binding site assignments a ZIP Readme

Datasets


Year Program Dataset Protein# cutoff(Å)(a) resolution(Å) SeqId(%)(b) StrId(c) Update#(d)

RNA binding proteins

2006 BindN1 BindN_R107 107 3.5 3.5 25 NA 95
2008 DR_bind12 DR_bind1_R69 69 3.5 3 NA CATH 69
2014 DR_bind13 DR_bind1_R81 81 3.5 3 NA CATH 79
2006 KYG4 KYG_R86 86 3.5 NA 50 NA 85
2007 PPRInt5 PPRInt_R86 86 3.5 3 70 NA 83
2010 PRNA6 PRNA_R205 205 3.5 3 25 NA 189
2011 SRCPred7 SRCPred_R160 160 3.5 NA 25 NA 124
2011 PRBR8 PRBR_R180 180 3.5 3.5 25 NA 142
2006 RNABindR9 RNABindR_R106 106 3.5 NA NA NA 100
2007 RNABindR10 RNABindR_R109 109 3.5 3.5 30 NA 100
2007 RNABindR10 RNABindR_R144 144 3.5 NA NA NA 137
2007 RNABindR10 RNABindR_R147 147 3.5 3.5 30 NA 138
2007 RNABindR10 RNABindR_R198 198 3.5 3.5 30 NA 187
2014 RNABindRPlus11 RNABindR_R44 44 3.5 NA 40 NA 44
2014 RNABindRPlus11 RNABindR_R111 111 3.5 3.5 30 NA 101
2012 meta212 meta2_R44 44 3.5 NA 40 NA 44
2014 aaRNA13 aaRNA_R67 67 3.5 NA 30 NA 67
2014 aaRNA13 aaRNA_R141 141 3.5 3 25 NA 136
2014 aaRNA13 aaRNA_R205 205 3.5 3 25 NA 200
2015 RBscore RBscore_R130 130 3.5 3.5 25 TMscore<0.7 130
2015 RBscore RBscore_R116 116 3.5 3.5 25 TMscore<0.7 116
2011 14 Sungwook_R267 267 3.5 3 60 NA 178
2011 14 Sungwook_R727 727 3.5 3 NA NA 574
2011 14 Sungwook_R3149 3149 3.5 3 NA NA 2632
2015 After2014 New_R15 15 25 TMscore<0.7 15

DNA binding proteins

2006 BindN1 BindN_D62 62 3.5 NA 25 NA 66
2010 15 ProteDNA_D253 253 3.5 3.5 20 NA 253
2008 Pro-dna16 Pro-dna_D99 99PDB 3.5 3 20 NA 188
1999 17 Hidetoshi_D52 52 3.5 3.2 NA NA 49
2008 18 Shandar_D140 140 3.5 2.5 25 NA 138
2003 19 Susan_D56 56 3.5 3 NA CATH 54
2008 DBD-Hunter20 DBD-Hunter_D179 179 3.5 3 35 NA 177
2001 21 Luscombe_D129 129 3.5 3 NA NA 182
2000 DBindR22 DBindR_D374 374 3.5 3.5 25 NA 329
2007 DISPLAR23 DISPLAR_D428 428 3.5 NA 50 NA 390
2010 DNABINDPROT24 DNABINDPROT_D54 54 3.5 NA NA NA 50
2013 PreDNA25 PreDNA_D224 224 3.5 3 25 NA 216
2015 RBscore RBscore_D381 381 3.5 3.5 25 NA 381
2011 metaDBSite26 metaDBSite_D232 232 3.5 3 30 NA 225
2011 metaDBSite26 metaDBSite_D316 316 3.5 3 30 NA 308
2009 SDCPred27 SDCPred_D159 159 3.5 2.5 25 NA 158
2015 After2014 New_D31 31 25 TMscore<0.7 31

Sum

2015 RBscore RBscore_P627 627 3.5 3.5 NA NA 627
2000 ALL All_P5114 5114 3.5 NA NA NA 5114
Excluded cases
(a)distance cutoff used to define binding sites
(b)sequence identity
(c)structural identity
(d)unreasonable cases removed


References:

  • 1. Wang L, Brown SJ. BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res. 2006;34(Web Server issue):W243-8. doi:10.1093/nar/gkl298.
  • 2. Chen YC, Lim C. Predicting RNA-binding sites from the protein structure based on electrostatics, evolution and geometry. Nucleic Acids Res. 2008;36(5):e29. doi:10.1093/nar/gkn008.
  • 3. Chen YC, Sargsyan K, Wright JD, Huang Y-S, Lim C. Identifying RNA-binding residues based on evolutionary conserved structural and energetic features. Nucleic Acids Res. 2014;42(3):e15. doi:10.1093/nar/gkt1299.
  • 4. Kim OTP, Yura K, Go N. Amino acid residue doublet propensity in the protein-RNA interface and its application to RNA interface prediction. Nucleic Acids Res. 2006;34(22):6450-60. doi:10.1093/nar/gkl819.
  • 5. Kumar M, Gromiha MM, Raghava GPS. Prediction of RNA binding sites in a protein using SVM and PSSM profile. 2007:189-194. doi:10.1002/prot.21677.
  • 6. Liu Z-P, Wu L-Y, Wang Y, Zhang X-S, Chen L. Prediction of protein-RNA binding sites by a random forest method with combined features. Bioinformatics 2010;26(13):1616-22. doi:10.1093/bioinformatics/btq253.
  • 7. Fernandez M, Kumagai Y, Standley DM, Sarai A, Mizuguchi K, Ahmad S. Prediction of dinucleotide-specific RNA-binding sites in proteins. BMC Bioinformatics 2011;12 Suppl 1(Suppl 13):S5. doi:10.1186/1471-2105-12-S13-S5.
  • 8. Ma X, Guo J, Wu J, et al. Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature. Proteins 2011;79(4):1230-9. doi:10.1002/prot.22958.
  • 9. Terribilini M, Lee J, Yan C, Jernigan RL, Honavar V, Dobbs D. Prediction of RNA binding sites in proteins from amino acid sequence. 2006:1450-1462. doi:10.1261/rna.2197306.and.
  • 10. Terribilini M, Sander JD, Lee J-H, et al. RNABindR: a server for analyzing and predicting RNA-binding sites in proteins. Nucleic Acids Res. 2007;35(Web Server issue):W578-84. doi:10.1093/nar/gkm294.
  • 11. Walia RR, Xue LC, Wilkins K, El-Manzalawy Y, Dobbs D, Honavar V. RNABindRPlus: A Predictor that Combines Machine Learning and Sequence Homology-Based Methods to Improve the Reliability of Predicted RNA-Binding Residues in Proteins. PLoS One 2014;9(5):e97725. doi:10.1371/journal.pone.0097725.
  • 12. Puton T, Kozlowski L, Tuszynska I, Rother K, Bujnicki JM. Computational methods for prediction of protein-RNA interactions. J. Struct. Biol. 2012;179(3):261-8. doi:10.1016/j.jsb.2011.10.001.
  • 13. Li S, Yamashita K, Amada KM, Standley DM. Quantifying sequence and structural features of protein-RNA interactions. Nucleic Acids Res. 2014;42(15):10086-98. doi:10.1093/nar/gku681.
  • 14. Choi S, Han K. Prediction of RNA-binding amino acids from protein and RNA sequences. BMC Bioinformatics 2011;12 Suppl 1(Suppl 13):S7. doi:10.1186/1471-2105-12-S13-S7.
  • 15. Huang Y-F, Chiu L-Y, Huang C-KC-C. Predicting RNA-binding residues from evolutionary information and sequence conservation. BMC Genomics 2010;11 Suppl 4(Suppl 4):S2. doi:10.1186/1471-2164-11-S4-S2.
  • 16. Nitin Bhardwaj HL. Residue-Level Prediction of DNA-Binding Sites and its Application on DNA-Binding Protein Predictions. FEBS Lett 2008;581(5):1058-1066.
  • 17. Kono H, Sarai A. Structure-Based Prediction of DNA Target Sites. 1999;131(November 1998):114-131.
  • 18. Ahmad S, Keskin O, Sarai A, Nussinov R. Protein-DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins. Nucleic Acids Res. 2008;36(18):5922-32. doi:10.1093/nar/gkn573.
  • 19. Jones S. Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res. 2003;31(24):7189-7198. doi:10.1093/nar/gkg922.
  • 20. Gao M, Skolnick J. DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions. Nucleic Acids Res. 2008;36(12):3978-92. doi:10.1093/nar/gkn332.
  • 21. Luscombe NM, Laskowski R a, Thornton JM. Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res. 2001;29(13):2860-74. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=55782&tool=pmcentrez&rendertype=abstract.
  • 22. Wu J, Liu H, Duan X, et al. Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature. Bioinformatics 2009;25(1):30-5. doi:10.1093/bioinformatics/btn583.
  • 23. Tjong H, Zhou H-X. DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Res. 2007;35(5):1465-77. doi:10.1093/nar/gkm008.
  • 24. Ozbek P, Soner S, Erman B, Haliloglu T. DNABINDPROT: fluctuation-based predictor of DNA-binding residues within a network of interacting residues. Nucleic Acids Res. 2010;38(Web Server issue):W417-23. doi:10.1093/nar/gkq396.
  • 25. Li T, Li Q-Z, Liu S, Fan G-L, Zuo Y-C, Peng Y. PreDNA: accurate prediction of DNA-binding sites in proteins by integrating sequence and geometric structure information. Bioinformatics 2013;29(6):678-85. doi:10.1093/bioinformatics/btt029.
  • 26. Si J, Zhang Z, Lin B, Schroeder M, Huang B. MetaDBSite: a meta approach to improve protein DNA-binding sites prediction. BMC Syst. Biol. 2011;5 Suppl 1(Suppl 1):S7. doi:10.1186/1752-0509-5-S1-S7.
  • 27. Andrabi M, Mizuguchi K, Sarai A, Ahmad S. Prediction of mono- and di-nucleotide-specific DNA-binding sites in proteins using neural networks. BMC Struct. Biol. 2009;9:30. doi:10.1186/1472-6807-9-30.