The Protein Design Archive (PDA): insights from 40 years of protein design

IO_AdminUncategorized3 months ago31 Views

While natural proteins are incredibly diverse biomacromolecules, they present only a subset of the sequence and structural space that is chemically possible1. The field of protein design has the potential to illuminate the unexplored regions of this space, to deepen our understanding of sequence–structure–function relationships, and to enable the development of novel proteins that could be used to address societal challenges.

Protein design has evolved dramatically over the past 40 years from rational design to contemporary data-driven approaches. The field has now reached milestones such as the design of small-molecule2 and protein3 binders, entirely novel folds4,5,6, and has been recognized with a Nobel Prize. However, it can be difficult to identify current challenges and opportunities within the field as no easily searchable resource provides an overview of proteins that have been designed to date. Such a resource would improve our ability to learn from previous efforts and guide the development of future design methods.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

Data availability

The PDA is available without registration at https://pragmaticproteindesign.bio.ed.ac.uk/pda/. Source data are provided with this paper.

Code availability

Code supporting PDA data collection and processing is available on GitHub: https://github.com/wells-wood-research/chronowska-stam-wood-2024-protein-design-archive. This includes files detailing manual curation of the PDA dataset, such as csv files listing proteins that were manually included and excluded, and how information was manually corrected. Code used to build the PDA website is available on GitHub: https://github.com/wells-wood-research/protein-design-archive.

References

  1. Baker, D. Protein Sci. 28, 678–683 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  2. Lu, L. et al. Science 384, 106–112 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  3. Gainza, P. et al. Nature 617, 176–184 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  4. Kuhlman, B. et al. Science 302, 1364–1368 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  5. Thomson, A. R. et al. Science 346, 485–488 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  6. Anishchenko, I. Nature 600, 547–552 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  7. Burley, S. K. et al. Nucleic Acids Res. 49, D437–D451 (2021). (D1).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  8. Leaver-Fay, A. et al. Methods Enzymol. 487, 545–574 (2011).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  9. Chaudhury, S., Lyskov, S. & Gray, J. J. Bioinformatics 26, 689–691 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  10. Stam, M. J. & Wood, C. W. Protein Eng. Des. Sel. 34, gzab029 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  11. Steinegger, M. & Söding, J. Nat. Biotechnol. 35, 1026–1028 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  12. van Kempen, M. Nat. Biotechnol. 42, 243–246 (2024).

    Article 
    PubMed 

    Google Scholar
     

  13. Huang, P.-S., Boyken, S. E. & Baker, D. Nature 537, 320–327 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  14. Woolfson, D. N. J. Mol. Biol. 433, 167160 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  15. Kortemme, T. Cell 187, 526–544 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

Download references

Acknowledgements

We thank the members of the Wells Wood Research Group for testing and feedback on the PDA website. M.C. is supported by a PhD studentship from the UK Research and Innovation-funded EastBio Doctoral Training Partnership programme. M.J.S., C.W.W. and D.N.W. are supported by a UKRI Biotechnology and Biological Sciences Research Council Strategic Longer and Larger award (BB/X003027/1).

Author information

Authors and Affiliations

  1. University of Edinburgh, School of Biological Sciences, Institute of Quantitative Biology, Biochemistry and Biotechnology, Edinburgh, UK

    Marta Chronowska, Michael J. Stam & Christopher W. Wood

  2. University of Bristol, Schools of Chemistry and of Biochemistry, Bristol BioDesign Institute, Max Planck-Bristol Centre, Bristol, UK

    Derek N. Woolfson

  3. Department of Agriculture, University of Napoli Federico II, Portici, Italy

    Luigi F. Di Costanzo

Contributions

M.C. and C.W.W. created the website, with support from M.J.S. M.C. and L.F.D.C. collected the data. M.C. and M.J.S. processed and analyzed the data. C.W.W. and D.N.W. conceived the project and secured the funding. M.C. with support of M.J.S. and C.W.W. prepared the manuscript; D.N.W. and L.F.D.C. edited the manuscript.

Corresponding authors

Correspondence to
Luigi F. Di Costanzo or Christopher W. Wood.

Ethics declarations

Competing interests

The authors declare no competing interests.

Supplementary information

Supplementary Information

Supplementary Figs. 1–4, Tables 1–4, Discussion, Methods, Software Versions, Front-end Requirements, Back-end Requirements, References

Source data

Source Data Fig. 1

The Protein Design Archive dataset as of 1 January 2025

Source Data Fig. 2

Structure-based metrics generated with DE-STRESS for proteins found in the Protein Design Archive database as of 1 August 2024 and the Research Collaboratory for Structural Bioinformatics Protein Data Bank as of 1 June 2024

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chronowska, M., Stam, M.J., Woolfson, D.N. et al. The Protein Design Archive (PDA): insights from 40 years of protein design.
Nat Biotechnol (2025). https://doi.org/10.1038/s41587-025-02607-x

Download citation

  • Published:

  • DOI: https://doi.org/10.1038/s41587-025-02607-x

Read More

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Previous Post

Next Post

Recent Comments

No comments to show.

Stay Informed With the Latest & Most Important News

I consent to receive newsletter via email. For further information, please review our Privacy Policy

Advertisement

Loading Next Post...
Follow
Sign In/Sign Up Sidebar Search Trending 0 Cart
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...

Cart
Cart updating

ShopYour cart is currently is empty. You could visit our shop and start shopping.