Protein folding fashions have achieved groundbreaking outcomes because the introduction of AlphaFold2, sometimes constructed through a
mixture of integrating domain-expertise into its architectural designs and coaching pipelines. Nonetheless, given the
success of generative fashions throughout completely different however associated issues, it’s pure to query whether or not these architectural
designs are a necessity to construct performant fashions. On this paper, we introduce SimpleFold, the primary flow-matching primarily based
protein folding mannequin that solely makes use of normal objective transformer layers. As a substitute of counting on costly modules
like triangle consideration or pair illustration biases, or rigorously crafted coaching aims, SimpleFold employs normal
transformer blocks with adaptive layers and is educated through a generative flow-matching goal. We scale SimpleFold to
3B parameters and prepare it on greater than 8.6M distilled protein constructions along with experimental PDB knowledge. To the
better of our data, SimpleFold is the most important scale folding mannequin ever developed. On normal folding benchmarks,
SimpleFold-3B mannequin achieves aggressive efficiency in comparison with state-of-the-art baselines. Because of its generative
coaching goal, SimpleFold additionally demonstrates sturdy efficiency in ensemble prediction. SimpleFold challenges the
reliance on complicated domain-specific architectures designs in folding, highlighting an alternate but essential avenue of
progress in protein construction prediction.
- ** Work finished whereas at Apple