Skip to content

Conversation

roccomoretti
Copy link
Member

To maximize the ease of use of RF3, it would be nice to be able to directly load in FASTA files (such that you can directly predict from a FASTA of your system). This PR puts together a potential system to add loading of systems from FASTA files, basically making it syntactic sugar around the JSON input format (though less powerful).

The file format is inspired by Boltz's FASTA input format, but slightly more flexible. Most fields are optional, and it should be robust to "extra" information in the label line. (You should be able to input most arbitrary polymeric FASTA files as-is and have them work, albeit without MSA ... which is also easy enough to add.)

While limited to FASTA input currently, it's written with an eye to be flexible for additional sequence file input formats, as desire dictates.

This is intended as a "draft" PR, for comment & feedback.

Add the ability for RF3 to load in from FASTA files.
The file format is inspired by Boltz's FASTA input format, but slightly more flexible.
(You should be able to input a protein FASTA as-is and have it work, albeit without MSA.)

It's written with an eye to be flexible for additional sequence file input formats, as desire dictactes.

The FASTA input is basically just syntactic sugar around the JSON input format, with a reduced feature set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant