[Biojava-dev] GSoC - File parsing coding exercise

David Felty davfelty at gmail.com
Tue Mar 27 20:01:58 UTC 2012


Hello, BioJava!

My name is David Felty, and I am a Computer Science student at Cornell
University. Biology has always been one of my interests, and I've actually
considered getting a degree in Bioinformatics; I feel like computer science
has so much to offer the scientific fields. It is for this reason that I'm
applying to BioJava for GSoC.

I want to apply for the project entitled "New File Parsers for BioJava,"
but I have a question about task 2 of the coding exercise at
biojava.org/wiki/Coding_exercise. What are "ambiguous characters"? My
guess, based on en.wikipedia.org/wiki/FASTA_format#Sequence_representation,
is that 'N', 'X', and '-' are ambiguous for nucleic acids, and 'X' and '-'.
are ambiguous for amino acids. Is this correct?

Thanks,

David



More information about the biojava-dev mailing list