[BioRuby] [GSoC][NeXML and RDF API] Sequences( doubts )
Anurag Priyam
anurag08priyam at gmail.com
Sat Jul 3 07:13:43 EDT 2010
This is going to be a long mail.
NeXML's characters tag serves as a storage block for sequences. Sequences
can be described in NeXML in two ways, raw( with the seq tag ) and granular(
with the cell tags ). NeXML offers six kind of sequences :
1. Protein( AA )
2. DNA
3. RNA
4. Restriction
5. Standard
6. Continuous
As of now, the NeXML parser just returns the sequence as a string. It should
return Bio::Sequence. BioRuby already has classes to work with AA and NA
sequences. I was thinking of adding classes to represent Restriction,
Standard and Continuous sequences. Should I work on adding support for these
as a core BioRuby classes or just as a part of NeXML lib? I will have to
adapt Bio::Sequence class to recognize the new sequences.
Why does the Bio::Sequence#guess method use the some 90% way of recognition
between AA and NA? Why not use regexp instead?
--
Anurag Priyam,
2nd Year Undergraduate,
Department of Mechanical Engineering,
IIT Kharagpur.
+91-9775550642
More information about the BioRuby
mailing list