[Biopython-dev] Getting raw unparsed records with SeqIO?
Brad Chapman
chapmanb at 50mail.com
Wed Feb 3 07:55:52 EST 2010
Hi Peter;
> Another solution to this task (extracting the raw GenBank
> records from a large file) would seem to be to extend the
> Bio.SeqIO.index functionality. The patch I'm about to
> attach to Bug 3000 adds a new "get_raw" method to the
> dictionary like object we return. Unlike the __getitem__
> and get methods which return a SeqRecord this just gives
> the raw string.
[...]
> >>> from Bio import SeqIO
> >>> data = SeqIO.index("cor6_6.gb", "gb")
> >>> data.keys()
> ['L31939.1', 'AJ237582.1', 'X62281.1', 'AF297471.1', 'X55053.1', 'M81224.1']
> >>> print data.get_raw("X62281.1")
> LOCUS ATKIN2 880 bp DNA PLN 23-JUL-1992
> DEFINITION A.thaliana kin2 gene.
> ACCESSION X62281
> ...
> //
>
> What are people's thoughts on this?
Not much to add, but a +1 from me. This sounds like a solid solution
and makes sense for the use case I can think of, which is picking
out records of interest from a large file and re-writing them in a
smaller file.
Brad
More information about the Biopython-dev
mailing list