[Biopython-dev] Updates to the tutorial for parsing GenBank files
Peter
biopython-dev at maubp.freeserve.co.uk
Wed Dec 14 13:33:15 EST 2005
Marc Colosimo wrote:
> The patch looks go to me , but i could have missed something there. I
> forgot about the Discussion List. I really should join that list.
Motion seconded - any developer want to accept this?
> Also, I probably will be filling a bug on Bio.Fasta documentation.
> There are two basic doc changes that should be made:
>
> Under the doc for Fasta:
> RecordParser Parses FASTA sequence data into a Record object <- change
> to a Fasta.Record object which is not the same as a Seq.Record
Sounds sensible
> Cookbooks:
>
> Then maybe in the Cookbook, give an example on using
> Fasta.SequenceParser with title2ids. With out title2ids, you don't get
> name or id. You only get description which is the title. Fasta.Record
> only has title, which maybe should be renamed (depreciated to)
> description to make it the same default behavior as SequenceParser.
I don't usually bother with the title2ids function either.
I agree that the fact that its .title and .description depending on the
parser used (Fasta.RecordParser or Fasta.SequenceParser) is odd.
> It seems odd that the Fasta stuff is buried within Chapter 2 (2.4.3
> Making it easier - plus it is missing "import string").
Yes, but I think it would be better to avoid using the string module
completely, and use the split method of the string object instead:
from Bio import Fasta
def parseTitle2Ids(title):
return title.split("|")[:3]
parser = Fasta.SequenceParser(title2ids = parseTitle2Ids)
file = open("ls_orchid.fasta")
iterator = Fasta.Iterator(file, parser)
...
Peter
More information about the Biopython-dev
mailing list