[Biopython] GI number
Peter Cock
p.j.a.cock at googlemail.com
Mon Jan 25 18:41:46 EST 2010
On Mon, Jan 25, 2010 at 9:38 PM, x y <rafal.b.pawlak at gmail.com> wrote:
> hello,
> how extract GI number in this program?
>
> from Bio import SeqIO
> handle = open("xyz.fasta")
> for seq_record in SeqIO.parse(handle, "fasta"):
> print seq_record.description
> handle.close()
>
> ex.
> Osa_SPT6 gi|222632083|gb|EEE64215.1| hypothetical protein Os05g41510.1_ORYZA
> [Oryza sativa Japonica Group]
>
> rafal pawlak
I would just the Python string split method on this string - assuming
all your record use the same layout, e.g. Something like this:
gi = record.description.split()[1].split("|")[1]
There are related examples in the tutorial, search for "get_accession"
which are a bit more robust because they check the string follows
the expected format. You could alternatively use a regular expression.
Peter
More information about the Biopython
mailing list