[Biopython] Fwd: help with parsing EMBL

Peter biopython at maubp.freeserve.co.uk
Mon Apr 26 15:54:54 UTC 2010


Hi all,

I'm forwarding this email from Nick Leake about parsing EMBL files,
but without his 1.3MB attachment. I'll reply to his questions in a
follow up email...

Peter

---------- Forwarded message ----------
From: Nick Leake
To: <biopython at lists.open-bio.org>
Date: Mon, 26 Apr 2010 09:35:45 -0400
Subject: help with parsing

Hello,



I'm having trouble parsing an embl file (attached) with multiple
sequences.  I want to be able to access the DNA sequences for
manipulation and removal from a chromosomal region.  I originally
thought that I could follow the same fasta format example shown in the
biopython tutorial.  However, that failed to work.  Next, I tried to
convert the file to a fastq or a fasta to just follow the examples -
again, failed.  So, I looked around and found some embl parsing code:



from Bio import SeqIO

p=SeqIO.parse(open(r"transposon_sequence_set.embl.v.9.41","rb"),"embl")
p.next()
record=p.next()

print record



This kinda works, but fails to read all entries.  Also, there is no
'record' argument for output.  In addition, I don't know what code I
need to 'grab' the DNA information for manipulations and remove these
sequences from a given DNA segment.    Can I get a little guidance to
what I need to do or where I can look to help solve my problem?

Any help would be greatly appreciated.  I'm still very much a python
novice and get frustrated by not knowing how to ask my questions
appropriately.

_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4

---------- Forwarded message ----------
From: biopython-request at lists.open-bio.org
To:
Date: Mon, 26 Apr 2010 09:44:02 -0400
Subject: confirm 29081d7dc4252dd9c96c13f5018658d3414acbdc
If you reply to this message, keeping the Subject: header intact,
Mailman will discard the held message.  Do this if the message is
spam.  If you reply to this message and include an Approved: header
with the list password in it, the message will be approved for posting
to the list.  The Approved: header can also appear in the first line
of the body of the reply.




More information about the Biopython mailing list