From rcsqtc at iiqab.csic.es Mon Aug 1 11:32:18 2005
From: rcsqtc at iiqab.csic.es (Ramon Crehuet)
Date: Mon Aug 1 11:23:17 2005
Subject: [BioPython] Superimposing CA atoms of a chain
Message-ID: <42EE4082.9030702@iiqab.csic.es>
I'd like to superimpose two chains (all atoms from all residues) but
calculating the RMS only from CA atoms. That is, I'd like to calculate
the transformation matrix for the CA atoms and apply it to all atoms. (A
common operation, I guess...)
Can I do that with the PDB.superimpose module? Otherwise, if I need
to use the SVDSuperimpose, can I manipule atom instances or it only
works with numeric arrays?
Thanks,
Ramon
From thamelry at binf.ku.dk Mon Aug 1 11:26:52 2005
From: thamelry at binf.ku.dk (Thomas Hamelryck)
Date: Mon Aug 1 12:11:36 2005
Subject: [BioPython] Superimposing CA atoms of a chain
In-Reply-To: <42EE4082.9030702@iiqab.csic.es>
References: <42EE4082.9030702@iiqab.csic.es>
Message-ID: <200508011726.52569.thamelry@binf.ku.dk>
On Monday 01 August 2005 17:32, Ramon Crehuet wrote:
> I'd like to superimpose two chains (all atoms from all residues) but
> calculating the RMS only from CA atoms. That is, I'd like to calculate
> the transformation matrix for the CA atoms and apply it to all atoms.
> (A common operation, I guess...)
> Can I do that with the PDB.superimpose module?
Yes.
Use the PDB.superimpose module to calculate the rotation/translation
for the CA atoms only and then apply these to the atoms you want using the
transform(rotation, translation) method of the atom object.
-Thomas
From amorgan at mitre.org Tue Aug 2 16:07:50 2005
From: amorgan at mitre.org (Alexander A. Morgan)
Date: Tue Aug 2 15:59:24 2005
Subject: [BioPython] Changes in NCBI BLAST output format !!??
In-Reply-To: <1121786507.42dd1a8b9dee5@imp3-q.free.fr>
References: <1121786507.42dd1a8b9dee5@imp3-q.free.fr>
Message-ID: <42EFD296.3020708@mitre.org>
Hello:
I've just run into the same problem, and I haven't seen a suggested
fix go by, so I apologize if this is redundant information, but it seems
that the files I've been getting from NCBI have a
removed from the
header between the "RID: " line and the "
Query" line, and it is just
a blank line now. If you edit Bio.Blast.NCBIWWW to not look for the
"
", it seems to work okay.
class _Scanner:
....
def _scan_header(self, uhandle, consumer):
....
change:
attempt_read_and_call(uhandle, consumer.noevent, start='
')
to:
attempt_read_and_call(uhandle, consumer.noevent)
aurelie.bornot@free.fr wrote:
>Thank you very much Jessica !!!
>
>Unfortunately, I need a lot of thing in the BLAST reports.....
>It will be difficult to do the same thing as you did....
>
>I will try to do something in the code of parser of Python.
>But it will be difficult for me..
>so if you or someone has advices !!!
>
>Thanks a lot again for your answer Jessica !
>Aur?lie
>
>
>--------------
>Aurelie BORNOT
>MNHN
>Paris
>
>
>_______________________________________________
>BioPython mailing list - BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
From mdehoon at c2b2.columbia.edu Wed Aug 3 14:37:33 2005
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed Aug 3 14:32:59 2005
Subject: [BioPython] Changes in NCBI BLAST output format !!??
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC283@cgcmail.cgc.cpmc.columbia.edu>
Do you happen to know if this change can break anything else in the Blast
parser? From running Biopython's tests for Blast, it seems that this change
is OK. On the other hand, I don't use Blast much myself, so I don't trust my
own judgement in this matter.
If making this change does not cause any new bugs, I'd be happy to include it
in CVS.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-----Original Message-----
From: biopython-bounces@portal.open-bio.org on behalf of Alexander A. Morgan
Sent: Tue 8/2/2005 4:07 PM
To: aurelie.bornot@free.fr
Cc: biopython@biopython.org
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
Hello:
I've just run into the same problem, and I haven't seen a suggested
fix go by, so I apologize if this is redundant information, but it seems
that the files I've been getting from NCBI have a
removed from the
header between the "RID: " line and the "
Query" line, and it is just
a blank line now. If you edit Bio.Blast.NCBIWWW to not look for the
"
", it seems to work okay.
class _Scanner:
....
def _scan_header(self, uhandle, consumer):
....
change:
attempt_read_and_call(uhandle, consumer.noevent, start='
')
to:
attempt_read_and_call(uhandle, consumer.noevent)
aurelie.bornot@free.fr wrote:
>Thank you very much Jessica !!!
>
>Unfortunately, I need a lot of thing in the BLAST reports.....
>It will be difficult to do the same thing as you did....
>
>I will try to do something in the code of parser of Python.
>But it will be difficult for me..
>so if you or someone has advices !!!
>
>Thanks a lot again for your answer Jessica !
>Aur?lie
>
>
>--------------
>Aurelie BORNOT
>MNHN
>Paris
>
>
>_______________________________________________
>BioPython mailing list - BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
_______________________________________________
BioPython mailing list - BioPython@biopython.org
http://biopython.org/mailman/listinfo/biopython
From aurelie.bornot at free.fr Wed Aug 3 14:54:36 2005
From: aurelie.bornot at free.fr (=?iso-8859-1?Q?Aur=E9lie_Bornot?=)
Date: Wed Aug 3 14:44:42 2005
Subject: [BioPython] Changes in NCBI BLAST output format !!??
References: <1121786507.42dd1a8b9dee5@imp3-q.free.fr>
<42EFD296.3020708@mitre.org>
Message-ID: <001b01c5985c$cc046360$0b413851@YSENGARD>
Thank you very much Alexander !!
I didn't dare to change the code in the Bio.Blast.NCBIWWW on my own because
I didn't have time to make tests...
So I simply automatiquely added the
in the Blast file... not very nice..
I know !
I will try your method instead !
Thanks !
Aur?lie
--------------
Aurelie BORNOT
MNHN
Paris
----- Original Message -----
From: "Alexander A. Morgan"
To:
Cc:
Sent: Tuesday, August 02, 2005 10:07 PM
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
> Hello:
> I've just run into the same problem, and I haven't seen a suggested fix
> go by, so I apologize if this is redundant information, but it seems that
> the files I've been getting from NCBI have a removed from the header
> between the "RID: " line and the "
Query" line, and it is just a blank
> line now. If you edit Bio.Blast.NCBIWWW to not look for the "
", it
> seems to work okay.
>
> class _Scanner:
> ....
> def _scan_header(self, uhandle, consumer):
> ....
>
> change:
> attempt_read_and_call(uhandle, consumer.noevent, start='
')
> to: attempt_read_and_call(uhandle, consumer.noevent)
>
>
>
>
>
> aurelie.bornot@free.fr wrote:
>
>>Thank you very much Jessica !!!
>>
>>Unfortunately, I need a lot of thing in the BLAST reports.....
>>It will be difficult to do the same thing as you did....
>>
>>I will try to do something in the code of parser of Python.
>>But it will be difficult for me..
>>so if you or someone has advices !!!
>>
>>Thanks a lot again for your answer Jessica !
>>Aur?lie
>>
>>
>>--------------
>>Aurelie BORNOT
>>MNHN
>>Paris
>>
>>
>>_______________________________________________
>>BioPython mailing list - BioPython@biopython.org
>>http://biopython.org/mailman/listinfo/biopython
>>
>
>
>
>
From dtomso at athenixcorp.com Thu Aug 4 16:40:17 2005
From: dtomso at athenixcorp.com (Daniel Tomso)
Date: Thu Aug 4 16:32:27 2005
Subject: [BioPython] Blast and multiple processors
Message-ID:
Hello, all.
I'm working on improving my BLAST throughput, and I have some questions
about how the program handles multiple processors and multiple
processes.
Specifically, I've been experimenting with using BioPython's
NCBIStandalone to handle 3 or 4 simultaneous blast requests, since my
system has 4 processors. I spin out the requests via
NCBIStandalone.blastall(blah, blah), then grab the blast_out and
blast_err file handles in a list. Afterwards, I use blast_out.read() to
collect the reports from each of the 4 processes.
Is this wise and/or efficient? My execution times do drop off when I
do, say, 4 jobs at a time instead of 1 at a time, so it is helping. Do
the processor flags for blastall accomplish this more efficiently?
Sorry if this is not specific enough, but any insight would be
welcome!!!!
Dan T.
Daniel J. Tomso
Senior Scientist, Bioinformatics
Athenix Corporation
2202 Ellis Road
Suite B
Durham, NC 27703
919.281.0920
dtomso@athenixcorp.com
www.athenixcorp.com
Disclaimer: This message (including any attachments) may contain
confidential or privileged information and is intended only for the use
of the addressee named above. If you are not the intended recipient of
this message, you are hereby notified that you must not use, copy,
disclose or take any action based on this message or information herein.
If you have received this message in error, please advise the sender
immediately and erase all copies of this message and any related
attachments. Thank you.
From mdehoon at c2b2.columbia.edu Sun Aug 7 20:08:24 2005
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Sun Aug 7 20:01:04 2005
Subject: [BioPython] Changes in NCBI BLAST output format !!??
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC297@cgcmail.cgc.cpmc.columbia.edu>
I've updated Biopython in CVS with this fix. See Bio/Blast/NCBIWWW.py
revision 1.41. Please let me know if you find any problems. Thanks for
finding this solution.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-----Original Message-----
From: Alexander A. Morgan [mailto:amorgan@mitre.org]
Sent: Wed 8/3/2005 2:48 PM
To: Michiel De Hoon
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
Michiel:
I couldn't find anything wrong with it. The Blast record objects seem
to be correct and have the right alignments. However, I don't know of a
thorough way to test it. In general the parser is pretty fragile though
and will break for even the most minor changes in NCBI format, but it
would be very challenging to try to make it more robust.
Thanks,
-Alex
Michiel De Hoon wrote:
>Do you happen to know if this change can break anything else in the Blast
>parser? From running Biopython's tests for Blast, it seems that this change
>is OK. On the other hand, I don't use Blast much myself, so I don't trust my
>own judgement in this matter.
>If making this change does not cause any new bugs, I'd be happy to include
it
>in CVS.
>
>--Michiel.
>
>
>Michiel de Hoon
>Center for Computational Biology and Bioinformatics
>Columbia University
>1150 St Nicholas Avenue
>New York, NY 10032
>
>
>
>-----Original Message-----
>From: biopython-bounces@portal.open-bio.org on behalf of Alexander A. Morgan
>Sent: Tue 8/2/2005 4:07 PM
>To: aurelie.bornot@free.fr
>Cc: biopython@biopython.org
>Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
>
>Hello:
> I've just run into the same problem, and I haven't seen a suggested
>fix go by, so I apologize if this is redundant information, but it seems
>that the files I've been getting from NCBI have a removed from the
>header between the "RID: " line and the "
Query" line, and it is just
>a blank line now. If you edit Bio.Blast.NCBIWWW to not look for the
>"
", it seems to work okay.
>
>class _Scanner:
>....
> def _scan_header(self, uhandle, consumer):
>....
>
> change:
> attempt_read_and_call(uhandle, consumer.noevent, start='
')
> to:
> attempt_read_and_call(uhandle, consumer.noevent)
>
>
>
>
>
>aurelie.bornot@free.fr wrote:
>
>
>
>>Thank you very much Jessica !!!
>>
>>Unfortunately, I need a lot of thing in the BLAST reports.....
>>It will be difficult to do the same thing as you did....
>>
>>I will try to do something in the code of parser of Python.
>>But it will be difficult for me..
>>so if you or someone has advices !!!
>>
>>Thanks a lot again for your answer Jessica !
>>Aur?lie
>>
>>
>>--------------
>>Aurelie BORNOT
>>MNHN
>>Paris
>>
>>
>>_______________________________________________
>>BioPython mailing list - BioPython@biopython.org
>>http://biopython.org/mailman/listinfo/biopython
>>
>>
>>
>>
>
>
>_______________________________________________
>BioPython mailing list - BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
>
From gvwilson at cs.utoronto.ca Tue Aug 9 09:13:10 2005
From: gvwilson at cs.utoronto.ca (Greg Wilson)
Date: Tue Aug 9 09:03:24 2005
Subject: [BioPython] re: software skills course
Message-ID:
Hi,
I'm working with support from the Python Software Foundation to develop
an open source course on basic software development skills for people
with backgrounds in science and engineering. I have a beta version of
the course notes ready for review, and would like to pull in people
in sci&eng to look it over and give me feedback. If you know anyone
who fits this bill (particularly people who might be interested in
following along with a trial run of the course this fall), I'd be
grateful for pointers.
Thanks,
Greg Wilson
From xuying at sibs.ac.cn Wed Aug 10 01:55:09 2005
From: xuying at sibs.ac.cn (xuying)
Date: Wed Aug 10 01:45:26 2005
Subject: [BioPython] where to find an updated cookbook?
Message-ID: <20050810055436.9B14C10DECC@smtp.sibsnet.org>
Can anyone tell me where to find an updated biopython tutorial?
Examples in the online tutorial are full of errors. Thanks!
xuying
xuying@sibs.ac.cn
2005-08-10
From loraine at loraine.net Wed Aug 10 16:11:49 2005
From: loraine at loraine.net (Ann Loraine)
Date: Wed Aug 10 16:01:51 2005
Subject: [BioPython] re: software skills course
In-Reply-To:
References:
Message-ID: <6f16141077ca7fbb9bd08da6746e2b5d@loraine.net>
Hello,
I would appreciate the chance to see the notes. It would be helpful to
the postdocs and students I supervise who would like to learn python.
Yours,
Ann Loraine
On Aug 9, 2005, at 6:13 AM, Greg Wilson wrote:
> Hi,
>
> I'm working with support from the Python Software Foundation to develop
> an open source course on basic software development skills for people
> with backgrounds in science and engineering. I have a beta version of
> the course notes ready for review, and would like to pull in people
> in sci&eng to look it over and give me feedback. If you know anyone
> who fits this bill (particularly people who might be interested in
> following along with a trial run of the course this fall), I'd be
> grateful for pointers.
>
> Thanks,
> Greg Wilson
>
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
>
From mdehoon at c2b2.columbia.edu Thu Aug 11 13:59:35 2005
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu Aug 11 13:52:02 2005
Subject: [BioPython] where to find an updated cookbook?
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC2B0@cgcmail.cgc.cpmc.columbia.edu>
> Can anyone tell me where to find an updated biopython tutorial?
> Examples in the online tutorial are full of errors. Thanks!
Can you make a list of the errors that you found? Then it'll be easier for us
to fix those errors. If you have a solution to the errors that you found,
even better!
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
From j.pansanel at pansanel.net Fri Aug 19 03:41:10 2005
From: j.pansanel at pansanel.net (Jerome PANSANEL)
Date: Fri Aug 19 03:36:43 2005
Subject: [BioPython] Vector NTI file import
Message-ID: <200508190941.11168.j.pansanel@pansanel.net>
Hello,
I would like to write some support for Vector NTI file (derived from GenBank
format) for biopython. Is already someone working on it ?
Is someone interested for debugging ?
Thanks,
Jerome Pansanel
From frederic.sohm at iaf.cnrs-gif.fr Fri Aug 19 05:39:09 2005
From: frederic.sohm at iaf.cnrs-gif.fr (Frederic Sohm)
Date: Fri Aug 19 05:29:14 2005
Subject: [BioPython] Vector NTI file import
In-Reply-To: <200508190941.11168.j.pansanel@pansanel.net>
References: <200508190941.11168.j.pansanel@pansanel.net>
Message-ID: <200508191139.09551.frederic.sohm@iaf.cnrs-gif.fr>
Hi,
I am definitely. Interested I mean. How do you plane to work on the NTI file
format? I have had a look to it and it seems particularly complex. What kind
of support do you have in mind?
Cheers
Fred
Le vendredi 19 Ao?t 2005 09:41, Jerome PANSANEL a ?crit?:
> Hello,
>
> I would like to write some support for Vector NTI file (derived from
> GenBank format) for biopython. Is already someone working on it ?
> Is someone interested for debugging ?
>
> Thanks,
>
> Jerome Pansanel
>
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
--
Fr?d?ric Sohm
Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s"
UPR 2197 DEPSN, CNRS
Institut de Neurosciences A. Fessard
1 Avenue de la Terrasse
91 198 GIF-SUR-YVETTE
FRANCE
Phone: +33 (0) 1 69 82 34 12
Fax:+33 (0) 1 69 82 34 47
From sbassi at genesdigitales.com Fri Aug 19 08:31:55 2005
From: sbassi at genesdigitales.com (Sebastian Bassi)
Date: Fri Aug 19 08:24:26 2005
Subject: [BioPython] Vector NTI file import
In-Reply-To: <200508190941.11168.j.pansanel@pansanel.net>
References: <200508190941.11168.j.pansanel@pansanel.net>
Message-ID: <4305D13B.3060308@genesdigitales.com>
Jerome PANSANEL wrote:
> I would like to write some support for Vector NTI file (derived from GenBank
> format) for biopython. Is already someone working on it ?
> Is someone interested for debugging ?
I do have a valid license for VNTI (not mine, but from the place I
work), I could provide files.
First obvious question: Is it documented?
I could check manual if needed :)
PS: Just changed email address, sorry if it is duplicate.
From j.pansanel at pansanel.net Fri Aug 19 11:43:18 2005
From: j.pansanel at pansanel.net (Jerome PANSANEL)
Date: Fri Aug 19 11:38:41 2005
Subject: [BioPython] Vector NTI file import
In-Reply-To: <4305D13B.3060308@genesdigitales.com>
References: <200508190941.11168.j.pansanel@pansanel.net>
<4305D13B.3060308@genesdigitales.com>
Message-ID: <200508191743.19070.j.pansanel@pansanel.net>
Le Vendredi 19 Ao?t 2005 14:31, Sebastian Bassi a ?crit?:
> Jerome PANSANEL wrote:
> > I would like to write some support for Vector NTI file (derived from
> > GenBank format) for biopython. Is already someone working on it ?
> > Is someone interested for debugging ?
>
> I do have a valid license for VNTI (not mine, but from the place I
> work), I could provide files.
It would be great !
> First obvious question: Is it documented?
> I could check manual if needed :)
I have not found any documentation. I only known that it's very similar to
genbank file format.
The main differences are :
The header who can only contain LOCUS and SOURCE
a lot of COMMENT
Jerome Pansanel
>
>
> PS: Just changed email address, sorry if it is duplicate.
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
From j.pansanel at pansanel.net Fri Aug 19 11:44:48 2005
From: j.pansanel at pansanel.net (Jerome PANSANEL)
Date: Fri Aug 19 11:40:14 2005
Subject: [BioPython] Vector NTI file import
In-Reply-To: <200508191139.09551.frederic.sohm@iaf.cnrs-gif.fr>
References: <200508190941.11168.j.pansanel@pansanel.net>
<200508191139.09551.frederic.sohm@iaf.cnrs-gif.fr>
Message-ID: <200508191744.49362.j.pansanel@pansanel.net>
Hi
Le Vendredi 19 Ao?t 2005 11:39, vous avez ?crit?:
> Hi,
>
> I am definitely. Interested I mean. How do you plane to work on the NTI
> file format? I have had a look to it and it seems particularly complex.
> What kind of support do you have in mind?
I think about importing data. vector NTI seems to easely import genbank file,
so it is not necessary to export this type of file, it isn't ?
Jerome
> Cheers
>
> Fred
>
> Le vendredi 19 Ao?t 2005 09:41, Jerome PANSANEL a ?crit?:
> > Hello,
> >
> > I would like to write some support for Vector NTI file (derived from
> > GenBank format) for biopython. Is already someone working on it ?
> > Is someone interested for debugging ?
> >
> > Thanks,
> >
> > Jerome Pansanel
> >
> > _______________________________________________
> > BioPython mailing list - BioPython@biopython.org
> > http://biopython.org/mailman/listinfo/biopython
From sameet at nccs.res.in Mon Aug 22 00:14:25 2005
From: sameet at nccs.res.in (Sameet)
Date: Mon Aug 22 00:05:14 2005
Subject: [BioPython] Doubt about the MEME parser
In-Reply-To: <200508191540.j7JFe5Tx027099@portal.open-bio.org>
Message-ID: <0a292e625ac651dd294d719f0fd8bbf0430951ca@nccs.res.in>
Hi,
I found the module to deal with MEME output files in the latest CVS.
However, I couldn't get it working. Any pointers will be helpful.
Regards
Sameet
-----Original Message-----
From: biopython-bounces@portal.open-bio.org
[mailto:biopython-bounces@portal.open-bio.org] On Behalf Of
biopython-request@portal.open-bio.org
Sent: Friday, August 19, 2005 9:11 PM
To: biopython@biopython.org
Subject: BioPython Digest, Vol 32, Issue 2
Send BioPython mailing list submissions to
biopython@biopython.org
To subscribe or unsubscribe via the World Wide Web, visit
http://biopython.org/mailman/listinfo/biopython
or, via email, send a message with subject or body 'help' to
biopython-request@biopython.org
You can reach the person managing the list at
biopython-owner@biopython.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of BioPython digest..."
Today's Topics:
1. Re: Changes in NCBI BLAST output format !!?? (Aur?lie Bornot)
2. Blast and multiple processors (Daniel Tomso)
3. RE: Changes in NCBI BLAST output format !!?? (Michiel De Hoon)
4. re: software skills course (Greg Wilson)
5. where to find an updated cookbook? (xuying)
6. Re: re: software skills course (Ann Loraine)
7. RE: where to find an updated cookbook? (Michiel De Hoon)
8. Vector NTI file import (Jerome PANSANEL)
9. Re: Vector NTI file import (Frederic Sohm)
10. Re: Vector NTI file import (Sebastian Bassi)
11. Re: Vector NTI file import (Jerome PANSANEL)
----------------------------------------------------------------------
Message: 1
Date: Wed, 3 Aug 2005 20:54:36 +0200
From: Aur?lie Bornot
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
To:
Message-ID: <001b01c5985c$cc046360$0b413851@YSENGARD>
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
reply-type=response
Thank you very much Alexander !!
I didn't dare to change the code in the Bio.Blast.NCBIWWW on my own because
I didn't have time to make tests...
So I simply automatiquely added the in the Blast file... not very nice..
I know !
I will try your method instead !
Thanks !
Aurilie
--------------
Aurelie BORNOT
MNHN
Paris
----- Original Message -----
From: "Alexander A. Morgan"
To:
Cc:
Sent: Tuesday, August 02, 2005 10:07 PM
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
> Hello:
> I've just run into the same problem, and I haven't seen a suggested fix
> go by, so I apologize if this is redundant information, but it seems that
> the files I've been getting from NCBI have a removed from the header
> between the "RID: " line and the "
Query" line, and it is just a blank
> line now. If you edit Bio.Blast.NCBIWWW to not look for the "
", it
> seems to work okay.
>
> class _Scanner:
> ....
> def _scan_header(self, uhandle, consumer):
> ....
>
> change:
> attempt_read_and_call(uhandle, consumer.noevent, start='
')
> to: attempt_read_and_call(uhandle, consumer.noevent)
>
>
>
>
>
> aurelie.bornot@free.fr wrote:
>
>>Thank you very much Jessica !!!
>>
>>Unfortunately, I need a lot of thing in the BLAST reports.....
>>It will be difficult to do the same thing as you did....
>>
>>I will try to do something in the code of parser of Python.
>>But it will be difficult for me..
>>so if you or someone has advices !!!
>>
>>Thanks a lot again for your answer Jessica !
>>Aurilie
>>
>>
>>--------------
>>Aurelie BORNOT
>>MNHN
>>Paris
>>
>>
>>_______________________________________________
>>BioPython mailing list - BioPython@biopython.org
>>http://biopython.org/mailman/listinfo/biopython
>>
>
>
>
>
------------------------------
Message: 2
Date: Thu, 4 Aug 2005 16:40:17 -0400
From: "Daniel Tomso"
Subject: [BioPython] Blast and multiple processors
To:
Message-ID:
Content-Type: text/plain; charset="us-ascii"
Hello, all.
I'm working on improving my BLAST throughput, and I have some questions
about how the program handles multiple processors and multiple
processes.
Specifically, I've been experimenting with using BioPython's
NCBIStandalone to handle 3 or 4 simultaneous blast requests, since my
system has 4 processors. I spin out the requests via
NCBIStandalone.blastall(blah, blah), then grab the blast_out and
blast_err file handles in a list. Afterwards, I use blast_out.read() to
collect the reports from each of the 4 processes.
Is this wise and/or efficient? My execution times do drop off when I
do, say, 4 jobs at a time instead of 1 at a time, so it is helping. Do
the processor flags for blastall accomplish this more efficiently?
Sorry if this is not specific enough, but any insight would be
welcome!!!!
Dan T.
Daniel J. Tomso
Senior Scientist, Bioinformatics
Athenix Corporation
2202 Ellis Road
Suite B
Durham, NC 27703
919.281.0920
dtomso@athenixcorp.com
www.athenixcorp.com
Disclaimer: This message (including any attachments) may contain
confidential or privileged information and is intended only for the use
of the addressee named above. If you are not the intended recipient of
this message, you are hereby notified that you must not use, copy,
disclose or take any action based on this message or information herein.
If you have received this message in error, please advise the sender
immediately and erase all copies of this message and any related
attachments. Thank you.
------------------------------
Message: 3
Date: Sun, 7 Aug 2005 20:08:24 -0400
From: "Michiel De Hoon"
Subject: RE: [BioPython] Changes in NCBI BLAST output format !!??
To: "Alexander A. Morgan"
Cc: biopython@biopython.org
Message-ID:
<6CA15ADD82E5724F88CB53D50E61C9AE7AC297@cgcmail.cgc.cpmc.columbia.edu>
Content-Type: text/plain; charset="iso-8859-1"
I've updated Biopython in CVS with this fix. See Bio/Blast/NCBIWWW.py
revision 1.41. Please let me know if you find any problems. Thanks for
finding this solution.
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-----Original Message-----
From: Alexander A. Morgan [mailto:amorgan@mitre.org]
Sent: Wed 8/3/2005 2:48 PM
To: Michiel De Hoon
Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
Michiel:
I couldn't find anything wrong with it. The Blast record objects seem
to be correct and have the right alignments. However, I don't know of a
thorough way to test it. In general the parser is pretty fragile though
and will break for even the most minor changes in NCBI format, but it
would be very challenging to try to make it more robust.
Thanks,
-Alex
Michiel De Hoon wrote:
>Do you happen to know if this change can break anything else in the Blast
>parser? From running Biopython's tests for Blast, it seems that this change
>is OK. On the other hand, I don't use Blast much myself, so I don't trust
my
>own judgement in this matter.
>If making this change does not cause any new bugs, I'd be happy to include
it
>in CVS.
>
>--Michiel.
>
>
>Michiel de Hoon
>Center for Computational Biology and Bioinformatics
>Columbia University
>1150 St Nicholas Avenue
>New York, NY 10032
>
>
>
>-----Original Message-----
>From: biopython-bounces@portal.open-bio.org on behalf of Alexander A.
Morgan
>Sent: Tue 8/2/2005 4:07 PM
>To: aurelie.bornot@free.fr
>Cc: biopython@biopython.org
>Subject: Re: [BioPython] Changes in NCBI BLAST output format !!??
>
>Hello:
> I've just run into the same problem, and I haven't seen a suggested
>fix go by, so I apologize if this is redundant information, but it seems
>that the files I've been getting from NCBI have a removed from the
>header between the "RID: " line and the "
Query" line, and it is just
>a blank line now. If you edit Bio.Blast.NCBIWWW to not look for the
>"
", it seems to work okay.
>
>class _Scanner:
>....
> def _scan_header(self, uhandle, consumer):
>....
>
> change:
> attempt_read_and_call(uhandle, consumer.noevent, start='
')
> to:
> attempt_read_and_call(uhandle, consumer.noevent)
>
>
>
>
>
>aurelie.bornot@free.fr wrote:
>
>
>
>>Thank you very much Jessica !!!
>>
>>Unfortunately, I need a lot of thing in the BLAST reports.....
>>It will be difficult to do the same thing as you did....
>>
>>I will try to do something in the code of parser of Python.
>>But it will be difficult for me..
>>so if you or someone has advices !!!
>>
>>Thanks a lot again for your answer Jessica !
>>Aurilie
>>
>>
>>--------------
>>Aurelie BORNOT
>>MNHN
>>Paris
>>
>>
>>_______________________________________________
>>BioPython mailing list - BioPython@biopython.org
>>http://biopython.org/mailman/listinfo/biopython
>>
>>
>>
>>
>
>
>_______________________________________________
>BioPython mailing list - BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
>
>
>
------------------------------
Message: 4
Date: Tue, 09 Aug 2005 09:13:10 -0400
From: Greg Wilson
Subject: [BioPython] re: software skills course
To: biopython@biopython.org
Message-ID:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi,
I'm working with support from the Python Software Foundation to develop
an open source course on basic software development skills for people
with backgrounds in science and engineering. I have a beta version of
the course notes ready for review, and would like to pull in people
in sci&eng to look it over and give me feedback. If you know anyone
who fits this bill (particularly people who might be interested in
following along with a trial run of the course this fall), I'd be
grateful for pointers.
Thanks,
Greg Wilson
------------------------------
Message: 5
Date: Wed, 10 Aug 2005 13:55:09 +0800
From: "xuying"
Subject: [BioPython] where to find an updated cookbook?
To: "biopython"
Message-ID: <20050810055436.9B14C10DECC@smtp.sibsnet.org>
Content-Type: text/plain; charset="gb2312"
Can anyone tell me where to find an updated biopython tutorial?
Examples in the online tutorial are full of errors. Thanks!
!!!!!!!!!!!!!!!!xuying
!!!!!!!!!!!!!!!!xuying@sibs.ac.cn
!!!!!!!!!!!!!!!!!!!!2005-08-10
------------------------------
Message: 6
Date: Wed, 10 Aug 2005 13:11:49 -0700
From: Ann Loraine
Subject: Re: [BioPython] re: software skills course
To: Greg Wilson
Cc: biopython@biopython.org
Message-ID: <6f16141077ca7fbb9bd08da6746e2b5d@loraine.net>
Content-Type: text/plain; charset=US-ASCII; format=flowed
Hello,
I would appreciate the chance to see the notes. It would be helpful to
the postdocs and students I supervise who would like to learn python.
Yours,
Ann Loraine
On Aug 9, 2005, at 6:13 AM, Greg Wilson wrote:
> Hi,
>
> I'm working with support from the Python Software Foundation to develop
> an open source course on basic software development skills for people
> with backgrounds in science and engineering. I have a beta version of
> the course notes ready for review, and would like to pull in people
> in sci&eng to look it over and give me feedback. If you know anyone
> who fits this bill (particularly people who might be interested in
> following along with a trial run of the course this fall), I'd be
> grateful for pointers.
>
> Thanks,
> Greg Wilson
>
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
>
------------------------------
Message: 7
Date: Thu, 11 Aug 2005 13:59:35 -0400
From: "Michiel De Hoon"
Subject: RE: [BioPython] where to find an updated cookbook?
To: "xuying" , "biopython"
Message-ID:
<6CA15ADD82E5724F88CB53D50E61C9AE7AC2B0@cgcmail.cgc.cpmc.columbia.edu>
Content-Type: text/plain; charset="iso-8859-1"
> Can anyone tell me where to find an updated biopython tutorial?
> Examples in the online tutorial are full of errors. Thanks!
Can you make a list of the errors that you found? Then it'll be easier for
us
to fix those errors. If you have a solution to the errors that you found,
even better!
--Michiel.
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
------------------------------
Message: 8
Date: Fri, 19 Aug 2005 09:41:10 +0200
From: Jerome PANSANEL
Subject: [BioPython] Vector NTI file import
To: biopython@biopython.org
Message-ID: <200508190941.11168.j.pansanel@pansanel.net>
Content-Type: text/plain; charset="iso-8859-1"
Hello,
I would like to write some support for Vector NTI file (derived from GenBank
format) for biopython. Is already someone working on it ?
Is someone interested for debugging ?
Thanks,
Jerome Pansanel
------------------------------
Message: 9
Date: Fri, 19 Aug 2005 11:39:09 +0200
From: Frederic Sohm
Subject: Re: [BioPython] Vector NTI file import
To: biopython@biopython.org, Jerome PANSANEL
Message-ID: <200508191139.09551.frederic.sohm@iaf.cnrs-gif.fr>
Content-Type: text/plain; charset="iso-8859-1"
Hi,
I am definitely. Interested I mean. How do you plane to work on the NTI file
format? I have had a look to it and it seems particularly complex. What kind
of support do you have in mind?
Cheers
Fred
Le vendredi 19 Ao{t 2005 09:41, Jerome PANSANEL a icrit :
> Hello,
>
> I would like to write some support for Vector NTI file (derived from
> GenBank format) for biopython. Is already someone working on it ?
> Is someone interested for debugging ?
>
> Thanks,
>
> Jerome Pansanel
>
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
--
Fridiric Sohm
Equipe INRA U1126 "Morphogenhse du systhme nerveux des Chordis"
UPR 2197 DEPSN, CNRS
Institut de Neurosciences A. Fessard
1 Avenue de la Terrasse
91 198 GIF-SUR-YVETTE
FRANCE
Phone: +33 (0) 1 69 82 34 12
Fax:+33 (0) 1 69 82 34 47
------------------------------
Message: 10
Date: Fri, 19 Aug 2005 09:31:55 -0300
From: Sebastian Bassi
Subject: Re: [BioPython] Vector NTI file import
To: biopython@biopython.org
Message-ID: <4305D13B.3060308@genesdigitales.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Jerome PANSANEL wrote:
> I would like to write some support for Vector NTI file (derived from
GenBank
> format) for biopython. Is already someone working on it ?
> Is someone interested for debugging ?
I do have a valid license for VNTI (not mine, but from the place I
work), I could provide files.
First obvious question: Is it documented?
I could check manual if needed :)
PS: Just changed email address, sorry if it is duplicate.
------------------------------
Message: 11
Date: Fri, 19 Aug 2005 17:43:18 +0200
From: Jerome PANSANEL
Subject: Re: [BioPython] Vector NTI file import
To: biopython@biopython.org
Message-ID: <200508191743.19070.j.pansanel@pansanel.net>
Content-Type: text/plain; charset="iso-8859-1"
Le Vendredi 19 Ao{t 2005 14:31, Sebastian Bassi a icrit :
> Jerome PANSANEL wrote:
> > I would like to write some support for Vector NTI file (derived from
> > GenBank format) for biopython. Is already someone working on it ?
> > Is someone interested for debugging ?
>
> I do have a valid license for VNTI (not mine, but from the place I
> work), I could provide files.
It would be great !
> First obvious question: Is it documented?
> I could check manual if needed :)
I have not found any documentation. I only known that it's very similar to
genbank file format.
The main differences are :
The header who can only contain LOCUS and SOURCE
a lot of COMMENT
Jerome Pansanel
>
>
> PS: Just changed email address, sorry if it is duplicate.
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
------------------------------
_______________________________________________
BioPython mailing list - BioPython@biopython.org
http://biopython.org/mailman/listinfo/biopython
End of BioPython Digest, Vol 32, Issue 2
****************************************
From frederic.sohm at iaf.cnrs-gif.fr Mon Aug 22 04:13:13 2005
From: frederic.sohm at iaf.cnrs-gif.fr (Frederic Sohm)
Date: Mon Aug 22 04:04:28 2005
Subject: [BioPython] Vector NTI file import
Message-ID: <200508221013.13555.frederic.sohm@iaf.cnrs-gif.fr>
Le vendredi 19 Ao?t 2005 17:44, vous avez ?crit?:
> Hi
>
> Le Vendredi 19 Ao?t 2005 11:39, vous avez ?crit?:
> > Hi,
> >
> > I am definitely. Interested I mean. How do you plane to work on the NTI
> > file format? I have had a look to it and it seems particularly complex.
> > What kind of support do you have in mind?
>
> I think about importing data. vector NTI seems to easely import genbank
> file, so it is not necessary to export this type of file, it isn't ?
>
Yes, it is largely enough. But I mean do you plan to import the Genbank part
of the vector NTI format (the genbank fields after Features) or the Vector
NTI part of it which records everything displayed by Vector NTI in the
graphical map. It make quite a difference to the amount of code to write.
Anyway I can do some testing of your code.
good luck.
Fred
> Jerome
>
> > Cheers
> >
> > Fred
> >
> > Le vendredi 19 Ao?t 2005 09:41, Jerome PANSANEL a ?crit?:
> > > Hello,
> > >
> > > I would like to write some support for Vector NTI file (derived from
> > > GenBank format) for biopython. Is already someone working on it ?
> > > Is someone interested for debugging ?
> > >
> > > Thanks,
> > >
> > > Jerome Pansanel
> > >
> > > _______________________________________________
> > > BioPython mailing list ?- ?BioPython@biopython.org
> > > http://biopython.org/mailman/listinfo/biopython
--
Fr?d?ric Sohm
Equipe INRA U1126 "Morphogen?se du syst?me nerveux des Chord?s"
UPR 2197 DEPSN, CNRS
Institut de Neurosciences A. Fessard
1 Avenue de la Terrasse
91 198 GIF-SUR-YVETTE
FRANCE
Phone: +33 (0) 1 69 82 34 12
Fax:+33 (0) 1 69 82 34 47
From mdehoon at c2b2.columbia.edu Mon Aug 22 10:26:01 2005
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Mon Aug 22 10:17:07 2005
Subject: [BioPython] FW: NETTAB 2005 - Deadlines approaching: early
registration and call (fwd)
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE7AC2CB@cgcmail.cgc.cpmc.columbia.edu>
FYI
Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032
-----Original Message-----
From: mailman-bounces@portal.open-bio.org on behalf of Paolo Romano
Sent: Mon 8/22/2005 9:52 AM
To: biopython-owner@biopython.org
Subject: NETTAB 2005 - Deadlines approaching: early registration and call
(fwd)
Dear list owner,
I would be glad if you could forward the following message to
your mailing list.
Thank you in advance. Best regards. Paolo Romano
This message has been forwarded by paolo.romano@istge.it
--------
Dear all,
this is a reminder for next deadlines of the NETTAB 2005 Workshop on
"Workflows management: new abilities for the biological information
overflow" that will be held on next October 5-7, 2005, in Naples, Italy.
----------------------------------------------------------------------
The Scientific Programme is now available on-line at
http://www.nettab.org/2005/progr.html .
The Opening Lecture will be given by Francis Ouellette.
Francis Ouellette is the Director of the UBiC, the Bioinformatics
Centre of the University of British Columbia, Canada. He is Associate
Professor of the Michael Smith Laboratories and of the Department of
Medical Genetics of UBC. He is also a core faculty member in the new
UBC graduate Training Program in Bioinformatics for Health Research,
associate director of Bioinformatics at Genome British Columbia and
director of the Canadian Genetic Diseases Network (CGDN) bioinformatics
core facility where he helps coordinate the Canadian Bioinformatics
Workshops.
Francis has an exceptional curriculum and his recent research,
training and coordination activities make him one of the most known
and appreciated bioinformaticians.
The title of his opening talk will be
"Workflow management in bioinformatics: the possibilities and the
challenges".
----------------------------------------------------------------------
The Call for posters and position papers is closing on next friday
August 26, 2005.
You are all warmly invited to present your recent activity related to the
workshops' topics by submitting a poster abstract (1-2 A4 pages, font size
12 pt, MS Word format) by email to posters2005@nettab.org .
Topics are the following:
Technologies and technological platforms of interest, with emphasis on:
- Web Services (SOAP, WSDL, WSFL, UDDI, ....)
- Web Services Choreography and Orchestration
- Semantic Web (RDF, LSID, OWL, ...)
- comparison of available technologies, limitations, pros and cons
- knowledge representation
- biological data and knowledge modeling tools
- Ontologies, Databases and Applications of Semantics in Bioinformatics
Workflow management systems in bioinformatics
- implementations of web services
- implementations of registries
- reuse and versioning of web services and workflows
- workflow management systems
- web interfaces for accessing and executing workflows
- interactive systems to support work flows
Applications of workflow management systems in bioinformatics
- Methodologies for life sciences analysis, such as:
- gene expression,
- genome annotation,
- mass spec peptide fragment identification,
- Encoding of the above in workflows
- Case studies
- Scenarios and use cases
Check all details at: http://www.nettab.org/2005/call.html .
-----------------------------------------------------------------------
The early registration deadline is also next Friday August 26, 2005.
The registration form is available on-line at:
http://www.nettab.org/2005/rform.html .
The payment of the fee can either be done on-line, through the Online
Payment Form of the Bioinformatics Italian Society (BITS), or by direct
money transfer.
Participation fees are as follows:
Until August 26, 2005:
- Students: 70,00 Euro
- Academic: 130,00 Euro (reduced fee: 117,00 Euro)
- Non-academic: 270,00 Euro (reduced fee: 243,00 Euro)
After August 26, 2005:
- Students: 70,00 Euro
- Academic: 180,00 Euro (reduced fee: 162,00 Euro)
- Non-academic: 370,00 Euro (reduced fee: 333,00 Euro)
The 10% reduction on fees is applied for members of:
- ISCB (International Society for Computational Biology),
http://www.iscb.org/
- BITS (Bioinformatics Italian Society),
http://www.bioinformatics.it/
- Hormone Responsive Breast Cancer (HRBC) Genomics Network,
http://www.hrbc-genomics.net/
- Oncology over Internet (O2I) project,
http://www.o2i.it/
- Interdisciplinary Laboratory for Technologies in Bioinformatics (LITBIO)
-----------------------------------------------------------------------------
---
I'm looking forward to meeting many of you in Naples quite soon.
Ciao. Paolo
--
Paolo Romano (paolo.romano@istge.it)
Bioinformatics and Structural Proteomics
National Cancer Research Institute (IST)
Largo Rosanna Benzi, 10, I-16132, Genova, Italy
Tel: +39-010-5737-288 Fax: +39-010-5737-295
Web: http://www.nettab.org/promano/
From meames at itsa.ucsf.edu Tue Aug 23 15:42:32 2005
From: meames at itsa.ucsf.edu (meames@itsa.ucsf.edu)
Date: Tue Aug 23 15:31:59 2005
Subject: [BioPython] Formating files for Clustalw
Message-ID: <200508231942.j7NJgWmG029335@itsa.ucsf.edu>
Hi all
I'm working my way though the cookbook and I've run in to a snag in
section 3.5.1 - Clustalw
I've created a simple two-entry FASTA file for aligning but the parser
appears to reject it. There are no question marks or other punctuation in
the titles (such as I've read on this board) that would seem to give it
trouble, so I'm at a bit of a loss. Can anyone help? (I'm running
clustalw 1.81)
Here is the error message:
Traceback (most recent call last):
File "./practice.py", line 21, in ?
alignment = Clustalw.do_alignment(cline)
File "/usr/lib/python2.3/site-packages/Bio/Clustalw/__init__.py", line
116, in do_alignme nt
return parse_file(out_file, alphabet)
File "/usr/lib/python2.3/site-packages/Bio/Clustalw/__init__.py", line
55, in parse_file
parser.parseFile(to_parse)
File "/usr/lib/python2.3/site-packages/Martel/Parser.py", line 328, in
parseFile
self.parseString(fileobj.read())
File "/usr/lib/python2.3/site-packages/Martel/Parser.py", line 356, in
parseString
self._err_handler.fatalError(result)
File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line
38, in fatalError
raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character 0
Here is the code:
cline = MultipleAlignCL('file_to_align')
cline.set_output('test.aln')
alignment = Clustalw.do_alignment(cline)
Here is the file_to_align:
>sptrembl|Q00647|Q00647 Myosin I heavy chain. [Emericella nidulans]
MGHSRRPAGGEKKSRFGRSKAAADVGDGRQAGGKPQVRKAVFESTKKKEIGVSDLTLLSK
ISNEAINDNLKLRFQHDEIYTYIGHVLVSVNPFRDLGIYTDSVLNSYRGKNRLEVPPHVF
AVAESAYYNMKSYKDNQCVIISGESGAGKTEAAKRIMQYIASVSGGSDSSIQQTKDMVLA
TNPLLESFGNAKTLRNNNSSRFGKYLELEFNAQGEPVGANITNYLLEKSRVVGQITNERN
FHIFYQFAKGAPQKYRDSFGVQQPQSYLYTSRSKCFDVPGVDDVAEFQDTLNAMSVIGMS
EAEQDNVFRMLAAILWMGNIQFAEDDSGNAAITDQSVVDFVAYLLEVDAGQVNQALTIRM
METSRGGRRGSVYEVPLNTTQALAVRDALAKAIYFNLFDWIVGRVNQSLTAKGAVANSIG
ILDIYGFEIFEKNSFEQLCINYVNEKLQQIFIQLTLKAEQDEYEREQITWTPIKYFDNKV
VCSLIEDKRPPGVFAALNDACATAHADSGAADNTFVGRLNFLGQNPNFENRQGQFIIKHY
AGDVSYAVQGMTDKNKDQLLKDLLNLVQSSSNHFVHTLFPEQVNQDDKRRPPTASDKIKA
SANDLVAMLMKAQPSYIRTIKPNDNKAPKEFNESNVLHQIKYLGLQENVRIRRAGFAYRQ
TFDKFVERFYLLSPKTSYAGDYTWTGDVETGARQILKDTRIPAEEYQMGITKVFIKTPET
LFALEAMRDRYWHNMAIRIQRAWRNYLRYRTECAIRIQRFWPRMNGGLELLKLRDQGHTI
LGGRKERRRMSILGSRRFLGDYVGISNKGGPGEMIRSGAAISTSDDVLFSCRGEVLVSKF
GRSSKPSPRIFVLTNRHVYIVSQNFVNNQLVISSERTIPIGAIKTVSASSYRDDWFSLVV
GGQEPDPLCNCVFKTEFFTHLHNALRGQLNLKIGPEIEYNKKPGKLATVKVVKDGSQVDS
YKSGTIHTGPGEPPNSVSKPTPRGKQVAARPVTKGKLLRLAVQAVARPNWLPDLYQSVGL
YHSPRLKQPRRNRHQRPDPFLNQWQPLQHPIHVLHLLPPQGHHPRLLPRPPAAAGPKKAK
ALYDFSSDNNGMLSISAGQIVEIVSKEGNGWWLCMNLETSAQGWTPEAYLEEQVAPTPKP
APPPPPPVAPRASPAPVNGSAAVAAAKAKAAPPPPAKRPNMAGRKTAPAPPPAPRDSAVS
MNSQGDSSGASGRGTPSSVSNACLAGGLAEALRRRQSAMQGKQDDDDDW
>gi|17507983|ref|NP_492393.1| F29D10.4 [Caenorhabditis elegans]
MAFHWQSKVNVQHVGVDDMVLLPKLTEQSIVENLKKRLQANSIFTYIGPVLISVNPFKQM
PYFTEKEMLLYQGAAQYENAPHIYALADNMYRNMLIDNESQCVIISGESGAGKTVNAKFI
MNYISRISGGGQKVQHIKDVILQSNPLLEAFGNSATVRNWNSSRFGKYVEIVFSRGGEPI
GGKLSNFLLEKSRVVHQNEGDRNFHVFYQLCAGADKNLRSTFGIGELQYYNYLNMSGVFK
ADDTDDGKEFESTLHAMKVVGVNDQDQLEVLRIVATVLHIGNITFTEENNFAAVSGKDYL
EYPAFLLGLTSADIEAKLTGRKMESKWGTQKEEIDMKLNVEQASYTRDAWVKAIYARLFD
YLVKKVNDAMNITSQSTSDNFSVGILDIYGFEIFNNNGFEQFCINFVNEKLQQIFIELTL
KAEQEEYVREGIKWTEIDYFDNKIVCDLIETKRPPGIMSLLDDTCAQNHGQREGVDRQLL
TTLSKSFAGHPHFGPGSDSFVIKHYAGDVTYNVDGFCDRNRDVLYPDLILLMQKSSRPFI
QALFPENVAASAGKRPTTFSTKIRTQANTLVESLMKCSPHYVRCIKPNETKRPNDWEESR
VKHQVEYLGLRENIRVRRAGFAYRRAFDKFAQRYAIVSPQTWPCFQGDQQRACEIICDSV
HMEKNQYQMGKTKIFVKNPESLFLLEETRERKFDGYARVIQKAWRQFSARKQHIKQKEQA
ADLMYGKKERRRYSLNRNFVGDYIGLEHHPTLQSLVGKRQRVLFACTANKYDRKFRVTKL
DLLLTVNHLTLIGKEKVKNGPEKGKIVEVIKRQFDLPQIKSIGLSPYQDDFVILYLGNDD
YSSLLETPFKTEFCTALSKAYKERTNGTLHLDFRSSHVVSYKKMKFDFSDGKRTVQFGND
GTSSAEKTLKPNGKVLNVSIGTGLPNTTRPSTERPQGGYTPRRDQLRTSTRRTKQNNQSY
GQNGQSQAMRAPVPAHGMNNNYNQTPAPVSTNHQYSQEPARIPVMGNVINQLNNMNLSGN
GNSPAGRGPPPARGPKPPPPAKPKLNPVVIAVYPYEAQDVDELSFEAGAEIELMNKDASG
WWQGKVNNRVGLFPGNYVKE
I've also attempted to run the simple command line:
clustalw ./file_to_align -OUTFILE=test.aln
without success, resulting in the error message:
Error: unknown option -./file_to_align
Thanks
Matt "I'm new at this" Eames
From j.pansanel at pansanel.net Wed Aug 24 03:42:02 2005
From: j.pansanel at pansanel.net (Jerome PANSANEL)
Date: Wed Aug 24 03:37:59 2005
Subject: [BioPython] Formating files for Clustalw
In-Reply-To: <200508231942.j7NJgWmG029335@itsa.ucsf.edu>
References: <200508231942.j7NJgWmG029335@itsa.ucsf.edu>
Message-ID: <200508240942.03064.j.pansanel@pansanel.net>
Le Mardi 23 Ao?t 2005 21:42, meames@itsa.ucsf.edu a ?crit?:
> Hi all
>
...
Hi,
1. clustalw -infile=file_to_align -outfile=test.aln is working very well by me
(clustalw 1.83)
2. In your code, how looks your test.aln file ?
Is it like an clustalw file ?
Is your 'test.aln' file like this :
CLUSTAL W (1.83) multiple sequence alignment
sptrembl|Q00647|Q00647
MGHSRRPAGGEKKSRFGRSKAAADVGDGRQAGGKPQVRKAVFESTK
KKEI
gi|17507983|ref|NP_492393.1|
MAFHWQSK------------------------------------VNVQHV
*.. :. .: :.:
sptrembl|Q00647|Q00647
GVSDLTLLSKISNEAINDNLKLRFQHDEIYTYIGHVLVSVNPFRDLGIYT
gi|17507983|ref|NP_492393.1|
GVDDMVLLPKLTEQSIVENLKKRLQANSIFTYIGPVLISVNPFKQMPYFT
**.*:.**.*:::::* :*** *:* :.*:****
**:*****::: :*
...
Jerome Pansanel
--
From pwilkinson_m at xbioinformatics.org Wed Aug 24 16:31:30 2005
From: pwilkinson_m at xbioinformatics.org (Peter Wilkinson)
Date: Wed Aug 24 16:20:05 2005
Subject: [BioPython] Vector NTI file import
In-Reply-To: <200508191744.49362.j.pansanel@pansanel.net>
References: <200508190941.11168.j.pansanel@pansanel.net>
<200508191139.09551.frederic.sohm@iaf.cnrs-gif.fr>
<200508191744.49362.j.pansanel@pansanel.net>
Message-ID: <6.2.1.2.0.20050824162113.034978e8@mail.xbioinformatics.org>
Hi there,
.... Since I used to work for the company who created Vector NTI.
The Vector NTI format is an adulteration of the Genbank format.
The format is simple: Genbank + additional data in COMMENT TAG.
Vector NTI stores up additional data associated with your sequence in
flat-files in the back end. In order to keep it in the 'Genbank' format, it
takes the additional data stored in Vector NTI and the stores name/value
pair combinations inside a Genank COMMENT tag when it exports a Genbank
format and some serialization information that is stored up in NTI (NTI
proprietary serialization format ...). You will immediately see how this
works if you export a Genbank file from Vector NTI. So if you had done some
annotations of some kind, or primers or whatever that is where you would
find them.
This was done in order to make sure an exported Vector NTI Genbank format
was compatible with other software that recognises the Genbank format,
whiles retaining information that was added within NTI.
For me, if you mess with the output formatting, its no longer officially a
Genbank format but a Vector NTI's format ... but that is a philosophical
debate.
Feel free to contact me, send my a sample output, I have not worked with
NTI for a while ... but it will all come back to me.
Peter
At 11:44 AM 19/08/2005, Jerome PANSANEL wrote:
>Hi
>
>Le Vendredi 19 Ao?t 2005 11:39, vous avez ?crit :
> > Hi,
> >
> > I am definitely. Interested I mean. How do you plane to work on the NTI
> > file format? I have had a look to it and it seems particularly complex.
> > What kind of support do you have in mind?
>
>I think about importing data. vector NTI seems to easely import genbank file,
>so it is not necessary to export this type of file, it isn't ?
>
>Jerome
>
> > Cheers
> >
> > Fred
> >
> > Le vendredi 19 Ao?t 2005 09:41, Jerome PANSANEL a ?crit :
> > > Hello,
> > >
> > > I would like to write some support for Vector NTI file (derived from
> > > GenBank format) for biopython. Is already someone working on it ?
> > > Is someone interested for debugging ?
> > >
> > > Thanks,
> > >
> > > Jerome Pansanel
> > >
> > > _______________________________________________
> > > BioPython mailing list - BioPython@biopython.org
> > > http://biopython.org/mailman/listinfo/biopython
>
>
>_______________________________________________
>BioPython mailing list - BioPython@biopython.org
>http://biopython.org/mailman/listinfo/biopython
From pwilkinson_m at xbioinformatics.org Wed Aug 24 16:43:15 2005
From: pwilkinson_m at xbioinformatics.org (Peter Wilkinson)
Date: Wed Aug 24 16:31:43 2005
Subject: [BioPython] Fasta parser, minor (bug/feature?)
In-Reply-To: <200508240942.03064.j.pansanel@pansanel.net>
References: <200508231942.j7NJgWmG029335@itsa.ucsf.edu>
<200508240942.03064.j.pansanel@pansanel.net>
Message-ID: <6.2.1.2.0.20050824163137.03497658@mail.xbioinformatics.org>
It seems that the fasta parser retains the os specific line endings when it
stores the title and sequence in the Record object, so I have to write out
something like this when I read a file from working in windows (eeeeek),
then display using a true text editor like Context:
file_out.writelines(str(cur_record).replace('\r',''))
... because all the line endings are '\r\n', and are displayed in the text
editor as 2 returns, or double spacing the text when written to file
instead of single space:
>gi|272209|gb|M61959.1| EST00007 Fetal brain, Stratagene (cat#936206) ...
CTTCCCTTTTGTTCCCCTCAGTGTCCCTTTTAATTGCTTCCCTCCATTTTCCTTAGCAGC
ATCCTAGTTGATGGTCTGGGTTATCAGAGGAGCAAAAACATTTAAGTGTCAAATAATGCT
CATTGTCTCCCTGGGATTTCTAAACAGAAAAAATGAAGAAAGAGGCAGAGAAGAGCTTCA
Should the behavior to allow both single and os specific line returns be
applied, or just '\n'?
I realise that the Record __str() method uses os.linesep, but when working
with fasta files in a true text editor in windows ... only the \n is
needed. Also I work generally in a mixed environment and the \r\n should
be avoided.
I am unsure why os.linesep is used here. My vote is to just have a plain
'\n' applied to each end of line.
Peter
From sbassi at genesdigitales.com Thu Aug 25 14:59:12 2005
From: sbassi at genesdigitales.com (Sebastian Bassi)
Date: Thu Aug 25 14:48:47 2005
Subject: [BioPython] Fasta parser, minor (bug/feature?)
In-Reply-To: <6.2.1.2.0.20050824163137.03497658@mail.xbioinformatics.org>
References: <200508231942.j7NJgWmG029335@itsa.ucsf.edu> <200508240942.03064.j.pansanel@pansanel.net>
<6.2.1.2.0.20050824163137.03497658@mail.xbioinformatics.org>
Message-ID: <430E1500.1010804@genesdigitales.com>
Peter Wilkinson wrote:
> It seems that the fasta parser retains the os specific line endings when
> it stores the title and sequence in the Record object, so I have to
> write out something like this when I read a file from working in windows
> (eeeeek), then display using a true text editor like Context:
....
I've just run into this problem too!. But this seems that is something
that changed. And I will tell why I think so:
I made a script on June 2004 and it worked as expected (without showing
double spacing). Then I change PC and installed Py2.4 and last biopython
(1.4). Today I run the same program and it printed the fasta format with
double space (see atached example).
So I think this a bug caused by either Python 2.4 or BioPy1.4 :)
>QH_CA_Contig1507for
ATTACGGTCGGGGAGTGGATCCGATATCGATATGATGGTAGGGATCCCTAACTCGCGATCTTCAATACGT
TGCTGCAAGTCGTGACAATTCATTTGATTGGGTATGGAGAAACATCATGAGTTATCCGGATGTCAAATTT
CCTTACATAGCAGTTGGTAACGAGGTCAACCCATCCGATGGCACATTGGCTCCATTGGTTCATCCGGCTT
TGACCAACATCCAAGAAGCTGTCTCGTTTTATGGCCTCAAGGATCAAATCAAAGTTTCAACTTCGATCGA
CACATCTATGATTGGAGTTAGTTATCCTCCGTCACAAGGTGCATTCAGCGATGATGCCCGTGCGTACATA
GACCCGATCATCGGGTTCCTAGTTGCCATCAATGCACCATTGTTGGTTAATGTCTATCCATATTTCAGTT
ACACAGGAAATCCGACACAGATATCACTAGCCTATGCAACATTTACTTCTCCTGGAACCGTAGTACAAGA
TGGAGCAAATGGATACCAAAACCTTTTTGACGCGATAGTAGATGCGATGTACTCAGCGTTAGAGAGGGCC
From chris.lasher at gmail.com Thu Aug 25 18:03:24 2005
From: chris.lasher at gmail.com (Chris Lasher)
Date: Thu Aug 25 17:54:23 2005
Subject: [BioPython] Why would this GenBank file choke the GB parser?
Message-ID: <128a885f050825150342c609d3@mail.gmail.com>
Hello,
I have a GenBank file, accession AY499671.gb, and 21 like it that I
would like to process through BioPython (I am using BioPython 1.40b
with Windows), but I am encountering trouble. It seems that the
GenBank parser is choking on something in the files themselves, but I
could really use help determining what this would be, and in
determining how to fix it. The error seems to be raised by the Martel
Parser, but exactly what is causing it to raise the error is beyond my
lack of knowledge and inexperience.
I obtained the files from GenBank via the NCBI Entrez website pages,
i.e., http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=41080113
. From a page like this one, I selected "File" in the dialog box
labeled "Send to", and saved the file. I also tried obtaining the
files via BioEdit and saving those, but the parser still had
difficulty with those, as well.
I am attaching my script "gbtoseq.py" that I'm trying to process my
GB files with. I have had success with this script from sequences
obtained from GenBank in the manner described above and can recreate
this success, and I am including one of those successful sequences,
AFU75647. I am also attaching the error output when it chokes on these
22 most recent sequences I've obtained.
I sincerely appreciate any help anyone has to offer.
Thanks very much in advance,
Chris Lasher
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AY499671.gb
Type: pubmed/text
Size: 2663 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20050825/78c4dd26/AY499671-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: AFU75647.gb
Type: pubmed/text
Size: 3096 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20050825/78c4dd26/AFU75647-0001.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gbtoseq.py
Type: text/x-python
Size: 1134 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/biopython/attachments/20050825/78c4dd26/gbtoseq-0001.py
-------------- next part --------------
C:\Documents and Settings\chris\My Documents\scripts\pythonscripts\gbtoseq>gbtos
eq.py
Now on AFU75647.gb
Writing to AFU75647.seq
Now on AY499671.gb
Traceback (most recent call last):
File "C:\Documents and Settings\chris\My Documents\scripts\pythonscripts\gbtos
eq\gbtoseq.py", line 30, in ?
parserecord = gbiterator.next()
File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 129, in nex
t
return self._parser.parse(File.StringHandle(data))
File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 219, in par
se
self._scanner.feed(handle, self._consumer)
File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 1259, in fe
ed
self._parser.parseFile(handle)
File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 328, in parseFile
self.parseString(fileobj.read())
File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 356, in parseStrin
g
self._err_handler.fatalError(result)
File "C:\Python24\lib\xml\sax\handler.py", line 38, in fatalError
raise exception
Martel.Parser.ParserPositionException: error parsing at or beyond character 64
From biopython at maubp.freeserve.co.uk Fri Aug 26 05:01:09 2005
From: biopython at maubp.freeserve.co.uk (Peter)
Date: Fri Aug 26 04:51:56 2005
Subject: [BioPython] Why would this GenBank file choke the GB parser?
In-Reply-To: <128a885f050825150342c609d3@mail.gmail.com>
References: <128a885f050825150342c609d3@mail.gmail.com>
Message-ID: <430EDA55.303@maubp.freeserve.co.uk>
Chris Lasher wrote:
> Hello,
>
> I have a GenBank file, accession AY499671.gb, and 21 like it that I
> would like to process through BioPython (I am using BioPython 1.40b
> with Windows), but I am encountering trouble....
Hi Chris,
Looking at your GenBank files by eye, I didn't spot anything "wrong"
except I note there is a blank final line which has caused trouble in
the past:
http://www.biopython.org/pipermail/biopython/2005-April/002607.html
Could you edit the GenBank file by hand to confirm this is the problem?
I don't know if a fix for this was ever made... it should just be a
small tweak to the GenBank file format definition for the Martel parser.
-----------------------------------------------------------------------
Alternatively, I'm using a different GenBank parser (in order to cope
with much larger GenBank files) and this works fine with your example
(attached to previous email).
You could try the patch on this bug to see if it solves your problem:
http://bugzilla.open-bio.org/show_bug.cgi?id=1747
If you have trouble with the patch file, I can send you a modified
version of the Bio/GenBank/__init__.py file which you can use to replace
the existing one if that is easier.
Note that this version might not work with the GenBank.Dictionary as I
have never tried that...
Peter
From jtk at cmp.uea.ac.uk Fri Aug 26 05:59:46 2005
From: jtk at cmp.uea.ac.uk (Jan T. Kim)
Date: Fri Aug 26 05:51:22 2005
Subject: [BioPython] GenBank Format & Parsing (was: Why would this GenBank
file choke the GB parser?)
In-Reply-To: <128a885f050825150342c609d3@mail.gmail.com>
References: <128a885f050825150342c609d3@mail.gmail.com>
Message-ID: <20050826095946.GG4175@jtkpc.cmp.uea.ac.uk>
On Thu, Aug 25, 2005 at 06:03:24PM -0400, Chris Lasher wrote:
> Hello,
>
> I have a GenBank file, accession AY499671.gb, and 21 like it that I
> would like to process through BioPython (I am using BioPython 1.40b
> with Windows), but I am encountering trouble. It seems that the
> GenBank parser is choking on something in the files themselves, but I
> could really use help determining what this would be, and in
> determining how to fix it. The error seems to be raised by the Martel
> Parser, but exactly what is causing it to raise the error is beyond my
> lack of knowledge and inexperience.
I've run into similar problems a while ago, the parser is rather picky
about certain things.
In your case, AY499671 gives "ENV" as the division in the DEFINITION line
(first line of the file), and it turns out that BioPython doesn't know
about this division. Specifically, this is in Bio/expressions/genbank.py:
valid_divisions = ["PRI", "ROD", "MAM", "VRT", "INV", "PLN", "BCT", "RNA",
"VRL", "PHG", "SYN", "UNA", "EST", "PAT", "STS", "GSS",
"HTG", "HTC", "CON"]
Chances are very good that by adding "ENV" to that list, you'll fix your
problem. I've tried changing ENV to BCT in the GenBank file and that
fixed it.
While we're at this: Yeast chromosome GenBank files which I downloaded
recently have
ACCESSION NC_001133 REGION: 1..230208
which the GenBank parser doesn't like either. I've patched my
Bio/expressions/genbank.py to accept this, but I haven't been able to
find any documentation of this -- I just checked the GenBank release
notes (ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt) again. Can anyone comment
on this?
Personally, I can't help but wonder whether it would not be possible for
the GenBank format to converge to stability after so many years...
Best regards, Jan
--
+- Jan T. Kim -------------------------------------------------------+
| *NEW* email: jtk@cmp.uea.ac.uk |
| *NEW* WWW: http://www.cmp.uea.ac.uk/people/jtk |
*-----=< hierarchical systems are for files, not for humans >=-----*
From chris.lasher at gmail.com Fri Aug 26 10:17:09 2005
From: chris.lasher at gmail.com (Chris Lasher)
Date: Fri Aug 26 10:07:06 2005
Subject: [BioPython] Re: GenBank Format & Parsing (was: Why would this
GenBank file choke the GB parser?)
In-Reply-To: <20050826095946.GG4175@jtkpc.cmp.uea.ac.uk>
References: <128a885f050825150342c609d3@mail.gmail.com>
<20050826095946.GG4175@jtkpc.cmp.uea.ac.uk>
Message-ID: <128a885f050826071722ba6ea5@mail.gmail.com>
That was tremendously helpful! Thank you very much, Dr. Kim! Should
this change be added to the CVS of Bio/expressions/genbank.py, and if
so, is that something I should do, or something one of the active
developers should do?
Thanks again, very much,
Chris Lasher
On 8/26/05, Jan T. Kim wrote:
> I've run into similar problems a while ago, the parser is rather picky
> about certain things.
>
> In your case, AY499671 gives "ENV" as the division in the DEFINITION line
> (first line of the file), and it turns out that BioPython doesn't know
> about this division. Specifically, this is in Bio/expressions/genbank.py:
>
> valid_divisions = ["PRI", "ROD", "MAM", "VRT", "INV", "PLN", "BCT", "RNA",
> "VRL", "PHG", "SYN", "UNA", "EST", "PAT", "STS", "GSS",
> "HTG", "HTC", "CON"]
>
> Chances are very good that by adding "ENV" to that list, you'll fix your
> problem. I've tried changing ENV to BCT in the GenBank file and that
> fixed it.
>
> While we're at this: Yeast chromosome GenBank files which I downloaded
> recently have
>
> ACCESSION NC_001133 REGION: 1..230208
>
> which the GenBank parser doesn't like either. I've patched my
> Bio/expressions/genbank.py to accept this, but I haven't been able to
> find any documentation of this -- I just checked the GenBank release
> notes (ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt) again. Can anyone comment
> on this?
>
> Personally, I can't help but wonder whether it would not be possible for
> the GenBank format to converge to stability after so many years...
>
> Best regards, Jan
> --
> +- Jan T. Kim -------------------------------------------------------+
> | *NEW* email: jtk@cmp.uea.ac.uk |
> | *NEW* WWW: http://www.cmp.uea.ac.uk/people/jtk |
> *-----=< hierarchical systems are for files, not for humans >=-----*
>
From kael.fischer at gmail.com Tue Aug 30 18:10:04 2005
From: kael.fischer at gmail.com (Kael Fischer)
Date: Tue Aug 30 17:59:18 2005
Subject: [BioPython] BioPython/BioSQL Status? How to move forward?
Message-ID:
Hi all:
I am involved in a metagenomic project that needs a powerful, fast and
relational database. After some study of the schema I have decided I
would like to try to use BioSQL. Numerous issues with
BioPython/BioSQLhave come up.
Of course I have been doing a lot of Googlin' and CVS browsing to see
what people have found before me, with limited success. Although
BioSQL has some documentation problems, from what I can tell there are
a lot of BioPython specific problems too. At least there are problems
we can start to work on on the BioPython side.
I am wondering if there is interest here to get into the nuts and bolts of this.
In particular:
1) How much interest, in general, is there in BioSQL and BioPython
playing well together?
2) Where should I send my patches?
Rgds,
Kael
--
Kael Fischer, Ph.D
DeRisi Lab - Univ. Of California San Francisco
415-514-4320