[Biopython] Biopython Digest, Vol 85, Issue 13
João Rodrigues
anaryin at gmail.com
Tue Jan 12 19:01:34 UTC 2010
Hello Peter,
Well, updating the wiki is cumbersome. Specially if done manually. Why not
update the wiki automatically with that link you just gave?
Regards,
João [...] Rodrigues
@ http://stanford.edu/~joaor/
On Tue, Jan 12, 2010 at 9:00 AM, <biopython-request at lists.open-bio.org>wrote:
> Send Biopython mailing list submissions to
> biopython at lists.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.open-bio.org/mailman/listinfo/biopython
> or, via email, send a message with subject or body 'help' to
> biopython-request at lists.open-bio.org
>
> You can reach the person managing the list at
> biopython-owner at lists.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biopython digest..."
>
> Today's Topics:
>
> 1. Re: is there an updated tutorial on how to use the Wrappers
> for the new NCBI BLAST+ tools? (Peter)
> 2. Re: Could Bio.SeqIO write EMBL file? (Anne Pajon)
> 3. Re: is there an updated tutorial on how to use the Wrappers
> for the new NCBI BLAST+ tools? (Kevin Lam)
> 4. Re: Could Bio.SeqIO write EMBL file? (Peter)
> 5. Re: Could Bio.SeqIO write EMBL file? (Peter)
> 6. Publication list (Peter Cock)
>
>
> ---------- Forwarded message ----------
> From: Peter <biopython at maubp.freeserve.co.uk>
> To: Kevin <aboulia at gmail.com>
> Date: Mon, 11 Jan 2010 17:08:55 +0000
> Subject: Re: [Biopython] is there an updated tutorial on how to use the
> Wrappers for the new NCBI BLAST+ tools?
> On Mon, Jan 11, 2010 at 4:46 PM, Kevin <aboulia at gmail.com> wrote:
> > Hi Peter,
> > I was thinking of porting the legacy blast script to python as u r right
> > about the helper script being inflexible.
>
> A python version of legacy_blast.pl isn't any more useful than the
> Perl version is it? Maybe I have misunderstood you.
>
> What would be nice is a way to help people update their old
> Biopython scripts which called legacy BLAST, so that they can
> be used on BLAST+ instead. I would expect in most cases this
> means scripts using the legacy BLAST "helper" functions in
> Bio.Blast.NCBIStandalone. One way to do this would be to
> add new BLAST+ versions of the "helper" functions (taking
> the same argument names as before), but that is just a stop
> gap (a temporary measure). We really want people using these
> old helper functions to switch to using the wrappers in
> Bio.Blast.Applications and subprocess instead.
>
> > The documentation bit was actually about my first email about any
> > updated doc on how to use blast+ with biopython
>
> I see. What do you think the current (Biopython 1.53) version
> of the tutorial needs in the BLAST chapter?
>
> http://biopython.org/DIST/docs/tutorial/Tutorial.html
> http://biopython.org/DIST/docs/tutorial/Tutorial.pdf
>
> Thanks,
>
> Peter
>
>
>
> ---------- Forwarded message ----------
> From: Anne Pajon <ap12 at sanger.ac.uk>
> To: Peter <biopython at maubp.freeserve.co.uk>
> Date: Mon, 11 Jan 2010 17:32:43 +0000
> Subject: Re: [Biopython] Could Bio.SeqIO write EMBL file?
> Hi Peter,
>
> Just tested now.
>
> It worked fine. Thanks a lot.
>
> Here is the diff between the EMBL output from Bio.SeqIO and the genbank
> output from Bio.SeqIO converted with the EMBOSS tool to an EMBL file:
>
> guest137:RAST ap12$ diff tmp.embl
> updated_files/Alistipes_shahii_WAL8301_uRAST.embl
> 1c1
> < ID unknown; SV 1; ; DNA; ; ; 3763317 BP.
> ---
> > ID unknown; SV 1; linear; unassigned DNA; STD; UNC; 3763317 BP.
> 5c5
> < DE
> ---
> > KW .
> 8c8
> < OC .
> ---
> > XX
> 10a11
> > FH
> 1949,1950c1950
> < FT /product="Peptidyl-prolyl cis-trans isomerase (EC
> < FT 5.2.1.8)"
> ---
> > FT /product="Peptidyl-prolyl cis-trans isomerase (EC
> 5.2.1.8)"
> 3346,3347c3346
> < FT kinase/response regulator, hybrid ('one component
> < FT system')"
> ---
> > FT kinase/response regulator, hybrid ('one component
> system')"
> 3380,3381c3379
> < FT /product="Iron-sulfur cluster assembly ATPase
> protein
> < FT SufC"
> ---
> > FT /product="Iron-sulfur cluster assembly ATPase
> protein SufC"
> 4811,4812c4809
> < FT /product="Gamma-glutamyl phosphate reductase (EC
> < FT 1.2.1.41)"
> ---
> > FT /product="Gamma-glutamyl phosphate reductase (EC
> 1.2.1.41)"
> 5472,5473c5469
> < FT /product="lipoprotein releasing system ATP-binding
> < FT protein"
> ---
> > FT /product="lipoprotein releasing system ATP-binding
> protein"
> 5881,5882c5877
> < FT /product="NAD-dependent protein deacetylase of SIR2
> < FT family"
> ---
> > FT /product="NAD-dependent protein deacetylase of SIR2
> family"
> 6032,6033c6027
> < FT /product="Exodeoxyribonuclease V alpha chain (EC
> < FT 3.1.11.5)"
> ---
> > FT /product="Exodeoxyribonuclease V alpha chain (EC
> 3.1.11.5)"
> 6495,6496c6489
> < FT /product="Pyrophosphate-energized proton pump (EC
> < FT 3.6.1.1)"
> ---
> > FT /product="Pyrophosphate-energized proton pump (EC
> 3.6.1.1)"
> 6946,6947c6939
> < FT /product="Exodeoxyribonuclease V alpha chain (EC
> < FT 3.1.11.5)"
> ---
> > FT /product="Exodeoxyribonuclease V alpha chain (EC
> 3.1.11.5)"
> 7128,7129c7120
> < FT /product="N-acyl-L-amino acid amidohydrolase (EC
> < FT 3.5.1.14)"
> ---
> > FT /product="N-acyl-L-amino acid amidohydrolase (EC
> 3.5.1.14)"
> 8035,8036c8026
> < FT /product="D-3-phosphoglycerate dehydrogenase (EC
> < FT 1.1.1.95)"
> ---
> > FT /product="D-3-phosphoglycerate dehydrogenase (EC
> 1.1.1.95)"
> 8601,8602c8591
> < FT /product="Acetolactate synthase small subunit (EC
> < FT 2.2.1.6)"
> ---
> > FT /product="Acetolactate synthase small subunit (EC
> 2.2.1.6)"
> 8608,8609c8597
> < FT /product="Acetolactate synthase large subunit (EC
> < FT 2.2.1.6)"
> ---
> > FT /product="Acetolactate synthase large subunit (EC
> 2.2.1.6)"
> 9152,9153c9140
> < FT /product="Exodeoxyribonuclease V alpha chain (EC
> < FT 3.1.11.5)"
> ---
> > FT /product="Exodeoxyribonuclease V alpha chain (EC
> 3.1.11.5)"
> 10659,10660c10646
> < FT kinase/response regulator, hybrid ('one-component
> < FT system')"
> ---
> > FT kinase/response regulator, hybrid ('one-component
> system')"
> 12056,12057c12042
> < FT /product="N-acetylmuramoyl-L-alanine amidase (EC
> < FT 3.5.1.28)"
> ---
> > FT /product="N-acetylmuramoyl-L-alanine amidase (EC
> 3.5.1.28)"
> 12957,12958c12942
> < FT /product="Phosphatidate cytidylyltransferase (EC
> < FT 2.7.7.41)"
> ---
> > FT /product="Phosphatidate cytidylyltransferase (EC
> 2.7.7.41)"
> 13550,13551c13534
> < FT /product="Glutamine synthetase type III, GlnN (EC
> < FT 6.3.1.2)"
> ---
> > FT /product="Glutamine synthetase type III, GlnN (EC
> 6.3.1.2)"
> 14344c14327,14328
> < SQ
> ---
> > XX
> > SQ Sequence 3763317 BP; 772804 A; 1042979 C; 1057681 G; 776208 T;
> 113645 other;
>
> The main differences are on line breaks.
>
> Regards,
> Anne.
>
>
> On 11 Jan 2010, at 16:22, Peter wrote:
>
> Hi Anne,
>>
>> I've just checked in feature support to the new EMBL output in Bio.SeqIO
>> (our main branch on git). If you could give that a test it would be very
>> much appreciated. If you are on the dev mailing list, we can discuss
>> issues there - otherwise we might as well continue on this thread.
>>
>> Thanks,
>>
>> Peter
>>
>
> --
> Dr Anne Pajon - Pathogen Genomics, Team 81
> Sanger Institute, Wellcome Trust Genome Campus, Hinxton
> Cambridge CB10 1SA, United Kingdom
> +44 (0)1223 494 798 (office) | +44 (0)7958 511 353 (mobile)
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome ResearchLimited,
> a charity registered in England with number 1021457 and acompany registered
> in England with number 2742969, whose registeredoffice is 215 Euston Road,
> London, NW1 2BE.
>
>
> ---------- Forwarded message ----------
> From: Kevin Lam <aboulia at gmail.com>
> To: Peter <biopython at maubp.freeserve.co.uk>
> Date: Tue, 12 Jan 2010 13:04:07 +0800
> Subject: Re: [Biopython] is there an updated tutorial on how to use the
> Wrappers for the new NCBI BLAST+ tools?
> On Tue, Jan 12, 2010 at 1:08 AM, Peter <biopython at maubp.freeserve.co.uk
> >wrote:
>
> > On Mon, Jan 11, 2010 at 4:46 PM, Kevin <aboulia at gmail.com> wrote:
> > > Hi Peter,
> > > I was thinking of porting the legacy blast script to python as u r
> right
> > > about the helper script being inflexible.
> >
> > A python version of legacy_blast.pl isn't any more useful than the
> > Perl version is it? Maybe I have misunderstood you.
> >
> > What would be nice is a way to help people update their old
> > Biopython scripts which called legacy BLAST, so that they can
> > be used on BLAST+ instead. I would expect in most cases this
> > means scripts using the legacy BLAST "helper" functions in
> > Bio.Blast.NCBIStandalone. One way to do this would be to
> > add new BLAST+ versions of the "helper" functions (taking
> > the same argument names as before), but that is just a stop
> > gap (a temporary measure). We really want people using these
> > old helper functions to switch to using the wrappers in
> > Bio.Blast.Applications and subprocess instead.
> >
>
> Yes I was thinking of this when i meant porting/integrate. to integrate the
> legacy blast perl script into Bio.Blast.NCBIStandalone
>
> I didn't realise that Bio.Blast.Applications existed
>
> > The documentation bit was actually about my first email about any
> > > updated doc on how to use blast+ with biopython
> >
> > I see. What do you think the current (Biopython 1.53) version
> > of the tutorial needs in the BLAST chapter?
> >
> > http://biopython.org/DIST/docs/tutorial/Tutorial.html
> > http://biopython.org/DIST/docs/tutorial/Tutorial.pdf
> >
> > http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc80
> was exactly what I was looking for! Maybe i was looking at the wrong page
> Thanks for pointing it out!
>
>
>
> > Thanks,
> >
> > Peter
> >
>
> Cheers
> Kevin
>
>
>
> ---------- Forwarded message ----------
> From: Peter <biopython at maubp.freeserve.co.uk>
> To: Anne Pajon <ap12 at sanger.ac.uk>
> Date: Tue, 12 Jan 2010 10:27:47 +0000
> Subject: Re: [Biopython] Could Bio.SeqIO write EMBL file?
> On Mon, Jan 11, 2010 at 5:32 PM, Anne Pajon <ap12 at sanger.ac.uk> wrote:
> > Hi Peter,
> >
> > Just tested now.
> >
> > It worked fine. Thanks a lot.
>
> Great.
>
> > Here is the diff between the EMBL output from Bio.SeqIO and the genbank
> > output from Bio.SeqIO converted with the EMBOSS tool to an EMBL file:
> >
> > ...
> >
> > The main differences are on line breaks.
>
> I hadn't yet done a comparison against EMBOSS (what version do you
> have), but yes, it looks like I am wrapping the feature tables using a
> shorter line length - we should check that, and it would be easy to
> adjust in Bio/SeqIO/InsdcIO.py
>
> Regarding the SQ line, that was on my "TODO" list. Including the
> sequence length and base counts shouldn't hard at all. If you want
> to work on that it should just be a few lines in Bio/SeqIO/InsdcIO.py
>
> Right now however, further testing of features would be my first
> priority. See also:
> http://lists.open-bio.org/pipermail/open-bio-l/2010-January/000604.html
>
> There are other things still to do (e.g. missing fields on the ID line,
> dates, and references).
>
> Peter
>
>
>
> ---------- Forwarded message ----------
> From: Peter <biopython at maubp.freeserve.co.uk>
> To: Anne Pajon <ap12 at sanger.ac.uk>
> Date: Tue, 12 Jan 2010 12:33:35 +0000
> Subject: Re: [Biopython] Could Bio.SeqIO write EMBL file?
> On Tue, Jan 12, 2010 at 10:27 AM, Peter <biopython at maubp.freeserve.co.uk>
> wrote:
> > On Mon, Jan 11, 2010 at 5:32 PM, Anne Pajon <ap12 at sanger.ac.uk> wrote:
> >> Here is the diff between the EMBL output from Bio.SeqIO and the genbank
> >> output from Bio.SeqIO converted with the EMBOSS tool to an EMBL file:
> >>
> >> ...
> >>
> >> The main differences are on line breaks.
> >
> > I hadn't yet done a comparison against EMBOSS (what version do you
> > have), but yes, it looks like I am wrapping the feature tables using a
> > shorter line length - we should check that, and it would be easy to
> > adjust in Bio/SeqIO/InsdcIO.py
>
> The spec is pretty clear than the feature lines should be up to 80
> characters. The premature wrapping was because I had been
> testing length < 80 instead of <= 80, which is now fixed in git.
>
> Peter
>
>
>
> ---------- Forwarded message ----------
> From: Peter Cock <p.j.a.cock at googlemail.com>
> To: Biopython Mailing List <biopython at lists.open-bio.org>
> Date: Tue, 12 Jan 2010 14:27:30 +0000
> Subject: [Biopython] Publication list
> Dear all,
>
> We have a fairly extensive manually compiled list of over 150
> publications citing,
> referencing or using Biopython on the wiki, covering the first 10
> years of Biopython:
> http://biopython.org/wiki/Publications
>
> *If your own Biopython related publications are missing from this list,
> please
> add them. If they are listed in PubMed this is pretty easy.*
>
> Keeping this up to date has been a tedious task, although now that we have
> an
> up to date reference, which hopefully will get cited, this is a little
> easier:
> http://news.open-bio.org/news/2009/03/biopython-paper-published/
>
> There is an example in the Biopython Tutorial of using Bio.Entrez and
> PubMed
> Central (PMC) to find papers citing a reference, or you can just use this
> URL:
>
> http://www.ncbi.nlm.nih.gov/sites/entrez?db=pubmed&cmd=link&linkname=pubmed_pubmed_citedin&uid=19304878
>
> Likewise, using Google Scholar also finds plenty of citations (although I
> don't
> know if this URL will work long term):
>
> http://scholar.google.com/scholar?cites=1800471218280477755&hl=en&as_sdt=2000
>
> Perhaps just a few links like these will suffice for tracking future
> publications?
> Or do people think we should continue to update the wiki in the same style?
>
> Regards,
>
> Peter
>
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
>
More information about the Biopython
mailing list