[Biopython] Biopython Digest, Vol 151, Issue 6

Fri Jul 10 01:40:15 UTC 2015

Nice book. congratulations Tiago.

在2015-07-09 21:03:35,WU<ribozyme at ioz.ac.cn>写道：
> Send Biopython mailing list submissions to
> 	biopython at mailman.open-bio.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mailman.open-bio.org/mailman/listinfo/biopython
> or, via email, send a message with subject or body 'help' to
> 	biopython-request at mailman.open-bio.org
> 
> You can reach the person managing the list at
> 	biopython-owner at mailman.open-bio.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biopython digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: Biopython Digest, Vol 151, Issue 4 (Bj?rn Johansson)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 9 Jul 2015 13:45:12 +0100
> From: Bj?rn Johansson <bjorn_johansson at bio.uminho.pt>
> To: biopython at mailman.open-bio.org
> Subject: Re: [Biopython] Biopython Digest, Vol 151, Issue 4
> Message-ID:
> 	<CAG_4V=YtBGVNRFePzCpJ8O3_zUfHo9tYEvNWZPiqo7WKougJiA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hello,
> Very nice book!
> I you run the welcome link through nbviewer, the links work:
> 
> http://nbviewer.ipython.org/github/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> 
> 
> /bjorn
> 
> On Thu, Jul 9, 2015 at 4:29 AM, <biopython-request at mailman.open-bio.org>
> wrote:
> 
> > Send Biopython mailing list submissions to
> >         biopython at mailman.open-bio.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >         http://mailman.open-bio.org/mailman/listinfo/biopython
> > or, via email, send a message with subject or body 'help' to
> >         biopython-request at mailman.open-bio.org
> >
> > You can reach the person managing the list at
> >         biopython-owner at mailman.open-bio.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Biopython digest..."
> >
> >
> > Today's Topics:
> >
> >    1. Re: Bioinformatics with Python cookbook (Iddo Friedberg)
> >    2. Re: Bioinformatics with Python cookbook (Wibowo Arindrarto)
> >    3. Entrez EFetch Options (Zach Gayk)
> >    4. Re: Bioinformatics with Python cookbook (Chris Mitchell)
> >    5. Re: Entrez EFetch Options (Joshua Klein)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Wed, 8 Jul 2015 13:02:37 -0500
> > From: Iddo Friedberg <idoerg at gmail.com>
> > To: Tiago Antao <tra at popgen.net>
> > Cc: Biopython Mailing List <biopython at biopython.org>
> > Subject: Re: [Biopython] Bioinformatics with Python cookbook
> > Message-ID:
> >         <CABm4-MQxTm6x_fS-_91R40y=ME0=
> > pJQgnCGo15RqvwKfYn-DHA at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > Wow, nice going, Tiago. Will buy!
> >
> > On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
> >
> > > Dear all,
> > >
> > >
> > > I would like to announce "Bioinformatics with Python Cookbook" which I
> > > authored. As you might imagine Biopython is discussed heavily in the
> > > book.
> > >
> > > This book is slightly different from the standard books on
> > > Bioinformatics and Python. It is not about teaching Bioinformatics
> > > algorithms, but about solving practical day-to-day problems with
> > > Python, for example:
> > >
> > > Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> > > filtering of datasets.
> > >
> > > Genomics: processing reference genomes of both high-quality references
> > > of model species and low-quality non-model species. Also discussed are
> > > genome annotations and gene ontologies.
> > >
> > > Population Genetics: doing PCA, Admixture/Structure, computing FSTs, ...
> > >
> > > Genome simulation: mostly forward-time simulations, but also a bit of
> > > coalescent
> > >
> > > Phylogenetics: tree reconstruction and tree drawing
> > >
> > > Proteins: PDB processing and visualization.
> > >
> > > Other topics like processing map data, GBIF, interfacing with
> > > Cytoscape, accessing lots of online databases, ...
> > >
> > > There is a bit on interacting with R/Bioconductor via Python.
> > >
> > > Finally we discuss high-performance in Python: faster algorithms,
> > > clusters, Numba and Cython. Also related technologies like Docker
> > >
> > > The book discusses the usual Python Libraries in the field: Biopython,
> > > PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> > > like NumPy, SciPy, matplotlib and scikit-learn.
> > >
> > > The code is fully available for free at github
> > >
> > >
> > https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> > >
> > > I am keen on maintaining the book code, so if you find any issues
> > > please do contact me.
> > >
> > > The book is available in the usual places (Amazon, etc.) in paperback
> > > and e-book format. The web page of the book is
> > >
> > >
> > https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> > >
> > > Regards,
> > > Tiago
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > http://mailman.open-bio.org/mailman/listinfo/biopython
> > >
> >
> >
> >
> > --
> > Iddo Friedberg
> > http://iddo-friedberg.net/contact.html
> > ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> > ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> > .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> > >>----.<--.>++++++.<<<<------------------------------------.
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> > http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/65038d9c/attachment-0001.html
> > >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Thu, 9 Jul 2015 01:35:44 +0200
> > From: Wibowo Arindrarto <w.arindrarto at gmail.com>
> > To: Iddo Friedberg <idoerg at gmail.com>
> > Cc: Biopython Mailing List <biopython at biopython.org>
> > Subject: Re: [Biopython] Bioinformatics with Python cookbook
> > Message-ID:
> >         <
> > CADEGkF512cv2G07iwgAETt3m0Oc-c8nHUivaKgNvyONN+n7tjQ at mail.gmail.com>
> > Content-Type: text/plain; charset=UTF-8
> >
> > Hi Tiago,
> >
> > Congratulations on publishing the book! +1 as well for using notebooks
> > and putting them on GitHub (and using many of the up-and-coming
> > libraries ~ I see e.g. numba and spark there :) ).
> >
> > One small note, apparently the direct links to the other notebooks in
> > the Welcome.ipynb notebook is not working properly in the
> > GitHub-rendered page. Looks like this is something from GitHub, though
> > (i.e. the way they render notebook links in notebooks). Anyway, just
> > in case you haven't noticed :).
> >
> > Cheers,
> >
> > On Wed, Jul 8, 2015 at 8:02 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> > > Wow, nice going, Tiago. Will buy!
> > >
> > > On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
> > >>
> > >> Dear all,
> > >>
> > >>
> > >> I would like to announce "Bioinformatics with Python Cookbook" which I
> > >> authored. As you might imagine Biopython is discussed heavily in the
> > >> book.
> > >>
> > >> This book is slightly different from the standard books on
> > >> Bioinformatics and Python. It is not about teaching Bioinformatics
> > >> algorithms, but about solving practical day-to-day problems with
> > >> Python, for example:
> > >>
> > >> Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> > >> filtering of datasets.
> > >>
> > >> Genomics: processing reference genomes of both high-quality references
> > >> of model species and low-quality non-model species. Also discussed are
> > >> genome annotations and gene ontologies.
> > >>
> > >> Population Genetics: doing PCA, Admixture/Structure, computing FSTs, ...
> > >>
> > >> Genome simulation: mostly forward-time simulations, but also a bit of
> > >> coalescent
> > >>
> > >> Phylogenetics: tree reconstruction and tree drawing
> > >>
> > >> Proteins: PDB processing and visualization.
> > >>
> > >> Other topics like processing map data, GBIF, interfacing with
> > >> Cytoscape, accessing lots of online databases, ...
> > >>
> > >> There is a bit on interacting with R/Bioconductor via Python.
> > >>
> > >> Finally we discuss high-performance in Python: faster algorithms,
> > >> clusters, Numba and Cython. Also related technologies like Docker
> > >>
> > >> The book discusses the usual Python Libraries in the field: Biopython,
> > >> PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> > >> like NumPy, SciPy, matplotlib and scikit-learn.
> > >>
> > >> The code is fully available for free at github
> > >>
> > >>
> > https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> > >>
> > >> I am keen on maintaining the book code, so if you find any issues
> > >> please do contact me.
> > >>
> > >> The book is available in the usual places (Amazon, etc.) in paperback
> > >> and e-book format. The web page of the book is
> > >>
> > >>
> > https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> > >>
> > >> Regards,
> > >> Tiago
> > >> _______________________________________________
> > >> Biopython mailing list  -  Biopython at mailman.open-bio.org
> > >> http://mailman.open-bio.org/mailman/listinfo/biopython
> > >
> > >
> > >
> > >
> > > --
> > > Iddo Friedberg
> > > http://iddo-friedberg.net/contact.html
> > > ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> > > ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> > > .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> > >>>----.<--.>++++++.<<<<------------------------------------.
> > >
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > http://mailman.open-bio.org/mailman/listinfo/biopython
> >
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Wed, 8 Jul 2015 21:12:46 -0400
> > From: "Zach Gayk" <zgayk at nmu.edu>
> > To: biopython at mailman.open-bio.org
> > Subject: [Biopython] Entrez EFetch Options
> > Message-ID:
> >         <1a6578a08666733c24564622fe2f8dc2.squirrel at webmail.nmu.edu>
> > Content-Type: text/plain;charset=iso-8859-1
> >
> > Hello,
> >
> > I would like to use the following code from the biopython tutorial to
> > retrieve gi numbers for a number of sequences that matched to scaffolds on
> > a genome assembly:
> >
> > import os
> > os.chdir('/Users/zachgayk/Desktop/GAVIABioinformatics/')
> > from Bio import Entrez # this is the most likely script modified
> > from Bio import SeqIO
> > Entrez.email = "zgayk at nmu.edu"
> > handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", \
> >                        id="gi|50254217|gb|`, gi|50254217|gb|AY567890.1|,
> > gi|559028|gb|L33375.1|GVSMTDGI,
> > gi|559028|gb|L33375.1|GVSMTDGI")
> > for seq_record in SeqIO.parse(handle, "gb"):
> >     print seq_record.description[:100] + "..." # the :100 specifies no.
> > characters and "..." says this comes after specified character limit
> > handle.close()
> >
> > The problem, however, is that there are a large number of gi numbers I
> > wish to retrieve, and so there are simply too many to manually enter into
> > the id ="" field. What I would like to do is specify a file containing all
> > of the needed gi numbers in a list and then have the code parse all of
> > them. I haven't been able to figure out how to do this yet, and if anyone
> > has any ideas they would be very much appreciated.
> >
> > Thank you,
> > Zach Gayk
> >
> >
> >
> >
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Wed, 8 Jul 2015 21:25:27 -0400
> > From: Chris Mitchell <chris.mit7 at gmail.com>
> > To: Wibowo Arindrarto <w.arindrarto at gmail.com>
> > Cc: Iddo Friedberg <idoerg at gmail.com>, biopython at biopython.org
> > Subject: Re: [Biopython] Bioinformatics with Python cookbook
> > Message-ID:
> >         <
> > CAK_U6OCMSuxexTfj3S2zvoY-UkTMERX8sp9vaPpMvwvVU_w0pw at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > I noticed that as well bow, but it's only for pages that are opened in a
> > new tab.
> >
> > Really good stuff all in all.
> >
> > Sent from my phone
> > On Jul 8, 2015 8:07 PM, "Wibowo Arindrarto" <w.arindrarto at gmail.com>
> > wrote:
> >
> > > Hi Tiago,
> > >
> > > Congratulations on publishing the book! +1 as well for using notebooks
> > > and putting them on GitHub (and using many of the up-and-coming
> > > libraries ~ I see e.g. numba and spark there :) ).
> > >
> > > One small note, apparently the direct links to the other notebooks in
> > > the Welcome.ipynb notebook is not working properly in the
> > > GitHub-rendered page. Looks like this is something from GitHub, though
> > > (i.e. the way they render notebook links in notebooks). Anyway, just
> > > in case you haven't noticed :).
> > >
> > > Cheers,
> > >
> > > On Wed, Jul 8, 2015 at 8:02 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> > > > Wow, nice going, Tiago. Will buy!
> > > >
> > > > On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
> > > >>
> > > >> Dear all,
> > > >>
> > > >>
> > > >> I would like to announce "Bioinformatics with Python Cookbook" which I
> > > >> authored. As you might imagine Biopython is discussed heavily in the
> > > >> book.
> > > >>
> > > >> This book is slightly different from the standard books on
> > > >> Bioinformatics and Python. It is not about teaching Bioinformatics
> > > >> algorithms, but about solving practical day-to-day problems with
> > > >> Python, for example:
> > > >>
> > > >> Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> > > >> filtering of datasets.
> > > >>
> > > >> Genomics: processing reference genomes of both high-quality references
> > > >> of model species and low-quality non-model species. Also discussed are
> > > >> genome annotations and gene ontologies.
> > > >>
> > > >> Population Genetics: doing PCA, Admixture/Structure, computing FSTs,
> > ...
> > > >>
> > > >> Genome simulation: mostly forward-time simulations, but also a bit of
> > > >> coalescent
> > > >>
> > > >> Phylogenetics: tree reconstruction and tree drawing
> > > >>
> > > >> Proteins: PDB processing and visualization.
> > > >>
> > > >> Other topics like processing map data, GBIF, interfacing with
> > > >> Cytoscape, accessing lots of online databases, ...
> > > >>
> > > >> There is a bit on interacting with R/Bioconductor via Python.
> > > >>
> > > >> Finally we discuss high-performance in Python: faster algorithms,
> > > >> clusters, Numba and Cython. Also related technologies like Docker
> > > >>
> > > >> The book discusses the usual Python Libraries in the field: Biopython,
> > > >> PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> > > >> like NumPy, SciPy, matplotlib and scikit-learn.
> > > >>
> > > >> The code is fully available for free at github
> > > >>
> > > >>
> > >
> > https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> > > >>
> > > >> I am keen on maintaining the book code, so if you find any issues
> > > >> please do contact me.
> > > >>
> > > >> The book is available in the usual places (Amazon, etc.) in paperback
> > > >> and e-book format. The web page of the book is
> > > >>
> > > >>
> > >
> > https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> > > >>
> > > >> Regards,
> > > >> Tiago
> > > >> _______________________________________________
> > > >> Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > >> http://mailman.open-bio.org/mailman/listinfo/biopython
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Iddo Friedberg
> > > > http://iddo-friedberg.net/contact.html
> > > > ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> > > > ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> > > > .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> > > >>>----.<--.>++++++.<<<<------------------------------------.
> > > >
> > > > _______________________________________________
> > > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > > http://mailman.open-bio.org/mailman/listinfo/biopython
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > http://mailman.open-bio.org/mailman/listinfo/biopython
> > >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> > http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/08f11293/attachment-0001.html
> > >
> >
> > ------------------------------
> >
> > Message: 5
> > Date: Wed, 8 Jul 2015 23:29:01 -0400
> > From: Joshua Klein <mobiusklein at gmail.com>
> > To: Zach Gayk <zgayk at nmu.edu>
> > Cc: biopython at mailman.open-bio.org
> > Subject: Re: [Biopython] Entrez EFetch Options
> > Message-ID:
> >         <
> > CAFZWoGbwDtfzFe02p3qtAyVEjBdSo-oLTzOaOqRkeTuCqKJqkA at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > If you store your list of identifiers in a file given by the variable
> > 'file_path', with one identifier per line, you can use:
> >
> > ids_to_fetch = ",".join(open(file_path))
> >
> > This code will open the file, and use the default iteration behavior for
> > file objects to yield a line at a time to the join method of the string
> > ",". This will create a long string of comma-separated identifiers to use
> > in your efetch call.
> >
> > On Wed, Jul 8, 2015 at 9:12 PM, Zach Gayk <zgayk at nmu.edu> wrote:
> >
> > > Hello,
> > >
> > > I would like to use the following code from the biopython tutorial to
> > > retrieve gi numbers for a number of sequences that matched to scaffolds
> > on
> > > a genome assembly:
> > >
> > > import os
> > > os.chdir('/Users/zachgayk/Desktop/GAVIABioinformatics/')
> > > from Bio import Entrez # this is the most likely script modified
> > > from Bio import SeqIO
> > > Entrez.email = "zgayk at nmu.edu"
> > > handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", \
> > >                        id="gi|50254217|gb|`, gi|50254217|gb|AY567890.1|,
> > > gi|559028|gb|L33375.1|GVSMTDGI,
> > > gi|559028|gb|L33375.1|GVSMTDGI")
> > > for seq_record in SeqIO.parse(handle, "gb"):
> > >     print seq_record.description[:100] + "..." # the :100 specifies no.
> > > characters and "..." says this comes after specified character limit
> > > handle.close()
> > >
> > > The problem, however, is that there are a large number of gi numbers I
> > > wish to retrieve, and so there are simply too many to manually enter into
> > > the id ="" field. What I would like to do is specify a file containing
> > all
> > > of the needed gi numbers in a list and then have the code parse all of
> > > them. I haven't been able to figure out how to do this yet, and if anyone
> > > has any ideas they would be very much appreciated.
> > >
> > > Thank you,
> > > Zach Gayk
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > http://mailman.open-bio.org/mailman/listinfo/biopython
> > >
> > -------------- next part --------------
> > An HTML attachment was scrubbed...
> > URL: <
> > http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/f83d063f/attachment.html
> > >
> >
> > ------------------------------
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython
> >
> > End of Biopython Digest, Vol 151, Issue 4
> > *****************************************
> >
> 
> 
> 
> -- 
> ______O_________oO________oO______o_______oO__
> Bj?rn Johansson
> Assistant Professor
> Departament of Biology
> University of Minho
> Campus de Gualtar
> 4710-057 Braga
> PORTUGAL
> www.bio.uminho.pt
> Google profile <https://profiles.google.com/bjornjobb>
> Google Scholar Profile
> <http://scholar.google.com/citations?user=7AiEuJ4AAAAJ>
> my group <https://sites.google.com/site/metabolicengineeringgroup/>
> Office (direct) +351-253 601517 | (PT) mob.  +351-967 147 704 | (SWE) mob.
>  +46 739 792 968
> Dept of Biology (secr) +351-253 60 4310  | fax +351-253 678980
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150709/26343e35/attachment.html>
> 
> ------------------------------
> 
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
> 
> End of Biopython Digest, Vol 151, Issue 6
> *****************************************