[Biopython] Biopython Digest, Vol 151, Issue 4

Björn Johansson bjorn_johansson at bio.uminho.pt
Thu Jul 9 12:45:12 UTC 2015


Hello,
Very nice book!
I you run the welcome link through nbviewer, the links work:

http://nbviewer.ipython.org/github/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb


/bjorn

On Thu, Jul 9, 2015 at 4:29 AM, <biopython-request at mailman.open-bio.org>
wrote:

> Send Biopython mailing list submissions to
>         biopython at mailman.open-bio.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.open-bio.org/mailman/listinfo/biopython
> or, via email, send a message with subject or body 'help' to
>         biopython-request at mailman.open-bio.org
>
> You can reach the person managing the list at
>         biopython-owner at mailman.open-bio.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biopython digest..."
>
>
> Today's Topics:
>
>    1. Re: Bioinformatics with Python cookbook (Iddo Friedberg)
>    2. Re: Bioinformatics with Python cookbook (Wibowo Arindrarto)
>    3. Entrez EFetch Options (Zach Gayk)
>    4. Re: Bioinformatics with Python cookbook (Chris Mitchell)
>    5. Re: Entrez EFetch Options (Joshua Klein)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 8 Jul 2015 13:02:37 -0500
> From: Iddo Friedberg <idoerg at gmail.com>
> To: Tiago Antao <tra at popgen.net>
> Cc: Biopython Mailing List <biopython at biopython.org>
> Subject: Re: [Biopython] Bioinformatics with Python cookbook
> Message-ID:
>         <CABm4-MQxTm6x_fS-_91R40y=ME0=
> pJQgnCGo15RqvwKfYn-DHA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Wow, nice going, Tiago. Will buy!
>
> On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
>
> > Dear all,
> >
> >
> > I would like to announce "Bioinformatics with Python Cookbook" which I
> > authored. As you might imagine Biopython is discussed heavily in the
> > book.
> >
> > This book is slightly different from the standard books on
> > Bioinformatics and Python. It is not about teaching Bioinformatics
> > algorithms, but about solving practical day-to-day problems with
> > Python, for example:
> >
> > Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> > filtering of datasets.
> >
> > Genomics: processing reference genomes of both high-quality references
> > of model species and low-quality non-model species. Also discussed are
> > genome annotations and gene ontologies.
> >
> > Population Genetics: doing PCA, Admixture/Structure, computing FSTs, ...
> >
> > Genome simulation: mostly forward-time simulations, but also a bit of
> > coalescent
> >
> > Phylogenetics: tree reconstruction and tree drawing
> >
> > Proteins: PDB processing and visualization.
> >
> > Other topics like processing map data, GBIF, interfacing with
> > Cytoscape, accessing lots of online databases, ...
> >
> > There is a bit on interacting with R/Bioconductor via Python.
> >
> > Finally we discuss high-performance in Python: faster algorithms,
> > clusters, Numba and Cython. Also related technologies like Docker
> >
> > The book discusses the usual Python Libraries in the field: Biopython,
> > PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> > like NumPy, SciPy, matplotlib and scikit-learn.
> >
> > The code is fully available for free at github
> >
> >
> https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> >
> > I am keen on maintaining the book code, so if you find any issues
> > please do contact me.
> >
> > The book is available in the usual places (Amazon, etc.) in paperback
> > and e-book format. The web page of the book is
> >
> >
> https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> >
> > Regards,
> > Tiago
> > _______________________________________________
> > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython
> >
>
>
>
> --
> Iddo Friedberg
> http://iddo-friedberg.net/contact.html
> ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> >>----.<--.>++++++.<<<<------------------------------------.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/65038d9c/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 9 Jul 2015 01:35:44 +0200
> From: Wibowo Arindrarto <w.arindrarto at gmail.com>
> To: Iddo Friedberg <idoerg at gmail.com>
> Cc: Biopython Mailing List <biopython at biopython.org>
> Subject: Re: [Biopython] Bioinformatics with Python cookbook
> Message-ID:
>         <
> CADEGkF512cv2G07iwgAETt3m0Oc-c8nHUivaKgNvyONN+n7tjQ at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Hi Tiago,
>
> Congratulations on publishing the book! +1 as well for using notebooks
> and putting them on GitHub (and using many of the up-and-coming
> libraries ~ I see e.g. numba and spark there :) ).
>
> One small note, apparently the direct links to the other notebooks in
> the Welcome.ipynb notebook is not working properly in the
> GitHub-rendered page. Looks like this is something from GitHub, though
> (i.e. the way they render notebook links in notebooks). Anyway, just
> in case you haven't noticed :).
>
> Cheers,
>
> On Wed, Jul 8, 2015 at 8:02 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> > Wow, nice going, Tiago. Will buy!
> >
> > On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
> >>
> >> Dear all,
> >>
> >>
> >> I would like to announce "Bioinformatics with Python Cookbook" which I
> >> authored. As you might imagine Biopython is discussed heavily in the
> >> book.
> >>
> >> This book is slightly different from the standard books on
> >> Bioinformatics and Python. It is not about teaching Bioinformatics
> >> algorithms, but about solving practical day-to-day problems with
> >> Python, for example:
> >>
> >> Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> >> filtering of datasets.
> >>
> >> Genomics: processing reference genomes of both high-quality references
> >> of model species and low-quality non-model species. Also discussed are
> >> genome annotations and gene ontologies.
> >>
> >> Population Genetics: doing PCA, Admixture/Structure, computing FSTs, ...
> >>
> >> Genome simulation: mostly forward-time simulations, but also a bit of
> >> coalescent
> >>
> >> Phylogenetics: tree reconstruction and tree drawing
> >>
> >> Proteins: PDB processing and visualization.
> >>
> >> Other topics like processing map data, GBIF, interfacing with
> >> Cytoscape, accessing lots of online databases, ...
> >>
> >> There is a bit on interacting with R/Bioconductor via Python.
> >>
> >> Finally we discuss high-performance in Python: faster algorithms,
> >> clusters, Numba and Cython. Also related technologies like Docker
> >>
> >> The book discusses the usual Python Libraries in the field: Biopython,
> >> PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> >> like NumPy, SciPy, matplotlib and scikit-learn.
> >>
> >> The code is fully available for free at github
> >>
> >>
> https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> >>
> >> I am keen on maintaining the book code, so if you find any issues
> >> please do contact me.
> >>
> >> The book is available in the usual places (Amazon, etc.) in paperback
> >> and e-book format. The web page of the book is
> >>
> >>
> https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> >>
> >> Regards,
> >> Tiago
> >> _______________________________________________
> >> Biopython mailing list  -  Biopython at mailman.open-bio.org
> >> http://mailman.open-bio.org/mailman/listinfo/biopython
> >
> >
> >
> >
> > --
> > Iddo Friedberg
> > http://iddo-friedberg.net/contact.html
> > ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> > ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> > .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> >>>----.<--.>++++++.<<<<------------------------------------.
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython
>
>
> ------------------------------
>
> Message: 3
> Date: Wed, 8 Jul 2015 21:12:46 -0400
> From: "Zach Gayk" <zgayk at nmu.edu>
> To: biopython at mailman.open-bio.org
> Subject: [Biopython] Entrez EFetch Options
> Message-ID:
>         <1a6578a08666733c24564622fe2f8dc2.squirrel at webmail.nmu.edu>
> Content-Type: text/plain;charset=iso-8859-1
>
> Hello,
>
> I would like to use the following code from the biopython tutorial to
> retrieve gi numbers for a number of sequences that matched to scaffolds on
> a genome assembly:
>
> import os
> os.chdir('/Users/zachgayk/Desktop/GAVIABioinformatics/')
> from Bio import Entrez # this is the most likely script modified
> from Bio import SeqIO
> Entrez.email = "zgayk at nmu.edu"
> handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", \
>                        id="gi|50254217|gb|`, gi|50254217|gb|AY567890.1|,
> gi|559028|gb|L33375.1|GVSMTDGI,
> gi|559028|gb|L33375.1|GVSMTDGI")
> for seq_record in SeqIO.parse(handle, "gb"):
>     print seq_record.description[:100] + "..." # the :100 specifies no.
> characters and "..." says this comes after specified character limit
> handle.close()
>
> The problem, however, is that there are a large number of gi numbers I
> wish to retrieve, and so there are simply too many to manually enter into
> the id ="" field. What I would like to do is specify a file containing all
> of the needed gi numbers in a list and then have the code parse all of
> them. I haven't been able to figure out how to do this yet, and if anyone
> has any ideas they would be very much appreciated.
>
> Thank you,
> Zach Gayk
>
>
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 8 Jul 2015 21:25:27 -0400
> From: Chris Mitchell <chris.mit7 at gmail.com>
> To: Wibowo Arindrarto <w.arindrarto at gmail.com>
> Cc: Iddo Friedberg <idoerg at gmail.com>, biopython at biopython.org
> Subject: Re: [Biopython] Bioinformatics with Python cookbook
> Message-ID:
>         <
> CAK_U6OCMSuxexTfj3S2zvoY-UkTMERX8sp9vaPpMvwvVU_w0pw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> I noticed that as well bow, but it's only for pages that are opened in a
> new tab.
>
> Really good stuff all in all.
>
> Sent from my phone
> On Jul 8, 2015 8:07 PM, "Wibowo Arindrarto" <w.arindrarto at gmail.com>
> wrote:
>
> > Hi Tiago,
> >
> > Congratulations on publishing the book! +1 as well for using notebooks
> > and putting them on GitHub (and using many of the up-and-coming
> > libraries ~ I see e.g. numba and spark there :) ).
> >
> > One small note, apparently the direct links to the other notebooks in
> > the Welcome.ipynb notebook is not working properly in the
> > GitHub-rendered page. Looks like this is something from GitHub, though
> > (i.e. the way they render notebook links in notebooks). Anyway, just
> > in case you haven't noticed :).
> >
> > Cheers,
> >
> > On Wed, Jul 8, 2015 at 8:02 PM, Iddo Friedberg <idoerg at gmail.com> wrote:
> > > Wow, nice going, Tiago. Will buy!
> > >
> > > On Tue, Jul 7, 2015 at 6:26 AM, Tiago Antao <tra at popgen.net> wrote:
> > >>
> > >> Dear all,
> > >>
> > >>
> > >> I would like to announce "Bioinformatics with Python Cookbook" which I
> > >> authored. As you might imagine Biopython is discussed heavily in the
> > >> book.
> > >>
> > >> This book is slightly different from the standard books on
> > >> Bioinformatics and Python. It is not about teaching Bioinformatics
> > >> algorithms, but about solving practical day-to-day problems with
> > >> Python, for example:
> > >>
> > >> Next-Generation Sequencing: FASTQ, BAM and VCF processing. Along with
> > >> filtering of datasets.
> > >>
> > >> Genomics: processing reference genomes of both high-quality references
> > >> of model species and low-quality non-model species. Also discussed are
> > >> genome annotations and gene ontologies.
> > >>
> > >> Population Genetics: doing PCA, Admixture/Structure, computing FSTs,
> ...
> > >>
> > >> Genome simulation: mostly forward-time simulations, but also a bit of
> > >> coalescent
> > >>
> > >> Phylogenetics: tree reconstruction and tree drawing
> > >>
> > >> Proteins: PDB processing and visualization.
> > >>
> > >> Other topics like processing map data, GBIF, interfacing with
> > >> Cytoscape, accessing lots of online databases, ...
> > >>
> > >> There is a bit on interacting with R/Bioconductor via Python.
> > >>
> > >> Finally we discuss high-performance in Python: faster algorithms,
> > >> clusters, Numba and Cython. Also related technologies like Docker
> > >>
> > >> The book discusses the usual Python Libraries in the field: Biopython,
> > >> PyVCF, Pysam, simuPOP, DendroPy, Pymol and also scientific libraries
> > >> like NumPy, SciPy, matplotlib and scikit-learn.
> > >>
> > >> The code is fully available for free at github
> > >>
> > >>
> >
> https://github.com/tiagoantao/bioinf-python/blob/master/notebooks/Welcome.ipynb
> > >>
> > >> I am keen on maintaining the book code, so if you find any issues
> > >> please do contact me.
> > >>
> > >> The book is available in the usual places (Amazon, etc.) in paperback
> > >> and e-book format. The web page of the book is
> > >>
> > >>
> >
> https://www.packtpub.com/application-development/bioinformatics-python-cookbook
> > >>
> > >> Regards,
> > >> Tiago
> > >> _______________________________________________
> > >> Biopython mailing list  -  Biopython at mailman.open-bio.org
> > >> http://mailman.open-bio.org/mailman/listinfo/biopython
> > >
> > >
> > >
> > >
> > > --
> > > Iddo Friedberg
> > > http://iddo-friedberg.net/contact.html
> > > ++++++++++[>+++>++++++>++++++++>++++++++++>+++++++++++<<<<<-]>>>>++++.>
> > > ++++++..----.<<<<++++++++++++++++++++++++++++.-----------..>>>+.-----.
> > > .>-.<<<<--.>>>++.>+++.<+++.----.-.<++++++++++++++++++.>+.>.<++.<<<+.>>
> > >>>----.<--.>++++++.<<<<------------------------------------.
> > >
> > > _______________________________________________
> > > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > > http://mailman.open-bio.org/mailman/listinfo/biopython
> > _______________________________________________
> > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/08f11293/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 5
> Date: Wed, 8 Jul 2015 23:29:01 -0400
> From: Joshua Klein <mobiusklein at gmail.com>
> To: Zach Gayk <zgayk at nmu.edu>
> Cc: biopython at mailman.open-bio.org
> Subject: Re: [Biopython] Entrez EFetch Options
> Message-ID:
>         <
> CAFZWoGbwDtfzFe02p3qtAyVEjBdSo-oLTzOaOqRkeTuCqKJqkA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> If you store your list of identifiers in a file given by the variable
> 'file_path', with one identifier per line, you can use:
>
> ids_to_fetch = ",".join(open(file_path))
>
> This code will open the file, and use the default iteration behavior for
> file objects to yield a line at a time to the join method of the string
> ",". This will create a long string of comma-separated identifiers to use
> in your efetch call.
>
> On Wed, Jul 8, 2015 at 9:12 PM, Zach Gayk <zgayk at nmu.edu> wrote:
>
> > Hello,
> >
> > I would like to use the following code from the biopython tutorial to
> > retrieve gi numbers for a number of sequences that matched to scaffolds
> on
> > a genome assembly:
> >
> > import os
> > os.chdir('/Users/zachgayk/Desktop/GAVIABioinformatics/')
> > from Bio import Entrez # this is the most likely script modified
> > from Bio import SeqIO
> > Entrez.email = "zgayk at nmu.edu"
> > handle = Entrez.efetch(db="nucleotide", rettype="gb", retmode="text", \
> >                        id="gi|50254217|gb|`, gi|50254217|gb|AY567890.1|,
> > gi|559028|gb|L33375.1|GVSMTDGI,
> > gi|559028|gb|L33375.1|GVSMTDGI")
> > for seq_record in SeqIO.parse(handle, "gb"):
> >     print seq_record.description[:100] + "..." # the :100 specifies no.
> > characters and "..." says this comes after specified character limit
> > handle.close()
> >
> > The problem, however, is that there are a large number of gi numbers I
> > wish to retrieve, and so there are simply too many to manually enter into
> > the id ="" field. What I would like to do is specify a file containing
> all
> > of the needed gi numbers in a list and then have the code parse all of
> > them. I haven't been able to figure out how to do this yet, and if anyone
> > has any ideas they would be very much appreciated.
> >
> > Thank you,
> > Zach Gayk
> >
> >
> >
> >
> > _______________________________________________
> > Biopython mailing list  -  Biopython at mailman.open-bio.org
> > http://mailman.open-bio.org/mailman/listinfo/biopython
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://mailman.open-bio.org/pipermail/biopython/attachments/20150708/f83d063f/attachment.html
> >
>
> ------------------------------
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
>
> End of Biopython Digest, Vol 151, Issue 4
> *****************************************
>



-- 
______O_________oO________oO______o_______oO__
Björn Johansson
Assistant Professor
Departament of Biology
University of Minho
Campus de Gualtar
4710-057 Braga
PORTUGAL
www.bio.uminho.pt
Google profile <https://profiles.google.com/bjornjobb>
Google Scholar Profile
<http://scholar.google.com/citations?user=7AiEuJ4AAAAJ>
my group <https://sites.google.com/site/metabolicengineeringgroup/>
Office (direct) +351-253 601517 | (PT) mob.  +351-967 147 704 | (SWE) mob.
 +46 739 792 968
Dept of Biology (secr) +351-253 60 4310  | fax +351-253 678980
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20150709/26343e35/attachment-0001.html>


More information about the Biopython mailing list