From marco.galardini at unifi.it Sat Sep 3 06:33:01 2011 From: marco.galardini at unifi.it (Marco Galardini) Date: Sat, 03 Sep 2011 12:33:01 +0200 Subject: [Biopython] Fwd: Some ideas to add functionalities to Bio.Blast In-Reply-To: <4E5E12E2.6000201@unifi.it> References: <4E5E12E2.6000201@unifi.it> Message-ID: <4E62025D.9040304@unifi.it> Hi everybody, i'm sending to you this e-mail in order to see if the BioPython community may be interested in enhancing the features related to the Blast package: i've been using it for quite a long time now and over the months i've created a set of functions that may be useful to other scientists using Blast through the BioPython interface. To be a little more specific, it would be desirable to have some functions to build a database (makeblastdb), to retrieve sequences from a db (blastdbcmd), to have some statistics from the overall blast run (average evalue, sequence identiy, ...) and most importantly some Classes to perform some specific analysis like Bi-directional Blast Hits (BBH) (which may be even genome-wide for comparative genomics purposes). Do you think that features like that may be useful/needed by the BioPython community? I would be really glad to give my contribution on that issues. Regards, Marco Galardini -- ------------------------------------------------- Marco Galardini DBE - Department of Evolutionary Biology University of Florence - Italy e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 2288249 mobile: +39 340 2808041 ------------------------------------------------- From p.j.a.cock at googlemail.com Sat Sep 3 11:14:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 3 Sep 2011 16:14:03 +0100 Subject: [Biopython] Fwd: Some ideas to add functionalities to Bio.Blast In-Reply-To: <4E62025D.9040304@unifi.it> References: <4E5E12E2.6000201@unifi.it> <4E62025D.9040304@unifi.it> Message-ID: On Sat, Sep 3, 2011 at 11:33 AM, Marco Galardini wrote: > Hi everybody, > > i'm sending to you this e-mail in order to see if the BioPython > community may be interested in enhancing the features related to the > Blast package: i've been using it for quite a long time now and over the > months i've created a set of functions that may be useful to other > scientists using Blast through the BioPython interface. > To be a little more specific, it would be desirable to have some > functions to build a database (makeblastdb), to retrieve sequences from > a db (blastdbcmd),... The Bio.Applications module is currently lacking wrappers for those, so adding them would be very welcome: https://github.com/biopython/biopython/blob/master/Bio/Blast/Applications.py Could you sign up to the Biopython developers mailing list which is were code contributions tend to be discussed. Thanks. > to have some statistics from the overall blast run > (average evalue, sequence identiy, ...) and most importantly some > Classes to perform some specific analysis like Bi-directional Blast Hits > (BBH) (which may be even genome-wide for comparative genomics purposes). > Do you think that features like that may be useful/needed by the > BioPython community? I would be really glad to give my contribution on > that issues. These later ideas are harder to generalise into library functions, but if done well could be very widely used. I've written reciprocal best hit BLAST code myself for example (as a Python script for use in Galaxy). If you've got some code already written that you'd like to share, please do so - even if we decide some of it would be better as a Cookbook example rather than in the core library, it would still be very useful to the community. Regards, Peter From nanatrapnest at hotmail.it Mon Sep 5 05:14:47 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Mon, 5 Sep 2011 09:14:47 +0000 Subject: [Biopython] help for Byopython In-Reply-To: References: Message-ID: Hello, I am a new user and I'd like to set up a programm that allow to rotate protein in the 3D space. I think that PDB parser could be useful and I could use quaternion geometry as well... could you give me some advice? Thank you very much From p.j.a.cock at googlemail.com Mon Sep 5 05:38:26 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Sep 2011 10:38:26 +0100 Subject: [Biopython] help for Byopython In-Reply-To: References: Message-ID: On Mon, Sep 5, 2011 at 10:14 AM, Nana Trapnest wrote: > > > Hello, > I am a new user and I'd like to set up a programm that allow to > rotate protein in the 3D space. I think that PDB parser could be > useful and I could use quaternion geometry as well... could you > give me some advice? Thank you very much Do you just want to draw pictures of a rotating protein? If so check our VMD, OpenRasMol or the many other PDB file viewers. You can use Biopython to rotate 3D structures (e.g. PDB files). See for example: http://www.warwick.ac.uk/go/peter_cock/python/protein_superposition/ Peter From nmz787 at gmail.com Tue Sep 6 18:58:28 2011 From: nmz787 at gmail.com (Nathan McCorkle) Date: Tue, 6 Sep 2011 18:58:28 -0400 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? Message-ID: Wikipedia says the FASTA format can deal with ; as a comment character, software ignoring those lines. Is there a way to get to such comments from bioPython? -- Nathan McCorkle Rochester Institute of Technology College of Science, Biotechnology/Bioinformatics From p.j.a.cock at googlemail.com Tue Sep 6 19:17:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 00:17:03 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: References: Message-ID: On Tuesday, September 6, 2011, Nathan McCorkle wrote: > Wikipedia says the FASTA format can deal with ; as a comment > character, software ignoring those lines. Is there a way to get to > such comments from bioPython? No, in practice very few if any tools accept such comment lines, so writing them would be counter productive. Peter From gv1 at sanger.ac.uk Wed Sep 7 05:09:12 2011 From: gv1 at sanger.ac.uk (Giles Velarde) Date: Wed, 7 Sep 2011 10:09:12 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: References: Message-ID: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> GFF3? You can create features in the first part, add any metadata you like them in the 9th column, and have FASTA blocks at the bottom. There's nothing stopping you having 1 feature per FASTA block, I think. Best, Giles. On 7 Sep 2011, at 00:17, Peter Cock wrote: > On Tuesday, September 6, 2011, Nathan McCorkle > wrote: >> Wikipedia says the FASTA format can deal with ; as a comment >> character, software ignoring those lines. Is there a way to get to >> such comments from bioPython? > > No, in practice very few if any tools accept such comment > lines, so writing them would be counter productive. > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From p.j.a.cock at googlemail.com Wed Sep 7 05:45:09 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 10:45:09 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> References: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> Message-ID: On Wed, Sep 7, 2011 at 10:09 AM, Giles Velarde wrote: > GFF3? You can create features in the first part, add any metadata you like > them in the 9th column, and have FASTA blocks at the bottom. > > There's nothing stopping you having 1 feature per FASTA block, I think. > > Best, Giles. I'm not sure I'd pick GFF3, but as alternatives to FASTA there certainly there are many annotation supporting sequence file formats out there, old and new (e.g. GenBank and SeqXML are both supported in Biopython). Peter From mjldehoon at yahoo.com Wed Sep 7 08:48:22 2011 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 7 Sep 2011 05:48:22 -0700 (PDT) Subject: [Biopython] Bio.Unigene.UniGene Message-ID: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Hi all, Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. Does anybody object against deprecating Bio/UniGene/UniGene.py? (or should we start with a PendingDeprecationWarning)? Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. Best, --Michiel. From p.j.a.cock at googlemail.com Wed Sep 7 08:57:00 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 13:57:00 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Wed, Sep 7, 2011 at 1:48 PM, Michiel de Hoon wrote: > Hi all, > > Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: > > https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data > > Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. > > Does anybody object against deprecating Bio/UniGene/UniGene.py? > (or should we start with a PendingDeprecationWarning)? > Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. I didn't realise we still had any HTML parsers left. Since there is an easier to parse plain text format, then absolutely we should phase out the HTML parser. Peter From youngcsong at gmail.com Wed Sep 7 18:36:26 2011 From: youngcsong at gmail.com (Young Song) Date: Wed, 7 Sep 2011 15:36:26 -0700 Subject: [Biopython] How do I retrieve information regarding isolation_source and clones using Entrez Message-ID: Hi, I have spent few hours reading the Biopython manual, and I am currently trying to write a script that can retrieve information regarding isolation_source and clone name for certain sequences. This is the code that I have now: >>from Bio import Entrez >>Entrez.email = "youngcsong at gmail.com" >>entrez_handle = Entrez.efetch(db="protein", id="ADJ51069", retmode="xml") >>entrez_record = Entrez.read(entrez_handle) >>print entrez_record[0].keys() Then I get the following: [u'GBSeq_moltype', u'GBSeq_source', u'GBSeq_sequence', u'GBSeq_primary-accession', u'GBSeq_definition', u'GBSeq_accession-version', u'GBSeq_topology', u'GBSeq_length', u'*GBSeq_feature-table*', u'GBSeq_create-date', u'GBSeq_other-seqids', u'GBSeq_division', u'GBSeq_taxonomy', u'GBSeq_comment', u'GBSeq_source-db', u'GBSeq_references', u'GBSeq_update-date', u'GBSeq_organism', u'GBSeq_locus'] I used the key, "GBSeq_feature-table" to see what sort of values are stored here, >>print records[0]["GBSeq_feature-table"] Then I get following, which seems rather confusing: [{u'GBFeature_quals': [{u'GBQualifier_name': 'organism', u'GBQualifier_value': 'uncultured prokaryote'}, *{u'GBQualifier_name': 'isolation_source', u'GBQualifier_value': 'contaminated river sediment'}*, {u'GBQualifier_name': 'db_xref', u'GBQualifier_value': 'taxon:198431'}, *{u'GBQualifier_name': 'clone', u'GBQualifier_value': '**Arthur_Kill_OTU4'}*, {u'GBQualifier_name': 'environmental_sample'}, {u'GBQualifier_name': 'country', u'GBQualifier_value': 'USA: New Jersey'}], u'GBFeature_key': 'source', u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '1..218'}, {u'GBFeature_quals': [{u'GBQualifier_name': 'product', u'GBQualifier_value': 'alkylsuccinate synthase'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '<1..>218', u'GBFeature_key': 'Protein', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}, {u'GBFeature_quals': [{u'GBQualifier_name': 'region_name', u'GBQualifier_value': 'RNR_PFL'}, {u'GBQualifier_name': 'note', u'GBQualifier_value': 'Ribonucleotide reductase and Pyruvate formate lyase; cl09939'}, {u'GBQualifier_name': 'db_xref', u'GBQualifier_value': 'CDD:186877'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '<1..>218', u'GBFeature_key': 'Region', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}, {u'GBFeature_quals': [{u'GBQualifier_name': 'gene', u'GBQualifier_value': 'assA'}, {u'GBQualifier_name': 'coded_by', u'GBQualifier_value': 'GU453639.1:<1..>658'}, {u'GBQualifier_name': 'codon_start', u'GBQualifier_value': '3'}, {u'GBQualifier_name': 'transl_table', u'GBQualifier_value': '11'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '1..218', u'GBFeature_key': 'CDS', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}] It seems like there is some attributes called GBQualifier_name and GBQualifier_value, but I am not sure how to utilize these attributes to get the bolded values (i.e. contaminated river sediment and Arthur_Kill_OTU4). Your help here would be very much appreciated. Thank you in advance. Young -- Young C. Song Masters Student Graduate Program in Bioinformatics The University of British Columbia Department of Microbiology and Immunology 2350 Health Science Mall Vancouver, BC V6T 1Z4, Canada From p.j.a.cock at googlemail.com Thu Sep 8 03:41:14 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Sep 2011 08:41:14 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Wednesday, September 7, 2011, Peter Cock wrote: > On Wed, Sep 7, 2011 at 1:48 PM, Michiel de Hoon wrote: >> Hi all, >> >> Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: >> >> https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data >> >> Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. >> >> Does anybody object against deprecating Bio/UniGene/UniGene.py? >> (or should we start with a PendingDeprecationWarning)? >> Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. > > I didn't realise we still had any HTML parsers left. > Since there is an easier to parse plain text format, then > absolutely we should phase out the HTML parser. > > Peter > Something in the recent change has upset 2to3, e.g. http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.2/builds/282/steps/compile/logs/stdio Peter From mjldehoon at yahoo.com Thu Sep 8 09:29:59 2011 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 8 Sep 2011 06:29:59 -0700 (PDT) Subject: [Biopython] How do I retrieve information regarding isolation_source and clones using Entrez In-Reply-To: Message-ID: <1315488599.55360.YahooMailClassic@web161212.mail.bf1.yahoo.com> > I used the key, "GBSeq_feature-table" to see what sort of > values are stored > here, > > >>print records[0]["GBSeq_feature-table"] > > Then I get following, which seems rather confusing: > > [{u'GBFeature_quals': [{u'GBQualifier_name': 'organism', > u'GBQualifier_value': 'uncultured prokaryote'}, > {u'GBQualifier_name': > 'isolation_source', u'GBQualifier_value': 'contaminated > river sediment'}, > {u'GBQualifier_name': ... As records[0]["GBSeq_feature-table"] starts with a '[', it is a Python list, and you can use it as such. Try for example >>> len(records[0]["GBSeq_feature-table"]) or >> records[0]["GBSeq_feature-table"][0] Similarly, if you see something that starts with a '{', it is a dictionary. Best, --Michiel. From p.j.a.cock at googlemail.com Thu Sep 8 10:10:25 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Sep 2011 15:10:25 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Thu, Sep 8, 2011 at 8:41 AM, Peter Cock wrote: > > Something in the recent change has upset 2to3, e.g. > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.2/builds/282/steps/compile/logs/stdio > Looked like a new line problem, fixed: https://github.com/biopython/biopython/commit/11c33d25f2560f37c6b5457243205ec6186ebd45 Peter From pawan.mani2 at gmail.com Fri Sep 9 13:04:22 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Fri, 9 Sep 2011 22:34:22 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. Message-ID: Dear sir, I have been facing a problem in removing biopython from ubantu which i am using virtually using oracle VM vertualBox software. i have python2.7. but i hav instal python3 using apt-get command. and then i install biopython using aptget . then i could not able to import biopython in python3. only it works in python2.7. so i have remove python2.7 using sudo rm command. then i uninstall biopython using apt-get remove. then i type the command: $ dpkg --list | grep 'biopython' then i got : rF python-biopython 1.56-l Python library for Bioinformatics ii python-biopython 1.56-l Documentation for the biopython library How to remove this two files and I have downloanded biopython tar file but before completing i have stop and when i locate biopythone. there is a file biopython-1.58.tar.gz.par. i want to delete this file. I would like to know whether biopython is compatible with python.3.2. if yes then i want to install Biopython for the two python version 2.7.2 and 3.2. Kindly guide me to remove python3 also and I want to reinstall the two python version along with biopython. so that i ca used in both the version. With best regards, pawankumar From p.j.a.cock at googlemail.com Sat Sep 10 07:03:41 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 10 Sep 2011 12:03:41 +0100 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Fri, Sep 9, 2011 at 6:04 PM, kakchingtabam pawankumar sharma wrote: > Dear sir, > ? ? I have been facing a problem in removing biopython from ubantu which i > am using virtually using oracle VM vertualBox software. i have python2.7. > but i have instal python3 using apt-get command. and then i install biopython > using aptget I believe the Ubuntu package for Biopython will be for Python 2 (whichever Python 2 is standard for that vernon of Ubuntu). > . then i could not able to import biopython in python3. only it works in > python2.7. so i have remove python2.7 using sudo rm command. Was Python 2.7 installed by an Ubuntu package, or by you from source? If you install things from packages it is best to remove them using the package manager (apt-get remove here). > then i > uninstall biopython using apt-get remove. then i type the command: > > ? ? ?$ dpkg --list | grep 'biopython' > then i got : > > rF python-biopython ? ? ? ? ? 1.56-l ? ? Python library for Bioinformatics > ii python-biopython ? ? ? ? ? 1.56-l ? ? Documentation for the biopython > library > > > > How to remove this two files and I have downloanded biopython tar file but > before completing i have stop and when i locate biopythone. there is a file > biopython-1.58.tar.gz.par. i want to delete this file. I'm confused about what you've done with your system. > I would like to know whether biopython is compatible with python.3.2. > Somewhat. Most things work but at this point we only support Biopython on Python 2.5, 2.6 and 2.7. You are welcome to help test on Python 3, and fix bugs - but for a beginner I would recommend sticking with Biopython on Python 2 for now. > if yes then i want to install Biopython for the two python version 2.7.2 > and 3.2. > > Kindly guide me to remove python3 also and I want to reinstall the two > python version along with biopython. so that i ca used in both the version. If you installed Python 3 using the Ubuntu packages, then use the package manager to uninstall it. If you installed any copies of Biopython from source, you must manually remove them. Normally Python libraries are installed separately for Python 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2 etc. You can install Biopython for Python 2.7, and also install Biopython for Python 3.2 - this will result in two copies in the system libraries. That is normal. I hope that helps. Peter From eric.talevich at gmail.com Sat Sep 10 10:29:12 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 10:29:12 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Pawankumar, Try this series of commands: sudo aptitude reinstall python sudo aptitude remove python-biopython sudo aptitude install python-setuptools sudo aptitude install python3-setuptools sudo easy_install biopython sudo easy_install3 biopython This should fix your current Python 2.7 installation, which must be installed correctly for most of Ubuntu to work. The it installs the setuptools package for both Python 2.7 and Python 3, which includes the program "easy_install". As Peter mentioned, each Python version keeps its own set of installed libraries, so you need to install Biopython separately for each Python version you want to use. Don't use apt-get to install Python packages, in general. Use the easy_install command instead -- easy_install for the default Python version (2.7), easy_install3 for Python 3. Cheers, Eric On Fri, Sep 9, 2011 at 1:04 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > Dear sir, > I have been facing a problem in removing biopython from ubantu which i > am > using virtually using oracle VM vertualBox software. i have python2.7. but > i > hav > instal python3 using apt-get command. and then i install biopython using > aptget > . then i could not able to import biopython in python3. only it works in > python2.7. so i have remove python2.7 using sudo rm command. then i > uninstall > biopython using apt-get remove. then i type the command: > > $ dpkg --list | grep 'biopython' > then i got : > > rF python-biopython 1.56-l Python library for Bioinformatics > ii python-biopython 1.56-l Documentation for the biopython > library > > > > How to remove this two files and I have downloanded biopython tar file but > before completing i have stop and when i locate biopythone. there is a file > biopython-1.58.tar.gz.par. i want to delete this file. > > > I would like to know whether biopython is compatible with python.3.2. > > if yes then i want to install Biopython for the two python version 2.7.2 > and > 3.2. > > Kindly guide me to remove python3 also and I want to reinstall the two > python > version along with biopython. so that i ca used in both the version. > > With best regards, > pawankumar > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From pawan.mani2 at gmail.com Sat Sep 10 13:09:06 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sat, 10 Sep 2011 22:39:06 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Thanks for your reply. Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now in my system i have python2.6 so instead of python2.6, i would like to have python 2.7 and 3.1. for this two separate biopython also. so kindly tel me the *required exact commands to instal*l in my system. Dear Pete,r u mean to say is that python 3 is not possible to use biopython package. but i was reading a book called PYTHON FOR BIOINFORMATICS 2010 edition. this author taught us using biopython in python3 version. they kept emphasising the reader to learn python3 version. With best regards, Pawankumar sharma On Sat, Sep 10, 2011 at 7:59 PM, Eric Talevich wrote: > Pawankumar, > > Try this series of commands: > > sudo aptitude reinstall python > sudo aptitude remove python-biopython > sudo aptitude install python-setuptools > sudo aptitude install python3-setuptools > sudo easy_install biopython > sudo easy_install3 biopython > > > This should fix your current Python 2.7 installation, which must be > installed correctly for most of Ubuntu to work. > > The it installs the setuptools package for both Python 2.7 and Python 3, > which includes the program "easy_install". As Peter mentioned, each Python > version keeps its own set of installed libraries, so you need to install > Biopython separately for each Python version you want to use. > > Don't use apt-get to install Python packages, in general. Use the > easy_install command instead -- easy_install for the default Python version > (2.7), easy_install3 for Python 3. > > Cheers, > Eric > > > On Fri, Sep 9, 2011 at 1:04 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> Dear sir, >> I have been facing a problem in removing biopython from ubantu which i >> am >> using virtually using oracle VM vertualBox software. i have python2.7. but >> i >> hav >> instal python3 using apt-get command. and then i install biopython using >> aptget >> . then i could not able to import biopython in python3. only it works in >> python2.7. so i have remove python2.7 using sudo rm command. then i >> uninstall >> biopython using apt-get remove. then i type the command: >> >> $ dpkg --list | grep 'biopython' >> then i got : >> >> rF python-biopython 1.56-l Python library for Bioinformatics >> ii python-biopython 1.56-l Documentation for the biopython >> library >> >> >> >> How to remove this two files and I have downloanded biopython tar file but >> before completing i have stop and when i locate biopythone. there is a >> file >> biopython-1.58.tar.gz.par. i want to delete this file. >> >> >> I would like to know whether biopython is compatible with python.3.2. >> >> if yes then i want to install Biopython for the two python version 2.7.2 >> and >> 3.2. >> >> Kindly guide me to remove python3 also and I want to reinstall the two >> python >> version along with biopython. so that i ca used in both the version. >> >> With best regards, >> pawankumar >> _______________________________________________ >> Biopython mailing list - Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> > > From eric.talevich at gmail.com Sat Sep 10 13:34:06 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 13:34:06 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Sat, Sep 10, 2011 at 1:09 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > > Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now > in my system i have python2.6 so instead of python2.6, i would like to have > python 2.7 and 3.1. for this two separate biopython also. so kindly tel me > the *required exact commands to instal*l in my system. > I assume you're new to Ubuntu. If the default Python installation is 2.6, I recommend you use that instead of 2.7 for now. Using Python 3 is still OK. Do this: sudo aptitude install python-setuptools python3 python3-setuptools sudo easy_install biopython sudo easy_install3 biopython If you really want to use 2.7 also, then figuring out how to do that will be a fun exercise for you and you'll learn more about managing your system. Cheers, Eric From pawan.mani2 at gmail.com Sat Sep 10 13:41:14 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sat, 10 Sep 2011 23:11:14 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: DEar Eri, yap I am new to ubantu linux and python. I show in biopython website. the latest biothon does not support python 3.2. How to upgrade to python2.6 to 2.7. and i want install biopython for python 2.7 and i want python3.2 also. what are the possible commands upgration to python.2.7 and biopython for this. above this commands for installing oython3.2. thanks for replying very soon. with best regards, Pawan On Sat, Sep 10, 2011 at 11:04 PM, Eric Talevich wrote: > On Sat, Sep 10, 2011 at 1:09 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> >> Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now >> in my system i have python2.6 so instead of python2.6, i would like to have >> python 2.7 and 3.1. for this two separate biopython also. so kindly tel me >> the *required exact commands to instal*l in my system. >> > > I assume you're new to Ubuntu. If the default Python installation is 2.6, I > recommend you use that instead of 2.7 for now. Using Python 3 is still OK. > > Do this: > > sudo aptitude install python-setuptools python3 python3-setuptools > > sudo easy_install biopython > sudo easy_install3 biopython > > > If you really want to use 2.7 also, then figuring out how to do that will > be a fun exercise for you and you'll learn more about managing your system. > > Cheers, > Eric > > From p.j.a.cock at googlemail.com Sat Sep 10 14:53:44 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 10 Sep 2011 19:53:44 +0100 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Saturday, September 10, 2011, kakchingtabam pawankumar sharma wrote: > Dear Pete,r u mean to say is that python 3 is not possible to use biopython > package. but i was reading a book called PYTHON FOR BIOINFORMATICS 2010 > edition. this author taught us using biopython in python3 version. they > kept emphasising the reader to learn python3 version. Which book is that please? e.g. Author name? Publisher? I'm a bit surprised there is a book out recommending using Biopython with Python 3 (given we don't officially support it yet, and don't even provide Windows installlers for Biopython under Python 3). We're still in a transition period where not all the major libraries have been converted from Python 2 to Python 3, so if you want/need to use extra Python libraries you shouldn't automatically start with Python 3. Peter From eric.talevich at gmail.com Sat Sep 10 15:36:42 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 15:36:42 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Sat, Sep 10, 2011 at 1:41 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > DEar Eri, yap I am new to ubantu linux and python. I show in biopython > website. the latest biothon does not support python 3.2. > > How to upgrade to python2.6 to 2.7. and i want install biopython for python > 2.7 and i want python3.2 also. > Since you're new to Ubuntu and Biopython, I recommend just using the versions you've installed already. Python 2.6 and 3.1 will meet your needs, and the more weird stuff you do to your system, the more likely you are to have problems later. Just try using the versions you have for a while, first. When you upgrade Ubuntu to the latest version (11.04 or 11.10 -- you're on 10.10 right now, I think), both Python versions will be upgraded automatically. If you have deeper questions about managing your Python installation on Ubuntu, or upgrading Ubuntu itself, please consult either the Ubuntu wiki or Python.org. Best, Eric From pawan.mani2 at gmail.com Sat Sep 10 15:46:30 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sun, 11 Sep 2011 01:16:30 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Thanks U two for ur kind reply i ll upgrade the ubatu. then i may have python 2.7. On Sun, Sep 11, 2011 at 1:06 AM, Eric Talevich wrote: > On Sat, Sep 10, 2011 at 1:41 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> DEar Eri, yap I am new to ubantu linux and python. I show in biopython >> website. the latest biothon does not support python 3.2. >> >> How to upgrade to python2.6 to 2.7. and i want install biopython for >> python 2.7 and i want python3.2 also. >> > > Since you're new to Ubuntu and Biopython, I recommend just using the > versions you've installed already. Python 2.6 and 3.1 will meet your needs, > and the more weird stuff you do to your system, the more likely you are to > have problems later. Just try using the versions you have for a while, > first. > > When you upgrade Ubuntu to the latest version (11.04 or 11.10 -- you're on > 10.10 right now, I think), both Python versions will be upgraded > automatically. > > If you have deeper questions about managing your Python installation on > Ubuntu, or upgrading Ubuntu itself, please consult either the Ubuntu wiki or > Python.org. > > Best, > Eric > From ndousis at gmail.com Sun Sep 11 21:00:56 2011 From: ndousis at gmail.com (Nasos Dousis) Date: Sun, 11 Sep 2011 18:00:56 -0700 Subject: [Biopython] align single sequence to MSA Message-ID: Hello, First, thank you to everyone who has contributed to the BioPython codebase and to the mailing list. I have a FASTA sequence, and I'd like to find the optimal alignment of that sequence to an MSA. I don't want to alter the MSA-- I just want to map the single sequence onto the MSA. Is there a simple way to do this by ClustalW or MUSCLE? Thanks, Nasos From steven.irvin at monsanto.com Tue Sep 13 16:51:03 2011 From: steven.irvin at monsanto.com (IRVIN, STEVEN (AG-Contractor/1000)) Date: Tue, 13 Sep 2011 20:51:03 +0000 Subject: [Biopython] Translations for BLAST hits "on the fly" using biopython Message-ID: <8F46CBF672774F4C8A6B288A246B4468A8209F@stlwexmbxprd02.na.ds.monsanto.com> Hell All, Is there a quick and easy way to obtain the (optimal) translations from nucleotide sequence [query or subject (db)] results ("hits"/HSP) derived from BLAST+ programs such as tblastn using BioPython? Steve Steven D Irvin, MS Bioinformatics Analyst [cid:image003.png at 01CC1925.F25B8430]CC214-A Monsanto Research Center Chesterfield Village, MO Steven.Irvin at monsanto.com (636) 737-1980 This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 3721 bytes Desc: image001.png URL: From p.j.a.cock at googlemail.com Wed Sep 14 17:09:07 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 14 Sep 2011 22:09:07 +0100 Subject: [Biopython] Fwd: NETTAB 2011 on Clinical Bioinformatics: Call for posters and participation In-Reply-To: <201109141346.p8EDk67h021261@clus2.istge.it> References: <201109141346.p8EDk67h021261@clus2.istge.it> Message-ID: Hopefully this is of interest to some of you... ---------- Forwarded message ---------- From: Paolo Romano Date: Wed, Sep 14, 2011 at 2:46 PM Subject: NETTAB 2011 on Clinical Bioinformatics: Call for posters and participation To: biopython-owner at lists.open-bio.org Dear list owner, I would be glad if you would forward thsi message to the list. Many thanks in adavnce. Ciao. Paolo ==== Dear all, I'm glad to remind you of next NETTAB 2011 workshop on Clinical Bioinformatics and to send you the related Call for submissions of posters. A Supplement of BMC Bioinformatics will later be published with a selection of best extended and revised versions of contributions, both oral communications and posters, presented at the workshop. If you are interested, as I hope, you are also welcome to subscribe to the low trafic list nettab-announce at istge.it. It is a moderated mailing list, devoted exclusively to announcements related to the NETTAB workshops, a series of events on new ICT tools and their applicability to biological research and bioinformatics that are held annually in Italy. Please subscribe at http://www.nettab.org/listsub.php . Best regards. Paolo Romano ==== NETTAB 2011 on "Clinical Bioinformatics" October 12-14, 2011 Collegio Ghislieri, Pavia, Italy http://www.nettab.org/2011/ NETTAB 2011 is the eleventh in a series of international workshops on Network Tools and Applications in Biology. NETTAB 2011 is focused on Clinical Bioinformatics and it is aimed at presenting the methods, tools and infrastructures that are nowadays available. It will also show some of the most interesting applications in this field. PROGRAMME Invited Speakers + Network Models of Mesophenotypes in Personal Genomics and Targeted Therapies Yves A. LUSSIER, Institute for Translational Medicine, University of Chicago, USA + From Omics to Systems Biology - an Approach to Individualized Medicine Thomas ILLIG, Institute of Epidemiology, Helmholtz Zentrum M?nchen, Neuherberg, Germany + ICT Architectures for Biobanks to Support Clinical Research Jan-Eric LITTON, Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden + Computing our Patients' Future Using Data from our Healthcare Institutions Shawn N MURPHY, Harvard Medical School, Boston, USA Special Sessions + Why clinicians need e-Health Terry HANNAN, University of Tasmania, Launceston, Australia + Interoperability - HL7 Amnon SHABO, IBM Research Lab, Haifa, Israel Tutorials + Data Warehouse Carlo COMBI, Department of Computer Science, University of Verona, Verona, Italy + Natural Language Processing Pierre ZWEIGENBAUM, LIMSI - CNRS, Orsay, France + Search Computing Marco MASSEROLI (to be confirmed), Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy CALL FOR POSTERS Submitted contributions should address one or more of the following topics: - Methods and Technologies: ? Data warehouse - Natural Language Processing - Data Mining - Ontologies, Interoperability, Standardisation - ? Search Computing - Decision Support - Tools: ? ICT architectures to support - Clinical research . BioBanks . Software for next generation sequencing - ? Data management and storage - Applications: ? Risk assessment . Diagnosis . Therapy planning- Drug Design Contributions length: - posters and demoes: no more than 3 pages Deadlines: ? September 16, 2011: Posters and demoes submission A slight delay may be accepted. Submit your contribution through the EasyChair system at http://www.easychair.org/conferences/?conf=nettab2011 All contributions submitted to NETTAB 2011 will be invited to a restricted Call for full research papers to be published in a Supplement of BMC Bioinformatics. REGISTRATION You can register to the NETTAB 2011 workshop by using the form at http://www.nettab.org/2011/rform.html Early registration ends within September 30, 2011. CONTACTS Visit the website http://www.nettab.org/2011/ Contact the organization by sending an email message to nettab2011 at unipv.it . Best regards. Paolo Romano, on behalf of the Conference Chairs Paolo Romano (paolo.romano at istge.it) Bioinformatics National Cancer Research Institute (IST) http://www.nettab.org/ NETTAB Workshops. Stay tuned! From abhishek.vit at gmail.com Fri Sep 16 18:42:37 2011 From: abhishek.vit at gmail.com (Abhishek Pratap) Date: Fri, 16 Sep 2011 15:42:37 -0700 Subject: [Biopython] comparing bam files Message-ID: Hi All This is my first post to the biopython mailing list. Basically I am new to both Python and BioP. So I have two bam files one contains the properly paired reads (file A) and the other has some of the singeltons (file B) either (read 1 / read 2). I have to find the mates of all the singletons from the properly paired bam file (file A) and then generate a bam file (file C)which has all the proper pairs for all the singletons I had. PS: Also the file A is guaranteed to have all the pairs which might exist as a singleton in file B. I want to do this on the binary files and avoid reading in the sam files. Is that something I can do using some of the bam readers in biopython ? Thanks! -Abhi From p.j.a.cock at googlemail.com Sat Sep 17 17:44:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 17 Sep 2011 22:44:03 +0100 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: On Fri, Sep 16, 2011 at 11:42 PM, Abhishek Pratap wrote: > Hi All > > This is my first post to the biopython mailing list. Basically I am > new to both Python and BioP. > Welcome. > > So I have two bam files one contains the properly paired reads (file > A) and the other has some of the singeltons (file B) either (read 1 / > read 2). > > I have to find the mates of all the singletons from the properly > paired bam file (file A) and then generate a bam file ?(file C)which > has all the proper pairs for all the singletons I had. > > PS: Also the file A is guaranteed to have all the pairs which might > exist as a singleton in file B. I don't understand what you're trying to do - why are there singletons in file A if that is the file of properly paired reads? > I want to do this on the binary files and avoid reading in the sam > files. Is that something I can do using some of the bam readers in > biopython ? Biopython doesn't have a SAM/BAM interface, instead there is pysam which binds the samtools C API: http://code.google.com/p/pysam/ Try that. Peter From abhishek.vit at gmail.com Tue Sep 20 17:46:15 2011 From: abhishek.vit at gmail.com (Abhishek Pratap) Date: Tue, 20 Sep 2011 14:46:15 -0700 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: Thanks for the reply Peter. I know my requirement sure does confusing but this is something we need to do in order to extract the reads which are stranded. In our case we want the reads where read 1 maps to same strand and read 2 on the other strand and eliminate the cases where read 2 falls on the same strand and read 1 on the opposite strand. I am looking into pysam to see if it can help me . Does pysam have another mailing list or this is the right forum to ask pysam related questions ? I am sure I will have some as soon as I begin poking into it. -Abhi On Sat, Sep 17, 2011 at 2:44 PM, Peter Cock wrote: > On Fri, Sep 16, 2011 at 11:42 PM, Abhishek Pratap > wrote: > > Hi All > > > > This is my first post to the biopython mailing list. Basically I am > > new to both Python and BioP. > > > > Welcome. > > > > > So I have two bam files one contains the properly paired reads (file > > A) and the other has some of the singeltons (file B) either (read 1 / > > read 2). > > > > I have to find the mates of all the singletons from the properly > > paired bam file (file A) and then generate a bam file (file C)which > > has all the proper pairs for all the singletons I had. > > > > PS: Also the file A is guaranteed to have all the pairs which might > > exist as a singleton in file B. > > I don't understand what you're trying to do - why are there > singletons in file A if that is the file of properly paired reads? > > > I want to do this on the binary files and avoid reading in the sam > > files. Is that something I can do using some of the bam readers in > > biopython ? > > Biopython doesn't have a SAM/BAM interface, instead there > is pysam which binds the samtools C API: > > http://code.google.com/p/pysam/ > > Try that. > > Peter > From p.j.a.cock at googlemail.com Tue Sep 20 18:35:36 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 20 Sep 2011 23:35:36 +0100 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: On Tue, Sep 20, 2011 at 10:46 PM, Abhishek Pratap wrote: > Thanks for the reply Peter. I know my requirement sure does confusing but > this is something we need to do in order to extract the reads which are > stranded. In our case we want the reads where read 1 maps to same strand and > read 2 on the other strand and eliminate the cases where read 2 falls on the > same strand and read 1 on the opposite strand. Do you have something backwards, or perhaps I should go to sleep now... it sounds like you are talking about paired reads here (read 1 and read 2). It would be normal for Sanger (capillary) or Illumina paired end (or Illumina mate pairs) reads to map to opposite strands. Technically Roche paired end reads would map to the same strand (due to the way they sequence over the boundary of a circularised fragment) but I'm not 100% sure if any read aligners/assemblers reflect this. I know that sff_extract and MIRA flip one of the Roche 454 reads so that they act like classical Sanger or Illumina paired ends. > I am looking into pysam to see if it can help me . Does pysam have another > mailing list ?or this is the right forum to ask pysam related questions ? I > am sure I will have some as soon as I begin poking into it. > -Abhi There are a few people on the Biopython lists who do use pysam, and might be able to help, but pysam has a separate mailing list. Peter From nanatrapnest at hotmail.it Wed Sep 21 10:34:23 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Wed, 21 Sep 2011 14:34:23 +0000 Subject: [Biopython] Help for PDBParser Message-ID: Hello,I'd like to know how to print structure of a protein using Biopython, I istalled Python and Biopython, but where I get the proteins? I use this from Bio.PDB import *parser=PDBParser()structure=parser.get_structure("Tripsina", "2PTC.pdb")print structure but there is an error... Traceback (most recent call last): File "C:/Documents and Settings/Stefania/Desktop/PITON/prova", line 4, in structure=parser.get_structure("Tripsina", "2PTC.pdb") File "C:\Python27\lib\site-packages\Bio\PDB\PDBParser.py", line 77, in get_structure file=open(file)IOError: [Errno 2] No such file or directory: '2PTC.pdb' Can you help me please??? Where I find 2PTC.pdb??? Thanks From mikael.trellet at gmail.com Wed Sep 21 10:47:42 2011 From: mikael.trellet at gmail.com (Mikael Trellet) Date: Wed, 21 Sep 2011 16:47:42 +0200 Subject: [Biopython] Help for PDBParser In-Reply-To: References: Message-ID: The second argument you give to the get_structure function has to be the path of your PDB file. You will certainly find this PDB file in the PDB database : http://www.rcsb.org/pdb/home/home.do Moreover, your "print structure" will return only the structure object like that : "" You will have to iterate on it to print models, chains, residues and/or atoms to have something understandable ! Don't hesitate to read the Biopython wiki to have more details and informations : http://biopython.org/wiki/Biopython Cordially, On Wed, Sep 21, 2011 at 4:34 PM, Nana Trapnest wrote: > > Hello,I'd like to know how to print structure of a protein using Biopython, > I istalled Python and Biopython, but where I get the proteins? I use this > from Bio.PDB import > *parser=PDBParser()structure=parser.get_structure("Tripsina", > "2PTC.pdb")print structure > but there is an error... > Traceback (most recent call last): File "C:/Documents and > Settings/Stefania/Desktop/PITON/prova", line 4, in > structure=parser.get_structure("Tripsina", "2PTC.pdb") File > "C:\Python27\lib\site-packages\Bio\PDB\PDBParser.py", line 77, in > get_structure file=open(file)IOError: [Errno 2] No such file or > directory: '2PTC.pdb' > Can you help me please??? Where I find 2PTC.pdb??? Thanks > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Mikael TRELLET, Computational structural biology group, Utrecht University Bijvoet Center, The Netherlands From oriolebaltimore at gmail.com Fri Sep 23 15:12:21 2011 From: oriolebaltimore at gmail.com (Adrian Johnson) Date: Fri, 23 Sep 2011 15:12:21 -0400 Subject: [Biopython] annotation help Message-ID: Hi : I have mutation results in VCF format. Typically I want to take chromosome position reference base consensus base chr21 30576509 C Y (C/T) >From this data: 1. I want to find out if this is a missense mutation. 2. Amino acid change ( VAL to MET) 3. Protein position 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) 4. Name of drug that acts on this. Is it possible to get such annotation through biopython? Dear Sean: You are very active in both bioconductor and biopython and you might have worked exome-seq data and worked through this problem. I could do this kind of stuff using SeattleSeq, however I want to get a stand-alone program that will help getting this done locally. what is your opinion on this kind of problem. Are there any standalone programs now in addition to Duke Sequence Variant Analyzer or SeattleSeq? thank you. -Adrian. From sdavis2 at mail.nih.gov Fri Sep 23 15:33:32 2011 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 23 Sep 2011 15:33:32 -0400 Subject: [Biopython] annotation help In-Reply-To: References: Message-ID: Hi, Adrian. See: annovar snpEff Ensembl Variant Effect Predictor others.... None of these (or any program that I know of) will include the name of the drug that acts on the gene, but that information can be gleaned from other sources once you have the gene names. If you want to build something from scratch, you could start with this if you are working in cancer: https://wiki.nci.nih.gov/display/ICR/Cancer+Gene+Index+End+User+Documentation There are commercial softwares that offer gene/compound information, but I do not know which is "best". Sean On Fri, Sep 23, 2011 at 3:12 PM, Adrian Johnson wrote: > Hi : > > I have mutation results in VCF format. > > Typically I want to take > > chromosome ? ? position ? ? ? reference base ? ? ? consensus base > > ? ? ?chr21 ? ? ? ? ?30576509 ? ? ? ? ? ? ? ? ?C ? ? ? ? ? ? ? ? ? ? Y (C/T) > > > > From this data: > > 1. I want to find out if this is a missense mutation. > 2. Amino acid change ( VAL to MET) > 3. Protein position > 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) > 4. Name of drug that acts on this. > > > Is it possible to get such annotation through biopython? > > > Dear Sean: You are very active in both bioconductor and biopython and > you might have worked exome-seq data and worked through this problem. > I could do this kind of stuff using SeattleSeq, however I want to get > a stand-alone program that will help getting this done locally. ? what > is your opinion on this kind of problem. Are there any standalone > programs now in addition to Duke Sequence Variant Analyzer or > SeattleSeq? > > > thank you. > > -Adrian. > From oriolebaltimore at gmail.com Fri Sep 23 16:09:01 2011 From: oriolebaltimore at gmail.com (Adrian Johnson) Date: Fri, 23 Sep 2011 16:09:01 -0400 Subject: [Biopython] annotation help In-Reply-To: References: Message-ID: Thanks Sean. I will look into those software you mentioned. -Adrian. On Fri, Sep 23, 2011 at 3:33 PM, Sean Davis wrote: > Hi, Adrian. > > See: > > annovar > snpEff > Ensembl Variant Effect Predictor > others.... > > None of these (or any program that I know of) will include the name of > the drug that acts on the gene, but that information can be gleaned > from other sources once you have the gene names. ?If you want to build > something from scratch, you could start with this if you are working > in cancer: > > https://wiki.nci.nih.gov/display/ICR/Cancer+Gene+Index+End+User+Documentation > > There are commercial softwares that offer gene/compound information, > but I do not know which is "best". > > Sean > > > On Fri, Sep 23, 2011 at 3:12 PM, Adrian Johnson > wrote: >> Hi : >> >> I have mutation results in VCF format. >> >> Typically I want to take >> >> chromosome ? ? position ? ? ? reference base ? ? ? consensus base >> >> ? ? ?chr21 ? ? ? ? ?30576509 ? ? ? ? ? ? ? ? ?C ? ? ? ? ? ? ? ? ? ? Y (C/T) >> >> >> >> From this data: >> >> 1. I want to find out if this is a missense mutation. >> 2. Amino acid change ( VAL to MET) >> 3. Protein position >> 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) >> 4. Name of drug that acts on this. >> >> >> Is it possible to get such annotation through biopython? >> >> >> Dear Sean: You are very active in both bioconductor and biopython and >> you might have worked exome-seq data and worked through this problem. >> I could do this kind of stuff using SeattleSeq, however I want to get >> a stand-alone program that will help getting this done locally. ? what >> is your opinion on this kind of problem. Are there any standalone >> programs now in addition to Duke Sequence Variant Analyzer or >> SeattleSeq? >> >> >> thank you. >> >> -Adrian. >> > From nanatrapnest at hotmail.it Mon Sep 26 04:55:24 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Mon, 26 Sep 2011 08:55:24 +0000 Subject: [Biopython] help PDB parser Message-ID: I have a question... is it possible to save a file txt or better a matrix with all coordinates of the atoms ? For example... ATOM1 C 16 27 32ATOM2 CA 18 45 55ATOMN .. ... ... ... and save it like a txt file or a matrix N*3? Can you help me,please??? :) From anaryin at gmail.com Mon Sep 26 05:51:54 2011 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 26 Sep 2011 11:51:54 +0200 Subject: [Biopython] help PDB parser In-Reply-To: References: Message-ID: Hey Nana, PDBIO will not be suitable for this. I'd suggest looping through the atoms and printing the information you need in the format you want using regular string formatting options. Cheers, Jo?o From fkauff at biologie.uni-kl.de Mon Sep 26 10:54:15 2011 From: fkauff at biologie.uni-kl.de (Frank Kauff) Date: Mon, 26 Sep 2011 16:54:15 +0200 Subject: [Biopython] align single sequence to MSA In-Reply-To: References: Message-ID: <4E809217.5040004@biologie.uni-kl.de> Hi, Yes, clustal can do this easily. If I remember correctly the command line should be something like clustalw -sequences -profile1=mas_file.fas -profile2=new_sequences.fas Frank On 09/12/2011 03:00 AM, Nasos Dousis wrote: > Hello, > > First, thank you to everyone who has contributed to the BioPython > codebase and to the mailing list. > > I have a FASTA sequence, and I'd like to find the optimal alignment of > that sequence to an MSA. I don't want to alter the MSA-- I just want > to map the single sequence onto the MSA. Is there a simple way to do > this by ClustalW or MUSCLE? > > Thanks, > Nasos > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From ndousis at gmail.com Tue Sep 27 19:16:16 2011 From: ndousis at gmail.com (Nasos Dousis) Date: Tue, 27 Sep 2011 16:16:16 -0700 Subject: [Biopython] Biopython Digest, Vol 105, Issue 14 In-Reply-To: References: Message-ID: Frank, Thanks for your reply and suggestion. I tried that command line with ClustalW2 and a number of variations: clustalw2 -sequences -profile1=mas_file.aln -profile2=new_sequences.fas (.aln = clustal format) clustalw2 -sequences -profile1=mas_file.aln -profile2=new_sequences.aln clustalw2 -sequences -profile1=new_sequences.fas -profile2=mas_file.aln clustalw2 -sequences -profile2=new_sequences.fas -profile1=mas_file.aln clustalw2 -profile1=mas_file.aln -profile2=new_sequences.fas clustalw2 -profile1=mas_file.aln -profile2=mas_file.aln clustalw2 -profile1=mas_file.aln etc and I always get the following error: ================================================================== [ndousis at linux-machine ~]$ clustalw2 -profile1=mas_file.aln -profile2=new_sequences.aln CLUSTAL 2.1 Multiple Sequence Alignments Sequence format is CLUSTAL ERROR: There are no sequences in profile2 file. ================================================================== Nevertheless, I implemented a simple version of Needleman-Wunsch to align my sequence to the MSA and choose the highest scoring alignment. Thanks and kind regards, Nasos On Mon, Sep 26, 2011 at 9:00 AM, wrote: > Message: 3 > Date: Mon, 26 Sep 2011 16:54:15 +0200 > From: Frank Kauff > Subject: Re: [Biopython] align single sequence to MSA > To: biopython at lists.open-bio.org > Message-ID: <4E809217.5040004 at biologie.uni-kl.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi, > > Yes, clustal can do this easily. If I remember correctly the command > line should be something like > > clustalw -sequences -profile1=mas_file.fas -profile2=new_sequences.fas > > Frank > > > On 09/12/2011 03:00 AM, Nasos Dousis wrote: >> Hello, >> >> First, thank you to everyone who has contributed to the BioPython >> codebase and to the mailing list. >> >> I have a FASTA sequence, and I'd like to find the optimal alignment of >> that sequence to an MSA. ?I don't want to alter the MSA-- I just want >> to map the single sequence onto the MSA. ?Is there a simple way to do >> this by ClustalW or MUSCLE? >> >> Thanks, >> Nasos >> _______________________________________________ >> Biopython mailing list ?- ?Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> > > > > ------------------------------ > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > > End of Biopython Digest, Vol 105, Issue 14 > ****************************************** > From nanatrapnest at hotmail.it Fri Sep 30 09:14:57 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Fri, 30 Sep 2011 13:14:57 +0000 Subject: [Biopython] Information PDB file Message-ID: Hello, do you know if is possible to overwrite a PDBfile with other information, for example, atomic coordinates and saving it with another file name?? Thanks From anaryin at gmail.com Fri Sep 30 10:32:44 2011 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 30 Sep 2011 16:32:44 +0200 Subject: [Biopython] Information PDB file In-Reply-To: References: Message-ID: Dear Nana, Yes. Parse the structure as normally, and then just change the values of the fields that you want. For atomic coordinates, you can either insert them manually (just change the atom.coor array) or use transform to rotate and translate the atom according to a matrix-vector. Then just use PDBIO to save that new structure to another file. Cheers, Jo?o [...] Rodrigues http://nmr.chem.uu.nl/~joao 2011/9/30 Nana Trapnest > > Hello, > do you know if is possible to overwrite a PDBfile with other information, > for example, atomic coordinates and saving it with another file name?? > Thanks > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From marco.galardini at unifi.it Sat Sep 3 10:33:01 2011 From: marco.galardini at unifi.it (Marco Galardini) Date: Sat, 03 Sep 2011 12:33:01 +0200 Subject: [Biopython] Fwd: Some ideas to add functionalities to Bio.Blast In-Reply-To: <4E5E12E2.6000201@unifi.it> References: <4E5E12E2.6000201@unifi.it> Message-ID: <4E62025D.9040304@unifi.it> Hi everybody, i'm sending to you this e-mail in order to see if the BioPython community may be interested in enhancing the features related to the Blast package: i've been using it for quite a long time now and over the months i've created a set of functions that may be useful to other scientists using Blast through the BioPython interface. To be a little more specific, it would be desirable to have some functions to build a database (makeblastdb), to retrieve sequences from a db (blastdbcmd), to have some statistics from the overall blast run (average evalue, sequence identiy, ...) and most importantly some Classes to perform some specific analysis like Bi-directional Blast Hits (BBH) (which may be even genome-wide for comparative genomics purposes). Do you think that features like that may be useful/needed by the BioPython community? I would be really glad to give my contribution on that issues. Regards, Marco Galardini -- ------------------------------------------------- Marco Galardini DBE - Department of Evolutionary Biology University of Florence - Italy e-mail: marco.galardini at unifi.it www: http://www.unifi.it/dblage/CMpro-v-p-51.html phone: +39 055 2288249 mobile: +39 340 2808041 ------------------------------------------------- From p.j.a.cock at googlemail.com Sat Sep 3 15:14:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 3 Sep 2011 16:14:03 +0100 Subject: [Biopython] Fwd: Some ideas to add functionalities to Bio.Blast In-Reply-To: <4E62025D.9040304@unifi.it> References: <4E5E12E2.6000201@unifi.it> <4E62025D.9040304@unifi.it> Message-ID: On Sat, Sep 3, 2011 at 11:33 AM, Marco Galardini wrote: > Hi everybody, > > i'm sending to you this e-mail in order to see if the BioPython > community may be interested in enhancing the features related to the > Blast package: i've been using it for quite a long time now and over the > months i've created a set of functions that may be useful to other > scientists using Blast through the BioPython interface. > To be a little more specific, it would be desirable to have some > functions to build a database (makeblastdb), to retrieve sequences from > a db (blastdbcmd),... The Bio.Applications module is currently lacking wrappers for those, so adding them would be very welcome: https://github.com/biopython/biopython/blob/master/Bio/Blast/Applications.py Could you sign up to the Biopython developers mailing list which is were code contributions tend to be discussed. Thanks. > to have some statistics from the overall blast run > (average evalue, sequence identiy, ...) and most importantly some > Classes to perform some specific analysis like Bi-directional Blast Hits > (BBH) (which may be even genome-wide for comparative genomics purposes). > Do you think that features like that may be useful/needed by the > BioPython community? I would be really glad to give my contribution on > that issues. These later ideas are harder to generalise into library functions, but if done well could be very widely used. I've written reciprocal best hit BLAST code myself for example (as a Python script for use in Galaxy). If you've got some code already written that you'd like to share, please do so - even if we decide some of it would be better as a Cookbook example rather than in the core library, it would still be very useful to the community. Regards, Peter From nanatrapnest at hotmail.it Mon Sep 5 09:14:47 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Mon, 5 Sep 2011 09:14:47 +0000 Subject: [Biopython] help for Byopython In-Reply-To: References: Message-ID: Hello, I am a new user and I'd like to set up a programm that allow to rotate protein in the 3D space. I think that PDB parser could be useful and I could use quaternion geometry as well... could you give me some advice? Thank you very much From p.j.a.cock at googlemail.com Mon Sep 5 09:38:26 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Mon, 5 Sep 2011 10:38:26 +0100 Subject: [Biopython] help for Byopython In-Reply-To: References: Message-ID: On Mon, Sep 5, 2011 at 10:14 AM, Nana Trapnest wrote: > > > Hello, > I am a new user and I'd like to set up a programm that allow to > rotate protein in the 3D space. I think that PDB parser could be > useful and I could use quaternion geometry as well... could you > give me some advice? Thank you very much Do you just want to draw pictures of a rotating protein? If so check our VMD, OpenRasMol or the many other PDB file viewers. You can use Biopython to rotate 3D structures (e.g. PDB files). See for example: http://www.warwick.ac.uk/go/peter_cock/python/protein_superposition/ Peter From nmz787 at gmail.com Tue Sep 6 22:58:28 2011 From: nmz787 at gmail.com (Nathan McCorkle) Date: Tue, 6 Sep 2011 18:58:28 -0400 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? Message-ID: Wikipedia says the FASTA format can deal with ; as a comment character, software ignoring those lines. Is there a way to get to such comments from bioPython? -- Nathan McCorkle Rochester Institute of Technology College of Science, Biotechnology/Bioinformatics From p.j.a.cock at googlemail.com Tue Sep 6 23:17:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 00:17:03 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: References: Message-ID: On Tuesday, September 6, 2011, Nathan McCorkle wrote: > Wikipedia says the FASTA format can deal with ; as a comment > character, software ignoring those lines. Is there a way to get to > such comments from bioPython? No, in practice very few if any tools accept such comment lines, so writing them would be counter productive. Peter From gv1 at sanger.ac.uk Wed Sep 7 09:09:12 2011 From: gv1 at sanger.ac.uk (Giles Velarde) Date: Wed, 7 Sep 2011 10:09:12 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: References: Message-ID: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> GFF3? You can create features in the first part, add any metadata you like them in the 9th column, and have FASTA blocks at the bottom. There's nothing stopping you having 1 feature per FASTA block, I think. Best, Giles. On 7 Sep 2011, at 00:17, Peter Cock wrote: > On Tuesday, September 6, 2011, Nathan McCorkle > wrote: >> Wikipedia says the FASTA format can deal with ; as a comment >> character, software ignoring those lines. Is there a way to get to >> such comments from bioPython? > > No, in practice very few if any tools accept such comment > lines, so writing them would be counter productive. > > Peter > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. From p.j.a.cock at googlemail.com Wed Sep 7 09:45:09 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 10:45:09 +0100 Subject: [Biopython] Any way to add accessible meta-data/comments to FASTA files? In-Reply-To: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> References: <2C5EEB48-2A58-4450-BEA8-6D9CE0A2B3CC@sanger.ac.uk> Message-ID: On Wed, Sep 7, 2011 at 10:09 AM, Giles Velarde wrote: > GFF3? You can create features in the first part, add any metadata you like > them in the 9th column, and have FASTA blocks at the bottom. > > There's nothing stopping you having 1 feature per FASTA block, I think. > > Best, Giles. I'm not sure I'd pick GFF3, but as alternatives to FASTA there certainly there are many annotation supporting sequence file formats out there, old and new (e.g. GenBank and SeqXML are both supported in Biopython). Peter From mjldehoon at yahoo.com Wed Sep 7 12:48:22 2011 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 7 Sep 2011 05:48:22 -0700 (PDT) Subject: [Biopython] Bio.Unigene.UniGene Message-ID: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Hi all, Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. Does anybody object against deprecating Bio/UniGene/UniGene.py? (or should we start with a PendingDeprecationWarning)? Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. Best, --Michiel. From p.j.a.cock at googlemail.com Wed Sep 7 12:57:00 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 7 Sep 2011 13:57:00 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Wed, Sep 7, 2011 at 1:48 PM, Michiel de Hoon wrote: > Hi all, > > Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: > > https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data > > Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. > > Does anybody object against deprecating Bio/UniGene/UniGene.py? > (or should we start with a PendingDeprecationWarning)? > Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. I didn't realise we still had any HTML parsers left. Since there is an easier to parse plain text format, then absolutely we should phase out the HTML parser. Peter From youngcsong at gmail.com Wed Sep 7 22:36:26 2011 From: youngcsong at gmail.com (Young Song) Date: Wed, 7 Sep 2011 15:36:26 -0700 Subject: [Biopython] How do I retrieve information regarding isolation_source and clones using Entrez Message-ID: Hi, I have spent few hours reading the Biopython manual, and I am currently trying to write a script that can retrieve information regarding isolation_source and clone name for certain sequences. This is the code that I have now: >>from Bio import Entrez >>Entrez.email = "youngcsong at gmail.com" >>entrez_handle = Entrez.efetch(db="protein", id="ADJ51069", retmode="xml") >>entrez_record = Entrez.read(entrez_handle) >>print entrez_record[0].keys() Then I get the following: [u'GBSeq_moltype', u'GBSeq_source', u'GBSeq_sequence', u'GBSeq_primary-accession', u'GBSeq_definition', u'GBSeq_accession-version', u'GBSeq_topology', u'GBSeq_length', u'*GBSeq_feature-table*', u'GBSeq_create-date', u'GBSeq_other-seqids', u'GBSeq_division', u'GBSeq_taxonomy', u'GBSeq_comment', u'GBSeq_source-db', u'GBSeq_references', u'GBSeq_update-date', u'GBSeq_organism', u'GBSeq_locus'] I used the key, "GBSeq_feature-table" to see what sort of values are stored here, >>print records[0]["GBSeq_feature-table"] Then I get following, which seems rather confusing: [{u'GBFeature_quals': [{u'GBQualifier_name': 'organism', u'GBQualifier_value': 'uncultured prokaryote'}, *{u'GBQualifier_name': 'isolation_source', u'GBQualifier_value': 'contaminated river sediment'}*, {u'GBQualifier_name': 'db_xref', u'GBQualifier_value': 'taxon:198431'}, *{u'GBQualifier_name': 'clone', u'GBQualifier_value': '**Arthur_Kill_OTU4'}*, {u'GBQualifier_name': 'environmental_sample'}, {u'GBQualifier_name': 'country', u'GBQualifier_value': 'USA: New Jersey'}], u'GBFeature_key': 'source', u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '1..218'}, {u'GBFeature_quals': [{u'GBQualifier_name': 'product', u'GBQualifier_value': 'alkylsuccinate synthase'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '<1..>218', u'GBFeature_key': 'Protein', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}, {u'GBFeature_quals': [{u'GBQualifier_name': 'region_name', u'GBQualifier_value': 'RNR_PFL'}, {u'GBQualifier_name': 'note', u'GBQualifier_value': 'Ribonucleotide reductase and Pyruvate formate lyase; cl09939'}, {u'GBQualifier_name': 'db_xref', u'GBQualifier_value': 'CDD:186877'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '<1..>218', u'GBFeature_key': 'Region', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}, {u'GBFeature_quals': [{u'GBQualifier_name': 'gene', u'GBQualifier_value': 'assA'}, {u'GBQualifier_name': 'coded_by', u'GBQualifier_value': 'GU453639.1:<1..>658'}, {u'GBQualifier_name': 'codon_start', u'GBQualifier_value': '3'}, {u'GBQualifier_name': 'transl_table', u'GBQualifier_value': '11'}], u'GBFeature_intervals': [{u'GBInterval_from': '1', u'GBInterval_to': '218', u'GBInterval_accession': 'ADJ51069.1'}], u'GBFeature_location': '1..218', u'GBFeature_key': 'CDS', u'GBFeature_partial5': StringElement('', attributes={u'value': u'true'}), u'GBFeature_partial3': StringElement('', attributes={u'value': u'true'})}] It seems like there is some attributes called GBQualifier_name and GBQualifier_value, but I am not sure how to utilize these attributes to get the bolded values (i.e. contaminated river sediment and Arthur_Kill_OTU4). Your help here would be very much appreciated. Thank you in advance. Young -- Young C. Song Masters Student Graduate Program in Bioinformatics The University of British Columbia Department of Microbiology and Immunology 2350 Health Science Mall Vancouver, BC V6T 1Z4, Canada From p.j.a.cock at googlemail.com Thu Sep 8 07:41:14 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Sep 2011 08:41:14 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Wednesday, September 7, 2011, Peter Cock wrote: > On Wed, Sep 7, 2011 at 1:48 PM, Michiel de Hoon wrote: >> Hi all, >> >> Bio/UniGene/__init__.py contains a parser for UniGene files such as this one: >> >> https://github.com/biopython/biopython/blob/master/Tests/UniGene/Eca.1.2425.data >> >> Bio/UniGene/UniGene.py contains a parser for UniGene data in HTML format. I didn't find an example HTML file, but I somehow doubt that this parser is still relevant today. >> >> Does anybody object against deprecating Bio/UniGene/UniGene.py? >> (or should we start with a PendingDeprecationWarning)? >> Of course, Bio/UniGene/__init__.py will not be deprecated and will remain. > > I didn't realise we still had any HTML parsers left. > Since there is an easier to parse plain text format, then > absolutely we should phase out the HTML parser. > > Peter > Something in the recent change has upset 2to3, e.g. http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.2/builds/282/steps/compile/logs/stdio Peter From mjldehoon at yahoo.com Thu Sep 8 13:29:59 2011 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Thu, 8 Sep 2011 06:29:59 -0700 (PDT) Subject: [Biopython] How do I retrieve information regarding isolation_source and clones using Entrez In-Reply-To: Message-ID: <1315488599.55360.YahooMailClassic@web161212.mail.bf1.yahoo.com> > I used the key, "GBSeq_feature-table" to see what sort of > values are stored > here, > > >>print records[0]["GBSeq_feature-table"] > > Then I get following, which seems rather confusing: > > [{u'GBFeature_quals': [{u'GBQualifier_name': 'organism', > u'GBQualifier_value': 'uncultured prokaryote'}, > {u'GBQualifier_name': > 'isolation_source', u'GBQualifier_value': 'contaminated > river sediment'}, > {u'GBQualifier_name': ... As records[0]["GBSeq_feature-table"] starts with a '[', it is a Python list, and you can use it as such. Try for example >>> len(records[0]["GBSeq_feature-table"]) or >> records[0]["GBSeq_feature-table"][0] Similarly, if you see something that starts with a '{', it is a dictionary. Best, --Michiel. From p.j.a.cock at googlemail.com Thu Sep 8 14:10:25 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Thu, 8 Sep 2011 15:10:25 +0100 Subject: [Biopython] Bio.Unigene.UniGene In-Reply-To: References: <1315399702.38063.YahooMailClassic@web161217.mail.bf1.yahoo.com> Message-ID: On Thu, Sep 8, 2011 at 8:41 AM, Peter Cock wrote: > > Something in the recent change has upset 2to3, e.g. > http://testing.open-bio.org/biopython/builders/Linux%2064%20-%20Python%203.2/builds/282/steps/compile/logs/stdio > Looked like a new line problem, fixed: https://github.com/biopython/biopython/commit/11c33d25f2560f37c6b5457243205ec6186ebd45 Peter From pawan.mani2 at gmail.com Fri Sep 9 17:04:22 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Fri, 9 Sep 2011 22:34:22 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. Message-ID: Dear sir, I have been facing a problem in removing biopython from ubantu which i am using virtually using oracle VM vertualBox software. i have python2.7. but i hav instal python3 using apt-get command. and then i install biopython using aptget . then i could not able to import biopython in python3. only it works in python2.7. so i have remove python2.7 using sudo rm command. then i uninstall biopython using apt-get remove. then i type the command: $ dpkg --list | grep 'biopython' then i got : rF python-biopython 1.56-l Python library for Bioinformatics ii python-biopython 1.56-l Documentation for the biopython library How to remove this two files and I have downloanded biopython tar file but before completing i have stop and when i locate biopythone. there is a file biopython-1.58.tar.gz.par. i want to delete this file. I would like to know whether biopython is compatible with python.3.2. if yes then i want to install Biopython for the two python version 2.7.2 and 3.2. Kindly guide me to remove python3 also and I want to reinstall the two python version along with biopython. so that i ca used in both the version. With best regards, pawankumar From p.j.a.cock at googlemail.com Sat Sep 10 11:03:41 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 10 Sep 2011 12:03:41 +0100 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Fri, Sep 9, 2011 at 6:04 PM, kakchingtabam pawankumar sharma wrote: > Dear sir, > ? ? I have been facing a problem in removing biopython from ubantu which i > am using virtually using oracle VM vertualBox software. i have python2.7. > but i have instal python3 using apt-get command. and then i install biopython > using aptget I believe the Ubuntu package for Biopython will be for Python 2 (whichever Python 2 is standard for that vernon of Ubuntu). > . then i could not able to import biopython in python3. only it works in > python2.7. so i have remove python2.7 using sudo rm command. Was Python 2.7 installed by an Ubuntu package, or by you from source? If you install things from packages it is best to remove them using the package manager (apt-get remove here). > then i > uninstall biopython using apt-get remove. then i type the command: > > ? ? ?$ dpkg --list | grep 'biopython' > then i got : > > rF python-biopython ? ? ? ? ? 1.56-l ? ? Python library for Bioinformatics > ii python-biopython ? ? ? ? ? 1.56-l ? ? Documentation for the biopython > library > > > > How to remove this two files and I have downloanded biopython tar file but > before completing i have stop and when i locate biopythone. there is a file > biopython-1.58.tar.gz.par. i want to delete this file. I'm confused about what you've done with your system. > I would like to know whether biopython is compatible with python.3.2. > Somewhat. Most things work but at this point we only support Biopython on Python 2.5, 2.6 and 2.7. You are welcome to help test on Python 3, and fix bugs - but for a beginner I would recommend sticking with Biopython on Python 2 for now. > if yes then i want to install Biopython for the two python version 2.7.2 > and 3.2. > > Kindly guide me to remove python3 also and I want to reinstall the two > python version along with biopython. so that i ca used in both the version. If you installed Python 3 using the Ubuntu packages, then use the package manager to uninstall it. If you installed any copies of Biopython from source, you must manually remove them. Normally Python libraries are installed separately for Python 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2 etc. You can install Biopython for Python 2.7, and also install Biopython for Python 3.2 - this will result in two copies in the system libraries. That is normal. I hope that helps. Peter From eric.talevich at gmail.com Sat Sep 10 14:29:12 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 10:29:12 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Pawankumar, Try this series of commands: sudo aptitude reinstall python sudo aptitude remove python-biopython sudo aptitude install python-setuptools sudo aptitude install python3-setuptools sudo easy_install biopython sudo easy_install3 biopython This should fix your current Python 2.7 installation, which must be installed correctly for most of Ubuntu to work. The it installs the setuptools package for both Python 2.7 and Python 3, which includes the program "easy_install". As Peter mentioned, each Python version keeps its own set of installed libraries, so you need to install Biopython separately for each Python version you want to use. Don't use apt-get to install Python packages, in general. Use the easy_install command instead -- easy_install for the default Python version (2.7), easy_install3 for Python 3. Cheers, Eric On Fri, Sep 9, 2011 at 1:04 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > Dear sir, > I have been facing a problem in removing biopython from ubantu which i > am > using virtually using oracle VM vertualBox software. i have python2.7. but > i > hav > instal python3 using apt-get command. and then i install biopython using > aptget > . then i could not able to import biopython in python3. only it works in > python2.7. so i have remove python2.7 using sudo rm command. then i > uninstall > biopython using apt-get remove. then i type the command: > > $ dpkg --list | grep 'biopython' > then i got : > > rF python-biopython 1.56-l Python library for Bioinformatics > ii python-biopython 1.56-l Documentation for the biopython > library > > > > How to remove this two files and I have downloanded biopython tar file but > before completing i have stop and when i locate biopythone. there is a file > biopython-1.58.tar.gz.par. i want to delete this file. > > > I would like to know whether biopython is compatible with python.3.2. > > if yes then i want to install Biopython for the two python version 2.7.2 > and > 3.2. > > Kindly guide me to remove python3 also and I want to reinstall the two > python > version along with biopython. so that i ca used in both the version. > > With best regards, > pawankumar > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From pawan.mani2 at gmail.com Sat Sep 10 17:09:06 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sat, 10 Sep 2011 22:39:06 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Thanks for your reply. Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now in my system i have python2.6 so instead of python2.6, i would like to have python 2.7 and 3.1. for this two separate biopython also. so kindly tel me the *required exact commands to instal*l in my system. Dear Pete,r u mean to say is that python 3 is not possible to use biopython package. but i was reading a book called PYTHON FOR BIOINFORMATICS 2010 edition. this author taught us using biopython in python3 version. they kept emphasising the reader to learn python3 version. With best regards, Pawankumar sharma On Sat, Sep 10, 2011 at 7:59 PM, Eric Talevich wrote: > Pawankumar, > > Try this series of commands: > > sudo aptitude reinstall python > sudo aptitude remove python-biopython > sudo aptitude install python-setuptools > sudo aptitude install python3-setuptools > sudo easy_install biopython > sudo easy_install3 biopython > > > This should fix your current Python 2.7 installation, which must be > installed correctly for most of Ubuntu to work. > > The it installs the setuptools package for both Python 2.7 and Python 3, > which includes the program "easy_install". As Peter mentioned, each Python > version keeps its own set of installed libraries, so you need to install > Biopython separately for each Python version you want to use. > > Don't use apt-get to install Python packages, in general. Use the > easy_install command instead -- easy_install for the default Python version > (2.7), easy_install3 for Python 3. > > Cheers, > Eric > > > On Fri, Sep 9, 2011 at 1:04 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> Dear sir, >> I have been facing a problem in removing biopython from ubantu which i >> am >> using virtually using oracle VM vertualBox software. i have python2.7. but >> i >> hav >> instal python3 using apt-get command. and then i install biopython using >> aptget >> . then i could not able to import biopython in python3. only it works in >> python2.7. so i have remove python2.7 using sudo rm command. then i >> uninstall >> biopython using apt-get remove. then i type the command: >> >> $ dpkg --list | grep 'biopython' >> then i got : >> >> rF python-biopython 1.56-l Python library for Bioinformatics >> ii python-biopython 1.56-l Documentation for the biopython >> library >> >> >> >> How to remove this two files and I have downloanded biopython tar file but >> before completing i have stop and when i locate biopythone. there is a >> file >> biopython-1.58.tar.gz.par. i want to delete this file. >> >> >> I would like to know whether biopython is compatible with python.3.2. >> >> if yes then i want to install Biopython for the two python version 2.7.2 >> and >> 3.2. >> >> Kindly guide me to remove python3 also and I want to reinstall the two >> python >> version along with biopython. so that i ca used in both the version. >> >> With best regards, >> pawankumar >> _______________________________________________ >> Biopython mailing list - Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> > > From eric.talevich at gmail.com Sat Sep 10 17:34:06 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 13:34:06 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Sat, Sep 10, 2011 at 1:09 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > > Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now > in my system i have python2.6 so instead of python2.6, i would like to have > python 2.7 and 3.1. for this two separate biopython also. so kindly tel me > the *required exact commands to instal*l in my system. > I assume you're new to Ubuntu. If the default Python installation is 2.6, I recommend you use that instead of 2.7 for now. Using Python 3 is still OK. Do this: sudo aptitude install python-setuptools python3 python3-setuptools sudo easy_install biopython sudo easy_install3 biopython If you really want to use 2.7 also, then figuring out how to do that will be a fun exercise for you and you'll learn more about managing your system. Cheers, Eric From pawan.mani2 at gmail.com Sat Sep 10 17:41:14 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sat, 10 Sep 2011 23:11:14 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: DEar Eri, yap I am new to ubantu linux and python. I show in biopython website. the latest biothon does not support python 3.2. How to upgrade to python2.6 to 2.7. and i want install biopython for python 2.7 and i want python3.2 also. what are the possible commands upgration to python.2.7 and biopython for this. above this commands for installing oython3.2. thanks for replying very soon. with best regards, Pawan On Sat, Sep 10, 2011 at 11:04 PM, Eric Talevich wrote: > On Sat, Sep 10, 2011 at 1:09 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> >> Dear Eric,I have reinstall the ubantu using oracle VM virtual box. rit now >> in my system i have python2.6 so instead of python2.6, i would like to have >> python 2.7 and 3.1. for this two separate biopython also. so kindly tel me >> the *required exact commands to instal*l in my system. >> > > I assume you're new to Ubuntu. If the default Python installation is 2.6, I > recommend you use that instead of 2.7 for now. Using Python 3 is still OK. > > Do this: > > sudo aptitude install python-setuptools python3 python3-setuptools > > sudo easy_install biopython > sudo easy_install3 biopython > > > If you really want to use 2.7 also, then figuring out how to do that will > be a fun exercise for you and you'll learn more about managing your system. > > Cheers, > Eric > > From p.j.a.cock at googlemail.com Sat Sep 10 18:53:44 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 10 Sep 2011 19:53:44 +0100 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Saturday, September 10, 2011, kakchingtabam pawankumar sharma wrote: > Dear Pete,r u mean to say is that python 3 is not possible to use biopython > package. but i was reading a book called PYTHON FOR BIOINFORMATICS 2010 > edition. this author taught us using biopython in python3 version. they > kept emphasising the reader to learn python3 version. Which book is that please? e.g. Author name? Publisher? I'm a bit surprised there is a book out recommending using Biopython with Python 3 (given we don't officially support it yet, and don't even provide Windows installlers for Biopython under Python 3). We're still in a transition period where not all the major libraries have been converted from Python 2 to Python 3, so if you want/need to use extra Python libraries you shouldn't automatically start with Python 3. Peter From eric.talevich at gmail.com Sat Sep 10 19:36:42 2011 From: eric.talevich at gmail.com (Eric Talevich) Date: Sat, 10 Sep 2011 15:36:42 -0400 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: On Sat, Sep 10, 2011 at 1:41 PM, kakchingtabam pawankumar sharma < pawan.mani2 at gmail.com> wrote: > DEar Eri, yap I am new to ubantu linux and python. I show in biopython > website. the latest biothon does not support python 3.2. > > How to upgrade to python2.6 to 2.7. and i want install biopython for python > 2.7 and i want python3.2 also. > Since you're new to Ubuntu and Biopython, I recommend just using the versions you've installed already. Python 2.6 and 3.1 will meet your needs, and the more weird stuff you do to your system, the more likely you are to have problems later. Just try using the versions you have for a while, first. When you upgrade Ubuntu to the latest version (11.04 or 11.10 -- you're on 10.10 right now, I think), both Python versions will be upgraded automatically. If you have deeper questions about managing your Python installation on Ubuntu, or upgrading Ubuntu itself, please consult either the Ubuntu wiki or Python.org. Best, Eric From pawan.mani2 at gmail.com Sat Sep 10 19:46:30 2011 From: pawan.mani2 at gmail.com (kakchingtabam pawankumar sharma) Date: Sun, 11 Sep 2011 01:16:30 +0530 Subject: [Biopython] request to solve unistallation and installation biopython in two version of python under ubantu linux vertually in window7 using Oracle box. In-Reply-To: References: Message-ID: Thanks U two for ur kind reply i ll upgrade the ubatu. then i may have python 2.7. On Sun, Sep 11, 2011 at 1:06 AM, Eric Talevich wrote: > On Sat, Sep 10, 2011 at 1:41 PM, kakchingtabam pawankumar sharma < > pawan.mani2 at gmail.com> wrote: > >> DEar Eri, yap I am new to ubantu linux and python. I show in biopython >> website. the latest biothon does not support python 3.2. >> >> How to upgrade to python2.6 to 2.7. and i want install biopython for >> python 2.7 and i want python3.2 also. >> > > Since you're new to Ubuntu and Biopython, I recommend just using the > versions you've installed already. Python 2.6 and 3.1 will meet your needs, > and the more weird stuff you do to your system, the more likely you are to > have problems later. Just try using the versions you have for a while, > first. > > When you upgrade Ubuntu to the latest version (11.04 or 11.10 -- you're on > 10.10 right now, I think), both Python versions will be upgraded > automatically. > > If you have deeper questions about managing your Python installation on > Ubuntu, or upgrading Ubuntu itself, please consult either the Ubuntu wiki or > Python.org. > > Best, > Eric > From ndousis at gmail.com Mon Sep 12 01:00:56 2011 From: ndousis at gmail.com (Nasos Dousis) Date: Sun, 11 Sep 2011 18:00:56 -0700 Subject: [Biopython] align single sequence to MSA Message-ID: Hello, First, thank you to everyone who has contributed to the BioPython codebase and to the mailing list. I have a FASTA sequence, and I'd like to find the optimal alignment of that sequence to an MSA. I don't want to alter the MSA-- I just want to map the single sequence onto the MSA. Is there a simple way to do this by ClustalW or MUSCLE? Thanks, Nasos From steven.irvin at monsanto.com Tue Sep 13 20:51:03 2011 From: steven.irvin at monsanto.com (IRVIN, STEVEN (AG-Contractor/1000)) Date: Tue, 13 Sep 2011 20:51:03 +0000 Subject: [Biopython] Translations for BLAST hits "on the fly" using biopython Message-ID: <8F46CBF672774F4C8A6B288A246B4468A8209F@stlwexmbxprd02.na.ds.monsanto.com> Hell All, Is there a quick and easy way to obtain the (optimal) translations from nucleotide sequence [query or subject (db)] results ("hits"/HSP) derived from BLAST+ programs such as tblastn using BioPython? Steve Steven D Irvin, MS Bioinformatics Analyst [cid:image003.png at 01CC1925.F25B8430]CC214-A Monsanto Research Center Chesterfield Village, MO Steven.Irvin at monsanto.com (636) 737-1980 This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 3721 bytes Desc: image001.png URL: From p.j.a.cock at googlemail.com Wed Sep 14 21:09:07 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 14 Sep 2011 22:09:07 +0100 Subject: [Biopython] Fwd: NETTAB 2011 on Clinical Bioinformatics: Call for posters and participation In-Reply-To: <201109141346.p8EDk67h021261@clus2.istge.it> References: <201109141346.p8EDk67h021261@clus2.istge.it> Message-ID: Hopefully this is of interest to some of you... ---------- Forwarded message ---------- From: Paolo Romano Date: Wed, Sep 14, 2011 at 2:46 PM Subject: NETTAB 2011 on Clinical Bioinformatics: Call for posters and participation To: biopython-owner at lists.open-bio.org Dear list owner, I would be glad if you would forward thsi message to the list. Many thanks in adavnce. Ciao. Paolo ==== Dear all, I'm glad to remind you of next NETTAB 2011 workshop on Clinical Bioinformatics and to send you the related Call for submissions of posters. A Supplement of BMC Bioinformatics will later be published with a selection of best extended and revised versions of contributions, both oral communications and posters, presented at the workshop. If you are interested, as I hope, you are also welcome to subscribe to the low trafic list nettab-announce at istge.it. It is a moderated mailing list, devoted exclusively to announcements related to the NETTAB workshops, a series of events on new ICT tools and their applicability to biological research and bioinformatics that are held annually in Italy. Please subscribe at http://www.nettab.org/listsub.php . Best regards. Paolo Romano ==== NETTAB 2011 on "Clinical Bioinformatics" October 12-14, 2011 Collegio Ghislieri, Pavia, Italy http://www.nettab.org/2011/ NETTAB 2011 is the eleventh in a series of international workshops on Network Tools and Applications in Biology. NETTAB 2011 is focused on Clinical Bioinformatics and it is aimed at presenting the methods, tools and infrastructures that are nowadays available. It will also show some of the most interesting applications in this field. PROGRAMME Invited Speakers + Network Models of Mesophenotypes in Personal Genomics and Targeted Therapies Yves A. LUSSIER, Institute for Translational Medicine, University of Chicago, USA + From Omics to Systems Biology - an Approach to Individualized Medicine Thomas ILLIG, Institute of Epidemiology, Helmholtz Zentrum M?nchen, Neuherberg, Germany + ICT Architectures for Biobanks to Support Clinical Research Jan-Eric LITTON, Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden + Computing our Patients' Future Using Data from our Healthcare Institutions Shawn N MURPHY, Harvard Medical School, Boston, USA Special Sessions + Why clinicians need e-Health Terry HANNAN, University of Tasmania, Launceston, Australia + Interoperability - HL7 Amnon SHABO, IBM Research Lab, Haifa, Israel Tutorials + Data Warehouse Carlo COMBI, Department of Computer Science, University of Verona, Verona, Italy + Natural Language Processing Pierre ZWEIGENBAUM, LIMSI - CNRS, Orsay, France + Search Computing Marco MASSEROLI (to be confirmed), Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy CALL FOR POSTERS Submitted contributions should address one or more of the following topics: - Methods and Technologies: ? Data warehouse - Natural Language Processing - Data Mining - Ontologies, Interoperability, Standardisation - ? Search Computing - Decision Support - Tools: ? ICT architectures to support - Clinical research . BioBanks . Software for next generation sequencing - ? Data management and storage - Applications: ? Risk assessment . Diagnosis . Therapy planning- Drug Design Contributions length: - posters and demoes: no more than 3 pages Deadlines: ? September 16, 2011: Posters and demoes submission A slight delay may be accepted. Submit your contribution through the EasyChair system at http://www.easychair.org/conferences/?conf=nettab2011 All contributions submitted to NETTAB 2011 will be invited to a restricted Call for full research papers to be published in a Supplement of BMC Bioinformatics. REGISTRATION You can register to the NETTAB 2011 workshop by using the form at http://www.nettab.org/2011/rform.html Early registration ends within September 30, 2011. CONTACTS Visit the website http://www.nettab.org/2011/ Contact the organization by sending an email message to nettab2011 at unipv.it . Best regards. Paolo Romano, on behalf of the Conference Chairs Paolo Romano (paolo.romano at istge.it) Bioinformatics National Cancer Research Institute (IST) http://www.nettab.org/ NETTAB Workshops. Stay tuned! From abhishek.vit at gmail.com Fri Sep 16 22:42:37 2011 From: abhishek.vit at gmail.com (Abhishek Pratap) Date: Fri, 16 Sep 2011 15:42:37 -0700 Subject: [Biopython] comparing bam files Message-ID: Hi All This is my first post to the biopython mailing list. Basically I am new to both Python and BioP. So I have two bam files one contains the properly paired reads (file A) and the other has some of the singeltons (file B) either (read 1 / read 2). I have to find the mates of all the singletons from the properly paired bam file (file A) and then generate a bam file (file C)which has all the proper pairs for all the singletons I had. PS: Also the file A is guaranteed to have all the pairs which might exist as a singleton in file B. I want to do this on the binary files and avoid reading in the sam files. Is that something I can do using some of the bam readers in biopython ? Thanks! -Abhi From p.j.a.cock at googlemail.com Sat Sep 17 21:44:03 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 17 Sep 2011 22:44:03 +0100 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: On Fri, Sep 16, 2011 at 11:42 PM, Abhishek Pratap wrote: > Hi All > > This is my first post to the biopython mailing list. Basically I am > new to both Python and BioP. > Welcome. > > So I have two bam files one contains the properly paired reads (file > A) and the other has some of the singeltons (file B) either (read 1 / > read 2). > > I have to find the mates of all the singletons from the properly > paired bam file (file A) and then generate a bam file ?(file C)which > has all the proper pairs for all the singletons I had. > > PS: Also the file A is guaranteed to have all the pairs which might > exist as a singleton in file B. I don't understand what you're trying to do - why are there singletons in file A if that is the file of properly paired reads? > I want to do this on the binary files and avoid reading in the sam > files. Is that something I can do using some of the bam readers in > biopython ? Biopython doesn't have a SAM/BAM interface, instead there is pysam which binds the samtools C API: http://code.google.com/p/pysam/ Try that. Peter From abhishek.vit at gmail.com Tue Sep 20 21:46:15 2011 From: abhishek.vit at gmail.com (Abhishek Pratap) Date: Tue, 20 Sep 2011 14:46:15 -0700 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: Thanks for the reply Peter. I know my requirement sure does confusing but this is something we need to do in order to extract the reads which are stranded. In our case we want the reads where read 1 maps to same strand and read 2 on the other strand and eliminate the cases where read 2 falls on the same strand and read 1 on the opposite strand. I am looking into pysam to see if it can help me . Does pysam have another mailing list or this is the right forum to ask pysam related questions ? I am sure I will have some as soon as I begin poking into it. -Abhi On Sat, Sep 17, 2011 at 2:44 PM, Peter Cock wrote: > On Fri, Sep 16, 2011 at 11:42 PM, Abhishek Pratap > wrote: > > Hi All > > > > This is my first post to the biopython mailing list. Basically I am > > new to both Python and BioP. > > > > Welcome. > > > > > So I have two bam files one contains the properly paired reads (file > > A) and the other has some of the singeltons (file B) either (read 1 / > > read 2). > > > > I have to find the mates of all the singletons from the properly > > paired bam file (file A) and then generate a bam file (file C)which > > has all the proper pairs for all the singletons I had. > > > > PS: Also the file A is guaranteed to have all the pairs which might > > exist as a singleton in file B. > > I don't understand what you're trying to do - why are there > singletons in file A if that is the file of properly paired reads? > > > I want to do this on the binary files and avoid reading in the sam > > files. Is that something I can do using some of the bam readers in > > biopython ? > > Biopython doesn't have a SAM/BAM interface, instead there > is pysam which binds the samtools C API: > > http://code.google.com/p/pysam/ > > Try that. > > Peter > From p.j.a.cock at googlemail.com Tue Sep 20 22:35:36 2011 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 20 Sep 2011 23:35:36 +0100 Subject: [Biopython] comparing bam files In-Reply-To: References: Message-ID: On Tue, Sep 20, 2011 at 10:46 PM, Abhishek Pratap wrote: > Thanks for the reply Peter. I know my requirement sure does confusing but > this is something we need to do in order to extract the reads which are > stranded. In our case we want the reads where read 1 maps to same strand and > read 2 on the other strand and eliminate the cases where read 2 falls on the > same strand and read 1 on the opposite strand. Do you have something backwards, or perhaps I should go to sleep now... it sounds like you are talking about paired reads here (read 1 and read 2). It would be normal for Sanger (capillary) or Illumina paired end (or Illumina mate pairs) reads to map to opposite strands. Technically Roche paired end reads would map to the same strand (due to the way they sequence over the boundary of a circularised fragment) but I'm not 100% sure if any read aligners/assemblers reflect this. I know that sff_extract and MIRA flip one of the Roche 454 reads so that they act like classical Sanger or Illumina paired ends. > I am looking into pysam to see if it can help me . Does pysam have another > mailing list ?or this is the right forum to ask pysam related questions ? I > am sure I will have some as soon as I begin poking into it. > -Abhi There are a few people on the Biopython lists who do use pysam, and might be able to help, but pysam has a separate mailing list. Peter From nanatrapnest at hotmail.it Wed Sep 21 14:34:23 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Wed, 21 Sep 2011 14:34:23 +0000 Subject: [Biopython] Help for PDBParser Message-ID: Hello,I'd like to know how to print structure of a protein using Biopython, I istalled Python and Biopython, but where I get the proteins? I use this from Bio.PDB import *parser=PDBParser()structure=parser.get_structure("Tripsina", "2PTC.pdb")print structure but there is an error... Traceback (most recent call last): File "C:/Documents and Settings/Stefania/Desktop/PITON/prova", line 4, in structure=parser.get_structure("Tripsina", "2PTC.pdb") File "C:\Python27\lib\site-packages\Bio\PDB\PDBParser.py", line 77, in get_structure file=open(file)IOError: [Errno 2] No such file or directory: '2PTC.pdb' Can you help me please??? Where I find 2PTC.pdb??? Thanks From mikael.trellet at gmail.com Wed Sep 21 14:47:42 2011 From: mikael.trellet at gmail.com (Mikael Trellet) Date: Wed, 21 Sep 2011 16:47:42 +0200 Subject: [Biopython] Help for PDBParser In-Reply-To: References: Message-ID: The second argument you give to the get_structure function has to be the path of your PDB file. You will certainly find this PDB file in the PDB database : http://www.rcsb.org/pdb/home/home.do Moreover, your "print structure" will return only the structure object like that : "" You will have to iterate on it to print models, chains, residues and/or atoms to have something understandable ! Don't hesitate to read the Biopython wiki to have more details and informations : http://biopython.org/wiki/Biopython Cordially, On Wed, Sep 21, 2011 at 4:34 PM, Nana Trapnest wrote: > > Hello,I'd like to know how to print structure of a protein using Biopython, > I istalled Python and Biopython, but where I get the proteins? I use this > from Bio.PDB import > *parser=PDBParser()structure=parser.get_structure("Tripsina", > "2PTC.pdb")print structure > but there is an error... > Traceback (most recent call last): File "C:/Documents and > Settings/Stefania/Desktop/PITON/prova", line 4, in > structure=parser.get_structure("Tripsina", "2PTC.pdb") File > "C:\Python27\lib\site-packages\Bio\PDB\PDBParser.py", line 77, in > get_structure file=open(file)IOError: [Errno 2] No such file or > directory: '2PTC.pdb' > Can you help me please??? Where I find 2PTC.pdb??? Thanks > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > -- Mikael TRELLET, Computational structural biology group, Utrecht University Bijvoet Center, The Netherlands From oriolebaltimore at gmail.com Fri Sep 23 19:12:21 2011 From: oriolebaltimore at gmail.com (Adrian Johnson) Date: Fri, 23 Sep 2011 15:12:21 -0400 Subject: [Biopython] annotation help Message-ID: Hi : I have mutation results in VCF format. Typically I want to take chromosome position reference base consensus base chr21 30576509 C Y (C/T) >From this data: 1. I want to find out if this is a missense mutation. 2. Amino acid change ( VAL to MET) 3. Protein position 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) 4. Name of drug that acts on this. Is it possible to get such annotation through biopython? Dear Sean: You are very active in both bioconductor and biopython and you might have worked exome-seq data and worked through this problem. I could do this kind of stuff using SeattleSeq, however I want to get a stand-alone program that will help getting this done locally. what is your opinion on this kind of problem. Are there any standalone programs now in addition to Duke Sequence Variant Analyzer or SeattleSeq? thank you. -Adrian. From sdavis2 at mail.nih.gov Fri Sep 23 19:33:32 2011 From: sdavis2 at mail.nih.gov (Sean Davis) Date: Fri, 23 Sep 2011 15:33:32 -0400 Subject: [Biopython] annotation help In-Reply-To: References: Message-ID: Hi, Adrian. See: annovar snpEff Ensembl Variant Effect Predictor others.... None of these (or any program that I know of) will include the name of the drug that acts on the gene, but that information can be gleaned from other sources once you have the gene names. If you want to build something from scratch, you could start with this if you are working in cancer: https://wiki.nci.nih.gov/display/ICR/Cancer+Gene+Index+End+User+Documentation There are commercial softwares that offer gene/compound information, but I do not know which is "best". Sean On Fri, Sep 23, 2011 at 3:12 PM, Adrian Johnson wrote: > Hi : > > I have mutation results in VCF format. > > Typically I want to take > > chromosome ? ? position ? ? ? reference base ? ? ? consensus base > > ? ? ?chr21 ? ? ? ? ?30576509 ? ? ? ? ? ? ? ? ?C ? ? ? ? ? ? ? ? ? ? Y (C/T) > > > > From this data: > > 1. I want to find out if this is a missense mutation. > 2. Amino acid change ( VAL to MET) > 3. Protein position > 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) > 4. Name of drug that acts on this. > > > Is it possible to get such annotation through biopython? > > > Dear Sean: You are very active in both bioconductor and biopython and > you might have worked exome-seq data and worked through this problem. > I could do this kind of stuff using SeattleSeq, however I want to get > a stand-alone program that will help getting this done locally. ? what > is your opinion on this kind of problem. Are there any standalone > programs now in addition to Duke Sequence Variant Analyzer or > SeattleSeq? > > > thank you. > > -Adrian. > From oriolebaltimore at gmail.com Fri Sep 23 20:09:01 2011 From: oriolebaltimore at gmail.com (Adrian Johnson) Date: Fri, 23 Sep 2011 16:09:01 -0400 Subject: [Biopython] annotation help In-Reply-To: References: Message-ID: Thanks Sean. I will look into those software you mentioned. -Adrian. On Fri, Sep 23, 2011 at 3:33 PM, Sean Davis wrote: > Hi, Adrian. > > See: > > annovar > snpEff > Ensembl Variant Effect Predictor > others.... > > None of these (or any program that I know of) will include the name of > the drug that acts on the gene, but that information can be gleaned > from other sources once you have the gene names. ?If you want to build > something from scratch, you could start with this if you are working > in cancer: > > https://wiki.nci.nih.gov/display/ICR/Cancer+Gene+Index+End+User+Documentation > > There are commercial softwares that offer gene/compound information, > but I do not know which is "best". > > Sean > > > On Fri, Sep 23, 2011 at 3:12 PM, Adrian Johnson > wrote: >> Hi : >> >> I have mutation results in VCF format. >> >> Typically I want to take >> >> chromosome ? ? position ? ? ? reference base ? ? ? consensus base >> >> ? ? ?chr21 ? ? ? ? ?30576509 ? ? ? ? ? ? ? ? ?C ? ? ? ? ? ? ? ? ? ? Y (C/T) >> >> >> >> From this data: >> >> 1. I want to find out if this is a missense mutation. >> 2. Amino acid change ( VAL to MET) >> 3. Protein position >> 3. Gene name (KRTAP24) and RefSeq transcript name (NM_****) >> 4. Name of drug that acts on this. >> >> >> Is it possible to get such annotation through biopython? >> >> >> Dear Sean: You are very active in both bioconductor and biopython and >> you might have worked exome-seq data and worked through this problem. >> I could do this kind of stuff using SeattleSeq, however I want to get >> a stand-alone program that will help getting this done locally. ? what >> is your opinion on this kind of problem. Are there any standalone >> programs now in addition to Duke Sequence Variant Analyzer or >> SeattleSeq? >> >> >> thank you. >> >> -Adrian. >> > From nanatrapnest at hotmail.it Mon Sep 26 08:55:24 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Mon, 26 Sep 2011 08:55:24 +0000 Subject: [Biopython] help PDB parser Message-ID: I have a question... is it possible to save a file txt or better a matrix with all coordinates of the atoms ? For example... ATOM1 C 16 27 32ATOM2 CA 18 45 55ATOMN .. ... ... ... and save it like a txt file or a matrix N*3? Can you help me,please??? :) From anaryin at gmail.com Mon Sep 26 09:51:54 2011 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Mon, 26 Sep 2011 11:51:54 +0200 Subject: [Biopython] help PDB parser In-Reply-To: References: Message-ID: Hey Nana, PDBIO will not be suitable for this. I'd suggest looping through the atoms and printing the information you need in the format you want using regular string formatting options. Cheers, Jo?o From fkauff at biologie.uni-kl.de Mon Sep 26 14:54:15 2011 From: fkauff at biologie.uni-kl.de (Frank Kauff) Date: Mon, 26 Sep 2011 16:54:15 +0200 Subject: [Biopython] align single sequence to MSA In-Reply-To: References: Message-ID: <4E809217.5040004@biologie.uni-kl.de> Hi, Yes, clustal can do this easily. If I remember correctly the command line should be something like clustalw -sequences -profile1=mas_file.fas -profile2=new_sequences.fas Frank On 09/12/2011 03:00 AM, Nasos Dousis wrote: > Hello, > > First, thank you to everyone who has contributed to the BioPython > codebase and to the mailing list. > > I have a FASTA sequence, and I'd like to find the optimal alignment of > that sequence to an MSA. I don't want to alter the MSA-- I just want > to map the single sequence onto the MSA. Is there a simple way to do > this by ClustalW or MUSCLE? > > Thanks, > Nasos > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > From ndousis at gmail.com Tue Sep 27 23:16:16 2011 From: ndousis at gmail.com (Nasos Dousis) Date: Tue, 27 Sep 2011 16:16:16 -0700 Subject: [Biopython] Biopython Digest, Vol 105, Issue 14 In-Reply-To: References: Message-ID: Frank, Thanks for your reply and suggestion. I tried that command line with ClustalW2 and a number of variations: clustalw2 -sequences -profile1=mas_file.aln -profile2=new_sequences.fas (.aln = clustal format) clustalw2 -sequences -profile1=mas_file.aln -profile2=new_sequences.aln clustalw2 -sequences -profile1=new_sequences.fas -profile2=mas_file.aln clustalw2 -sequences -profile2=new_sequences.fas -profile1=mas_file.aln clustalw2 -profile1=mas_file.aln -profile2=new_sequences.fas clustalw2 -profile1=mas_file.aln -profile2=mas_file.aln clustalw2 -profile1=mas_file.aln etc and I always get the following error: ================================================================== [ndousis at linux-machine ~]$ clustalw2 -profile1=mas_file.aln -profile2=new_sequences.aln CLUSTAL 2.1 Multiple Sequence Alignments Sequence format is CLUSTAL ERROR: There are no sequences in profile2 file. ================================================================== Nevertheless, I implemented a simple version of Needleman-Wunsch to align my sequence to the MSA and choose the highest scoring alignment. Thanks and kind regards, Nasos On Mon, Sep 26, 2011 at 9:00 AM, wrote: > Message: 3 > Date: Mon, 26 Sep 2011 16:54:15 +0200 > From: Frank Kauff > Subject: Re: [Biopython] align single sequence to MSA > To: biopython at lists.open-bio.org > Message-ID: <4E809217.5040004 at biologie.uni-kl.de> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Hi, > > Yes, clustal can do this easily. If I remember correctly the command > line should be something like > > clustalw -sequences -profile1=mas_file.fas -profile2=new_sequences.fas > > Frank > > > On 09/12/2011 03:00 AM, Nasos Dousis wrote: >> Hello, >> >> First, thank you to everyone who has contributed to the BioPython >> codebase and to the mailing list. >> >> I have a FASTA sequence, and I'd like to find the optimal alignment of >> that sequence to an MSA. ?I don't want to alter the MSA-- I just want >> to map the single sequence onto the MSA. ?Is there a simple way to do >> this by ClustalW or MUSCLE? >> >> Thanks, >> Nasos >> _______________________________________________ >> Biopython mailing list ?- ?Biopython at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython >> > > > > ------------------------------ > > _______________________________________________ > Biopython mailing list ?- ?Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > > End of Biopython Digest, Vol 105, Issue 14 > ****************************************** > From nanatrapnest at hotmail.it Fri Sep 30 13:14:57 2011 From: nanatrapnest at hotmail.it (Nana Trapnest) Date: Fri, 30 Sep 2011 13:14:57 +0000 Subject: [Biopython] Information PDB file Message-ID: Hello, do you know if is possible to overwrite a PDBfile with other information, for example, atomic coordinates and saving it with another file name?? Thanks From anaryin at gmail.com Fri Sep 30 14:32:44 2011 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 30 Sep 2011 16:32:44 +0200 Subject: [Biopython] Information PDB file In-Reply-To: References: Message-ID: Dear Nana, Yes. Parse the structure as normally, and then just change the values of the fields that you want. For atomic coordinates, you can either insert them manually (just change the atom.coor array) or use transform to rotate and translate the atom according to a matrix-vector. Then just use PDBIO to save that new structure to another file. Cheers, Jo?o [...] Rodrigues http://nmr.chem.uu.nl/~joao 2011/9/30 Nana Trapnest > > Hello, > do you know if is possible to overwrite a PDBfile with other information, > for example, atomic coordinates and saving it with another file name?? > Thanks > > _______________________________________________ > Biopython mailing list - Biopython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython >