From p.j.a.cock at googlemail.com Tue Jun 4 13:29:55 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 4 Jun 2013 18:29:55 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs Message-ID: Hi Bartek, I'm hoping you or Michiel can investigate this issue, http://www.biostars.org/p/73500/ I believe Ivan has correctly diagnosed a Biopython issue in the alphabet handling of the motif class on this BioStars question, and he's given a workaround. The problem code looks like this: if self.alphabet!=IUPAC.unambiguous_dna: raise ValueError("Wrong alphabet! Use only with DNA motifs") First, assuming the test is really for just IUPAC unambiguous DNA, the error message is misleading - it sounds like using generic_dna or IUPAC ambiguous DNA would be acceptable but it isn't. The core problem here is that IUPAC.unambiguous_dna is just one instance of the IUPACUnambiguousDNA() class, and other instances should be equally acceptable but will fail the equality. I have sometimes wondered if we could and should make some of the Alphabet objects into singletons (only one instance allowed), which might be one way to solve this issue. Alternatively, perhaps all we need is to here is see if the alphabet is DNA and which letter set it uses? Is that the key point for the matrix calculations etc? e.g. from Bio.Alphabet import _get_base_alphabet, DNAAlphabet if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): raise ValueError("This only works for DNA motifs") if not self.alphabet.letters == unambiguous_dna.letters: raise ValueError("Expected IUPAC.unambiguous_dna or similar") (Untested, and these suggested error messages need some work) Regards, Peter From clements at galaxyproject.org Tue Jun 4 15:29:57 2013 From: clements at galaxyproject.org (Dave Clements) Date: Tue, 4 Jun 2013 12:29:57 -0700 Subject: [Biopython-dev] GCC2013 Regular Registration Closes June 14 Message-ID: Hello all, This is the final registration reminder for the 2013 Galaxy Community Conference (GCC2013), being held in Oslo, 30 June through July 2. GCC2013 is a great opportunity to share best practices and network with other researchers who are also facing the challenges of data-intensive biology. Registration closes June 14*, ten days from today. Register now and guarantee your spot in the Training Day sessions you want to take.* *Registration is still a bargain with the full 3-day registration starting at ~ ?165 for post-docs and students (or just ?55 per day). The program features 15 Training Day sessions in 5 tracks on 12 different topics, 25 Talks on topics ranging from Reproducibility to Exploiting Galaxy , 23 Posters (and counting), 2 Lightning Talk sessions, and a end-of-conference event at an historic venue high above Oslo. Ser frem til ? se deg i Oslo! GCC2013 Organizing Committee PS: Please help get the word out . * Not June 7, as had been stated earlier in several places. -- http://galaxyproject.org/GCC2013 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://wiki.galaxyproject.org/ From mjldehoon at yahoo.com Tue Jun 4 22:28:28 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 4 Jun 2013 19:28:28 -0700 (PDT) Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: Message-ID: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Hi Peter, I have never quite understood why we need a separate class for each alphabet. I would think that a single alphabet class (or maybe a DNA, an RNA, and a protein alphabet class) is sufficient, and that the specific alphabets are instances of this class. Also, alphabets are essentially sets of letters, so an Alphabet class should inherit from set, allowing us to use its associated methods to compare alphabets to each other. Best, -Michiel. ________________________________ From: Peter Cock To: Bartek Wilczynski ; Michiel de Hoon Cc: Biopython-Dev Mailing List Sent: Wednesday, June 5, 2013 2:29 AM Subject: Alphabet bug in Bio.Motif and Bio.motifs Hi Bartek, I'm hoping you or Michiel can investigate this issue, http://www.biostars.org/p/73500/ I believe Ivan has correctly diagnosed a Biopython issue in the alphabet handling of the motif class on this BioStars question, and he's given a workaround. The problem code looks like this: ? ? ? ? if self.alphabet!=IUPAC.unambiguous_dna: ? ? ? ? ? ? raise ValueError("Wrong alphabet! Use only with DNA motifs") First, assuming the test is really for just IUPAC unambiguous DNA, the error message is misleading - it sounds like using generic_dna or IUPAC ambiguous DNA would be acceptable but it isn't. The core problem here is that IUPAC.unambiguous_dna is just one instance of the IUPACUnambiguousDNA() class, and other instances should be equally acceptable but will fail the equality. I have sometimes wondered if we could and should make some of the Alphabet objects into singletons (only one instance allowed), which might be one way to solve this issue. Alternatively, perhaps all we need is to here is see if the alphabet is DNA and which letter set it uses? Is that the key point for the matrix calculations etc? e.g. from Bio.Alphabet import _get_base_alphabet, DNAAlphabet ? ? if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): ? ? ? ? raise ValueError("This only works for DNA motifs") ? ? if not self.alphabet.letters == unambiguous_dna.letters: ? ? ? ? raise ValueError("Expected IUPAC.unambiguous_dna or similar") (Untested, and these suggested error messages need some work) Regards, Peter From redmine at redmine.open-bio.org Tue Jun 4 23:31:25 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Wed, 5 Jun 2013 03:31:25 +0000 Subject: [Biopython-dev] [Biopython - Bug #3434] (New) PDB.PDBParser Message-ID: Issue #3434 has been reported by Mirslaw Syzdek. ---------------------------------------- Bug #3434: PDB.PDBParser https://redmine.open-bio.org/issues/3434 Author: Mirslaw Syzdek Status: New Priority: Normal Assignee: Category: Target version: URL: Two months ago I downloaded a pdb file from NCBI (Database: Structure, Name: 1EZQ). With this file the following code works fine: @parser = PDB.PDBParser() struct = parser.get_structure('1EZQ.pdb', '1EZQ.pdb') ppb = PDB.PPBuilder() peptides = ppb.build_peptides(struct)@ Few days ago I downloaded the pdb file one more time. For the new file above code stopped working. The PDBParser is throwing an error. The error is cause by different column separation (see the line 423 in the attached files). ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From barwil at gmail.com Wed Jun 5 04:13:00 2013 From: barwil at gmail.com (Bartek Wilczynski) Date: Wed, 5 Jun 2013 10:13:00 +0200 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: I'm a bit out of the loop here, but to me it seems like a simple issue: Why not change the problematic code: if self.alphabet!=IUPAC.unambiguous_dna: raise ValueError("Wrong alphabet! Use only with DNA motifs") into: if type(self.alphabet)!=type(IUPAC.unambiguous_dna): raise ValueError("Wrong alphabet! Use only with DNA motifs") and worry about fixing the Bio.Alphabet issues later (it does sound reasonable to make sure that any alphabet instance is a singleton). best Bartek On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: > Hi Peter, > > I have never quite understood why we need a separate class for each > alphabet. > I would think that a single alphabet class (or maybe a DNA, an RNA, and a > protein alphabet class) is sufficient, and that the specific alphabets are > instances of this class. > Also, alphabets are essentially sets of letters, so an Alphabet class should > inherit from set, allowing us to use its associated methods to compare > alphabets to each other. > > Best, > -Michiel. > > > ________________________________ > From: Peter Cock > To: Bartek Wilczynski ; Michiel de Hoon > > Cc: Biopython-Dev Mailing List > Sent: Wednesday, June 5, 2013 2:29 AM > Subject: Alphabet bug in Bio.Motif and Bio.motifs > > Hi Bartek, > > I'm hoping you or Michiel can investigate this issue, > http://www.biostars.org/p/73500/ > > I believe Ivan has correctly diagnosed a Biopython issue in the alphabet > handling of the motif class on this BioStars question, and he's given a > workaround. The problem code looks like this: > > if self.alphabet!=IUPAC.unambiguous_dna: > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > First, assuming the test is really for just IUPAC unambiguous DNA, > the error message is misleading - it sounds like using generic_dna > or IUPAC ambiguous DNA would be acceptable but it isn't. > > The core problem here is that IUPAC.unambiguous_dna is just > one instance of the IUPACUnambiguousDNA() class, and other > instances should be equally acceptable but will fail the equality. > > I have sometimes wondered if we could and should make some of > the Alphabet objects into singletons (only one instance allowed), > which might be one way to solve this issue. > > Alternatively, perhaps all we need is to here is see if the alphabet > is DNA and which letter set it uses? Is that the key point for the matrix > calculations etc? e.g. > > from Bio.Alphabet import _get_base_alphabet, DNAAlphabet > > if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): > raise ValueError("This only works for DNA motifs") > if not self.alphabet.letters == unambiguous_dna.letters: > raise ValueError("Expected IUPAC.unambiguous_dna or similar") > > (Untested, and these suggested error messages need some work) > > Regards, > > Peter > > -- Bartek Wilczynski ================== Institute of Informatics University of Warsaw http://www.mimuw.edu.pl/~bartek From p.j.a.cock at googlemail.com Wed Jun 5 05:32:11 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jun 2013 10:32:11 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Wed, Jun 5, 2013 at 9:13 AM, Bartek Wilczynski wrote: > I'm a bit out of the loop here, but to me it seems like a simple issue: > > Why not change the problematic code: > > if self.alphabet!=IUPAC.unambiguous_dna: > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > into: > > if type(self.alphabet)!=type(IUPAC.unambiguous_dna): > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > and worry about fixing the Bio.Alphabet issues later (it does sound > reasonable to make sure that any alphabet instance is a singleton). > > best > Bartek I would prefer a more duck-typing approach (is it DNA? Does it use the expected set of letters?), but that sounds practical. Could you try using isinstance instead though (see PEP8), and then make that fix with a new unit test based on the original query please? > On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: >> Hi Peter, >> >> I have never quite understood why we need a separate class for each >> alphabet. >> I would think that a single alphabet class (or maybe a DNA, an RNA, and a >> protein alphabet class) is sufficient, and that the specific alphabets are >> instances of this class. >> Also, alphabets are essentially sets of letters, so an Alphabet class should >> inherit from set, allowing us to use its associated methods to compare >> alphabets to each other. >> >> Best, >> -Michiel. I wouldn't want to subclass sets due to the fact that in many existing uses of the alphabets the order of the letters is important (and this is not specified in a Python set). But I agree that a rationalised alphabet system like that could work better. Here equality testing could be on both being the same type, e.g. DNA, and having the same letters - including special letters for gaps or stop codons (which are the nastiest part of the current alphabet object system)? Peter From mjldehoon at yahoo.com Wed Jun 5 06:12:38 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 5 Jun 2013 03:12:38 -0700 (PDT) Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> > I wouldn't want to subclass sets due to the fact that in many > existing uses of the alphabets the order of the letters is > important (and this is not specified in a Python set). OK, then indeed a set wouldn't be appropriate. > But I agree that a rationalised alphabet system like that could > work better. Here equality testing could be on both being the > same type, e.g. DNA, and having the same letters - including > special letters for gaps or stop codons (which are the nastiest > part of the current alphabet object system)? I guess that it depends on how the alphabet is used. For example, for the example in the bug report the order of the letters doesn't matter, but for other cases it may matter. Personally I almost never use alphabets. Can anybody give some real-life examples of how they are used? Best, -Michiel ________________________________ From: Peter Cock To: Bartek Wilczynski Cc: Michiel de Hoon ; Biopython-Dev Mailing List Sent: Wednesday, June 5, 2013 6:32 PM Subject: Re: Alphabet bug in Bio.Motif and Bio.motifs On Wed, Jun 5, 2013 at 9:13 AM, Bartek Wilczynski wrote: > I'm a bit out of the loop here, but to me it seems like a simple issue: > > Why not change the problematic code: > >? if self.alphabet!=IUPAC.unambiguous_dna: >? ? ? ???raise ValueError("Wrong alphabet! Use only with DNA motifs") > > into: > >? if type(self.alphabet)!=type(IUPAC.unambiguous_dna): >? ? ? ???raise ValueError("Wrong alphabet! Use only with DNA motifs") > > and worry about fixing the Bio.Alphabet issues later (it does sound > reasonable to make sure that any alphabet instance is a singleton). > > best > Bartek I would prefer a more duck-typing approach (is it DNA? Does it use the expected set of letters?), but that sounds practical. Could you try using isinstance instead though (see PEP8), and then make that fix with a new unit test based on the original query please? > On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: >> Hi Peter, >> >> I have never quite understood why we need a separate class for each >> alphabet. >> I would think that a single alphabet class (or maybe a DNA, an RNA, and a >> protein alphabet class) is sufficient, and that the specific alphabets are >> instances of this class. >> Also, alphabets are essentially sets of letters, so an Alphabet class should >> inherit from set, allowing us to use its associated methods to compare >> alphabets to each other. >> >> Best, >> -Michiel. I wouldn't want to subclass sets due to the fact that in many existing uses of the alphabets the order of the letters is important (and this is not specified in a Python set). But I agree that a rationalised alphabet system like that could work better. Here equality testing could be on both being the same type, e.g. DNA, and having the same letters - including special letters for gaps or stop codons (which are the nastiest part of the current alphabet object system)? Peter From p.j.a.cock at googlemail.com Wed Jun 5 06:29:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jun 2013 11:29:58 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Wed, Jun 5, 2013 at 11:12 AM, Michiel de Hoon wrote: >> I wouldn't want to subclass sets due to the fact that in many >> existing uses of the alphabets the order of the letters is >> important (and this is not specified in a Python set). > > OK, then indeed a set wouldn't be appropriate. > >> But I agree that a rationalised alphabet system like that could >> work better. Here equality testing could be on both being the >> same type, e.g. DNA, and having the same letters - including >> special letters for gaps or stop codons (which are the nastiest >> part of the current alphabet object system)? > > I guess that it depends on how the alphabet is used. For example, for > the example in the bug report the order of the letters doesn't matter, > but for other cases it may matter. What is the motif class doing that restricts it to IUPAC unambiguous DNA? Rather than any DNA alphabet, such as ambiguous DNA, or mixed case sequences? > Personally I almost never use > alphabets. Can anybody give some real-life examples of how they > are used? The generic aim is to label Seq objects as either DNA, RNA or protein (and restrict operations like additions or translation accordingly). That doesn't need the letter level information. Validating that sequences use the expected letters only (e.g. if sending to a tool which does not understand U as a protein, or if writing to a restricted file format). I think the NEXUS code has this kind of constraint. Counting amino acid or nucleotide frequencies - even if your example proteins happens to lack proline, you'd probably want to consider it in your list of amino acids. Depending on your data structure that could be important (while a consistent order may or may not matter, e.g. array indexing). Peter From yeyanbo289 at gmail.com Thu Jun 6 11:58:46 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Thu, 6 Jun 2013 23:58:46 +0800 Subject: [Biopython-dev] GSOC Project Introduction Message-ID: Hi everyone, I'm Yanbo Ye. I'm happy that I was accepted by NESCent for this year's GSOC and that I can contribute to Biopython through this project. I will work on two phylogenetic modules of the Phylo package: tree construction and consensus tree searching. To share my project progress, as Peter Cock suggested, I have setup a blog on github. Hereis my first introduction post. Cheers, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From jmb at ebi.ac.uk Thu Jun 13 15:01:39 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Thu, 13 Jun 2013 20:01:39 +0100 Subject: [Biopython-dev] changing PDB file chains Message-ID: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Hi I'm trying to use biopython to update a PDB file. I'm trying to update the chain ID of a series of waters in a PDB file. I have the original chain ID, new chain ID and water residue number in an mmcif file which I parse using a separate parser. Then for each water I have in the mmcif file I want to update the chain ID from the cif file. I then want to write out the updated water line (to test it works) or write out the updated PDB file. Is this possible with biopython? Regards John From davidjosephcain at gmail.com Thu Jun 13 15:17:18 2013 From: davidjosephcain at gmail.com (David Cain) Date: Thu, 13 Jun 2013 15:17:18 -0400 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: Yes, John, it's possible! You'll first want to modify the parsed structure. Add your water molecules to the desired chain (removing from the old, of course). To actually do this, you may want to look at the source ( http://biopython.org/DIST/docs/api/Bio.PDB-module.html), specifically how the SMRCA hierarchy is constructed. Once you've modified your Structure (say it's in a variable `struct`), you should create an instance of PDBIO(), then save your structure like so: pdb_writer = PDB.PDBIO() pdb_writer.set_structure(struct) pdb_writer.save("output_path.pdb") Do not that PDBIO has some limitations (e.g. it cannot write out PDB header data). It should probably suffice for your needs, though. If you're not able to figure it out, feel free to email me back (preferably with your code!) and I can help you out. StackOverflow works particularly well for me, if you're amenable to that. David Cain +1 (339) 222 4452 On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford wrote: > Hi > > > > I'm trying to use biopython to update a PDB file. > > > > I'm trying to update the chain ID of a series of waters in a PDB file. I > have the original chain ID, new chain ID and water residue number in an > mmcif file which I parse using a separate parser. Then for each water I > have > in the mmcif file I want to update the chain ID from the cif file. > > I then want to write out the updated water line (to test it works) or write > out the updated PDB file. > > > > Is this possible with biopython? > > > > Regards > > > > John > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From anaryin at gmail.com Fri Jun 14 05:05:41 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 14 Jun 2013 11:05:41 +0200 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: Hi, If you simply want to update ids, you can just change them (chain.id = newvalue) and then output the structure like David suggested. No need to remove/add atoms. If you wish to play with the structure then you should modify the SMCRA hierarchy indeed. Cheers, Jo?o 2013/6/13 David Cain > Yes, John, it's possible! > > You'll first want to modify the parsed structure. Add your water molecules > to the desired chain (removing from the old, of course). To actually do > this, you may want to look at the source ( > http://biopython.org/DIST/docs/api/Bio.PDB-module.html), specifically how > the SMRCA hierarchy is constructed. > > Once you've modified your Structure (say it's in a variable `struct`), you > should create an instance of PDBIO(), then save your structure like so: > > pdb_writer = PDB.PDBIO() > pdb_writer.set_structure(struct) > pdb_writer.save("output_path.pdb") > > Do not that PDBIO has some limitations (e.g. it cannot write out PDB header > data). It should probably suffice for your needs, though. > > If you're not able to figure it out, feel free to email me back (preferably > with your code!) and I can help you out. > StackOverflow works > particularly well for me, if you're amenable to that. > > > > David Cain > +1 (339) 222 4452 > > > On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford wrote: > > > Hi > > > > > > > > I'm trying to use biopython to update a PDB file. > > > > > > > > I'm trying to update the chain ID of a series of waters in a PDB file. I > > have the original chain ID, new chain ID and water residue number in an > > mmcif file which I parse using a separate parser. Then for each water I > > have > > in the mmcif file I want to update the chain ID from the cif file. > > > > I then want to write out the updated water line (to test it works) or > write > > out the updated PDB file. > > > > > > > > Is this possible with biopython? > > > > > > > > Regards > > > > > > > > John > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From jmb at ebi.ac.uk Fri Jun 14 05:48:21 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Fri, 14 Jun 2013 10:48:21 +0100 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: <51BAE6E5.1040401@ebi.ac.uk> Hi Jo?o and David Thank you for the help. The part that confused me is how do I change a chain ID for a specific water? eg. I can select a water with atom = pdbFile[0]['W']['W', 1031, ' ']['O'] or maybe residue = pdbFile[0]['W']['W', 1031, ' '] now, how do I update the chain ID for this water? alternatively I can select a water with for model in pdbFile: for chain in model: for residue in chain: if residue.id[0] == 'W': if residue.id[1] == '1031': I presume that I can then do... chain.id = 'A' and will this change the chain ID for this specific water or all atoms? Regards John On 14/06/13 10:05, Jo?o Rodrigues wrote: > Hi, > > If you simply want to update ids, you can just change them (chain.id > = newvalue) and then output the structure like David > suggested. No need to remove/add atoms. If you wish to play with the > structure then you should modify the SMCRA hierarchy indeed. > > Cheers, > > Jo?o > > > 2013/6/13 David Cain > > > Yes, John, it's possible! > > You'll first want to modify the parsed structure. Add your water > molecules > to the desired chain (removing from the old, of course). To > actually do > this, you may want to look at the source ( > http://biopython.org/DIST/docs/api/Bio.PDB-module.html), > specifically how > the SMRCA hierarchy is constructed. > > Once you've modified your Structure (say it's in a variable > `struct`), you > should create an instance of PDBIO(), then save your structure > like so: > > pdb_writer = PDB.PDBIO() > pdb_writer.set_structure(struct) > pdb_writer.save("output_path.pdb") > > Do not that PDBIO has some limitations (e.g. it cannot write out > PDB header > data). It should probably suffice for your needs, though. > > If you're not able to figure it out, feel free to email me back > (preferably > with your code!) and I can help you out. > StackOverflow works > particularly well for me, if you're amenable to that. > > > > David Cain > +1 (339) 222 4452 > > > On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford > wrote: > > > Hi > > > > > > > > I'm trying to use biopython to update a PDB file. > > > > > > > > I'm trying to update the chain ID of a series of waters in a PDB > file. I > > have the original chain ID, new chain ID and water residue > number in an > > mmcif file which I parse using a separate parser. Then for each > water I > > have > > in the mmcif file I want to update the chain ID from the cif file. > > > > I then want to write out the updated water line (to test it > works) or write > > out the updated PDB file. > > > > > > > > Is this possible with biopython? > > > > > > > > Regards > > > > > > > > John > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > -- John Berrisford PDBe EMBL-EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD Tel: 01223 492529 http://www.facebook.com/proteindatabank http://twitter.com/PDBeurope From anaryin at gmail.com Fri Jun 14 05:50:24 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 14 Jun 2013 11:50:24 +0200 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <51BAE6E5.1040401@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> Message-ID: Hi John, Actually, David is absolutely right.. I didn't really think it through. You need to move the water atoms to the chain where you want them to be. So, if they are in chain A and should be in chain B, you need to detach them from chain A (detach_child method on the residue, easier) and re-attach it to chain B (add method). From yeyanbo289 at gmail.com Fri Jun 14 07:24:10 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Fri, 14 Jun 2013 19:24:10 +0800 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation Message-ID: Hi everyone, Here we have some people like to translate the biopython tutorial into Chinese, so that more people in China can use biopython for their research and contribute to biopython. We noticed there is a LaTeX file for this tutorial that we can work on. Before we start, we want to know what is the right way to do this. Is there any previous or ongoing translation project that we can follow? Thanks, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From zruan1991 at gmail.com Fri Jun 14 10:43:35 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 14 Jun 2013 10:43:35 -0400 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: There was once a discussion in http://www.bioxxx.cn. You can find relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems not to be an official translation. I find a copy of it in case you don't have permission (send you off-list). Best, Zheng Ruan On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: > Hi everyone, > > Here we have some people like to translate the biopython tutorial into > Chinese, so that more people in China can use biopython for their research > and contribute to biopython. We noticed there is a LaTeX file for this > tutorial that we can work on. > > Before we start, we want to know what is the right way to do this. Is there > any previous or ongoing translation project that we can follow? > > Thanks, > Yanbo > -- > > ?????? > > ???????????????????????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From eric.talevich at gmail.com Fri Jun 14 11:59:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Fri, 14 Jun 2013 11:59:00 -0400 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi guys, Great idea! Yes, there was an earlier "unofficial" effort to both port the tutorial to Sphinx and translate it to Chinese: http://www.bio-cloud.info/Biopython/en/index.html http://www.bio-cloud.info/Biopython/cn/index.html http://www.bio-cloud.info/blog/?p=57 I don't know much about Sphinx's support for multiple translations of the same text (or LaTeX's, for that matter), but maybe a Sphinx/reStructuredText port would make it easier to manage individual sections of the document and keep them up to date with their English equivalents. The bug report for this (long-term) task is: https://redmine.open-bio.org/issues/3219 All the best, Eric On Fri, Jun 14, 2013 at 10:43 AM, Zheng Ruan wrote: > There was once a discussion in http://www.bioxxx.cn. You can find relevant > info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems not > to be an official translation. I find a copy of it in case you don't have > permission (send you off-list). > > Best, > Zheng Ruan > > > > On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: > > > Hi everyone, > > > > Here we have some people like to translate the biopython tutorial into > > Chinese, so that more people in China can use biopython for their > research > > and contribute to biopython. We noticed there is a LaTeX file for this > > tutorial that we can work on. > > > > Before we start, we want to know what is the right way to do this. Is > there > > any previous or ongoing translation project that we can follow? > > > > Thanks, > > Yanbo > > -- > > > > ?????? > > > > ???????????????????????????????? > > > > Yanbo Ye > > > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > > Sciences > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From zruan1991 at gmail.com Fri Jun 14 23:18:07 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 14 Jun 2013 23:18:07 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Homepage Message-ID: Hi all, Following Peter and Karen's suggestion, I set up my project homepage in github (http://zruanweb.com/ ). I also have a first four weeks' plan there (http://zruanweb.com/project-timeline.html). Thanks. Best, Ruan From yeyanbo289 at gmail.com Fri Jun 14 23:22:15 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sat, 15 Jun 2013 11:22:15 +0800 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi Eric, That's great. It seems we have a good starting point. I'll contact him to see how to join him and make an official version if possible. Best, Yanbo On Fri, Jun 14, 2013 at 11:59 PM, Eric Talevich wrote: > Hi guys, > > Great idea! Yes, there was an earlier "unofficial" effort to both port the > tutorial to Sphinx and translate it to Chinese: > http://www.bio-cloud.info/Biopython/en/index.html > http://www.bio-cloud.info/Biopython/cn/index.html > http://www.bio-cloud.info/blog/?p=57 > > I don't know much about Sphinx's support for multiple translations of the > same text (or LaTeX's, for that matter), but maybe a > Sphinx/reStructuredText port would make it easier to manage individual > sections of the document and keep them up to date with their English > equivalents. The bug report for this (long-term) task is: > https://redmine.open-bio.org/issues/3219 > > All the best, > Eric > > > > > On Fri, Jun 14, 2013 at 10:43 AM, Zheng Ruan wrote: > >> There was once a discussion in http://www.bioxxx.cn. You can find >> relevant >> info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems >> not >> to be an official translation. I find a copy of it in case you don't have >> permission (send you off-list). >> >> Best, >> Zheng Ruan >> >> >> >> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >> >> > Hi everyone, >> > >> > Here we have some people like to translate the biopython tutorial into >> > Chinese, so that more people in China can use biopython for their >> research >> > and contribute to biopython. We noticed there is a LaTeX file for this >> > tutorial that we can work on. >> > >> > Before we start, we want to know what is the right way to do this. Is >> there >> > any previous or ongoing translation project that we can follow? >> > >> > Thanks, >> > Yanbo >> > -- >> > >> > ??? >> > >> > ???????????????? >> > >> > Yanbo Ye >> > >> > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >> > Sciences >> > >> > _______________________________________________ >> > Biopython-dev mailing list >> > Biopython-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biopython-dev >> > >> >> >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev >> >> > -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From yeyanbo289 at gmail.com Fri Jun 14 23:38:17 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sat, 15 Jun 2013 11:38:17 +0800 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: I've contacted the bioxxx website admin. While he agreed, the copyright is still a problem. He said they just organized the translation from the original website and make a pdf version. As the original site is not accessible anymore, now it is hard to find the original translator. Maybe we can just write an explanation and wait for him/her to contact us. On Fri, Jun 14, 2013 at 11:36 PM, Zheng Ruan wrote: > Cool. Go ahead! > > Zheng Ruan > > > On Fri, Jun 14, 2013 at 11:34 AM, Yanbo Ye wrote: > >> Sure. The original site for this translation is not accessible anymore. >> But I know the bioxxx website admin. We can contact him about the author. >> ? 2013-6-14 ??11:17?"Zheng Ruan" ??? >> >> Yep, some old chapters do not change much. But the copyright seems to be >>> a concern. Do we need to contact the original author for such a permission. >>> I think they will be happy to grant it. >>> >>> Thanks, >>> Zheng Ruan >>> >>> >>> On Fri, Jun 14, 2013 at 10:55 AM, Yanbo Ye wrote: >>> >>>> Thanks, Zhen Ruan. >>>> I have that one. It's an old version tranlation and many chapters are >>>> not available. But we can work base on that one I think. >>>> ? 2013-6-14 ??10:40?"Zheng Ruan" ??? >>>> >>>> There was once a discussion in http://www.bioxxx.cn. You can find >>>>> relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, >>>>> this seems not to be an official translation. I find a copy of it in case >>>>> you don't have permission (send you off-list). >>>>> >>>>> Best, >>>>> Zheng Ruan >>>>> >>>>> >>>>> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> Here we have some people like to translate the biopython tutorial into >>>>>> Chinese, so that more people in China can use biopython for their >>>>>> research >>>>>> and contribute to biopython. We noticed there is a LaTeX file for this >>>>>> tutorial that we can work on. >>>>>> >>>>>> Before we start, we want to know what is the right way to do this. Is >>>>>> there >>>>>> any previous or ongoing translation project that we can follow? >>>>>> >>>>>> Thanks, >>>>>> Yanbo >>>>>> -- >>>>>> >>>>>> ??? >>>>>> >>>>>> ???????????????? >>>>>> >>>>>> Yanbo Ye >>>>>> >>>>>> Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >>>>>> Sciences >>>>>> >>>>>> _______________________________________________ >>>>>> Biopython-dev mailing list >>>>>> Biopython-dev at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev >>>>>> >>>>> >>>>> >>> > -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From zruan1991 at gmail.com Sat Jun 15 02:12:56 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 15 Jun 2013 02:12:56 -0400 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi, I played with sphinx for a while. Is this what we expected ( http://zruanweb.com/html/Tutorial.html), although there are some issue that needs manual curation. I build this using pandoc to convert Doc/Tutorual.tex to reStructuredText and then use sphinx to make html. The sphinx directory can be found at ( https://github.com/zruan/biopython/tree/master/Doc/sphinx). Thanks. Best, Ruan On Fri, Jun 14, 2013 at 11:38 PM, Yanbo Ye wrote: > I've contacted the bioxxx website admin. While he agreed, the copyright is > still a problem. He said they just organized the translation from the > original website and make a pdf version. As the original site is not > accessible anymore, now it is hard to find the original translator. Maybe > we can just write an explanation and wait for him/her to contact us. > > > On Fri, Jun 14, 2013 at 11:36 PM, Zheng Ruan wrote: > >> Cool. Go ahead! >> >> Zheng Ruan >> >> >> On Fri, Jun 14, 2013 at 11:34 AM, Yanbo Ye wrote: >> >>> Sure. The original site for this translation is not accessible anymore. >>> But I know the bioxxx website admin. We can contact him about the author. >>> ?? 2013-6-14 ????11:17??"Zheng Ruan" ?????? >>> >>> Yep, some old chapters do not change much. But the copyright seems to be >>>> a concern. Do we need to contact the original author for such a permission. >>>> I think they will be happy to grant it. >>>> >>>> Thanks, >>>> Zheng Ruan >>>> >>>> >>>> On Fri, Jun 14, 2013 at 10:55 AM, Yanbo Ye wrote: >>>> >>>>> Thanks, Zhen Ruan. >>>>> I have that one. It's an old version tranlation and many chapters are >>>>> not available. But we can work base on that one I think. >>>>> ?? 2013-6-14 ????10:40??"Zheng Ruan" ?????? >>>>> >>>>> There was once a discussion in http://www.bioxxx.cn. You can find >>>>>> relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, >>>>>> this seems not to be an official translation. I find a copy of it in case >>>>>> you don't have permission (send you off-list). >>>>>> >>>>>> Best, >>>>>> Zheng Ruan >>>>>> >>>>>> >>>>>> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> Here we have some people like to translate the biopython tutorial >>>>>>> into >>>>>>> Chinese, so that more people in China can use biopython for their >>>>>>> research >>>>>>> and contribute to biopython. We noticed there is a LaTeX file for >>>>>>> this >>>>>>> tutorial that we can work on. >>>>>>> >>>>>>> Before we start, we want to know what is the right way to do this. >>>>>>> Is there >>>>>>> any previous or ongoing translation project that we can follow? >>>>>>> >>>>>>> Thanks, >>>>>>> Yanbo >>>>>>> -- >>>>>>> >>>>>>> ?????? >>>>>>> >>>>>>> ???????????????????????????????? >>>>>>> >>>>>>> Yanbo Ye >>>>>>> >>>>>>> Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >>>>>>> Sciences >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Biopython-dev mailing list >>>>>>> Biopython-dev at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev >>>>>>> >>>>>> >>>>>> >>>> >> > > > -- > > ?????? > > ???????????????????????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences > From p.j.a.cock at googlemail.com Sat Jun 15 07:53:17 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 15 Jun 2013 12:53:17 +0100 Subject: [Biopython-dev] [Wg-phyloinformatics] Codon Alignment GSoC Homepage In-Reply-To: References: Message-ID: On Saturday, June 15, 2013, Zheng Ruan wrote: > Hi all, > > Following Peter and Karen's suggestion, I set up my project homepage in > github (http://zruanweb.com/ ). I also > have a first four weeks' plan there ( > http://zruanweb.com/project-timeline.html). Thanks. > > Best, > Ruan > Thank you :) Peter From yeyanbo289 at gmail.com Sun Jun 16 02:39:37 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sun, 16 Jun 2013 14:39:37 +0800 Subject: [Biopython-dev] Post for the first week Message-ID: Hi Eric, Mark and Jeet, I post a blog describing the design idea of tree construction module and some works for the first week. Here is the link: http://blog.yeyanbo.com/posts/google-summer-of-code-2.html Best, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From p.j.a.cock at googlemail.com Sun Jun 16 09:21:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 16 Jun 2013 14:21:14 +0100 Subject: [Biopython-dev] Post for the first week In-Reply-To: References: Message-ID: On Sunday, June 16, 2013, Yanbo Ye wrote: > Hi Eric, Mark and Jeet, > > I post a blog describing the design idea of tree construction module and > some works for the first week. Here is the link: > http://blog.yeyanbo.com/posts/google-summer-of-code-2.html > > Best, > Yanbo > Thanks for keeping us informed Yanbo :) This old code might be useful for distance matrices, https://redmine.open-bio.org/issues/2034 You can also get distance matrices from Bio.Cluster but they may not make sense as input to a tree building algorithm... Peter From eric.talevich at gmail.com Sun Jun 16 13:15:26 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Sun, 16 Jun 2013 13:15:26 -0400 Subject: [Biopython-dev] Progress with ticket 3336 In-Reply-To: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> References: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> Message-ID: On Tue, May 7, 2013 at 5:12 PM, Nate Sutton wrote: > Hi, > > Here is a progress follow up to > http://lists.open-bio.org/pipermail/biopython-dev/2013-April/010548.html. I have added a commit to the github branch that adds an option to create > claude branch lines using linecollection. The linecollection objects are > stored in a tuple before adding them to the plot. It?s in > Bio/Phylo/_utils.py. Is this what the last bullet point was requesting in > https://redmine.open-bio.org/issues/3336 ? > > Thanks! > > Nate > > P. S. I used a tuple to store the linecollection objects instead of a > list because that was mentioned in the ticket but if that looks like it > should be different let me know. Also, I got some global variables to work > with the code but I was only able to do that after declaring them as > globals twice. If there are suggestions on how to code that differently > let me know. > Hi Nate, I left some comments on your commits in your branch on GitHub. When you're done, would you mind rebasing to the current master branch and doing a pull request? Regarding the global variables, I think you might have to declare them as such in every new scope where they're used, or not at all, and in this case, you don't need to declare them as global at all. Thanks, Eric From jmb at ebi.ac.uk Sun Jun 16 16:33:26 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Sun, 16 Jun 2013 21:33:26 +0100 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> Message-ID: <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> Thanks for the advice Can you help in how to set a chain using the add method? Also the PDBIO writer appears to remove aniso records. Is there anyway to stop it doing this? Currently my code is: pdbFile = PDBParser().get_structure(pdbid, pdbPath) waterChain = pdbFile[0]['W'] newChain = pdbFile[0]['A'] waterAtom = pdbFile[0]['W']['W', 1031, ' ']['O'] waterResidue = pdbFile[0]['W']['W', 1031, ' '] print waterChain.id print waterResidue.get_parent() waterResidue.detach_parent() #this bit seems to work print waterResidues.get_parent() waterResidue.add(pdbid[0]["A"]) #i?m not sure exactly how to get a chainID. print waterResidue.get_parent() print waterResidue.id which returns: W #original parent (chain) of the residue. None #detach parent seems to work fine Traceback (most recent call last): File ?water_orig.py", line 53, in waterResidue.add(pdbid[0]["A"]) TypeError: string indices must be integers, not str My writer commands are: pdb_writer = PDBIO() pdb_writer.set_structure(pdbFile) pdb_writer.save("output_path.pdb") Regards John From: Jo?o Rodrigues [mailto:anaryin at gmail.com] Sent: 14 June 2013 10:50 To: John Berrisford Cc: David Cain; biopython-dev at lists.open-bio.org Subject: Re: [Biopython-dev] changing PDB file chains Hi John, Actually, David is absolutely right.. I didn't really think it through. You need to move the water atoms to the chain where you want them to be. So, if they are in chain A and should be in chain B, you need to detach them from chain A (detach_child method on the residue, easier) and re-attach it to chain B (add method). From davidjosephcain at gmail.com Sun Jun 16 22:52:12 2013 From: davidjosephcain at gmail.com (David Cain) Date: Sun, 16 Jun 2013 22:52:12 -0400 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> Message-ID: Hi, John. Your error is that you're using pdbid[0]["A"], where pdbid is a string (not an instance of PDB.Structure, as you probably expect it to be). You seem to have the parent detachment down, you just need to properly attach to a new chain. Calling waterResidue.add(...) will add a child object to the residue (which should be an atom: http://biopython.org/DIST/docs/api/Bio.PDB.Residue-pysrc.html#L73). Instead, you want to call newChain.add(waterResidue). Just FYI, looking at how Structures are constructed by StructureBuilder should help you with the mechanics of modifying the SMRCA hierarchy. That is, StructureBuilder creates a Structure from scratch- if you understand how a Structure is built, it should make modifying a Structure trivial! As far as Aniso records, I don't believe the current implementation of PDBIO can handle that. You could always modify the source code to fit your needs, though! (I'm sure others would benefit from your changes). David Cain +1 (339) 222 4452 On Sun, Jun 16, 2013 at 4:33 PM, John Berrisford wrote: > Thanks for the advice**** > > ** ** > > Can you help in how to set a chain using the add method?**** > > ** ** > > Also the PDBIO writer appears to remove aniso records. Is there anyway to > stop it doing this?**** > > ** ** > > Currently my code is:**** > > pdbFile = PDBParser().get_structure(pdbid, pdbPath)**** > > waterChain = pdbFile[0]['W']**** > > newChain = pdbFile[0]['A']**** > > waterAtom = pdbFile[0]['W']['W', 1031, ' ']['O']**** > > waterResidue = pdbFile[0]['W']['W', 1031, ' ']**** > > print waterChain.id**** > > print waterResidue.get_parent()**** > > waterResidue.detach_parent() #this bit seems to work**** > > print waterResidues.get_parent()**** > > waterResidue.add(pdbid[0]["A"]) #i?m not sure exactly how to get a > chainID. **** > > print waterResidue.get_parent()**** > > print waterResidue.id**** > > ** ** > > which returns:**** > > W**** > > #original parent (chain) of the residue. **** > > None #detach parent seems to work fine**** > > Traceback (most recent call last):**** > > File ?water_orig.py", line 53, in **** > > waterResidue.add(pdbid[0]["A"])**** > > TypeError: string indices must be integers, not str**** > > ** ** > > ** ** > > My writer commands are:**** > > pdb_writer = PDBIO()**** > > pdb_writer.set_structure(pdbFile)**** > > pdb_writer.save("output_path.pdb")**** > > ** ** > > ** ** > > Regards**** > > ** ** > > John**** > > ** ** > > *From:* Jo?o Rodrigues [mailto:anaryin at gmail.com] > *Sent:* 14 June 2013 10:50 > *To:* John Berrisford > *Cc:* David Cain; biopython-dev at lists.open-bio.org > *Subject:* Re: [Biopython-dev] changing PDB file chains**** > > ** ** > > Hi John,**** > > ** ** > > Actually, David is absolutely right.. I didn't really think it through. > You need to move the water atoms to the chain where you want them to be. > So, if they are in chain A and should be in chain B, you need to detach > them from chain A (detach_child method on the residue, easier) and > re-attach it to chain B (add method).**** > From redmine at redmine.open-bio.org Tue Jun 18 02:06:50 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Tue, 18 Jun 2013 06:06:50 +0000 Subject: [Biopython-dev] [Biopython - Bug #3435] (New) Pyhlo.draw_ascii() type error Message-ID: Issue #3435 has been reported by Giulio Valentino Dalla Riva. ---------------------------------------- Bug #3435: Pyhlo.draw_ascii() type error https://redmine.open-bio.org/issues/3435 Author: Giulio Valentino Dalla Riva Status: New Priority: Normal Assignee: Category: Target version: URL: I define a simple tree in a newick format file (the example one in the tutorial 13.1). I read it using @tree = Phylo.read("simple.dnd", "newick")@ which works. When I try to use draw_ascii() to draw it I get a Type error: @Phylo.draw_ascii(tree)@ produces @Traceback (most recent call last): File "", line 1, in Phylo.draw_ascii(tree) File "C:\Python32\lib\site-packages\Bio\Phylo\_utils.py", line 253, in draw_ascii draw_clade(tree.root, 0) File "C:\Python32\lib\site-packages\Bio\Phylo\_utils.py", line 239, in draw_clade char_matrix[thisrow][col] = '_' TypeError: list indices must be integers, not float@ I think the error is related with the fact that Python 3.X gives a float as respond to @/@ and not an integer. The problem doesn't occur with Phylo.draw(). P.S. Hope the issue has not been already answered: I didn't find it. I'm working on phyton 3.2 and I tested the issue both from the latest git release (compiled with mingw) and the binary on a windows 32 machine. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From natemsutton at yahoo.com Wed Jun 19 05:00:56 2013 From: natemsutton at yahoo.com (Nate Sutton) Date: Wed, 19 Jun 2013 02:00:56 -0700 (PDT) Subject: [Biopython-dev] Progress with ticket 3336 In-Reply-To: References: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> Message-ID: <1371632456.20927.YahooMailNeo@web122602.mail.ne1.yahoo.com> I appreciate your review of the code, the feedback helps me become better! ?I made the pull request here: https://github.com/biopython/biopython/pull/189 . ? I have worked on fixing all the things you commented on and added comments describing the edits to the new file version at: https://github.com/nmsutton/biopython/commit/893a6508ad18278b9d5cdb10d3c81c823125c90f and 2 minor changes at: https://github.com/nmsutton/biopython/commit/c8215edba1bd796684722372cec3c94bffdafc91 -Nate P.S. ?To acknowledge assistance I got from a friend I added a co-authored-by line in the commit due to reading that can be helpful for authors who want to include that they were helped with their code. ?If you or anyone else knows a better way to let others get recognized for providing assistance on someone's code let me know. ________________________________ From: Eric Talevich To: Nate Sutton Cc: "biopython-dev at lists.open-bio.org" Sent: Sunday, June 16, 2013 10:15 AM Subject: Re: [Biopython-dev] Progress with ticket 3336 On Tue, May 7, 2013 at 5:12 PM, Nate Sutton wrote: Hi, > >Here is a progress follow up to http://lists.open-bio.org/pipermail/biopython-dev/2013-April/010548.html . ?I have added a commit to the github branch that adds an option to create claude branch lines using linecollection. ?The linecollection objects are stored in a tuple before adding them to the plot. ?It?s in Bio/Phylo/_utils.py. ?Is this what the last bullet point was requesting in https://redmine.open-bio.org/issues/3336 ? ? > >Thanks! > >Nate > >P. S. ?I used a tuple to store the linecollection objects instead of a list because that was mentioned in the ticket but if that looks like it should be different let me know. ?Also, I got some global variables to work with the code but I was only able to do that after declaring them as globals twice. ?If there are suggestions on how to code that differently let me know. > ? Hi Nate, I left some comments on your commits in your branch on GitHub. When you're done, would you mind rebasing to the current master branch and doing a pull request? Regarding the global variables, I think you might have to declare them as such in every new scope where they're used, or not at all, and in this case, you don't need to declare them as global at all. Thanks, Eric From zruan1991 at gmail.com Thu Jun 20 18:02:44 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Thu, 20 Jun 2013 18:02:44 -0400 Subject: [Biopython-dev] Codon Alignment for Biopython Project Update Message-ID: Hi all, I post my first update diary for Codon Alignment project in http://zruanweb.com/1st-diary.html. The repository for the code lies in https://github.com/zruan/biopython/tree/master/Bio/CodonAlign. I'd be happy to hear from your suggestions. Thanks! Best, Ruan From yeyanbo289 at gmail.com Mon Jun 24 00:57:36 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 24 Jun 2013 12:57:36 +0800 Subject: [Biopython-dev] Post of the second week Message-ID: Hi guys, I posted another blog here summarizing my work of the first week and plan for this week. Any feedback is welcome. Thanks, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From redmine at redmine.open-bio.org Fri Jun 28 10:31:37 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:31:37 +0000 Subject: [Biopython-dev] [Biopython - Bug #3436] (New) Fix slicing of SFF objects Message-ID: Issue #3436 has been reported by Martin Mokrej?. ---------------------------------------- Bug #3436: Fix slicing of SFF objects https://redmine.open-bio.org/issues/3436 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Peter Cock Category: Target version: URL: I am chasing a deemed bug in biopython which happens during slicing. It seems it is related to some internal cross-checks and NOT to the slicing range itself. This is likely caused by quality trim points conflicting and thir check triggers during slicing whereas NOT during SFF input parsing. But I think this a valid use case. Maybe Peter will be faster then me in finding the answer what is going on.
$ python
Python 2.7.3 (default, Apr 20 2013, 18:28:22) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> for _record in SeqIO.parse('/tmp/SRR088776_short_F95S1KP01DA6OU.sff', 'sff'):
...     print _record
... 
ID: F95S1KP01DA6OU
Name: F95S1KP01DA6OU
Number of features: 0
/flow_values=(95, 9, 110, 10, 30, 104, 15, 103, 1539, 33, 10, 20, 1501, 19, 8, 10, 1507, 22, 8, 8, 1482, 21, 8, 7, 1402, 22, 7, 11, 1394, 25, 10, 10, 1390, 27, 10, 12, 1333, 32, 14, 12, 1313, 59, 12, 14, 1258, 77, 13, 15, 1152, 91, 16, 16, 1144, 100, 14, 19, 1005, 107, 17, 20, 945, 109, 19, 20, 920, 113, 21, 21, 826, 113, 20, 25, 744, 105, 24, 25, 633, 110, 26, 26, 532, 103, 28, 30, 428, 103, 31, 33, 306, 106, 34, 37, 208, 106, 37, 39, 155, 113, 37, 46, 115, 103, 44, 57, 91, 113, 42, 57, 79, 111, 43, 67, 52, 103, 43, 78, 41, 102, 46, 82, 37, 102, 49, 91, 33, 103, 47, 104, 32, 100, 49, 98, 27, 111, 51, 109, 26, 113, 48, 114, 25, 116, 48, 120, 25, 111, 47, 119, 26, 100, 47, 109, 27, 90, 38, 112, 25, 85, 35, 105, 26, 74, 34, 102, 26, 78, 31, 90, 21, 67, 30, 78, 22, 49, 27, 67, 24, 45, 27, 62, 19, 46, 23, 48, 18, 41, 20, 45, 18, 37, 15, 44, 14, 33, 13, 39, 17, 32, 12, 34, 17, 27, 11, 24, 17, 26, 12, 20, 12, 26, 11, 17, 15, 22, 12, 15, 13, 20, 11, 18, 15, 20, 12, 16, 15, 22, 11, 14, 15, 16, 11, 13, 13, 18, 13, 11, 14, 17, 12, 13, 13, 17, 10, 12, 13, 15, 10, 12, 14, 15, 10, 13, 15, 15, 11, 14, 11, 14, 12, 11, 12, 15, 11, 10, 14, 12, 10, 10, 11, 18, 12, 9, 13, 14, 11, 10, 14, 13, 10, 13, 14, 14, 10, 13, 13, 15, 11, 9, 14, 17, 11, 13, 14, 18, 9, 15, 15, 16, 11, 11, 20, 16, 11, 14, 14, 14, 12, 14, 17, 16, 11, 12, 17, 14, 11, 11, 18, 15, 10, 11, 13, 14, 11, 12, 10, 19, 13, 10, 10, 16, 11, 10, 13, 14, 11, 12, 12, 12, 11, 13, 13, 14, 11, 15, 12, 16, 12, 15, 12, 14, 15, 11, 15, 15, 15, 9, 18, 14, 14, 12, 16, 13, 13, 13, 15, 13, 12, 12, 19, 15, 11, 12, 14, 15, 13, 11, 14, 14, 12, 12, 14, 16, 12, 13, 13, 17, 13, 11, 16, 15, 12, 12, 16, 16, 12, 12, 17, 13, 12, 11)
/flow_index=(1, 2, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 1, 3, 0, 0, 0, 1, 3, 0, 0, 1, 3, 0, 1, 3, 0, 1, 3, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 4, 4, 4, 4, 4, 4, 4, 12, 76)
/flow_chars=TACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG
/clip_adapter_right=0
/clip_qual_right=239
/clip_qual_left=248
/clip_adapter_left=4
/flow_key=TCAG
Per letter annotation for: phred_quality
Seq('tcagtttttttttttttttnttttttttttttttttttttttttttttttnttt...nnn', DNAAlphabet())
>>> _record[4:]
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 460, in __getitem__
    answer._per_letter_annotations[key] = value[index]
  File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 78, in __setitem__
    "strings) of length %i." % self._length)
TypeError: We only allow python sequences (lists, tuples or strings) of length 316.
>>> 
Um, this was biopython 1.59. I just installed 1.61 but it is same: _new = record[lval:] File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 461, in __getitem__ answer._per_letter_annotations[key] = value[index] File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 79, in __setitem__ "strings) of length %i." % self._length) TypeError: We only allow python sequences (lists, tuples or strings) of length 316. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From redmine at redmine.open-bio.org Fri Jun 28 10:49:38 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:49:38 +0000 Subject: [Biopython-dev] [Biopython - Bug #3437] (New) SeqIO.write(): Do not write broken data for empty objects Message-ID: Issue #3437 has been reported by Martin Mokrej?. ---------------------------------------- Bug #3437: SeqIO.write(): Do not write broken data for empty objects https://redmine.open-bio.org/issues/3437 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Category: Target version: URL: While slicing SFF objects I tripped several times in the past across the case when left trim point is defined whereas right trim point is not. Doing a slice like _record[_lval:_rval] not surprisingly results in an empty object (e.g. _record[4:0]). I think "SFF object" could be smart enough and do just [4:] on my behalf. Alternatively, I would prefer a Warning message. While you might disagree with both proposals above, you might change your view if you think about writing sliced objects into an outfile, for example in fastq or fasta or qual. These objects with empty sequence are written into the files but of course, only the FASTA/FASTQ header line is written and nothing for the sequence or qualities itself. So, the output FASTA or QUAL or FASTQ is actually broken. So must do something about this anyway. And users tend to forget so at leats I am likely to trip into this again after a few months. ;-) To recapitulate what I propose: First, I would just replace [4:0] with [4:] in the example mentioned. Second, SeqIO.write() must definitely check for non-zero length of object's sequence. This was on biopython-1.59 but happens also on 1.61. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From redmine at redmine.open-bio.org Fri Jun 28 10:56:51 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:56:51 +0000 Subject: [Biopython-dev] [Biopython - Feature #3438] (New) Allow modifications of sequence in SFF objects Message-ID: Issue #3438 has been reported by Martin Mokrej?. ---------------------------------------- Feature #3438: Allow modifications of sequence in SFF objects https://redmine.open-bio.org/issues/3438 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Category: Target version: URL: I find it a bit awkward but namely, with unnecessary overhead, to edit a sequence in a SFF object. I am fine with the requirement that the length of the sequence must stay same is the length of qualities and other annotation lists. However, to edit a sequence I have to do now:
    if _was_modified:
        _letter_annotations = _record.letter_annotations
        _annotations = _record.annotations
        _record.letter_annotations = {}
        _record.annotations = {}
        _record.seq = Seq(_sequence, generic_dna)
        _record.letter_annotations = _letter_annotations
        _record.annotations = _annotations

        _new_record = SeqRecord(Seq(_sequence, generic_dna), id=_record.id, name=_record.name, description=_record.description, annotations=_record.annotations, letter_annotations=_record.letter_annotations)

        _wrote = SeqIO.write(_new_record, _fh, 'sff')
    else:
        _wrote = SeqIO.write(_record, _fh, 'sff')
The whole work in backup&restore of annotation lists is not necessary in my eyes. I think providing _record.rewrite_sequence('tcagnnnnnnnn') would be quite helpful here. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From zruan1991 at gmail.com Fri Jun 28 17:10:07 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 28 Jun 2013 17:10:07 -0400 Subject: [Biopython-dev] Week2 Update for Codon Alignment for Biopython Project Message-ID: Hi all, I wrote a diary for the progress of my project in week 2 ( http://zruanweb.com/). Thanks for your feedback and suggestions. Best, Ruan From redmine at redmine.open-bio.org Fri Jun 28 18:12:59 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 22:12:59 +0000 Subject: [Biopython-dev] [Biopython - Bug #3439] (New) SwissProt parser breaks when parsing '[' crossref Message-ID: Issue #3439 has been reported by Iddo Friedberg. ---------------------------------------- Bug #3439: SwissProt parser breaks when parsing '[' crossref https://redmine.open-bio.org/issues/3439 Author: Iddo Friedberg Status: New Priority: Normal Assignee: Category: Target version: URL: It seems the SwissProt parser treats opening square brackets as comments in the cross-reference records. So if there is a '[' in the freetext, everything after that does not get parsed. Seems like the relevant function is "_read_dr" line 491 in BioSwissProt/__init__.py Thanks, Iddo ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From p.j.a.cock at googlemail.com Fri Jun 28 20:17:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 29 Jun 2013 01:17:53 +0100 Subject: [Biopython-dev] Fwd: [biopython] MeltingTemp completely rewritten and extended (#192) In-Reply-To: References: Message-ID: Sebastian, could you comment on or review this? Thanks, Peter ---------- Forwarded message ---------- From: *Markus Piotrowski* Date: Saturday, June 29, 2013 Subject: [biopython] MeltingTemp completely rewritten and extended (#192) To: biopython/biopython More or less completely rewritten and largely extended. 1. Three different Tm calculations: one 'rule of thumb' (Tm_Wallace), one using approximative formulas basing on GC content (Tm_GC) and one using nearest neighbor calculations (Tm_NN). 2. The new Tm_NN allows the usage of different thermodynamic datasets (8 tables are included for Watson-Crick base-pairing) and includes tables for mismatches (including inosine) and dangling ends. The datasets are Python dictionaries; the user can use his own datasets or change/update existing tables for his needs. 3. Seven different formulas to correct for salt concentration, including correction for Mg2+ ions (method salt_correction). 4. Method chem_correction which allows for Tm correction when using DMSO and formaldehyde. ------------------------------ You can merge this Pull Request by running git pull https://github.com/MarkusPiotrowski/biopython MeltingTemp Or view, comment on, or merge it at: https://github.com/biopython/biopython/pull/192 Commit Summary - MeltingTemp completely rewritten and extended File Changes - *M* Bio/SeqUtils/MeltingTemp.py(1107) Patch Links: - https://github.com/biopython/biopython/pull/192.patch - https://github.com/biopython/biopython/pull/192.diff From lomereiter at gmail.com Sat Jun 29 10:39:05 2013 From: lomereiter at gmail.com (Artem Tarasov) Date: Sat, 29 Jun 2013 18:39:05 +0400 Subject: [Biopython-dev] CFFI bindings for sambamba Message-ID: Hello BioPython, As you may know, during the previous GSoC I wrote a library for working with SAM/BAM in D. Only recently the language gained shared library support on Linux. And at about the same time PyPy 2.0 was released. I had some spare time last few days and made bindings for the library. At the time they are a bit incomplete and lack tests and documentation. They work on Linux with PyPy 2.0 or CPython 2.*. On PyPy, the performance is not worse than that of PySam on CPython, thanks to the powerful JIT. Features so far: * BAM reader and writer, both can use multiple threads for (de-)compression * Random access, creating BAI index * Fast pileup engine (optionally uses MD tags to determine reference bases) The repository is at https://github.com/lomereiter/sambamba-bindings Again, currently only Linux is supported. Bug-hunting and pull requests with tests and docs are much appreciated. -- Artem From zruan1991 at gmail.com Sat Jun 29 18:29:14 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 29 Jun 2013 18:29:14 -0400 Subject: [Biopython-dev] Fwd: Week2 Update for Codon Alignment for Biopython Project In-Reply-To: References: Message-ID: Hi all, It seems my email yesterday had not been reached to the biopython-dev mailing list. So I try to send it again. I wrote a diary for the progress of the Codon Alignment project in week 2 ( http://zruanweb.com/). Thanks for your feedback and suggestions. Since I'll be traveling in July 1st to 3rd, I anticipate less update next week. Thanks. Best, Ruan From p.j.a.cock at googlemail.com Tue Jun 4 17:29:55 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Tue, 4 Jun 2013 18:29:55 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs Message-ID: Hi Bartek, I'm hoping you or Michiel can investigate this issue, http://www.biostars.org/p/73500/ I believe Ivan has correctly diagnosed a Biopython issue in the alphabet handling of the motif class on this BioStars question, and he's given a workaround. The problem code looks like this: if self.alphabet!=IUPAC.unambiguous_dna: raise ValueError("Wrong alphabet! Use only with DNA motifs") First, assuming the test is really for just IUPAC unambiguous DNA, the error message is misleading - it sounds like using generic_dna or IUPAC ambiguous DNA would be acceptable but it isn't. The core problem here is that IUPAC.unambiguous_dna is just one instance of the IUPACUnambiguousDNA() class, and other instances should be equally acceptable but will fail the equality. I have sometimes wondered if we could and should make some of the Alphabet objects into singletons (only one instance allowed), which might be one way to solve this issue. Alternatively, perhaps all we need is to here is see if the alphabet is DNA and which letter set it uses? Is that the key point for the matrix calculations etc? e.g. from Bio.Alphabet import _get_base_alphabet, DNAAlphabet if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): raise ValueError("This only works for DNA motifs") if not self.alphabet.letters == unambiguous_dna.letters: raise ValueError("Expected IUPAC.unambiguous_dna or similar") (Untested, and these suggested error messages need some work) Regards, Peter From clements at galaxyproject.org Tue Jun 4 19:29:57 2013 From: clements at galaxyproject.org (Dave Clements) Date: Tue, 4 Jun 2013 12:29:57 -0700 Subject: [Biopython-dev] GCC2013 Regular Registration Closes June 14 Message-ID: Hello all, This is the final registration reminder for the 2013 Galaxy Community Conference (GCC2013), being held in Oslo, 30 June through July 2. GCC2013 is a great opportunity to share best practices and network with other researchers who are also facing the challenges of data-intensive biology. Registration closes June 14*, ten days from today. Register now and guarantee your spot in the Training Day sessions you want to take.* *Registration is still a bargain with the full 3-day registration starting at ~ ?165 for post-docs and students (or just ?55 per day). The program features 15 Training Day sessions in 5 tracks on 12 different topics, 25 Talks on topics ranging from Reproducibility to Exploiting Galaxy , 23 Posters (and counting), 2 Lightning Talk sessions, and a end-of-conference event at an historic venue high above Oslo. Ser frem til ? se deg i Oslo! GCC2013 Organizing Committee PS: Please help get the word out . * Not June 7, as had been stated earlier in several places. -- http://galaxyproject.org/GCC2013 http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://wiki.galaxyproject.org/ From mjldehoon at yahoo.com Wed Jun 5 02:28:28 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Tue, 4 Jun 2013 19:28:28 -0700 (PDT) Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: Message-ID: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Hi Peter, I have never quite understood why we need a separate class for each alphabet. I would think that a single alphabet class (or maybe a DNA, an RNA, and a protein alphabet class) is sufficient, and that the specific alphabets are instances of this class. Also, alphabets are essentially sets of letters, so an Alphabet class should inherit from set, allowing us to use its associated methods to compare alphabets to each other. Best, -Michiel. ________________________________ From: Peter Cock To: Bartek Wilczynski ; Michiel de Hoon Cc: Biopython-Dev Mailing List Sent: Wednesday, June 5, 2013 2:29 AM Subject: Alphabet bug in Bio.Motif and Bio.motifs Hi Bartek, I'm hoping you or Michiel can investigate this issue, http://www.biostars.org/p/73500/ I believe Ivan has correctly diagnosed a Biopython issue in the alphabet handling of the motif class on this BioStars question, and he's given a workaround. The problem code looks like this: ? ? ? ? if self.alphabet!=IUPAC.unambiguous_dna: ? ? ? ? ? ? raise ValueError("Wrong alphabet! Use only with DNA motifs") First, assuming the test is really for just IUPAC unambiguous DNA, the error message is misleading - it sounds like using generic_dna or IUPAC ambiguous DNA would be acceptable but it isn't. The core problem here is that IUPAC.unambiguous_dna is just one instance of the IUPACUnambiguousDNA() class, and other instances should be equally acceptable but will fail the equality. I have sometimes wondered if we could and should make some of the Alphabet objects into singletons (only one instance allowed), which might be one way to solve this issue. Alternatively, perhaps all we need is to here is see if the alphabet is DNA and which letter set it uses? Is that the key point for the matrix calculations etc? e.g. from Bio.Alphabet import _get_base_alphabet, DNAAlphabet ? ? if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): ? ? ? ? raise ValueError("This only works for DNA motifs") ? ? if not self.alphabet.letters == unambiguous_dna.letters: ? ? ? ? raise ValueError("Expected IUPAC.unambiguous_dna or similar") (Untested, and these suggested error messages need some work) Regards, Peter From redmine at redmine.open-bio.org Wed Jun 5 03:31:25 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Wed, 5 Jun 2013 03:31:25 +0000 Subject: [Biopython-dev] [Biopython - Bug #3434] (New) PDB.PDBParser Message-ID: Issue #3434 has been reported by Mirslaw Syzdek. ---------------------------------------- Bug #3434: PDB.PDBParser https://redmine.open-bio.org/issues/3434 Author: Mirslaw Syzdek Status: New Priority: Normal Assignee: Category: Target version: URL: Two months ago I downloaded a pdb file from NCBI (Database: Structure, Name: 1EZQ). With this file the following code works fine: @parser = PDB.PDBParser() struct = parser.get_structure('1EZQ.pdb', '1EZQ.pdb') ppb = PDB.PPBuilder() peptides = ppb.build_peptides(struct)@ Few days ago I downloaded the pdb file one more time. For the new file above code stopped working. The PDBParser is throwing an error. The error is cause by different column separation (see the line 423 in the attached files). ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From barwil at gmail.com Wed Jun 5 08:13:00 2013 From: barwil at gmail.com (Bartek Wilczynski) Date: Wed, 5 Jun 2013 10:13:00 +0200 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: I'm a bit out of the loop here, but to me it seems like a simple issue: Why not change the problematic code: if self.alphabet!=IUPAC.unambiguous_dna: raise ValueError("Wrong alphabet! Use only with DNA motifs") into: if type(self.alphabet)!=type(IUPAC.unambiguous_dna): raise ValueError("Wrong alphabet! Use only with DNA motifs") and worry about fixing the Bio.Alphabet issues later (it does sound reasonable to make sure that any alphabet instance is a singleton). best Bartek On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: > Hi Peter, > > I have never quite understood why we need a separate class for each > alphabet. > I would think that a single alphabet class (or maybe a DNA, an RNA, and a > protein alphabet class) is sufficient, and that the specific alphabets are > instances of this class. > Also, alphabets are essentially sets of letters, so an Alphabet class should > inherit from set, allowing us to use its associated methods to compare > alphabets to each other. > > Best, > -Michiel. > > > ________________________________ > From: Peter Cock > To: Bartek Wilczynski ; Michiel de Hoon > > Cc: Biopython-Dev Mailing List > Sent: Wednesday, June 5, 2013 2:29 AM > Subject: Alphabet bug in Bio.Motif and Bio.motifs > > Hi Bartek, > > I'm hoping you or Michiel can investigate this issue, > http://www.biostars.org/p/73500/ > > I believe Ivan has correctly diagnosed a Biopython issue in the alphabet > handling of the motif class on this BioStars question, and he's given a > workaround. The problem code looks like this: > > if self.alphabet!=IUPAC.unambiguous_dna: > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > First, assuming the test is really for just IUPAC unambiguous DNA, > the error message is misleading - it sounds like using generic_dna > or IUPAC ambiguous DNA would be acceptable but it isn't. > > The core problem here is that IUPAC.unambiguous_dna is just > one instance of the IUPACUnambiguousDNA() class, and other > instances should be equally acceptable but will fail the equality. > > I have sometimes wondered if we could and should make some of > the Alphabet objects into singletons (only one instance allowed), > which might be one way to solve this issue. > > Alternatively, perhaps all we need is to here is see if the alphabet > is DNA and which letter set it uses? Is that the key point for the matrix > calculations etc? e.g. > > from Bio.Alphabet import _get_base_alphabet, DNAAlphabet > > if not isinstance(_get_base_alphabet(self.alphabet), DNAAlphabet): > raise ValueError("This only works for DNA motifs") > if not self.alphabet.letters == unambiguous_dna.letters: > raise ValueError("Expected IUPAC.unambiguous_dna or similar") > > (Untested, and these suggested error messages need some work) > > Regards, > > Peter > > -- Bartek Wilczynski ================== Institute of Informatics University of Warsaw http://www.mimuw.edu.pl/~bartek From p.j.a.cock at googlemail.com Wed Jun 5 09:32:11 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jun 2013 10:32:11 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Wed, Jun 5, 2013 at 9:13 AM, Bartek Wilczynski wrote: > I'm a bit out of the loop here, but to me it seems like a simple issue: > > Why not change the problematic code: > > if self.alphabet!=IUPAC.unambiguous_dna: > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > into: > > if type(self.alphabet)!=type(IUPAC.unambiguous_dna): > raise ValueError("Wrong alphabet! Use only with DNA motifs") > > and worry about fixing the Bio.Alphabet issues later (it does sound > reasonable to make sure that any alphabet instance is a singleton). > > best > Bartek I would prefer a more duck-typing approach (is it DNA? Does it use the expected set of letters?), but that sounds practical. Could you try using isinstance instead though (see PEP8), and then make that fix with a new unit test based on the original query please? > On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: >> Hi Peter, >> >> I have never quite understood why we need a separate class for each >> alphabet. >> I would think that a single alphabet class (or maybe a DNA, an RNA, and a >> protein alphabet class) is sufficient, and that the specific alphabets are >> instances of this class. >> Also, alphabets are essentially sets of letters, so an Alphabet class should >> inherit from set, allowing us to use its associated methods to compare >> alphabets to each other. >> >> Best, >> -Michiel. I wouldn't want to subclass sets due to the fact that in many existing uses of the alphabets the order of the letters is important (and this is not specified in a Python set). But I agree that a rationalised alphabet system like that could work better. Here equality testing could be on both being the same type, e.g. DNA, and having the same letters - including special letters for gaps or stop codons (which are the nastiest part of the current alphabet object system)? Peter From mjldehoon at yahoo.com Wed Jun 5 10:12:38 2013 From: mjldehoon at yahoo.com (Michiel de Hoon) Date: Wed, 5 Jun 2013 03:12:38 -0700 (PDT) Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> > I wouldn't want to subclass sets due to the fact that in many > existing uses of the alphabets the order of the letters is > important (and this is not specified in a Python set). OK, then indeed a set wouldn't be appropriate. > But I agree that a rationalised alphabet system like that could > work better. Here equality testing could be on both being the > same type, e.g. DNA, and having the same letters - including > special letters for gaps or stop codons (which are the nastiest > part of the current alphabet object system)? I guess that it depends on how the alphabet is used. For example, for the example in the bug report the order of the letters doesn't matter, but for other cases it may matter. Personally I almost never use alphabets. Can anybody give some real-life examples of how they are used? Best, -Michiel ________________________________ From: Peter Cock To: Bartek Wilczynski Cc: Michiel de Hoon ; Biopython-Dev Mailing List Sent: Wednesday, June 5, 2013 6:32 PM Subject: Re: Alphabet bug in Bio.Motif and Bio.motifs On Wed, Jun 5, 2013 at 9:13 AM, Bartek Wilczynski wrote: > I'm a bit out of the loop here, but to me it seems like a simple issue: > > Why not change the problematic code: > >? if self.alphabet!=IUPAC.unambiguous_dna: >? ? ? ???raise ValueError("Wrong alphabet! Use only with DNA motifs") > > into: > >? if type(self.alphabet)!=type(IUPAC.unambiguous_dna): >? ? ? ???raise ValueError("Wrong alphabet! Use only with DNA motifs") > > and worry about fixing the Bio.Alphabet issues later (it does sound > reasonable to make sure that any alphabet instance is a singleton). > > best > Bartek I would prefer a more duck-typing approach (is it DNA? Does it use the expected set of letters?), but that sounds practical. Could you try using isinstance instead though (see PEP8), and then make that fix with a new unit test based on the original query please? > On Wed, Jun 5, 2013 at 4:28 AM, Michiel de Hoon wrote: >> Hi Peter, >> >> I have never quite understood why we need a separate class for each >> alphabet. >> I would think that a single alphabet class (or maybe a DNA, an RNA, and a >> protein alphabet class) is sufficient, and that the specific alphabets are >> instances of this class. >> Also, alphabets are essentially sets of letters, so an Alphabet class should >> inherit from set, allowing us to use its associated methods to compare >> alphabets to each other. >> >> Best, >> -Michiel. I wouldn't want to subclass sets due to the fact that in many existing uses of the alphabets the order of the letters is important (and this is not specified in a Python set). But I agree that a rationalised alphabet system like that could work better. Here equality testing could be on both being the same type, e.g. DNA, and having the same letters - including special letters for gaps or stop codons (which are the nastiest part of the current alphabet object system)? Peter From p.j.a.cock at googlemail.com Wed Jun 5 10:29:58 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Wed, 5 Jun 2013 11:29:58 +0100 Subject: [Biopython-dev] Alphabet bug in Bio.Motif and Bio.motifs In-Reply-To: <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> References: <1370399308.72906.YahooMailNeo@web164001.mail.gq1.yahoo.com> <1370427158.91948.YahooMailNeo@web164001.mail.gq1.yahoo.com> Message-ID: On Wed, Jun 5, 2013 at 11:12 AM, Michiel de Hoon wrote: >> I wouldn't want to subclass sets due to the fact that in many >> existing uses of the alphabets the order of the letters is >> important (and this is not specified in a Python set). > > OK, then indeed a set wouldn't be appropriate. > >> But I agree that a rationalised alphabet system like that could >> work better. Here equality testing could be on both being the >> same type, e.g. DNA, and having the same letters - including >> special letters for gaps or stop codons (which are the nastiest >> part of the current alphabet object system)? > > I guess that it depends on how the alphabet is used. For example, for > the example in the bug report the order of the letters doesn't matter, > but for other cases it may matter. What is the motif class doing that restricts it to IUPAC unambiguous DNA? Rather than any DNA alphabet, such as ambiguous DNA, or mixed case sequences? > Personally I almost never use > alphabets. Can anybody give some real-life examples of how they > are used? The generic aim is to label Seq objects as either DNA, RNA or protein (and restrict operations like additions or translation accordingly). That doesn't need the letter level information. Validating that sequences use the expected letters only (e.g. if sending to a tool which does not understand U as a protein, or if writing to a restricted file format). I think the NEXUS code has this kind of constraint. Counting amino acid or nucleotide frequencies - even if your example proteins happens to lack proline, you'd probably want to consider it in your list of amino acids. Depending on your data structure that could be important (while a consistent order may or may not matter, e.g. array indexing). Peter From yeyanbo289 at gmail.com Thu Jun 6 15:58:46 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Thu, 6 Jun 2013 23:58:46 +0800 Subject: [Biopython-dev] GSOC Project Introduction Message-ID: Hi everyone, I'm Yanbo Ye. I'm happy that I was accepted by NESCent for this year's GSOC and that I can contribute to Biopython through this project. I will work on two phylogenetic modules of the Phylo package: tree construction and consensus tree searching. To share my project progress, as Peter Cock suggested, I have setup a blog on github. Hereis my first introduction post. Cheers, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From jmb at ebi.ac.uk Thu Jun 13 19:01:39 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Thu, 13 Jun 2013 20:01:39 +0100 Subject: [Biopython-dev] changing PDB file chains Message-ID: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Hi I'm trying to use biopython to update a PDB file. I'm trying to update the chain ID of a series of waters in a PDB file. I have the original chain ID, new chain ID and water residue number in an mmcif file which I parse using a separate parser. Then for each water I have in the mmcif file I want to update the chain ID from the cif file. I then want to write out the updated water line (to test it works) or write out the updated PDB file. Is this possible with biopython? Regards John From davidjosephcain at gmail.com Thu Jun 13 19:17:18 2013 From: davidjosephcain at gmail.com (David Cain) Date: Thu, 13 Jun 2013 15:17:18 -0400 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: Yes, John, it's possible! You'll first want to modify the parsed structure. Add your water molecules to the desired chain (removing from the old, of course). To actually do this, you may want to look at the source ( http://biopython.org/DIST/docs/api/Bio.PDB-module.html), specifically how the SMRCA hierarchy is constructed. Once you've modified your Structure (say it's in a variable `struct`), you should create an instance of PDBIO(), then save your structure like so: pdb_writer = PDB.PDBIO() pdb_writer.set_structure(struct) pdb_writer.save("output_path.pdb") Do not that PDBIO has some limitations (e.g. it cannot write out PDB header data). It should probably suffice for your needs, though. If you're not able to figure it out, feel free to email me back (preferably with your code!) and I can help you out. StackOverflow works particularly well for me, if you're amenable to that. David Cain +1 (339) 222 4452 On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford wrote: > Hi > > > > I'm trying to use biopython to update a PDB file. > > > > I'm trying to update the chain ID of a series of waters in a PDB file. I > have the original chain ID, new chain ID and water residue number in an > mmcif file which I parse using a separate parser. Then for each water I > have > in the mmcif file I want to update the chain ID from the cif file. > > I then want to write out the updated water line (to test it works) or write > out the updated PDB file. > > > > Is this possible with biopython? > > > > Regards > > > > John > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From anaryin at gmail.com Fri Jun 14 09:05:41 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 14 Jun 2013 11:05:41 +0200 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: Hi, If you simply want to update ids, you can just change them (chain.id = newvalue) and then output the structure like David suggested. No need to remove/add atoms. If you wish to play with the structure then you should modify the SMCRA hierarchy indeed. Cheers, Jo?o 2013/6/13 David Cain > Yes, John, it's possible! > > You'll first want to modify the parsed structure. Add your water molecules > to the desired chain (removing from the old, of course). To actually do > this, you may want to look at the source ( > http://biopython.org/DIST/docs/api/Bio.PDB-module.html), specifically how > the SMRCA hierarchy is constructed. > > Once you've modified your Structure (say it's in a variable `struct`), you > should create an instance of PDBIO(), then save your structure like so: > > pdb_writer = PDB.PDBIO() > pdb_writer.set_structure(struct) > pdb_writer.save("output_path.pdb") > > Do not that PDBIO has some limitations (e.g. it cannot write out PDB header > data). It should probably suffice for your needs, though. > > If you're not able to figure it out, feel free to email me back (preferably > with your code!) and I can help you out. > StackOverflow works > particularly well for me, if you're amenable to that. > > > > David Cain > +1 (339) 222 4452 > > > On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford wrote: > > > Hi > > > > > > > > I'm trying to use biopython to update a PDB file. > > > > > > > > I'm trying to update the chain ID of a series of waters in a PDB file. I > > have the original chain ID, new chain ID and water residue number in an > > mmcif file which I parse using a separate parser. Then for each water I > > have > > in the mmcif file I want to update the chain ID from the cif file. > > > > I then want to write out the updated water line (to test it works) or > write > > out the updated PDB file. > > > > > > > > Is this possible with biopython? > > > > > > > > Regards > > > > > > > > John > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From jmb at ebi.ac.uk Fri Jun 14 09:48:21 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Fri, 14 Jun 2013 10:48:21 +0100 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> Message-ID: <51BAE6E5.1040401@ebi.ac.uk> Hi Jo?o and David Thank you for the help. The part that confused me is how do I change a chain ID for a specific water? eg. I can select a water with atom = pdbFile[0]['W']['W', 1031, ' ']['O'] or maybe residue = pdbFile[0]['W']['W', 1031, ' '] now, how do I update the chain ID for this water? alternatively I can select a water with for model in pdbFile: for chain in model: for residue in chain: if residue.id[0] == 'W': if residue.id[1] == '1031': I presume that I can then do... chain.id = 'A' and will this change the chain ID for this specific water or all atoms? Regards John On 14/06/13 10:05, Jo?o Rodrigues wrote: > Hi, > > If you simply want to update ids, you can just change them (chain.id > = newvalue) and then output the structure like David > suggested. No need to remove/add atoms. If you wish to play with the > structure then you should modify the SMCRA hierarchy indeed. > > Cheers, > > Jo?o > > > 2013/6/13 David Cain > > > Yes, John, it's possible! > > You'll first want to modify the parsed structure. Add your water > molecules > to the desired chain (removing from the old, of course). To > actually do > this, you may want to look at the source ( > http://biopython.org/DIST/docs/api/Bio.PDB-module.html), > specifically how > the SMRCA hierarchy is constructed. > > Once you've modified your Structure (say it's in a variable > `struct`), you > should create an instance of PDBIO(), then save your structure > like so: > > pdb_writer = PDB.PDBIO() > pdb_writer.set_structure(struct) > pdb_writer.save("output_path.pdb") > > Do not that PDBIO has some limitations (e.g. it cannot write out > PDB header > data). It should probably suffice for your needs, though. > > If you're not able to figure it out, feel free to email me back > (preferably > with your code!) and I can help you out. > StackOverflow works > particularly well for me, if you're amenable to that. > > > > David Cain > +1 (339) 222 4452 > > > On Thu, Jun 13, 2013 at 3:01 PM, John Berrisford > wrote: > > > Hi > > > > > > > > I'm trying to use biopython to update a PDB file. > > > > > > > > I'm trying to update the chain ID of a series of waters in a PDB > file. I > > have the original chain ID, new chain ID and water residue > number in an > > mmcif file which I parse using a separate parser. Then for each > water I > > have > > in the mmcif file I want to update the chain ID from the cif file. > > > > I then want to write out the updated water line (to test it > works) or write > > out the updated PDB file. > > > > > > > > Is this possible with biopython? > > > > > > > > Regards > > > > > > > > John > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > -- John Berrisford PDBe EMBL-EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD Tel: 01223 492529 http://www.facebook.com/proteindatabank http://twitter.com/PDBeurope From anaryin at gmail.com Fri Jun 14 09:50:24 2013 From: anaryin at gmail.com (=?UTF-8?Q?Jo=C3=A3o_Rodrigues?=) Date: Fri, 14 Jun 2013 11:50:24 +0200 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <51BAE6E5.1040401@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> Message-ID: Hi John, Actually, David is absolutely right.. I didn't really think it through. You need to move the water atoms to the chain where you want them to be. So, if they are in chain A and should be in chain B, you need to detach them from chain A (detach_child method on the residue, easier) and re-attach it to chain B (add method). From yeyanbo289 at gmail.com Fri Jun 14 11:24:10 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Fri, 14 Jun 2013 19:24:10 +0800 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation Message-ID: Hi everyone, Here we have some people like to translate the biopython tutorial into Chinese, so that more people in China can use biopython for their research and contribute to biopython. We noticed there is a LaTeX file for this tutorial that we can work on. Before we start, we want to know what is the right way to do this. Is there any previous or ongoing translation project that we can follow? Thanks, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From zruan1991 at gmail.com Fri Jun 14 14:43:35 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 14 Jun 2013 10:43:35 -0400 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: There was once a discussion in http://www.bioxxx.cn. You can find relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems not to be an official translation. I find a copy of it in case you don't have permission (send you off-list). Best, Zheng Ruan On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: > Hi everyone, > > Here we have some people like to translate the biopython tutorial into > Chinese, so that more people in China can use biopython for their research > and contribute to biopython. We noticed there is a LaTeX file for this > tutorial that we can work on. > > Before we start, we want to know what is the right way to do this. Is there > any previous or ongoing translation project that we can follow? > > Thanks, > Yanbo > -- > > ??? > > ???????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > From eric.talevich at gmail.com Fri Jun 14 15:59:00 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Fri, 14 Jun 2013 11:59:00 -0400 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi guys, Great idea! Yes, there was an earlier "unofficial" effort to both port the tutorial to Sphinx and translate it to Chinese: http://www.bio-cloud.info/Biopython/en/index.html http://www.bio-cloud.info/Biopython/cn/index.html http://www.bio-cloud.info/blog/?p=57 I don't know much about Sphinx's support for multiple translations of the same text (or LaTeX's, for that matter), but maybe a Sphinx/reStructuredText port would make it easier to manage individual sections of the document and keep them up to date with their English equivalents. The bug report for this (long-term) task is: https://redmine.open-bio.org/issues/3219 All the best, Eric On Fri, Jun 14, 2013 at 10:43 AM, Zheng Ruan wrote: > There was once a discussion in http://www.bioxxx.cn. You can find relevant > info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems not > to be an official translation. I find a copy of it in case you don't have > permission (send you off-list). > > Best, > Zheng Ruan > > > > On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: > > > Hi everyone, > > > > Here we have some people like to translate the biopython tutorial into > > Chinese, so that more people in China can use biopython for their > research > > and contribute to biopython. We noticed there is a LaTeX file for this > > tutorial that we can work on. > > > > Before we start, we want to know what is the right way to do this. Is > there > > any previous or ongoing translation project that we can follow? > > > > Thanks, > > Yanbo > > -- > > > > ??? > > > > ???????????????? > > > > Yanbo Ye > > > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > > Sciences > > > > _______________________________________________ > > Biopython-dev mailing list > > Biopython-dev at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > > > > _______________________________________________ > Biopython-dev mailing list > Biopython-dev at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython-dev > > From zruan1991 at gmail.com Sat Jun 15 03:18:07 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 14 Jun 2013 23:18:07 -0400 Subject: [Biopython-dev] Codon Alignment GSoC Homepage Message-ID: Hi all, Following Peter and Karen's suggestion, I set up my project homepage in github (http://zruanweb.com/ ). I also have a first four weeks' plan there (http://zruanweb.com/project-timeline.html). Thanks. Best, Ruan From yeyanbo289 at gmail.com Sat Jun 15 03:22:15 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sat, 15 Jun 2013 11:22:15 +0800 Subject: [Biopython-dev] Fwd: Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi Eric, That's great. It seems we have a good starting point. I'll contact him to see how to join him and make an official version if possible. Best, Yanbo On Fri, Jun 14, 2013 at 11:59 PM, Eric Talevich wrote: > Hi guys, > > Great idea! Yes, there was an earlier "unofficial" effort to both port the > tutorial to Sphinx and translate it to Chinese: > http://www.bio-cloud.info/Biopython/en/index.html > http://www.bio-cloud.info/Biopython/cn/index.html > http://www.bio-cloud.info/blog/?p=57 > > I don't know much about Sphinx's support for multiple translations of the > same text (or LaTeX's, for that matter), but maybe a > Sphinx/reStructuredText port would make it easier to manage individual > sections of the document and keep them up to date with their English > equivalents. The bug report for this (long-term) task is: > https://redmine.open-bio.org/issues/3219 > > All the best, > Eric > > > > > On Fri, Jun 14, 2013 at 10:43 AM, Zheng Ruan wrote: > >> There was once a discussion in http://www.bioxxx.cn. You can find >> relevant >> info at http://www.bioxxx.cn/thread-2354-1-1.html. However, this seems >> not >> to be an official translation. I find a copy of it in case you don't have >> permission (send you off-list). >> >> Best, >> Zheng Ruan >> >> >> >> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >> >> > Hi everyone, >> > >> > Here we have some people like to translate the biopython tutorial into >> > Chinese, so that more people in China can use biopython for their >> research >> > and contribute to biopython. We noticed there is a LaTeX file for this >> > tutorial that we can work on. >> > >> > Before we start, we want to know what is the right way to do this. Is >> there >> > any previous or ongoing translation project that we can follow? >> > >> > Thanks, >> > Yanbo >> > -- >> > >> > ??? >> > >> > ???????????????? >> > >> > Yanbo Ye >> > >> > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >> > Sciences >> > >> > _______________________________________________ >> > Biopython-dev mailing list >> > Biopython-dev at lists.open-bio.org >> > http://lists.open-bio.org/mailman/listinfo/biopython-dev >> > >> >> >> _______________________________________________ >> Biopython-dev mailing list >> Biopython-dev at lists.open-bio.org >> http://lists.open-bio.org/mailman/listinfo/biopython-dev >> >> > -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From yeyanbo289 at gmail.com Sat Jun 15 03:38:17 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sat, 15 Jun 2013 11:38:17 +0800 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: I've contacted the bioxxx website admin. While he agreed, the copyright is still a problem. He said they just organized the translation from the original website and make a pdf version. As the original site is not accessible anymore, now it is hard to find the original translator. Maybe we can just write an explanation and wait for him/her to contact us. On Fri, Jun 14, 2013 at 11:36 PM, Zheng Ruan wrote: > Cool. Go ahead! > > Zheng Ruan > > > On Fri, Jun 14, 2013 at 11:34 AM, Yanbo Ye wrote: > >> Sure. The original site for this translation is not accessible anymore. >> But I know the bioxxx website admin. We can contact him about the author. >> ? 2013-6-14 ??11:17?"Zheng Ruan" ??? >> >> Yep, some old chapters do not change much. But the copyright seems to be >>> a concern. Do we need to contact the original author for such a permission. >>> I think they will be happy to grant it. >>> >>> Thanks, >>> Zheng Ruan >>> >>> >>> On Fri, Jun 14, 2013 at 10:55 AM, Yanbo Ye wrote: >>> >>>> Thanks, Zhen Ruan. >>>> I have that one. It's an old version tranlation and many chapters are >>>> not available. But we can work base on that one I think. >>>> ? 2013-6-14 ??10:40?"Zheng Ruan" ??? >>>> >>>> There was once a discussion in http://www.bioxxx.cn. You can find >>>>> relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, >>>>> this seems not to be an official translation. I find a copy of it in case >>>>> you don't have permission (send you off-list). >>>>> >>>>> Best, >>>>> Zheng Ruan >>>>> >>>>> >>>>> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> Here we have some people like to translate the biopython tutorial into >>>>>> Chinese, so that more people in China can use biopython for their >>>>>> research >>>>>> and contribute to biopython. We noticed there is a LaTeX file for this >>>>>> tutorial that we can work on. >>>>>> >>>>>> Before we start, we want to know what is the right way to do this. Is >>>>>> there >>>>>> any previous or ongoing translation project that we can follow? >>>>>> >>>>>> Thanks, >>>>>> Yanbo >>>>>> -- >>>>>> >>>>>> ??? >>>>>> >>>>>> ???????????????? >>>>>> >>>>>> Yanbo Ye >>>>>> >>>>>> Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >>>>>> Sciences >>>>>> >>>>>> _______________________________________________ >>>>>> Biopython-dev mailing list >>>>>> Biopython-dev at lists.open-bio.org >>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev >>>>>> >>>>> >>>>> >>> > -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From zruan1991 at gmail.com Sat Jun 15 06:12:56 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 15 Jun 2013 02:12:56 -0400 Subject: [Biopython-dev] Biopython Tutorial Chinese Translation In-Reply-To: References: Message-ID: Hi, I played with sphinx for a while. Is this what we expected ( http://zruanweb.com/html/Tutorial.html), although there are some issue that needs manual curation. I build this using pandoc to convert Doc/Tutorual.tex to reStructuredText and then use sphinx to make html. The sphinx directory can be found at ( https://github.com/zruan/biopython/tree/master/Doc/sphinx). Thanks. Best, Ruan On Fri, Jun 14, 2013 at 11:38 PM, Yanbo Ye wrote: > I've contacted the bioxxx website admin. While he agreed, the copyright is > still a problem. He said they just organized the translation from the > original website and make a pdf version. As the original site is not > accessible anymore, now it is hard to find the original translator. Maybe > we can just write an explanation and wait for him/her to contact us. > > > On Fri, Jun 14, 2013 at 11:36 PM, Zheng Ruan wrote: > >> Cool. Go ahead! >> >> Zheng Ruan >> >> >> On Fri, Jun 14, 2013 at 11:34 AM, Yanbo Ye wrote: >> >>> Sure. The original site for this translation is not accessible anymore. >>> But I know the bioxxx website admin. We can contact him about the author. >>> ? 2013-6-14 ??11:17?"Zheng Ruan" ??? >>> >>> Yep, some old chapters do not change much. But the copyright seems to be >>>> a concern. Do we need to contact the original author for such a permission. >>>> I think they will be happy to grant it. >>>> >>>> Thanks, >>>> Zheng Ruan >>>> >>>> >>>> On Fri, Jun 14, 2013 at 10:55 AM, Yanbo Ye wrote: >>>> >>>>> Thanks, Zhen Ruan. >>>>> I have that one. It's an old version tranlation and many chapters are >>>>> not available. But we can work base on that one I think. >>>>> ? 2013-6-14 ??10:40?"Zheng Ruan" ??? >>>>> >>>>> There was once a discussion in http://www.bioxxx.cn. You can find >>>>>> relevant info at http://www.bioxxx.cn/thread-2354-1-1.html. However, >>>>>> this seems not to be an official translation. I find a copy of it in case >>>>>> you don't have permission (send you off-list). >>>>>> >>>>>> Best, >>>>>> Zheng Ruan >>>>>> >>>>>> >>>>>> On Fri, Jun 14, 2013 at 7:24 AM, Yanbo Ye wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> Here we have some people like to translate the biopython tutorial >>>>>>> into >>>>>>> Chinese, so that more people in China can use biopython for their >>>>>>> research >>>>>>> and contribute to biopython. We noticed there is a LaTeX file for >>>>>>> this >>>>>>> tutorial that we can work on. >>>>>>> >>>>>>> Before we start, we want to know what is the right way to do this. >>>>>>> Is there >>>>>>> any previous or ongoing translation project that we can follow? >>>>>>> >>>>>>> Thanks, >>>>>>> Yanbo >>>>>>> -- >>>>>>> >>>>>>> ??? >>>>>>> >>>>>>> ???????????????? >>>>>>> >>>>>>> Yanbo Ye >>>>>>> >>>>>>> Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of >>>>>>> Sciences >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Biopython-dev mailing list >>>>>>> Biopython-dev at lists.open-bio.org >>>>>>> http://lists.open-bio.org/mailman/listinfo/biopython-dev >>>>>>> >>>>>> >>>>>> >>>> >> > > > -- > > ??? > > ???????????????? > > Yanbo Ye > > Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of > Sciences > From p.j.a.cock at googlemail.com Sat Jun 15 11:53:17 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 15 Jun 2013 12:53:17 +0100 Subject: [Biopython-dev] [Wg-phyloinformatics] Codon Alignment GSoC Homepage In-Reply-To: References: Message-ID: On Saturday, June 15, 2013, Zheng Ruan wrote: > Hi all, > > Following Peter and Karen's suggestion, I set up my project homepage in > github (http://zruanweb.com/ ). I also > have a first four weeks' plan there ( > http://zruanweb.com/project-timeline.html). Thanks. > > Best, > Ruan > Thank you :) Peter From yeyanbo289 at gmail.com Sun Jun 16 06:39:37 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Sun, 16 Jun 2013 14:39:37 +0800 Subject: [Biopython-dev] Post for the first week Message-ID: Hi Eric, Mark and Jeet, I post a blog describing the design idea of tree construction module and some works for the first week. Here is the link: http://blog.yeyanbo.com/posts/google-summer-of-code-2.html Best, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From p.j.a.cock at googlemail.com Sun Jun 16 13:21:14 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sun, 16 Jun 2013 14:21:14 +0100 Subject: [Biopython-dev] Post for the first week In-Reply-To: References: Message-ID: On Sunday, June 16, 2013, Yanbo Ye wrote: > Hi Eric, Mark and Jeet, > > I post a blog describing the design idea of tree construction module and > some works for the first week. Here is the link: > http://blog.yeyanbo.com/posts/google-summer-of-code-2.html > > Best, > Yanbo > Thanks for keeping us informed Yanbo :) This old code might be useful for distance matrices, https://redmine.open-bio.org/issues/2034 You can also get distance matrices from Bio.Cluster but they may not make sense as input to a tree building algorithm... Peter From eric.talevich at gmail.com Sun Jun 16 17:15:26 2013 From: eric.talevich at gmail.com (Eric Talevich) Date: Sun, 16 Jun 2013 13:15:26 -0400 Subject: [Biopython-dev] Progress with ticket 3336 In-Reply-To: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> References: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> Message-ID: On Tue, May 7, 2013 at 5:12 PM, Nate Sutton wrote: > Hi, > > Here is a progress follow up to > http://lists.open-bio.org/pipermail/biopython-dev/2013-April/010548.html. I have added a commit to the github branch that adds an option to create > claude branch lines using linecollection. The linecollection objects are > stored in a tuple before adding them to the plot. It?s in > Bio/Phylo/_utils.py. Is this what the last bullet point was requesting in > https://redmine.open-bio.org/issues/3336 ? > > Thanks! > > Nate > > P. S. I used a tuple to store the linecollection objects instead of a > list because that was mentioned in the ticket but if that looks like it > should be different let me know. Also, I got some global variables to work > with the code but I was only able to do that after declaring them as > globals twice. If there are suggestions on how to code that differently > let me know. > Hi Nate, I left some comments on your commits in your branch on GitHub. When you're done, would you mind rebasing to the current master branch and doing a pull request? Regarding the global variables, I think you might have to declare them as such in every new scope where they're used, or not at all, and in this case, you don't need to declare them as global at all. Thanks, Eric From jmb at ebi.ac.uk Sun Jun 16 20:33:26 2013 From: jmb at ebi.ac.uk (John Berrisford) Date: Sun, 16 Jun 2013 21:33:26 +0100 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> Message-ID: <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> Thanks for the advice Can you help in how to set a chain using the add method? Also the PDBIO writer appears to remove aniso records. Is there anyway to stop it doing this? Currently my code is: pdbFile = PDBParser().get_structure(pdbid, pdbPath) waterChain = pdbFile[0]['W'] newChain = pdbFile[0]['A'] waterAtom = pdbFile[0]['W']['W', 1031, ' ']['O'] waterResidue = pdbFile[0]['W']['W', 1031, ' '] print waterChain.id print waterResidue.get_parent() waterResidue.detach_parent() #this bit seems to work print waterResidues.get_parent() waterResidue.add(pdbid[0]["A"]) #i?m not sure exactly how to get a chainID. print waterResidue.get_parent() print waterResidue.id which returns: W #original parent (chain) of the residue. None #detach parent seems to work fine Traceback (most recent call last): File ?water_orig.py", line 53, in waterResidue.add(pdbid[0]["A"]) TypeError: string indices must be integers, not str My writer commands are: pdb_writer = PDBIO() pdb_writer.set_structure(pdbFile) pdb_writer.save("output_path.pdb") Regards John From: Jo?o Rodrigues [mailto:anaryin at gmail.com] Sent: 14 June 2013 10:50 To: John Berrisford Cc: David Cain; biopython-dev at lists.open-bio.org Subject: Re: [Biopython-dev] changing PDB file chains Hi John, Actually, David is absolutely right.. I didn't really think it through. You need to move the water atoms to the chain where you want them to be. So, if they are in chain A and should be in chain B, you need to detach them from chain A (detach_child method on the residue, easier) and re-attach it to chain B (add method). From davidjosephcain at gmail.com Mon Jun 17 02:52:12 2013 From: davidjosephcain at gmail.com (David Cain) Date: Sun, 16 Jun 2013 22:52:12 -0400 Subject: [Biopython-dev] changing PDB file chains In-Reply-To: <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> References: <005801ce6868$7044af00$50ce0d00$@ebi.ac.uk> <51BAE6E5.1040401@ebi.ac.uk> <016201ce6ad0$c1cafa90$4560efb0$@ebi.ac.uk> Message-ID: Hi, John. Your error is that you're using pdbid[0]["A"], where pdbid is a string (not an instance of PDB.Structure, as you probably expect it to be). You seem to have the parent detachment down, you just need to properly attach to a new chain. Calling waterResidue.add(...) will add a child object to the residue (which should be an atom: http://biopython.org/DIST/docs/api/Bio.PDB.Residue-pysrc.html#L73). Instead, you want to call newChain.add(waterResidue). Just FYI, looking at how Structures are constructed by StructureBuilder should help you with the mechanics of modifying the SMRCA hierarchy. That is, StructureBuilder creates a Structure from scratch- if you understand how a Structure is built, it should make modifying a Structure trivial! As far as Aniso records, I don't believe the current implementation of PDBIO can handle that. You could always modify the source code to fit your needs, though! (I'm sure others would benefit from your changes). David Cain +1 (339) 222 4452 On Sun, Jun 16, 2013 at 4:33 PM, John Berrisford wrote: > Thanks for the advice**** > > ** ** > > Can you help in how to set a chain using the add method?**** > > ** ** > > Also the PDBIO writer appears to remove aniso records. Is there anyway to > stop it doing this?**** > > ** ** > > Currently my code is:**** > > pdbFile = PDBParser().get_structure(pdbid, pdbPath)**** > > waterChain = pdbFile[0]['W']**** > > newChain = pdbFile[0]['A']**** > > waterAtom = pdbFile[0]['W']['W', 1031, ' ']['O']**** > > waterResidue = pdbFile[0]['W']['W', 1031, ' ']**** > > print waterChain.id**** > > print waterResidue.get_parent()**** > > waterResidue.detach_parent() #this bit seems to work**** > > print waterResidues.get_parent()**** > > waterResidue.add(pdbid[0]["A"]) #i?m not sure exactly how to get a > chainID. **** > > print waterResidue.get_parent()**** > > print waterResidue.id**** > > ** ** > > which returns:**** > > W**** > > #original parent (chain) of the residue. **** > > None #detach parent seems to work fine**** > > Traceback (most recent call last):**** > > File ?water_orig.py", line 53, in **** > > waterResidue.add(pdbid[0]["A"])**** > > TypeError: string indices must be integers, not str**** > > ** ** > > ** ** > > My writer commands are:**** > > pdb_writer = PDBIO()**** > > pdb_writer.set_structure(pdbFile)**** > > pdb_writer.save("output_path.pdb")**** > > ** ** > > ** ** > > Regards**** > > ** ** > > John**** > > ** ** > > *From:* Jo?o Rodrigues [mailto:anaryin at gmail.com] > *Sent:* 14 June 2013 10:50 > *To:* John Berrisford > *Cc:* David Cain; biopython-dev at lists.open-bio.org > *Subject:* Re: [Biopython-dev] changing PDB file chains**** > > ** ** > > Hi John,**** > > ** ** > > Actually, David is absolutely right.. I didn't really think it through. > You need to move the water atoms to the chain where you want them to be. > So, if they are in chain A and should be in chain B, you need to detach > them from chain A (detach_child method on the residue, easier) and > re-attach it to chain B (add method).**** > From redmine at redmine.open-bio.org Tue Jun 18 06:06:50 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Tue, 18 Jun 2013 06:06:50 +0000 Subject: [Biopython-dev] [Biopython - Bug #3435] (New) Pyhlo.draw_ascii() type error Message-ID: Issue #3435 has been reported by Giulio Valentino Dalla Riva. ---------------------------------------- Bug #3435: Pyhlo.draw_ascii() type error https://redmine.open-bio.org/issues/3435 Author: Giulio Valentino Dalla Riva Status: New Priority: Normal Assignee: Category: Target version: URL: I define a simple tree in a newick format file (the example one in the tutorial 13.1). I read it using @tree = Phylo.read("simple.dnd", "newick")@ which works. When I try to use draw_ascii() to draw it I get a Type error: @Phylo.draw_ascii(tree)@ produces @Traceback (most recent call last): File "", line 1, in Phylo.draw_ascii(tree) File "C:\Python32\lib\site-packages\Bio\Phylo\_utils.py", line 253, in draw_ascii draw_clade(tree.root, 0) File "C:\Python32\lib\site-packages\Bio\Phylo\_utils.py", line 239, in draw_clade char_matrix[thisrow][col] = '_' TypeError: list indices must be integers, not float@ I think the error is related with the fact that Python 3.X gives a float as respond to @/@ and not an integer. The problem doesn't occur with Phylo.draw(). P.S. Hope the issue has not been already answered: I didn't find it. I'm working on phyton 3.2 and I tested the issue both from the latest git release (compiled with mingw) and the binary on a windows 32 machine. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From natemsutton at yahoo.com Wed Jun 19 09:00:56 2013 From: natemsutton at yahoo.com (Nate Sutton) Date: Wed, 19 Jun 2013 02:00:56 -0700 (PDT) Subject: [Biopython-dev] Progress with ticket 3336 In-Reply-To: References: <1367961179.88206.YahooMailNeo@web122603.mail.ne1.yahoo.com> Message-ID: <1371632456.20927.YahooMailNeo@web122602.mail.ne1.yahoo.com> I appreciate your review of the code, the feedback helps me become better! ?I made the pull request here: https://github.com/biopython/biopython/pull/189 . ? I have worked on fixing all the things you commented on and added comments describing the edits to the new file version at: https://github.com/nmsutton/biopython/commit/893a6508ad18278b9d5cdb10d3c81c823125c90f and 2 minor changes at: https://github.com/nmsutton/biopython/commit/c8215edba1bd796684722372cec3c94bffdafc91 -Nate P.S. ?To acknowledge assistance I got from a friend I added a co-authored-by line in the commit due to reading that can be helpful for authors who want to include that they were helped with their code. ?If you or anyone else knows a better way to let others get recognized for providing assistance on someone's code let me know. ________________________________ From: Eric Talevich To: Nate Sutton Cc: "biopython-dev at lists.open-bio.org" Sent: Sunday, June 16, 2013 10:15 AM Subject: Re: [Biopython-dev] Progress with ticket 3336 On Tue, May 7, 2013 at 5:12 PM, Nate Sutton wrote: Hi, > >Here is a progress follow up to http://lists.open-bio.org/pipermail/biopython-dev/2013-April/010548.html . ?I have added a commit to the github branch that adds an option to create claude branch lines using linecollection. ?The linecollection objects are stored in a tuple before adding them to the plot. ?It?s in Bio/Phylo/_utils.py. ?Is this what the last bullet point was requesting in https://redmine.open-bio.org/issues/3336 ? ? > >Thanks! > >Nate > >P. S. ?I used a tuple to store the linecollection objects instead of a list because that was mentioned in the ticket but if that looks like it should be different let me know. ?Also, I got some global variables to work with the code but I was only able to do that after declaring them as globals twice. ?If there are suggestions on how to code that differently let me know. > ? Hi Nate, I left some comments on your commits in your branch on GitHub. When you're done, would you mind rebasing to the current master branch and doing a pull request? Regarding the global variables, I think you might have to declare them as such in every new scope where they're used, or not at all, and in this case, you don't need to declare them as global at all. Thanks, Eric From zruan1991 at gmail.com Thu Jun 20 22:02:44 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Thu, 20 Jun 2013 18:02:44 -0400 Subject: [Biopython-dev] Codon Alignment for Biopython Project Update Message-ID: Hi all, I post my first update diary for Codon Alignment project in http://zruanweb.com/1st-diary.html. The repository for the code lies in https://github.com/zruan/biopython/tree/master/Bio/CodonAlign. I'd be happy to hear from your suggestions. Thanks! Best, Ruan From yeyanbo289 at gmail.com Mon Jun 24 04:57:36 2013 From: yeyanbo289 at gmail.com (Yanbo Ye) Date: Mon, 24 Jun 2013 12:57:36 +0800 Subject: [Biopython-dev] Post of the second week Message-ID: Hi guys, I posted another blog here summarizing my work of the first week and plan for this week. Any feedback is welcome. Thanks, Yanbo -- ??? ???????????????? Yanbo Ye Bioinformatics Group, Wuhan Institute Of Virology, Chinese Academy of Sciences From redmine at redmine.open-bio.org Fri Jun 28 14:31:37 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:31:37 +0000 Subject: [Biopython-dev] [Biopython - Bug #3436] (New) Fix slicing of SFF objects Message-ID: Issue #3436 has been reported by Martin Mokrej?. ---------------------------------------- Bug #3436: Fix slicing of SFF objects https://redmine.open-bio.org/issues/3436 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Peter Cock Category: Target version: URL: I am chasing a deemed bug in biopython which happens during slicing. It seems it is related to some internal cross-checks and NOT to the slicing range itself. This is likely caused by quality trim points conflicting and thir check triggers during slicing whereas NOT during SFF input parsing. But I think this a valid use case. Maybe Peter will be faster then me in finding the answer what is going on.
$ python
Python 2.7.3 (default, Apr 20 2013, 18:28:22) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> for _record in SeqIO.parse('/tmp/SRR088776_short_F95S1KP01DA6OU.sff', 'sff'):
...     print _record
... 
ID: F95S1KP01DA6OU
Name: F95S1KP01DA6OU
Number of features: 0
/flow_values=(95, 9, 110, 10, 30, 104, 15, 103, 1539, 33, 10, 20, 1501, 19, 8, 10, 1507, 22, 8, 8, 1482, 21, 8, 7, 1402, 22, 7, 11, 1394, 25, 10, 10, 1390, 27, 10, 12, 1333, 32, 14, 12, 1313, 59, 12, 14, 1258, 77, 13, 15, 1152, 91, 16, 16, 1144, 100, 14, 19, 1005, 107, 17, 20, 945, 109, 19, 20, 920, 113, 21, 21, 826, 113, 20, 25, 744, 105, 24, 25, 633, 110, 26, 26, 532, 103, 28, 30, 428, 103, 31, 33, 306, 106, 34, 37, 208, 106, 37, 39, 155, 113, 37, 46, 115, 103, 44, 57, 91, 113, 42, 57, 79, 111, 43, 67, 52, 103, 43, 78, 41, 102, 46, 82, 37, 102, 49, 91, 33, 103, 47, 104, 32, 100, 49, 98, 27, 111, 51, 109, 26, 113, 48, 114, 25, 116, 48, 120, 25, 111, 47, 119, 26, 100, 47, 109, 27, 90, 38, 112, 25, 85, 35, 105, 26, 74, 34, 102, 26, 78, 31, 90, 21, 67, 30, 78, 22, 49, 27, 67, 24, 45, 27, 62, 19, 46, 23, 48, 18, 41, 20, 45, 18, 37, 15, 44, 14, 33, 13, 39, 17, 32, 12, 34, 17, 27, 11, 24, 17, 26, 12, 20, 12, 26, 11, 17, 15, 22, 12, 15, 13, 20, 11, 18, 15, 20, 12, 16, 15, 22, 11, 14, 15, 16, 11, 13, 13, 18, 13, 11, 14, 17, 12, 13, 13, 17, 10, 12, 13, 15, 10, 12, 14, 15, 10, 13, 15, 15, 11, 14, 11, 14, 12, 11, 12, 15, 11, 10, 14, 12, 10, 10, 11, 18, 12, 9, 13, 14, 11, 10, 14, 13, 10, 13, 14, 14, 10, 13, 13, 15, 11, 9, 14, 17, 11, 13, 14, 18, 9, 15, 15, 16, 11, 11, 20, 16, 11, 14, 14, 14, 12, 14, 17, 16, 11, 12, 17, 14, 11, 11, 18, 15, 10, 11, 13, 14, 11, 12, 10, 19, 13, 10, 10, 16, 11, 10, 13, 14, 11, 12, 12, 12, 11, 13, 13, 14, 11, 15, 12, 16, 12, 15, 12, 14, 15, 11, 15, 15, 15, 9, 18, 14, 14, 12, 16, 13, 13, 13, 15, 13, 12, 12, 19, 15, 11, 12, 14, 15, 13, 11, 14, 14, 12, 12, 14, 16, 12, 13, 13, 17, 13, 11, 16, 15, 12, 12, 16, 16, 12, 12, 17, 13, 12, 11)
/flow_index=(1, 2, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 0, 1, 3, 0, 0, 0, 0, 1, 3, 0, 0, 0, 1, 3, 0, 0, 1, 3, 0, 1, 3, 0, 1, 3, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 4, 4, 4, 4, 4, 4, 4, 12, 76)
/flow_chars=TACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACG
/clip_adapter_right=0
/clip_qual_right=239
/clip_qual_left=248
/clip_adapter_left=4
/flow_key=TCAG
Per letter annotation for: phred_quality
Seq('tcagtttttttttttttttnttttttttttttttttttttttttttttttnttt...nnn', DNAAlphabet())
>>> _record[4:]
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 460, in __getitem__
    answer._per_letter_annotations[key] = value[index]
  File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 78, in __setitem__
    "strings) of length %i." % self._length)
TypeError: We only allow python sequences (lists, tuples or strings) of length 316.
>>> 
Um, this was biopython 1.59. I just installed 1.61 but it is same: _new = record[lval:] File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 461, in __getitem__ answer._per_letter_annotations[key] = value[index] File "/usr/lib64/python2.7/site-packages/Bio/SeqRecord.py", line 79, in __setitem__ "strings) of length %i." % self._length) TypeError: We only allow python sequences (lists, tuples or strings) of length 316. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From redmine at redmine.open-bio.org Fri Jun 28 14:49:38 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:49:38 +0000 Subject: [Biopython-dev] [Biopython - Bug #3437] (New) SeqIO.write(): Do not write broken data for empty objects Message-ID: Issue #3437 has been reported by Martin Mokrej?. ---------------------------------------- Bug #3437: SeqIO.write(): Do not write broken data for empty objects https://redmine.open-bio.org/issues/3437 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Category: Target version: URL: While slicing SFF objects I tripped several times in the past across the case when left trim point is defined whereas right trim point is not. Doing a slice like _record[_lval:_rval] not surprisingly results in an empty object (e.g. _record[4:0]). I think "SFF object" could be smart enough and do just [4:] on my behalf. Alternatively, I would prefer a Warning message. While you might disagree with both proposals above, you might change your view if you think about writing sliced objects into an outfile, for example in fastq or fasta or qual. These objects with empty sequence are written into the files but of course, only the FASTA/FASTQ header line is written and nothing for the sequence or qualities itself. So, the output FASTA or QUAL or FASTQ is actually broken. So must do something about this anyway. And users tend to forget so at leats I am likely to trip into this again after a few months. ;-) To recapitulate what I propose: First, I would just replace [4:0] with [4:] in the example mentioned. Second, SeqIO.write() must definitely check for non-zero length of object's sequence. This was on biopython-1.59 but happens also on 1.61. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From redmine at redmine.open-bio.org Fri Jun 28 14:56:51 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 14:56:51 +0000 Subject: [Biopython-dev] [Biopython - Feature #3438] (New) Allow modifications of sequence in SFF objects Message-ID: Issue #3438 has been reported by Martin Mokrej?. ---------------------------------------- Feature #3438: Allow modifications of sequence in SFF objects https://redmine.open-bio.org/issues/3438 Author: Martin Mokrej? Status: New Priority: Normal Assignee: Category: Target version: URL: I find it a bit awkward but namely, with unnecessary overhead, to edit a sequence in a SFF object. I am fine with the requirement that the length of the sequence must stay same is the length of qualities and other annotation lists. However, to edit a sequence I have to do now:
    if _was_modified:
        _letter_annotations = _record.letter_annotations
        _annotations = _record.annotations
        _record.letter_annotations = {}
        _record.annotations = {}
        _record.seq = Seq(_sequence, generic_dna)
        _record.letter_annotations = _letter_annotations
        _record.annotations = _annotations

        _new_record = SeqRecord(Seq(_sequence, generic_dna), id=_record.id, name=_record.name, description=_record.description, annotations=_record.annotations, letter_annotations=_record.letter_annotations)

        _wrote = SeqIO.write(_new_record, _fh, 'sff')
    else:
        _wrote = SeqIO.write(_record, _fh, 'sff')
The whole work in backup&restore of annotation lists is not necessary in my eyes. I think providing _record.rewrite_sequence('tcagnnnnnnnn') would be quite helpful here. ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From zruan1991 at gmail.com Fri Jun 28 21:10:07 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Fri, 28 Jun 2013 17:10:07 -0400 Subject: [Biopython-dev] Week2 Update for Codon Alignment for Biopython Project Message-ID: Hi all, I wrote a diary for the progress of my project in week 2 ( http://zruanweb.com/). Thanks for your feedback and suggestions. Best, Ruan From redmine at redmine.open-bio.org Fri Jun 28 22:12:59 2013 From: redmine at redmine.open-bio.org (redmine at redmine.open-bio.org) Date: Fri, 28 Jun 2013 22:12:59 +0000 Subject: [Biopython-dev] [Biopython - Bug #3439] (New) SwissProt parser breaks when parsing '[' crossref Message-ID: Issue #3439 has been reported by Iddo Friedberg. ---------------------------------------- Bug #3439: SwissProt parser breaks when parsing '[' crossref https://redmine.open-bio.org/issues/3439 Author: Iddo Friedberg Status: New Priority: Normal Assignee: Category: Target version: URL: It seems the SwissProt parser treats opening square brackets as comments in the cross-reference records. So if there is a '[' in the freetext, everything after that does not get parsed. Seems like the relevant function is "_read_dr" line 491 in BioSwissProt/__init__.py Thanks, Iddo ---------------------------------------- You have received this notification because this email was added to the New Issue Alert plugin -- You have received this notification because you have either subscribed to it, or are involved in it. To change your notification preferences, please click here and login: http://redmine.open-bio.org From p.j.a.cock at googlemail.com Sat Jun 29 00:17:53 2013 From: p.j.a.cock at googlemail.com (Peter Cock) Date: Sat, 29 Jun 2013 01:17:53 +0100 Subject: [Biopython-dev] Fwd: [biopython] MeltingTemp completely rewritten and extended (#192) In-Reply-To: References: Message-ID: Sebastian, could you comment on or review this? Thanks, Peter ---------- Forwarded message ---------- From: *Markus Piotrowski* Date: Saturday, June 29, 2013 Subject: [biopython] MeltingTemp completely rewritten and extended (#192) To: biopython/biopython More or less completely rewritten and largely extended. 1. Three different Tm calculations: one 'rule of thumb' (Tm_Wallace), one using approximative formulas basing on GC content (Tm_GC) and one using nearest neighbor calculations (Tm_NN). 2. The new Tm_NN allows the usage of different thermodynamic datasets (8 tables are included for Watson-Crick base-pairing) and includes tables for mismatches (including inosine) and dangling ends. The datasets are Python dictionaries; the user can use his own datasets or change/update existing tables for his needs. 3. Seven different formulas to correct for salt concentration, including correction for Mg2+ ions (method salt_correction). 4. Method chem_correction which allows for Tm correction when using DMSO and formaldehyde. ------------------------------ You can merge this Pull Request by running git pull https://github.com/MarkusPiotrowski/biopython MeltingTemp Or view, comment on, or merge it at: https://github.com/biopython/biopython/pull/192 Commit Summary - MeltingTemp completely rewritten and extended File Changes - *M* Bio/SeqUtils/MeltingTemp.py(1107) Patch Links: - https://github.com/biopython/biopython/pull/192.patch - https://github.com/biopython/biopython/pull/192.diff From lomereiter at gmail.com Sat Jun 29 14:39:05 2013 From: lomereiter at gmail.com (Artem Tarasov) Date: Sat, 29 Jun 2013 18:39:05 +0400 Subject: [Biopython-dev] CFFI bindings for sambamba Message-ID: Hello BioPython, As you may know, during the previous GSoC I wrote a library for working with SAM/BAM in D. Only recently the language gained shared library support on Linux. And at about the same time PyPy 2.0 was released. I had some spare time last few days and made bindings for the library. At the time they are a bit incomplete and lack tests and documentation. They work on Linux with PyPy 2.0 or CPython 2.*. On PyPy, the performance is not worse than that of PySam on CPython, thanks to the powerful JIT. Features so far: * BAM reader and writer, both can use multiple threads for (de-)compression * Random access, creating BAI index * Fast pileup engine (optionally uses MD tags to determine reference bases) The repository is at https://github.com/lomereiter/sambamba-bindings Again, currently only Linux is supported. Bug-hunting and pull requests with tests and docs are much appreciated. -- Artem From zruan1991 at gmail.com Sat Jun 29 22:29:14 2013 From: zruan1991 at gmail.com (Zheng Ruan) Date: Sat, 29 Jun 2013 18:29:14 -0400 Subject: [Biopython-dev] Fwd: Week2 Update for Codon Alignment for Biopython Project In-Reply-To: References: Message-ID: Hi all, It seems my email yesterday had not been reached to the biopython-dev mailing list. So I try to send it again. I wrote a diary for the progress of the Codon Alignment project in week 2 ( http://zruanweb.com/). Thanks for your feedback and suggestions. Since I'll be traveling in July 1st to 3rd, I anticipate less update next week. Thanks. Best, Ruan