From mmhohman at northwestern.edu Mon Feb 5 03:35:49 2007 From: mmhohman at northwestern.edu (Moses M.Hohman) Date: Mon, 5 Feb 2007 00:35:49 -0800 Subject: [BioRuby] Download BioRuby API document In-Reply-To: <839897.12672.qm@web36804.mail.mud.yahoo.com> References: <839897.12672.qm@web36804.mail.mud.yahoo.com> Message-ID: Hi Li, I was going to say that If you've installed BioRuby as a gem, you could view its RDoc documentation by running "gem_server", which allows you to browse the RDoc documentation using a web browser pointed at localhost:8808. However, it doesn't look like BioRuby supports that at the moment. This is probably something easy to do and would show off the RDoc work people have been doing. Is there another way? Moses On Jan 27, 2007, at 6:23 AM, chen li wrote: > Hi all, > > I am new to this forum. I wonder if I can download the > BioRuby's API document and install it to my computer > > Thanks, > > > Li > > > > ______________________________________________________________________ > ______________ > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > http://games.yahoo.com/games/front > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From trevor at corevx.com Mon Feb 5 08:40:27 2007 From: trevor at corevx.com (Trevor Wennblom) Date: Mon, 05 Feb 2007 07:40:27 -0600 Subject: [BioRuby] Download BioRuby API document In-Reply-To: References: <839897.12672.qm@web36804.mail.mud.yahoo.com> Message-ID: <45C733CB.1070303@corevx.com> I haven't done anything interesting with the site in awhile, but the BioRuby rdoc from v1.0 is available for browsing here: http://bioruby-doc.org/rdoc/ While we've made great improvements on the RDoc formatting we could do a much better job (mostly do to technical misunderstandings than a lack of documentation). I would like to hold off on addressing those concerns until BioRuby 1.1 is released since I think it's due time we make a major update public. There's been quite a bit added to the CVS within the last year that could really shine in a GEM package. I'd really like to see more frequent minor releases - perhaps even one every few months? It would certainly attract a bit more attention to the project. On another topic - has anyone heard from Toshiaki? I sent him an email on 2007-01-05 regarding some issues with Bio::Sequence to_re and haven't heard back. Trevor Moses M.Hohman wrote: > Hi Li, > > I was going to say that If you've installed BioRuby as a gem, you > could view its RDoc documentation by running "gem_server", which > allows you to browse the RDoc documentation using a web browser > pointed at localhost:8808. However, it doesn't look like BioRuby > supports that at the moment. This is probably something easy to do > and would show off the RDoc work people have been doing. Is there > another way? > > Moses > > On Jan 27, 2007, at 6:23 AM, chen li wrote: > > >> Hi all, >> >> I am new to this forum. I wonder if I can download the >> BioRuby's API document and install it to my computer >> >> Thanks, >> >> >> Li >> From yjchenx at gmail.com Thu Feb 8 17:54:09 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Thu, 8 Feb 2007 14:54:09 -0800 Subject: [BioRuby] Bug in writing PDB ATOM Message-ID: In bio/db/pdb/pdb.rb line 1019, the ATOM entry is written as: sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s It results an ATOM entry as: ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O But the right ATOM entry should be ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O Note there are 2 spaces after '61' and one space before 'ASN' I change this line to: sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s and it works fine now. But I am new to Ruby and not familiar with the format yet. Yen-Ju From yjchenx at gmail.com Thu Feb 8 20:44:59 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Thu, 8 Feb 2007 17:44:59 -0800 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: On 2/8/07, Alex Gutteridge wrote: > On 9 Feb 2007, at 07:54, Yen-Ju Chen wrote: > > > In bio/db/pdb/pdb.rb line 1019, > > the ATOM entry is written as: > > > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > > > It results an ATOM entry as: > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > > 88.56 O > > > > But the right ATOM entry should be > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > > 88.56 O > > > > Note there are 2 spaces after '61' and one space before 'ASN' > > I change this line to: > > > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > > > and it works fine now. > > But I am new to Ruby and not familiar with the format yet. > > > > Yen-Ju > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > Hi Yen-Ju, > > Thanks for your bug report. In fact (as far as I can tell) the PDB > format (http://www.wwpdb.org/documentation/format23/sect9.html) is > ambiguous in this case. Columns 13-16 are specified for the 'Atom > name' ('OD1' in the case you mention), but the justification of the > field is not specified. Note that the field requires four columns so > your fix (which reduces it to three) may break if you encounter an > atom name with 4 characters. > > However, you are quite correct that the convention in most PDB files > is that when less than 4 characters are used for the atom name, the > field is aligned as you show. In summary, any of the following is a > valid name according to my reading of the specifications, but the > convention in many files is to use the form shown in the third and > fourth examples rather than the first and second. Note that the fifth > example is also a valid atom name and may break your fix: > > OD1 > N > OD1 > N > OD12 > > I will change the code to use the conventional form where possible, > but be careful with your fix because it may break on some (rare) PDB > files. > > An important general point: PDB files (particularly older ones) are > *very* messy. Efforts have been made within the PDB and at the EBI > MSD to clean these files up, but there are still issues. This means > that it is very hard to write a parser that can read in any PDB file > and then output it in exactly the same format (including spacing > etc...). The BioRuby parser should be able to parse any valid PDB > file and output the data back out as a valid PDB format string, but > the input and output are *not* guaranteed to be identical. > > I have not had time to actively maintain the PDB parsing in BioRuby, > so if you are interested in Ruby and PDB files feel free to submit > more bug reports and patches. > > Thanks again. Thanx. I understand it is messy on PDB format and PDB is not the only one in this field. :) I notice this bug because the output from bioruby cannot be read correctly by some program I am using, like rasmol. Anyway, I just start to use bioruby recently and still learning. If I found some more bugs, I will try to send reports and patches. By the way, I am working more on the structural side. Currently BioRuby is more on sequence and database. If people are interested, I may submit some scripts for common structural stuff in the future. for example, calculating symmetry-related position in unit cell based on space group, converting position from orthogonal to fraction coordinate, converting format of heavy metal positions for various crystallography packages, etc. Yen-Ju > > Alex Gutteridge > > Bioinformatics Center > Kyoto University > > > From alexg at kuicr.kyoto-u.ac.jp Thu Feb 8 20:15:46 2007 From: alexg at kuicr.kyoto-u.ac.jp (Alex Gutteridge) Date: Fri, 9 Feb 2007 10:15:46 +0900 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: On 9 Feb 2007, at 07:54, Yen-Ju Chen wrote: > In bio/db/pdb/pdb.rb line 1019, > the ATOM entry is written as: > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > It results an ATOM entry as: > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > 88.56 O > > But the right ATOM entry should be > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > 88.56 O > > Note there are 2 spaces after '61' and one space before 'ASN' > I change this line to: > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > and it works fine now. > But I am new to Ruby and not familiar with the format yet. > > Yen-Ju > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > Hi Yen-Ju, Thanks for your bug report. In fact (as far as I can tell) the PDB format (http://www.wwpdb.org/documentation/format23/sect9.html) is ambiguous in this case. Columns 13-16 are specified for the 'Atom name' ('OD1' in the case you mention), but the justification of the field is not specified. Note that the field requires four columns so your fix (which reduces it to three) may break if you encounter an atom name with 4 characters. However, you are quite correct that the convention in most PDB files is that when less than 4 characters are used for the atom name, the field is aligned as you show. In summary, any of the following is a valid name according to my reading of the specifications, but the convention in many files is to use the form shown in the third and fourth examples rather than the first and second. Note that the fifth example is also a valid atom name and may break your fix: OD1 N OD1 N OD12 I will change the code to use the conventional form where possible, but be careful with your fix because it may break on some (rare) PDB files. An important general point: PDB files (particularly older ones) are *very* messy. Efforts have been made within the PDB and at the EBI MSD to clean these files up, but there are still issues. This means that it is very hard to write a parser that can read in any PDB file and then output it in exactly the same format (including spacing etc...). The BioRuby parser should be able to parse any valid PDB file and output the data back out as a valid PDB format string, but the input and output are *not* guaranteed to be identical. I have not had time to actively maintain the PDB parsing in BioRuby, so if you are interested in Ruby and PDB files feel free to submit more bug reports and patches. Thanks again. Alex Gutteridge Bioinformatics Center Kyoto University From ngoto at gen-info.osaka-u.ac.jp Sat Feb 10 02:32:40 2007 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Sat, 10 Feb 2007 16:32:40 +0900 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: <200702100732.l1A7WGui007466@idns103.gen-info.osaka-u.ac.jp> Hi, Yen-Ju, Which bioruby version do you use? By using CVS HEAD 'pdb.rb,v 1.16 2006/06/27 14:23:45', it seems fine. #### sample script require 'bio' atom = Bio::PDB::Record::ATOM.new atom.serial = 61 atom.name = 'OD1' atom.altLoc = '' atom.resName = 'ASN' atom.chainID = 'A' atom.resSeq = 8 atom.iCode = '' atom.x = 102.025 atom.y = 27.929 atom.z = 144.984 atom.occupancy = 1.0 atom.tempFactor = 88.56 atom.segID = '' atom.element = 'O' atom.charge = '' print atom.to_s # "ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O \n" #### end of sample script However, it still fails in some rare cases. require 'bio' # record from PDB 1CX1 str = "ATOM 376 HH TYR A 25 " + "4.479 12.801 -3.919 1.00 1.72 H " atom = Bio::PDB::Record::ATOM.new.initialize_from_string(str) print atom.to_s # # "ATOM 376 HH TYR A 25 4.479 12.801 -3.919 1.00 1.72 H \n" # ^ an excess space!! I'll make changes in the CVS to give more accurate results, but it'll be still imperfect (becase of ambiguity, as Alex said). Thanks, Naohisa Goto ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp On Thu, 8 Feb 2007 14:54:09 -0800 "Yen-Ju Chen" wrote: > In bio/db/pdb/pdb.rb line 1019, > the ATOM entry is written as: > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > It results an ATOM entry as: > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > But the right ATOM entry should be > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > Note there are 2 spaces after '61' and one space before 'ASN' > I change this line to: > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > and it works fine now. > But I am new to Ruby and not familiar with the format yet. > > Yen-Ju > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From yjchenx at gmail.com Sat Feb 10 09:13:25 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Sat, 10 Feb 2007 06:13:25 -0800 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: <45cd751d.28e949d0.4e3f.ffff9a4dSMTPIN_ADDED@mx.google.com> References: <45cd751d.28e949d0.4e3f.ffff9a4dSMTPIN_ADDED@mx.google.com> Message-ID: On 2/9/07, GOTO Naohisa wrote: > Hi, Yen-Ju, > > Which bioruby version do you use? > By using CVS HEAD 'pdb.rb,v 1.16 2006/06/27 14:23:45', > it seems fine. I tried both 1.0 and CVS. And your example indeed works fine. Here is the data and script I used: ATOM 1 CB TYR A 4 46.803 20.433 46.159 1.00130.00 ATOM 2 CG TYR A 4 46.708 19.122 46.931 1.00130.00 ATOM 3 CD1 TYR A 4 46.708 17.892 46.257 1.00130.00 ATOM 4 CE1 TYR A 4 46.596 16.691 46.961 1.00130.00 ATOM 5 CD2 TYR A 4 46.599 19.109 48.336 1.00130.00 ######### require 'bio' file = File.new('a.pdb').gets(nil) structure = Bio::PDB.new(file) structure.each do |model| model.each do |chain| chain.each do |residue| residue.each do |atom| atom.resSeq += 400 end end end end File.open('x.pdb', 'w') do |file| file << structure.to_s end Yen-Ju > > #### sample script > require 'bio' > atom = Bio::PDB::Record::ATOM.new > atom.serial = 61 > atom.name = 'OD1' > atom.altLoc = '' > atom.resName = 'ASN' > atom.chainID = 'A' > atom.resSeq = 8 > atom.iCode = '' > atom.x = 102.025 > atom.y = 27.929 > atom.z = 144.984 > atom.occupancy = 1.0 > atom.tempFactor = 88.56 > atom.segID = '' > atom.element = 'O' > atom.charge = '' > print atom.to_s > # "ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O \n" > #### end of sample script > > However, it still fails in some rare cases. > > require 'bio' > # record from PDB 1CX1 > str = "ATOM 376 HH TYR A 25 " + > "4.479 12.801 -3.919 1.00 1.72 H " > atom = Bio::PDB::Record::ATOM.new.initialize_from_string(str) > print atom.to_s > # > # "ATOM 376 HH TYR A 25 4.479 12.801 -3.919 1.00 1.72 H \n" > # ^ an excess space!! > > I'll make changes in the CVS to give more accurate results, > but it'll be still imperfect (becase of ambiguity, as Alex said). > > Thanks, > > Naohisa Goto > ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp > > On Thu, 8 Feb 2007 14:54:09 -0800 > "Yen-Ju Chen" wrote: > > > In bio/db/pdb/pdb.rb line 1019, > > the ATOM entry is written as: > > > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > > > It results an ATOM entry as: > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > > > But the right ATOM entry should be > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > > > Note there are 2 spaces after '61' and one space before 'ASN' > > I change this line to: > > > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > > > and it works fine now. > > But I am new to Ruby and not familiar with the format yet. > > > > Yen-Ju > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > From mmhohman at northwestern.edu Mon Feb 5 08:35:49 2007 From: mmhohman at northwestern.edu (Moses M.Hohman) Date: Mon, 5 Feb 2007 00:35:49 -0800 Subject: [BioRuby] Download BioRuby API document In-Reply-To: <839897.12672.qm@web36804.mail.mud.yahoo.com> References: <839897.12672.qm@web36804.mail.mud.yahoo.com> Message-ID: Hi Li, I was going to say that If you've installed BioRuby as a gem, you could view its RDoc documentation by running "gem_server", which allows you to browse the RDoc documentation using a web browser pointed at localhost:8808. However, it doesn't look like BioRuby supports that at the moment. This is probably something easy to do and would show off the RDoc work people have been doing. Is there another way? Moses On Jan 27, 2007, at 6:23 AM, chen li wrote: > Hi all, > > I am new to this forum. I wonder if I can download the > BioRuby's API document and install it to my computer > > Thanks, > > > Li > > > > ______________________________________________________________________ > ______________ > Bored stiff? Loosen up... > Download and play hundreds of games for free on Yahoo! Games. > http://games.yahoo.com/games/front > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From trevor at corevx.com Mon Feb 5 13:40:27 2007 From: trevor at corevx.com (Trevor Wennblom) Date: Mon, 05 Feb 2007 07:40:27 -0600 Subject: [BioRuby] Download BioRuby API document In-Reply-To: References: <839897.12672.qm@web36804.mail.mud.yahoo.com> Message-ID: <45C733CB.1070303@corevx.com> I haven't done anything interesting with the site in awhile, but the BioRuby rdoc from v1.0 is available for browsing here: http://bioruby-doc.org/rdoc/ While we've made great improvements on the RDoc formatting we could do a much better job (mostly do to technical misunderstandings than a lack of documentation). I would like to hold off on addressing those concerns until BioRuby 1.1 is released since I think it's due time we make a major update public. There's been quite a bit added to the CVS within the last year that could really shine in a GEM package. I'd really like to see more frequent minor releases - perhaps even one every few months? It would certainly attract a bit more attention to the project. On another topic - has anyone heard from Toshiaki? I sent him an email on 2007-01-05 regarding some issues with Bio::Sequence to_re and haven't heard back. Trevor Moses M.Hohman wrote: > Hi Li, > > I was going to say that If you've installed BioRuby as a gem, you > could view its RDoc documentation by running "gem_server", which > allows you to browse the RDoc documentation using a web browser > pointed at localhost:8808. However, it doesn't look like BioRuby > supports that at the moment. This is probably something easy to do > and would show off the RDoc work people have been doing. Is there > another way? > > Moses > > On Jan 27, 2007, at 6:23 AM, chen li wrote: > > >> Hi all, >> >> I am new to this forum. I wonder if I can download the >> BioRuby's API document and install it to my computer >> >> Thanks, >> >> >> Li >> From yjchenx at gmail.com Thu Feb 8 22:54:09 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Thu, 8 Feb 2007 14:54:09 -0800 Subject: [BioRuby] Bug in writing PDB ATOM Message-ID: In bio/db/pdb/pdb.rb line 1019, the ATOM entry is written as: sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s It results an ATOM entry as: ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O But the right ATOM entry should be ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O Note there are 2 spaces after '61' and one space before 'ASN' I change this line to: sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s and it works fine now. But I am new to Ruby and not familiar with the format yet. Yen-Ju From yjchenx at gmail.com Fri Feb 9 01:44:59 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Thu, 8 Feb 2007 17:44:59 -0800 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: On 2/8/07, Alex Gutteridge wrote: > On 9 Feb 2007, at 07:54, Yen-Ju Chen wrote: > > > In bio/db/pdb/pdb.rb line 1019, > > the ATOM entry is written as: > > > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > > > It results an ATOM entry as: > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > > 88.56 O > > > > But the right ATOM entry should be > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > > 88.56 O > > > > Note there are 2 spaces after '61' and one space before 'ASN' > > I change this line to: > > > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > > > and it works fine now. > > But I am new to Ruby and not familiar with the format yet. > > > > Yen-Ju > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > > > Hi Yen-Ju, > > Thanks for your bug report. In fact (as far as I can tell) the PDB > format (http://www.wwpdb.org/documentation/format23/sect9.html) is > ambiguous in this case. Columns 13-16 are specified for the 'Atom > name' ('OD1' in the case you mention), but the justification of the > field is not specified. Note that the field requires four columns so > your fix (which reduces it to three) may break if you encounter an > atom name with 4 characters. > > However, you are quite correct that the convention in most PDB files > is that when less than 4 characters are used for the atom name, the > field is aligned as you show. In summary, any of the following is a > valid name according to my reading of the specifications, but the > convention in many files is to use the form shown in the third and > fourth examples rather than the first and second. Note that the fifth > example is also a valid atom name and may break your fix: > > OD1 > N > OD1 > N > OD12 > > I will change the code to use the conventional form where possible, > but be careful with your fix because it may break on some (rare) PDB > files. > > An important general point: PDB files (particularly older ones) are > *very* messy. Efforts have been made within the PDB and at the EBI > MSD to clean these files up, but there are still issues. This means > that it is very hard to write a parser that can read in any PDB file > and then output it in exactly the same format (including spacing > etc...). The BioRuby parser should be able to parse any valid PDB > file and output the data back out as a valid PDB format string, but > the input and output are *not* guaranteed to be identical. > > I have not had time to actively maintain the PDB parsing in BioRuby, > so if you are interested in Ruby and PDB files feel free to submit > more bug reports and patches. > > Thanks again. Thanx. I understand it is messy on PDB format and PDB is not the only one in this field. :) I notice this bug because the output from bioruby cannot be read correctly by some program I am using, like rasmol. Anyway, I just start to use bioruby recently and still learning. If I found some more bugs, I will try to send reports and patches. By the way, I am working more on the structural side. Currently BioRuby is more on sequence and database. If people are interested, I may submit some scripts for common structural stuff in the future. for example, calculating symmetry-related position in unit cell based on space group, converting position from orthogonal to fraction coordinate, converting format of heavy metal positions for various crystallography packages, etc. Yen-Ju > > Alex Gutteridge > > Bioinformatics Center > Kyoto University > > > From alexg at kuicr.kyoto-u.ac.jp Fri Feb 9 01:15:46 2007 From: alexg at kuicr.kyoto-u.ac.jp (Alex Gutteridge) Date: Fri, 9 Feb 2007 10:15:46 +0900 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: On 9 Feb 2007, at 07:54, Yen-Ju Chen wrote: > In bio/db/pdb/pdb.rb line 1019, > the ATOM entry is written as: > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > It results an ATOM entry as: > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > 88.56 O > > But the right ATOM entry should be > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 > 88.56 O > > Note there are 2 spaces after '61' and one space before 'ASN' > I change this line to: > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > and it works fine now. > But I am new to Ruby and not familiar with the format yet. > > Yen-Ju > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > Hi Yen-Ju, Thanks for your bug report. In fact (as far as I can tell) the PDB format (http://www.wwpdb.org/documentation/format23/sect9.html) is ambiguous in this case. Columns 13-16 are specified for the 'Atom name' ('OD1' in the case you mention), but the justification of the field is not specified. Note that the field requires four columns so your fix (which reduces it to three) may break if you encounter an atom name with 4 characters. However, you are quite correct that the convention in most PDB files is that when less than 4 characters are used for the atom name, the field is aligned as you show. In summary, any of the following is a valid name according to my reading of the specifications, but the convention in many files is to use the form shown in the third and fourth examples rather than the first and second. Note that the fifth example is also a valid atom name and may break your fix: OD1 N OD1 N OD12 I will change the code to use the conventional form where possible, but be careful with your fix because it may break on some (rare) PDB files. An important general point: PDB files (particularly older ones) are *very* messy. Efforts have been made within the PDB and at the EBI MSD to clean these files up, but there are still issues. This means that it is very hard to write a parser that can read in any PDB file and then output it in exactly the same format (including spacing etc...). The BioRuby parser should be able to parse any valid PDB file and output the data back out as a valid PDB format string, but the input and output are *not* guaranteed to be identical. I have not had time to actively maintain the PDB parsing in BioRuby, so if you are interested in Ruby and PDB files feel free to submit more bug reports and patches. Thanks again. Alex Gutteridge Bioinformatics Center Kyoto University From ngoto at gen-info.osaka-u.ac.jp Sat Feb 10 07:32:40 2007 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Sat, 10 Feb 2007 16:32:40 +0900 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: References: Message-ID: <200702100732.l1A7WGui007466@idns103.gen-info.osaka-u.ac.jp> Hi, Yen-Ju, Which bioruby version do you use? By using CVS HEAD 'pdb.rb,v 1.16 2006/06/27 14:23:45', it seems fine. #### sample script require 'bio' atom = Bio::PDB::Record::ATOM.new atom.serial = 61 atom.name = 'OD1' atom.altLoc = '' atom.resName = 'ASN' atom.chainID = 'A' atom.resSeq = 8 atom.iCode = '' atom.x = 102.025 atom.y = 27.929 atom.z = 144.984 atom.occupancy = 1.0 atom.tempFactor = 88.56 atom.segID = '' atom.element = 'O' atom.charge = '' print atom.to_s # "ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O \n" #### end of sample script However, it still fails in some rare cases. require 'bio' # record from PDB 1CX1 str = "ATOM 376 HH TYR A 25 " + "4.479 12.801 -3.919 1.00 1.72 H " atom = Bio::PDB::Record::ATOM.new.initialize_from_string(str) print atom.to_s # # "ATOM 376 HH TYR A 25 4.479 12.801 -3.919 1.00 1.72 H \n" # ^ an excess space!! I'll make changes in the CVS to give more accurate results, but it'll be still imperfect (becase of ambiguity, as Alex said). Thanks, Naohisa Goto ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp On Thu, 8 Feb 2007 14:54:09 -0800 "Yen-Ju Chen" wrote: > In bio/db/pdb/pdb.rb line 1019, > the ATOM entry is written as: > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > It results an ATOM entry as: > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > But the right ATOM entry should be > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > Note there are 2 spaces after '61' and one space before 'ASN' > I change this line to: > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > and it works fine now. > But I am new to Ruby and not familiar with the format yet. > > Yen-Ju > _______________________________________________ > BioRuby mailing list > BioRuby at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/bioruby > From yjchenx at gmail.com Sat Feb 10 14:13:25 2007 From: yjchenx at gmail.com (Yen-Ju Chen) Date: Sat, 10 Feb 2007 06:13:25 -0800 Subject: [BioRuby] Bug in writing PDB ATOM In-Reply-To: <45cd751d.28e949d0.4e3f.ffff9a4dSMTPIN_ADDED@mx.google.com> References: <45cd751d.28e949d0.4e3f.ffff9a4dSMTPIN_ADDED@mx.google.com> Message-ID: On 2/9/07, GOTO Naohisa wrote: > Hi, Yen-Ju, > > Which bioruby version do you use? > By using CVS HEAD 'pdb.rb,v 1.16 2006/06/27 14:23:45', > it seems fine. I tried both 1.0 and CVS. And your example indeed works fine. Here is the data and script I used: ATOM 1 CB TYR A 4 46.803 20.433 46.159 1.00130.00 ATOM 2 CG TYR A 4 46.708 19.122 46.931 1.00130.00 ATOM 3 CD1 TYR A 4 46.708 17.892 46.257 1.00130.00 ATOM 4 CE1 TYR A 4 46.596 16.691 46.961 1.00130.00 ATOM 5 CD2 TYR A 4 46.599 19.109 48.336 1.00130.00 ######### require 'bio' file = File.new('a.pdb').gets(nil) structure = Bio::PDB.new(file) structure.each do |model| model.each do |chain| chain.each do |residue| residue.each do |atom| atom.resSeq += 400 end end end end File.open('x.pdb', 'w') do |file| file << structure.to_s end Yen-Ju > > #### sample script > require 'bio' > atom = Bio::PDB::Record::ATOM.new > atom.serial = 61 > atom.name = 'OD1' > atom.altLoc = '' > atom.resName = 'ASN' > atom.chainID = 'A' > atom.resSeq = 8 > atom.iCode = '' > atom.x = 102.025 > atom.y = 27.929 > atom.z = 144.984 > atom.occupancy = 1.0 > atom.tempFactor = 88.56 > atom.segID = '' > atom.element = 'O' > atom.charge = '' > print atom.to_s > # "ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O \n" > #### end of sample script > > However, it still fails in some rare cases. > > require 'bio' > # record from PDB 1CX1 > str = "ATOM 376 HH TYR A 25 " + > "4.479 12.801 -3.919 1.00 1.72 H " > atom = Bio::PDB::Record::ATOM.new.initialize_from_string(str) > print atom.to_s > # > # "ATOM 376 HH TYR A 25 4.479 12.801 -3.919 1.00 1.72 H \n" > # ^ an excess space!! > > I'll make changes in the CVS to give more accurate results, > but it'll be still imperfect (becase of ambiguity, as Alex said). > > Thanks, > > Naohisa Goto > ng at bioruby.org / ngoto at gen-info.osaka-u.ac.jp > > On Thu, 8 Feb 2007 14:54:09 -0800 > "Yen-Ju Chen" wrote: > > > In bio/db/pdb/pdb.rb line 1019, > > the ATOM entry is written as: > > > > sprintf("%-6s%5d %-4s%-1s%3s %-1s%4d%-1s > > > > It results an ATOM entry as: > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > > > But the right ATOM entry should be > > ATOM 61 OD1 ASN A 8 102.025 27.929 144.984 1.00 88.56 O > > > > Note there are 2 spaces after '61' and one space before 'ASN' > > I change this line to: > > > > sprintf("%-6s%5d %-3s%-1s%3s %-1s%4d%-1s > > > > and it works fine now. > > But I am new to Ruby and not familiar with the format yet. > > > > Yen-Ju > > _______________________________________________ > > BioRuby mailing list > > BioRuby at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/bioruby > > >