[BioRuby] bio.pdb doubt

Wed Feb 20 15:22:47 UTC 2008

My posts don't seem to make it to the mailing list from this address,  
so I'm reposting - sorry if people get this twice!

Shameer,

There's no specific method for testing for multi-chain/single chain  
proteins and there are some interesting edge cases (does a small  
peptide bound to a large enzyme count as a chain for instance) where  
Naohisa's method may give different answers, but one simple way it  
just to grab the chains for as an array and check it's size:

Bio::PDB.new(IO.read('1TIM.pdb')).chains.size

And because this is Ruby you can always define your own convenience  
method on the Bio::PDB class:

irb(main):007:0> module Bio
irb(main):008:1> class PDB
irb(main):009:2> def multichain?
irb(main):010:3> self.chains.size > 1
irb(main):011:3> end
irb(main):012:2> end
irb(main):013:1> end
=> nil
irb(main):014:0> Bio::PDB.new(IO.read('1TIM.pdb')).multichain?
=> true

AlexG

On 20 Feb 2008, at 13:53, Naohisa GOTO wrote:

> Dear Shameer,
>
> Information of chains for each macromolecule is described in
> 'COMPND' record. In BioRuby, Bio::PDB#record method can be used.
> Because the information obtained by the method is sometimes
> naive, processing of the data would be needed.
>
> Below is a sample program:
>
>  require 'bio'
>
>  def parse_COMPND(pdb)
>    molecules = []
>    current_molecule = nil
>    pdb.record('COMPND')[0].compound.each do |a|
>      case a[0]
>      when 'MOL_ID'
>        current_molecule = {}
>        molecules.push current_molecule
>      when 'CHAIN'
>        chains = a[1].split(/\s*\,\s*/)
>        current_molecule[:chains] = chains
>      end
>      current_molecule[a[0]] = a[1]
>    end
>    molecules
>  end
>
>  pdb1 = Bio::FlatFile.open('pdb1fjg.ent') { |f| f.next_entry }
>  pdb2 = Bio::FlatFile.open('pdb1a0d.ent') { |f| f.next_entry }
>
>  [ pdb1, pdb2 ].each do |pdb|
>    compounds = parse_COMPND(pdb)
>    compounds.each do |c|
>      p c['MOLECULE']
>      p c[:chains]
>    end
>  end
>
> The meanings of the 'COMPND' record is described in
> PDB file format document:
> http://www.wwpdb.org/documentation/format23/sect2.html#COMPND
>
> -- 
> Naohisa Goto
> ngoto at gen-info.osaka-u.ac.jp /ng at bioruby.org
>
>
> On Mon, 18 Feb 2008 10:46:03 +0530 (IST)
> "K. Shameer" <shameer at ncbs.res.in> wrote:
>
>> Dear Naohisa and Alex,
>>
>> Thanks for the links and the sample code.
>> I have one more doubt :) .
>> Is there any method to check whether a protein is multichain/single  
>> chain
>> using BioRuby. I checked in BioRuby in Anger document and wiki, but I
>> couldnt find it (May be am missing something important)
>>
>> Thanks,
>> K. Shameer
>> NCBS - TIFR
>>
>>
>>> You can use Bio::PDB#find_atom or Bio::PDB#find_residue methods.
>>>
>>>  require 'bio'
>>>
>>>  # reading PDB data
>>>  pdb = Bio::FlatFile.open("pdb1a0d.ent") { |f| f.next_entry }
>>>
>>>  # using Bio::PDB#find_atom
>>>  atoms = pdb.find_atom do |atom|
>>>    (atom.chainID == "A" and atom.resSeq >= 22) or
>>>    (atom.chainID == "B" and atom.resSeq <= 50)
>>>  end
>>>  print atoms.to_s
>>>
>>>  print "\n"
>>>
>>>  # the same thing can be done by using Bio::PDB#find_residue
>>>  residues = pdb.find_residue do |residue|
>>>    (residue.chain.id == "A" and residue.resSeq >= 22) or
>>>    (residue.chain.id == "B" and residue.resSeq <= 50)
>>>  end
>>>  print residues.to_s
>>>
>>>
>>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby
>