[Bioperl-l] Need info re Prodom & Stockholm formats and univaln.t2

Florence Servant fservant@toulouse.inra.fr
Mon, 18 Sep 2000 10:27:28 +0200


Hi Peter,
	Here is an example of ProDom format:
	
ID   11309 p2000.1                           6 seq.
AC   PD009000
KW   GBB5(3) GBB(3)  // GUANINE NUCLEOTIDE-BINDING PROTEIN BETA SUBUNIT
TRANSDUCER REPEAT WD MULTIGENE FAMILY 
LA   43
ND   6
CC   -!- DIAMETER:      77 PAM
CC   -!- RADIUS OF GYRATION:    36 PAM
CC   -!- SEQUENCE CLOSEST TO CONSENSUS: GBB_DICDI 144-178 (distance:23
PAM)
                                                10        20       
30        40        
                                       
---------|---------|---------|---------|-----
AL Q20636|GBB5_CAEEL     143   185 0.88
DDIIQKKRQVATHTSYMSCCTFLRSDNLILTGSGDSTCAIWDV
AL O14775|GBB5_HUMAN     140   182 0.53
ENMAAKKKSVAMHTNYLSACSFTNSDMQILTASGDGTCALWDV
AL P54314|GBB5_MOUSE     140   182 0.53
ENMAAKKKSVAMHTNYLSACSFTNSDMQILTASGDGTCALWDV
AL O14435|GBB_CRYPA      154   188 1.05
.......RELSGHAGYLSCCRFIN-DRSILTSSGDMTCMKWDI
AL P36408|GBB_DICDI      144   178 1.05
.......RELNSHTGYLSCCRFLN-DRQIVTSSGDMTCILWDV
AL P18851|GBB_YEAST      179   209 1.97
...........GHTCYISDIEFTD-NAHILTASGDMTCALWDI
CO                                     
DNMAAKKRELSGHTCYLSCCEFTNSDRHILTASGDMTCALWDI
DR   PROSITE;     PS00678 PDOC00574 WD_REPEATS (29-43)
DR   PDB;         1SCG chain 2  (141-171) GBB_YEAST (179-209)
//

You can easily extract this information from the whole ProDom SRS file
(ftp://ftp.toulouse.inra.fr/pub/prodom/current_release/prodom2000.1.srs.gz)
by using the fetchdom program
(ftp://ftp.toulouse.inra.fr/pub/prodom/current_release/fetchdom.tar.gz):
	fetchdom -db prodom2000.1 -a PD009000
NB: don't worry if the first fetchdom run is slow, it is due to the
building of the index file. Next runs are very fast.

Best wishes,
Florence.

Peter Schattner wrote:
> 
> In developing the new "AlignIO.pm" module, I have been using various
> multiple-alignment data files ("test.pfam, "test.mase" etc) in the "/t"
> directory to confirm that the new module is parsing properly.
> 
> However, for the formats "prodom" and "stockholm" (which are included among
> the formats accepted by SimpleAlign.pm) I do not find any sample files. Can
> anyone out there send me  sample prodom- and/or stockholm-format multiple
> sequence alignment files so that I can test that the  new input modules for
> these formats are parsing correctly?
> 
> Also, in the module UnivAln.pm, there is reference to a script "univaln.t2"
> which more fully exercises the UnivAln module.  Does anyone out there (Ewan?,
> Georg?, Steve?) have a copy of this script they could send me?  I am trying
> to assess the feasibility and level of effort required to merge the
> functionalities of the SimpleAlign and UnivAln modules once multiple
> alignment IO has been stripped out using AlignIO.
> 
> Thanks.
> 
> Peter Schattner

-- 
Florence SERVANT
Laboratoire de Biologie Moleculaire
INRA - CNRS
BP 27, Chemin de Borde Rouge
31326 Castanet Tolosan Cedex
Tel : +33  05.61.28.50.53
mailto:fservant@toulouse.inra.fr