[EMBOSS] tfextract does not work properly with newer transfac	site.dat file
    Mauleon, Ramil (IRRI) 
    R.MAULEON at CGIAR.ORG
       
    Tue Jun 22 09:53:49 UTC 2010
    
    
  
Hello, 
I used tfextract on the Transfac 6.4 <site.dat> file to be able to use
this on tfscan, but it does not parse the file properly. Part of the
problem that I saw with the Transfac site.dat 6.4 file were:
 
1 - many entries had more that 1 motif sequences (the SQ line); these
subsequently weren't  included in the parsed output
 
AC  R00018
XX
ID  MOUSE$ACRD_01
XX
DT  20.06.1990 (created); ewi.
DT  24.08.1995 (updated); hiwi.
CO  Copyright (C), Biobase GmbH.
XX
TY  D
XX
DE  AChR delta (acetylcholine receptor, delta-subunit); Gene: G000457.
XX
SQ  TGCCTGG.
SQ  TGCCCTTG.
SQ  TGCCCTAA.
SQ  TGGCAAAC.
XX
SF  -148
 
.
.
.
 
 
2 - Some motif sequences were broken up to 2 lines, for example..
 
AC  R00709
XX
ID  HA$HMGCR_02
XX
DT  20.06.1990 (created); ewi.
DT  06.09.1995 (updated); ewi.
CO  Copyright (C), Biobase GmbH.
XX
TY  D
XX
DE  HMGCOAR (HMG-CoA reductase); Gene: G000157.
XX
SQ  TGCTGGAACTCGACCAGCTATTGGTTGGCTCGGCCGTGGTGAGAGATGGTGCGGTGCCCG
SQ  TTCTCC.
 
Thanks in advance for fixing tfextract
 
Ramil
---------------------------------
Ramil P. Mauleon
Bioinformatics Specialist
International Rice Research Institute
DAPO Box 7777, Metro Manila, Philippines
email: r.mauleon at cgiar.org <mailto:r.mauleon at cgiar.org> 
phone: 632-580-5600 ext 2508 ; fax: 632-580-5699
---------------------------------
 
    
    
More information about the EMBOSS
mailing list