[Bioperl-l] a space in Feature key

Heikki Lehvaslaiho heikki at ebi.ac.uk
Thu Oct 23 04:41:31 EDT 2003


Checking with EMBL databank guys, I find out that since feature table 
format is defined in characters, space characters are not specifically
banned in feature keys, but they are actively avoided.

Also, I talked to Rodrigo Lopez who promised to release an updated
CpGIsle database soon (next week?) and remove the offending space.

Yours,

	-Heikki


On Fri, 2003-10-17 at 21:03, Henry Hyun-il Paik wrote:
>  Hello Ewan,
> 
>  I downloaded data from 
> 
> 	ftp://ftp.ebi.ac.uk/pub/databases/cpgisle/
> 
>  the file name is cpgisle.dat
> 
>  - Henry
> 
> On Fri, 17 Oct 2003, Ewan Birney wrote:
> 
> > 
> > On Friday, October 17, 2003, at 07:15  pm, Henry Hyun-il Paik wrote:
> > 
> > >
> > >  Hello list,
> > >
> > >  It is impossible to have a space in Feature key, Right?
> > >
> > >  I downlaoded some data from embl cpgisle. They look like below.
> > >
> > 
> > I don't think you are allowed spaces. Where did you get this from?
> > 
> > 
> > 
> > > ----------------------------------------------------------------------- 
> > > ----
> > > ID   GAPDHG
> > > AC   J04038;
> > > LE   5378
> > > DE   Human glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene,  
> > > complete
> > > cds.
> > > DE   7/95
> > > EX   Gene expression widespread
> > > FT   CpG island      871..1673
> > > FT                   /size=803
> > > FT                   /%(C+G)=69.12
> > > FT                   /Obs/Exp CpG=0.82
> > > FT   CpG island      1683..2063
> > > FT                   /size=381
> > > FT                   /%(C+G)=67.19
> > > FT                   /Obs/Exp CpG=0.77
> > > XX
> > > FT                   /CAAT-box.1="884"
> > > FT                   /CAAT-box.2_complement="2156"
> > > FT                   /GC-box="1064"
> > > FT                   /E2F_CS.1="1785"
> > > FT                   /SpI="158,1198,1244,1290,1310,1314"
> > > FT                   /SpI_complement="174,584,1519,1668,1736,2271"
> > > FT                   /SpI_complement="2625"
> > > FT                   /AccII="717,727,1093,1268,1334,1423"
> > > FT                   /AccII="1489,1531,1788,2006,3650,4278"
> > > //
> > > ----------------------------------------------------------------------- 
> > > -
> > >
> > >  I tried to parse this by using SeqIO. It didn't work.
> > >
> > >  I got an error message like below.
> > >
> > >   
> > > -----------------------------------------------------------------------
> > >
> > > Argument "island" isn't numeric in numeric gt (>) at
> > > /home/hy1001/bin/Bio/Location/Atomic.pm line 91, <GEN0> line 15.
> > > Argument "island" isn't numeric in numeric gt (>) at
> > > /home/hy1001/bin/Bio/Location/Atomic.pm line 91, <GEN0> line 15.
> > >
> > > ------------- EXCEPTION  -------------
> > > MSG: Got a sequence with no letters in - cannot guess alphabet []
> > > STACK Bio::PrimarySeq::_guess_alphabet
> > > /home/hy1001/bin/Bio/PrimarySeq.pm:817
> > > STACK Bio::PrimarySeq::seq /home/hy1001/bin/Bio/PrimarySeq.pm:276
> > > STACK Bio::PrimarySeq::new /home/hy1001/bin/Bio/PrimarySeq.pm:214
> > > STACK Bio::Seq::new /home/hy1001/bin/Bio/Seq.pm:498
> > > STACK Bio::Seq::RichSeq::new /home/hy1001/bin/Bio/Seq/RichSeq.pm:115
> > > STACK Bio::Seq::SeqFactory::create
> > > /home/hy1001/bin/Bio/Seq/SeqFactory.pm:126
> > > STACK Bio::SeqIO::embl::next_seq /home/hy1001/bin/Bio/SeqIO/embl.pm:344
> > > STACK toplevel extracted.pl:13
> > >
> > >   
> > > -----------------------------------------------------------------------
> > >
> > >  So I changed 'CpG island' to 'CpG_island'. Then it worked fine.
> > >
> > >  I am using perl 5.8.0 and bioperl 1.2.3 on linux.
> > >
> > >  Thank you.
> > >
> > >  - Henry.
> > >
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > 
> > 
> > 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list