[Bioperl-l] Reading sequences without parsing them

Ewan Birney birney@ebi.ac.uk
Mon, 16 Jul 2001 15:20:14 +0100 (BST)


On Mon, 16 Jul 2001, Karger, Amir wrote:

>   
> So, sorry about the lack of clarity. Do a s/sequence/entry/g on my original
> email.
> 

There is not an in built way to do this inside Bioperl nicely.

options

   (a) use IO::String but that will be dependent on the bioperl write_seq
differences - ie, this is not what you want as when we change bioperl
write_seq for a format you will think all your sequences have updates

   (b) trust the in built accession.version system for sequences not
annotations

   (c) trust the Date line for annotation updates (available in swissprot,
embl , genbank)


If you are paranoid you will need to write your own Digest::MD5 system
based around a string from // to // in the files. This could perhaps
become quite a nice system integrated into the SeqIO system: for example,
I could imagine a complex system like:

   # fictional class 
   use Bio::DB::AutoUpdate.pm;

   $auto = Bio::DB::AutoUpdate->new( -file => 'some/file',
				     -md5  => '/some/place/with/md5',
                                     -record => '//',
                                     -seqio => 'swiss'
                                     -update => 1 # means update md5 on reading
                                    );

   # auto update complies to the implict SeqIO interface of next_seq
   # but only gives back new MD5 entries

   while( (my $updated_entry = $auto->next_seq()) ) {
      # do something with updated
   }


the MD5 is probably best implemented as a DBM file.


If you wrote something like this that would be great! If you wait 6 months
or so I'll probably get bored on a train sometime and might do it
assumming half a ton of other interesting things are not happening ;)


any other thoughts from people?
   

> -Amir
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------