[Bioperl-l] A perl regex query

James Smith js5 at sanger.ac.uk
Tue Sep 18 12:58:36 UTC 2007


Neeti,

This isn't really a bioperl query - but I will try and explain a simple
solution...

warn simplify( 'Cyclic-2,3-bisphospho-D-glycerate' );

sub simplify {
   local $_ = "-$_[0]-";
         ## Quick hack add -'s at start and end! as always match "-string-"
   s/-(
     Cyclic | # The prefix "cyclic"
     \d+    | # a single number between two "-"s
     \d+,\d+| # number,number between two "-"s
     \w       # a single letter between two "-"s
   )(?=-)//ixg;  ## case-insensitive, commented, multiple matches!
         ## 0-width +ve lookahead assertion - so can match
         ## multiple consecutive -x- constructions in same regexp!
   s/-//g;
         ## remove remaining "-"s from string...
}

Not sure what other test strings you may want - but most should be able to
fit in the () brackets in the first regexp of simplify

James

On Tue, 18 Sep 2007, Andreas Kahari wrote:

> On Tue, Sep 18, 2007 at 04:00:34PM +0530, neeti somaiya wrote:
>> Hi,
>>
>> This isnt really a bioperl query.
>> But does anyone know how I can substitute all special characters (+ some
>> other things) in a string with nothing in perl?
>> I mean if I have a string like Cyclic-2,3-bisphospho-D-glycerate and I want
>> ouput as bisphosphoglycerate. I want to remove -D-, Cyclic-, 2,3- etc.
>
> This is in additions to the suggestions you've already had.
>
> If you always want to concatenate the 3rd and 5th part of the string, as
> delimited by dashes, then you could do this:
>
>  my $string = 'Cyclic-2,3-bisphospho-D-glycerate';
>  my $newstring = join( '', ( split( /-/, $string ) )[ 2, 4 ] );
>
>
> Cheers,
> Andreas
>
> -- 
> Andreas Kähäri :: Ensembl Software Developer
> European Bioinformatics Institute (EMBL-EBI)
> --------------------------------------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Bioperl-l mailing list