[BioPython] Re: Bioperl-guts: Prosite

Andrew Dalke dalke@acm.org
Thu, 6 Apr 2000 05:30:51 -0600


Ewan:
>(ps --- anyone fancy doing this in BioPerl. We have a SeqPattern
> class ;)).


How about
http://www.uni-bielefeld.de/mailinglists/BCD/vsns-bcd-perl/9906/0016.html ?
:)

Actually, I've since found a bug, since [G>] is a valid
pattern, which translates to (G|$).  Fixing that is left for
the reader.

I offer without testing, the following improved algorithm:

$pat =~ s/\{/[^/g;            # convert "{" to "[^"
$pat =~ s/-//g;               # don't need these
$pat =~ tr/}()<>xX/]{}^$../;  # solely a syntactic transformation
# fix the bug here ...

And, here's a regex for the full prosite pattern

^<?                   # starts with an optional "<"
(
  [A-Zx]|             # a character OR
  \[[A-Z]+\]|         # something in []s OR
  \{[A-Z]+\}          # something in {}s
)(\(\d+(,\d+)?\))?    # optional count of the form "(i,j)" (",j" is
optional)
(-                    # new terms seperated by a '-'
 (
  [A-Zx]|             # a character OR
  \[[A-Z]+\]|         # something in []s OR
  \{[A-Z]+\}          # something in {}s
 )(\(\d+(,\d+)?\))?   # optional count
)*                    # repeat until done
>?                    # pattern ends with an optional ">"
\.$                   # description ends with a required "."

(again, excepting the bug, which can probably be fixed with
  \[<?[A-Z]+>?\]|         # something in []s OR
  \{<?[A-Z]+>?\}          # something in {}s

except that the exact format of when/how '<' and '>' are used
is not described in the prosite documentation!)

Enjoy!

                    Andrew


=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl-guts.html
====================================================================