[Biojava-l] three-letter Protein alphabet names
mark.schreiber at novartis.com
mark.schreiber at novartis.com
Tue Aug 1 08:20:30 UTC 2006
You mean something like ..
Pro Ala Tyr
Then yes in this case you would want to make a WordTokenization.
Best regards,
- Mark
Neil Bacon <neil at cambia.org>
Sent by: biojava-l-bounces at lists.open-bio.org
08/01/2006 03:41 PM
To: biojava-l at lists.open-bio.org
cc: (bcc: Mark Schreiber/GP/Novartis)
Subject: [Biojava-l] three-letter Protein alphabet names
Hi,
I'm looking at extending biojava sequence io to read sequences from
patents (initially current US data formats, later perhaps older formats
and other jurisdictions).
Anyone done this already or interested?
Protein data uses 3-letter codes. I found an old posting about 3-letter
codes:
[Biojava-dev] Protein alphabet names
http://lists.open-bio.org/pipermail/biojava-dev/2002-October/000143.html
>/ - Add an additional tokenization (probably called
/>/ "three-letter"
/>/ unless someone comes up with a better
/>/ suggestion) for people
/>/ who actually want 3-letter codes.
/
Did this happen (I can't find it)?
I'll try extending WordTokenization to do this unless someone has
already done it or can advise me better (I'm new here and advice would
be very welcome).
Cheers,
Neil Bacon
_______________________________________________
Biojava-l mailing list - Biojava-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biojava-l
More information about the Biojava-l
mailing list