[Biojava-l] Editing a RichSequence[Scanned]
Richard Holland
holland at ebi.ac.uk
Mon Feb 18 16:12:52 UTC 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
OK, got it.
It's because ChunkedSymbolListFactory is creating a ChunkedSymbolList
for your sequence, because the sequence is greater than 1<<14 bp long
(that's about 16384 bytes). This is a hardcoded limit.
ChunkedSymbolList extends AbstractSymbolList, which is immutable and
therefore not editable.
I'm not sure who wrote ChunkedSymbolList - and I'm not sure how to (or
if I should) fix it. It's quite a deeply embedded piece of the system.
Does anyone out there know?
There is a workaround - create a new symbol list based on the
RichSequence ( SymbolList syms = new SimpleSymbolList(richSeq) ). The
copy will be mutable and edit() will work on it.
cheers,
Richard
Jolyon Holdstock wrote:
> Hi,
>
> I tried using the readGenbank method with the following code...
>
> [code]
> import java.io.BufferedReader;
> import java.io.File;
> import java.io.FileNotFoundException;
> import java.io.FileReader;
> import java.io.IOException;
>
> import org.biojava.bio.BioException;
> import org.biojava.bio.symbol.Edit;
> import org.biojava.bio.symbol.SymbolList;
> import org.biojava.bio.seq.DNATools;
> import org.biojava.bio.seq.io.SymbolTokenization;
> import org.biojava.utils.ChangeVetoException;
>
> import org.biojavax.RichObjectFactory;
> import org.biojavax.bio.seq.RichSequence;
> import org.biojavax.bio.seq.io.RichSequenceBuilderFactory;
>
> public class EditBigSequence {
> RichSequence richSeq;
> Edit edit;
>
> public EditBigSequence() {
> try {
> SymbolTokenization symbolTokenization =
> DNATools.getDNA().getTokenization("token");
> richSeq = RichSequence.IOTools.readGenbank(new BufferedReader(new
> FileReader(new File("AF234172.gbk"))),
> symbolTokenization,
>
> RichSequenceBuilderFactory.FACTORY,
>
> RichObjectFactory.getDefaultNamespace()).nextRichSequence();
>
> SymbolList insertSeq = DNATools.createDNA("AAAACCCCGGGGTTTT");
> edit = new Edit(1000, 100, insertSeq);
> richSeq.edit(edit);
> }
> catch (FileNotFoundException FNFE){
> System.out.println("FileNotFoundException: " + FNFE);
> }
> catch (BioException BIOE){
> System.out.println("BioException: " + BIOE);
> }
> catch (ChangeVetoException CVE){
> CVE.printStackTrace();
> System.out.println("ChangeVetoException: " + CVE);
> }
> catch (IOException IOE){
> System.out.println("IOException: " + IOE);
> }
> }
>
> public static void main(String args []){
> EditBigSequence ebs = new EditBigSequence();
> }
> }
> [/code]
>
> But I still got an error, for which the StckTrace is below.
>
> org.biojava.utils.ChangeVetoException: AbstractSymbolList is immutable
> ChangeVetoException: org.biojava.utils.ChangeVetoException:
> AbstractSymbolList is immutable
> at
> org.biojava.bio.symbol.AbstractSymbolList.edit(AbstractSymbolList.java:1
> 13)
> at
> org.biojavax.bio.seq.DummyRichSequenceHandler.edit(DummyRichSequenceHand
> ler.java:30)
> at
> org.biojavax.bio.seq.ThinRichSequence.edit(ThinRichSequence.java:155)
> at biojavahacks.EditBigSequence.<init>(EditBigSequence.java:47)
> at biojavahacks.EditBigSequence.main(EditBigSequence.java:65)
>
>
> cheers,
>
> Jolyon
>
>
> -----Original Message-----
> From: Richard Holland [mailto:holland at ebi.ac.uk]
> Sent: 15 February 2008 15:17
> To: Jolyon Holdstock
> Cc: biojava-l at biojava.org
> Subject: Re: [Biojava-l] Editing a RichSequence[Scanned]
>
> I think it's because sequences are constructed internally in a
> ChunkedSymbolListFactory which compresses large sequences whereas small
> sequences are stored as normal uncompressed ones. Compressed sequences
> extend AbstractSymbolList, which is immutable (and therefore uneditable)
> whereas uncompressed ones do not, and hence are editable.
>
> You can disable the use of compressed sequences by using readGenbank()
> instead of readGenbankDNA() and passing in the DNA alphabet and the
> non-compressed sequence factory (see the static constants in
> RichSequenceBuilderFactory).
>
> If this still doesn't work, please could you post the full stacktrace so
> that we can see which class is throwing the exception and at what line
> etc.
>
> cheers,
> Richard
>
> On Fri, February 15, 2008 2:44 pm, Jolyon Holdstock wrote:
>> Hi
>>
>>
>> Hi,
>>
>> I am trying to edit a Genbank sequence.
>> The code I'm using is as follows:
>>
>> [code]
>> richSeq = RichSequence.IOTools.readGenbankDNA(new BufferedReader(new
>> FileReader(new File("U00096.gbk"))), null).nextRichSequence();
>>
>> SymbolList sl1 = DNATools.createDNA("AAAGGGTTTCCC");
>> Edit editOne = new Edit(47078, 2690, sl1);
>> richSeq.edit(editOne);
>>
>> [/code]
>>
>> When it runs it gives the following error
>>
>> ChangeVetoException: org.biojava.utils.ChangeVetoException:
>> AbstractSymbolList is immutable
>>
>>
>> I have used the code for a smaller sequence (15kb, compared with 4Mb)
>> and it works.
>>
>> Does anyone have an idea why this is not working?
>>
>> Thanks,
>>
>> Jolyon
>>
>>
>>
>>
>>
>> Jolyon Holdstock Ph.D.
>> Senior Computational Biologist,
>> Oxford Gene Technology,
>> Begbroke Science Park,
>> Sandy Lane, Yarnton
>> Oxford, OX5 1PF
>>
>> Tel: +44 (0)1865 856852
>> Fax: +44 (0)1865 842116
>>
>> Oxford Gene Technology (Operations) Ltd. Registered in England
>> No:03845432 Begbroke Science Park, Sandy Lane, Yarnton, Oxford, OX5
> 1PF.
>> Confidentiality Notice: The contents of this email from the Oxford
> Gene
>> Technology Group of Companies are confidential and intended solely for
>> the person to whom it is addressed. It may contain privileged and
>> confidential information. If you are not the intended recipient you
> must
>> not read, copy, distribute, discuss or take any action in reliance on
>> it.
>>
>>
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>
>
>
- --
Richard Holland (BioMart)
EMBL EBI, Wellcome Trust Genome Campus,
Hinxton, Cambridgeshire CB10 1SD, UK
Tel. +44 (0)1223 494416
http://www.biomart.org/
http://www.biojava.org/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHua6D4C5LeMEKA/QRAn/WAJ9sTII9aMU60LWdQvlgy1Ntp60q0QCdFeYa
w60vXjENWcQLCiBf1ezRgh8=
=M4J7
-----END PGP SIGNATURE-----
More information about the Biojava-l
mailing list