[Bioperl-l] Strange problem with Bio::Seq::LargePrimarySeq]

Tue Mar 15 09:27:05 EST 2005

I see your point with regard to splitting it up.  I was debating how 
exactly to do the split myself, and I'll work on that later as I look at 
the code.  One quick question about the code snipet you posted, though.  
Are you trying to create a separate stream from the extant SeqIO?  If 
so, why?  Wouldn't it be redundant?

Stefan Kirov wrote:

> Your first problem is that you cannot access SeqFeature methods 
> (Annotation as well) as LargePrimarySeq inherits (unlike LargeSeq) 
> only from PrimarySeq. Therefore you don't have these methods 
> available. Two approaches: create an empty Seq object to hold the 
> annotation, or read the file into a new LargeSeq object:
> my $id=<>;
> while (<>) {
> chomp;
> $largeseq->add_sequence_as_string;
> }
> I don't know what the performance will be with the second object when 
> you add features. First one is far safer I think.
> Next problem performance screening the sequence. Unless you have 
> something BIG it is likely you will have to split the sequence in at 
> least several chunks (then you can see if this disrupted a signal, 
> site..etc.) or get few gigs more for your RAM (best would be some 
> shared memory and a grid if you want to kill a fly with a tank :-) ).
> Let me know if you have further questions.
> Hope this helps and good luck.
> Stefan
>
> iluminati at earthlink.net wrote:
>
>> Thanks for asking the questions!  In hindsight, I realized that I 
>> glossed over the problem in my frustration.
>> Anyway, here's the drill.  I created a seq object from a 
>> chromosome-sized fasta file like so...
>>
>> my $seqio = new Bio::SeqIO('-format'=>'largefasta',
>>                            '-file'  =>Bio::Root::IO->catfile("/Thesis 
>> Stuff/Chr$Chromosome/chr$Chromosome.fa"));
>>      #Create the seq object
>>    my $seq = $seqio->next_seq();
>>
>> From there, I want to manipulate the sequence and use the functions 
>> generally available to a seq object.  Now, in order to the build the 
>> seq object, I have to use the Bio::Seq::largefasta module.  The 
>> reason I need the Bio::Seq::LargePrimarySeq module is so that I can 
>> manipulate the sequence and get to the necessary functions.  However, 
>> I get this error running the script despite including the 
>> Bio::Seq:::LargePrimarySeq module:
>>
>> Can't locate object method "add_SeqFeature" via package 
>> "Bio::Seq::LargePrimaryS
>> eq" (perhaps you forgot to load "Bio::Seq::LargePrimarySeq"?) at 
>> ThesisFrontEndS
>> cript.pl line 94, <GeneExpressionData> line 33294.
>>
>>
>> I can send you the code in question if you want to get a better 
>> look-see.
>> Now, the reason I need the whole sequence is two-fold.  For one, I 
>> need to be able to calculate CG% of genes as an experimental control 
>> of my project.  The other part is that I need to be able to scan the 
>> genome for polyA sites with respect to their orientation to L1 sites, 
>> and there's no simple way to do that other than flat-out scanning the 
>> code.  I'll definitely look into tweaking the /$tmp directory if that 
>> helps, but other than that, I have to at least try and make it work.
>>
>> Stefan Kirov wrote:
>>
>>> First you have to answer few questions: how do you get the object?/
>>> use Bio::Seq::LargePrimarySeq does not create an object it merely 
>>> makes the code available.
>>> /If you post you code here it will be much easier to answer your 
>>> questions. How do you access the sequence (I hope you have read the 
>>> documentation, which states that it is not generally a good idea to 
>>> call $seq->seq).
>>> How big is you /tmp? What are trying to accomplish and why you need 
>>> the whole seq in memory?
>>> Stefan
>>>
>>> I've had the same issue... I ended up breaking down the sequences into
>>> manageable fragments but would really like to get the largePrimarySeq
>>> working.  When I tried loading a chrom size sequence I just sat back
>>> and watched my RAM get used up (2 gigs), then the swap, then the
>>> crash....  So if anyone can help it'd benefit both of us!
>>>
>>> Thanks for any help,
>>> Garrett
>>>
>>>
>>> On Mon, 14 Mar 2005 16:22:40 -0500, iluminati at earthlink.net 
>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>
>>> <iluminati at earthlink.net 
>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>> wrote:
>>>
>>>> / I'm having this unuusal problem with loading this particular 
>>>> module.  I
>>>
>>>
>>>
>>> />/ need b/c I'm working with chromosome-sized sequence files as a 
>>> part of
>>> />/ my project, but yet it seems to not want to load properly even 
>>> when it's
>>> />/ loaded using the following statement:
>>> />/ />/ use Bio::Seq::LargePrimarySeq;
>>> />/ />/ I checked my modules, and the necessary module is there.  It 
>>> seems to
>>> />/ just not want to load.  Can anyone be of service?
>>> />/ />/ _______________________________________________
>>> />/ Bioperl-l mailing list
>>> />/ Bioperl-l at portal.open-bio.org 
>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>
>>> />/ http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>> />
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>