[Bioperl-l] Strange problem with Bio::Seq::LargePrimarySeq]

Stefan Kirov skirov at utk.edu
Tue Mar 15 09:53:43 EST 2005



iluminati at earthlink.net wrote:

> I see your point with regard to splitting it up.  I was debating how 
> exactly to do the split myself, and I'll work on that later as I look 
> at the code.  One quick question about the code snipet you posted, 
> though.  Are you trying to create a separate stream from the extant 
> SeqIO?  If so, why?  Wouldn't it be redundant?
>
No, this just reads standard file from STDIN (not through bioperl SeqIO) 
and puts it in a LargeSeq instead of LargePrimarySeq. I don't know why 
largefasta has been implemented to create LargePrimarySeq, not LargeSeq 
object, but this is the current state.
Stefan

> Stefan Kirov wrote:
>
>> Your first problem is that you cannot access SeqFeature methods 
>> (Annotation as well) as LargePrimarySeq inherits (unlike LargeSeq) 
>> only from PrimarySeq. Therefore you don't have these methods 
>> available. Two approaches: create an empty Seq object to hold the 
>> annotation, or read the file into a new LargeSeq object:
>> my $id=<>;
>> while (<>) {
>> chomp;
>> $largeseq->add_sequence_as_string;
>> }
>> I don't know what the performance will be with the second object when 
>> you add features. First one is far safer I think.
>> Next problem performance screening the sequence. Unless you have 
>> something BIG it is likely you will have to split the sequence in at 
>> least several chunks (then you can see if this disrupted a signal, 
>> site..etc.) or get few gigs more for your RAM (best would be some 
>> shared memory and a grid if you want to kill a fly with a tank :-) ).
>> Let me know if you have further questions.
>> Hope this helps and good luck.
>> Stefan
>>
>> iluminati at earthlink.net wrote:
>>
>>> Thanks for asking the questions!  In hindsight, I realized that I 
>>> glossed over the problem in my frustration.
>>> Anyway, here's the drill.  I created a seq object from a 
>>> chromosome-sized fasta file like so...
>>>
>>> my $seqio = new Bio::SeqIO('-format'=>'largefasta',
>>>                            '-file'  
>>> =>Bio::Root::IO->catfile("/Thesis 
>>> Stuff/Chr$Chromosome/chr$Chromosome.fa"));
>>>      #Create the seq object
>>>    my $seq = $seqio->next_seq();
>>>
>>> From there, I want to manipulate the sequence and use the functions 
>>> generally available to a seq object.  Now, in order to the build the 
>>> seq object, I have to use the Bio::Seq::largefasta module.  The 
>>> reason I need the Bio::Seq::LargePrimarySeq module is so that I can 
>>> manipulate the sequence and get to the necessary functions.  
>>> However, I get this error running the script despite including the 
>>> Bio::Seq:::LargePrimarySeq module:
>>>
>>> Can't locate object method "add_SeqFeature" via package 
>>> "Bio::Seq::LargePrimaryS
>>> eq" (perhaps you forgot to load "Bio::Seq::LargePrimarySeq"?) at 
>>> ThesisFrontEndS
>>> cript.pl line 94, <GeneExpressionData> line 33294.
>>>
>>>
>>> I can send you the code in question if you want to get a better 
>>> look-see.
>>> Now, the reason I need the whole sequence is two-fold.  For one, I 
>>> need to be able to calculate CG% of genes as an experimental control 
>>> of my project.  The other part is that I need to be able to scan the 
>>> genome for polyA sites with respect to their orientation to L1 
>>> sites, and there's no simple way to do that other than flat-out 
>>> scanning the code.  I'll definitely look into tweaking the /$tmp 
>>> directory if that helps, but other than that, I have to at least try 
>>> and make it work.
>>>
>>> Stefan Kirov wrote:
>>>
>>>> First you have to answer few questions: how do you get the object?/
>>>> use Bio::Seq::LargePrimarySeq does not create an object it merely 
>>>> makes the code available.
>>>> /If you post you code here it will be much easier to answer your 
>>>> questions. How do you access the sequence (I hope you have read the 
>>>> documentation, which states that it is not generally a good idea to 
>>>> call $seq->seq).
>>>> How big is you /tmp? What are trying to accomplish and why you need 
>>>> the whole seq in memory?
>>>> Stefan
>>>>
>>>> I've had the same issue... I ended up breaking down the sequences into
>>>> manageable fragments but would really like to get the largePrimarySeq
>>>> working.  When I tried loading a chrom size sequence I just sat back
>>>> and watched my RAM get used up (2 gigs), then the swap, then the
>>>> crash....  So if anyone can help it'd benefit both of us!
>>>>
>>>> Thanks for any help,
>>>> Garrett
>>>>
>>>>
>>>> On Mon, 14 Mar 2005 16:22:40 -0500, iluminati at earthlink.net 
>>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>
>>>> <iluminati at earthlink.net 
>>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>> wrote:
>>>>
>>>>> / I'm having this unuusal problem with loading this particular 
>>>>> module.  I
>>>>
>>>>
>>>>
>>>>
>>>> />/ need b/c I'm working with chromosome-sized sequence files as a 
>>>> part of
>>>> />/ my project, but yet it seems to not want to load properly even 
>>>> when it's
>>>> />/ loaded using the following statement:
>>>> />/ />/ use Bio::Seq::LargePrimarySeq;
>>>> />/ />/ I checked my modules, and the necessary module is there.  
>>>> It seems to
>>>> />/ just not want to load.  Can anyone be of service?
>>>> />/ />/ _______________________________________________
>>>> />/ Bioperl-l mailing list
>>>> />/ Bioperl-l at portal.open-bio.org 
>>>> <http://portal.open-bio.org/mailman/listinfo/bioperl-l>
>>>> />/ http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>> />
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>



More information about the Bioperl-l mailing list