[Bioperl-l] Feature table comparison

Sean Davis sdavis2 at mail.nih.gov
Tue Jan 18 06:43:54 EST 2005


Rob,

Perhaps others have done something similar.  In general, it helps to 
post back to the list so we both benefit from others' knowledge and 
others benefit from your thoughts on what is not a straightforward 
problem.

As for my two-cents worth, can't you just go through the alignments for 
each strain, sort them in genomic order of one strain, and determine 
the segments not aligning based on the end of one alignment and the 
beginning of then next?  Do the sort for the other strain to get the 
same unaligned blocks for the other strain.  Then, move on to the next 
pairing of strains and repeat.  That will give you the unaligned blocks 
for each strain with respect to each other strain.  Then you can do go 
back to your feature table for each strain and look for overlaps 
between the unaligned segments and the annotated features--for this 
there are tools in bioperl.  See Bio::DB::GFF and ? others?

Sean

On Jan 18, 2005, at 6:24 AM, Robert Minshall wrote:

> tahnks for your help, so far i can align the DNA embl files no problem 
> and work
> out which bits are not alligned by habd but i have 6 strains to 
> compair against
> eachother and the first allignmnt has taken me months to work out so 
> far and
> i'm only 1/2 way through the thing. all i wanted to do was find 
> sections of dna
> not on one strain but on another and work out what the protiens were. 
> the
> feature table on most of the strains i have is not well annotated and 
> therefore
> i think that a feature table comparrison is now not the correct way 
> forward. i
> just want to separate out the sections of dna that are "unique" to one
> particular strain form the other and get the protien and see if it 
> appears on
> other strains or not. it seems simple in my head but in practice its 
> not.
> --
> Robert J Minshall
> Postgraduate Researcher in Microbiology,
> Biosciences Research Institute,
> School of Environment and Life Sciences,
> Lab 209 Cockcroft Building,
> University of Salford,
> Salford,
> Greater Manchester.
> M5 4WT
> UK
> 0161 2952652
> r.j.mishall at pgr.salford.ac.uk
>
>
>
> Quoting Sean Davis <sdavis2 at mail.nih.gov>:
>
>> Rob,
>>
>> If you have files in EMBL format, you can use Bio::SeqIO to read them.
>> What is in the EMBL files--protein or DNA?  Are the features named in 
>> a
>> systematic manner (are the same genes called the same thing in both
>> strains if they are present)?  If they are, can you simply do an ID
>> matching between the two strains?  Judging from your email below,
>> probably not.
>>
>> If the question you are asking is truly the opposite of an alignment,
>> then you will need to do more work.  This is beyond my usual
>> bioinformatics realm, but I would imagine that you would need to align
>> the two genomes first (and how you do this will greatly affect your
>> results, I would suppose) and then look for what didn't align in each
>> strain.  I'm sure others on the list have done this kind of thing
>> before.  I'm just not sure what the state-of-the-art is for
>> whole-genome alignments these days.
>>
>> Sean
>>
>> On Jan 18, 2005, at 4:36 AM, Robert Minshall wrote:
>>
>>> i am basically trying to find the differences between 2 strains of
>>> bacteria in
>>> embl format. what i really need is an inverted ACT (Artemis 
>>> comparison
>>> tool)
>>> diffseq from emboss wont do what i need, i just need to some how get 
>>> a
>>> list of
>>> protiens that are on one strain and not the other. This cn be done by
>>> hand but
>>> will take months. oi was woundereing if there was a program out there
>>> where i
>>> can input the 2 embl files and get a list of feature differences or 
>>> the
>>> opposite of an alignment.
>>> Thanks
>>> Rob
>>> --
>>> Robert J Minshall
>>> Postgraduate Researcher in Microbiology,
>>> Biosciences Research Institute,
>>> School of Environment and Life Sciences,
>>> Lab 209 Cockcroft Building,
>>> University of Salford,
>>> Salford,
>>> Greater Manchester.
>>> M5 4WT
>>> UK
>>> 0161 2952652
>>> r.j.mishall at pgr.salford.ac.uk
>>>
>>>
>>>
>>> Quoting Sean Davis <sdavis2 at mail.nih.gov>:
>>>
>>>> Rob,
>>>>
>>>> You will probably need to be a bit more specific.  What constitutes 
>>>> a
>>>> "genome" in your email below?  What are the features?  In what form
>>>> are you
>>>> getting the data?  Do you have a specific question you are trying to
>>>> answer?
>>>>
>>>> Sean
>>>>
>>>> ----- Original Message -----
>>>> From: "Robert Minshall" <R.J.Minshall at pgr.salford.ac.uk>
>>>> To: <bioperl-l at bioperl.org>
>>>> Sent: Monday, January 17, 2005 8:39 AM
>>>> Subject: [Bioperl-l] Feature table comparison
>>>>
>>>>
>>>>>
>>>>> Hi does any one know of or have a script that can compare the 
>>>>> faeture
>>>>> tables of
>>>>> genomes and show what appears on one and not the other. ie i want 
>>>>> to
>>>>> find
>>>>> the
>>>>> differenmces on the feature tables. is this possible i'm new to 
>>>>> perl
>>>>> and
>>>>> was
>>>>> hoping that someone could point me in the right direction. my email
>>>>> is
>>>>> r.j.minshall at pgr.salford.ac.uk
>>>>> thanks in advance
>>>>> Rob Minshall
>>>>>
>>>>> --
>>>>> Robert J Minshall
>>>>> Postgraduate Researcher in Microbiology,
>>>>> Biosciences Research Institute,
>>>>> School of Environment and Life Sciences,
>>>>> Lab 209 Cockcroft Building,
>>>>> University of Salford,
>>>>> Salford,
>>>>> Greater Manchester.
>>>>> M5 4WT
>>>>> UK
>>>>> 0161 2952652
>>>>> r.j.mishall at pgr.salford.ac.uk
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----------------------------------------------------------------
>>>>> Concerns about content should be sent to abuse at salford.ac.uk
>>>>> _______________________________________________
>>>>> Bioperl-l mailing list
>>>>> Bioperl-l at portal.open-bio.org
>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> ----------------------------------------------------------------
>>> Concerns about content should be sent to abuse at salford.ac.uk
>>
>>
>
>
> ----------------------------------------------------------------
> Concerns about content should be sent to abuse at salford.ac.uk



More information about the Bioperl-l mailing list