[Biopython-dev] [BioPython] about the SeqRecord slicing

Peter biopython at maubp.freeserve.co.uk
Thu Mar 26 11:07:33 EDT 2009


On Thu, Mar 26, 2009 at 12:24 PM, Jose Blanca <jblanca at btc.upv.es> wrote:
> On Thursday 26 March 2009 13:05:25 Peter wrote:
>> Can you give me an example of where you want to pull out a single
>> character from a SeqRecord, and its quality?  I would consider things
>> like this quite elegant:
>>
>> for letter, quality in zip(record.seq,
>> record.letter_annotations("phred_quality") :
>>    #do stuff
>
> I'm implementing a Contig class similar to the Alignment class but with the
> added capability of supporting sequences that do not start and end at the
> same position and with the capability of masking the sequences.
> I'm implementing the __getitem__ method.
> When I request a column I get for all sequences a int slice and I return the
> result of adding them all. I could solve the problem as you suggest. The
> problem is that this Contig class can work also with Seqs and strs (to
> simplify its use when we don't need a full SeqRecord). If SeqRecord behaves
> more like a Seq or a str I wouldn't need to check for the special SeqRecord
> case in the Contig.__getitem__ method.
> Best regards,

If you pull out a column from a Seq or string based alignment, there is no
annotation to worry about, and you can return the column as a Seq or string.
As things stand, if it was a SeqRecord based alignment, having my_string[i],
my_seq[i] and my_seqrecord[i] all return a single letter string is actually
rather nice for generic code - as long as you are happy returning a Seq or a
string for the column.

However, if I understand you, when pulling a column from a SeqRecord
based alignment in addition to the column's sequence you'd like the get the
per-letter-annotations as well.  This assumes that all the SeqRecord objects
in the alignment have the same per-letter-annotation present - some might
have quality and others might not!  But how would you want to store this
new column object?  Using a string or a Seq doesn't support any annotation
 - you *could* use a SeqRecord with no id, name, description, features,
annotation - just a sequence and any common per-letter-annotation.  Is this
what you had in mind?

Peter



More information about the Biopython-dev mailing list