[Biopython-dev] [Bug 2381] translate and transcibe methods for the Seq object (in Bio.Seq)
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Thu Nov 6 17:32:52 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2381
------- Comment #48 from lpritc at scri.sari.ac.uk 2008-11-06 12:32 EST -------
(In reply to comment #47)
> (In reply to comment #46)
> > [seq.translate() for seq in seqlist if seq.is_cds()]
> >
> > I prefer the second option, for readability, but YMMV.
>
> Note the above wouldn't give you translations starting with methionine, you'd
> need something like:
>
> [seq.translate(cds_start=True) for seq in seqlist if seq.is_cds()]
>
> (assuming we call the "init" option "cds_start")
Fair point... my focus was on putting that filter into the list comprehension.
> Or, going with the complete_cds option you could build a list of translations
> of valid CDSs like this:
>
> proteins = []
> for seq in seqlist :
> try :
> proteins.append(seq.translate(complete_cds=True))
> except ValueError :
> #Not a valid CDS, excluded
> pass
>
> Not a one liner, but I think in a real situation you'd want to do something
> with the invalid CDSs anyway (even if just logging them).
True enough. It comes down in part to a preference of style, as the same could
be achieved with
proteins = []
for seq in seqlist :
if seq.is_cds():
proteins.append(seq.translate(complete_cds=True))
else:
#Not a valid CDS, excluded
pass
I think the clarity of this arrangement to my eyes comes from 'is/is not a cds'
being - naturally-speaking - a property or attribute of the sequence itself.
The 'cds_start' argument in your example is then an instruction to treat the
translation as though you have a CDS, and implement some specialised behaviour
that is appropriate under that circumstance, rather than to implement a test
that raises an error if it is failed. By separating the 'is_cds()' call from
the 'cds_start' argument, you gain the ability to translate the sequence with
either the methionine or the coded amino acid, without losing the test of the
sequence being a CDS.
Of course, using the 'cds_start=True' argument could force a call to
self.is_cds(), anyway. Your non-one-liner could then be as you originally
wrote:
proteins = []
for seq in seqlist :
try:
proteins.append(seq.translate(complete_cds=True))
except ValueError:
#Not a valid CDS, excluded
pass
The two advantages I see to having the is_cds() method as a separate call are
that it permits separation of the determining the CDS status of the sequence,
and that it provides a filter that is more readable than attempting to
translate the sequence to find out if it's a valid CDS. If the 'cds_start'
argument forces a self.is_cds() test, then the usage can be - I think - exactly
as you've been proposing throughout the thread.
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list