[Biopython-dev] Biopython 1.60 plans and beyond

Peter Cock p.j.a.cock at googlemail.com
Sat Feb 18 17:24:02 EST 2012


On Sat, Feb 18, 2012 at 10:17 PM, Eric Talevich <eric.talevich at gmail.com> wrote:
>> If right now it has known failures, I don't want to squeeze this into
>> Biopython 1.59 next week.
>
> Agreed! But 1.60 sounds like a good goal.

OK.

>> Does your code manage to produce the same FASTA sequence as
>> the PDB themselves offer for download? That would be my expectation
>> as an end user. It should be easy enough to test if you've already
>> done a full local PDB download.
>
> If there are disordered regions (very common), the missing residues are
> replaced with 'X' characters. These residues can be listed in the SEQRES
> lines of the PDB header, if it's available, but they're not included with
> the atomic coordinates, so PdbIO can't reliably fill in these disordered
> residues for all PDB files. This matches the behavior of the tool I was
> using before (which is non-free and not widely used).
>
> I don't keep a local copy of PDB normally, but I'll download it and do the
> test before asking to merge PdbIO.

Great.

>> I'm still uneasy about this making SeqIO depend on NumPy (even as
>> a soft dependency at runtime), given the fact that the rest of SeqIO
>> should work fine under Jython and PpPy. Support for the NumPy
>> API under PyPy is coming along, but isn't likely for Jython for now
>> (although PyPy's efforts may help there).
>
> As an alternative, I could copy the portion of PDBParser and
> StructureBuilder that are needed to read the amino acid sequence, but skip
> creating Atoms. That would avoid the need for Numpy, at the cost of some
> code duplication. Interested in that approach? If so, I can take a closer
> look and report back on the feasibility.

Rather than literally copying it, do you think it is realistic to make
some of Bio.PDB work without NumPy? e.g. fall back on tuples
of floats (x,y,z) for atom co-ordinates. Just brainstorming - this
might be a horrible idea?

Peter


More information about the Biopython-dev mailing list