[Biopython] Question about using Gapped Alignment in Bio.CAPS.CAPSMap (Differential Cutsites)

RYU Esther (INTERN) Esther.RYU.intern at 3ds.com
Tue Aug 15 17:04:57 UTC 2017


Hi all,

Two questions -


*         Can CAPSMap() take an alignment that has gaps?  I see errors both if the alignment has gap characters and if there are no gaps but the sequences are of different lengths.


*         What role does upper/lower case play in alignments?  BioPython test alignment sequences are gapless and mixed case.  Wondering what I'm missing here, especially regarding gaps.  A pointer to documentation would be great.

Background -

I am having some difficulty understanding/using the Bio.CAPS module. If I pass in an alignment with gap characters into CAPSMAP() and call ._digest_with() on it, I receive an invalid character error. When I remove the offending characters ('-' and '.'), I receive an error from the AlignIO.read() method that the lengths of the sequences are not the same. I looked at the provided test file (test_CAPS.py), and noticed that the aligned sequences do not contain gap characters and that the sequences also contain lowercase letters. Is this module only workable if the alignment has no gap characters and has the same length? And what role does case play in these aligned sequences?

Here is the script I ran:

from Bio.CAPS import DifferentialCutsite
from Bio.CAPS import CAPSMap
from Bio import AlignIO
from Bio.Restriction import *
def main():
      alignment = AlignIO.read("gap_chars.txt", "fasta")
      map = CAPSMap(alignment)
      enzymes = [EcoRI]
      for enzyme in enzymes:
            map._digest_with(enzyme)
main()

Here are the 2 data files I ran:

Data file 1: "gap_chars.txt"

>gi|2695850|emb|Y13260.1|ABY13260/1-573
TCTGCTGGTTACAACACTTTCTTCTTTCAATAAC.CACAATACTGCAGTACAATGG.GGA
>gi|2695854|emb|Y13264.1|ABY13264/1-547
.................TTTCTTCTTTCAATAAC.CACAATACTGCAGTACAATGG.GGA

Data file 2: "no_gaps.txt"

>gi|2695850|emb|Y13260.1|ABY13260 Acipenser baeri mRNA for immunoglobulin heavy chain, clone ScH 16.1
TCTGCTGGTTACAACACTTTCTTCTTTCAATAACCACAATACTGCAGTACAATGGGGA
>gi|2695854|emb|Y13264.1|ABY13264 Acipenser baeri mRNA for immunoglobulin heavy chain, clone ScH 113
TTTCTTCTTTCAATAACCACAATACTGCAGTACAATGGGGA

If someone could better help me understand how to use this module I'd greatly appreciate it!

Regards,

Esther


This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.

If you are not one of the named recipients or have received this email in error,

(i) you should not read, disclose, or copy it,

(ii) please notify sender of your receipt by reply email and delete this email and all attachments,

(iii) Dassault Systemes does not accept or assume any liability or responsibility for any use of or reliance on this email.

For other languages, go to http://www.3ds.com/terms/email-disclaimer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20170815/a5ca5e95/attachment.html>


More information about the Biopython mailing list