[BioPython]
Is there a limit on Fasta parser? (an a bug spotted on LCC function)
Sebastian Bassi
sbassi at asalup.org
Wed Mar 26 16:54:52 EST 2003
Hi,
When I extract info using the fasta parser, I get up to 999950 (and
sometimes only 999932) nucleotides.
I'm using Biopython 1.10 on Python 2.2.2 on Win2000.
Is this something known?
Regarding LCC function, I found a bug, I forgot to reset a list, so each
function call, the list resturned was bigger than previous (because it
include previos results). Here is correct code:
def lcc_mult(seq,wsize,start,end):
"""Return a vector called lccsal, the LCC, a complexity measure
from a sequence, called seq."""
l2=math.log(2)
tamseq=end-start
global compone
#print "compone"+str(len(compone))
global lccsal
#print "lccsal"+str(len(lccsal))
compone=[0]
lccsal=[0]
for i in range(wsize):
compone.append(((i+1)/float(wsize))*((math.log((i+1)/float(wsize)))/l2))
window=seq[0:wsize]
cant_a=count(window,'A')
cant_c=count(window,'C')
cant_t=count(window,'T')
cant_g=count(window,'G')
term_a=compone[cant_a]
term_c=compone[cant_c]
term_t=compone[cant_t]
term_g=compone[cant_g]
lccsal[0]=(-(term_a+term_c+term_t+term_g))
tail=seq[0]
for x in range (tamseq-wsize):
window=seq[x+1:wsize+x+1]
if tail==window[-1]:
lccsal.append(lccsal[-1])
#break
elif tail=='A':
cant_a=cant_a-1
if window[-1]=='C':
cant_c=cant_c+1
term_a=compone[cant_a]
term_c=compone[cant_c]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='T':
cant_t=cant_t+1
term_a=compone[cant_a]
term_t=compone[cant_t]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='G':
cant_g=cant_g+1
term_a=compone[cant_a]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif tail=='C':
cant_c=cant_c-1
if window[-1]=='A':
cant_a=cant_a+1
term_a=compone[cant_a]
term_c=compone[cant_c]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='T':
cant_t=cant_t+1
term_c=compone[cant_c]
term_t=compone[cant_t]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='G':
cant_g=cant_g+1
term_c=compone[cant_c]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif tail=='T':
cant_t=cant_t-1
if window[-1]=='A':
cant_a=cant_a+1
term_a=compone[cant_a]
term_t=compone[cant_t]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='C':
cant_c=cant_c+1
term_c=compone[cant_c]
term_t=compone[cant_t]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='G':
cant_g=cant_g+1
term_t=compone[cant_t]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif tail=='G':
cant_g=cant_g-1
if window[-1]=='A':
cant_a=cant_a+1
term_a=compone[cant_a]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='C':
cant_c=cant_c+1
term_c=compone[cant_c]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
elif window[-1]=='T':
cant_t=cant_t+1
term_t=compone[cant_t]
term_g=compone[cant_g]
lccsal.append(-(term_a+term_c+term_t+term_g))
tail=window[0]
return lccsal
--
Best regards,
//=\ Sebastian Bassi - Diplomado en Ciencia y Tecnologia, UNQ //=\
\=// IT Manager Advanta Seeds - Balcarce Research Center - \=//
//=\ Pro secretario ASALUP - www.asalup.org - PGP key available //=\
\=// E-mail: sbassi at genesdigitales.com - ICQ UIN: 3356556 - \=//
Linux para todos: http://Linuxfacil.info
More information about the BioPython
mailing list