[BioPython] Random sequence
pan at uchicago.edu
pan at uchicago.edu
Thu Jun 17 00:23:41 EDT 2004
Also see a 2-liner coding sequence:
>>> def codingSeq(size=30, stopCodons=['TAG','TAA', 'TGA'], sep=''):
codons = [x+y+z for x in 'AGTC' for y in 'AGTC' for z in 'AGTC' \
if x+y+z not in stopCodons]
return sep.join([ random.choice(codons) for x in range(size/3) ])
>>> codingSeq()
'AATGTTTCACTAGGTGACGTGTCGTGGCTA'
>>> codingSeq(sep=' ')
'GGT GCT AAG TTC CGA TCG AAC AGA AAC TGT'
Quoting pan at uchicago.edu:
> You can make a random seq with one line of python code:
>
> >>> import random
>
> >>> ''.join([random.choice('AGTC') for x in range(10)])
> 'GGTTTCGGTA'
>
> >>> ''.join([random.choice('AGTC') for x in range(10)])
> 'GCGGGTCCGT'
>
> >>> ''.join([random.choice('AGTC') for x in range(10)])
> 'AAAAGCACTG'
>
> Isn't it beautiful?
>
> pan
>
>
>
>
>
>
>
>
> Quoting ashleigh smythe <absmythe at ucdavis.edu>:
>
> > On Wed, 2004-06-16 at 07:45, Sebastian Bassi wrote:
> > > Is there a way to generate a random DNA sequence with biopython?
> > > If not, I could submit a function to do it, but before doing it, I'd
> > > want to see if its not already done.
> >
> > Hi Sebastian. I wasn't able to find a random sequence generator in the
> > biopython modules so I wrote a simple little one of my own a few months
> > ago- it only uses biopython modules to add the sequence to a
> > biopython-parsed file. It is quite ugly and brute force as I'm a
> > beginner - I'd be curious to see what you come up with. In case you are
> > curious, here it is:
> >
> > #This is designed to generate random DNA sequence data and add
> > #it to the end of a biopython-parsed sequence record
> > #in fasta format.
> > #Modified 2-20 to just make random seq. data for a taxon,
> > #rather than adding it onto the existing sequence.
> >
> > import random
> > import string
> >
>
> >
>
> >
>
> >
> > def generate(n): #generate the dna sequence of n length
> > bases=['A', 'T', 'G', 'C']
> > dna_in_list=[]
> >
>
> >
> > while n > 0:
> > abase=random.choice(bases)
> >
>
> >
> > dna_in_list.append(abase)
> > n=n-1
> >
>
> >
> > dnastring=str(dna_in_list) #format the list into a string.
> > better_dnastring=string.join(string.split(dnastring),"") #Take
> > better2_dnastring=string.strip(better_dnastring) #out
> > better3_dnastring=better2_dnastring.replace(',','') #unwanted
> > better4_dnastring=better3_dnastring.replace(']','') #characters
> > better5_dnastring=better4_dnastring.replace('[','')
> > better6_dnastring=better5_dnastring.replace("'",'')
> >
> > return better6_dnastring
> >
> >
> > def add_seq(n): #this is how start
> > import sys #the program:seqgen.add_seq(file, n).
> > from Bio import Fasta
> > parser=Fasta.RecordParser()
> > afile=open(file_to_add_to, 'r')
> > iterator=Fasta.Iterator(afile, parser)
> >
> > out_file=open('randomadded.nex', 'w')
> >
> > while 1: #loop through each record and add the new
> > seq_to_add=generate(n) #sequence
> > cur_record=iterator.next()
> > if cur_record is None:
> > break
> > title_and_seq=string.split(cur_record.title)
> > title='>' + title_and_seq[0] + '\n'
> > new_record=title + 'N' + seq_to_add
> > out_file.write(new_record)
> > out_file.write('\n')
> >
> >
> > Ashleigh
> >
> > _______________________________________________
> > BioPython mailing list - BioPython at biopython.org
> > http://biopython.org/mailman/listinfo/biopython
> >
>
>
> _______________________________________________
> BioPython mailing list - BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
>
More information about the BioPython
mailing list