[Biopython] random peptide sequences
ferreirafm at usp.br
ferreirafm at usp.br
Thu Apr 5 00:01:42 UTC 2012
Hi Peter,
It seems I get there, but can't write records to file using
SeqIO.write as usual.
Fred
code:
def random_seq(fastafile):
records = [ ]
query = SeqIO.read(fastafile, "fasta")
peplist = str(query.seq).split('GPGPG')
peptup = tuple(str(query.seq).split('GPGPG'))
for pep in peptup:
outf = open("test.fasta", "w")
peplist.remove(pep)
for k in range(10):
random.shuffle(peplist, random.random)
peplist.insert(0, pep)
rec = SeqRecord('GPGPG'.join(peplist), id="pep%s" % k)
records.append(rec)
print 'id: %s\nSeq: %s\n' % (rec.id, rec.seq)
peplist.remove(pep)
print records
SeqIO.write(records, outf, "fasta")
outf.close()
sys.exit(1)
output:
$ random_pep.py --run br18.fasta
id: pep0
Seq:
EELRSLYNTVATLYCVHGPGPGRDLLLIVTRIVELLGRGPGPGKRWIILGLNKIVRMYSPTSIGPGPGNTSYRLISCNTSVIGPGPGGKIILVAVHVASGYIGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGQQLLFIHFRIGCRHSRIGGPGPGELLKTVRLIKFLYQSNPGPGPGTPVNIIGRNLLTQIGGPGPGSPEVIPMFSALSEGPGPGSELYLYKVVKIEPLGVAPGPGPGSLQYLALVALVAPKKGPGPGVLAIVALVVATIIAIGPGPGTMLLGMLMICSAAGPGPGVLEWRFDSRLAFHHVGPGPGDKELYPLASLRSLFGGPGPGEAIIRILQQLLFIHF
id: pep1
Seq:
EELRSLYNTVATLYCVHGPGPGRDLLLIVTRIVELLGRGPGPGALFYKLDVVPIDGPGPGSELYLYKVVKIEPLGVAPGPGPGKRWIILGLNKIVRMYSPTSIGPGPGVLAIVALVVATIIAIGPGPGQRPLVTIKIGGQLKEGPGPGELLKTVRLIKFLYQSNPGPGPGSLQYLALVALVAPKKGPGPGEAIIRILQQLLFIHFGPGPGVLEWRFDSRLAFHHVGPGPGTMLLGMLMICSAAGPGPGQQLLFIHFRIGCRHSRIGGPGPGNTSYRLISCNTSVIGPGPGSPEVIPMFSALSEGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYIGPGPGTPVNIIGRNLLTQIG
id: pep2
Seq:
EELRSLYNTVATLYCVHGPGPGVLAIVALVVATIIAIGPGPGSLQYLALVALVAPKKGPGPGQRPLVTIKIGGQLKEGPGPGDKELYPLASLRSLFGGPGPGSPEVIPMFSALSEGPGPGALFYKLDVVPIDGPGPGEAIIRILQQLLFIHFGPGPGKRWIILGLNKIVRMYSPTSIGPGPGGKIILVAVHVASGYIGPGPGELLKTVRLIKFLYQSNPGPGPGSELYLYKVVKIEPLGVAPGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGRDLLLIVTRIVELLGRGPGPGVLEWRFDSRLAFHHVGPGPGQQLLFIHFRIGCRHSRIGGPGPGNTSYRLISCNTSVI
id: pep3
Seq:
EELRSLYNTVATLYCVHGPGPGGKIILVAVHVASGYIGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGEAIIRILQQLLFIHFGPGPGQQLLFIHFRIGCRHSRIGGPGPGKRWIILGLNKIVRMYSPTSIGPGPGSELYLYKVVKIEPLGVAPGPGPGSLQYLALVALVAPKKGPGPGQRPLVTIKIGGQLKEGPGPGRDLLLIVTRIVELLGRGPGPGELLKTVRLIKFLYQSNPGPGPGSPEVIPMFSALSEGPGPGALFYKLDVVPIDGPGPGVLAIVALVVATIIAIGPGPGVLEWRFDSRLAFHHVGPGPGDKELYPLASLRSLFGGPGPGNTSYRLISCNTSVI
id: pep4
Seq:
EELRSLYNTVATLYCVHGPGPGVLAIVALVVATIIAIGPGPGVLEWRFDSRLAFHHVGPGPGKRWIILGLNKIVRMYSPTSIGPGPGQRPLVTIKIGGQLKEGPGPGELLKTVRLIKFLYQSNPGPGPGSELYLYKVVKIEPLGVAPGPGPGALFYKLDVVPIDGPGPGSPEVIPMFSALSEGPGPGNTSYRLISCNTSVIGPGPGSLQYLALVALVAPKKGPGPGTPVNIIGRNLLTQIGGPGPGEAIIRILQQLLFIHFGPGPGQQLLFIHFRIGCRHSRIGGPGPGTMLLGMLMICSAAGPGPGRDLLLIVTRIVELLGRGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYI
id: pep5
Seq:
EELRSLYNTVATLYCVHGPGPGSPEVIPMFSALSEGPGPGEAIIRILQQLLFIHFGPGPGTMLLGMLMICSAAGPGPGQQLLFIHFRIGCRHSRIGGPGPGELLKTVRLIKFLYQSNPGPGPGVLEWRFDSRLAFHHVGPGPGQRPLVTIKIGGQLKEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGGKIILVAVHVASGYIGPGPGDKELYPLASLRSLFGGPGPGSLQYLALVALVAPKKGPGPGNTSYRLISCNTSVIGPGPGTPVNIIGRNLLTQIGGPGPGVLAIVALVVATIIAIGPGPGSELYLYKVVKIEPLGVAPGPGPGALFYKLDVVPIDGPGPGRDLLLIVTRIVELLGR
id: pep6
Seq:
EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGSPEVIPMFSALSEGPGPGVLEWRFDSRLAFHHVGPGPGTMLLGMLMICSAAGPGPGALFYKLDVVPIDGPGPGKRWIILGLNKIVRMYSPTSIGPGPGDKELYPLASLRSLFGGPGPGSELYLYKVVKIEPLGVAPGPGPGRDLLLIVTRIVELLGRGPGPGELLKTVRLIKFLYQSNPGPGPGNTSYRLISCNTSVIGPGPGVLAIVALVVATIIAIGPGPGEAIIRILQQLLFIHFGPGPGGKIILVAVHVASGYIGPGPGQQLLFIHFRIGCRHSRIGGPGPGQRPLVTIKIGGQLKEGPGPGSLQYLALVALVAPKK
id: pep7
Seq:
EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGSLQYLALVALVAPKKGPGPGEAIIRILQQLLFIHFGPGPGSELYLYKVVKIEPLGVAPGPGPGVLAIVALVVATIIAIGPGPGELLKTVRLIKFLYQSNPGPGPGVLEWRFDSRLAFHHVGPGPGQQLLFIHFRIGCRHSRIGGPGPGSPEVIPMFSALSEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGNTSYRLISCNTSVIGPGPGRDLLLIVTRIVELLGRGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYI
id: pep8
Seq:
EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGRDLLLIVTRIVELLGRGPGPGNTSYRLISCNTSVIGPGPGELLKTVRLIKFLYQSNPGPGPGSPEVIPMFSALSEGPGPGVLEWRFDSRLAFHHVGPGPGEAIIRILQQLLFIHFGPGPGDKELYPLASLRSLFGGPGPGALFYKLDVVPIDGPGPGQQLLFIHFRIGCRHSRIGGPGPGQRPLVTIKIGGQLKEGPGPGSLQYLALVALVAPKKGPGPGGKIILVAVHVASGYIGPGPGTMLLGMLMICSAAGPGPGSELYLYKVVKIEPLGVAPGPGPGVLAIVALVVATIIAIGPGPGKRWIILGLNKIVRMYSPTSI
id: pep9
Seq:
EELRSLYNTVATLYCVHGPGPGVLEWRFDSRLAFHHVGPGPGSPEVIPMFSALSEGPGPGVLAIVALVVATIIAIGPGPGTPVNIIGRNLLTQIGGPGPGDKELYPLASLRSLFGGPGPGSELYLYKVVKIEPLGVAPGPGPGNTSYRLISCNTSVIGPGPGTMLLGMLMICSAAGPGPGGKIILVAVHVASGYIGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGRDLLLIVTRIVELLGRGPGPGEAIIRILQQLLFIHFGPGPGELLKTVRLIKFLYQSNPGPGPGSLQYLALVALVAPKKGPGPGQQLLFIHFRIGCRHSRIG
[SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGRDLLLIVTRIVELLGRGPGPGKRWIILGLNKIVRMYSPTSIGPGPGNTSYRLISCNTSVIGPGPGGKIILVAVHVASGYIGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGQQLLFIHFRIGCRHSRIGGPGPGELLKTVRLIKFLYQSNPGPGPGTPVNIIGRNLLTQIGGPGPGSPEVIPMFSALSEGPGPGSELYLYKVVKIEPLGVAPGPGPGSLQYLALVALVAPKKGPGPGVLAIVALVVATIIAIGPGPGTMLLGMLMICSAAGPGPGVLEWRFDSRLAFHHVGPGPGDKELYPLASLRSLFGGPGPGEAIIRILQQLLFIHF', id='pep0', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGRDLLLIVTRIVELLGRGPGPGALFYKLDVVPIDGPGPGSELYLYKVVKIEPLGVAPGPGPGKRWIILGLNKIVRMYSPTSIGPGPGVLAIVALVVATIIAIGPGPGQRPLVTIKIGGQLKEGPGPGELLKTVRLIKFLYQSNPGPGPGSLQYLALVALVAPKKGPGPGEAIIRILQQLLFIHFGPGPGVLEWRFDSRLAFHHVGPGPGTMLLGMLMICSAAGPGPGQQLLFIHFRIGCRHSRIGGPGPGNTSYRLISCNTSVIGPGPGSPEVIPMFSALSEGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYIGPGPGTPVNIIGRNLLTQIG', id='pep1', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGVLAIVALVVATIIAIGPGPGSLQYLALVALVAPKKGPGPGQRPLVTIKIGGQLKEGPGPGDKELYPLASLRSLFGGPGPGSPEVIPMFSALSEGPGPGALFYKLDVVPIDGPGPGEAIIRILQQLLFIHFGPGPGKRWIILGLNKIVRMYSPTSIGPGPGGKIILVAVHVASGYIGPGPGELLKTVRLIKFLYQSNPGPGPGSELYLYKVVKIEPLGVAPGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGRDLLLIVTRIVELLGRGPGPGVLEWRFDSRLAFHHVGPGPGQQLLFIHFRIGCRHSRIGGPGPGNTSYRLISCNTSVI', id='pep2', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGGKIILVAVHVASGYIGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGEAIIRILQQLLFIHFGPGPGQQLLFIHFRIGCRHSRIGGPGPGKRWIILGLNKIVRMYSPTSIGPGPGSELYLYKVVKIEPLGVAPGPGPGSLQYLALVALVAPKKGPGPGQRPLVTIKIGGQLKEGPGPGRDLLLIVTRIVELLGRGPGPGELLKTVRLIKFLYQSNPGPGPGSPEVIPMFSALSEGPGPGALFYKLDVVPIDGPGPGVLAIVALVVATIIAIGPGPGVLEWRFDSRLAFHHVGPGPGDKELYPLASLRSLFGGPGPGNTSYRLISCNTSVI', id='pep3', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGVLAIVALVVATIIAIGPGPGVLEWRFDSRLAFHHVGPGPGKRWIILGLNKIVRMYSPTSIGPGPGQRPLVTIKIGGQLKEGPGPGELLKTVRLIKFLYQSNPGPGPGSELYLYKVVKIEPLGVAPGPGPGALFYKLDVVPIDGPGPGSPEVIPMFSALSEGPGPGNTSYRLISCNTSVIGPGPGSLQYLALVALVAPKKGPGPGTPVNIIGRNLLTQIGGPGPGEAIIRILQQLLFIHFGPGPGQQLLFIHFRIGCRHSRIGGPGPGTMLLGMLMICSAAGPGPGRDLLLIVTRIVELLGRGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYI', id='pep4', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGSPEVIPMFSALSEGPGPGEAIIRILQQLLFIHFGPGPGTMLLGMLMICSAAGPGPGQQLLFIHFRIGCRHSRIGGPGPGELLKTVRLIKFLYQSNPGPGPGVLEWRFDSRLAFHHVGPGPGQRPLVTIKIGGQLKEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGGKIILVAVHVASGYIGPGPGDKELYPLASLRSLFGGPGPGSLQYLALVALVAPKKGPGPGNTSYRLISCNTSVIGPGPGTPVNIIGRNLLTQIGGPGPGVLAIVALVVATIIAIGPGPGSELYLYKVVKIEPLGVAPGPGPGALFYKLDVVPIDGPGPGRDLLLIVTRIVELLGR', id='pep5', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGSPEVIPMFSALSEGPGPGVLEWRFDSRLAFHHVGPGPGTMLLGMLMICSAAGPGPGALFYKLDVVPIDGPGPGKRWIILGLNKIVRMYSPTSIGPGPGDKELYPLASLRSLFGGPGPGSELYLYKVVKIEPLGVAPGPGPGRDLLLIVTRIVELLGRGPGPGELLKTVRLIKFLYQSNPGPGPGNTSYRLISCNTSVIGPGPGVLAIVALVVATIIAIGPGPGEAIIRILQQLLFIHFGPGPGGKIILVAVHVASGYIGPGPGQQLLFIHFRIGCRHSRIGGPGPGQRPLVTIKIGGQLKEGPGPGSLQYLALVALVAPKK', id='pep6', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGTMLLGMLMICSAAGPGPGSLQYLALVALVAPKKGPGPGEAIIRILQQLLFIHFGPGPGSELYLYKVVKIEPLGVAPGPGPGVLAIVALVVATIIAIGPGPGELLKTVRLIKFLYQSNPGPGPGVLEWRFDSRLAFHHVGPGPGQQLLFIHFRIGCRHSRIGGPGPGSPEVIPMFSALSEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGNTSYRLISCNTSVIGPGPGRDLLLIVTRIVELLGRGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGDKELYPLASLRSLFGGPGPGGKIILVAVHVASGYI', id='pep7', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGTPVNIIGRNLLTQIGGPGPGRDLLLIVTRIVELLGRGPGPGNTSYRLISCNTSVIGPGPGELLKTVRLIKFLYQSNPGPGPGSPEVIPMFSALSEGPGPGVLEWRFDSRLAFHHVGPGPGEAIIRILQQLLFIHFGPGPGDKELYPLASLRSLFGGPGPGALFYKLDVVPIDGPGPGQQLLFIHFRIGCRHSRIGGPGPGQRPLVTIKIGGQLKEGPGPGSLQYLALVALVAPKKGPGPGGKIILVAVHVASGYIGPGPGTMLLGMLMICSAAGPGPGSELYLYKVVKIEPLGVAPGPGPGVLAIVALVVATIIAIGPGPGKRWIILGLNKIVRMYSPTSI', id='pep8', name='<unknown name>', description='<unknown description>', dbxrefs=[]), SeqRecord(seq='EELRSLYNTVATLYCVHGPGPGVLEWRFDSRLAFHHVGPGPGSPEVIPMFSALSEGPGPGVLAIVALVVATIIAIGPGPGTPVNIIGRNLLTQIGGPGPGDKELYPLASLRSLFGGPGPGSELYLYKVVKIEPLGVAPGPGPGNTSYRLISCNTSVIGPGPGTMLLGMLMICSAAGPGPGGKIILVAVHVASGYIGPGPGALFYKLDVVPIDGPGPGQRPLVTIKIGGQLKEGPGPGKRWIILGLNKIVRMYSPTSIGPGPGRDLLLIVTRIVELLGRGPGPGEAIIRILQQLLFIHFGPGPGELLKTVRLIKFLYQSNPGPGPGSLQYLALVALVAPKKGPGPGQQLLFIHFRIGCRHSRIG', id='pep9', name='<unknown name>', description='<unknown description>',
dbxrefs=[])]
Traceback (most recent call last):
File "/home/ferreirafm/bin/random_pep.py", line 173, in <module>
main()
File "/home/ferreirafm/bin/random_pep.py", line 156, in main
random_seq(fastafile)
File "/home/ferreirafm/bin/random_pep.py", line 39, in random_seq
SeqIO.write(records, outf, "fasta")
File "/usr/lib64/python2.7/site-packages/Bio/SeqIO/__init__.py",
line 412, in write
count = writer_class(handle).write_file(sequences)
File "/usr/lib64/python2.7/site-packages/Bio/SeqIO/Interfaces.py",
line 271, in write_file
count = self.write_records(records)
File "/usr/lib64/python2.7/site-packages/Bio/SeqIO/Interfaces.py",
line 256, in write_records
self.write_record(record)
File "/usr/lib64/python2.7/site-packages/Bio/SeqIO/FastaIO.py",
line 136, in write_record
data = self._get_seq_string(record) #Catches sequence being None
File "/usr/lib64/python2.7/site-packages/Bio/SeqIO/Interfaces.py",
line 164, in _get_seq_string
% record.id)
TypeError: SeqRecord (id=pep0) has an invalid sequence.
Citando Peter Cock <p.j.a.cock at googlemail.com>:
> On Wed, Apr 4, 2012 at 8:56 PM, <ferreirafm at usp.br> wrote:
>>
>> Hi Peter,
>> Thanks for helping. I'll try something like that and let you know the
>> results.
>> Fred
>
> Good luck - and please reply on the list to let us know how you get on :)
>
> Peter
>
More information about the Biopython
mailing list