[Biopython] retrieving paired sequences from BAM
Mic
mictadlo at gmail.com
Mon Feb 20 06:37:59 UTC 2012
Hello,
I am trying to retrieve paired end sequences from BAM file. However, I am
only able to retrieve the A sequence with the following code:
''''''''''''''''''''''''''''''''''''''''
import pysam
samfile = pysam.Samfile("b.sorted.bam", "rb")
for read in samfile.fetch():
if read.is_paired:
print read.qname
print read.seq
samfile.close()
$ python a.py | head -n 2
HWI-ST226_0154:4:1206:12773:170407#CTTGTA
AGTATAAAACTAAGCAAACTGTTAGAACTTTGATTACGTTTTGTTTATCAGTGATACGCAAAAGTTTAAGATCCTTGAGTACCTCTTTCGATGGCGGATT
''''''''''''''''''''''''''''''''''''''''
How is it possible to retrieve also the B sequence?
What is the difference between is_proper_pair and is_paired?
-------------
$ grep -A 1 'HWI-ST226_0154:4:1206:12773:170407#CTTGTA' X_A.fastq
@HWI-ST226_0154:4:1206:12773:170407#CTTGTA/1
AGTATAAAACTAAGCAAACTGTTAGAACTTTGATTACGTTTTGTTTATCAGTGATACGCAAAAGTTTAAGATCCTTGAGTACCTCTTTCGATGGCGGATT
$ grep -A 1 'HWI-ST226_0154:4:1206:12773:170407#CTTGTA' X_B.fastq
@HWI-ST226_0154:4:1206:12773:170407#CTTGTA/2
GGCGCTGTGACTGAAGTCCTCAGATTTCGGTACGGTTTTGTCTATTTCTGGGTTCCTGCGGAAACACCTTCTCGATTATTTTCTAATCTCAATTAGGTTT
-------------
Thank you in advance.
Cheers
More information about the Biopython
mailing list