[Biopython] Gene postion is shifting one nucleotide after using BCBio's GFF parser
Bastian Greshake
bgreshake at googlemail.com
Sat Jan 28 00:55:22 UTC 2017
Hey there,
that’s because GFFs use a 1-based index, while the FeatureLocation’s in BioPython use a zero-based index, see: http://biopython.org/DIST/docs/api/Bio.SeqFeature.FeatureLocation-class.html, specifically
"Note that the start and end location numbering follow Python's scheme, thus a GenBank entry of 123..150 (one based counting) becomes a location of [122:150] (zero based counting).“
Hope that helps!
Cheers,
Bastian
—
www.ruleofthirds.de
While I may be sending this email outside my normal office hours, I have no expectation to receive a reply outside yours.
> On 28 Jan 2017, at 01:42, Islam Amin <eng.islamamin at gmail.com> wrote:
>
> Dear All,
>
> I'm new with parsing gff files while I'm trying to parse the gff files I found that the start position of the gene is 8774510 instead of 8774511 (in the original files), Could any one explain that for me:
>
>> grep "gene:Bra000001" Brassica_rapa.IVFCAASv1.34.chr.gff3
>
>> A03 brad gene 8774511 8777095 . + . ID=gene:Bra000001;biotype=protein_coding;description=AT2G37440 (E%3D7e-179) | endonuclease/exonuclease/phosphatase family protein ;gene_id=Bra000001;logic_name=glean
> A03 brad mRNA 8774511 8777095 . + . ID=transcript:Bra000001.1;Parent=gene:Bra000001;biotype=protein_coding;transcript_id=Bra000001.1
>
>
> after using the following script
> =============================
> from BCBio import GFF
> in_file = "chr.gff3"
> limits = dict(gff_type = ["gene","mRNA","exon"])
> gff_handle = open(in_file)
> for rec in GFF.parse(gff_handle,target_lines=1000,limit_info=limits):
> for gene_feature in rec.features:
> if gene_feature.id == 'gene:Bra000001':
> print(gene_feature.id,gene_feature.location)
> ============================
> The result in the following, tell us that the start position for the same gene is 8774510 instead of 8774511
>> ('gene:Bra000001', FeatureLocation(ExactPosition(8774510), ExactPosition(8777095), strand=1)
> _______________________________________________
> Biopython mailing list - Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython
More information about the Biopython
mailing list