[Bioperl-l] fasta2gff_landmark ...

Dan Bolser dan.bolser at gmail.com
Wed May 13 17:01:27 UTC 2009


Hi,

I am trying to create a script to write one 'GFF landmark' line for
each entry in a fasta file. I'm doing this so that the landmarks show
up in GBrowse (along with my attached similarity 'features').

In a nutshell, I was told to add lines like:

C08HBa0216M19.1    .    contig    1    <length>    .    .    .
ID=C08HBa0216M19.1;Name=C08HBa0216M19.1


to the GFF so that GBrowse would pick up the features which have
C08HBa0216M19.1 as their 'landmark'. This seems to be working great
for the sequences where I added this line manually, but now I want to
write a script to automatically generate these GFF landmarks from a
fasta file.


So far I have a script that looks like this:

<source lang="perl">
#!/usr/bin/perl -w

use strict;

use Bio::SeqIO;
use Bio::SeqFeature::Generic;
use Bio::FeatureIO;

[snip]

my $gffIO =
    new Bio::FeatureIO( -format => 'GFF',
			-version => 3
			);

## Now read in the fasta file to create the landmarks

my $seqIO =
  new Bio::SeqIO( -file => $finished_bacs_repeatmasked,
		  -format => 'fasta'
		);

## Begwin

while(my $seq = $seqIO->next_seq){
  ## Create a new generic feature to hold the properties of the
  ## sequence;

  my $name = $seq->id;
  my $leng = $seq->length;

  my $accn;

  next unless $accn = $finished{$name};

  my $f =
    new Bio::SeqFeature::Generic( -seq_id  => $name,
				  -primary => "SGN",
				  -source  => "genomic_clone",
				  -start   =>     1,
				  -end     => $leng,
				  -tag     => { ID   => $name,
					        Name => $name,
						Dbxref => "GB:$accn",
						Ontology_term => "SO:0000040"
					      }
				);

  print $gffIO->write_feature($f);
  exit;
}
</source>


The script gives me the following error:

Use of uninitialized value in pattern match (m//) at
~/perl5/lib/perl5/Bio/FeatureIO/gff.pm line 107, <FB> line 688.

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: only Bio::SeqFeature::Annotated objects are writeable
STACK: Error::throw
STACK: Bio::Root::Root::throw ~/perl5/lib/perl5/Bio/Root/Root.pm:368
STACK: Bio::FeatureIO::gff::write_feature
~/perl5/lib/perl5/Bio/FeatureIO/gff.pm:272
STACK: ./fasta2gff_landmarks.plx:72
-----------------------------------------------------------


With all due respect to:

* http://www.bioperl.org/wiki/GFF_Refactor and
* http://www.bioperl.org/wiki/GFF_code_audit


I'm a bit stuck.

Any quick fix in the meantime? I realise that using a feature object
is overkill here, but I wanted to learn the 'BioPerl way' of doing
things. Should I drop that approach until the code audit and
refactorization is over, or is this the kind of use-case that can help
with that process?


Thanks for any pointers,
Dan.


P.S. Thanks also to the friendly people on
irc://ircfreenode.net/#bioperl and
irc://irc.freenode.net/#bioinformatics who helped me get this script
together as it is.



More information about the Bioperl-l mailing list