[Bioperl-l] New code

Chris Mungall cjm@fruitfly.bdgp.berkeley.edu
Sun, 13 Jan 2002 16:18:47 -0800 (PST)


I have committed scripts/bioperl.pl

(name isn't final, just wanted to get the code out there; also not sure
about the whole release schedule thing - should it be rolled back for that
as it's still unfinished code)

this is a command line interface to bioperl, the easiest way to get a
handle on it is to fire it up and type "demo"

still a ways to go yet but you should get the basic idea

I didn't make much use of Bio::Perl as I needed some really hacky
subroutines that play around with global variables (I know they're
generally evil but I think it's justified in this context)

On Mon, 31 Dec 2001, Ewan Birney wrote:

> 
> Chris - that would be great - this sort of "easy access" can work for a
> whole series of users as you mention.
> 
> 
> I have just commit'ed the start of Bio::Perl object - do you want to add
> to it? 
> 
> 
> We should not add the BLAST parsing stuff until Jason tells us the "one
> blast parser to rule them all" 
> 
> 
> 
> I've attached the pod2text of Bio::Perl so far - newbies on the list
> (Elizabeth? Others?) does this look good to you?
> 
> 
> 
> NAME
>     Bio::Perl - Functional access to BioPerl for people who don't
>     like objects
> 
> SYNOPSIS
>        use Bio::Perl qw(read_sequence read_all_sequences write_sequence new_sequence get_sequence);
> 
>        # will guess file format from extension
>        $seq_object = read_sequence($filename); 
> 
>        # forces genbank format
>        $seq_object = read_sequence($filename,'genbank'); 
> 
>        # reads an array of sequences
>        @seq_object_array = read_all_sequences($filename,'fasta'); 
> 
>        # sequences are Bio::Seq objects, so the following methods work
>        # (for more info see Bio::Seq documentation - try perldoc Bio::Seq)
> 
>        print "Sequence name is ",$seq_object->display_id,"\n";
>        print "Sequence acc  is ",$seq_object->accession_number,"\n";
>        print "First 5 bases is ",$seq_object->subseq(1,5),"\n";
> 
>        # get the whole sequence as a single string
> 
>        $sequence_as_a_string = $seq_object->seq();
> 
>        # writing sequences
> 
>        write_sequence(">$filename",'genbank',$seq_object);
> 
>        write_sequence(">$filename",'genbank',@seq_object_array);
>      
>        # making a new sequence from just strings you have
>        # from something else
> 
>        $seq_object = new_sequence("ATTGGTTTGGGGACCCAATTTGTGTGTTATATGTA","myname","AL12232");
> 
>        # getting a sequence from a database (assummes internet connection)
> 
>        $seq_object = get_sequence('swissprot',"ROA1_HUMAN");
> 
>        $seq_object = get_sequence('embl',"AI129902");
> 
>        $seq_object = get_sequence('genbank',"AI129902");   
> 
> DESCRIPTION
>     Easy first time access to BioPerl via functions
> 
> FEEDBACK
>   Mailing Lists
> 
>     User feedback is an integral part of the evolution of this and
>     other Bioperl modules. Send your comments and suggestions
>     preferably to one of the Bioperl mailing lists. Your
>     participation is much appreciated.
> 
>       bioperl-l@bio.perl.org
> 
>   Reporting Bugs
> 
>     Report bugs to the Bioperl bug tracking system to help us keep
>     track the bugs and their resolution. Bug reports can be
>     submitted via email or the web:
> 
>       bioperl-bugs@bio.perl.org
>       http://bio.perl.org/bioperl-bugs/
> 
> AUTHOR - Ewan Birney
>     Email bioperl-l@bio.perl.org
> 
>     Describe contact details here
> 
> APPENDIX
>     The rest of the documentation details each of the object
>     methods. Internal methods are usually preceded with a _
> 
>   read_sequence
> 
>      Title   : read_sequence
>      Usage   : $seq = read_sequence('sequences.fa')
>                $seq = read_sequence($filename,'genbank');
>        
>                # pipes are fine
>                $seq = read_sequence("my_fetching_program $id |",'fasta');
> 
>      Function: Reads the top sequence from the file. If no format is given, it will
>                try to guess the format from the filename. If a format is given, it
>                forces that format. The filename can be any valid perl open() string
>                - in particular, you can put in pipes
> 
>      Returns : A Bio::Seq object - see perldoc Bio::Seq for more information
>                (quick synopsis - 
>                 $seq_object->display_id - name of the sequence
>                 $seq_object->seq        - sequence as a string )
> 
>      Args    : Two strings, first the filename - any Perl open() string is ok
>                Second string is the format, which is optional
> 
>   read_all_sequences
> 
>      Title   : read_all_sequences
>      Usage   : @seq_object_array = read_all_sequences($filename);
>                @seq_object_array = read_all_sequences($filename,'genbank');
> 
>      Function: Just as the function above, but reads all the sequences in the
>                file and loads them into an array.
> 
>                For very large files, you will run out of memory. When this
>                happens, you've got to use the SeqIO system directly (this is
>                not so hard! Don't worry about it!). See perldoc Bio::SeqIO
>                for more information
> 
>      Returns : array of Bio::Seq objects
> 
>      Args    : two strings, first the filename (any open() string is ok)
>                second the format (which is optional)
> 
>   write_sequence
> 
>      Title   : write_sequence
>      Usage   : write_sequence(">new_file.gb",'genbank',$seq)
>                write_sequence(">new_file.gb",'genbank',@array_of_sequence_objects)
> 
>      Function: writes sequences in the specified format, 
> 
>      Returns : Nothing
> 
>      Args    : filename as a string, must provide an open() output file
>                format as a string
>                one or more sequence objects
> 
>   new_sequence
> 
>      Title   : new_sequence
>      Usage   :
>      Function:
>      Example :
>      Returns : 
>      Args    :
> 
>   get_sequence
> 
>      Title   : get_sequence
>      Usage   : $seq_object = get_sequence('swiss',"ROA1_HUMAN");
> 
>      Function: If the computer has internet accessibility, gets
>                the sequence from internet accessible databases. Currently
>                this supports Swissprot, EMBL and GenBank.
> 
>                Swissprot and EMBL are more robust than GenBank fetching
> 
>      Returns : A Bio::Seq object
> 
>      Args    : database type - one of swiss, embl or genbank
>                identifier or accession number
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>