Quantcast

NCBI BLAST+ wrappers in Galaxy?

classic Classic list List threaded Threaded
48 messages Options
123
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
Hi all,

Something I expect to find useful in several analysis pipelines is
a Galaxy wrapper for the NCBI BLAST+ tools (or even the old
NCBI "legacy" BLAST tools if such a wrapper exists).

I've been looking over the tools in galaxy-dist and galaxy-central and
the only NCBI BLAST wrapper I can see is for MEGABLAST, under
tools/metag_tools.

Is there some more general NCBI BLAST+ wrappers that I have
missed? Or is anyone already working on this?

Thanks,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Thu, Sep 9, 2010 at 11:29 AM, Peter <[hidden email]> wrote:

> Hi all,
>
> Something I expect to find useful in several analysis pipelines is
> a Galaxy wrapper for the NCBI BLAST+ tools (or even the old
> NCBI "legacy" BLAST tools if such a wrapper exists).
>
> I've been looking over the tools in galaxy-dist and galaxy-central and
> the only NCBI BLAST wrapper I can see is for MEGABLAST, under
> tools/metag_tools.
>
> Is there some more general NCBI BLAST+ wrappers that I have
> missed? Or is anyone already working on this?
>
> Thanks,
>
> Peter

Hi all,

I met Björn Grüning (CC'd) from the University of Freiburg
Pharmaceutical Bioinformatics at a workshop last week, and
they had a few simple BLAST+ wrappers setup. If I recall
correctly, all their databases were nucleotide databases.

For configuration, Björn re-used the existing blastdb.loc file
that comes with Galaxy for the NGS megablast_wrapper tool.
However, that (currently) only holds nucleotide BLAST
databases - and we would need to have separate lists for
nucleotide, protein, and RPS-BLAST protein domain databases.

I would suggest either:

(a) Add new loc files specific to proteins and rpsblast

or:

(b) Extend the blastdb.loc format to include a fourth column
giving the database type (which can default to nucleotide).

What would the Galaxy team prefer?

Thanks,

Peter

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

James Taylor-2
Hi Peter, I think a separate loc file for proteins makes sense, easier to maintain backward compatibility that way. Only speaking for myself though.

On Sep 20, 2010, at 5:55 AM, Peter wrote:

> On Thu, Sep 9, 2010 at 11:29 AM, Peter <[hidden email]> wrote:
>> Hi all,
>>
>> Something I expect to find useful in several analysis pipelines is
>> a Galaxy wrapper for the NCBI BLAST+ tools (or even the old
>> NCBI "legacy" BLAST tools if such a wrapper exists).
>>
>> I've been looking over the tools in galaxy-dist and galaxy-central and
>> the only NCBI BLAST wrapper I can see is for MEGABLAST, under
>> tools/metag_tools.
>>
>> Is there some more general NCBI BLAST+ wrappers that I have
>> missed? Or is anyone already working on this?
>>
>> Thanks,
>>
>> Peter
>
> Hi all,
>
> I met Björn Grüning (CC'd) from the University of Freiburg
> Pharmaceutical Bioinformatics at a workshop last week, and
> they had a few simple BLAST+ wrappers setup. If I recall
> correctly, all their databases were nucleotide databases.
>
> For configuration, Björn re-used the existing blastdb.loc file
> that comes with Galaxy for the NGS megablast_wrapper tool.
> However, that (currently) only holds nucleotide BLAST
> databases - and we would need to have separate lists for
> nucleotide, protein, and RPS-BLAST protein domain databases.
>
> I would suggest either:
>
> (a) Add new loc files specific to proteins and rpsblast
>
> or:
>
> (b) Extend the blastdb.loc format to include a fourth column
> giving the database type (which can default to nucleotide).
>
> What would the Galaxy team prefer?
>
> Thanks,
>
> Peter
>
> _______________________________________________
> galaxy-dev mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-dev

-- jt

James Taylor
Assistant Professor
Department of Biology
Department of Mathematics & Computer Science
Emory University




_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Mon, Sep 20, 2010 at 6:00 PM, James Taylor <[hidden email]> wrote:
>
> Hi Peter, I think a separate loc file for proteins makes sense, easier to maintain
> backward compatibility that way. Only speaking for myself though.
>

Hi James,

That works for me. So we keep blastdb.loc as a list of nucleotide only
databases,
and introduce new files perhaps named blastp_db.loc and rpsblast_db.loc (or
maybe blastdb_p.loc and blastdb_rps.loc - I don't mind) for protein and
RPS-BLAST databases respectively.

Assuming I produce something generally useful for contribution to the project,
would the best route be:

(a) a patch
(b) an hg branch from http://bitbucket.org/galaxy/galaxy-central
(c) an hg branch from http://bitbucket.org/galaxy/galaxy-dist

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

James Taylor-2
A patch is probably best, or a fork of central. We don't integrate  
anything directly into dist. For tools, there is also the community  
site: usegalaxy.org/community

Thanks!

On Sep 21, 2010, at 5:21 AM, Peter wrote:

> On Mon, Sep 20, 2010 at 6:00 PM, James Taylor  
> <[hidden email]> wrote:
>>
>> Hi Peter, I think a separate loc file for proteins makes sense,  
>> easier to maintain
>> backward compatibility that way. Only speaking for myself though.
>>
>
> Hi James,
>
> That works for me. So we keep blastdb.loc as a list of nucleotide only
> databases,
> and introduce new files perhaps named blastp_db.loc and  
> rpsblast_db.loc (or
> maybe blastdb_p.loc and blastdb_rps.loc - I don't mind) for protein  
> and
> RPS-BLAST databases respectively.
>
> Assuming I produce something generally useful for contribution to  
> the project,
> would the best route be:
>
> (a) a patch
> (b) an hg branch from http://bitbucket.org/galaxy/galaxy-central
> (c) an hg branch from http://bitbucket.org/galaxy/galaxy-dist
>
> Peter
> _______________________________________________
> galaxy-dev mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-dev

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 21, 2010 at 12:45 PM, James Taylor <[hidden email]> wrote:
>
> A patch is probably best, or a fork of central. We don't integrate anything
> directly into dist. For tools, there is also the community site:
> usegalaxy.org/community

I have a query about the existing Megablast wrapper, Python code here:
http://bitbucket.org/galaxy/galaxy-central/src/tip/tools/metag_tools/megablast_wrapper.py

Looking at the above, it is clearly trying to call the command line tool
'megablast' which is part of the NCBI 'legacy' BLAST suite. This is
replaced by the command line tool 'blastn' in the new NCBI BLAST+
suite (the default 'task' parameter is megablast).

Currently the wiki instructions appear to be wrong, quoting:
>> Megablast installation
>>
>> Megablast is a part of the BLAST+ suite of tools. To download it, go to the
>> Megablast page and go to the download link. Select the BLAST+ file
>> appropriate to your platform, noting that Galaxy uses version 2.2.22 currently.
>> There is some information about installation in the BLAST+ user manual,
>> available from the download page.

Quoted from http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup

Have I misunderstood? Perhaps the script expects the be able to call
'megablast' via legacy_blast.pl - but most likely the documentation is
out of sync any the Galaxy servers have both BLAST and BLAST+
installed.

I think it would make sense to update megablast_wrapper.py to call the
BLAST+ command line tool blastn instead of the legacy BLAST tool
megablast... would that change be welcome?

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 21, 2010 at 2:13 PM, Peter <[hidden email]> wrote:

>
> I have a query about the existing Megablast wrapper, Python code here:
> http://bitbucket.org/galaxy/galaxy-central/src/tip/tools/metag_tools/megablast_wrapper.py
>
> Looking at the above, it is clearly trying to call the command line tool
> 'megablast' which is part of the NCBI 'legacy' BLAST suite. This is
> replaced by the command line tool 'blastn' in the new NCBI BLAST+
> suite (the default 'task' parameter is megablast).
>
> Currently the wiki instructions appear to be wrong, quoting:
>>> Megablast installation
>>>
>>> Megablast is a part of the BLAST+ suite of tools. To download it, go to the
>>> Megablast page and go to the download link. Select the BLAST+ file
>>> appropriate to your platform, noting that Galaxy uses version 2.2.22 currently.
>>> There is some information about installation in the BLAST+ user manual,
>>> available from the download page.
>
> Quoted from http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup
>
> Have I misunderstood? Perhaps the script expects the be able to call
> 'megablast' via legacy_blast.pl - but most likely the documentation is
> out of sync any the Galaxy servers have both BLAST and BLAST+
> installed.
>
> I think it would make sense to update megablast_wrapper.py to call the
> BLAST+ command line tool blastn instead of the legacy BLAST tool
> megablast... would that change be welcome?
>

Here is a fork of galaxy-central which updates megablast_wrapper.py
to actually use BLAST+, which seems to work for me:

http://bitbucket.org/peterjc/galaxy-central/changeset/ff54cf59749d

Follow up change to update both the XML and py files to use the
new BLAST+ arguments for the filter (yes/no instead of T/F) and
update the list of columns in the documentation:

https://bitbucket.org/peterjc/galaxy-central/changeset/71e6e7db6bea
https://bitbucket.org/peterjc/galaxy-central/changeset/2efff78a82de

Note that both 'legacy' BLAST 2.2.22 and BLAST+ 2.2.24 both
output 12 columns in tabular mode, so I think the old XML wrapper
documentaion about 12 columns is wrong or was at least out of date.

These updates are on my branch 'megablast'.

Could someone review these changes for possible inclusion in
Galaxy? Would you prefer me to prepare a single patch file?

Regards,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

James Taylor-2
Peter, this is great and we will look at it. The main thing I want to  
think about is does this affect reproducibility in any way. We may  
want to keep the old tool, and have another tool for the NCBI version  
(I'd love to see a complete set of wrappers for NCBI blast+, which we  
could include with our cloud images right away). Thanks!

-- jt

James Taylor
Assistant Professor
Department of Biology
Department of Mathematics & Computer Science
Emory University

On Sep 21, 2010, at 11:12 AM, Peter wrote:

> On Tue, Sep 21, 2010 at 2:13 PM, Peter <[hidden email]>  
> wrote:
>>
>> I have a query about the existing Megablast wrapper, Python code  
>> here:
>> http://bitbucket.org/galaxy/galaxy-central/src/tip/tools/metag_tools/megablast_wrapper.py
>>
>> Looking at the above, it is clearly trying to call the command line  
>> tool
>> 'megablast' which is part of the NCBI 'legacy' BLAST suite. This is
>> replaced by the command line tool 'blastn' in the new NCBI BLAST+
>> suite (the default 'task' parameter is megablast).
>>
>> Currently the wiki instructions appear to be wrong, quoting:
>>>> Megablast installation
>>>>
>>>> Megablast is a part of the BLAST+ suite of tools. To download it,  
>>>> go to the
>>>> Megablast page and go to the download link. Select the BLAST+ file
>>>> appropriate to your platform, noting that Galaxy uses version  
>>>> 2.2.22 currently.
>>>> There is some information about installation in the BLAST+ user  
>>>> manual,
>>>> available from the download page.
>>
>> Quoted from http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup
>>
>> Have I misunderstood? Perhaps the script expects the be able to call
>> 'megablast' via legacy_blast.pl - but most likely the documentation  
>> is
>> out of sync any the Galaxy servers have both BLAST and BLAST+
>> installed.
>>
>> I think it would make sense to update megablast_wrapper.py to call  
>> the
>> BLAST+ command line tool blastn instead of the legacy BLAST tool
>> megablast... would that change be welcome?
>>
>
> Here is a fork of galaxy-central which updates megablast_wrapper.py
> to actually use BLAST+, which seems to work for me:
>
> http://bitbucket.org/peterjc/galaxy-central/changeset/ff54cf59749d
>
> Follow up change to update both the XML and py files to use the
> new BLAST+ arguments for the filter (yes/no instead of T/F) and
> update the list of columns in the documentation:
>
> https://bitbucket.org/peterjc/galaxy-central/changeset/71e6e7db6bea
> https://bitbucket.org/peterjc/galaxy-central/changeset/2efff78a82de
>
> Note that both 'legacy' BLAST 2.2.22 and BLAST+ 2.2.24 both
> output 12 columns in tabular mode, so I think the old XML wrapper
> documentaion about 12 columns is wrong or was at least out of date.
>
> These updates are on my branch 'megablast'.
>
> Could someone review these changes for possible inclusion in
> Galaxy? Would you prefer me to prepare a single patch file?
>
> Regards,
>
> Peter

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>
> Peter, this is great and we will look at it. The main thing I want to think
> about is does this affect reproducibility in any way. We may want to keep
> the old tool, and have another tool for the NCBI version

Sure.

By old tool I'm assuming you mean the NCBI legacy BLAST?

> (I'd love to see a complete set of wrappers for NCBI blast+, which we could
> include with our cloud images right away). Thanks!

I'm working on it - we want to use it on our local server too.

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 21, 2010 at 4:35 PM, Peter <[hidden email]> wrote:
>On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>> (I'd love to see a complete set of wrappers for NCBI blast+, which we could
>> include with our cloud images right away). Thanks!
>
> I'm working on it - we want to use it on our local server too.
>

I have a blastplus branch here, currently just minimal wrappers for blastn and
blastp: http://bitbucket.org/peterjc/galaxy-central

Early feedback is welcome - I'd like this to follow Galaxy conventions
and be taken up as part of the default installation.

Before I do lots of work defining most of the parameters, I started a thread
asking how to share definitions between wrapper XML files:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-September/003371.html

I've also written and wrapped a very simple script to split a FASTA file into
records with and without BLAST hits - something I plan to use in some
simple workflows later on.

I plan to add BLAST ASN1 as a new format, and wrap the blast_formatter
application added in BLAST 2.2.24+ to turn this into blastxml, plain text,
tabular etc.

There are other things I'd like to add like blastxml to tabular conversion.
In this case I'd like to use the Biopython BLAST XML parser - is adding
Biopython as a Galaxy dependency going to be a problem? You already
have numpy which is the main dependency of Biopython (there are
others - all optional).

Regards,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Bossers, Alex
Hi Peter,
Nice work. We are working on some general tool wrappers as well (blat, mummer, etc) which have the same difficulty of lots to maintain if something changes...So I also follow your thread on the shared XML parts with great interest.

Regarding the blast xml to table. Its already in your distribution for megablast at metag_tools/megablast_xml_parser.xml there is also a basic wrapper for megablast.
Keep up the good work!
Alex

-----Oorspronkelijk bericht-----
Van: [hidden email] [mailto:[hidden email]] Namens Peter
Verzonden: dinsdag 28 september 2010 16:07
Aan: James Taylor
CC: [hidden email]
Onderwerp: Re: [galaxy-dev] NCBI BLAST+ wrappers in Galaxy?

On Tue, Sep 21, 2010 at 4:35 PM, Peter <[hidden email]> wrote:
>On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>> (I'd love to see a complete set of wrappers for NCBI blast+, which we
>> could include with our cloud images right away). Thanks!
>
> I'm working on it - we want to use it on our local server too.
>

I have a blastplus branch here, currently just minimal wrappers for blastn and
blastp: http://bitbucket.org/peterjc/galaxy-central

Early feedback is welcome - I'd like this to follow Galaxy conventions and be taken up as part of the default installation.

Before I do lots of work defining most of the parameters, I started a thread asking how to share definitions between wrapper XML files:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-September/003371.html

I've also written and wrapped a very simple script to split a FASTA file into records with and without BLAST hits - something I plan to use in some simple workflows later on.

I plan to add BLAST ASN1 as a new format, and wrap the blast_formatter application added in BLAST 2.2.24+ to turn this into blastxml, plain text, tabular etc.

There are other things I'd like to add like blastxml to tabular conversion.
In this case I'd like to use the Biopython BLAST XML parser - is adding Biopython as a Galaxy dependency going to be a problem? You already have numpy which is the main dependency of Biopython (there are others - all optional).

Regards,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev



_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 28, 2010 at 3:43 PM, Bossers, Alex <[hidden email]> wrote:
>
> Hi Peter,
> Nice work. We are working on some general tool wrappers as well (blat, mummer,
> etc) which have the same difficulty of lots to maintain if something changes...So I
> also follow your thread on the shared XML parts with great interest.

Nice to know this use case (BLAST) isn't a special case.

> Regarding the blast xml to table. Its already in your distribution for megablast at
> metag_tools/megablast_xml_parser.xml there is also a basic wrapper for megablast.

I'd seen the megablast wrapper (currently for legacy NCBI BLAST, not BLAST+,
as discussed earlier in the thread).

The metag_tools/megablast_xml_parser.xml script is close to what I had in mind,
but not quite the same: I wanted to reproduce the default 12 column tabular
output from the BLAST+ tools from the XML output. My thinking was we'd have
lots of tools designed to work with the default tabular output from BLAST+,
so an option to go to XML if needed for some steps and recover the tabular
output later would be nice. [And similarly for the ASN.1 output.]

I guess here (and in general) there is scope for supporting all the tab fields
which the NCBI command line tools support... I see Galaxy has some clever
metadata for tracking different columns in interval data types - we'd need to
do something similar for different BLAST columns. That is going to be more
work of course, and not something I need immediately.

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
In reply to this post by Peter-2-2
On Tue, Sep 28, 2010 at 3:07 PM, Peter <[hidden email]> wrote:
> On Tue, Sep 21, 2010 at 4:35 PM, Peter <[hidden email]> wrote:
>>On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>>> (I'd love to see a complete set of wrappers for NCBI blast+, which we could
>>> include with our cloud images right away). Thanks!
>>
>> I'm working on it - we want to use it on our local server too.
>
> I have a blastplus branch here, currently just minimal wrappers for blastn and
> blastp: http://bitbucket.org/peterjc/galaxy-central

I've extended that to cover the five main BLAST flavours by adding blastx,
tblastn and tblastx. The order and descriptions should match that used on
the NCBI BLAST webserver.

Still to do, RPS-BLAST and PSI-BLAST, and of course most of the
optional parameters.

Creating (small) databases from FASTA files seems potentially useful
(i.e. wrapping the BLAST+ tool makeblastdb which replaces formatdb)
but as these are made up of several files I need to know more about
how Galaxy works before tackling that.

Alternatively, the BLAST+ feature of FASTA file verses FASTA file
could be handy (using -subject instead of -db). I think I can see how
to handle this... but it would complicate supporting other parameters.

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Tue, Sep 28, 2010 at 4:52 PM, Peter <[hidden email]> wrote:

>
> Creating (small) databases from FASTA files seems potentially useful
> (i.e. wrapping the BLAST+ tool makeblastdb which replaces formatdb)
> but as these are made up of several files I need to know more about
> how Galaxy works before tackling that.
>
> Alternatively, the BLAST+ feature of FASTA file verses FASTA file
> could be handy (using -subject instead of -db). I think I can see how
> to handle this... but it would complicate supporting other parameters.
>

I've done the later now,
http://bitbucket.org/peterjc/galaxy-central/changeset/17b2cb598b5e

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
Just FYI,

In related news, Bill Pearson's latest release of the FASTA suite
includes support for BLAST+ like tabular output files. I guess
wrappers for these in Galaxy would also be nice to have.

Peter

---------- Forwarded message ----------
From: William Pearson <[hidden email]>
Date: Fri, Oct 1, 2010 at 6:26 PM
Subject: fasta-36.2.7 available
To: [hidden email]




The latest version of the FASTA36 package, fasta-36.2.7, is available from:

http://faculty.virginia.edu/wrpearson/fasta/fasta36/fasta-36.2.7.tar.gz

A Mac OSX universal binary is also available.

This version of FASTA36 fixes a bug in the fastx36(_t) sub-alignment
code, that could cause the library sequence to be modified.  In
addition, it fixes some problems with sub-alignments that occurred
with very short query sequences.  There have also been some minor
output format changes to reduce presentation of redundant information
when multiple sub-alignments are shown.  The "-L" long library
descriptions has been re-enabled, and changing the -E threshold no
longer disables multiple sub-alignments.

The major new feature in this version are the introduction of two
BLAST+ compatible tabular output formats: -m 8 (BLAST+ tabular output,
equivalent to BLAST+ -outfmt=6) and -m 8C (BLAST+ -outfmt=7).

In addition, the FASTA36 programs can be compiled to read the sequence
database only once, and then compare multiple query sequences to the
library held in memory (see comments in doc/readme.v36 and
make/Makefile36m.common).  Holding the library in memory allows the
program to scale very efficiently on large multi-core computers (>40X
speedup on 48 cores).

This version of the program has not been thoroughly tested under MPI.

As always, let me know about problems.

Bill Pearson
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
In reply to this post by James Taylor-2
On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>
> Peter, this is great and we will look at it. The main thing I want to think
> about is does this affect reproducibility in any way. We may want to keep
> the old tool, and have another tool for the NCBI version (I'd love to see a
> complete set of wrappers for NCBI blast+, which we could include with our
> cloud images right away). Thanks!
>
> -- jt

Hi James et al.

Could you or someone from the Galaxy team take a look at my wrappers
for blastn, blastp, blastx, tblastn and tblastx and the BLAST XML to tabular
converter for possible inclusion in galaxy-central?

http://bitbucket.org/peterjc/galaxy-central/src/blastplus

The BLAST+ suite is so big and has so many options that this is by no
means a "complete set of wrappers" but it covers the immediate core
functionality that I expect to need personally.

Thanks,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
On Mon, Oct 11, 2010 at 4:06 PM, Peter <[hidden email]> wrote:

> On Tue, Sep 21, 2010 at 4:14 PM, James Taylor <[hidden email]> wrote:
>>
>> Peter, this is great and we will look at it. The main thing I want to think
>> about is does this affect reproducibility in any way. We may want to keep
>> the old tool, and have another tool for the NCBI version (I'd love to see a
>> complete set of wrappers for NCBI blast+, which we could include with our
>> cloud images right away). Thanks!
>>
>> -- jt
>
> Hi James et al.
>
> Could you or someone from the Galaxy team take a look at my wrappers
> for blastn, blastp, blastx, tblastn and tblastx and the BLAST XML to tabular
> converter for possible inclusion in galaxy-central?
>
> http://bitbucket.org/peterjc/galaxy-central/src/blastplus
>
> The BLAST+ suite is so big and has so many options that this is by no
> means a "complete set of wrappers" but it covers the immediate core
> functionality that I expect to need personally.
>
> Thanks,
>
> Peter

P.S. One thing this is lacking is unit tests. I worry that these could be
specific to the version of BLAST+ and the version of any database
installed. Ultimately the reference platform here is the "official"
public Galaxy server, right?

Using the -subject feature we can BLAST one file against another
which avoids the database version issue.

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
In reply to this post by Peter-2-2
On Tue, Oct 5, 2010 at 2:47 PM, Peter <[hidden email]> wrote:
> Just FYI,
>
> In related news, Bill Pearson's latest release of the FASTA suite
> includes support for BLAST+ like tabular output files. I guess
> wrappers for these in Galaxy would also be nice to have.
>
> Peter

I've started looking at that now,
http://bitbucket.org/peterjc/galaxy-central/src/pearson_fasta

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
In reply to this post by Peter-2-2
On Wed, Oct 13, 2010 at 10:22 AM, Peter <[hidden email]> wrote:
> On Wed, Oct 13, 2010 at 12:58 AM, Kanwei Li <[hidden email]> wrote:
>>
>> Hi Peter,
>>
>> The BLAST+ branch has been transplanted into galaxy-central. Thank you!
>>
>> Kanwei
>
> Wow. Thanks :)

There are another few change sets on this follow up branch,
http://bitbucket.org/peterjc/galaxy-central/src/blastplus2

These are:

Include BLAST-XML to tabular in tool_conf.xml.sample
http://bitbucket.org/peterjc/galaxy-central/changeset/629448f1a17b

Workaround Issue 397 using 'if' in Cheetah template to avoid hidden parameters
http://bitbucket.org/peterjc/galaxy-central/changeset/02ce0b68936a

Ignore Unix editor backup files ending with tilde (not BLAST+ specific):
http://bitbucket.org/peterjc/galaxy-central/changeset/708816ffd451

Ignore # lines in BLAST tabular output in my filter script
http://bitbucket.org/peterjc/galaxy-central/changeset/b2dcd6b8d434

And a capitalisation fix in my XML caption:
http://bitbucket.org/peterjc/galaxy-central/changeset/6efadbea0cfc

>
> I mentioned it needs tests:
>

I've worked out how to add unit tests to the XML wrapper (that is well
documented) and run them via run_functional_tests.sh - so adding
some BLAST+ tests should be possible now.

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: NCBI BLAST+ wrappers in Galaxy?

Peter-2-2
Hi again,

On Mon, Oct 18, 2010 at 10:01 AM, Peter <[hidden email]> wrote:
>
> There are another few change sets on this follow up branch,
> http://bitbucket.org/peterjc/galaxy-central/src/blastplus2

I've tidied that up a bit, please see:
http://bitbucket.org/peterjc/galaxy-central/src/blastplus_please_merge

I've been working on adding more arguments but this is contingent
on support for optional interger/float parameters (Issue 403):
http://bitbucket.org/galaxy/galaxy-central/issue/403/enhance-tool-parameters-to-be-optional

Thanks,

Peter
_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
123
Loading...