Wiki erratum...

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Wiki erratum...

Richard Bruskiewich-2
Galaxy Colleagues,

I don't know who is maintaining the Galaxy wiki page at http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed that the Python script under the Megablast instructions has an error: the "defline" operation after the "line.startswith" should be moved *after* the if length > 0 statement, otherwise, the defline is reset incorrectly before the previous sequence is written out. This results in a frameshift in the FASTA header line identifiers (i.e. the current sequence gets the next sequence identifier).

I've commented out the erroneous defline below and added the right one:
import sys

length = 0
defline = ''
seq = []

for line in sys.stdin :
line = line.rstrip( '\r\n' )
if line.startswith( '>' ):
# defline = line.split( "|" )[1] # defline should NOT be here
if length > 0:
print ">%s_%s" % ( defline, length )
print "\n".join( seq )
length = 0
seq = []
defline = line.split( "|" )[1] # defline should be here

else:
seq.append( line )
length += len( line )

print ">%s_%s" % ( defline, length )
print "\n".join( seq )
While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this?

BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I heard rumors that it was due this week.

Cheers
Richard Bruskiewich

--
Richard Bruskiewich, PhD
Senior Scientist, Computational and Systems Biology
Applications Team for Computational Genomics
T.T. Chang Genetic Resources Center
International Rice Research Institute


_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: Wiki erratum...

Jelle Scholtalbers
Hi Richard,

any bitbucket registered user can edit the Galaxy wiki. This way the
documentation quality should increase as more people can put their
experiences/fixes/comments on it. So I would say, feel free to edit.
The only real downside is that it is a bitbucket wiki, which is rather limited..

Cheers,

Jelle

On Fri, Dec 10, 2010 at 4:04 AM, Richard Bruskiewich
<[hidden email]> wrote:

> Galaxy Colleagues,
>
> I don't know who is maintaining the Galaxy wiki page at
> http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed
> that the Python script under the Megablast instructions has an error: the
> "defline" operation after the "line.startswith" should be moved *after* the
> if length > 0 statement, otherwise, the defline is reset incorrectly before
> the previous sequence is written out. This results in a frameshift in the
> FASTA header line identifiers (i.e. the current sequence gets the next
> sequence identifier).
>
> I've commented out the erroneous defline below and added the right one:
>
> import sys
>
> length = 0
> defline = ''
> seq = []
>
> for line in sys.stdin :
>     line = line.rstrip( '\r\n' )
>
>     if line.startswith( '>' ):
>         # defline = line.split( "|" )[1] # defline should NOT be here
>         if length > 0:
>              print ">%s_%s" % ( defline, length )
>
>              print "\n".join( seq )
>              length = 0
>              seq = []
>         defline = line.split( "|" )[1]  # defline should be here
>
>     else:
>         seq.append( line )
>
>         length += len( line )
>
> print ">%s_%s" % ( defline, length )
> print "\n".join( seq )
>
> While on the topic of this page, perhaps the software versions need to be
> revisited. Megablast has been superseded already by Blast+. Perhaps new
> releases of Galaxy should update this?
>
> BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I
> heard rumors that it was due this week.
>
> Cheers
> Richard Bruskiewich
>
> --
> Richard Bruskiewich, PhD
> Senior Scientist, Computational and Systems Biology
> Applications Team for Computational Genomics
> T.T. Chang Genetic Resources Center
> International Rice Research Institute
>
>
> _______________________________________________
> galaxy-user mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-user
>
>
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: [galaxy-dev] Wiki erratum...

Anton Nekrutenko
In reply to this post by Richard Bruskiewich-2
Richard:

This beauty was mine. Thanks for pointing this out. It is now fixed.

Thanks,

anton


On Dec 9, 2010, at 10:04 PM, Richard Bruskiewich wrote:

Galaxy Colleagues,

I don't know who is maintaining the Galaxy wiki page at http://bitbucket.org/galaxy/galaxy-central/wiki/NGSLocalSetup but I noticed that the Python script under the Megablast instructions has an error: the "defline" operation after the "line.startswith" should be moved *after* the if length > 0 statement, otherwise, the defline is reset incorrectly before the previous sequence is written out. This results in a frameshift in the FASTA header line identifiers (i.e. the current sequence gets the next sequence identifier).

I've commented out the erroneous defline below and added the right one:
import sys

length = 0
defline = ''
seq = []

for line in sys.stdin :
line = line.rstrip( '\r\n' )
if line.startswith( '>' ):
# defline = line.split( "|" )[1] # defline should NOT be here
if length > 0:
print ">%s_%s" % ( defline, length )
print "\n".join( seq )
length = 0
seq = []
defline = line.split( "|" )[1] # defline should be here

else:
seq.append( line )
length += len( line )

print ">%s_%s" % ( defline, length )
print "\n".join( seq )
While on the topic of this page, perhaps the software versions need to be revisited. Megablast has been superseded already by Blast+. Perhaps new releases of Galaxy should update this?

BTW, when is the new Galaxy release (cloud man AMI too...) coming out? I heard rumors that it was due this week.

Cheers
Richard Bruskiewich

--
Richard Bruskiewich, PhD
Senior Scientist, Computational and Systems Biology
Applications Team for Computational Genomics
T.T. Chang Genetic Resources Center
International Rice Research Institute

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev



_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user
Reply | Threaded
Open this post in threaded view
|

Re: Wiki erratum...

Peter-2-2
In reply to this post by Richard Bruskiewich-2
On Fri, Dec 10, 2010 at 3:04 AM, Richard Bruskiewich
<[hidden email]> wrote:
> Galaxy Colleagues,
>
> I don't know who is maintaining the Galaxy wiki page ...
>
> While on the topic of this page, perhaps the software versions need to be
> revisited. Megablast has been superseded already by Blast+. Perhaps new
> releases of Galaxy should update this?

Hi Richard,

Galaxy already has wrappers for the main BLAST+ tools, including
blastn which covers megablast. However they are not currently
available on the public Galaxy instance, in part due to load concerns.
You can enable them locally if you are running your own Galaxy -
that is what we are doing.

I did also offer to update the old Megablast tool to use blastn from
BLAST+ instead of blastall from legacy BLAST, but as I recall the
Galaxy team were cautious about this since it could break the
reproducibility of existing work flows.

Peter
_______________________________________________
galaxy-user mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-user