Quantcast

grouping hsps in btab format

classic Classic list List threaded Threaded
2 messages Options
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

grouping hsps in btab format

Chris Hemmerich

Hello,

Does the btab format retain the information to group hsps that have been
combined to generate an E-value beyond assuming consecutive, identical
hsps are grouped? We're looking a transcriptome data and could have
millions of hsps per pipeline, so I'm hesitant to automate an assumption.

BioPerl provides an 'n()' method fo HSPs that resolves this unambiguously.
I'm considering sticking it in the empty column 15 "segment_number" in our
local blast2btab.pl or adding a new column at the end. Does anyone have a
better idea for handling this? Does anyone care enough to commit a
consistent method of doing it to svn?

Cheers and Thanks,
  Chris

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
| Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: grouping hsps in btab format

Joshua Orvis
Chris -

I think using that column would be appropriate for this need.  In fact, another component had a similar need to group alignment sections and set the initial use here (see the documentation for aat_aa within Ergatis.)  It actually used two values, chain ID along with segment number, to group different alignment segments or HSPs.  I think chain ID is the more appropriate term here, if we want to be consistent.  The segments were just increasing integers within each chain.

Joshua


On Sun, Mar 25, 2012 at 9:06 PM, Chris Hemmerich <[hidden email]> wrote:

Hello,

Does the btab format retain the information to group hsps that have been
combined to generate an E-value beyond assuming consecutive, identical
hsps are grouped? We're looking a transcriptome data and could have
millions of hsps per pipeline, so I'm hesitant to automate an assumption.

BioPerl provides an 'n()' method fo HSPs that resolves this unambiguously.
I'm considering sticking it in the empty column 15 "segment_number" in our
local blast2btab.pl or adding a new column at the end. Does anyone have a
better idea for handling this? Does anyone care enough to commit a
consistent method of doing it to svn?

Cheers and Thanks,
 Chris

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Ergatis-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/ergatis-users
Loading...