\u0001 characters and blast2bsml.pl

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

\u0001 characters and blast2bsml.pl

Aaron Buechlein
We use bioperl version 1.6.1 and noticed an important change to
Bio::SearchIO.  When retrieving a blast hit description, if there is
more than one description for the hit (like when using NCBI's nr), it
will add a control-A where the old behavior would be to use a space.
This causes an error when running blast2bsml.pl.  Since \u0001 is not a
valid xml character you will receive the following error because of the

    Code point \u0001 is not a valid character in XML at
    line 500

Is this a known issue?  If so, is there a fix available?  I think the
simplest solution would be to just substitute a space back in for
control-A characters around line 224 in blast2bsml.pl.  Based on the
comments in Bioperl, it appears that the use of control-A in place of a
space is the desired action.

Aaron Buechlein

All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
Ergatis-users mailing list
[hidden email]