GBROWSE/CHADO/PostgreSQL problem: timeout (alarm clock)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

GBROWSE/CHADO/PostgreSQL problem: timeout (alarm clock)

lpritc@scri.ac.uk
Hi,

I've got two near-identical GBROWSE/CHADO/PostgreSQL setups, one on a VM
(for testing) and the intended production setup; they are identical in terms
of data, and software configuration, but I'm getting inconsistent results in
the following odd circumstances.

If I enter a query term in Landmark or Region that doesn't have a match in
the database, I get a corresponding error message on the VM (fig1.png), but
entering the same query on the server proper reports a timeout (fig2.png).

The error that shows in /var/log/httpd/error_log is:

[Fri May 28 09:58:00 2010] [error] [client 143.234.97.119] alarm clock at
(eval 222) line 1., referer:
http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi
d=general
[Fri May 28 09:58:00 2010] [error] [client 143.234.97.119] The search timed
out; try a more specific search, referer:
http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi
d=general

Which reports an 'alarm clock' error, which I think is coming from
PostgreSQL, as an strace indicates that a SIGALRM occurs - though I can't
tell why.  There's no corresponding error in the VM httpd server log.

The alarm clock error also occurs on the server, but not the VM when a query
returns more than around 2000 features.

Has anyone on the list seen this error before, or can you provide any advice
on how to avoid/fix it?

Cheers,

L.



--
Dr Leighton Pritchard MRSC
D131, Plant Pathology Programme, SCRI
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e:[hidden email]       w:http://www.scri.ac.uk/staff/leightonpritchard
gpg/pgp: 0xFEFC205C       tel:+44(0)1382 562731 x2405



______________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by guarantee.
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries.  This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify [hidden email] quoting the name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).
______________________________________________________
------------------------------------------------------------------------------


_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema

fig1.png (31K) Download Attachment
fig2.png (54K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-gbrowse] GBROWSE/CHADO/PostgreSQL problem: timeout (alarm clock)

Lincoln Stein
If you are using different versions of GBrowse, then the handling of long-running SQL queries differs. It looks like in the first case, GBrowse waited the full time for the SQL query to come back empty. In the second case, GBrowse's internal timeout alarm clock went off while the query was pending and it reported a timeout error.

If there is no difference in GBrowse versions, then my guess is that you have a difference in Perl versions.Earlier versions will allow alarm signals to interrupt pending IO processes (such as waiting on a socket for a SQL result), while later versions automatically restart the IO process, which is very annoying to me.

I guess the question is why are your landmark search queries taking so long to run?

Lincoln

On Fri, May 28, 2010 at 5:01 AM, Leighton Pritchard <[hidden email]> wrote:
Hi,

I've got two near-identical GBROWSE/CHADO/PostgreSQL setups, one on a VM
(for testing) and the intended production setup; they are identical in terms
of data, and software configuration, but I'm getting inconsistent results in
the following odd circumstances.

If I enter a query term in Landmark or Region that doesn't have a match in
the database, I get a corresponding error message on the VM (fig1.png), but
entering the same query on the server proper reports a timeout (fig2.png).

The error that shows in /var/log/httpd/error_log is:

[Fri May 28 09:58:00 2010] [error] [client 143.234.97.119] alarm clock at
(eval 222) line 1., referer:
<a href="http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi d=general" target="_blank">http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi
d=general
[Fri May 28 09:58:00 2010] [error] [client 143.234.97.119] The search timed
out; try a more specific search, referer:
<a href="http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi d=general" target="_blank">http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=id:11002;dbi
d=general

Which reports an 'alarm clock' error, which I think is coming from
PostgreSQL, as an strace indicates that a SIGALRM occurs - though I can't
tell why.  There's no corresponding error in the VM httpd server log.

The alarm clock error also occurs on the server, but not the VM when a query
returns more than around 2000 features.

Has anyone on the list seen this error before, or can you provide any advice
on how to avoid/fix it?

Cheers,

L.



--
Dr Leighton Pritchard MRSC
D131, Plant Pathology Programme, SCRI
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
[hidden email]       w:<a href="http://www.scri.ac.uk/staff/leightonpritchard gpg/pgp" target="_blank">http://www.scri.ac.uk/staff/leightonpritchard
gpg/pgp: 0xFEFC205C       tel:+44(0)1382 562731 x2405



______________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.
The Scottish Crop Research Institute is a charitable company limited by guarantee.
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries.  This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify [hidden email] quoting the name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).
______________________________________________________
------------------------------------------------------------------------------


_______________________________________________
Gmod-gbrowse mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse




--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <[hidden email]>

------------------------------------------------------------------------------


_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema
Reply | Threaded
Open this post in threaded view
|

Re: [Gmod-gbrowse] GBROWSE/CHADO/PostgreSQL problem: timeout (alarm clock)

lpritc@scri.ac.uk
Hi,

On 28/05/2010 Friday, May 28, 16:58, "Lincoln Stein"
<[hidden email]> wrote:

> If you are using different versions of GBrowse, then the handling of
> long-running SQL queries differs. It looks like in the first case, GBrowse
> waited the full time for the SQL query to come back empty. In the second
> case, GBrowse's internal timeout alarm clock went off while the query was
> pending and it reported a timeout error.
>
> If there is no difference in GBrowse versions, then my guess is that you
> have a difference in Perl versions.Earlier versions will allow alarm signals
> to interrupt pending IO processes (such as waiting on a socket for a SQL
> result), while later versions automatically restart the IO process, which is
> very annoying to me.

Both servers are running the same (up to date) version of CentOS 5 with the
same packaged Perl.  It's most likely then that the GBROWSE versions are out
of sync.  

> I guess the question is why are your landmark search queries taking so long
> to run?

I don't know why they should take quite so long - I still haven't got my
head around what's going on in the CHADO adaptor (and I speak Perl about as
well as I speak French, just enough to get home ;) ) - looking at the
postgresql statement log, and the DEBUG warnings, many queries appear to be
repeated several times, but I'm assuming that they need to be for the
purposes of the query.

After making some changes to my full-text search code in the light of
needing to find source features by alias, the hardware server implementation
now reports that it doesn't find nonexistent queries, rather than timing out
(recent development in last 30min).  Those full-text searches must have been
pushing the query time over that allowed.

However, my go-to query for returning large numbers of rows: "hypothetical
protein" still reports a timeout on that server.  The key query itself:

oomycete_reference=> SELECT COUNT(DISTINCT(feature_id)) FROM
mv_all_feature_names WHERE searchable_name @@ to_tsquery('hypothetical &
protein');
 count
-------
 10958
(1 row)

Time: 20.167 ms

Returns 11000 or so rows in a very reasonable time.  Forcing the query to
return a maximum of 1000 rows by hard-coding in the Chado adaptor (round
about line 1058):

  my $query = $select_part . ' FROM ' . $from_part . $where_part . " LIMIT
1000";

  warn "first get_feature_by_name query:$query" if DEBUG;

  $sth = $self->dbh->prepare($query);

returned (after some time) the first 1000 the matching features using the
query as reported in the httpd error_log:

[Fri May 28 17:12:42 2010] [error] [client 143.234.97.119] first
get_feature_by_name query:select f.feature_id  FROM  feature f where
lower(f.name) = ?  AND f.organism_id =19 LIMIT 1000 at
/usr/lib/perl5/site_perl/5.8.8/Bio/DB/Das/Chado.pm line 1060., referer:
http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=PF00892
[Fri May 28 17:12:42 2010] [error] [client 143.234.97.119]  FROM
mv_all_feature_names afn where afn.searchable_name @@ to_tsquery(?)  AND
afn.organism_id =19 LIMIT 1000 at
/usr/lib/perl5/site_perl/5.8.8/Bio/DB/Das/Chado.pm line 1060., referer:
http://ppserver/cgi-bin/gb2/gbrowse/p_infestans_reference/?name=PF00892

So I guess it's possible that the long query time could be to do with the
way in which the adaptor, once it has retrieved the feature_ids, then goes
on to get the locations of each of the many features - perhaps this is too
slow for GBROWSE?

The GBROWSE settings were the same in the .conf file on each server:

# Performance settings
renderfarm             = 1
slave_timeout          = 45
global_timeout         = 60
search_timeout         = 15
max_render_processes   = 4   # try double number of CPU/cores

The VM implementation returned all 11000 features, but the hardware server
was capping me at about 1000 features before timeout, after a couple of
minutes' wait.  

I've raised search_timeout on the hardware server to 300, and now
(eventually) it reports the correct number of matches.  I feel quite daft
for not trying that, already - but thank you for the nudge towards it.

So - problem sort of solved, though there are clearly still things to be
optimised, but at least it's now functioning in the way I want it.

Cheers,

L.


--
Dr Leighton Pritchard MRSC
D131, Plant Pathology Programme, SCRI
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e:[hidden email]       w:http://www.scri.ac.uk/staff/leightonpritchard
gpg/pgp: 0xFEFC205C       tel:+44(0)1382 562731 x2405


______________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by guarantee.
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries.  This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify [hidden email] quoting the name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).
______________________________________________________

------------------------------------------------------------------------------

_______________________________________________
Gmod-schema mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gmod-schema