« Previous 1 2 3 4 5
Asynchronous PostgreSQL
Sophisticated
Signals in View
Line 19 installs a watcher for the database socket, which ends the main event loop in line 29 once the result of the SQL operation exists. The two signal watchers created in line 25 are more interesting. The $cancel
function, which they call, cancels the database operation and ends the event loop. Again, to avoid a race condition, the initialized signals must not occur between installing the handler and the end of the execute operation, that is, between lines 25 and 26. The two sigprocmask
calls in lines 24 and 27 ensure this.
What happens in detail when Apache sends the SIGTERM signal? If it occurs before line 24, the program just stops, but because no SQL query is running, this is not critical. Between lines 24 and 27, delivery is delayed at the kernel level. If the signal arrives in this period, the process does not see it until after line 27. Here, the signal handlers are already installed, and the SQL operation is initiated. The signal handler installed by AnyEvent
at the C level writes to eventfd
or the self-pipe and returns. Now, at least, one file descriptor, eventfd
, is ready to be read. That is, the epoll
system call used by the main event loop reports this. The signal watcher is called.
Conclusions
The pitfalls you can encounter with a simple CGI program are amazing. Fortunately, the query function from final version can be fairly easily shifted out into a module and reused. Attentive programmers should know what to look for and what the potential pitfalls are. Unfortunately, all of the work so far only solves half the problem. If the timeout does not occur on the web server, but instead the problem is sitting in front of the screen reloading the page again and again, you will very quickly have a bottleneck in the database (see "What About Performance?"). With mod_perl
instead of mod_cgi
you can quite easily identify the disappearance of the browser. The timeout on the user side could thus be detected quickly and the current SQL operation aborted, which is the next solution I'll tackle.
What About Performance?
Will using asynchronous requests cause a bottleneck? Will it slow down normal operation? To answer these questions, I wrote a small benchmark program that compares the query in Listing 6 with the following function:
sub query_sync { my $sql=pop; state $dbh||=DBI->connect('dbi:Pg:dbname=r2', 'ipp', undef, {RaiseError=>1}); my $stmt=$dbh->prepare_cached($sql); return $stmt->execute(@_), $stmt; }
If the SQL statement select 1
is used for comparison, the synchronous version is in fact much faster. On my test machine, it achieves about 2,500 operations per second, whereas the asynchronous variant only just reaches 1,000.
However, if you consider real queries that can take a while, the whole thing shifts into perspective. If select pg_sleep (0.05)
is used (i.e., an SQL query that requires 50ms on the server), the synchronous function achieves 18.6 operations per second, and asynchronous 17.9 operations per second. The difference is small, but still noticeable.
If the operation on the server takes 200ms, the synchronous version achieves 4.83 operations per second, and the asynchronous 4.90 operations per second. The difference is negligible.
Real-life reliable requests, as commonly occurs in web applications, are usually somewhere between these two values.
Infos
- DBD::Pg: http://search.cpan.org/perldoc?DBD::Pg
- PostgreSQL functions for system administration: http://www.postgresql.org/docs/8.4/static/functions-admin.html
- Boost application performance using asynchronous I/O: http://www.ibm.com/developerworks/linux/library/l-async/
- pselect(2): http://linux.die.net/man/2/pselect
- Self-pipe trick: http://lwn.net/Articles/177897/
- Linux::FD: http://search.cpan.org/perldoc?Linux::FD
- AnyEvent: http://search.cpan.org/perldoc?AnyEvent
- Libev documentation: http://search.cpan.org/perldoc?libev
- Scherbaum, Andreas, PostgreSQL . Open Source Press, 2009
« Previous 1 2 3 4 5