Introduction to SpeedyCGI

<head>
<title>Introduction to SpeedyCGI</title>
</head>
<center>
<h1>Introduction to SpeedyCGI</h1>
Sam Horrocks
<br>
&nbsp;<br>
Presented at YAPC North America<br>
14 June 2001
</center>
<h2>Overview</h2>
This paper will introduce SpeedyCGI, a persistent perl interpreter, and
will cover the following topics:
<ul>
    <li> What is SpeedyCGI?
    <li> Why was SpeedyCGI created?
    <li> How do I use SpeedyCGI?
    <li> Why would I want to use SpeedyCGI instead of normal perl?
    <li> How does SpeedyCGI compare to other persistent perl interpreters like
         mod_perl and fastcgi?
    <li> How does SpeedyCGI work?
    <li> Future directions for SpeedyCGI
</ul>

<h2>What is SpeedyCGI?</h2>

SpeedyCGI is a persistent perl interpreter.  In traditional perl when
you run a perl script, a new process is created, your perl script
is compiled and executed, and then the perl process exits.  If the
same script is run again, all these steps are repeated.

<p>
SpeedyCGI behaves a little differently.  Just as in regular perl, the
first time a perl script is run, a new process is created and
the script is compiled and executed.  However, with SpeedyCGI, at this
point instead of exiting, the perl process is retained.  If the
same script is run again, then the perl process can execute it right away,
without re-reading and re-compiling the script.

<h2>Why was SpeedyCGI created?</h2>
SpeedyCGI was created because a solution was needed that would:
<ul>
<li> make perl CGI apps run much faster
<li> work completely outside the web server
<li> not require a lot of administration and tuning
<li> also speed up regular perl code, not just web-based apps
<li> be freely available
<li> allow code to be written that looked like a normal CGI app, and could
     be run if only regular perl was available
</ul>
There wasn't anything at the time that would meet all these requirements,
so SpeedyCGI was created.


<h2>How do I use SpeedyCGI?</h2>

The simplest way is to change the <tt>#!/usr/bin/perl</tt> line at the top of
your script to <tt>#!/usr/bin/speedy</tt>.  For higher performance, you
can also run SpeedyCGI via an Apache module (mod_speedycgi).

<p>
Unfortunately not all code works
correctly when run persistently.  So in addition you may have to clean
up your code to make it work.  Using <tt>strict</tt> and the
<tt>-w</tt> switch will solve a lot of these problems.

<p>

<h2>Why would I use SpeedyCGI instead of normal perl?</h2>

Performance.  Under SpeedyCGI, if you run the same perl code over and
over the script doesn't have to be compiled each time it is run.  This
means less CPU time is required for each run.

<p>
In addition, once your code is running persistently, it's possible to
speed up execution even further by caching data or objects in
global variables.  In SpeedyCGI, global variables retain their values between
runs of your script.  You can take advantage of this fact to cache
things like database handles or other resources instead of having to
re-initialize them each time the script is run.

<p>
For example, if you have a subroutine "get_db_handle" that returns a
database handle, the following code will cache this handle in a persistent
global variable:
<pre>    use vars '$dbh';
    $dbh ||= &get_db_handle;</pre>
The first time this code is run, <tt>$dbh</tt> will be undefined so the <tt>||</tt>
operator will cause <tt>get_db_handle</tt> to be called.  On subsequent
runs of the script, <tt>$dbh</tt> will already be defined (it will already
hold the database handle) and <tt>get_db_handle</tt> will not be called.

<h2>Comparison to other persistent perl interpreters</h2>

There are other persistent perl environments around.
Below is a comparison between SpeedyCGI and two popular
persistent perl interpreters - mod_perl and FastCGI.

<h3>Comparison to both mod_perl and FastCGI</h3>
<ul>
<li><h4>SpeedyCGI Advantages</h4>
<ul>
<p><li>
SpeedyCGI can be used for general purpose perl programming,
not just web-based scripts.
<p><li>
SpeedyCGI provides real files for STDIN, STDOUT and STDERR, including real unix
file descriptors, increasing compatibility with existing code.
<p><li>
Speedy always tries to use the fewest number of perl interpreters
possible for the given load by reusing the same interpreters over and over.
This results in fewer interpreters being used under a heavy load.
</ul>
<p>
<li><h4>SpeedyCGI Disadvantages</h4>
<ul>
<p><li>
No Win32 version yet - Unix only.
</ul>
</ul>

<h3>mod_perl Comparison</h3>
<ul>
<li><h4>SpeedyCGI Advantages</h4>
<ul>
<p><li>
The SpeedyCGI perl interpreter runs outside the web server, so bad
perl code can't affect the web server.
<p><li>
Each interpreter can be assigned to run only a single script, or only
a certain group of scripts.  This means you can keep one group of scripts
from affecting another group, or set different policies for different
groups of scripts.  In mod_perl  there is no control over this -
each interpreter runs all of the scripts.
<p><li>
SpeedyCGI buffers the output from the script.  If the buffer is made
large enough, then this means that as soon as the perl interpreter is
done producing results it can be reused for another request, regardless
of how long it takes to send the buffered output to the client.

<p><li>
SpeedyCGI can work with any web server, not just Apache
</ul>
<p>
<li><h4>SpeedyCGI Disadvantages</h4>
<ul>
<p><li>
mod_perl provides access to web server internals and allows for writing
request handlers which are faster than CGI.
<p><li>
mod_perl doesn't copy the output data twice - it goes directly from the
program to the client.
<p><li>
The perl interpreters in mod_perl can share pre-compiled code
because they are forked from a common base interpreter.  When this
feature is used it can mean a smaller amount of private memory used for
each perl interpreter.
</ul>
</ul>

<h3>FastCGI Comparison</h3>
<ul>
<li><h4>SpeedyCGI Advantages</h4>
<ul>
<p><li>
SpeedyCGI doesn't require that you add an "accept" loop to your code.
There are no architectural changes required by SpeedyCGI to the code.
<p><li>
SpeedyCGI can run more than one script in each interpreter.
</ul>
<p>
<li><h4>SpeedyCGI Disadvantages</h4>
<ul>
<p><li>
FastCGI can use multiple systems to run the interpreters.  SpeedyCGI only runs
on the local system.
<p><li>
FastCGI supports languages other than perl.
</ul>
</ul>

<h2>How does SpeedyCGI work?</h2>

When you run <tt>/usr/bin/speedy</tt>, you are not directly running a perl
interpreter.  The speedy executable is only a "frontend".  The actual
perl interpreter is contained in a different executable named
"speedy_backend".

<p>
When executed, the speedy frontend does the following:
<ol>
<p><li>
Looks for an available backend to run this script.
<p><li>
If no backends are available, starts a new one.
<p><li>
As soon as a backend is located, connects to it and
starts to send over %ENV, @ARGV and the STDIN data.
<p><li>
Brings back the STDOUT and STDERR data from the backend and sends it
to its output.
</ol>

<p>
The speedy backend does the following:
<ol>
<p><li>
Initializes the perl interpreter and compiles the perl script
<p><li>
Waits for a frontend to contact it
<p><li>
Once a frontend contacts it, reads in and initializes
%ENV and @ARGV 
<p><li>
Sets up STDIN, STDOUT and STDERR so that they are connected
to the frontend via Unix sockets.
<p><li>
Executes the perl code.
<p><li>
Goes back and waits for another frontend to contact it.
</ol>

<p>
A data file in /tmp is used to keep track of the frontends and backends.
Once the two processes find each other, Unix sockets are used for
communication.
<h2>Future Directions</h2>
<ul>
<p><li>
Win32 Port
<p><li>
Better buffering so that we don't start a perl interpreter until we've
received most of the STDIN data.  This should reduce the number of interpreters
needed during http post operations.
<p><li>
Fork from a common perl interpreter so we can shared compiled perl code
like mod_perl does.
</ul>

<h2>More Information</h2>

For more information about SpeedyCGI see the SpeedyCGI home page at
<a href="http://daemoninc.com/SpeedyCGI/">http://daemoninc.com/SpeedyCGI/</a>