Latest News >> 2008-08-06

Well, I’ll be at DefCon this year, and I always try to do something fun for the conferences I attend.

2008-08-01

Well looks like my rant about the state of open source feed readers hit some sites, so I should put in a few clarifications so people understand what I was looking for more specifically. I’ll do it by answering several of the questions people sent me.

2008-07-20

RubyFringe was my last Ruby conference and it was the best conference to go out on. Everything about RubyFringe was great. It was well organized, contained eclectic talks, and supported the weirdness that’s usually hidden at the other conferences.

2008-06-25

I’ve been completely fed up with news/feed/rss/atom readers these days. I use Linux as my primary operating system, and I only have a few feeds that I want to rip through quick so I can get to reading the content. Yet, trying to find a reader that doesn’t suck donkey balls has been a chore.

Getting Started With SCGI

This is just a quick HOWTO document showing you how to get your Ruby On Rails application running under the include SCGI server. Please don’t use this in production systems yet as it’s very raw code still and could blow up destroying the world. I’m not responsible if you melt a nuclear reactor to slag with this or kill an army of babies.

What Is SCGI

SCGI is an alternative protocol for a web server to run an application under the CGI system, but without starting the application over and over again. It’s like FastCGI but is much simpler to implement and simple is always a good thing. Getting it working actually turned out to be dead simple in Myriad and most of the work was spent hacking up the Ruby CGI classes to work with Rails.

Warning, read the scgi_evil.rb file if you try to mix the built in scgi.rb
file with your own CGI implementation.

Running your Ruby On Rails application under the Myriad SCGI runner will make it easy for you to run your application independent of the fronting web server, and should give you great performance without a lot of the management headaches associated with FastCGI and Rails.

Install Ruby/Event

First, you have to install libevent on your system. Many systems already have this as a native package, and if you installed memcached then it is probably installed. The only restriction is that I’ve only tested with version 1.1a.

Once you’ve got libevent installed you’ll also need the cmdparse gem installed. Just do gem install cmdparse or go grab the source and install it yourself. After those two are installed you can grab the tar.bz2 and extract it. Then do:

  1. cd ruby_event-0.4.1
  2. sudo ruby setup.rb

This installs it in your system. If you don’t use sudo then you need to figure out how to install it yourself.

There's a huge problem on some systems where the damn setup.rb
script mangles the bin/scgi_rails file's permissions.  Make 
sure that you change them back to executable for all, editable
to you and make sure it installed it correctly.  Sorry, I have 
no idea why it does this.

Setup SCGI in Rails w/ lighttpd

If everything installed correctly then there should be a scgi_rails script in your path. Just change to your Rails application directory and do:

scgi_rails start -e production

It will run one child listener on 0.0.0.0:9999 for you. If you want to change this then do:

scgi_rails start -h localhost -p 9999 -c 3 -e production

And it will run on localhost port 9999 with three children listening at once. There are lots of other options, including a -D option to run the server in the foreground for testing purposes.

Configuring lighttpd

You’ll need to tell lighttpd how to connect to your SCGI server. The nice thing is that the scgi_rails script knows how to fork itself to create a cluster for you, so all you need to do is point lighttpd at one port. Here’s the important parts of my configuration:

Make sure you have mod_scgi mentioned in the modules:

server.modules = ( "mod_rewrite", "mod_redirect", "mod_access", "mod_accesslog", "mod_compress", "mod_scgi" )

Tell lighttpd to route 404 errors to your SCGI server with this stupidity:

server.error-handler-404 = "/dispatch.scgi"

Next you have to tell lighttpd to route all requests for dispatch.scgi to SCGI and not to check for local.

scgi.server = ( "dispatch.scgi" => (( 
  "host" => "127.0.0.1",
  "port" => 9999,
  "check-local" => "disable" 
 )) )

What happens is lighttpd will check for a request in /public as a file, it doesn’t find a file so it runs the 404 handler. You’ve got /dispatch.scgi as the 404 handler, which is configured to then run 127.0.0.1:9999 SCGI server, which has check-local disabled. Then the bump on the frog on the log at the bottom of the ocean by the beach with the man with the tan will begin to work. Oh well, it’s much better than Apache’s configuration at least.

Then I turn on super debugging for fun:

scgi.debug=3

But make sure you turn this off in production. A common complaint people have is that lighttpd spams the lighttpd_error.log file with tons of connection messages and other stuff if debug=3. Set it to scgi.debug=0 for better messages in production.

lighttpd 1.4.3

You no longer need to use an empty executable dispatch.scgi thanks to Jan Kneschke’s above configuration, but now you’ll have other problems with lighttpd prior to 1.4.3. The 1.4.x series prior to 1.4.3 will segfault on exit many times. This version also did not work with the previously published configuration example that required the empty dispatch.scgi file.

With this configuration you can use 1.4.3 and the above configuration and also delete the empty dispatch.scgi file.

Testing It All Out

You should now be able to go to the server and it should run. If it doesn’t then check the lighttpd_error.log for information on whether the backend crashed or not.

Killing It And Status

When you ran your scgi_rails with the start command it should have created a file called children.yaml in the root of the Rails application. This file contains information on which children are running so that you can stop them later. It also logs important information to the log/scgi.log file which includes the SHA1 hash of the file’s contents before they were written. You should check the log file and jot this hash down to see if anyone else has messed with the file or done a restart since you last were there.

If you want to get status from all of the children you can just do:

scgi_rails status

And it will dump a series of status files beginning with scgi_ in the /tmp directory. Read the scgi_rails start -h output for the option to change where the status files are put. The status just lists the PID of the process and how many connections you have right now on each. Not much else. It also might take a little while if the children are really busy.

When you want to stop all of the children simply do:

scgi_rails stop

And each one will exit.

Lots of Children

You might be wondering how these “child listeners” work on a single socket. It’s actually a cool trick that makes it possible to multiplex a ton of connections onto a small number of processes that each do their own single threaded IO processing. What the scgi_rails script does is create one socket to listen for new SCGI connections, but spawns many children to listen on this socket.

The trick works because the OS wakes up all listening processes, but only one can actually get the incoming connection. This gives you a really cheap cluster off a single port and really simplifies the configuration. You only need to say how many children and which port to listen on rather than all the children.

Before you go thinking this is a huge performance hit to wake up all the children, keep in mind that you’re only starting a few children, and that these children and handling tons of multiplexed IO all the time. They are frequently active anyway, so “waking” them all up is not really done as often as you think. In reality they all just go about their business anyway and really only accept clients when they are told by the OS to do so. If one of the processes is busy doing something more useful than it isn’t bothered.

Where this really shines is on computers that have multiple processors or hyperthreading. You’ll see your processing speed increase dramatically by only starting a few more children.

The only killer on this is that if you accidentally start many children in debugging mode then you’ll go insane with memory leaks under Rails. Rails has known memory leaks in debugging mode (not my fault). This is why the examples show running scgi_rails with the -e production option.

More Information

The source of bin/scgi_rails is pretty simple, but you have to know how the SCGIServer class was written and how the Myriad framework operates. Contact me if you have ideas for improvement.

I have also tested scgi_rails against Apache 2.0.54 and mod_scgi 1.7 and it seems to work. My configuration was pretty simple, so you’ll want to follow the mod_scgi README for apache2 and do your own tests.