Tech Musings

Wednesday, March 28, 2007

Backing Up OS 10.3.9 Server Using Rsync +hfsmode

I found some time to better document the process I've been using to mirror data from my OS X Server (which I primarily use for Web services) across the network to a box residing in another physical location. I regularly use Bombich's CCC psync feature to clone local drives, but I prefer to use remote network backups whenever possible. I'm using an "HFS+ aware" version of Rsync to mirror data from my OS X 10.3.x Server (source) over the network to a different box (target) daily through cron jobs.

Step 1:
Originally, I was going to use RsyncX for network backups but decided against it after reading this article on afp548. RsyncX sounds a little buggy to me. Instead, I download and install Andrew Reynhout's patched binary version of rsync which addresses the HFS+ resource fork problem. I'm not too worried about potential lchown problems that could occur when copying symbolic links from one machine to another, mainly because I'm only synching data files and not full directory architectures. Thus, I usually skip the Hoffman patch discussed in the article (plus I'm too dumb and lazy to figure out how to utilize it).

My hope is that there will be no need to go through this "rig-a-ma-roll" and use this patched version after Apple finally gets their act together and includes a workable version of rsync bundled with the OS. In fact, Apple did introduce an "HFS aware" version of rsync in 10.4 Tiger-- but I read somewhere that it's crap. Maybe apple engineers will get on the ball and improve it when Leopard Server rolls out.

I install Reynhout's 2.6.3 version of rsync (named Rsync+hfsmode) on BOTH my source and target machines. To install, download and mount the binary .dmg (linked above) and type the commands listed below into the Terminal window. These commands add the rsync+hfsmode version of rsync into the user account's PATH on both boxes. Truthfully, I'm still a little fuzzy as to which box rsync actually runs on when backups take place (nice, huh?). To address this little technicality, I always install it on both machines to cover my stupidity.

There's no binary package to the enhanced rsync+hfsmode installation. Instead, make a backup of your current rsync program and then overwrite it with the newer, "better" version available inside the disk image.

$sudo mv /usr/bin/rsync /usr/bin/rsync-apple
$sudo cp /Volumes/rsync+hfsmode/rsync-2.6.3+hfsmode-1.2b2 /usr/bin/rsync
$sudo chown root:wheel /usr/bin/rsync
$sudo chmod 755 /usr/bin/rsync


Step 2: Next, prep for SSH transfers between machines using authorized keys as explained in the AFP548 article above. I generate a public/private dsa key pair on the source (OS X Server) under my identified user account with no passphrase. This creates a key fingerprint for the user that I then copy over to my target box. This allows the user account on my OS X server (source) to authenticate to the target box without the need to physically type a password in the Terminal. I couldn't run this network backup as an unattended cron job without this host based authentication. The instructions for doing this are in the aforementioned article on afp548 under "Setting Up SSH."

Here's a screen shot showing the commands for the generation of the key pairs in the terminal window:


Generating public and private dsa key pair in Terminal Window


Step 3: The next step is to implement Bombich's rsync wrapper shell script on my target box. Bombich devotes an entire page about rsync backups on his Web site. His instructions for setting up the public/private ssh keys were confusing to me because of his use of the words server and client. I feel like his instructions are backwards from the way I do it. Anyway, his wrapper envokes a layer of security to ensure the privileges of the user logging in from my OS X Server are limited to the functionalities the rsync script. You add the following line to the beginning of the key present in the authorized_keys files. Use the vi editor rather than pico to make the edit to alleviate line break problems.

command="/private/etc/rsync-wrapper.sh"

Step 4: I searched for and found a decent rsync shell script that was originally created by Art Mulder which includes log rotations and email notifications. I modify it to suit my purposes including adding the appledouble flag to utilize the HFS fix. The source and destination directories targeted for backup are identified as variables $SOURCE and $DEST in the script in the screen shot.

rsync -e ssh --archive --update --delete --verbose --hfs-mode=appledouble $SOURCE $LOGIN:$DEST | tee -a $LOG




Step 5: After I test to make sure everything works I add the shell script to the root account's cron job (type crontab -e in the terminal window) on my source OS X Server box.

Wednesday, March 21, 2007

OS X (10.3.9) Server and CGI with Virtual Hosts

OS X Server's initial set up is designed for webmasters to house all cgi scripts in /Library/WebServer/CGI-Executables/. This is fine, secure and works well unless your server plays host to multiple sites (virtual hosting). In this situation, it might behoove each site to have its own individual cgi bin (i.e. multiple cgi bins on the server) in order to execute scripts unique to each site's own environment. It took me half a morning to figure out how to configure OS X Server to function in this capacity, but I finally prevailed and here is how I did it.

To start, OS X Server includes a ScriptAlias directive /cgi-bin/ "/Library/WebServer/CGI-Executables/" in the main httpd.conf file (/etc/httpd/httpd.conf) which directs anything found in a cgi-bin folder anywhere on the hard drive to automatically redirect to /Library/WebServer/CGI-Executables/. Consequently, I commented out this directive in the main httpd.conf file so each virtual site could have its own unique cgi-bin directory.

#ScriptAlias /cgi-bin/ "/Library/WebServer/CGI-Executables/"

Enabling CGI Execution in OS 10.3 Server Settings Then, in Server Settings I checked the box to enable CGI Execution (under Options) for each virtual site. This added the -ExecCGI option to each site's host configuration file in /etc/httpd/sites/.


<~ Directory "/Library/WebServer/Documents/site_1" ~>
Options All +MultiViews -Indexes -Includes
-ExecCGI
<~ IfModule mod_dav.c ~>
DAV Off
<~ /IfModule ~>
AllowOverride All AuthConfig
<~ /Directory ~>


I thought this would do it, but soon discovered there was one more tiny little step. Between the directory tags inside the site config file, I needed to insert AddHandler cgi-script .cgi. I also added the .pl extension to the end of the line so scripts with a .pl (perl) extension would execute in addition to scripts with a .cgi extension. Basically, adding this line tells Apache that a .cgi or .pl script can be executed anywhere in the site, which could be deemed a security risk without careful consideration.

<~ Directory "/Library/WebServer/Documents/site_1" ~>
Options All +MultiViews -Indexes -Includes -ExecCGI
<~ IfModule mod_dav.c ~>
DAV Off
<~ /IfModule ~>
AllowOverride All AuthConfig
AddHandler cgi-script .cgi .pl
<~ /Directory ~>

Monday, March 12, 2007

Virtual Hosts, DNS, Domain Names and OS X Server

Setting up virtual hosts with DNS on OS X Server was an intimidating prospect. I don't have a sandbox with an outside IP address to test on, so all experimentation related to virtual hosting would need to take place on my production server. Scary. And THIS is why I didn't know beans about this subject... until now!

Over the years I've purchased domain names and tied them into Web sites I've housed with hosting services, but I was not fully versed in the details of how this would work when everything actually resided on my own server. In other words, I was somewhat mystified by DNS and virtual hosting. Also, I've read a multitude of articles on how to set up virtual hosting on Apache, but many seemed overly verbose and difficult to apply to the OS X Server environment and its kludgy (klugy?) GUI.

Apple implemented its own flavor of Apache with OS X Server which has caused me many a headache over the years. Yet, to my chagrin, I don't have easy physical access and control to re-install a different OS on my Web server... if I did you can bet the house I would choose something other than OS X Server to run my Web services!

My other consideration was SOE or search engine optimization. I have a multitude of domain names I want to tie into my site, but I don't want to be penalized by Google or the other big players for promoting redundant content. Thus, I wanted to make sure I set up aliasing correctly so search engines would not find duplicate versions of my site.

Now, there a few different ways to set up virtual hosting on OS 10.3+ server. I originally contemplated trying it with virtual aliasing which was described by kiddailey in this post on oreillynet.com. But he didn't reference the Server Settings GUI which served me pause.

So, I started by posting a general probing question in the apple discussion forum. Part of the problem was that I wasn't exactly sure how to ask for help because I didn't know enough to know what to ask!! This is evident in the discussion thread. I wanted to use the Server Settings GUI to set it up because I was afraid to muck around in the Apache config files for fear of breaking the production server beyond repair.

Based on the feedback I received, I decided to forge ahead and use the GUI to add a second site. My greatest fears were realized as my main site went offline instantly. Nice. I immediately posted a cry for help on the apple board, again. As soon as I hit the submit button I started reading another discussion which helped me finally get my arms around how Apache handles virtual hosting. Conceptually, I was under the impression Apache needed to host each virtual site on its own port number or designated IP address. How could two sites share the same IP and port number? Well, they can because Apache negotiates incoming requests (i.e. traffic) according to what it's instructed to do in the configuration files. This is a pretty good article written in easy-to-understand language that explains the process: http://www.apacheweek.com/features/vhost

Based on what I read in the second discussion, I came to the understanding that ALL sites on OS X Server are set up as virtual hosts. In fact, each site that resides on a box has its own site configuration file located in /etc/httpd/sites. When you create a new site and enable it in Server Settings under Web Services, a new configuration file for that site is generated. It will probably look something like this 0001_214.43.212.41_80_photography.us.com.conf. Any directive featured in the main httpd.conf file can be overruled by directives in each site's individual configuration file. All sites can share the same port number because Apache will listen for an incoming request on a domain name and route that request to the appropriate site to fetch its content. So, when a request comes in on http://photography.us.com/page_1.html, Apache will actually recognize (i.e. read) the domain name (photography.us.com) in the headers sent by the browser and route the request on the server to the appropriate directory according to what it has been told in its config files. I'm beginning to truly appreciate how smart this piece of software really is!!

In theory, both sites should be able to be assigned to the same port, but this wasn't working for me. When I assigned my second site port 8080, both sites finally came online. However, after coming across a domain management service called EditDNS and entering the IP addresses for my server, I was stuck. Where do I input the port number 8080 in the class A record for my domain name? I tried using a AAAA record but was told these were reserved for only IPv6, and I had an IPv4 address (reference this post on nerdie nets). You can't include port numbers in DNS records. Sheee-it.

It turns out that Performance Cache was turned on in Server Settings which was causing all of my problems. After unchecking this box and assigning both sites to port 80, it worked.
unchecking the performance cache in OS X Server Settings for Virtual Hosting
I also needed to remove the quotes around the Include /etc/httpd/sites/*conf in httpd.conf, which, for some ridiculous reason, was not automatically done when I added another site using the server settings GUI. DAMN YOU APPLE! So much of my frustration has stemmed with OS X Server and their damn GUI!!!
Remove the quotes around this Include statement in httpd.conf for virtual hosts to work in OS X Server
The final piece of the puzzle for me was to properly set up domain aliases and 301 redirects. I added all my additional domain names in the Web Server Aliases window under the "Aliases" button separated by hard returns, then added 301 permanent redirects in the .htaccess file. For example, if you want the www version of your domain name to resolve to the non-www version, you add the www version in the server alias window and then enter a 301 redirect in the .htaccess file.
Adding additional domain names into the Sites Aliases window of Server Settings
Entries in my site's .htaccess file:

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.photography\.us\.com$
RewriteRule (.*) http://photography.us.com$1 [R=permanent,L]


Options +FollowSymlinks

RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.san-diego-photography\.org$
RewriteRule (.*) http://photography.us.com$1 [R=permanent,L]

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^san-diego-photography\.org$
RewriteRule (.*) http://photography.us.com$1 [R=permanent,L]

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.san-diego-photographs\.com$
RewriteRule (.*) http://photography.us.com$1 [R=permanent,L]

Options +FollowSymlinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^san-diego-photographs\.com$
RewriteRule (.*) http://photography.us.com$1 [R=permanent,L]