PostgreSQL Warm Fail Over using Write Ahead Logs

Introduction

A typical fail over method for any application is to have two identical
machines with all data stored on a shared SAN. This falls short on ensuring the
integrity of the database. Do we know that the database was properly shutdown
on the primary node before failing over? If not, data loss can occurs and the
only recovery method will be restoring from backup. This method also
introduces a single point of failure for the database in the shared storage.

Luckily, postgres provides a transaction logging method known as write
ahead logs (WAL). We can exploit these feature to create a live backup
of the production system on the fail over node. We’ll do this by
having the primary node copy it’s wal logs to the secondary node. The
secondary node will be in continuous recovery mode, reading the wal logs
and applying them to it’s database.

Pre-requisites

Before we begin, a few requirements:
1. Both machines must be the same postgresql version
This document assumes postgresql 8.x. Different solutions are available for postgresql 7.x and 9.x.
2. Both machines must be the same architecture (64 or 32 bit).
3. The primary node must be able to access the secondary node via ssh.
4. Install postgresql-contrib on secondary machine

Configure the primary node

1. Generate ssh key and place public key in authorized_hosts on secondary
# ssh-keygen -t dsa
# ssh-copy-id -i ~/.ssh/id_dsa.pub postgres@secondary
2. Enable WAL logging by editing postgresql.conf and setting the following values.

archive_mode = on
archive_command = 'rsync --delete-after -a %p postgres@secondary:/var/lib/pgsql/walfiles/%f'

Configure the secondary node

1. Create a directory for the WAL files
# mkdir /var/lib/pgsql/walfiles
2. Create /var/lib/pgsql/recovery.conf with the following contents:

restore_command = '/usr/bin/pg_standby -l -d -s 2 -t /tmp/pgsql.trigger.5432 \ /var/lib/pgsql/walfiles %f %p %r 2>>standby.log'

Initialize the cluster

1. On the primary run the following:
Please note that ‘dhreplication’ is an arbitrary tag, and can be set to anything useful to your configuration.
# psql -U postgres -c “SELECT pg_start_backup(‘dhreplication’);”
# rsync -avz /var/lib/pgsql/data/* secondary:/var/lib/pgsql/data/
# psql -U postgres -c “SELECT pg_stop_backup();”
2. On secondary run the following:
Edit /var/lib/pgsql/data/postgresql.conf and set ‘archive_mode = off’
# ln -s /var/lib/pgsql/recovery.conf /var/lib/pgsql/data/recovery.conf
A symlink is used above, as /var/lib/pgsql/recovery.conf will be removed when recovery is disabled.
# service postgresql start
3. Monitor for problems by watching /var/pgsql/pgsql.log and /var/lib/pgsql/data/standby.log on secondary

Testing fail over

1. On primary edit /var/lib/pgsql/data/postgresql.conf and set “archive_command = /bin/true”
then reload postgresql to make the changes active.
# service postgresql reload
2. On secondary
# tail /var/pgsql/pgsql.log /var/lib/pgsql/data/standby.log
# touch /tmp/pgsql.trigger.5432
3. You may now connect clients to the secondary server.

To resume running on primary

1. On secondary
# psql -U postgres -c “SELECT pg_start_backup(‘dhreplication’);”
# rsync -avz /var/lib/pgsql/data/* primary:/var/lib/pgsql/data/
# psql -U postgres -c “SELECT pg_stop_backup();”
2. Preform steps above to initialize the secondary server again.

External sources

http://www.xtuple.org/replication-how-to

http://www.postgresql.org/docs/8.4/static/high-availability.html

http://www.postgresql.org/docs/8.4/static/warm-standby.html

Exim

Introduction

I installed the debian packages, so I didn’t need to compile anything. The basic configuration was done via the debian installer.

I found these links helpful in further configuring exim:

Setting up virtual domains
To add a new virtual domain, create file in /etc/exim4/virtual and add domain to /etc/exim4/update-exim4.conf.conf

Catch all alias

Virus scanning

I also wanted to drop connections after too many unknown users, so I added this to /etc/exim4/conf.d/acl/30_exim4-config_check_rcpt

# Deny if more than 5 unknown users
  deny
    condition = ${if >{$rcpt_fail_count}{5}{yes}{no}}
    message = Too many unknown users

Stats

Stats are generating by running

eximstats -byemail -pattern "Rejected by SA" "/SAEximRejCond expand returned: '1'/"

from the log rotation cron job.

DNS Blacklists

I enabled the flagging of messages by adding the following to /etc/exim4/conf.d/main/02_exim4-config_options:

CHECK_RCPT_IP_DNSBLS = sbl-xbl.spamhaus.org:dnsbl.sorbs.net:combined.njabl.org:bl.spamcop.net:psbl.surriel.com

Command line arguments

To list all messages in the queue:
sudo /usr/sbin/exim4 -bp

To view a specific message header:
sudo /usr/sbin/exim4 -Mvh

To remove a message from the queue:
sudo /usr/sbin/exim4 -Mrm

To remove all frozen messages from the queue:
sudo /usr/sbin/exiqgrep -iz | xargs sudo /usr/sbin/exim4 -Mrm

To force a delivery attempt:
sudo /usr/sbin/exim4 -qf

ClamAV

I ended up having to install clamav from the volatile packages from /http://www.debian.org/volatile/ in order to keep it updated.

Antivirus test files: http://www.eicar.org/anti_virus_test_file.htm

Greylistd

This was silly easy to setup using these instructions

It boiled down to the following commands:

apt-get install greylistd

greylistd-setup-exim4 add

I did edit /etc/greylistd/config to change the length of time before a retry will be accepted to 30min (default is 60).

Spoofing work email

I wanted to be able to send email as $workaddress, and have it be accepted. To do this I had to add my user account to the MAIN_TRUSTED_USERS in 02_exim4-config_options

Being your own SSL Certificate Authority

SSL

Buying commercial SSL certificates are expensive, so I setup my own CA in /etc/ssl/CA

Renewing a certificate

First revoke the old certificate
openssl ca -revoke newcerts/02.pem -config ./openssl.cnf
Then re-sign the request
openssl ca -out cert.pem -config ./openssl.cnf -infiles server-req.pem

Create new certificate

Create request
openssl req -new -nodes -out server-req.pem -config ./openssl.cnf
Sign it
openssl ca -out server-cert.pem -config ./openssl.cnf -infiles server-req.pem

All information gathered from here

FastCGI + PHP

FastCGI + PHP

I run this webserver on a VPS hosted by quantact so the lower the memory usage, the
lower my monthly bill. This has led me to be pretty stingy with memory usage.
While using mod_php, each apache process was around 25MB each. Since switching
to PHP running as a fastcgi my apache processes are down to around 2MB with a
couple of php process at around 25MB.

Configuration

I followed these instructions to set everything up. The only real gotcha’s I ran into were
ensuring the file permissions were correct as php is now running as a different
user that www-data.

Benchmarks

I use ab to beat on the server a bit in both configurations, below are the results for mod_php:

Server Software:        Apache/2.2.9
Server Hostname:        www.digitaldogma.org
Server Port:            80

Document Path:          /2010/10/09/monitoring-with-cacti
Document Length:        20745 bytes

Concurrency Level:      25
Time taken for tests:   45.775 seconds
Complete requests:      1000
Failed requests:        229
   (Connect: 0, Receive: 0, Length: 229, Exceptions: 0)
Write errors:           0
Non-2xx responses:      96
Total transferred:      17081694 bytes
HTML transferred:       16537051 bytes
Requests per second:    21.85 [#/sec] (mean)
Time per request:       1144.374 [ms] (mean)
Time per request:       45.775 [ms] (mean, across all concurrent requests)
Transfer rate:          364.42 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       74   82   4.6     81     107
Processing:    80 1048 746.3    936    8988
Waiting:       80  863 586.0    757    6633
Total:        163 1131 746.0   1019    9069

Percentage of the requests served within a certain time (ms)
  50%   1019
  66%   1185
  75%   1299
  80%   1384
  90%   1734
  95%   2181
  98%   3174
  99%   3940
 100%   9069 (longest request)

and for FastCGI:

Server Software:        Apache/2.2.9
Server Hostname:        www.digitaldogma.org
Server Port:            80

Document Path:          /2010/10/09/monitoring-with-cacti
Document Length:        20745 bytes

Concurrency Level:      25
Time taken for tests:   52.335 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      21287344 bytes
HTML transferred:       20748806 bytes
Requests per second:    19.11 [#/sec] (mean)
Time per request:       1308.374 [ms] (mean)
Time per request:       52.335 [ms] (mean, across all concurrent requests)
Transfer rate:          397.22 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       74   82   4.9     81     123
Processing:   316 1201 234.2   1136    2588
Waiting:      155 1025 232.3    960    2408
Total:        398 1283 234.4   1219    2667

Percentage of the requests served within a certain time (ms)
  50%   1219
  66%   1317
  75%   1401
  80%   1469
  90%   1577
  95%   1710
  98%   1928
  99%   2103
 100%   2667 (longest request)

When I first looked at the results, it appears that the mod_php config was faster based on the Time per request, but then I noticed the Failed requests numbers. Of the 1000 requests, 229 had failed on the mod_php server, looking in the error logs I find quite a few of these:

[error] (12)Cannot allocate memory: fork: Unable to fork new process

Looks like the mod_php exhausted the system memory and wasn’t able to process all the requests. This would account for the faster response times, as it doesn’t take much to reply with an error :).

The above has cemented my decision to go with FastCGI on my hosted VPN. If you are able to throw more memory at it I get the feeling mod_php would be faster (note the max connect times, as the first hit need to spawn the fastcgi process), but for my small site and limited resources (money and ram :) FastCGI is the way to go.

Strange incoming connections from source port 80

I began getting many alerts from snort on incoming traffic from the dynamic
range of my ISP (insightbb). Connecting back to the source IPs on port 80
showed the google search page. A quick nmap -A showed it to be a linux box,
running Google httpd 2.0 (GFE) on port 80 and 443. After some more monkeying
around I made a connection with this traffic and watching videos on youtube. I
also noticed that if I completely blocked the incoming traffic, youtube would
break. If I used my hosted machine as a proxy with the traffic blocked,
youtube worked.

From this, my best guess is that my ISP is running a google appliance for some
kind of youtube cache/proxy. If anyone has any more information on this, I’d
love to hear it.

Monitoring with Cacti

Remote Monitoring

I added “agentaddress tcp:161″ to snmpd.conf which cases snmpd to listen on tcp port 161. Then it was easy to tunnel the tcp communication with ssh and eliminated the need for the buggy socat method.

SSH tunnel


ssh -f -N -L 6003:localhost:161 user@digitaldogma.org

Then I setup the host using tcp:127.0.0.1 and port 6003 in cacti and everything works as expected.

Monit

I also set a monit entry for localhost:6003 to monitor the ssh tunnel. If the connection fails it restarts the tunnel using the ssh command above.

Apache, Exim4, and MySQL stats

To graph apache, exim4 and mysql stats, I created the following cron jobs:

# collect exim stats
2,7,12,17,22,27,32,37,42,47,52,57 * * * * root /usr/local/bin/exim_perf.pl > /etc/snmp/exim.stats 2> /dev/null
# collect apache stats
2,7,12,17,22,27,32,37,42,47,52,57 * * * * root /usr/bin/lynx -dump localhost/server-status?auto | head -9 | sed -e 's/: /\n/' > /etc/snmp/apache.stats
# collect mysql stats
2,7,12,17,22,27,32,37,42,47,52,57 * * * * root  /usr/bin/mysqladmin -u root status | sed -e 's/: /\n/g' -e 's/[0-9|.]\+/& \n/g' -e 's/ //g' | sed '$d' > /etc/snmp/mysql.stats

This writes the stats to a file in /etc/snmp every 5 min, these are then available via snmp by adding the following to /etc/snmp/snmpd.conf:

exec .1.3.6.1.4.1.14464.25 exim_perf.pl /bin/cat /etc/snmp/exim_perf.stats
exec .1.3.6.1.4.1.14464.80 apache /bin/cat /etc/snmp/apache.stats
exec .1.3.6.1.4.1.14464.3360 mysql /bin/cat /etc/snmp/mysql.stats

For my own sanity, I used the tcp port of the daemon as the last number of the OID. As I understand it, this is a reserved OID for just such custom things.

After some final tweaking, I’ll keep an exported copy of the cacti templates here as well.

Snort

I also thought it would be neat to graph snort alerts. As cacti runs on the same server that hosts the snot db, this was pretty simple. I’m using the following shell script to get the number of alerts in the past 5 min:

#!/bin/sh
# list the number of snort events in the last 5min
# is protocol number is provided, show only events for that protocol
# TCP = 6
# UDP = 17
# ICMP = 1
# portscan = 255

PROTO=""
if [ $1 ]; then
        PROTO="and iphdr.ip_proto = '$1'"
fi
echo "select count(*) from iphdr, event where event.cid = iphdr.cid and DATE_SUB(NOW(),INTERVAL 5 MINUTE) <= event.timestamp $PROTO" | /usr/bin/mysql -s -u user --password=pass -D snort

Links

smoke ping graphs

Creating a quick play DVD in Linux

With all the extras and previews on kids movies now, it takes forever to get to the feature. This is a huge pain when trying to start a movie for the kids in the car, so I’ve started making copies of their movies to avoid all the unwanted extras. This serves a dual purpose in that they’re hard on the movies and being able to leave the original at home ensures it wont be lost of damaged.

To do this I use mencoder and dvdauthor.

# to re-encode the movie

mencoder -oac lavc -ovc lavc -of mpeg -mpegopts format=dvd:tsaf -vf scale=720:480,harddup -srate 48000 -af lavcresample=48000 -lavcopts vcodec=mpeg2video:vrc_buf_size=1835:vrc_maxrate=9800:vbitrate=5000:keyint=18:vstrict=0:acodec=ac3:abitrate=192:aspect=16/9 -ofps 30000/1001 -o movie.mpg dvdnav://1

# to just copy movie from dvd

mencoder -of mpeg -mpegopts format=dvd:tsaf -ovc copy -oac copy -o movie.mpg dvdnav://1

# example movie.xml

<?xml version="1.0"?>
<dvdauthor>
<vmgm />
<titleset>
<titles>
<audio lang="en"/>
<pgc>
<vob file="movie.mpg" />
</pgc>
</titles>
</titleset>
</dvdauthor>

# gen dvd filesystem

dvdauthor -o dvd/ -x movie.xml

# to play copy from disk

mplayer dvd://1 -dvd-device dvd/

# gen iso and burn

mkisofs -dvd-video -o dvd.iso dvd/
growisofs -dvd-compat -Z /dev/hdc=dvd.iso

The requested message could not be converted to an RFC-822 compatible format.

From time to time, archivemail run against the exchange server may fail with an error such as:

archivemail: Failed to fetch message sizes; server says 'The requested message could not be converted to an RFC-822 compatible format.' archivemail: unexpected error encountered - shutting down

When it does, determine the problem message by running fetchmail by hand:
fetchmail -v | grep couldn\'t
and delete it via IMAP manually as such


# openssl s_client -crlf -connect server:993

. login user@domain {password}
. select INBOX
. store flags \Deleted
. expunge
. logout

Note: the ‘.’ above are necessary.
http://bobpeers.com/technical/telnet_imap

Update: 04/03/2012
If your imap server uses TLS1 (as mine now does) you’ll need to connect like this:
# openssl s_client -crlf -connect server:143 -starttls imap

First Post from VIM

So I found Vimpress a vim plugin that allows you to publish and edit posts to wordpress from within VIM. One of my biggest grips about using blog software was being forced to use a web form based editor. With this removed, I figured I’d try out wordpress again.

Vimpress is neat, but Blogit is a much more complete tool. In addition to allowing you to publish/unpublish pages it also has support for multiple blogs. Very slick.