PostgreSQL 9.0 Backup & Recovery

May 2, 2011, 3:44 am

≫ Next: PostgreSQL 9.0 Streaming Replication on Windows

In PostgreSQL, Backup & Recovery are very user friendly comparing with other database. Many of them won't agree to this, ok lets not get into debate. Coming to Backups, PostgreSQL does'nt support INCREMENTAL BACKUP, however there are very consistant backup tools and OS level work-around to achieve this goal.

My pictorial presentation on PostgreSQL Backup and Recovery gives a complete conceptial idea. Looking into Diagram you can make out which backups can be used to restore or recover.

Logical Backup

pg_dump,pg_restore and pg_dumpall utilities used for logical backups. pg_dump and pg_restore will help in taking backups for Database level, Schema level and Table Level. Pg_dumpall used for cluster level dump.

Three formats are supported with pg_dump, Plain SQL format, Custom Format and Tar Format. Custom and Tar Format dumps are compatible with pg_restore utility whereas Plain SQL format dumps are compatible with psql utility for restoration.

Below are the examples for each Backup levels and with related restore commands.

Note: Set the defaults for PGDATABASE, PGUSER, PGPASSWORD and PGPORT in .bash_profile(Environment Variables in Windows)

Plain SQL Format Dump and Restore

$ pg_dump -U username -Fp dbname  >  filename
or
$ pg_dump -U username dbname -f  filename
or
$ pg_dump -Fp -U username dbname -f  filename

For restoring use psql command

$ psql -U username -f filename dbname
or
postgres=# \i SQL-file-name     //in psql terminal with \i option

Custom Format

$ pg_dump -Fc dbname -f filename
$ pg_restore -Fc -U username -d dbname filename.dmp

Tar Format

$ pg_dump -Ft dbname -f filename
$ pg_restore -U username -d dbname filename
or
$ cat tar-file.tar | psql -U username dbname

Note: Schema Level and Tables Level dumps can be performed in the same way by adding related options.

Cluster Level Dump:

$pg_dumpall -p portnumber > filename

For restoring use psql command

$ psql -f filename

There are very best way of taking dumps and restoring methodolgies. In particular, Simon Riggs and Hannu Krosing - "PostgreSQL 9 Administration Cookbook - 2010" book is good way to start with PostgreSQL Backup and Recovery published by www.2ndQuadrant.com.

Physical Backup (File system Backup)

Cold Backup:

In cold backup, its a simple file system backup of /data directory when Postgres Instance is down, mean, to achieve a self-consistent data directory backup, the database server should be shut down before copying. PostgreSQL gives flexibility to keep pg_xlog and pg_tblspce in different mount points via softlink. While copying the /data directory including the soft link's data, use the below command.

tar czf backup.tar.gz $PGDATA
or
cp -r $PGDATA /backup/
or
rsync -a $PGDATA /wherever/data

Hot Backup (Online Backup):

In Hot Backup, cluster will be up and running and the Database should be in Archive Log Mode. Two system functions will notify the instance about starting and stopping the Hot Backup process(pg_start_backup(),pg_stop_backup()). Before going forward with Online Backup, let's discuss on the Database Archive Log mode which is mandatory for Online Backups.

Enabling WAL Archiving:

Coming posts of mine will brief about PITR / Tunning WAL etc., presently we look into WAL Archiving. In PostgreSQL database system, the actual database 'writes' to an addition file called write-ahead log (WAL) to disk. It contains a record of writes that made in the database system. In the case of Crash, database can be repaired/recovered from these records.

Normally, the write-ahead log logs at regular intervals (called Checkpoints) matched against the database and then deleted because it no longer is required. You can also use the WAL as a backup because,there is a record of all writes made to the database.

Concept of WAL Archiving:

The write-ahead log is composed of each 16 MB large, which are called segments. The WALs reside under pg_xlog directory and it is the subdirectory of 'data directory'. The filenames will have numerically named in ascending order by PostgreSQL Instance. To perform a backup on the basis of WAL, one needs a basic backup that is, a complete backup of the data directory, and the WAL Segments between the base backup and the current date.

Configuring the archiving of WAL segments can be chosen by setting the two configuration parameter's archive_command and archive_mode in the postgresql.conf. Making the cluster into Archive-log mode requires RESTART.

archive_mode= on/off (boolean parameter)
archive_command = 'cp –i %p / Archive/Location/ f% '

Note:% p for the file to copy with path used as a file name and % f without a directory entry for the destination file.

For further information about the Archiver Process, refer to the post PostgreSQL 9.0 Memory & Processess.

Online Backup :

To take online backup:

Step 1 : Issue pg_start_backup('lable') in the psql terminal
postgres=# select pg_start_backup('fb');
Step 2 : OS level copy the $PGDATA directory to any Backup Location
$ cp -r $PGDATA  /anylocation
Step 3 : Issue pg_stop_backup() in psql terminal.
postgres=# select pg_stop_backup();

Note: It is not necessary that these two functions should run in the same database connection. The backup mode is global and persistent.

In PostgreSQL, there is no catalog to store the Start and Stop time of the Online backup. However, when online backup is in process, couple of the files created and deleted.

pg_start_backup('label') and pg_stop_backup are the two system functions to perform the Online Backup. With pg_start_backup('label') a file backup_label is created under $PGDATA directory and with pg_stop_backup() a file 'wal-segement-number.backup' file created under $PGDATA/pg_xlog. Backup_label will give the start time and Checkpoint location of WAL Segment, it will also notify the PostgreSQL instance that Cluster is in BACKUP-MODE. 'wal-segment-number.backup' file under $PGDATA/pg_xlog directory describes the start and stop time, Checkpoint location with WAL segement number.

Note: After pg_stop_backup(), backup_label file is deleted by the PostgreSQL instance.

Do post your comments, suggestions.

--Raghav

↧

PostgreSQL 9.0 Streaming Replication on Windows

May 30, 2011, 1:46 am

≫ Next: PostgreSQL Upgradation

≪ Previous: PostgreSQL 9.0 Backup & Recovery

A major milestone in PostgreSQL 9.0 is Streaming Replication(including DDL). Many of you all used configuring SR on Linux, but I would be presenting SR on Windows Platform. PostgreSQL wiki is the best guide for setting up the Streaming Replication.

For setting up SR on Windows, I would recommend to follow the PostgreSQL wiki steps with minor changes what needed for Windows Platform. I would like to show only the changes what you have to look for on Windows Platform in my blog.
http://wiki.postgresql.org/wiki/Streaming_Replication

Step 1. (Before configuring SR, add the port)

On primary, you need to configure the accepting port. Below link will guide for adding port.
http://support.microsoft.com/kb/842242

Note: Adding the port differ's from different Windows Platforms.

Step 2. (Before configuring SR, Create common mount point for Archives)

Create one common mount point where Primary and Standby write/read the Archives. Mount point should own the Postgres user permissions. My common mount point: '10.10.101.111'

Step 3.

On Primary, changes in PostgreSQL.conf.

wal_level = hot_standby 
archive_mode = on 
archive_command = 'copy %p  \\\\10.10.101.111\\pg\\WAL_Archive\\%f'
max_wal_senders = 5
wal_keep_segments = 32

Step 4.

On Standby,

1) Edit the postgresql.conf file and change the below parameters.
        listen_address='*'
        hot_standby = on

2) Add the primary server entry in pg_hba.conf
        host    replication   postgres  primary.IP.address/22  trust

3) Create recovery.conf
        standby_mode = 'on'
        primary_conninfo = 'host=10.10.101.111 port=5432 user=postgres'
        trigger_file = 'C:\\stopreplication\\standby.txt'
        restore_command = 'copy \\\\10.10.101.111\\pg\\WAL_Archive\\%f %p'

Note: Create the recovery.conf file by copying any of the .conf files from the /data_directory.

Mentioned steps are the only changes you need to take care when setting up SR on Windows, rest all follow the procedure on PostgreSQL Wiki.

Regards
Raghav

↧

PostgreSQL Upgradation

June 15, 2011, 6:24 am

≫ Next: pgmemcache Setup and Usage

≪ Previous: PostgreSQL 9.0 Streaming Replication on Windows

Its always a challenging task when moving from one version to another version on new Server. My presentation below is to upgrade the Old version of PostgreSQL 8.3 to PostgreSQL 9.0.4 on NEW SERVER. Basically, steps are very simple for upgradation, but need to take some extra care when bouncing the new server before and after restoration. Latest PostgreSQL, has lot of fixes in it, so it is recommended to use the new binaries for entire upgradation process.

Step 1 (On New Server PG 9.0.x):

First step in PostgreSQL is to set SHMMAX/SHMALL at OS-level, because shared_buffers purely depends on these setting, below script will give you the figure how much should be the SHMMAX/SHMALL on the basis of Server's Memory. I have taken the script, written by Greg Smith, which is very handy in setting the SHMMAX/SHMALL.

For better understanding on shmmax/shmall, here is the link
http://www.puschitz.com/TuningLinuxForOracle.shtml#SettingSharedMemory

vi shmsetup.sh
#!/bin/bash
# Output lines suitable for sysctl configuration based 
# on total amount of RAM on the system.  The output 
# will allow up to 50% of physical memory to be allocated 
# into shared memory.
# On Linux, you can use it as follows (as root): 
# 
# ./shmsetup >> /etc/sysctl.conf 
# sysctl -p
#
# Early FreeBSD versions do not support the sysconf interface 
# used here.  The exact version where this works hasn't 
# been confirmed yet.

page_size=`getconf PAGE_SIZE` 
phys_pages=`getconf _PHYS_PAGES`

if [ -z "$page_size" ]; then 
  echo Error:  cannot determine page size 
  exit 1 
fi

if [ -z "$phys_pages" ]; then 
  echo Error:  cannot determine number of memory pages 
  exit 2 
fi

shmall=`expr $phys_pages / 2` 
shmmax=`expr $shmall \* $page_size` 

echo \# Maximum shared segment size in bytes 
echo kernel.shmmax = $shmmax 
echo \# Maximum number of shared memory segments in pages 
echo kernel.shmall = $shmall
:wq

Execute the script :
#chmod +x shmsetup.sh
#./shmsetup.sh

Step 2 (On New Server PG 9.0.x):

Install the latest version of PostgreSQL 9.0.4 on the new server. Link below for 32bit and 64 bit:
http://get.enterprisedb.com/postgresql/postgresql-9.0.4-1-linux.bin
http://get.enterprisedb.com/postgresql/postgresql-9.0.4-1-linux-x64.bin

Most of the cases, as well recommended, to keep the "pg_xlog" in different mount point. You can create a new cluster with initdb command by selecting different "pg_xlog" mount point:-

$initdb -D DATA_DIRECTORY -X PG_XLOG_LOCATION

Note:initdb command will not create the 'pg_log' directory under new cluster, you need to create it explicitly.

After installation and creation of the cluster set the environment variables like PGDATA, PATH, PGDATABASE, PGPORT, PGUSER etc., in ".bash_profile" under postgres user.

Step 3 (On Old Server PG 8.3.x):

As I said, use the new binaries for all the commands you are executing on this server. If you dont have the new binaries on this server, install a copy of new binaries with source installation to any new location without overriding the existing binaries.

Download:-
http://wwwmaster.postgresql.org/redir/198/h/source/v9.0.4/postgresql-9.0.4.tar.gz

#tar xvf postgresql-9.0.4.tar.gz
#cd postgresql-9.0.4
#./configure --prefix=/usr/pg904
#make
#make install
New binaries location will be "/usr/pg904/"

Step 4 (On Old Server PG 8.3.x):

Intial step would be taking dump of global objects like users, tablespaces, etc., using pg_dumpall.

$ /usr/pg904/bin/pg_dumpall -p $PGPORT -g > /pg83_backups/global_dump.sql

Step 5 (On Old Server PG 8.3.x):

Take the dump of all the databases in a cluster using below command. Also generate logs for each dump to analyze if any issue arises in the dumps

$ usr/pg904/bin/pg_dump -Fc -v -U PGUSER -p PGPORT DBNAME -f /pg83_backups/dbname.dmp  >> /pg83_backups/dbname.log 2>>/pg83_backups/dbname.log

if the database is bigger, run in nohup

$ nohup usr/pg904/bin/pg_dump -Fc -v -U PGUSER -p PGPORT DBNAME -f /pg83_backups/dbname.dmp >> /pg83_backups/dbname.log 2>>/pg83_backups/dbname.log &

Step 6 (On Old Server PG 8.3.x):

Move all the dumps(/pg83_backups) to new server.

Step 7 (On New Server PG 9.0.x):

As per our STEP 2, New Server will have the latest binaries of PG 9.0.4 and cluster, to speed up the restoration process we need to tune some of the settings in $PGDATA/postgresql.conf file before and after.

Before restoration settings in postgresql.conf file(memory settings my differ as per the available RAM on the box):-

Memory Settings:
---------------
shared_buffers= (as per the shmmax settings, Maximum 8 gigs on 64 bit, 4 gigs on 32 bit)
work_mem= (in between 40MB - 100 MB)
maintenance_work_mem = (in between 1GB - 5 GB)

Checkpoints Settings:
--------------------
checkpoint_segments=(in between 128 - 256)
checkpoint_timeout=(default is 15mns make to 1h)

Autovacuum settings:
-------------------
autovacuum=off
track_counts=off

Sync to Disk:
------------
fsync=off
full_page_writes=off
synchronous_commit=off

Background Writer settings:
--------------------------
bgwriter_delay=(default 200ms, change to 50ms)

Changes demands to restart the cluster. 

$pg_ctl -D $PGDATA restart
or
$pg_ctl -D $PGDATA stop -m f
$pg_ctl -D $PGDATA start

Step 8 (On New Server PG 9.0.x):

First restoration is the global object.

$PGPATH/psql –d DBNAME -p $PGPORT -U $PGUSER -f /pg83_backups/global_dump.sql

Step 9 (On New Server PG 9.0.x):

Restoring the database can be done parallelly, means from PG 8.4 onwards we have an option -j will create multiple connection to PostgreSQL parallelly and fasten the restoration process.

http://www.postgresql.org/docs/current/static/app-pgrestore.html

Option -j, depends on number of CPUs the NEW Server has, for example if I have 4 core, I can go with -j 4. Each core can spawn one extra process with pg_restore. Use this option as per your CPU cores, you can also get the number of processors information with this command:
$ cat /proc/cpuinfo | grep -i processors | wc -l

Start restoring each database with the dumps to new server, if the database is bigger, run in nohup. Also generate the logs on restore for further analysis on the restoration.

$PGPATH/pg_restore –d DBNAME -Fc –v -p $PGPORT -U PGUSER /pg83_backups/dbname.dmp >>/pg83_backups/restore_dbname.log 2>>/pg83_backups/restore_dbname.log 
or
nohup $PGPATH/pg_restore –d DBNAME -Fc –v -p $PGPORT -U PGUSER /pg83_backups/dbname.dmp >>/pg83_backups/restore_dbname.log 2>>/pg83_backups/restore_dbname.log &

While, restoration is in progress you can monitor in two ways at OS-level using "top -cu postgres" or "ps -ef | grep postgres", at DB-level using "select * from pg_stat_activity".

Step 10 (On New Server PG 9.0.x):

Important step, after successfull restoration, it is recommended to update the catalogs with ANALYZE command.

$$PGPATH/vacuumdb -p $PGPORT -a -Z -v >>/pg83_backups/analyze.log 2>>/pg83_backups/analyze.log 
or
$nohup /usr/local/pgsql/bin/vacuumdb -p 5433 -a -Z -v >>/pg83_backups/analyze.log 2>>/pg83_backups/analyze.log &

Step 11 (On New Server PG 9.0.x):

After ANALYZE, you need to change the settings to normal or as per the demand of application by editing the $PGDATA/postgresql.conf file.

Memory Settings:
---------------
shared_buffers= (as per the shmmax settings, Maximum 8 gigs on 64 bit, 4 gigs on 32 bit)
work_mem= (in between 5MB - 40MB)
maintenance_work_mem = (in between 1GB -- 2 GB)

Checkpoints Settings:
--------------------
checkpoint_segments=(in between 64 - 128)
checkpoint_timeout=(default)

Autovacuum settings:
-------------------
autovacuum=on
track_counts=on

Sync to Disk:
------------
fsync=on
full_page_writes=on
synchronous_commit=on

Background Writer settings:
--------------------------
bgwriter_delay=(50ms)

Step 12 (On New Server PG 9.0.x):

After the above changes restart the cluster.

$pg_ctl -D $PGDATA restart
or
$pg_ctl -D $PGDATA stop -m f
$pg_ctl -D $PGDATA start

You also need to do some changes in $PGDATA/pg_hba.conf file for allowing application connections. Always keep a copy of $PGDATA/*.conf files(PG 8.3.x) on New Server for doing any changes to .conf files.

Do post your comments or suggestions, which are greatly appreciated.

Regards
Raghav

↧

pgmemcache Setup and Usage

July 24, 2011, 5:32 am

≫ Next: pgmemcache vs Infinite Cache

≪ Previous: PostgreSQL Upgradation

Preloading or Caching the table in PostgreSQL is a tough task, because PostgreSQL doesnt offer a Single big synchronize-level memory managment. All the memories are independent. Caching is possible with the third party tools like memcached.

pgmemcache is a set of PostgreSQL user-defined functions(API's) that provide an interface to memcached. pgmemcache, pre-requisites recommends to have libmemcached, however its also recommended to install memcached along with it. My presentation consist of installation/caching/monitoring using pgmemcache API's. As am not the Developer or Hacker :), so my way of implementation is in very simple method.

Points:

Stores value in cache on the basis of Key/Value means, keeping table with primary key/unique key is recommended.
No Data redundancy - If memcached goes down or runs out of space, new records and updates will be lost.
Supports all memcached commands (set/get(single/multi)/delete/replace/incr/stats)
After keeping the data into memcached and if you drop the table from backend, memcached won't throw any errors. Its all your management how you maintain it.
No ability to iterate over data or determine what keys have been stored.
You can never bring a memcached server down or add a new one to the pool while people are playing or connected.
If the background updating process stops for any reason, updates do not occur and there is a possiblity that the memcached server could fill up.
Every PostgreSQL backend has to bind to memcached port before accessing the data.
Memcached runs on default port 11211

Pre-requisites:

PostgreSQL 8.4. or above
libevent
memcached
libmemcached
pgmemcache
Monitoring-Tools (monitoring-tools,damemtop,etc.,)

Installation:
Step 1. (libevent)

Libevent API is important when configuring pgmemcache, I prefer to have libraries as first step of installation. So lets start with libevent library configuring in default location.

Download link for libevent:
http://www.monkey.org/~provos/libevent-2.0.12-stable.tar.gz
tar -xvf libevent-2.0.12-stable.tar.gz
cd libevent-2.0.12-stable
./configure 
make
make install

Step 2 (memcached)

Install memcached by enabling the libevent.

Download link for memcached:
http://memcached.googlecode.com/files/memcached-1.4.6.tar.gz
cd /usr/local/src/memcached-1.4.6
------on 32-bit
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
./configure --prefix=/opt/PostgreSQL/9.0/bin/ --with-libevent=/usr/lib
------on 64-bit
export LD_LIBRARY_PATH=/usr/lib64:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
./configure --prefix=/opt/PostgreSQL/9.0/bin/ --with-libevent=/usr/lib64
make
make install

Step 3. (libmemcached)

pgmemcache is built on top of libmemcached. Libmemcached looks for memcache binary location, so set the path to memcached binaries before proceeding it.

export PATH=/opt/PostgreSQL/9.0/bin/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
Download link:
http://launchpad.net/libmemcached/1.0/0.50/+download/libmemcached-0.50.tar.gz
cd libmemcached-0.50
./configure
make
make install

Step 4 (pgmemcache)

pgmemcache API will help in, interacting with memcached like caching/retreiving data.

Download link:
http://pgfoundry.org/frs/download.php/3018/pgmemcache_2.0.6.tar.bz2
cd pgmemcache
PATH=/opt/PostgreSQL/9.0/bin:$PATH make USE_PGXS=1 install
or 
make
make install

Installation will create pgmemcache.sql file with all API's to interact with memcache under PG contrib location. To create pgmemcache API's, just exectute pgmemcache.sql file in all the database.

psql -p PGPORT -d PGDATABASE -f /opt/PostgreSQL/9.0/share/postgresql/contrib/pgmemcache.sql

pgmemcache API's list:

Note: While executing .sql file you may face error like "ISTFATAL: could not load library "/opt/PostgreSQL/9.0/lib/postgresql/pgmemcache.so": libmemcached.so.8: cannot open shared object file: No such file or directory". Means, PG instance didnt loaded with newly created library. Resolution, set the PATH and LD_LIBRARY_PATH and restart the instance to recognize the libraries.

Eg:-
export PATH=/opt/PostgreSQL/9.0/bin/bin:$PATH
export LD_LIBRARY_PATH=/usr/lib:/opt/PostgreSQL/9.0/lib:$LD_LIBRARY_PATH
$pg_ctl -D $PGDATA restart

If you want to load the pgmemcache as default to your PG instance, edit the postgresql.conf file and change the following parameters and restart the cluster.

shared_preload_libraries='pgmemcache'
custom_variable_classes='pgmemcache'

Configuration:
Step 1.

For caching data, first you need to initialize the memory, once the memory is allotted, later PG backends responsibility to bind and push the data into the cache. Here, I have started my memcache on localhost with 512MB on default port 11211. -d means start the daemon. All my exercise is on localhost.

$./memcached -d -m 512 -u postgres -l localhost -p 11211

Note: To retreive data from the cache, every PostgreSQL backend should first bind and retreive the data.

Step 2.

Bind the instance to the running memcache port. After binding, checkout for the memcached statistics.

Step 3.

Now, its time to cache data into memcached, Memcached uses keys/value to reside data in its memory, so make sure your table has Primary/Unique key so retrieving will be easy. As mentioned, there are very good API's to play around on keeping the value and accessing them, in my example, I use memcache_set() to keep the value and memcache_get() to retrive data.

Once the value is set in the memcached, now its your responsibility to bind your backend to memcached and with the help of pgmemcache API's you can access the data. Each Postgres backend must bind before accessing. Please find the example below.

Getting data from cache

Monitoring

If you are very good in linux you can pull maximum information on memcached memory, however there are few tools which come along with memcached source pack like monitoring-tools,damemtop etc.,. Am using monitoring-tools utility for monitoring memcache.

usage:-
memcached-tool localhost display
memcached-tool localhost dump
memcached-tool localhost stats | grep bytes

Example:

A small effort from my side to setup pgmemcache and understand the basics. Hope it was helpful. Keep posting your comments or suggestion which are highly appreciated.

--Raghav

↧

pgmemcache vs Infinite Cache

August 10, 2011, 7:00 am

≫ Next: Connection Pooling with Pgbouncer on PostgreSQL 9.0

≪ Previous: pgmemcache Setup and Usage

In my recent post on pgmemcache, there were couple of questions asked which were really interesting and made me to work on it. I should thank for it :)

Questions:
1. Is pgmemcache application transparent ?
2. Is there any synchronization between memcached and PostgreSQL Shared buffers ?

Answer:

pgmemcache(memcached) is not application transparent, you need to do changes in the application for pushing or retreiving the data from the cache.

EnterpriseDB, product PostgresPlus Advance Server includes a feature called Infinite Cache, which is based on production proven technology memcached the open source distributed object cache.

About EnterpriseDB, the Enterprise PostgreSQL Company, provides enterprise-class PostgreSQL products of the world's most advanced open source database. The company's Postgres Plus products are ideally suited for transaction-intensive applications requiring superior performance, massive scalability and compatibility with proprietary database products.

Overview

Above diagram helps to understand the architecture of pgmemcache vs infinite cache. In infinite cache, all the pages are first searched in shared_buffers and then in Infinite Cache. Synchronization between shared buffer cache and infinite cache makes application transparency, which is not the case with pgmemcache.

Infinite Cache, is faster and completely application transparent. No special code is needed from developers. Warms up your cache with multiple parallel processes and pre-loads cache at startup reducing warming time.

To avail infinite cache you have to download the PostgresPlus Advance Server which is Oracle Compatible product bundled with Infinite Cache.
Download Link:
http://www.enterprisedb.com/downloads/postgres-postgresql-downloads

Implementation of Infinite cache is as simple as memcached, below link will help in setting up the infinite cache.

http://www.enterprisedb.com/docs/en/8.4/perf/Postgres_Plus_Advanced_Server_Performance_Guide-04.htm

Very informative discussion on PostgreSQL Community Forum:-

http://archives.postgresql.org/pgsql-performance/2011-07/msg00001.php

--Raghav

↧

Connection Pooling with Pgbouncer on PostgreSQL 9.0

August 15, 2011, 5:38 pm

≫ Next: How to get Database Creation Time in PostgreSQL 9.0 ?

≪ Previous: pgmemcache vs Infinite Cache

Connection pooling, Why we go for connection pooling in PostgreSQL, When your application demands for very good number of concurrent connection hits then you need to approach it, because Connection pool sits between your application and the database.

Idea behind connection pool is that you have enough connections to use of all the available resources and any incoming requests are re-used without dropping the database connection and keeping ready for a new connection to use.

pgbouncer is lightweight connection pooler. pgBouncer runs as a single process, not spawning a process per connection, which relies on library named libevent for connection pooling.

pgbouncer setup on PostgreSQL 9.0 is very simple, however there is small change with the latest version you need to create manual pg_auth file. pgbouncer uses pg_auth file for user authentication. Earlier verion of PostgreSQL 9.0, you can find the pg_auth file under $PGDATA/global/pg_auth, now in the latest version that file has been removed and placed in pg_catalog as table 'pg_auth'.

pgbouncer Setup:

1. First, download libevent library for pgbouncer.
Download link for libevent:
http://www.monkey.org/~provos/libevent-2.0.12-stable.tar.gz

tar -xvf libevent-2.0.12-stable.tar.gz
cd libevent-2.0.12-stable
./configure 
make
make install

2. Download the latest pgbouncer tar and configure to your PostgreSQL 9.0.
http://pgfoundry.org/frs/download.php/2912/pgbouncer-1.4.tgz

tar -xvf pgbouncer-1.4
cd pgbouncer-1.4
./configure --prefix=/opt/PostgreSQL/9.0/bin
make
make install

3. Create a libevent-i386.conf file in /etc/ld.so.conf.d directory

vi /etc/ld.so.conf.d/libevent-i386.conf
/usr/local/lib
:wq!

4. Run the ldconfig to apply new changes.

#ldconfig

5. Change the ownership of pgbouncer utility in PostgreSQL binary to postgres user.

chown -R postgres:postgres /opt/PostgreSQL/9.0/bin/bin/pgbouncer

6. Create the pgbouncer_auth file for users authentication.

7. Create pgbouncer.ini file with postgres user permission under /etc directory.

8. Start pgbouncer

-bash-4.1$ ./pgbouncer -d /etc/pgbouncer.ini
2011-08-14 11:42:00.925 1949 LOG File descriptor limit: 1024 (H:1024), max_client_conn: 1000, max fds possible: 1010

9. Connect to the databases using pgbouncer

10. Getting help: Connect to pgbouncer database and get helped.

$ psql -p 6432 -U postgres pgbouncer
pgbouncer=# show help;

For better understanding on pg_auth you can find in below link by 'depesz'.
http://www.depesz.com/index.php/2010/12/04/auto-refreshing-password-file-for-pgbouncer/

Do post your comments which are highly appreciated.

--Raghav

↧

How to get Database Creation Time in PostgreSQL 9.0 ?

September 24, 2011, 12:15 pm

≫ Next: Replication in PostgreSQL 9.0

≪ Previous: Connection Pooling with Pgbouncer on PostgreSQL 9.0

In PostgreSQL, database creation time is not stored in any of the pg_catalogs. So question arises, how do we know when was database created.

For every database, a directory is created with database-oid number under $PGDATA/base along with a set of OID's,OID_fsm,OID_vm, PG_VERSION files for each Object(Tables/Indexes/View/ etc.,).

Every OID,OID_fsm,OID_vm, files will be get updated as per the changes made at database-level. However, PG_VERSION file will never get updated on any changes made to the database. So, we going to use timestamp of PG_VERSION file as database creation time. I believe that there will be a chance of changing PG_VERSION timestamp, but I am not sure in which case this changes happen.

To get timestamp of PG_VERSION, I need something which executes OS command at PG Instance-level. So, I used pl/perlu function created by one of my collegue Vibhor Kumar.

http://vibhork.blogspot.com/2011/04/plperl-functions-for-getting-number-of.html

pl/perlu Function

CREATE OR REPLACE FUNCTION execute_shell(text) returns setof text
as
$$
$output=`$_[0] 2>&1`;
@output=split(/[\n\r]+/,$output);
foreach $out (@output)
{ return_next($out);
}
return undef;
$$ language plperlu;

And, one function to get database oid.

CREATE OR REPLACE FUNCTION public.get_pg_version_loc(dbname varchar) RETURNS text AS
$body$
DECLARE
       dbname ALIAS FOR $1;
       data_dir text;
       db_oid text;
       os_execute text;
BEGIN
     SELECT INTO db_oid oid from pg_database where datname = dbname;
     show data_directory into data_dir;
     os_execute := 'stat -c "%y" '||data_dir||'/base/'||db_oid||'/PG_VERSION';
     return os_execute;
END;
$body$
LANGUAGE 'plpgsql';

Output:

=# select datname,execute_shell(get_pg_version_loc(datname::text)) as "DB_Createion_Time"
-# from pg_database where datname not in ('template0','template1');
   datname    |          DB_Createion_Time
--------------+-------------------------------------
 postgres     | 2011-01-10 21:48:37.222016571 +0530
 provider     | 2011-05-26 11:40:14.253434477 +0530
 pgbench_test | 2011-08-14 16:52:21.689198728 +0530
 pgpool       | 2011-08-26 12:30:19.864134713 +0530
(4 rows)

Will be back with more stuff :). Do post your comments if any, they will be highly appreciated.

--Raghav

↧

Replication in PostgreSQL 9.0

October 18, 2011, 4:22 pm

≫ Next: High Availability Clustering with PostgreSQL

≪ Previous: How to get Database Creation Time in PostgreSQL 9.0 ?

Word "Replication" means a process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.

Replication is very interesting subject in any databases. In database competition world, PostgreSQL has its own uniqueness in RDBMS Open source for High availability. Latest PostgreSQL 9.1 has in-built support of Synchronous and Asynchronous replication. In-built Asynchronous replications are Warm Standby, Hot Standby and Streaming Replication and with third party tools Slony,Londiste,Mammoth etc. Below chart will help you to understand about available Synchronous and Asynchronouse replication.

WAL Shipping (Hot Standby and Warm Standby):

PostgreSQL has the ability to ship WAL's to another Server i.e, Standby. The Standby server will be running in recovery mode with the pg_standby utility applying the WAL's. Primary Server generates archives (a copy of WAL, usually 16 MB file) and sends them to multiple slaves, later it will be applied by pg_standby utility.

Warm Standby: Primary generates archives and feed them to Slave. Its a WAL Shipping to slave. Slave will be in continous recovery and not accessible for reads.
Hot Standby: Hot Standby is the name for the capability to run queries on a database that is currently performing archive recovery. In Hot Standby slaves can be used for read-only access.

Disadvantages:

Slave applie's WAL's periodically not continously, means only completed XLOG's will be available to slave as WAL archives and those will be applied. So, lag will be the unfilled or uncompleted WAL which has not generated archive. Data loss will be minimum of one WAL(16 MB).

Trigger Based Replication :

In trigger based replication, tools like Slony,Londiste, Mammoth uses ON INSERT, ON UPDATE, ON DELETE triggers on tables to maintain replication between Master and Slave. Slave will hold consistent Snapshots.

Streaming Replication :

Its also called as Binary replication. PostgreSQL, XLOG's records generated at primary will be shipped to Standby via network. Lag in streaming replication is very minimum like single transaction depending on Network Speed and Hot Standby Settings. Multiple Slave can be configured. Streaming replication comes with additional process 'WAL SENDER' at Primary and 'WAL RECEIVER' at Standby.

Advantages:

On Primary Crash, standby can be recovered in very less time.
Standby can be opened and it will be in READ ONLY mode.
It can be used for Reporting Server.
Load balancing can be configured using pgpool-II between Primary and Standby.

Disadvantages:

Standby Server should hold same amount of Memory/Disk/CPU etc., because, in case of Primary crashes the Slave acts as Primary.
Minimal Lag i.e. (one transaction behind Primary)

Slony Replication:

Slony is a asynchronous trigger-based replication. Its a single master to multiple slave replication system for PostgreSQL. Every table or sequence on Master will be replicated via remote triggers to Slave. Updates are committed to one database and are applied to Slave later as EVENTs. Using Slony Switchover and switchback is possible.

Limitations of Slony-I

Tables must have a primary key or a unique.
Only Tables and sequeces are allowed for replication.
Slave databases cannot be modified.

Advantages:

Slony-I supports switchback.
Using Slony-I, we can upgrade PG from one version to another version without any downtime.

Disadvantages:

Slony cannot detect the network failuer, hence causing all the EVENT's created at primary will be queued and are released once Network catch ups.
NO DDL changes allowed on the replication Tables while Slony Daemons running.

Do post your comments, they will be highly appreciated.

--Raghav

↧

High Availability Clustering with PostgreSQL

October 19, 2011, 6:22 am

≫ Next: ~/.psqlrc file for DBA's

≪ Previous: Replication in PostgreSQL 9.0

Firstly, I should thank my company for giving me an opportunity to work mostly with PostgreSQL HA stuff. I have worked with very good clients who has implemented Clustering with PostgreSQL. So, my article here is to give little idea on how HA clustering will work with PostgreSQL.

PostgreSQL has built-in functionality for High Availability like Warm Standby,Hot Standby and Streaming Replication. But, missing few features like Switchover/Switchback, failover automation, minimal downtime etc., which are mostly demanded by the companies. Postgres community member's are working on the demands aggressively and hope we see very new PostgreSQL soon with all features bundled. For now, let see Clustering with PostgreSQL.

There are many clustering architecture diagram's in brief which I have shared links below, but what I made here is just an overview of it.

What is High Availablity clustering ?

High availability clustering (HAC)is a feature which provides redundancy and fault tolerance. Its a number of connected devices processing and providing a service. HAC, involves employing both hardware and software technologies, like Server redundancy(including application failover and server clustering), Storage redundancy (including RAID and I/O multipathing), Network redundancy and Power system redundancy.

It's goal is to ensure this service is always available even in the event of a failure. If one server fail's the other servers will continue processing and take on the processing load of the failed server. HA cluster implementation attempt to use redundancy of cluster components to eliminate single points of failure.

Currently Available HA Products

There are many competitive high-availability software products in the market today; deciding which one to purchase can be tough. Following are the list of features you need to look in any HAC product.

Clustering capability ( How many servers can be clustered together?)
Load-balancing capability
Intelligent monitoring
Centralized management capability
Application monitoring
Cost (Most importantly though :) )
Customer support (Most of the products do this)

I have seen two of them, one is RedHat Cluster Suite (which is commonly used HA package for Linux operating system) and another is Steeleye-LifeKeeper.

http://www.redhat.com/rhel/add-ons/high_availability.html

http://www.ha-cc.org/high_availability/components/application_availability/cluster/high_availability_cluster/steeleye_lifekeeper/

Who needs ?

High Availability Clusters are often used by websites serving 24x7x365 not affording any downtime Eg: Amazon.com,Music websites, Customer Service sites etc., or Companies with Critical Databases.

How it works ?

You need minimum two nodes to start with HA. HA clusters usually use a heartbeat private network connection which is used to monitor the health and status of each node in the cluster. In any serious condition, any of the cluster goes down then other node attempts to start services and provides the same service.

Types of HAC :

Active/Passive: In this mode, one node is active (i.e., Primary) and processing service, while other node will be in passive mode meaning its a standby and will only become active if the primary node fails.
Active/Active : In this mode, both nodes are active and traffic is load balanced between both nodes and processing service. If one node fails, the other node will take the full processing load, until the failed node becomes active again.

Note: Active/Active mode is not supported with PostgreSQL.

Heartbeat:

A heartbeat is a sensing mechanism which sends a signal across to the primary node, and if the primary node stops responding to the heartbeat for a predefined amount of time, then a failover occurs automatically.

Failover Automation:

Automatic failover is the process of moving active services from the primary node to the standby node when the primary node fails. Usually the standby node continues its services until the primary node has come back up and running. When a device fails another device takes over this process which is referred to as a failover.

Failover automation is usually implemented on hardware firewalls over networks. You need to configure firewalls on Primary to take over Standby node in case of primary firewall fails.

HAC support with PostgreSQL

Currently, RHCS or LifeKeeper supports Active/Passive clustering with PostgreSQL. There is no Active/Active support for PostgreSQL yet. As I said, PostgreSQL has no built-in functionality of Failover Automation including third party replication tools like Slony-I, Londiste, etc.. To achieve this you may need to trick with OS level Scripting or take the help of Clustering.

Below link will help you to understand more about PostgreSQL Clustering with RHCS by Devrim Gunduz(Postgres Community Member).

http://wiki.postgresql.org/images/5/58/06.5_-_Devrim_Gunduz_-_PostgreSQLClusteringWithRedHatClusterSuite--LT.pdf

Setup service from EnterpriseDB on RHCS:

http://www.enterprisedb.com/services/packaged-services/high-availability

Do post your comments..

--Raghav

↧

~/.psqlrc file for DBA's

November 9, 2011, 2:04 pm

≫ Next: Deadlocks in PostgreSQL

≪ Previous: High Availability Clustering with PostgreSQL

In our regular DBA monitoring, we will be using so many combination of pg_catalog queries to reteive information like <IDLE> in transaction , waiting queries, No. of connections, etc. Most of the DBA's, create views to cut short big combination queries and keep handy for later use per requirement.

PostgreSQL, provides a startup file(.psqlrc) which executes before connecting to the database when using with psql utility. Using .psqlrc file you can place all your important queries with one word alias by '\set' command and execute it in psql terminal instead of typing big queries. If you wont see .psqlrc file in 'postgres' user home directory, you can create it explicitly. I tried it and found very helpful.

Points on .psqlrc:

.psqlrc is a startup file, executes when connecting to the cluster.
.psqlrc file will reside in 'postgres' user home directory.
psql options -X or -c, do not read the .psqlrc file.
.psqlrc file is for complete session-level not database level.

My terminal Screenshot:

Lets see how to implement this.

Syntax:

\set <alias-variable-name>  'query'

Note: if your query has single or double quotes then use \' or \" in the query.

Sample Queries to put in .psqlrc file with alias:

vi ~/.psqlrc

\set PAGER OFF

\set waits 'SELECT pg_stat_activity.procpid, pg_stat_activity.current_query, pg_stat_activity.waiting, now() - pg_stat_activity.query_start  as "totaltime", pg_stat_activity.backend_start FROM pg_stat_activity WHERE pg_stat_activity.current_query !~ \'%IDLE%\'::text AND pg_stat_activity.waiting = true;;'

\set locks 'select pid,mode,current_query from pg_locks,pg_stat_activity where granted=false and locktype=\'transactionid\' and pid=procpid order by pid,granted;;'

:wq!

Usage:

postgres=# :waits
 procpid |         current_query         | waiting |    totaltime    |          backend_start
---------+-------------------------------+---------+-----------------+----------------------------------
    9223 | insert into locks VALUES (1); | t       | 00:00:18.901773 | 2011-10-08 00:29:10.065186+05:30
(1 row)

postgres=# :locks
 pid  |   mode    |         current_query
------+-----------+-------------------------------
 9223 | ShareLock | insert into locks VALUES (1);
(1 row)

Was it not helpful. Enjoy... :). Will be back with some more stuff.

--Raghav

↧

Deadlocks in PostgreSQL

November 19, 2011, 2:59 am

≫ Next: Resize VARCHAR column of a Large Tables

≪ Previous: ~/.psqlrc file for DBA's

Before discussing on deadlocks, lets see type of locks and their acquiring methodolgy in PostgreSQL.
Types of Locks:

Table-Level Locks and
Row-Level Locks

Table-Level Locks:

AcessShareLock : It acquired automatically by a SELECT statement on the table or tables it retrieves from. This mode blocks ALTER TABLE, DROP TABLE, and VACUUM (AccessExclusiveLock) on the same table
RowShareLock : It acquired automatically by a SELECT...FOR UPDATE clause. It blocks concurrent ExclusiveLock and AccessExclusiveLock on the same table.
RowExclusiveLock: It acquired automatically by an UPDATE, INSERT, or DELETE command. It blocks ALTER TABLE, DROP TABLE, VACUUM, and CREATE INDEX commands (ShareLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on the same table.
ShareLock: It acquired automatically by a CREATE INDEX command. It blocks INSERT, UPDATE, DELETE, ALTER TABLE, DROP TABLE, and VACUUM commands. (RowExclusiveLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock) on the same table.
ShareRowExclusiveLock: This lock mode nearly identical to the ExclusiveLock, but which allows concurrent RowShareLock to be acquired.
ExclusiveLock: "Every transaction holds an exclusive lock on its transaction ID for its entire duration. If one transaction finds it necessary to wait specifically for another transaction, it does so by attempting to acquire share lock on the other transaction ID. That will succeed only when the other transaction terminates and releases its locks." (regards, tom lane). Best definition by Tom Lane, I Believe every email from him is a lesson, he is Dr. PostgreSQL :) . ExclusiveLock blocks INSERT, UPDATE, DELETE, CREATE INDEX, ALTER TABLE, DROP TABLE, SELECT...FOR UPDATE and VACUUM commands on the table.(RowShareLock,RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, and AccessExclusiveLock)
AccessExclusiveLock: It acquired automatically by a ALTER TABLE, DROP TABLE, or VACUUM command on the table it modifies.This blocks any concurrent command or other lock mode from being acquired on the locked table.

Row-Level Locks:

Two types of row-level locking share and exclusive locks. Don't fall into confusion of LOCK naming, you can differentiate row-lock and table-lock by the column 'lock_type' in pg_locks.

Exclusive lock: It is aquired automatically when a row hit by an update or delete. Lock is held until a transaction commits or rollbacks. To manually acquiring exclusive-lock use SELECT FOR UPDATE.
Share-Lock: It is acquired when a row hit by an SELECT...FOR SHARE.

Note: In either cases of row-level locks, data retreival is not at all effectied. Row-level lock block Writers (ie., Writer will block the Writer)

DeadLocks:

Now Deadlocks, you have seen the lock modes and their lock aquiring methodology, there are situations some of the transactions fall under deadlock. I believe application designing is the culprit forcing transactions to deadlocks. Deadlock mostly caused by ExclusiveLock's i.e., UPDATE or DELETE.

What is deadlock ?

Process A holding lock on object X and waiting for lock on Object Y. Process B holding lock on Object Y and waiting for lock on Object X. At this point the two processes are now in what's called 'deadlock' each is trying to obtain a lock on something owned by the other. They both will wait on each other forever if left in this state. One of them has to give up and release the locks they already have. Now, deadlock detector comes into picture and allow one process to success and another to rollback.

To over come deadlock, design application in such a way that any transaction UPDATE or DELETE should succeed with complete ownership on the table. Lock the table with 'SHARE UPDATE EXCLUSIVE MODE' or 'SELECT...FOR UPDATE' or 'ACCESS EXCLUSIVE MODE' and complete the transaction. In this model, deadlock detector never throw that it has hit by a EXCLUSIVE LOCK's.

You can test the scenario given in the pic above with the resolution, you see that deadlock detector never throws error.

Locking Query:

\set locks 'SELECT w.locktype AS waiting_locktype,w.relation::regclass AS waiting_table,w.transactionid, substr(w_stm.current_query,1,20) AS waiting_query,w.mode AS waiting_mode,w.pid AS waiting_pid,other.locktype AS other_locktype,other.relation::regclass AS other_table,other_stm.current_query AS other_query,other.mode AS other_mode,other.pid AS other_pid,other.granted AS other_granted FROM pg_catalog.pg_locks AS w JOIN pg_catalog.pg_stat_activity AS w_stm ON (w_stm.procpid = w.pid) JOIN pg_catalog.pg_locks AS other ON ((w.\"database\" = other.\"database\" AND w.relation  = other.relation) OR w.transactionid = other.transactionid) JOIN pg_catalog.pg_stat_activity AS other_stm ON (other_stm.procpid = other.pid) WHERE NOT w.granted AND w.pid <> other.pid;;'

Locking information Links

http://www.postgresql.org/docs/9.0/static/sql-lock.html

http://developer.postgresql.org/pgdocs/postgres/explicit-locking.html

Hope you got some idea on PostgreSQL Locks. See you all soon with another good blog.... :)

--Raghav

↧

Resize VARCHAR column of a Large Tables

January 25, 2012, 4:59 pm

≫ Next: Londiste Replication with PostgreSQL 9.0

≪ Previous: Deadlocks in PostgreSQL

Note: Recommended not to tamper pg_catalogs.

On forum, I saw an interesting posting and also the solution, however few things of that solution made me to test it. Scenario is, "How to resize the VARCHAR column on a large table with less time and what are best approach's". As known standard way is to, Create a NEW column with desired size, Copy OLD data to newly created column, Drop the OLD column and finally rename the NEW with OLD column name. Be noted that am talking here about 100 million rows :)

Another approach is to modify PostgreSQL pg_catalog's with new SIZE in the pg_attribute relation. Below are the steps.

Drop if you have indexes on the RESIZE column
Make the database into READ-ONLY mode (PG 9.x)
Use UPDATE command on the pg_attribute relation on the column atttypmod(column size) and attname (column Name)

Command:

update pg_attribute set atttypmod = atttypmod + (desired Resize) where attrelid=<relation OID> and attname='<column Name>';

Above command will update the pg_attribute relation with new column SIZE and allow you to insert the data according to new size. Here the table data not reformed with new SIZE instead its been overlooked by pg_catalogs changes.

Disadvantage:

You cannot decrease the size, if you do, then VARCHAR column size becomes ZERO and wont allow you to enter any data into the table. You get below error

ERROR: value too long for type character(0)

Will be back with more stuff. All the best :).

-Raghav

↧

Londiste Replication with PostgreSQL 9.0

February 10, 2012, 8:11 pm

≫ Next: Duplicate Rows in a primary key Table.

≪ Previous: Resize VARCHAR column of a Large Tables

Londiste, Asynchronous Master/Slave replication tool developed by Skytools. Its very simple and user-friendly created like Slony. Core logic behind Londiste or Slony is Remote Triggering. Whereas londiste follows events queuing model which is not their in Slony - I.

Overview on Skytools:
Skytools is a Python-based application, it comes with a bundle of three things PgQ,Londiste & Walmgr and it also requires the Python-Postgres driver 'psycopg2'.

PGQ : Its queue mechanism built with pl/pgsql with phython framework on top of it.
Londiste: A replication tool written in Phyton using PgQ as events transporter.
Walmgr : Creates a WAL archiving setup.

Am not going to describe much here regarding londiste replication daemon process etc., because you can find the best tutorial regarding Skytools(PgQ/Londiste/WalMgr) in this link http://skytools.projects.postgresql.org/doc/.

Basically, my demo include how to proceed with Londiste replication with PostgreSQL 9.0 along with installation steps. I say, Skytools documentation and PostgreSQL Wiki (http://wiki.postgresql.org/wiki/Londiste_Tutorial) is more then anything to play around with Londiste replication.

Pre-Requisites with Download Links :

PostgreSQL - PostgreSQL 9.0 http://www.enterprisedb.com/products-services-training/pgdownload
skytools - skytools-2.1.12.tar.gz http://pgfoundry.org/frs/download.php/2872/skytools-2.1.12.tar.gz
psycopg2 - psycopg2-2.4.2.tar.gz http://initd.org/psycopg/tarballs/PSYCOPG-2-4/psycopg2-2.4.2.tar.gz

My Demo includes following :-

OS                     : RHEL 6 32 bit
DB version             : PostgreSQL 9.0
Two Clusters & Database: londiste_provider on 5432,Londiste_subscriber on 5433
Table                  : One Table (ltest)
Location of .ini file  : /opt/skytools-2.1.12/scripts
Location of Skytools   : /opt/skytools-2.1.12
Location of PG 9.0     : /opt/PostgreSQL/9.0/

As its simple demo with one table, I have tried with RHEL 6 32bit/PostgreSQL 9.0 with two clusters in my local box. You would need to tweak it as per the actual requirements.

Note: Before moving forward with setup, I would like to remind that all source installations must be as root user and after installation those directories should own Postgres user permissions.

Step 1.
Install PostgreSQL 9.0 and create two clusters with INITDB command and make sure they run on 5432 & 5433 each. (Remember, its a old fact that with INITDB command pg_log directory will not be created under Data_directory you need to create it explicitly.)

Step 2.
Install skytools by downloading from the above link. Its best practice to keep all sources in one common standard location. I used '/usr/local/src' and skytools under '/opt/'. Now configure skytools with PostgreSQL 9.0 'pg_config'.

# tar -xvf skytools-2.1.12.tar.gz
# cd /usr/local/src/skytools-2.1.12
# ./configure --prefix=/opt/skytools-2.1.12 --with-pgconfig=/opt/PostgreSQL/9.0/bin/pg_config
# make 
# make install

Note: After the installation you will see two important contrib modules(pgq & londiste) under PostgreSQL contrib location. Basically, these two contrib's gives you the functionality of londiste replication.

# cd /opt/PostgreSQL/9.0/share/postgresql/contrib
# ll lond*
-rw-r--r--. 1 root root 29771 Jan 11 13:24 londiste.sql
-rw-r--r--. 1 root root 27511 Jan 11 13:24 londiste.upgrade.sql

# ll pgq*
-rw-r--r--. 1 root root  4613 Jan 11 13:24 pgq_ext.sql
-rw-r--r--. 1 root root  1170 Jan 11 13:24 pgq_lowlevel.sql
-rw-r--r--. 1 root root 69798 Jan 11 13:24 pgq.sql
-rw-r--r--. 1 root root  3940 Jan 11 13:24 pgq_triggers.sql
-rw-r--r--. 1 root root 54182 Jan 11 13:24 pgq.upgrade.sql

Step 3.
Install psycopg2, its a phyton-postgres driver which is necessary for skytools. Sometime these driver's wont come with python, so here are the installation steps.

# tar -xvf psycopg2-2.4.2.tar.gz
# cd psycopg2-2.4.2
# python setup.py install --prefix=/usr/local
# python setup.py build_ext --pg-config /opt/PostgreSQL/9.0/bin/pg_config

Step 4.
Give ownership of Postgres to skytools and postgres installation location. This makes sure that all files/executables are with Postgres User permissions.

# chown -R postgres:postgres /opt/skytools-2.1.12 
# chown -R postgres:postgres /opt/PostgreSQL/9.0/

Step 5.
Set the LD_LIBRARY_PATH & PYTHONPATH and start the two newly created clusters. You can place them in .bash_profile of postgres user as permanent solution.

$export PYTHONPATH=/opt/skytools-2.1.12/lib/python2.6/site-packages/
$export LD_LIBRARY_PATH=/opt/PostgreSQL/9.0/lib:/usr/lib:/usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE:
or 
$ vi .bash_profile 
export PYTHONPATH=/opt/skytools-2.1.12/lib/python2.6/site-packages/
export LD_LIBRARY_PATH=/opt/PostgreSQL/9.0/lib:/usr/lib:/usr/lib/perl5/5.10.0/i386-linux-thread-multi/CORE:
:wq
$ . .bash_profile (execute to take effect of new settings)

Now Start the two cluster

$ pg_ctl -o "-p 5432" -D /opt/PostgreSQL/9.0/data start
$ pg_ctl -o "-p 5433" -D /opt/PostgreSQL/9.0/data_1 start

Step 6.
Create two databases, londiste_provider in 5432 and londiste_subscriber in 5433. Create one table with primary key name 'ltest' in two databases and INSERT some data in londiste_provider (ltest) table and later completion of replication setup you should see those INSERT data in londiste_subscriber side.

You may not need CRETAE TABLE on slave side, instead you can use structure dump/restore by using pg_dump/pg_restore, if you have many tables.

On 5432
psql -p 5432 -c "create database londiste_provider;"
psql -p 5432 londiste_provider
londiste_provider=# create table ltest(id int primary key);
londiste_provider=# insert into ltest VALUES (generate_series(1,10));
INSERT 0 10

On 5433
psql -p 5433 -c "create database londiste_subscriber;"
psql -p 5433 londiste_subscriber
londiste_subscriber=# create table ltest(id int primary key);

Step 7.
Create two .ini files one for londiste(londiste.ini) and another for PgQ ticker(pgq_ticker.ini). You can also find the sample .ini files from base installation of skytools. Eg:- "/opt/skytools-2.1.12/share/doc/skytools/conf" location.

Step 8.
Create two directories for log's and PID's files and point them in the parameters of londiste.ini and pgq_ticker.ini.

$ cd /opt/PostgreSQL/9.0
$ mkdir log pid

Step 9.
Start the replication with .ini files, firstly install londiste on provider and subscriber and then start the ticker (PgQ) for replicating the tables.

Install londiste on provider and subscriber with below commands one by one:

$ cd /opt/skytools-2.1.12/bin
$ ./londiste.py ../scripts/londiste.ini provider install
2012-01-12 14:56:03,667 11073 INFO plpgsql is installed
2012-01-12 14:56:03,674 11073 INFO txid_current_snapshot is installed
2012-01-12 14:56:03,675 11073 INFO Installing pgq
2012-01-12 14:56:03,676 11073 INFO   Reading from /opt/skytools-2.1.12/share/skytools/pgq.sql
2012-01-12 14:56:03,816 11073 INFO Installing londiste
2012-01-12 14:56:03,816 11073 INFO   Reading from /opt/skytools-2.1.12/share/skytools/londiste.sql

-bash-4.1$ ./londiste.py ../scripts/londiste.ini subscriber install
2012-01-12 14:56:17,871 11081 INFO plpgsql is installed
2012-01-12 14:56:17,872 11081 INFO Installing londiste
2012-01-12 14:56:17,873 11081 INFO   Reading from /opt/skytools-2.1.12/share/skytools/londiste.sql

-->Now, Install PqQ and start ticker with .ini file. 

-bash-4.1$ ./pgqadm.py ../scripts/pgqadm.ini install
2012-01-11 16:45:03,219 6348 INFO plpgsql is installed
2012-01-11 16:45:03,225 6348 INFO txid_current_snapshot is installed
2012-01-11 16:45:03,228 6348 INFO pgq is installed 

-bash-4.1$ ./pgqadm.py -d ../scripts/pgqadm.ini ticker -d

-->Add the table to provider & subscriber to replicate.

-bash-4.1$ ./londiste.py ../scripts/londiste.ini provider add ltest
2012-01-12 15:03:39,583 11139 INFO Adding public.ltest

-bash-4.1$ ./londiste.py ../scripts/londiste.ini subscriber add ltest
2012-01-12 15:03:47,367 11146 INFO Checking public.ltest
2012-01-12 15:03:47,384 11146 INFO Adding public.ltest

After adding start the replication of the table.

-bash-4.1$ ./londiste.py ../ scripts/londiste.ini replay -d

Note: "-d" option is to run the londiste/PgQ daemons in background.

Here complete the replication setup. Now you should see the "ltest" table data on Slave Side(i.e. on 5433 port).

Step 10.
Now lets understand what all happend in the background to table/logs/pids/data etc., Lets see one by one.

Logs Information:

Table Structure after replication:

Event Queue Status
Replication status can be checked with pgq utility as below:-

-bash-4.1$ ./pgqadm.py ../scripts/pgqadm.ini status
Postgres version: 9.0.1   PgQ version: 2.1.8

Event queue                                    Rotation        Ticker   TLag
------------------------------------------------------------------------------
londiste.replica                                3/7200s    500/3s/60s     6s
------------------------------------------------------------------------------

Consumer                                                       Lag  LastSeen
------------------------------------------------------------------------------
londiste.replica:
  myfirstlondiste                                               6s        6s
------------------------------------------------------------------------------

Note: There are very good options with Londiste & PGQ utilities to do R & D.
Hoping you all have a successful Londiste replication setup. Please do post your comments those are highly appreciated. See you all soon with some more postings.

--Raghav

↧

Duplicate Rows in a primary key Table.

April 8, 2012, 5:43 am

≫ Next: Caching in PostgreSQL

≪ Previous: Londiste Replication with PostgreSQL 9.0

Back again, getting very less time for blogging :)

"ERROR: could not create unique index
DETAIL: Table contains duplicated values."

This error is thrown out by Postgres when it encounters duplicate rows in a primary key table by failing any of these command REINDEX or CREATE UNIQUE INDEX.

Why duplicate rows exists in a table ?

Not sure exactly :) nor any proved explainations out...
Two thing to my mind.

Firstly, it might be delayed index creation or if you have shared sequences in a database, sharing on two different Primary key Tables might be the cause while restoring the data into table (pg_restore). Secondly, if any huge transaction is taking place on that table and at the backend someone has abruptly stopped the instance, which might also fail the index(primary key) to point to the right row.

How to fix it ?

Well,as common practice, when we encounter a duplicate rows in a table (despite of any reason), we first filter the duplicate rows and delete them, and later by doing REINDEX should fix the issue.

Query for finding duplicate rows:

select count(*),primary_column from table_name group by primary_column having count(*) > 1;

Even after deleting the duplicate rows REINDEX or CREATE UNIQUE INDEX fails, it means your index is not cleaned properly. Above query might not be giving 100% result oriented output what you are expecting, because the query is going to pick the index which is already corrupted with duplicate rows. See the explain plan below.

postgres=# explain select count(*),id from duplicate_test group by id having count(*) > 1;
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..5042.90 rows=99904 width=4)
   Filter: (count(*) > 1)
   ->  Index Scan using duplicate_test_pkey on duplicate_test  (cost=0.00..3044.82 rows=99904 width=4)
(3 rows)

We need to catch CTID of duplicate rows from the main table and delete with conditional statement as CTID + PRIMARY KEY VALUE.

I have played a bit with pg_catalogs to voilate Primary Key Table to reproduce the scenario with similar error. (Please don't it)

postgres=# create unique index idup on duplicate_test(id);
ERROR:  could not create unique index "idup"
DETAIL:  Key (id)=(10) is duplicated.

My Table Definition & Data:

postgres=# \d duplicate_test
Table "public.duplicate_test"
 Column |  Type   | Modifiers
--------+---------+-----------
 id     | integer | not null
 name   | text    |
Indexes:
    "duplicate_test_pkey" PRIMARY KEY, btree (id)

postgres=# select * from duplicate_test ;
 id |  name
----+---------
 10 | Raghav    ---Duplicate
 20 | John H
 30 | Micheal
 10 | Raghav    ---Duplicate
(4 rows)

Now, lets fix this....

Step 1. Create a new table from effected table by pulling only two column values CTID and PRIMARY KEY.

postgres=# CREATE TABLE dupfinder AS SELECT ctid AS tid, id FROM duplicate_test;
SELECT 4

Step 2. Now, lets run the duplicate finder query with CTID to get the exact duplicates.

postgres=# select * from dupfinder x where exists (select 1 from dupfinder y where x.id = y.id and x.tid != y.tid);
  tid  | id
-------+----
 (0,1) | 10
 (0,5) | 10
(2 rows)

Step 3. On above result, now you can delete one row from main table(effected table) with CTID.

postgres=# delete from duplicate_test where ctid='(0,5)' and id=10;
DELETE 1

Step 4. Now, your REINDEX or CREATE UNIQUE INDEX will be successful.

postgres=# create unique index idup on duplicate_test(id);
CREATE INDEX

postgres=# select * from duplicate_test ;
 id |  name
----+---------
 10 | Raghav
 20 | John H
 30 | Micheal
(3 rows)

Step 5. Don't forget to do immediate VACUUM ANALYZE on the table to update the system catalogs as well CTID movement.

Please do share your comments.

--Raghav

↧

Caching in PostgreSQL

April 15, 2012, 12:49 pm

≫ Next: Autonomous Transaction in PostgreSQL 9.1

≪ Previous: Duplicate Rows in a primary key Table.

Caching...!!, its little bit hard to go in brief with single article. But will try to share my knowledge learnt from Heikki / Robert Haas / Bruce Momjian in short. In PostgreSQL, there are two layers, PG shared buffers and OS Page cache, any read/write should pass through OS cache(No bypassing till now). Postgres writes data on OS Page Cache and confirms to user as it has written to disk, later OS cache write's to physical disk in its own pace. PG shared buffers has no control over OS Page Cache and it not even know what's in OS cache. So, most of the recommendation's given by Postgres DBA's/Professional's to have faster DISK / better cache.

Caches/buffers in PostgreSQL are stronger like other databases and highly sophisticated. As am from Oracle background (mindset also…:) ), so, my question's from whom I learnt was how/when/what/why etc., regarding Database buffer cache, pinned buffers, Flushing database buffers cache, preloading database etc., I got all my answers from them, however, the approach is bit different. Though my questions were bugging, they answered with great patience and clarifying me to good extent which in result you are reading this blog.... :)..

On some learnings(still learning), I drawn a small overview of how data flow between Memory to Disk in Postgres and also some of the important tools and NEW patch by Robert Haas(pg_prewarm).

pg_buffercache
A contrib module, which tells whats in PostgreSQL buffer cache. Installation below:-

postgres=# CREATE EXTENSION pg_buffercache;

pgfincore
It has a functionality to give the information about what data in OS Page Cache. Pgfincore, module become's very handy when it is clubbed with pg_buffercache, now one can get PG buffer cache & OS Page Cache information together. Thanks to Cerdic Villemain. Pgfincore, backbone is fadvise, fincore which are linux ftools. You can also use fincore/fadvise by installing source. Two thing, you can use pgfincore contrib module or ftools both result the same. I tried both, they are simply awesome.

Installation:
Download the latest version: http://pgfoundry.org/frs/download.php/3186/pgfincore-v1.1.1.tar.gz
As root user:
export PATH=/usr/local/pgsql91/bin:$PATH     //Set the path to point pg_config.
tar -xvf pgfincore-v1.1.1.tar.gz
cd pgfincore-1.1.1
make clean
make 
make install

Now connect to PG and run below command

postgres=# CREATE EXTENSION pgfincore;

pg_prewarm
Preloading the relation/index into PG buffer cache. Is it possible in PostgreSQL? oh yes, Thanks to Robert Haas, who has recently submitted patch to community, hopefully it might be available in PG 9.2 or PG 9.3. However, you can use the patch for your testing on PG 9.1.

pg_prewarm has three MODE's:

PREFETCH: Fetching data blocks asynchronously into OS cache only not into PG buffers (hits OS cache only)
READ: Reads all the blocks into dummy buffer and forces into OS cache. (hits OS cache only)
BUFFER: reads all the blocks or range of blocks into database buffer cache.

Installation:
I am applying pg_prewarm patch on my PG source installation, you need to tweak as per your setup.

Untar location of PG source : /usr/local/src/postgresql-9.1.3
PG installation locatin : /usr/local/pgsql91
All downloads Location : /usr/local/src

Note: Install PG before applying pg_prewarm patch.

1. Download the patch to /usr/local/src/ location
http://archives.postgresql.org/pgsql-hackers/2012-03/binRVNreQMnK4.bin
Patch attached Email:
http://archives.postgresql.org/message-id/CA+TgmobRrRxCO+t6gcQrw_dJw+Uf9ZEdwf9beJnu+RB5TEBjEw@mail.gmail.com
2. After download go to PG source location and follow the steps.

# cd /usr/local/src/postgresql-9.1.3

# patch -p1 < ../pg_prewarm.bin         (I have renamed after download)

# make -C contrib/pg_prewarm
# make -C contrib/pg_prewarm install

3. Above command will create files under $PGPATH/contrib/extension. Now you are ready to add the contrib module.

postgres=# create EXTENSION pg_prewarm;
CREATE EXTENSION
postgres=# \dx
                          List of installed extensions
      Name      | Version |   Schema   |              Description
----------------+---------+------------+----------------------------------------
 pg_buffercache | 1.0     | public     | examine the shared buffer cache
 pg_prewarm     | 1.0     | public     | prewarm relation data
 pgfincore      | 1.1.1   | public     | examine and manage the os buffer cache
 plpgsql        | 1.0     | pg_catalog | PL/pgSQL procedural language
(4 rows)

Documentation:
/usr/local/src/postgresql-9.1.3/doc/src/sgml
[root@localhost sgml]# ll pgpre*
-rw-r--r-- 1 root root 2481 Apr 10 10:15 pgprewarm.sgml

dstat
A combination of vmstat,iostat,netstat,top,etc., tool together in one "dstat" linux command. When database behaving unusually, to know the cause from OS level, we open couple of terminals to pull process, memory,disk read/writes, network informations, which is little bit pain to shuffle between windows. So, dstat has serveral options with in it, which helps to show all commands in one output one window.

Installation:
Dstat download link: (RHEL 6)
wget http://pkgs.repoforge.org/dstat/dstat-0.7.2-1.el6.rfx.noarch.rpm
or
yum install dstat
Documentation: http://dag.wieers.com/home-made/dstat/

Linux ftools
Its designed for working with modern linux system calls including, mincore, fallocate, fadvise, etc. Ftools, will help you to figure out what files are in OS cache. Using perl/python scripts you can retrieve OS page cache information on object files (pg_class.relfilenode). pg_fincore is based on this. You can use pgfincore or ftools scripts.

Installation:
Download the tar.gz from the link.
https://github.com/david415/python-ftools

cd python-ftools
python setup.py build
export PYTHONPATH=build/lib.linux-x86_64-2.5
python setup.py install

Note: You need to have python & psycopg2 installed before installing python-ftools.

Now, we are all set to proceed with example to check with the tools & utilities. In my example, I have a table, it has one index & sequence with 100+ MB of data in it.

postgres=# \d+ cache
Table "public.cache"
Column |  Type   |                Modifiers                | Storage  | Description
--------+---------+-----------------------------------------+----------+-------------
name   | text    |                                         | extended |
code   | integer |                                         | plain    |
id     | integer | default nextval('icache_seq'::regclass) | plain    |
Indexes:
"icache" btree (code)
Has OIDs: no

Query to know the size occupied by table,sequence and its index.

postgres=# SELECT c.relname AS object_name,
CASE when c.relkind='r' then 'table'
when c.relkind='i' then 'index'
when c.relkind='S' then 'sequence'
else 'others'
END AS type,pg_relation_size(c.relname::text) AS size, pg_size_pretty(pg_relation_size(c.relname::text)) AS pretty_size
FROM pg_class c
JOIN pg_roles r ON r.oid = c.relowner
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE (c.relkind = ANY (ARRAY['r'::"char", 'i'::"char", 'S'::"char",''::"char"])) AND n.nspname = 'public';

object_name |   type   |   size   | pretty_size
-------------+----------+----------+-------------
icache_seq  | sequence |     8192 | 8192 bytes
cache       | table    | 83492864 | 80 MB
icache      | index    | 35962880 | 34 MB
(3 rows)

Total object size 'cache'

postgres=# select pg_size_pretty(pg_total_relation_size('cache'));
pg_size_pretty
----------------
114 MB
(1 row)

I have written small query by clubbing pgfincore and pg_buffercache to pull information from PG Buffer & OS Page cache. I will be using this query through out my example, only pasting this query outputs.

select rpad(c.relname,30,' ') as Object_Name,
case when c.relkind='r' then 'Table' when c.relkind='i' then 'Index' else 'Other' end as Object_Type, 
rpad(count(*)::text,5,' ') as "PG_Buffer_Cache_usage(8KB)",
split_part(pgfincore(c.relname::text)::text,','::text,5) as "OS_Cache_usage(4KB)"
from pg_class c inner join pg_buffercache b on b.relfilenode=c.relfilenode
     inner join pg_database d on (b.reldatabase=d.oid and d.datname=current_database() and c.relnamespace=(select oid from pg_namespace where nspname='public'))
group by c.relname,c.relkind
order by "PG_Buffer_Cache_usage(8KB)"
desc limit 10;

object_name | object_type | PG_Buffer_Cache_usage(8KB) | OS_Cache_usage(4KB)
-------------+-------------+----------------------------+---------------------
(0 rows)

Note: I have bounced the cluster to flush PG buffers & OS Page Cache. So, no data in any Cache/buffer.

Preloading relation/index using pg_prewarm:
Before, bouncing the cluster I have fired full table sequential scan query on "Cache" table, and noted the time which is before warming the relation/index.

postgres=# explain analyze select * from cache ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Seq Scan on cache  (cost=0.00..26192.00 rows=1600000 width=19) (actual time=0.033..354.691 rows=1600000 loops=1)
Total runtime: 427.769 ms
(2 rows)

Lets warm relation/index/sequence using pg_prewarm and check query plan.

postgres=# select pg_prewarm('cache','main','buffer',null,null);
pg_prewarm
------------
10192
(1 row)
postgres=# select pg_prewarm('icache','main','buffer',null,null);
pg_prewarm
------------
4390
(1 row)

Output of combined buffers:
object_name | object_type | PG_Buffer_Cache_usage(8KB) | OS_Cache_usage(4KB)
-------------+-------------+----------------------------+---------------------
icache      | Index       | 4390                       | 8780
cache       | Table       | 10192                      | 20384
(2 rows)

pgfincore output:

postgres=# select relname,split_part(pgfincore(c.relname::text)::text,','::text,5) as "In_OS_Cache" from pg_class c where relname ilike '%cache%';
relname   | In_OS_Cache
------------+-------------
icache_seq | 2
cache      | 20384
icache     | 8780
(3 rows)

or for each object.

postgres=# select * from pgfincore('cache');
relpath      | segment | os_page_size | rel_os_pages | pages_mem | group_mem | os_pages_free | databit
------------------+---------+--------------+--------------+-----------+-----------+---------------+---------
base/12780/16790 |       0 |         4096 |        20384 |     20384 |         1 |        316451 |
(1 row)

To retrieve similar information using python-ftools script you need to know objects relfilenode number, check below.

postgres=# select relfilenode,relname from pg_class where relname ilike '%cache%';
relfilenode |    relname
-------------+----------------
16787 | icache_seq       /// you can exclude sequence.
16790 | cache            /// table
16796 | icache           /// index
(3 rows)

using python-ftools script

Is it not interesting....!!!!.
Now compair the explain plan after warming table into buffer.

postgres=# explain analyze select * from cache ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Seq Scan on cache  (cost=0.00..26192.00 rows=1600000 width=19) (actual time=0.016..141.804 rows=1600000 loops=1)
Total runtime: 215.100 ms
(2 rows)

How to flush/prewarm relation/index in OS cache ?
Using pgfadvise, you can preload or flush the relation from the OS cache. For more information, type \df pgfadvise* in terminal for all functions related to pgfadvise. Below is example of flushing the OS cache.

postgres=# select * from pgfadvise_dontneed('cache');
relpath      | os_page_size | rel_os_pages | os_pages_free
------------------+--------------+--------------+---------------
base/12780/16790 |         4096 |        20384 |        178145
(1 row)
postgres=# select * from pgfadvise_dontneed('icache');
relpath      | os_page_size | rel_os_pages | os_pages_free
------------------+--------------+--------------+---------------
base/12780/16796 |         4096 |         8780 |        187166
(1 row)
postgres=# select relname,split_part(pgfincore(c.relname::text)::text,','::text,5) as "In_OS_Cache" from pg_class c where relname ilike '%cache%';
relname   | In_OS_Cache
------------+-------------
icache_seq | 0
cache      | 0
icache     | 0
(3 rows)

While these things are going on in one window you can check the read/write ratio by using dstat. For more options use dstat --list
dstat -s --top-io --top-bio --top-mem

Preloading Range of block's using pg_prewarm range functionality.
Assume,due to some reason, you want to bounce the cluster, but one of big table which is in buffer is performing well. On bouncing, your table no more in buffers, to get back to original state as it was before bouncing then you have to know how many table blocks were there in buffers and preload them using pg_prewarm range option.

I have created a table by querying pg_buffercache and later I have sent block range information to pg_prewarm. By this, shared buffers is back with the table earlier loaded in it. See the example.

select c.relname,count(*) as buffers from pg_class c 
inner join pg_buffercache b on b.relfilenode=c.relfilenode and c.relname ilike '%cache%' 
inner join pg_database d on (b.reldatabase=d.oid and d.datname=current_database()) 
group by c.relname 
order by buffers desc;
relname | buffers
---------+---------
cache   |   10192
icache  |    4390
(2 rows)
Note: These are the blocks in buffer.

postgres=# create table blocks_in_buff (relation, fork, block) as select c.oid::regclass::text, case b.relforknumber when 0 then 'main' when 1 then 'fsm' when 2 then 'vm' end, b.relblocknumber from pg_buffercache b, pg_class c, pg_database d where b.relfilenode = c.relfilenode and b.reldatabase = d.oid and d.datname = current_database() and b.relforknumber in (0, 1, 2);
SELECT 14716

Bounce the cluster and preload the range of blocks related to table into buffers from the "blocks_in_buff".

postgres=# select sum(pg_prewarm(relation, fork, 'buffer', block, block)) from blocks_in_buff;
sum
-------
14716
(1 row)

postgres=# select c.relname,count(*) as buffers from pg_class c
inner join pg_buffercache b on b.relfilenode=c.relfilenode and c.relname ilike '%cache%'
inner join pg_database d on (b.reldatabase=d.oid and d.datname=current_database())
group by c.relname
order by buffers desc;
relname | buffers
---------+---------
cache   |   10192
icache  |    4390
(2 rows)

See, my shared_buffer's is back in play.

Enjoy…!!! will be back with more interesting stuff. Do post your comments.

--Raghav

↧

Autonomous Transaction in PostgreSQL 9.1

May 28, 2012, 10:30 pm

≫ Next: Compiling PL/Proxy with PostgresPlus Advance Server 9.1

≪ Previous: Caching in PostgreSQL

Currently am working on Migrations from Oracle to PostgreSQL. Though am DBA, these days am learning a bit on Developer track too ... :)
Let's see a small feature of Oracle and a similar way in PostgreSQL.

Autonomous Transaction,what is it ?

An autonomous transaction is an independent transaction that is initiated by another transaction, and executes without interfering with the parent transaction. When an autonomous transaction is called, the originating transaction gets suspended. Control is returned when the autonomous transaction does a COMMIT or ROLLBACK.

Example in Oracle:

Create two tables and one procedure as shown below.

create table table_a(name varchar2(50));
create table table_b(name varchar2(50));

create or replace procedure insert_into_table_a is
begin
   insert into table_a values('Am in A');
   commit;
end;

Lets test it here.

SQL> begin
  2  insert into table_b values('Am in B');
  3  insert_into_table_a;
  4  rollback;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select * from table_a;

Am in A

SQL> select * from table_b;

Am in B

In my example above, line 3 has committed the line 2, where it has to rollback according to line 4. In my example am looking for a transaction blocks to behave independently, to achieve it in Oracle we need to include PRAGMA autonomous_transaction in the Procedure declaration to behave as independent transaction block. Lets Retake:

Truncate table table_a;
Truncate Table table_b;

create or replace procedure insert_into_table_a is pragma autonomous_transaction;
begin
   insert into table_a values('Am in A');
   commit;
end;

SQL> begin
  2  insert into table_b values('Am in B');
  3  INSERT_INTO_TABLE_A;
  4  rollback;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> select * from table_a;

NAME
----------
Am in A

SQL> select * from table_b;

no rows selected

How to make work in PostgreSQL ?

Autonomous Transaction, are very well controlled in Oracle. Similar functionality is not there in PostgreSQL, however you can achieve with a hack using dblink. Below is the link, where hack has been provided:
http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

create extension dblink;

create or replace function insert_into_table_a() returns void as $$
begin
    perform dblink_connect('pragma','dbname=edb');
    perform dblink_exec('pragma','insert into table_a values (''Am in A'');');
    perform dblink_exec('pragma','commit;');
    perform dblink_disconnect('pragma');
end;
$$ language plpgsql;

edb=# begin;
BEGIN
edb=# insert into table_b VALUES ('am in B');
INSERT 0 1
edb=# select insert_into_table_a();
 insert_into_table_a
---------------------

(1 row)

edb=# select * from table_a;
  name
---------
 Am in A
(1 row)

edb=# select * from table_b;
  name
---------
 am in B
(1 row)

edb=# rollback;
ROLLBACK
edb=# select * from table_a;
  name
---------
 Am in A
(1 row)

edb=# select * from table_b;
 name
------
(0 rows)

Is it not simple, thanks to the hack provider.

--Raghav

↧

Compiling PL/Proxy with PostgresPlus Advance Server 9.1

June 2, 2012, 6:27 am

≫ Next: Upgrading Slony-I 2.0.x to latest version 2.1.x

≪ Previous: Autonomous Transaction in PostgreSQL 9.1

PostgresPlus Advance Server 9.1(PPAS) isEnterpriseDB product, which comes with enterprise features as additional with community PostgreSQL. Most of the contrib modules(pgfoundry) can be pluged into this product using Stackbuilder. However,currently Pl/Proxy is not bundled or downloadable with Stack-builder. So,here is how you can compile the Pl/Proxy with PPAS 9.1.

1. Download Pl/Proxy.

wget http://pgfoundry.org/frs/download.php/3274/plproxy-2.4.tar.gz
tar -xvf plproxy-2.4.tar.gz
make PG_CONFIG=/opt/PostgresPlus/9.1AS/bin/pg_config
make intall PG_CONFIG=/opt/PostgresPlus/9.1AS/bin/pg_config

Note: Flex & Bison must be installed before compiling pl/proxy.

2. After sucessfull configuration, you get two files, plproxy.so in $PGPATH/lib & plproxy--2.4.0.sql in $PGPATH/share/extention/ location.
Execute the .sql file which creates call_handler & language.

bash-4.1$ psql -p 5444 -U enterprisedb -d edb -f /opt/PostgresPlus/9.1AS/share/extension/plproxy--2.4.0.sql
CREATE FUNCTION
CREATE LANGUAGE
CREATE FUNCTION
CREATE FOREIGN DATA WRAPPER

Now you can see the language installed.

edb=# \dL
        List of languages
  Name   |    Owner     | Trusted
---------+--------------+---------
 edbspl  | enterprisedb | t
 plpgsql | enterprisedb | t
 plproxy | enterprisedb | f
(3 rows)

3. Lets test the sample code with pl/proxy.

create table users(username text,blog text);
insert into users values('Raghav','raghavt.blogspot.com');

CREATE or replace  FUNCTION get_user_blog(i_username text)
RETURNS SETOF text AS $$
    CONNECT 'dbname=edb';
    SELECT blog FROM users WHERE username = $1;
$$ LANGUAGE plproxy;

edb=# select * from get_user_blog('Raghav');
          get_user_blog
----------------------------------
 raghavt.blogspot.com
(1 rows)

All set to go testing with pl/proxy on PPAS 9.1. If you want to know how to setup pl/proxy, follow below links.
http://www.depesz.com/2011/12/02/the-secret-ingredient-in-the-webscale-sauce/
http://kaiv.wordpress.com/2007/07/27/postgresql-cluster-partitioning-with-plproxy-part-i/

--Raghav

↧

Upgrading Slony-I 2.0.x to latest version 2.1.x

June 4, 2012, 5:53 pm

≫ Next: PostgreSQL Process names on Solaris

≪ Previous: Compiling PL/Proxy with PostgresPlus Advance Server 9.1

Slony-1 2.1 has very good fixes and new features like adding Bulk tables, improvement on WAIT FOR with Merge Set/Move Set, support for TRUNCATE on replicating tables and many more. Am using Slony-I 2.0.7, so thought of upgrading it to latest version. Upgrading Slony-I is very simple and it can be achievable in few steps. My upgrade procedure assumes there is already Master/Slave setup with Slony 2.0.7.

Backup Plan:
1. Backup the existing slony schema (_slonyschema) of master/slave
2. Backup the OLD Slony Binaries
3. Backup all initially creates slony configuration files.

Upgrade Procedure:
1. Stop all running slon proces on all nodes.
2. Install new Version of Slony 2.1.x binaries.
3. Execute SLONIK upgradation script
4. Start slony with new binaries on all nodes.

Link: http://slony.info/documentation/2.1/slonyupgrade.html

Current PostgreSQL & Slony version:

repdb=# select substr(version(),1,26) as "PostgreSQL-Version",_myrep.slonyversion();
     PostgreSQL-Version     | slonyversion
----------------------------+--------------
 PostgreSQL 9.1.3 on x86_64 | 2.0.7
(1 row)

Install/Configure Latest version of Slony-I 2.1.x source

 wget http://main.slony.info/downloads/2.0/source/slony1-2.1.0.tar.bz2
 ./configure --prefix=/opt/PostgreSQL/9.1/bin --with-pgconfigdir=/opt/PostgreSQL/9.1/bin
 make
 make install

After installation, you can find three executables slon, slonik & slon_logshipper under "/opt/PostgreSQL/9.1/bin/bin".  

-bash-4.1$ ./slon -v
slon version 2.1.0

Upgradation Script:

## Upgrade script

cluster name = myrep;
node 1 admin conninfo='host=localhost dbname=postgres user=postgres port=5432';
node 2 admin conninfo='host=localhost dbname=repdb user=postgres port=5433';
UPDATE FUNCTIONS (  ID = 1 );
UPDATE FUNCTIONS (  ID = 2 );

Note: Update all the nodes with UPDATE FUNCTIONS. I have two nodes Master(5432) and Slave(5433).

Execute the script:

-bash-4.1$ slonik upgrade_207_201.slonik

Start the slony process with new binaries and check for the changes.

postgres=# select substr(version(),1,26) as "PostgreSQL-Version",_myrep.slonyversion();
     PostgreSQL-Version     | slonyversion
----------------------------+--------------
 PostgreSQL 9.1.3 on x86_64 | 2.1.0
(1 row)

You can see my slony version has been upgraded to latest. You can also perform health check on the schema with a function provided by Slony-I in their documenation. Health Check function should return TRUE, else somewhere your PG & Slony catalogs are damaged.
Function link: http://slony.info/documentation/2.1/function.slon-node-health-check.html

postgres=# select node_health_check();
 node_health_check
-------------------
 t
(1 row)

--Raghav

↧

PostgreSQL Process names on Solaris

June 26, 2012, 2:45 am

≫ Next: Simple Slony-I Replication Setup.

≪ Previous: Upgrading Slony-I 2.0.x to latest version 2.1.x

PostgreSQL Processes are very few and countable like, writer process, wal writer proces,stats collector,autovacuum process,syslogger process,archiver process & daemon postmaster. If replication enabled then there will be wal sender & wal receiver process. In my trainings, I use to show process information by executing "ps -ef | grep postgres", but how could I show the same on Solaris. So, I checked with Solaris Documentation and found its very simple and easy to get the process names as linux.

In PostgreSQL documentaion, its said to use /usr/ucb/ps with -ww options to get process names instead of regular /usr/bin/ps, however most of the information are hidden by /usr/ucb/ps option as well. Lets see how to retrieve complete postgres process names in solaris.

Below are my postgres 9.1 instance processes on Solaris:

bash-3.00$ /usr/ucb/ps -awwx | grep postgres
  7778 ?        S  0:04 /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
  7779 ?        S  0:01 /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
  7780 ?        S  0:00 /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
  7781 ?        S  0:00 /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
  7776 pts/5    S  0:00 /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data

More extended way with pargs:

bash-3.00$  pargs `/usr/ucb/ps -awwx | grep postgres | awk '{print $1}'`
7778:   /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
argv[0]: postgres: writer process  
argv[1]:
argv[2]:

7779:   /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
argv[0]: postgres: wal writer process  
argv[1]:
argv[2]:

7780:   /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
argv[0]: postgres: autovacuum launcher process  
argv[1]:
argv[2]:

7781:   /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
argv[0]: postgres: stats collector process  
argv[1]:
argv[2]:

7776:   /Desktop/postgres/9.1-pgdg/bin/64/postgres -D /Desktop/postgres/9.1-pgdg/data
argv[0]: /Desktop/postgres/9.1-pgdg/bin/64/postgres
argv[1]: -D
argv[2]: /Desktop/postgres/9.1-pgdg/data

7776 is postmaster daemon process.

bash-3.00$ cat /Desktop/postgres/9.1-pgdg/data/postmaster.pid
7776
/Desktop/postgres/9.1-pgdg/data
1339917119
5432
/tmp
localhost
  5432001  50331683

Though it seems simple I believe its worth to know :).

--Raghav

↧

Simple Slony-I Replication Setup.

July 2, 2012, 12:48 am

≫ Next: Swapping Provider, not within slony replicating nodes

≪ Previous: PostgreSQL Process names on Solaris

Above shown is an overview about Slony-I Asynchronous replication in short. For more information,Slony-I documentation is your best friend :).

Let's start with replication methods, in perltools method,you need to configure slony at the time of source installation to enable built-in perl scripts. These scripts start with "SLONIK_" and they are designed to carry replication administrative tasks.

My demo for two methods shell(slonik) & Perl is on Localhost Single instance(5432) with two databases Master & Slave replicating one table "rep_table". For replication, master/Slave should hold same table structure. If you have many tables use pg_dump/pg_restore structure dump option. Since am replicating one table I just created the same on Master/Slave.
Note: Set environment variables like PGDATA,PGPORT,PGHOST,PGPASSWORD & PGUSER.

Source Installation:

Download the Slony-I 2.1 source(http://slony.info/downloads/) 

#bunzip2 slony1-2.1.0.tar.bz2
#tar -xvf slony1-2.1.0.tar
# cd slony1-2.1.0
#./configure --prefix=/opt/PostgreSQL/9.1/bin 
             --with-pgconfigdir=/opt/PostgreSQL/9.1/bin 
             --with-perltools=/opt/PostgreSQL/9.1/bin
             // Exclude --with-perltools if not needed
# make
# make install

Basic setup on Master/Slave

createdb -p 5432 master
createdb -p 5432 slave

psql -p 5432 -d master -c "create table rep_table(id int primary key);"
psql -p 5432 -d slave -c "create table rep_table(id int primary key);"

Insert some data on master to replicate to slave
psql -p 5432 -d master -c "insert into rep_table values(generate_series(1,10));"

Method 1: --with-perltools :

1. Create on standard .conf file, with information like, Log location, No. of Nodes, Set of Tables etc.,

$CLUSTER_NAME = 'myrep';
$LOGDIR = '/opt/PostgreSQL/9.1/slonylogs';
$MASTERNODE = 1;
$DEBUGLEVEL = 2;

&add_node(node => 1,host => 'localhost',dbname => 'master',port => 5432,user => 'postgres',password => 'postgres');
&add_node(node => 2,host => 'localhost',dbname => 'slave',port => 5433,user => 'postgres',password => 'postgres');

$SLONY_SETS =
{
    "set1" =>
    {
        "set_id" => 1,
        "table_id" => 1,
        "pkeyedtables" =>
                       [rep_table,],
    },
};

Initialize, Create-set & subscribe-set, these are the three phases of slony replication. For each phase, "slonik_" perl scripts are created in the location mentioned at the time of source installation with option "--with-perltools". In my case its, "/opt/PostgreSQL/9.1/bin". Above CONF file is used in all phases.

2. Initialize the cluster. Here slonik, cross-checks the nodes connection.

cd /opt/PostgreSQL/9.1/bin
./slonik_init_cluster -c slon.conf 
./slonik_init_cluster -c slon.conf| ./slonik

3. Create a set, means which set of tables to replicate from Node 1 to Node 2.

./slonik_create_set -c slon.conf 1 
./slonik_create_set -c slon.conf 1|./slonik

4. Start Slon daemons. Each node will have two slon process to carry work. Each node slon process should be started.

./slon_start -c slon.conf 1
./slon_start -c slon.conf 2

5. Subscribe Set, from here slony maintains data consistency between two nodes by allowing Master for all DML's and Denying them on Slave.

./slonik_subscribe_set -c slon.conf 1 2 
./slonik_subscribe_set -c slon.conf 1 2|./slonik

After the above steps now your slave will have replicated data.

Method 2: With standard scripts:

In Standard script methods, there are many way to implement, but to understand clearly I have split as same as Perl we did above like Initialize, create-set & subscribe set. All scripts are binded with SLONIK command.

1. Create two .conf files for Master & Slave Node.

vi master_slon.conf
cluster_name=myrep
pid_file='/opt/PostgreSQL/9.1/data/master_slon.pid'
conn_info='host=localhost dbname=master user=postgres port=5432'

vi slave_slon.conf
cluster_name=myrep
pid_file='/opt/PostgreSQL/9.1/data/slave_slon.pid'
conn_info='host=localhost dbname=slave1 user=postgres port=5432'

2. Initialize the cluster.

#!/bin/bash
# Initialize Cluster (init_cluster.sh)

slonik <<_eof_
cluster name = myrep;
node 1 admin conninfo='host=127.0.0.1 dbname=master user=postgres port=5432';
node 2 admin conninfo='host=127.0.0.1 dbname=slave1 user=postgres port=5432';

#Add Node
init cluster (id = 1, comment = 'Primary Node For the Slave postgres');
store node (id = 2, event node = 1, comment = 'Slave Node For The Primary postgres');

#Setting Store Paths ...
echo  'Stored all nodes in the slony catalogs';
store path(server = 1, client = 2, conninfo='host=127.0.0.1 dbname=master user=postgres port=5432');
store path(server = 2, client = 1, conninfo='host=127.0.0.1 dbname=slave1 user=postgres port=5432');
_eof_

$./init_cluster.sh

3. Create a set.

#!/bin/bash
# Create Set for set of tables (create-set.sh)

slonik <<_eof_
cluster name = myrep;
node 1 admin conninfo='host=127.0.0.1 dbname=master user=postgres port=5432';
node 2 admin conninfo='host=127.0.0.1 dbname=slave1 user=postgres port=5432';

try { create set (id = 1 ,origin = 1 , comment = 'Set for public'); } on error { echo  'Could not create set1'; exit 1;}

set add table (set id = 1 , origin = 1, id = 1, full qualified name = 'public.rep_table1', comment = 'Table action with primary key');
_eof_

$./create-set.sh

4. To start Slon daemons, use custom script which comes along with source tarbal under "/tools" location "start_slon.sh". Modify the script by changing .conf file locations for Master/slave startup scripts. This script will give flexibility to use and track all slon process with the help of PID's mentioned in .conf file.

Usage: ./master_start_slon.sh [start|stop|status]

-bash-4.1$ ./master_start_slon.sh  start
-bash-4.1$ ./slave_start_slon.sh  start

Sample STATUS output:

-bash-4.1$ ./master_start_slon.sh status
---------------------
Slony Config File    : /opt/PostgreSQL/9.1/slony_scripts/bash_slony/master_slon.conf
Slony Bin Path       : /opt/PostgreSQL/9.1/bin
Slony Running Status : Running...
Slony Running (M)PID : 28487
---------------------

4. Subscribe set.

#!/bin/bash
# Subscribe Set (subscribe-set.sh)

slonik <<_eof_
cluster name = myrep;
node 1 admin conninfo='host=127.0.0.1 dbname=master user=postgres port=5432';
node 2 admin conninfo='host=127.0.0.1 dbname=slave1 user=postgres port=5432';

try { subscribe set (id = 1, provider = 1 , receiver = 2, forward = yes, omit copy = false); } on error { exit 1; } echo  'Subscribed nodes to set 1';
_eof_

$./subscribe-set.sh

Now your slave database will have replicated data in "rep_table" table.
These two methods will help to understand the basic setup of slony replication. Will be back with more advanced slony concepts.

--Raghav

↧