Friday 18 September 2015

How to setup a stress test on Oracle or equivalent databases

In a recent workshop on Oracle12c administration, I noticed that the participants had no idea how to let Oracle create archived redo logfles, i.e. they did not know how to create a relevant workload. In the exercises of the workshops you can find one, which performs some workload on CPU and I/O, but nothing about writing so much to the online redo logs, that the log writer process LGWR needs to copy the contents to the archived redo log files.
In summary, the students - all computer scientists working several years in IT companies - had never seen a stress test before. To solve the issue, I passed an old script of mine to them. To whom it may concern, I will publish it here, because it is really simple.


The principle is not difficult to understand: you need a tool, script or the like, which runs a series of users with a certain delay against a database and lets them perform executable SQL statements, which the database engine is forced to write into online redo logs. The size of redo logs is 50 MB per default, but one may create bigger ones - see the documentation. That means, you must have a considerable iteration. To keep it simple, we here repeat the same action. If you are familiar with the general idea, you may create more sophisticated  tests, instead of buying expensive tools on the market.

1. The master script: stresstest.sh - looks complicated, but is simple

#!/usr/bin/ksh
# a simple stresstest:
#
# HWK 2008: TODO beautify
# basically, that's what it comes down to.
#
# defaults
LOG=stresstest.log
USERS=1
ITERATIONS=1
DELAY=1

USAGE="# usage: stresstest.sh [options] filename\n"
USAGE+="# -u    number of concurrent users\n"
USAGE+="# -i    number of iterations per user\n"
USAGE+="# -d    seconds delay before starting additional users\n"
USAGE+="# filename      name of the programm, SQL-Statements, stored procedure, ...\n"
USAGE+="# filename is required: absolute, relative or ./filename\n"
USAGE+="# example: ./stresstest.sh -u 50 -i 100 -d 30 ./load.sql\n\n"

function doit {
print "\nstresstest.sh executes $EXE with $USERS users, each with  $ITERATIONS iterations and $DELAY sec delay \n"
print "\nLogfiles: $LOG, additionally USERn.log"
if [[ -e $LOG ]]; then
        rm $LOG *.log   > /dev/null 2>&1
else
        touch $LOG
fi

for (( i = 1; i <= $USERS; i++))
do
        $EXE "USER$i" "$ITERATIONS"  &
        print "starting USER $i" | tee -a $LOG
        sleep $DELAY
done
print "\nAll users started with following PIDs" | tee -a $LOG
jobs -p | tee -a $LOG
ps -l -p $(jobs -p) | tee -a $LOG
wait
}

function killalljobs {
        kill "$@" $(jobs -p)

}

clear

while getopts "u#i#d#" opt; do
        case $opt in
                u ) USERS=$OPTARG;;
                i ) ITERATIONS=$OPTARG;;
                d ) DELAY=$OPTARG ;;
                \?) print $USAGE;
                        exit 1;;
        esac
done
shift $(($OPTIND - 1))
##########
if [[ $# -lt 1 ]]; then
  print $USAGE ; exit 1
fi
EXE=$1
if [[ -e $EXE ]]; then
        doit
else
        print $EXE not found
        exit 1
fi

By the way: you can use this master-script to set up any sort of stress test, e.g. applications, programs, etc.

2. The program, SQL-Statement or alike: load.sql - holy simplicity

LOG=$1.log
ITER=$2
for ((i = 1; i <= $ITER; i++))
do
sleep 1
print $(date) "$0 $1 $i of $ITER "
sqlplus -S sh/sh<< EOT
insert into sales_bkp select * from sales;
commit;
EOT
done > $LOG


3. You need a user and at least one object

I assume here, that the EXAMPLES are installed - understand: you must do that before the test, otherwise you will have no test target.


alter user sh identified by sh account unlock;
conn sh/sh
create table sales_bkp as select * from sales where 0=1;
exit

And that's it.

The Automatic Database Diagnostic Monitor ADDM will slap you for connecting /disconnecting so many times and recommend to "analyze the application logic". Further more it will notice, that the table is too big and it needs partitioning and some other inconvenient truths. Try it: take a snapshot before and after the stress test and then call the ADDM - you will love this cute tool.

ADDM & AWR is explained in my next post.

Beware: 
After the fourth user, there will be already 2 archived redo log files. So choose less users and less iterations or another SQL statement. And take care of the fast_recovery_area_size,  otherwise the database will stop due to an archiver stuck.

But all this is very good for training: backup the database plus archivelog (RMAN), make a crosscheck and delete expired archivelogs, etc. 

No comments: