Installing DB2 on OpenSolaris

Why?

The other day, Ben Rockwell mentioned on his blog that the free edition of DB2 was available for 64-bit Solaris on x86 systems. I like learning about new server software, and databases in particular, so I figured I’d take a look at it.

The majority of my relational database administration experience has been with Oracle, Microsoft SQL Server, MySQL, and PostgreSQL. Of that mix, I was expecting DB2 to be much more like Oracle than the others—specifically, complex installation (if you don’t want to use GUI tools that produce bloated default databases with every option under the sun enabled) and annoying command line tools (it’s 2009 and sqlplus doesn’t have any sort of command completion or press-the-up-arrow-to-get-to-previous-command support). I was pleasantly surprised in how easy it was for a mere mortal to install and create a database entirely from the command line. The interactive tools, though, seem about as brain-dead as sqlplus, though, so my guess was partially right (although perhaps slight favor to DB2 for its help system within its interactive command processor, but ding it back down for requiring an explicit line continuation character…but I’m getting way ahead of myself, we haven’t even installed it yet!).

So enough philosophy, let’s get down to business. Please remember that there’s a “GUI setup wizard” that can do all of this in just a few clicks (for Solaris as well as the other supported platforms, Linux and Windows). But where’s the fun in that? I like to know exactly what’s going on in my systems, and I’ve found that doing things the manual way is a much better way to learn how to support a system in the long run. Also, I like to be able to script most server setup tasks for reliable repeatability. If you’re with me, here we go!

Installing DB2

I have downloaded DB2 9.5 Express-C for Solaris x64, and have the distribution extracted in /root/db2_9.5_expc, ready for installation.

First, I’ll install the software to the default location, /opt/IBM/db2/V9.5. The -p EXP option is to tell the installer what product to install — EXP for Express Edition in this case. Before running the installer, I create the /usr/local/bin directory because DB2 puts a command (db2ls) in there. It doesn’t hurt if that command doesn’t get installed, but the installer will tell you that there were minor errors. So, the installation of the software:

root@lab01v04# mkdir -p /usr/local/bin
root@lab01v04# cd /root/db2_9.5_expc
root@lab01v04# ./db2_install -b /opt/IBM/db2/V9.5 -p EXP -n
The execution completed successfully.

For more information see the DB2 installation log at 
"/tmp/db2_install.log.18091".
root@lab01v04#

Creating an Instance

Easy enough. Now we need to create an instance. A DB2 instance is what holds databases and everything in them. One instance can hold several database (like Microsoft SQL Server or MySQL, but unlike Oracle). DB2 instances are owned by and tied to a local user account. In addition to the instance owner user, there is a “fenced user” that is used to provide a security context in which to run certain code. So we’ll be creating two users, db2inst1, the instance owner, and db2fenc1, the fenced user. Note that the actual instance data will live inside of the home directory of the instance owner. We’ll also create two groups, the instance admin group (db2iadm1) and the fenced admin group (db2fadm1). One physical server can run several instances of DB2, so the 1 on the end of the user and group names is just an easy way of identifying this particular instance we’re creating.

Create the groups, then the users, then set the users’ passwords using the regular Solaris tools:

root@lab01v04# groupadd db2iadm1
root@lab01v04# groupadd db2fadm1
root@lab01v04# useradd -g db2iadm1 -d /export/home/db2inst1 \
               -s /usr/bin/ksh93 -m db2inst1
64 blocks
root@lab01v04# useradd -g db2fadm1 -d /export/home/db2fenc1 \
               -s /usr/bin/ksh93 -m db2fenc1
64 blocks
root@lab01v04# passwd db2inst1
New Password: ...password...
Re-enter new Password: ...password...
passwd: password successfully changed for db2inst1
root@lab01v04# passwd db2fenc1
New Password: ...password...
Re-enter new Password: ...password...
passwd: password successfully changed for db2fenc1

We now have the users ready to go. Finally, our last task as root is to create the actual instance. We’ll do that with the db2icrt command, which takes an argument for the fenced user and the instance name/user:

root@lab01v04# /opt/IBM/db2/V9.5/instance/db2icrt -u db2fenc1 db2inst1
Sun Microsystems Inc.   SunOS 5.11      snv_104 November 2008
Sun Microsystems Inc.   SunOS 5.11      snv_104 November 2008
DBI1070I  Program db2icrt completed successfully.

Simple as that. The Sun banner that appears a couple of times is from the instance creation scripts logging in as the users to perform some setup.

Now that the instance is created, we can do the rest of the work as the db2inst1 user, so we’ll change logins. The instance creation tool added an entry to db2inst1‘s .profile file to pull in the environment for all of the DB2 commands.

Our first task is to start the instance:

db2inst1@lab01v04$ db2start
SQL1063N  DB2START processing was successful.

Creating the First Database

Since this is a new instance, there isn’t actually anything in it yet (specifically, databases). Now we can create our first database, which we’ll call mydb.

db2inst1@lab01v04$ db2 create database mydb
DB20000I  The CREATE DATABASE command completed successfully.

The db2 command is the DB2 Command Line Processor (CLP). The CLP is the primary interface to issue commands to the server. You can either run db2 command from the shell, which executes the command and exits, or you can use the CLP interactively by running db2 with no arguments. To do a couple quick tests on our database, we’ll use the CLP interactively:

db2inst1@lab01v04$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.5.1

You can issue database manager commands and SQL statements from the command 
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside 
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 =>

When we launch the CLP, we get some basic usage information and then the prompt. We first need to connect to a database, so we’ll connect to the one we just created, mydb:

db2 => connect to mydb

   Database Connection Information

 Database server        = DB2/SUNX8664 9.5.1
 SQL authorization ID   = DB2INST1
 Local database alias   = MYDB

Looks good. We’re connected as the db2inst1 user to the mydb database. Now we can just issue regular SQL statements, so we’ll create a table and insert a couple rows. Note that in the DB2 CLP (like the Unix shell), you need to put a backslash on the end of a line if you want to continue the command. Also, do not put semicolons at the end of SQL commands; in interactive mode they are not allowed.

db2 => create table testtab ( \
db2 (cont.) => id integer not null primary key, \
db2 (cont.) => name varchar(50) not null )
DB20000I  The SQL command completed successfully.
db2 => insert into testtab values (1, 'First entry')
DB20000I  The SQL command completed successfully.
db2 => insert into testtab values (2, 'Second entry')
DB20000I  The SQL command completed successfully.

Not surprisingly, it’s working like a SQL database should. We have a table, testtab, which we’ve inserted two rows into.

Finally, we’ll disconnect from the database and quit the CLP:

db2 => connect reset
DB20000I  The SQL command completed successfully.
db2 => quit
DB20000I  The QUIT command completed successfully.

Enabling Network Connectivity

That works great, but so far we’ve only accessed the instance locally. To allow connections from clients on other systems, we need to configure the instance to accept TCP/IP connections. This is an instance-level setting, so all of the databases you create in this instance will be available to remote clients.

The first step is to tell the instance what port to listen on. We’ll use 50,000. Note that svcename in the following command could also be the name of an entry in the /etc/inet/services file, which explains why the parameter is named svcename instead of something with the word “port.”

db2inst1@lab01v04$ db2 update dbm configuration using svcename 50000
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed 
successfully.
SQL1362W  One or more of the parameters submitted for immediate modification 
were not changed dynamically. Client changes will not be effective until the 
next time the application is started or the TERMINATE command has been issued. 
Server changes will not be effective until the next DB2START command.

Next we need to enable TCP/IP as a communication protocol. This uses a new command, db2set:

db2inst1@lab01v04$ db2set DB2COMM=tcpip

Finally, restart the instance:

db2inst1@lab01v04$ db2stop
SQL1064N  DB2STOP processing was successful.
db2inst1@lab01v04$ db2start
SQL1063N  DB2START processing was successful.

How do we know if it worked? First, we’ll check to see if there’s something listening on port 50,000 on our system:

db2inst1@lab01v04$ netstat -an | grep 50000
      *.50000              *.*                0      0 49152      0 LISTEN

Looks good! netstat reports that a process is accepting connections on port 50000 on all interfaces. To really prove that we’re ready to start serving clients, though, we’ll test connectivity from another system.

Connecting From a Remote Client

We installed DB2 on lab01v04. Over on another machine, lab01v03, the DB2 client is installed. I’m logged in as mwilson, a user that the DB2 server knows nothing about, but I should be able to connect to DB2 as the db2inst1 user, which is the “superuser” or “root user” for the database.

The DB2 client software contains the same db2 command to launch the Command Line Processor. To connect to a remote server from a client, you first need to define the “node,” which represents the instance. Then you define the specific database within the instance. Once I have the database defined (cataloged in DB2 parlance), I can connect to it by name just like I did on the server, only I’ll add a username since I don’t have the benefit of being logged on locally to the server as an authorized user.

I’ll give you this one in one piece: starting the CLP, cataloging the node and database, then connecting and reading the data we inserted into the table earlier. All of this is happening from lab01v03, talking to the DB2 instance we created on lab01v04.

mwilson@lab01v03$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.5.1

You can issue database manager commands and SQL statements from the command 
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside 
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 => catalog tcpip node lab01v04 remote lab01v04 server 50000
DB20000I  The CATALOG TCPIP NODE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is 
refreshed.
db2 => catalog database mydb at node lab01v04
DB20000I  The CATALOG DATABASE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is 
refreshed.
db2 => connect to mydb user db2inst1
Enter current password for db2inst1: ...password...

   Database Connection Information

 Database server        = DB2/SUNX8664 9.5.1
 SQL authorization ID   = DB2INST1
 Local database alias   = MYDB

db2 => select * from testtab

ID          NAME                                              
----------- --------------------------------------------------
          1 First entry                                       
          2 Second entry                                      

  2 record(s) selected.

db2 => connect reset
DB20000I  The SQL command completed successfully.
db2 => quit
DB20000I  The QUIT command completed successfully.

Wrapping Up

It worked! With really just a handful of commands, entirely from the command line, we’ve a) installed DB2, b) created an instance, c) created a database, d) enabled network client connectivity, and e) connected to our database from a remote client. If you’re familiar with doing the exact same thing using Oracle (please, no GUI installer or automatic bloated database creation with the Database Creation Assistant), you’ll appreciate just how much of a breeze this was with DB2, despite it being an equally “enterprise” database as Oracle. Maybe I’ll put together an article going through all of my Oracle installation and instance creation scripts for comparison.

In any case, the only thing we didn’t do is create the DB2 Administration Server (DAS), which allows remote management with the DB2 GUI utilities, but I don’t plan on needing that for now. If I did, it’s literally just a matter of creating a user to own the DAS and running the command dascrt -u DASuser.

There is one thing that I’m not satisfied with at this point, though: the instance uses the local system user accounts for authentication. That means, for example, that if I want a user, mwilson, in my database, I need to create a Unix account for mwilson. Luckily, DB2 ships with an LDAP authentication plugin. This will allow me to store user information in an LDAP directory and create as many users as I want without making any changes to the operating system hosting DB2. We’ll get that up and running, using OpenDS, in the next installment.

Replacing a Bad Drive with ZFS

One of the drives in my home file server was making occasional nasty clicking noises, which always precedes death in hard drives. The drive is only a couple of months old, so it must have just been a bad apple. Anyway, some quick testing showed that it was failing its SMART self-tests, so it was quick and easy to get a warranty replacement from Seagate. Luckily replacing the drive was quick and easy and all of the data was safe, since my data is on a zpool consisting of four terabyte drives in a RAID-Z configuration (if you’re familiar with traditional RAID, think RAID-5). Of course the data is also backed up, because RAID is not a substitute for backups, but as expected ZFS “just worked” and the new drive took over for the old drive with no hassles.

If you want to relive the experience, here’s my session on the server after shutting down, replacing the failing drive, and starting back up. First, on the console, Solaris complained about something not being quite right as soon as the server booted:

SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Wed Jan 14 19:12:39 PST 2009
PLATFORM: PowerEdge 1800, CSN: BSQMN91, HOSTNAME: athena
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: c6647451-fa5a-4f4b-99fd-de1e76bb059d
DESC: The number of I/O errors associated with a ZFS device exceeded
      acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD for more
      information.
AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
      will be made to activate a hot spare if available. 
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

Yeah, it didn’t like booting with a totally different hard drive in place of a drive that was a member of a zpool. A quick check confirms that Solaris is, indeed, complaining about the drive I replaced:

mwilson athena:~ [1258]% zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver completed after 0h0m with 0 errors on Wed Jan 14 19:12:11 2009
config:

        NAME        STATE     READ WRITE CKSUM
        tank        DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  UNAVAIL      0     0     0  cannot open
            c0t4d0  ONLINE       0     0     0
            c0t5d0  ONLINE       0     0     0

errors: No known data errors

No surprises there, we’ll tell it to replace c0t3d0. Without any additional arguments, the zpool replace command will attempt to replace the old device with the same new device (c0t3d0, that is).

mwilson athena:~ [1260]% pfexec zpool replace tank c0t3d0

That command executes immediately and returns me to the prompt. We can monitor the status while the pool is resilvering:

mwilson athena:~ [1262]% zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.40% done, 2h45m to go
config:

        NAME              STATE     READ WRITE CKSUM
        tank              DEGRADED     0     0     0
          raidz1          DEGRADED     0     0     0
            c0t2d0        ONLINE       0     0     0
            replacing     DEGRADED     0     0     0
              c0t3d0s0/o  FAULTED      0     0     0  corrupted data
              c0t3d0      ONLINE       0     0     0
            c0t4d0        ONLINE       0     0     0
            c0t5d0        ONLINE       0     0     0

errors: No known data errors

And some time later…

mwilson athena:~ [1273]% zpool status tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 2h34m with 0 errors on Wed Jan 14 21:48:43 2009
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  ONLINE       0     0     0
            c0t4d0  ONLINE       0     0     0
            c0t5d0  ONLINE       0     0     0

errors: No known data errors

Everything happy again! I love it when things just work how they’re supposed to.

Solaris CIFS Server and ZFS ACLs: The Problem

I’m going to be switching my home file server over to Solaris soon (or, more specifically, Solaris Express Community Edition [SXCE] build 99), and one of the primary goals of this server is to serve up a few directories to Windows or other SMB clients. One of the reasons I’m switching from Linux to Solaris is because I’m significantly increasing the disk space and I want to use ZFS for my storage pool. At the same time, I’m hoping to take advantage of SXCE’s built-in CIFS server to serve SMB shares.

To prepare for the big switch, I installed the latest build on a test machine and am playing around with setting up a configuration similar to what I’ll want. Unfortunately, it became clear quickly that I was going to hit problems with the new NFSv4 ACLs implemented in ZFS and how the CIFS server interacts with those ACLs on behalf of Windows clients.

So, in this post, I’ll walk through what I want to have happen, and what’s actually happening instead.

I have several users. They all belong to a group named data. There is a directory, /export/sandbox, that is for group project resources for everyone in the data group. All users in the group should be able to create files and directories in sandbox, and everyone else in the data group should be able to modify the files and directories. All users on the system should have read access to the sandbox tree.

This is very easy with traditional Unix permissions. You set the sandbox directory to mode 775, set the group to data, and set the setgid bit. For example, if I’m going to make sandbox a ZFS dataset, I can do:

# zfs create -o casesensitivity=mixed rpool/export/sandbox
# chown root:data /export/sandbox
# chmod 775 /export/sandbox
# chmod g+s /export/sandbox

(The casesensitivity option is to make it play well with Windows as a file share)

Finally, in each user’s profile I can set umask 002 and everything works as desired. Let’s log in as mwilson and do some tests:

$ umask
002
$ cd /export/sandbox
$ ls -l
total 0
$ touch test-file
$ mkdir test-dir
$ ls -l
total 2
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file

Excellent! This is exactly what we wanted: the file’s group is set to data and it remains writable by the group, as does the directory. Both are read-only for the world. The setgid bit on the new directory is set, so this method will work as users continue making subdirectories deeper in the tree.

But now, we’re going to add a new requirement: some users will access the sandbox from Windows using SMB. My server is already set up to run the Solaris CIFS server, so it’s easy to share this folder:

# zfs set sharesmb=name=sandbox rpool/export/sandbox

Now from a Windows client, I can go to \\server\sandbox and sure enough I see the directory and its contents. I’m authenticated as a user that maps to the mwilson Unix user. Now I’ll create a text file from Windows, then look at the directory listing back in Unix:

$ ls -l
total 5
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt

Whoa! Look at the file we created, windows-file.txt. That’s different… the ZFS ACLs are beginning to rear their ugly heads. The + next to the Unix permissions indicates that this file contains extended ACLs. Let’s look at the ACL on this file:

$ ls -v windows-file.txt
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt
     0:user:mwilson:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:allow
     1:group:2147483648:read_data/write_data/append_data/read_xattr
         /write_xattr/execute/delete_child/read_attributes/write_attributes
         /delete/read_acl/write_acl/write_owner/synchronize:allow

Okay. Deep breath. This file has ACL entries that say the user named mwilson is allowed to do, well, just about everything you could ever want to do to the file. The group with ID 2147483648 also has full permissions. Why the weird group number? It’s something to do with the mapping of Windows users and groups to Unix users and groups…honestly, I don’t know where it’s coming from. Since I’m mapped to the mwilson user, I wish it would just apply the Unix user’s group as the effective group if nothing else.

In any case, there seems to be a problem here: the data group no longer has any access to this file! Nor, it seems, does the world have read access.

Let’s log in as another user, jsmith, who is in the data group and look at the sandbox directory.

$ cd /export/sandbox
$ ls -l
./windows-file.txt: Permission denied
total 4
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file

Wow… ls gives us an error just trying to list the directory! That’s pretty bad…

Just for kicks, we’ll make a directory from Windows and see what that looks like, as mwilson:

$ ls
total 6
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file
d-----S---+  2 mwilson  data           2 Oct  7 20:45 windows-dir/
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt

Okay, there’s the directory (windows-dir) but it’s again different than what we’re used to. It looks like the Windows file, but has a capital S in the group mode. That indicates that the setgid bit is set, but the execute bit is not set for the group. Let’s check the ACL that’s in place:

$ ls -dv windows-dir
d-----S---+  2 mwilson  data           2 Oct  7 20:45 windows-dir/
     0:user:mwilson:list_directory/read_data/add_file/write_data
         /add_subdirectory/append_data/read_xattr/write_xattr/execute
         /delete_child/read_attributes/write_attributes/delete/read_acl
         /write_acl/write_owner/synchronize:allow
     1:group:2147483648:list_directory/read_data/add_file/write_data
         /add_subdirectory/append_data/read_xattr/write_xattr/execute
         /delete_child/read_attributes/write_attributes/delete/read_acl
         /write_acl/write_owner/synchronize:allow

This is just like the file, but with the directory versions of the permission names instead of the file versions of the names.

Where does that leave us? Creating files and directories from the Unix command line gives me the behavior I want, but creating files from Windows through the SMB share leads to dreadful results.

The problem definitely lies with ACL inheritance: I suspect I will need to define the ACLs that I want on the sandbox directory and set the appropriate inheritance flags, then files created in an “ACL-aware” fashion, such as by the CIFS server, will end up with the permissions I want. We shall see…hopefully soon I’ll have a follow-up post walking through the solution.

Update: as if it isn’t obvious by now, I never did follow up and figure out the right ACLs. I ended up using Solaris 10 Update 6, instead of OpenSolaris, to build the new file server. Solaris doesn’t have the integrated CIFS server yet, so I’m going the traditional Samba route, with doesn’t have this problem.