I’m in Japan!

I’m on vacation in Japan. If you want to follow along, you can check out my trip notes.

Three Required Programming Books

We were recently trying to hire some software engineers at work. Our usual approach with candidates involved a team interview session where the current developers all asked questions. A question that one of the developers on my team always asked was along the lines of, “What are three books that you think are important for all developers?” That’s not exactly how he asks it, but in my mind, I translate this into, “What three books would you expect any professional developer to be familiar with?”

It’s an interesting question and you can learn a few things about the candidate from the answer. I’ve thought about it, and I know what my answer is. I suspect it’s not the answer that would necessarily win me the most points if I were a candidate interviewing with that team. My answer certainly reveals two strong biases I have: I believe all professional programmers should be near-expert level C programmers (at least in terms of the language itself, not necessarily from a practical perspective of being able to successfully develop or manage a huge C project). I also believe that all professional developers should be familiar with The Unix Way. Because it is mostly The Right Way. Whatever the market says (clearly I’m about to disagree with the market…), it’s hard for me to consider Windows a serious enterprise application development and hosting platform, and it deserves little more than to be considered a passing fad.

Right. Back to the three books:

The C Programming Language, by Brian W. Kernighan and Dennis M. Ritchie. Given my first bias, this is an obvious choice. There has never been, nor will there likely ever be, as definitive or widely recognized a volume on any language. Read it. Know it. Love it. Everything that programmers do has C underneath it. While hiring is way too complex and there are too many other factors involved to boil things down to a simple litmus test, if life were that simple, I’d go so far as to say I wouldn’t hire any developer who hasn’t read K&R cover-to-cover at least one.

Advanced Programming In The Unix Environment, by W. Richard Stevens. Here you see my other bias playing out. I don’t really care what platform your current job is for. If you’re not able to dabble in Unix system programming (in C, of course), I’m not convinced that you have the same fundamental developer chops as people who can. This isn’t necessarily a read cover-to-cover type book after you get far enough in to understand the general Unix Way, but if you haven’t actually implemented C code that does string manipulation, file I/O, network sockets, memory management, threads, safe concurrency with critical sections, etc., then using higher level languages and frameworks is a crutch and you are more likely to make bad decisions. If you know how to do it in C, though, you can use those higher level languages for practicality and productivity, but know what their underlying implementation likely looks like and make correct decisions accordingly. If you’re a Windows Guy… sorry, Unix system programming is just so much more kick-ass than Windows system programming. (By the way, University CS programs that don’t make their Operating Systems students write a Unix shell in C that has program execution, pipe support, stdin and stdout redirection support, and a few other features are really doing their students a disservice.)

The Art of Computer Programming, by Donald Knuth. This is on my list because it simply has to be. This is the definitive monograph on the subject. I’m the first to admit that I haven’t actually read it all. The first few parts of Volume 1 deserve an honest read-through. The rest is great to skim, pick out topics you’re interested in to read more in depth…but really just the overall exposure is what you’re after. Getting used to thinking about algorithms that way Knuth talks about them. These volumes are perhaps the least practical books in my entire technical library, but at the same time if you put some effort into reading them–or parts of them–you will come away smarter than you used to be. The information density in these books is impressive. And it turns out they do have a practical side, too… I have, on numerous occasions, wanted to do something and just picked an algorithm straight out of one of the books to implement.

It’s great if candidates–or any developer–have also read practical books related to the languages, frameworks, and trends that a company they are working for uses. In fact, reading the three books I listed does very little to prepare a developer to work in a real-world software company. However, without understanding these three books (level of understanding required is in the order that I listed them… fully understand The C Programming Language, have a good handle on Advanced Programming In The Unix Environment, and get what you can via osmosis with The Art of Computer Programming), I honestly believe developers are at a disadvantage, and it will show in their software.

Zero character password?

Signing up for my city’s new online system for paying my water bill. Is it just me or would a zero-character password in this situation be a bad thing?

Web form with a password field that says it accepts zero characters for password

Automating Oracle Database Creation

Why?

I went through some time when, for some reason, I found myself creating lots of new Oracle databases on various systems. These databases were primarily on remote Solaris systems (because, as always, I don’t believe in running Oracle on Windows!).

The “obvious” way to create databases is with the Database Configuration Assistant (DBCA). However, I was unsatisfied with this approach for several reasons:

First, DBCA is a GUI tool and I only connect to the database server with SSH. To use DBCA, I ran a local X server and used X11 forwarding over SSH. Technically effective, but X over anything other than fast local network is barely usable.

Second, I wanted to provision databases that were as “lean and mean” as possible. The databases were usually for development or quick testing of different applications, and most applications didn’t depend on too much Oracle-specific functionality or advanced Oracle features. The databases that come out of DBCA always seemed a bit bloated to me. Furthermore, for applications that do use specific Oracle features (such as the embedded Java runtime, Streams, CDC, etc.), I want to know specifically what needs to be added to the base database to enable the functionality rather than just relying on a install-everything approach.

Finally, I believe anything you need to do server-side to deploy applications should be automated (or at least support the ability to automate the tasks). Creating the databases using the same automated script across my environments is much lower risk than remembering to click all the same settings in a GUI tool when I move through environments. Another aspect of this is that I was finding databases I created using DBCA on different systems tended to have variances in where various directories were created depending on how Oracle was installed. Over time I’ve come to like a particular scheme for organizing multiple databases on a single server, so by scripting the process I can go to any server that I’ve created databases on and know exactly where to find everything.

With all of that in mind, I went in search of the deep dark secrets of creating Oracle databases through PL/SQL instead of DBCA. This really boils down to three steps:

  1. Prepare to create the database
  2. Create the database
  3. Run post-creation scripts

Preparing to create the database really just involves making the directory structure you want and preparing the Oracle parameters file for the database you are going to create.

Then, creating the database is the big SQL statement to actually (duh!) create the database.

And finally, you need to run the SQL scripts to create the initial schema objects. This is also the first good opportunity to migrate the pfile to an spfile.

How?

The approach I took is to write a shell script that creates the directory structure and outputs the SQL and shell scripts to create the individual database (in the database’s admin directory so that the creation scripts used for a particular database are tucked away in that particular database’s directory structure for future reference).

The “creation script creator script” has some parameters you can change to indicate where Oracle is installed, and then of course the rest of the script builds paths based on how I normally set things up and like to see it organized. Very briefly, Oracle product is installed under /u01 and all of my data files go under /u02/oradata/database and recovery files go under /u02/orarecovery/database. I throw two control files under /u02 and stash one under /u01 on the theory that /u01 and /u02 should be different LUNs. Any other administrative stuff goes under /u01/app/oracle/admin/database.

The SID of the database you want to create is the only command-line parameter to the script. If you want anything else to be different, you need to edit the script ahead of time. If you don’t change the template for database creation and parameter file creation in the script, you’ll end up with a character set of AL32UTF8 and the database configured to use about 512MB of RAM on the system.

So without further ado, here’s the script I use:

#!/bin/sh

DB_SID=$1
DB_DOMAIN=mattwilson.org

ORACLE_BASE=/u01/app/oracle
ORACLE_HOME=${ORACLE_BASE}/product/10.2.0/db_1
ORACLE_ADMIN=${ORACLE_BASE}/admin/${DB_SID}

DATA_PRIMARY=/u02/oradata/${DB_SID}
DATA_SECONDARY=/u01/app/oracle/oradata/${DB_SID}
DATA_RECOVERY=/u02/orarecovery

# Create admin directories
mkdir -p ${ORACLE_ADMIN}
for x in adump bdump cdump udump scripts
do
        mkdir ${ORACLE_ADMIN}/${x}
done

# Create data directories
mkdir -p $DATA_PRIMARY
mkdir -p $DATA_SECONDARY
mkdir -p $DATA_RECOVERY

# Create init.ora file for instance
cat - > ${ORACLE_ADMIN}/scripts/init.ora << __EOF__
db_name = $DB_SID
db_domain = $DB_DOMAIN

db_block_size = 8192
undo_management = auto
undo_tablespace = undotbs1

control_files = (${DATA_PRIMARY}/${DB_SID}_ctrl_01.ctl,
                 ${DATA_PRIMARY}/${DB_SID}_ctrl_02.ctl,
                 ${DATA_SECONDARY}/${DB_SID}_ctrl_03.ctl)

background_dump_dest = ${ORACLE_ADMIN}/bdump
core_dump_dest = ${ORACLE_ADMIN}/cdump
user_dump_dest = ${ORACLE_ADMIN}/udump
audit_file_dest = ${ORACLE_ADMIN}/adump

db_recovery_file_dest = $DATA_RECOVERY
db_recovery_file_dest_size = 2147483648

sga_target = 402653184
__EOF__

# Create database creation script
cat - > ${ORACLE_ADMIN}/scripts/create.sql << __EOF__
connect / as sysdba
set echo on
spool ${ORACLE_ADMIN}/scripts/create.log

startup nomount pfile=${ORACLE_ADMIN}/scripts/init.ora;

CREATE DATABASE "${DB_SID}"
MAXINSTANCES 1
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 100
CHARACTER SET AL32UTF8
NATIONAL CHARACTER SET UTF8
DATAFILE '${DATA_PRIMARY}/system01.dbf'
        SIZE 128M
        AUTOEXTEND ON
        NEXT 128M MAXSIZE UNLIMITED
        EXTENT MANAGEMENT LOCAL
SYSAUX DATAFILE '${DATA_PRIMARY}/sysaux01.dbf'
        SIZE 128M
        AUTOEXTEND ON
        NEXT 128M MAXSIZE UNLIMITED
UNDO TABLESPACE "UNDOTBS1" DATAFILE '${DATA_PRIMARY}/undotbs01.dbf'
        SIZE 128M
        AUTOEXTEND ON
        NEXT 16M MAXSIZE UNLIMITED
DEFAULT TEMPORARY TABLESPACE TEMP
        TEMPFILE '${DATA_PRIMARY}/temp01.dbf'
        SIZE 32M
        AUTOEXTEND ON
        NEXT 8M MAXSIZE UNLIMITED
DEFAULT TABLESPACE USERS DATAFILE '${DATA_PRIMARY}/users01.dbf'
        SIZE 64M
        AUTOEXTEND ON
        NEXT 64M MAXSIZE UNLIMITED
LOGFILE GROUP 1 ('${DATA_PRIMARY}/redo01.log') SIZE 64M,
        GROUP 2 ('${DATA_PRIMARY}/redo02.log') SIZE 64M,
        GROUP 3 ('${DATA_PRIMARY}/redo03.log') SIZE 64M;

@?/rdbms/admin/catalog.sql
@?/rdbms/admin/catproc.sql

connect system/manager
@?/sqlplus/admin/pupbld

connect / as sysdba
shutdown immediate;
connect / as sysdba
startup mount pfile=${ORACLE_ADMIN}/scripts/init.ora;
alter database archivelog;
alter database open;
create spfile='${ORACLE_HOME}/dbs/spfile${DB_SID}.ora'
        from pfile='${ORACLE_ADMIN}/scripts/init.ora';
shutdown immediate;
startup;

execute utl_recomp.recomp_serial();

exit;
__EOF__

# Create run script
cat - > ${ORACLE_ADMIN}/scripts/create.sh << __EOF__
#!/bin/sh
ORACLE_HOME=$ORACLE_HOME
ORACLE_SID=$DB_SID
export ORACLE_HOME ORACLE_SID
\$ORACLE_HOME/bin/sqlplus /nolog @create
__EOF__

chmod +x ${ORACLE_ADMIN}/scripts/create.sh

# All done!
echo -------------------------------------------------------------
echo Ready to run create database script.
echo Go to ${ORACLE_ADMIN}/scripts
echo Then run create.sh in that directory.
echo -------------------------------------------------------------

Just save that as something like create-setup-script.sh, make it executable, and you’re all set!

IT Mergers and Acquisitions

My business unit was recently acquired by another company, along with all of us employees. So now I work for a new company who is going to keep us all in Portland to establish a west-coast office (they’re back east).

The upshot of all of this is that the IT guys at my old company are all staying with the old company, so the west coast office of my new company has a long list of IT infrastructure and support needs, but doesn’t have any local IT staff. Thus, I now have two full time jobs! Keep up with my normal customer consulting duties as well as make the IT transition happen.

Truth be told, I am having fun. It’s been several years since I last unboxed and configured new Cisco gear, set up new file and print and directory servers, got into wiring closets to patch drops, etc. This particular case is interesting because it’s not just an outright merger of two companies–there are really three parties involved since it’s just the sale of a business unit: Company A, the group of people and resources that are moving from Company A to Company B, and Company B. At the moment the group in the middle is very dependent on IT resources at Company A, some that are clearly only useful to the business unit and can move as-is, and some that are shared so can’t just move over with the people. And we’re also trying to involve people in Company B’s operations as soon as possible, so we’re all accessing resources in both networks as the disentangling and migration is happening.

At heart I’m really a systems guy–server operating systems, networks, the hardware it all runs on, etc. (but please, leave the client desktop hardware and software to someone else!)–so it’s nice to get back into it for a little while with real production systems instead of my little Solaris test lab at home. I’m lucky that when I was an intern at Intel we had a lab with a few Cisco Catalyst 6500-series switches and Cisco 7000-series routers, and before they actually needed to be deployed the network guy let me and some other interns loose to play with them and learn all about how Cisco gear works. If not for that, I probably would be totally lost getting our new layer 3 core switch up and running, but surprisingly all of my IOS knowledge tucked away in long term memory has bubbled back up to the top of the stack pretty quickly. Coupled with Eric’s excellent checklist for new Cisco switch and router setup, I have our new network ready to start migrating services and eventually cut off all connectivity to the old company.

Apple, why do you do everything you can to keep me from buying another Mac?

I’ve had a 20″ Core Duo iMac for a while now. It’s been a good machine, Mac OS X is decent to work with for what I do at home, the display looks nice… really no major complaints except: it only supports 2GB of RAM.

Yeah. This is a machine purchased in June 2006, and it’s only expandable to 2GB. All of my PCs that are still around, which were purchased before the iMac, can hold at least 4GB of RAM. And they were all less expensive than the iMac.

Anyway, this iMac could easily be just fine for me for another couple years if only it could hold more RAM. 2GB just isn’t enough, and there’s something about the way Mac OS handles memory management that is horribly bad (this is something I noticed both with this iMac and my original Mac Mini before it). Both of my machines at work, one running Windows XP and one running Linux, also only have 2GB of RAM but seem to be able to handle much more of a workload before performance starts degrading.

In any case, 2GB of RAM just isn’t enough for a machine running Mac OS X. So I’m thinking, you know, I shouldn’t have to send loads of cash to Apple for a whole new machine just because they chose ridiculously memory-limited motherboards, but the performance of this machine is just killing me sometimes. My options with this machine are limited, so…

I head over to Apple’s site to look at the specs on the latest iMacs. And guess what: they’re already setting me up to have to buy another one in a couple more years. The whole iMac lineup is limited to 4GB of RAM! This from a company that so loudly boasts about the 64-bitness of their operating systems. It’s like they don’t expect anyone to actually run apps on their computers. I don’t even have high demands: I just want to be able to keep my web browser, iTunes, NetBeans, and a VMWare VM with only 384MB of RAM allocated to it running. If my 2GB machine can’t even handle that (it can’t), how long do I think a machine with only 4GB of RAM will last me? I hate that the only reason this computer won’t last me several more years is because Apple skimped out on how much RAM it can hold.

I can’t bring myself to buy a computer in 2009 that can only hold 4GB of RAM. So looking for other options, I think, “well, maybe I need to go to their ‘pro’ line of systems,” even though consumers deserve more than 4GB of RAM too without buying a new computer again in a couple years.

First up: MacBook Pro. Ignoring that it’s too expensive, I know a lot of people who use these as their main machines with a monitor and keyboard plugged in at the desk. Not my ideal scenario, but luckily I don’t need to worry about it: even on a supposedly “professional” machine, the 15″ MacBook Pro is limited to 4GB of RAM. So we can write that off as a very expensive short-term toy like the iMac.

That just leaves the Mac Pro. The starting price of this puppy is $2,800, and that doesn’t even get you a monitor like the iMacs or MacBooks. Sure, it holds as much RAM as you want to throw at it, but seriously, I’m not going to pay $2,800 for a computer.

For the price of an iMac (I’d probably go with the $1,800 one), you really should be able to expand to more than 4GB of RAM. To ask me to jump from $1,800 to $3,400 (remember, I need to buy a monitor with that Mac Pro) just to satisfy the requirement that I’m able to run more than a couple of applications at a time is ludicrous.

Just for comparison, I priced out a system that is much faster, can hold much more RAM than the iMac, comes with a 24″ display, etc. at Dell and the grand total is… $1,200.

So what do I do? I won’t run Windows at home, of course, but I have no objections to using Linux for my main home system, which I was doing before I got back into Macs. I could build a lightning fast box for much less than even the iMac. But I’d rather just stick with a Mac if only it could hold more RAM.

Listen up, Apple: I want to give you my money. Just not $3,400 of it! You’re making it so hard for me to be your customer. Can’t you at least try to keep up technical parity in your consumer line with the competition?

Using OpenDS for DB2 Authentication

As I mentioned in the previous article about intalling DB2, the DB2 server uses operating system users for authentication. That means that if you want to give Bob Smith access to a database on the server, you need to create a Unix account for him. I like to keep application authentication separated from operating system authentication in most cases, so I didn’t like the way DB2 was working. Luckily, DB2 ships with LDAP authentication plugins to solve this problem.

With LDAP, I can keep all of my user authentication and group membership information in an LDAP directory. If you already have a directory set up, such as Microsoft Active Directory, Novell eDirectory, or an OpenLDAP directory that is in use for authentication, then you can just point at that.

In this case, though, I’m going to create a directory specifically for my DB2 instance. I’ll use OpenDS, an open source Java LDAP server.

After downloading OpenDS, I’ll put unzip it in /opt, resulting in my installation being in /opt/OpenDS-1.0.0:

root@lab01v04# cd /opt
root@lab01v04# unzip /root/OpenDS-1.0.0.zip
Archive:  /root/OpenDS-1.0.0.zip
   creating: OpenDS-1.0.0/
   creating: OpenDS-1.0.0/QuickSetup.app/
   creating: OpenDS-1.0.0/QuickSetup.app/Contents/
   creating: OpenDS-1.0.0/QuickSetup.app/Contents/MacOS/
   creating: OpenDS-1.0.0/QuickSetup.app/Contents/Resources/
   creating: OpenDS-1.0.0/QuickSetup.app/Contents/Resources/Java/
   creating: OpenDS-1.0.0/Uninstall.app/
 [...]
  inflating: OpenDS-1.0.0/setup
  inflating: OpenDS-1.0.0/uninstall
  inflating: OpenDS-1.0.0/upgrade
root@lab01v04#

I’m going to create a new user, opends, under which to run the directory server, then change ownership of the installation directory to the new user:

root@lab01v04# cd /opt/OpenDS-1.0.0/
root@lab01v04# groupadd opends
root@lab01v04# useradd -g opends -d /export/home/opends -m \ 
> -s /usr/bin/ksh93 opends
64 blocks
root@lab01v04# passwd opends
New Password: ...password...
Re-enter new Password: ...password...
passwd: password successfully changed for opends
root@lab01v04# chown -R opends:opends /opt/OpenDS-1.0.0/

Now I can perform the rest of the steps as the new users. After logging in as the opends user, I change to the OpenDS directory and start the setup program. This will allow me to set up the basics of the directory service.

I’ll give you the full conversation below. In essence, I’m accepting most of the defaults. I’ll be running on port 1389, so I can start the server as a non-root user. The base DN for my directory will be dc=lab,dc=mattwilson,dc=org (“dc” is short for “domain component,” so this is equivalent to a DNS name of lab.mattwilson.org).

opends@lab01v04$ cd /opt/OpenDS-1.0.0/
opends@lab01v04$ ./setup --cli

OpenDS Directory Server 1.0.0
Please wait while the setup program initializes...

What would you like to use as the initial root user DN for the Directory
Server? [cn=Directory Manager]:
Please provide the password to use for the initial root user:
Please re-enter the password for confirmation:

On which port would you like the Directory Server to accept connections from
LDAP clients? [1389]:

What do you wish to use as the base DN for the directory data?
[dc=example,dc=com]: dc=lab,dc=mattwilson,dc=org
Options for populating the database:

    1)  Only create the base entry
    2)  Leave the database empty
    3)  Import data from an LDIF file
    4)  Load automatically-generated sample data

Enter choice [1]: 1

Do you want to enable SSL? (yes / no) [no]: no

Do you want to enable Start TLS? (yes / no) [no]: no

Do you want to start the server when the configuration is completed? (yes /
no) [yes]: yes


Setup Summary
=============
LDAP Listener Port: 1389
LDAP Secure Access: disabled
Root User DN:       cn=Directory Manager
Directory Data:     Create New Base DN dc=lab,dc=mattwilson,dc=org.
Base DN Data: Only Create Base Entry (dc=lab,dc=mattwilson,dc=org)


Start Server when the configuration is completed


What would you like to do?

    1)  Setup the server with the parameters above
    2)  Provide the setup parameters again
    3)  Cancel the setup

Enter choice [1]: 1

Configuring Directory Server ..... Done.
Creating Base Entry dc=lab,dc=mattwilson,dc=org ..... Done.
Starting Directory Server ........ Done.

See /var/tmp/opends-setup-23950.log for a detailed log of this operation.

To see basic server configuration status and configuration you can launch
/opt/OpenDS-1.0.0/bin/status
opends@lab01v04$

And with that, we have a directory server running! I’m going to update my path to make it easier to use the various LDAP utilities:

opends@lab01v04$ PATH=/opt/OpenDS-1.0.0/bin:$PATH

Now that the directory server is running, we need to create entries in it to support authentication. At the highest level, we’re going to create to “organizational units,” one for users and one for groups. To create LDAP entries, we use LDIF files. The LDIF file with the “ou” definitions, which we’ll call container-setup.ldif, contains the following:

# users
dn: ou=users,dc=lab,dc=mattwilson,dc=org
objectClass: organizationalUnit
ou: users

# groups
dn: ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: organizationalUnit
ou: groups

To actually import these records into the directory, we’ll use the ldapmodify command. I’m connecting as the directory manager to the LDAP server running on port 1389, and creating records in the container-setup.ldif file:

opends@lab01v04$ ldapmodify -a -D "cn=Directory Manager" -p 1389 \
> -c -f container-setup.ldif
Password for user 'cn=Directory Manager':
Processing ADD request for ou=users,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN ou=users,dc=lab,dc=mattwilson,dc=org
Processing ADD request for ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN ou=groups,dc=lab,dc=mattwilson,dc=org

Next we need to create the users. I need to create users to represent the two operating system users that DB2 is already dependent on: db2inst1 and db2fenc1. Additionally, I will create the bsmith user. When we are all done, we should be able to connect as bsmith even though there is no equivalent Solaris user. DB2 should allow the login based on the LDAP entry.

The following user definitions are in user-setup.ldif:

# db2inst1 user -- required to match instance owner
dn: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org
objectClass: inetOrgPerson
uid: db2inst1
cn: DB2 Instance 1 Owner
sn: DB2 Instance 1 Owner

# db2fenc1 user -- required to match instance fenced user
dn: uid=db2fenc1,ou=users,dc=lab,dc=mattwilson,dc=org
objectClass: top
objectClass: inetOrgPerson
uid: db2fenc1
cn: DB2 Fenced User 1
sn: DB2 Fenced User 1

# "Bob Smith" user
dn: uid=bsmith,ou=users,dc=lab,dc=mattwilson,dc=org
objectClass: inetOrgPerson
uid: bsmith
cn: Bob Smith
sn: Smith
givenName: Bob

Now we’ll use the ldapmodify tool again to create these entries:

opends@lab01v04$ ldapmodify -a -D "cn=Directory Manager" -p 1389 \
> -c -f user-setup.ldif
Password for user 'cn=Directory Manager':
Processing ADD request for uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=or
g
Processing ADD request for uid=db2fenc1,ou=users,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN uid=db2fenc1,ou=users,dc=lab,dc=mattwilson,dc=or
g
Processing ADD request for uid=bsmith,ou=users,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN uid=bsmith,ou=users,dc=lab,dc=mattwilson,dc=org

And now we’ll define the groups. Again, we need to create the current operating system groups that DB2 is using, db2iadm1 and db2fadm1. We also need to create the other security groups that DB2 uses by default, SYSADM, SYSMAINT, SYSCTRL, and SYSMON. Note also how we’re adding members to these groups.

The contents of the group-setup.ldif file are:

# db2iadm1 group
dn: cn=db2iadm1,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: top
objectClass: groupOfEntries
cn: db2iadm1
member: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org

# db2fadm1 group
dn: cn=db2fadm1,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: top
objectClass: groupOfEntries
cn: db2fadm1
member: uid=db2fenc1,ou=users,dc=lab,dc=mattwilson,dc=org

# SYSADM group
dn: cn=SYSADM,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: groupOfEntries
cn: SYSADM
ou: Groups
member: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org

# SYSMAINT group
dn: cn=SYSMAINT,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: groupOfEntries
cn: SYSMAINT
ou: Groups
member: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org

# SYSCTRL group
dn: cn=SYSCTRL,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: groupOfEntries
cn: SYSCTRL
ou: Groups
member: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org

# SYSMON group
dn: cn=SYSMON,ou=groups,dc=lab,dc=mattwilson,dc=org
objectClass: groupOfEntries
cn: SYSMON
ou: Groups
member: uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org

And finally, create these records with ldapmodify:

opends@lab01v04$ ldapmodify -a -D "cn=Directory Manager" -p 1389 \
> -c -f group-setup.ldif
tPassword for user 'cn=Directory Manager':
Processing ADD request for cn=db2iadm1,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=db2iadm1,ou=groups,dc=lab,dc=mattwilson,dc=or
g
Processing ADD request for cn=db2fadm1,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=db2fadm1,ou=groups,dc=lab,dc=mattwilson,dc=or
g
Processing ADD request for cn=SYSADM,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=SYSADM,ou=groups,dc=lab,dc=mattwilson,dc=org
Processing ADD request for cn=SYSMAINT,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=SYSMAINT,ou=groups,dc=lab,dc=mattwilson,dc=or
g
Processing ADD request for cn=SYSCTRL,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=SYSCTRL,ou=groups,dc=lab,dc=mattwilson,dc=org
Processing ADD request for cn=SYSMON,ou=groups,dc=lab,dc=mattwilson,dc=org
ADD operation successful for DN cn=SYSMON,ou=groups,dc=lab,dc=mattwilson,dc=org

The last thing we need to do is assign passwords to the users. OpenDS includes a utility, ldappasswordmodify, to do just that (“adminPassword” is the password I set during setup of OpenDS, and “userPassword” is what I want to set the user’s password to):

opends@lab01v04$ ldappasswordmodify -p 1389 -D "cn=Directory Manager" \
> --authzID "dn:uid=db2inst1,ou=users,dc=lab,dc=mattwilson,dc=org" \
> -w adminPassword -n userPassword
The LDAP password modify operation was successful

opends@lab01v04$ ldappasswordmodify -p 1389 -D "cn=Directory Manager" \
> --authzID "dn:uid=db2fenc1,ou=users,dc=lab,dc=mattwilson,dc=org" \
> -w adminPassword -n userPassword
The LDAP password modify operation was successful

opends@lab01v04$ ldappasswordmodify -p 1389 -D "cn=Directory Manager" \
> --authzID "dn:uid=bsmith,ou=users,dc=lab,dc=mattwilson,dc=org" \
> -w adminPassword -n userPassword
The LDAP password modify operation was successful

And with that, our LDAP directory is created and populated with users and groups, and the users have passwords so they should be able to log in.

After setting up the directory, we need to configure DB2 to use the LDAP plugins. These require some configuration to tell them how to connect to the LDAP user, and how to find users and groups. The configuration is stored in the file sqllib/cfg/IBMLDAPSecurity.ini, relative to the instance root. In my case, that’s /export/home/db2inst1/sqllib/cfg/IBMLDAPSecurity.ini.

For the setup we created above, I’ve entered the following configuration in the file:

LDAP_HOST = localhost:1389
USER_OBJECTCLASS = inetOrgPerson
USERID_ATTRIBUTE = uid
AUTHID_ATTRIBUTE = uid
USER_BASEDN = ou=users,dc=lab,dc=mattwilson,dc=org
GROUP_OBJECTCLASS = groupOfEntries
GROUP_BASEDN = ou=groups,dc=lab,dc=mattwilson,dc=org
GROUPNAME_ATTRIBUTE = cn
GROUP_LOOKUP_METHOD = SEARCH_BY_DN
GROUP_LOOKUP_ATTRIBUTE = member

With the configuration in place, we’ll change the DB2 instance configuration to use the LDAP plugins:

db2inst1@lab01v04$ db2 update dbm cfg using srvcon_pw_plugin \
> IBMLDAPauthserver
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.

db2inst1@lab01v04$ db2 update dbm cfg using clnt_pw_plugin \
> IBMLDAPauthclient
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.

db2inst1@lab01v04$ db2 update dbm cfg using group_plugin IBMLDAPgroups
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed
successfully.

For the changes to take effect, we need to restart the instance:

db2inst1@lab01v04$ db2 terminate
DB20000I  The TERMINATE command completed successfully.
db2inst1@lab01v04$ db2stop
SQL1064N  DB2STOP processing was successful.
db2inst1@lab01v04$ db2start
SQL1063N  DB2START processing was successful.

Now to test it: we’ll try to connect to the database we created in the last article, mydb, as the user bsmith. Since there’s no user on my Solaris system named bsmith, this wouldn’t have worked before the LDAP configuration. If we’re able to connect, it means DB2 is now using our LDAP directory for authentication:

db2inst1@lab01v04$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.5.1

You can issue database manager commands and SQL statements from the command
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 => connect to mydb user bsmith
Enter current password for bsmith:

   Database Connection Information

 Database server        = DB2/SUNX8664 9.5.1
 SQL authorization ID   = BSMITH
 Local database alias   = MYDB

db2 =>

Success! You can see that we are logged in as bsmith. You can try other experiments to make sure this is really working—enter an incorrect password or an invalid username that isn’t defined in LDAP, for example, and the server will correctly reject the connection.

Installing DB2 on OpenSolaris

Why?

The other day, Ben Rockwell mentioned on his blog that the free edition of DB2 was available for 64-bit Solaris on x86 systems. I like learning about new server software, and databases in particular, so I figured I’d take a look at it.

The majority of my relational database administration experience has been with Oracle, Microsoft SQL Server, MySQL, and PostgreSQL. Of that mix, I was expecting DB2 to be much more like Oracle than the others—specifically, complex installation (if you don’t want to use GUI tools that produce bloated default databases with every option under the sun enabled) and annoying command line tools (it’s 2009 and sqlplus doesn’t have any sort of command completion or press-the-up-arrow-to-get-to-previous-command support). I was pleasantly surprised in how easy it was for a mere mortal to install and create a database entirely from the command line. The interactive tools, though, seem about as brain-dead as sqlplus, though, so my guess was partially right (although perhaps slight favor to DB2 for its help system within its interactive command processor, but ding it back down for requiring an explicit line continuation character…but I’m getting way ahead of myself, we haven’t even installed it yet!).

So enough philosophy, let’s get down to business. Please remember that there’s a “GUI setup wizard” that can do all of this in just a few clicks (for Solaris as well as the other supported platforms, Linux and Windows). But where’s the fun in that? I like to know exactly what’s going on in my systems, and I’ve found that doing things the manual way is a much better way to learn how to support a system in the long run. Also, I like to be able to script most server setup tasks for reliable repeatability. If you’re with me, here we go!

Installing DB2

I have downloaded DB2 9.5 Express-C for Solaris x64, and have the distribution extracted in /root/db2_9.5_expc, ready for installation.

First, I’ll install the software to the default location, /opt/IBM/db2/V9.5. The -p EXP option is to tell the installer what product to install — EXP for Express Edition in this case. Before running the installer, I create the /usr/local/bin directory because DB2 puts a command (db2ls) in there. It doesn’t hurt if that command doesn’t get installed, but the installer will tell you that there were minor errors. So, the installation of the software:

root@lab01v04# mkdir -p /usr/local/bin
root@lab01v04# cd /root/db2_9.5_expc
root@lab01v04# ./db2_install -b /opt/IBM/db2/V9.5 -p EXP -n
The execution completed successfully.

For more information see the DB2 installation log at 
"/tmp/db2_install.log.18091".
root@lab01v04#

Creating an Instance

Easy enough. Now we need to create an instance. A DB2 instance is what holds databases and everything in them. One instance can hold several database (like Microsoft SQL Server or MySQL, but unlike Oracle). DB2 instances are owned by and tied to a local user account. In addition to the instance owner user, there is a “fenced user” that is used to provide a security context in which to run certain code. So we’ll be creating two users, db2inst1, the instance owner, and db2fenc1, the fenced user. Note that the actual instance data will live inside of the home directory of the instance owner. We’ll also create two groups, the instance admin group (db2iadm1) and the fenced admin group (db2fadm1). One physical server can run several instances of DB2, so the 1 on the end of the user and group names is just an easy way of identifying this particular instance we’re creating.

Create the groups, then the users, then set the users’ passwords using the regular Solaris tools:

root@lab01v04# groupadd db2iadm1
root@lab01v04# groupadd db2fadm1
root@lab01v04# useradd -g db2iadm1 -d /export/home/db2inst1 \
               -s /usr/bin/ksh93 -m db2inst1
64 blocks
root@lab01v04# useradd -g db2fadm1 -d /export/home/db2fenc1 \
               -s /usr/bin/ksh93 -m db2fenc1
64 blocks
root@lab01v04# passwd db2inst1
New Password: ...password...
Re-enter new Password: ...password...
passwd: password successfully changed for db2inst1
root@lab01v04# passwd db2fenc1
New Password: ...password...
Re-enter new Password: ...password...
passwd: password successfully changed for db2fenc1

We now have the users ready to go. Finally, our last task as root is to create the actual instance. We’ll do that with the db2icrt command, which takes an argument for the fenced user and the instance name/user:

root@lab01v04# /opt/IBM/db2/V9.5/instance/db2icrt -u db2fenc1 db2inst1
Sun Microsystems Inc.   SunOS 5.11      snv_104 November 2008
Sun Microsystems Inc.   SunOS 5.11      snv_104 November 2008
DBI1070I  Program db2icrt completed successfully.

Simple as that. The Sun banner that appears a couple of times is from the instance creation scripts logging in as the users to perform some setup.

Now that the instance is created, we can do the rest of the work as the db2inst1 user, so we’ll change logins. The instance creation tool added an entry to db2inst1‘s .profile file to pull in the environment for all of the DB2 commands.

Our first task is to start the instance:

db2inst1@lab01v04$ db2start
SQL1063N  DB2START processing was successful.

Creating the First Database

Since this is a new instance, there isn’t actually anything in it yet (specifically, databases). Now we can create our first database, which we’ll call mydb.

db2inst1@lab01v04$ db2 create database mydb
DB20000I  The CREATE DATABASE command completed successfully.

The db2 command is the DB2 Command Line Processor (CLP). The CLP is the primary interface to issue commands to the server. You can either run db2 command from the shell, which executes the command and exits, or you can use the CLP interactively by running db2 with no arguments. To do a couple quick tests on our database, we’ll use the CLP interactively:

db2inst1@lab01v04$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.5.1

You can issue database manager commands and SQL statements from the command 
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside 
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 =>

When we launch the CLP, we get some basic usage information and then the prompt. We first need to connect to a database, so we’ll connect to the one we just created, mydb:

db2 => connect to mydb

   Database Connection Information

 Database server        = DB2/SUNX8664 9.5.1
 SQL authorization ID   = DB2INST1
 Local database alias   = MYDB

Looks good. We’re connected as the db2inst1 user to the mydb database. Now we can just issue regular SQL statements, so we’ll create a table and insert a couple rows. Note that in the DB2 CLP (like the Unix shell), you need to put a backslash on the end of a line if you want to continue the command. Also, do not put semicolons at the end of SQL commands; in interactive mode they are not allowed.

db2 => create table testtab ( \
db2 (cont.) => id integer not null primary key, \
db2 (cont.) => name varchar(50) not null )
DB20000I  The SQL command completed successfully.
db2 => insert into testtab values (1, 'First entry')
DB20000I  The SQL command completed successfully.
db2 => insert into testtab values (2, 'Second entry')
DB20000I  The SQL command completed successfully.

Not surprisingly, it’s working like a SQL database should. We have a table, testtab, which we’ve inserted two rows into.

Finally, we’ll disconnect from the database and quit the CLP:

db2 => connect reset
DB20000I  The SQL command completed successfully.
db2 => quit
DB20000I  The QUIT command completed successfully.

Enabling Network Connectivity

That works great, but so far we’ve only accessed the instance locally. To allow connections from clients on other systems, we need to configure the instance to accept TCP/IP connections. This is an instance-level setting, so all of the databases you create in this instance will be available to remote clients.

The first step is to tell the instance what port to listen on. We’ll use 50,000. Note that svcename in the following command could also be the name of an entry in the /etc/inet/services file, which explains why the parameter is named svcename instead of something with the word “port.”

db2inst1@lab01v04$ db2 update dbm configuration using svcename 50000
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed 
successfully.
SQL1362W  One or more of the parameters submitted for immediate modification 
were not changed dynamically. Client changes will not be effective until the 
next time the application is started or the TERMINATE command has been issued. 
Server changes will not be effective until the next DB2START command.

Next we need to enable TCP/IP as a communication protocol. This uses a new command, db2set:

db2inst1@lab01v04$ db2set DB2COMM=tcpip

Finally, restart the instance:

db2inst1@lab01v04$ db2stop
SQL1064N  DB2STOP processing was successful.
db2inst1@lab01v04$ db2start
SQL1063N  DB2START processing was successful.

How do we know if it worked? First, we’ll check to see if there’s something listening on port 50,000 on our system:

db2inst1@lab01v04$ netstat -an | grep 50000
      *.50000              *.*                0      0 49152      0 LISTEN

Looks good! netstat reports that a process is accepting connections on port 50000 on all interfaces. To really prove that we’re ready to start serving clients, though, we’ll test connectivity from another system.

Connecting From a Remote Client

We installed DB2 on lab01v04. Over on another machine, lab01v03, the DB2 client is installed. I’m logged in as mwilson, a user that the DB2 server knows nothing about, but I should be able to connect to DB2 as the db2inst1 user, which is the “superuser” or “root user” for the database.

The DB2 client software contains the same db2 command to launch the Command Line Processor. To connect to a remote server from a client, you first need to define the “node,” which represents the instance. Then you define the specific database within the instance. Once I have the database defined (cataloged in DB2 parlance), I can connect to it by name just like I did on the server, only I’ll add a username since I don’t have the benefit of being logged on locally to the server as an authorized user.

I’ll give you this one in one piece: starting the CLP, cataloging the node and database, then connecting and reading the data we inserted into the table earlier. All of this is happening from lab01v03, talking to the DB2 instance we created on lab01v04.

mwilson@lab01v03$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 9.5.1

You can issue database manager commands and SQL statements from the command 
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside 
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 => catalog tcpip node lab01v04 remote lab01v04 server 50000
DB20000I  The CATALOG TCPIP NODE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is 
refreshed.
db2 => catalog database mydb at node lab01v04
DB20000I  The CATALOG DATABASE command completed successfully.
DB21056W  Directory changes may not be effective until the directory cache is 
refreshed.
db2 => connect to mydb user db2inst1
Enter current password for db2inst1: ...password...

   Database Connection Information

 Database server        = DB2/SUNX8664 9.5.1
 SQL authorization ID   = DB2INST1
 Local database alias   = MYDB

db2 => select * from testtab

ID          NAME                                              
----------- --------------------------------------------------
          1 First entry                                       
          2 Second entry                                      

  2 record(s) selected.

db2 => connect reset
DB20000I  The SQL command completed successfully.
db2 => quit
DB20000I  The QUIT command completed successfully.

Wrapping Up

It worked! With really just a handful of commands, entirely from the command line, we’ve a) installed DB2, b) created an instance, c) created a database, d) enabled network client connectivity, and e) connected to our database from a remote client. If you’re familiar with doing the exact same thing using Oracle (please, no GUI installer or automatic bloated database creation with the Database Creation Assistant), you’ll appreciate just how much of a breeze this was with DB2, despite it being an equally “enterprise” database as Oracle. Maybe I’ll put together an article going through all of my Oracle installation and instance creation scripts for comparison.

In any case, the only thing we didn’t do is create the DB2 Administration Server (DAS), which allows remote management with the DB2 GUI utilities, but I don’t plan on needing that for now. If I did, it’s literally just a matter of creating a user to own the DAS and running the command dascrt -u DASuser.

There is one thing that I’m not satisfied with at this point, though: the instance uses the local system user accounts for authentication. That means, for example, that if I want a user, mwilson, in my database, I need to create a Unix account for mwilson. Luckily, DB2 ships with an LDAP authentication plugin. This will allow me to store user information in an LDAP directory and create as many users as I want without making any changes to the operating system hosting DB2. We’ll get that up and running, using OpenDS, in the next installment.

Replacing a Bad Drive with ZFS

One of the drives in my home file server was making occasional nasty clicking noises, which always precedes death in hard drives. The drive is only a couple of months old, so it must have just been a bad apple. Anyway, some quick testing showed that it was failing its SMART self-tests, so it was quick and easy to get a warranty replacement from Seagate. Luckily replacing the drive was quick and easy and all of the data was safe, since my data is on a zpool consisting of four terabyte drives in a RAID-Z configuration (if you’re familiar with traditional RAID, think RAID-5). Of course the data is also backed up, because RAID is not a substitute for backups, but as expected ZFS “just worked” and the new drive took over for the old drive with no hassles.

If you want to relive the experience, here’s my session on the server after shutting down, replacing the failing drive, and starting back up. First, on the console, Solaris complained about something not being quite right as soon as the server booted:

SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major
EVENT-TIME: Wed Jan 14 19:12:39 PST 2009
PLATFORM: PowerEdge 1800, CSN: BSQMN91, HOSTNAME: athena
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: c6647451-fa5a-4f4b-99fd-de1e76bb059d
DESC: The number of I/O errors associated with a ZFS device exceeded
      acceptable levels.  Refer to http://sun.com/msg/ZFS-8000-FD for more
      information.
AUTO-RESPONSE: The device has been offlined and marked as faulted.  An attempt
      will be made to activate a hot spare if available. 
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

Yeah, it didn’t like booting with a totally different hard drive in place of a drive that was a member of a zpool. A quick check confirms that Solaris is, indeed, complaining about the drive I replaced:

mwilson athena:~ [1258]% zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: resilver completed after 0h0m with 0 errors on Wed Jan 14 19:12:11 2009
config:

        NAME        STATE     READ WRITE CKSUM
        tank        DEGRADED     0     0     0
          raidz1    DEGRADED     0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  UNAVAIL      0     0     0  cannot open
            c0t4d0  ONLINE       0     0     0
            c0t5d0  ONLINE       0     0     0

errors: No known data errors

No surprises there, we’ll tell it to replace c0t3d0. Without any additional arguments, the zpool replace command will attempt to replace the old device with the same new device (c0t3d0, that is).

mwilson athena:~ [1260]% pfexec zpool replace tank c0t3d0

That command executes immediately and returns me to the prompt. We can monitor the status while the pool is resilvering:

mwilson athena:~ [1262]% zpool status tank
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h0m, 0.40% done, 2h45m to go
config:

        NAME              STATE     READ WRITE CKSUM
        tank              DEGRADED     0     0     0
          raidz1          DEGRADED     0     0     0
            c0t2d0        ONLINE       0     0     0
            replacing     DEGRADED     0     0     0
              c0t3d0s0/o  FAULTED      0     0     0  corrupted data
              c0t3d0      ONLINE       0     0     0
            c0t4d0        ONLINE       0     0     0
            c0t5d0        ONLINE       0     0     0

errors: No known data errors

And some time later…

mwilson athena:~ [1273]% zpool status tank
  pool: tank
 state: ONLINE
 scrub: resilver completed after 2h34m with 0 errors on Wed Jan 14 21:48:43 2009
config:

        NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  ONLINE       0     0     0
            c0t4d0  ONLINE       0     0     0
            c0t5d0  ONLINE       0     0     0

errors: No known data errors

Everything happy again! I love it when things just work how they’re supposed to.

Solaris CIFS Server and ZFS ACLs: The Problem

I’m going to be switching my home file server over to Solaris soon (or, more specifically, Solaris Express Community Edition [SXCE] build 99), and one of the primary goals of this server is to serve up a few directories to Windows or other SMB clients. One of the reasons I’m switching from Linux to Solaris is because I’m significantly increasing the disk space and I want to use ZFS for my storage pool. At the same time, I’m hoping to take advantage of SXCE’s built-in CIFS server to serve SMB shares.

To prepare for the big switch, I installed the latest build on a test machine and am playing around with setting up a configuration similar to what I’ll want. Unfortunately, it became clear quickly that I was going to hit problems with the new NFSv4 ACLs implemented in ZFS and how the CIFS server interacts with those ACLs on behalf of Windows clients.

So, in this post, I’ll walk through what I want to have happen, and what’s actually happening instead.

I have several users. They all belong to a group named data. There is a directory, /export/sandbox, that is for group project resources for everyone in the data group. All users in the group should be able to create files and directories in sandbox, and everyone else in the data group should be able to modify the files and directories. All users on the system should have read access to the sandbox tree.

This is very easy with traditional Unix permissions. You set the sandbox directory to mode 775, set the group to data, and set the setgid bit. For example, if I’m going to make sandbox a ZFS dataset, I can do:

# zfs create -o casesensitivity=mixed rpool/export/sandbox
# chown root:data /export/sandbox
# chmod 775 /export/sandbox
# chmod g+s /export/sandbox

(The casesensitivity option is to make it play well with Windows as a file share)

Finally, in each user’s profile I can set umask 002 and everything works as desired. Let’s log in as mwilson and do some tests:

$ umask
002
$ cd /export/sandbox
$ ls -l
total 0
$ touch test-file
$ mkdir test-dir
$ ls -l
total 2
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file

Excellent! This is exactly what we wanted: the file’s group is set to data and it remains writable by the group, as does the directory. Both are read-only for the world. The setgid bit on the new directory is set, so this method will work as users continue making subdirectories deeper in the tree.

But now, we’re going to add a new requirement: some users will access the sandbox from Windows using SMB. My server is already set up to run the Solaris CIFS server, so it’s easy to share this folder:

# zfs set sharesmb=name=sandbox rpool/export/sandbox

Now from a Windows client, I can go to \\server\sandbox and sure enough I see the directory and its contents. I’m authenticated as a user that maps to the mwilson Unix user. Now I’ll create a text file from Windows, then look at the directory listing back in Unix:

$ ls -l
total 5
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt

Whoa! Look at the file we created, windows-file.txt. That’s different… the ZFS ACLs are beginning to rear their ugly heads. The + next to the Unix permissions indicates that this file contains extended ACLs. Let’s look at the ACL on this file:

$ ls -v windows-file.txt
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt
     0:user:mwilson:read_data/write_data/append_data/read_xattr/write_xattr
         /execute/delete_child/read_attributes/write_attributes/delete
         /read_acl/write_acl/write_owner/synchronize:allow
     1:group:2147483648:read_data/write_data/append_data/read_xattr
         /write_xattr/execute/delete_child/read_attributes/write_attributes
         /delete/read_acl/write_acl/write_owner/synchronize:allow

Okay. Deep breath. This file has ACL entries that say the user named mwilson is allowed to do, well, just about everything you could ever want to do to the file. The group with ID 2147483648 also has full permissions. Why the weird group number? It’s something to do with the mapping of Windows users and groups to Unix users and groups…honestly, I don’t know where it’s coming from. Since I’m mapped to the mwilson user, I wish it would just apply the Unix user’s group as the effective group if nothing else.

In any case, there seems to be a problem here: the data group no longer has any access to this file! Nor, it seems, does the world have read access.

Let’s log in as another user, jsmith, who is in the data group and look at the sandbox directory.

$ cd /export/sandbox
$ ls -l
./windows-file.txt: Permission denied
total 4
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file

Wow… ls gives us an error just trying to list the directory! That’s pretty bad…

Just for kicks, we’ll make a directory from Windows and see what that looks like, as mwilson:

$ ls
total 6
drwxrwsr-x   2 mwilson  data           2 Oct  7 17:35 test-dir/
-rw-rw-r--   1 mwilson  data           0 Oct  7 17:35 test-file
d-----S---+  2 mwilson  data           2 Oct  7 20:45 windows-dir/
----------+  1 mwilson  data           0 Oct  7 17:47 windows-file.txt

Okay, there’s the directory (windows-dir) but it’s again different than what we’re used to. It looks like the Windows file, but has a capital S in the group mode. That indicates that the setgid bit is set, but the execute bit is not set for the group. Let’s check the ACL that’s in place:

$ ls -dv windows-dir
d-----S---+  2 mwilson  data           2 Oct  7 20:45 windows-dir/
     0:user:mwilson:list_directory/read_data/add_file/write_data
         /add_subdirectory/append_data/read_xattr/write_xattr/execute
         /delete_child/read_attributes/write_attributes/delete/read_acl
         /write_acl/write_owner/synchronize:allow
     1:group:2147483648:list_directory/read_data/add_file/write_data
         /add_subdirectory/append_data/read_xattr/write_xattr/execute
         /delete_child/read_attributes/write_attributes/delete/read_acl
         /write_acl/write_owner/synchronize:allow

This is just like the file, but with the directory versions of the permission names instead of the file versions of the names.

Where does that leave us? Creating files and directories from the Unix command line gives me the behavior I want, but creating files from Windows through the SMB share leads to dreadful results.

The problem definitely lies with ACL inheritance: I suspect I will need to define the ACLs that I want on the sandbox directory and set the appropriate inheritance flags, then files created in an “ACL-aware” fashion, such as by the CIFS server, will end up with the permissions I want. We shall see…hopefully soon I’ll have a follow-up post walking through the solution.

Update: as if it isn’t obvious by now, I never did follow up and figure out the right ACLs. I ended up using Solaris 10 Update 6, instead of OpenSolaris, to build the new file server. Solaris doesn’t have the integrated CIFS server yet, so I’m going the traditional Samba route, with doesn’t have this problem.

Dear Google: can you please add two features to GMail for me?

For several years, I ran my own server to handle my email. At first it was a fun project, gave me good real-world experience, and provided flexibility that I wouldn’t have had with most hosted options. Procmail and mutt were my friends. Over time, though, it became more of a burden than it was fun to keep up with anti-spam measures, and in the grand scheme of things I just didn’t feel like spending my free time maintaining caring for and feeding a production mail server.

The death knell for my own server was the introduction of Google Apps For Your Domain. Having played with regular GMail in the past, I liked the interface and its threading model, and I buy into the philosophy of searching email archives instead of trying to organize them. For those and other reasons, moving email to Google Apps sounds like a good option, so I set up a test domain and eventually moved mattwilson.org to Google Apps.

In short, I’ve been happy with the service and their spam filter is amazingly accurate. So I’m a happy camper, but there is one area where I’d like to see a couple of improvements: handling email list subscriptions.

I subscribe to several mail lists, and GMail’s searching and conversation threading features particularly shine when reading list traffic. Each list gets its own label and messages “skip the inbox” so I can just go through and read the lists I’m interested in as I have time. But here’s where the problems arise:

First, GMail’s filters don’t allow me to reliably drop messages from particular lists in a particular label (for GMail neophytes, think of labels as folders). For some lists I’ve subscribed to, the only way to identify that I received the message from that list is by looking for a specific header. Unfortunately, I can’t filter based on headers with GMail so the messages from those lists couldn’t be filed correctly. Even for the majority of my lists which I filter based on the list address in the “To” field, I occasionally get messages in the inbox because the list was bcc’ed for the particular message. There’s another header that still identifies the list, but I can’t act on it. So feature request one: I’d like to filter based on headers.

Second, I don’t read every message on every list. My workflow is to click on a label, scan the subject lines, and read the messages that look interesting. This leaves several unread conversations, and in the best case it takes three clicks to mark the remaining conversations as read. If I’ve been on vacation or not reading list traffic for a couple days and the messages expand past the first list screen, it takes more work to mark them as read. So feature request two: while browsing a label, I’d like a “Catch Up” or “Mark All As Read” button right up there next to the Delete button.

GMail is inherently a natural fit for managing an email account that subscribes to mail lists. The search is great, and the conversation interface is wonderful for following threads. With the addition of header-based filtering and a quick way to mark everything from a list as read, it would be truly fantastic.

Cell Phones (and yes, the iPhone)

By now, of course, you know it happened: Apple, Inc. announced the iPhone.

As cool as the phone looked throughout the entire demo, I was upset the whole time (and continued to rant all day…) that it’s a GSM/EDGE device. I am in no way a fan of Verizon Wireless as a company, but the bottom line is that they have the best network (in all measurable areas: coverage area, call quality, call setup time, etc.) in the area I live. EvDO is also significantly faster than EDGE, which for a mobile device such as the iPhone is going to be important. But more on the cell carriers later.

First, the iPhone itself. There’s not much to say other than “drool.” How can you not want one?

Warning: boring digression!

Perhaps a digression is in order: the iPhone announcement comes at an interesting time for me because I recently evaluated—and briefly tried—the switch to smartphone-land. My first attempt was a BlackBerry Pearl with T-Mobile, which had a fabulous web browser but otherwise I wasn’t a fan of the interface and capabilities. RIM has its (admittedly large) niche, but I wasn’t necessarily looking for real-time Exchange integration to be my killer-feature. I was coming from fantastic call quality and coverage with Verizon Wireless, so my experience with the BlackBerry wasn’t quite doing it for me and I switched back to my old phone and old plan.

My other option, then, was the Treo 700p. I’d have to pay an arm and leg for the device, but I used Palms long ago and know I’ll like their PDA functionality, so it was just a question of online data access. Sadly, it was a joke. The web browser (if it could render the page at all) was horrendous compared to the BlackBerry web browser, and the most important feature for me in a smartphone is web browsing. Also, despite being on a data network that is an order of magnitude faster (EvDO) than what the BlackBerry had access to, browsing the web on the Treo was painfully slow. It was clear the whole device was single-threaded at the operating system level and it was just an awful experience. So I’m sure the Treo is fantastic in every other way, but if it couldn’t browse the web decently, why even sell it?

I’m not interested in Windows Mobile-based devices, and I have a huge financial incentive to stick with my current (voice-only) cell phone and plan, so I left off my thoughts of smartphones around the end of November and decided to give the market some time to get better.

End digression! 

Which brings us back to the topic at hand: the iPhone just came along. It looks to be exactly what I want: non-Windows-based smartphone with a fantastic web browser and nice interface. And a mail client that can do direct IMAP or POP3 on top of that (this was a problem with the BlackBerry and, as far as I could tell, the Treo—they each had to proxy IMAP or POP3 stuff through the wireless provider, I think. This was an extra charge with T-Mobile and I don’t know how Verizon handled it. I want the phone to make a direct TCP connection to my mail server to check mail!).

I want one.

But… there are snags:

  1. Cingular??? Puh-leeze. They are the worst carrier (call quality/coverage/dropped calls) in this area from everything I’ve seen. At least Apple could have gone with T-Mobile to throw in the “hip and cool” angle.
  2. GSM/EDGE? This one is understandable (sadly), but still not what I want. CDMA/EvDO is just plain better, if for no other reason than EvDO is truly broadband-like speeds and EDGE isn’t. The international market is almost exclusively GSM, though, which is why this decision is understandable. I don’t know much about higher-speed GSM data technologies but we’ll have to see how quickly Cingular builds out their network with better tech and if Apple follows with a matching phone.
  3. This is a very expensive setup. The phone is very pricey and really isn’t a suitable iPod replacement (8GB in the most expensive model which is a bit of a joke for their first “widescreen video iPod”) so you can’t use the “well you’re getting a phone and an iPod for the price of one” argument. You will still want to buy the real widescreen video iPod when it comes out, so budget another few hundred bucks for that. Also, I don’t think most people realize how much an unlimited data plan costs: expect your cell phone to double if you have a regular 450-900 minute a month plan. In the Cingular case, unlimited data looks to be $45/month on top of your voice plan.

Points 1 and 2 aren’t likely to affect the mass market, I just don’t like them. Point 3, though, is interesting to me. What market is Apple going for with this phone? I don’t have any data on this, but I would guess that the majority of the cell phone accounts that have the extra $45/month data plan are corporate lines of service. There’s nothing out there yet indicating Apple has any kind of over-the-air Exchange integration story for the iPhone, which will prevent its adoption as a replacement for most of those corporate devices currently tied to data plans. That will still leave lots of people who are interested in doing this sort of setup on their own (like me), but this isn’t exactly something like an iPod where Mom-and-Dad can sink a one-time cost to buy the device and the kid is happy. Will this be a compelling device without a data plan? Perhaps. Is part of the Cingular/Apple deal a special service plan to get people on board? Perhaps. There are different data plan options (most BlackBerrys are on special BlackBerry data plans, with additional services like IMAP/POP3 mail checking requiring an additional charge), and Cingular looks to have a web-browsing only plan for certain smartphones, but in the case of the iPhone that gets back to the question of direct mail client connections versus proxying through some webmail service.

But forget all that: my big question about the iPhone: what does “runs Mac OS X” mean? It sure doesn’t mean that it’s literally the same operating system distribution that runs on my desktop machine. I suspect it does mean there’s parts of Darwin underneath with some key APIs to make it look like MacOS X for development purposes. (Which segues into the next question: what does developing for an iPhone look like? New XCode module? Will there be a simulator? etc…)

On that note, potentially show-stopper (for me, at least, not most people) news I ran across during my iPhone-news-roundup here at the end of the day: is it true that there will be no third-party development for the iPhone? This seems to be confirmed by another source on the show floor.

Anyway, at the moment, I want one. We’ll see what’s happening in the second half of this year.

Oh, and as promised, my quick thoughts on cell providers in the Portland, Oregon area:

  • T-Mobile. I really like T-Mobile because they have great customer service and the best plans/prices. The downside is limited coverage area and GSM/EDGE.
  • Cingular. Good luck actually getting through a complete call with someone and having both parties actually be able to understand each other the whole time! If you could even make good calls, it would be unfortunate that it’s GSM/EDGE.
  • Verizon Wireless. Pure evil. They cripple their phones so that even if the phone is capable of (for example) sending pictures you took to your computer via BlueTooth, that feature is disabled so you have to use Verizon’s $0.25-per-picture over-the-air picture deliver service. There was a class-action suit against them because of this and folks got new phones, but unfortunately this didn’t result in Verizon changing the practice of crippling phones going forward, they just added more fine print to cover themselves from future lawsuits about it in the future. The most mega of the mega corps when it comes to cell phones. BUT (and this is important) they have the best network in terms of coverage, reliability, etc. They are also CDMA and have great EvDO service around the country. At the end of the day I’m not paying my cell phone company to let me take pictures with my phone, I’m paying them to move my voice and data. It’s incredible how much better Verizon does this than the other carriers I’ve dealt with, so…sadly…Verizon gets my business.
  • Sprint/Nextel. Irrelevant. (yeah, I know, harsh! But at the moment, they are. Come on, you go to Radio Shack to buy them. That can’t be a good sign!)

That’s it for now. As I said, I’ll be curious to revisit the iPhone after the first round of people gets them and takes them for a spin.

Fixing busted fonts in Nevada build 46

Fonts seem to be broken slightly in Nevada builds 45 and 46. To correct, go to /usr/openwin/lib/X11/fonts/F3bitmaps/ and copy fonts.alias.all to fonts.alias.

(Forum thread here)

It’s been a while

Wow, well over a year since I’ve posted anything here. I guess I’m just not cool enough to truly jump on the blogging bandwagon!

Since I feel, for some reason, obligated to add something new at least once per calendar year, let me just say this:

Thank you, Sun Microsystems, for Solaris 10.

Seriously, Solaris 10 may be the best operating system ever. Zones, by themselves, would make this release of Solaris a Big Deal. But as an added bonus, we get ZFS as well. And while I haven’t quite figured out DTrace yet, it’s also a valuable new addition. I’m sure Sun’s marketing department would also want you to know about hundreds of other new features, but Zones and ZFS are what have been keeping me busy and having fun.

And don’t forget, it’s all open source now, so nobody has any reason not to run it!

Beach Pictures

I went to the beach Friday evening with camera in hand. Here are the results:

Ocean Ocean Meets Land Sunset Sunset

Don’t Panic

The Hitchhiker’s Guide to the Galaxy motion picture is arriving April 29th. I love the trailer!

Camp Hancock Report

This weekend saw a trip out to “OMSI(Oregon Museum of Science and Industry)’s”:http://www.omsi.org Camp Hancock for a star party. The site is about 190 miles and a nice, scenic drive a little over three hours away from my apartment. Such a distance takes you sufficiently away from Portland to have wonderfully dark skies.

Friday night’s weather was generally cooperative; observers will tell you the transparency was relatively poor, but I think most of us were more than happy that two far more important considerations, darkness and lack of cloud cover, were decent. I spent a good portion of the night trying my hand at astrophotography, the results of which were, as one might expect for my first serious attempt, rather lacking in success. However, it was a good learning experience and a necessary first step.

Saturday night I stuck with visual observing, but unfortunately the weather was far less cooperative and clouds covered us up early in the night. That didn’t stop me from hopping around the sky and catching a number of Messier objects while I had the chance. I wrote down the list of what I looked at that night, so without further ado I present Saturday Night’s Hit List:

The evening started with *M11*, an open cluster in Scutum. I tend not to be particularly impressed with most of the open clusters in Messier’s catalog, although from time to time one comes up that has nicely colored stars or an interesting arrangement. But M11, the Wild Duck Cluster, is really spectacular. It’s much more populated and compact than any other open cluster I’ve seen — at least 680 stars covering 13′ — making it appear significantly more like a globular than any other open cluster I’ve seen. It also has a distinctly brighter and more colorful single star near its center.

Next up were *M10* and *M12*, both globular clusters, in Ophiuchus. Their relative closeness and the fact that Ophiuchus was already partially obscured by the horizon at this point made it hard for me to get my exact bearings within the constellation; I had to find both of them and compare their positions to determine if I was looking at M10 or M12. Average globulars, I’d say.

No viewing session is complete without at least a quick look at *M13*, the globular in Hercules. Everything ever written about M13 includes a statement along the lines of “the finest globular cluster in the northern skies” (in this case from _The Messier Objects_ by Stephen James O’Meara), so I’ll just leave it at that!

*M15* is another globular, this time in Pegasus. It’s another of the brighter globulars in the sky.

Technically in Vulpecula’s territory, I find *M27* using stars in Cygnus as a guide (because, honestly, who really knows Vulpecula?). Known as the Dumbell Nebula, M27 is another staple of my oberving sessions. It’s amazing to think that when you’re looking at the nebula, it’s expanding at a rate of about 20 miles per second. From Earth, that translates to a growth rate of 6″ per century.

I have looked for *M40* a couple times before, but nothing stood out when I was looking. Sure enough, there’s a double star right where it’s supposed to be. The inclusion of M40 in Messier’s catalog strikes me as somewhat odd considering that it is just a double star (slightly above where the handle of the Big Dipper connects to the ladel), but according to my Messier Objects finder chart book Messier was apparently looking for a nebula in the area and ended up just finding this double.

Next I tried looking for M26, another open cluster, but ended up landing on a “faint fuzzy” that looked like either a faint globular or bright galaxy. I knew I wasn’t on M26, but I didn’t know what I had landed on instead. Careful consultation with a star chart (in this case Jim’s copy of _Uranometria 2000.0_ since I forgot my own atlas) led me to the hypothesis that I had stumbled upon *NGC 6712*, a magnitude 8.2 globular in Scutum. After some localized clouds in the area passed (they were beginning to become a problem by this point), I did find M26 and based on everything’s realitive positions was able to confirm I did find NGC 6712. Final confirmation was with a goto scope (you can’t be too careful, right?).

The quest for *M26*, as mentioned above, was also finally successful after an interesting diversion to NGC 6712. M26 is an open cluster in Scutum, and falls into that category of open clusters that just don’t excite me.

No observing session is complete without getting *M31*, *M32*, and *M110* in the same field of view of my telescope. The Andromeda Galaxy, and its companions, look quite marvelous in such dark skies.

*M33*, the Pinwheel Galaxy, is next on the list and close to Andromeda in the constellation Triangulum. Although moving from Andromeda on to M33 is completely unfair to M33, it is still a nice galaxy to look at. Apparently it may be a satellite galaxy of Andromeda, actually orbiting the larger galaxy.

Next up was *M34*, an open cluster in Perseus. I don’t really remember what this one looked like; I’ll have to take better notes next time. Easily overshadowed by the nearby Perseus Double Cluster, which I’m sure I prefer looking at.

While on open clusters, I next went to *M39*. I’m going to have to revisit this one with my new Messier book close at hand to see if I really did land on what is considered M39, it’s pretty open and lacking stars (only 30 stars across its 30′ size). Moving right along…

*M52* was the next open cluster, in Cassiopeia. Again, open cluster, not good notes, no idea what I thought of this one. Must take better notes on these things in the future!

Now on to a much more impressive open cluster: *M45*, the Pleiades in Taurus. Beautiful, bright stars that form a fuzzy splotch in the sky visible to the naked eye, and quite a sight in a nice wide-field view of the area.

Back to globulars, *M56* in Lyra was partially obscured by the clouds overhead, so I didn’t see it as much more than a spot that was brighter than the surrounding area. I’ll have to revisit this one on a clearer night to appreciate it more.

Perhaps the faintest fuzzy in the set of galaxies in Messier’s catalog, *M74* wasn’t too difficult to find in Pisces given the darkness of the skies out at Hancock. Definitely not one that I’ll be seeing too often outside of star parties given its 9.4 magnitude spread across 10′.5 by 9′.5.

Finally, before the clouds completely covered the sky, the last little opening in the sky was at the far end of Andromeda where we enter Perseus’ territory, home of the Little Dumbell Nebula, *M76*. This is another one reserved for skies much darker than those close to home; it’s a very small magnitude 10.1 planetary nebula. The transparency at this point was extremely poor (I may have actually been looking through a thin cloud given that the whole sky except this one spot was overcast), and I couldn’t really see the dumbell shape. I do know it deserves its name, though, because the other time I’ve seen this object was on a much better night for observing and there is a clear resemblance to the larger and brighter Dumbell Nebula.

That wraps up Saturday night’s batch of objects, sadly cut very short (10pm or so, compared to Friday night when I was up past 2am) by the clouds.

Some technical details about the objects in my descriptions above are from _The Messier Objects_, by Stephen James O’Meara. I don’t know magnitudes for these things off the top of my head, yet ;-).