Friday, April 15, 2011

Lesson learned moving clustered SQL Server 2008 system databases

I was schedule to perform some database maintenance on a SQL Server 2008 cluster on my last day of work before my move to Bermuda and it turned out to be quite the night because I immediately ran into issues when I attempted to move the model database from a LUN that was going to be decommissioned.  I was in such a rush (we had a small window to work with) that I didn’t really get to document the troubleshooting and error messages but what basically happened was that I was unable to attach the model database after detaching it.  Every time I ran the attach database transact SQL command, I would immediately get kicked out of SQL Server Management Studio even though the service was started in single user mode.  Traversing through the logs show that the reason why I get kicked out is because when I initiate the attach command, SQL server would attempt to create the tempDB which is derived from the model database and since the model did not exist, the command fails and the MSSQLServer service stops.

Since I ran out of options and the time left for the window became shorter and shorter, we initiated a call to Microsoft but had to wait a good 2 hours before a call back even though our issue was deemed critical.  When we finally got a call back from Microsoft, the engineer used the query:

use master

select * from sys.databases

Which would usually show the following:


… but since the model database was detached, we would see a missing entry for the model database like this:


I asked the engineer if there was a way to manually rewrite the entry into the table but he said it wasn’t possible.  The engineer proceeded to go through pretty much all the commands I tried and eventually ran out of ideas saying we should try to do a restore of the database.

So what ended up saving the day?  I have a habit of backing up databases before I move them and because the system databases were so small, I performed a backup of the mdf and ldf files before I started my work so I proceeded to rename the master.mdf file, copied the backup into the directory, started SQL server and managed to bring the cluster back online.

Lesson learned?  If you’re dealing with a SQL Server 2008 cluster, make sure you test the method you’re going to use to move the system databases before attempting to do so and also make a backup of your system databases (i.e. master) before you perform the work in case you run into the same problem as I did.

No comments: