Wednesday, September 15, 2010

Troubleshooting SQL Backup Job Failure – Logs appear to be truncated

I don’t do quite as much SQL DBA tasks these days as the projects I’ve been working on at the current company aren’t as involved with SQL as the previous company I was with. With that being said, there are a few environments with SQL clusters with issues that are escalated to me every so often and one of them began having backup jobs from maintenance plans that I’ve set up fail once every week. As I began troubleshooting and reviewing the logs, I noticed that Job log error had an error logged but the information was somewhat truncated. The following is the flow of troubleshooting that I went through to ultimately figuring out why the jobs were failing.

The first logs I went to look at were the Error Logs section:

image

Unfortunately, the logs didn’t really tell me much (I also had the SQL Server Agent logs selected but did not see any errors) so I went ahead and checked the SQL Server Agent Jobs logs and was able to see an error with a bit more information but it was truncated:

image

Here’s a dump of the text from the error. In order for it to be displayed problem, it needs to be copied to notepad with the size of the window bigger:

Date 9/8/2010 2:00:01 AM
Log Job History (Temp backup to X.Subplan_1)

Step ID 1
Server xxxSQLCLUSTER
Job Name Temp backup to X.Subplan_1
Step Name Subplan_1
Duration 02:12:13
Sql Severity 0
Sql Message ID 0
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0

Message
Executed as user: XXXX\svc_sqlclusteragent. ...531.0 for 64-bit Copyright (C) Microsoft Corp 1984-2005. All rights reserved. Started: 2:00:01 AM Progress: 2010-09-08 02:00:10.94 Source: {7505B5D7-1DFB-4550-B691-085284227C50} Executing query "DECLARE @Guid UNIQUEIDENTIFIER EXECUTE msdb..sp...".: 100% complete End Progress Progress: 2010-09-08 02:00:13.58 Source: Maintenance Cleanup Task Executing query "EXECUTE master.dbo.xp_delete_file 0,N'X:\',N'bak',...".: 100% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 11% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 22% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 33% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 44% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 55% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\XXXX.xxx...".: 66% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\master' ...".: 77% complete End Progress Progress: 2010-09-08 02:00:16.17 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\model' ".: 88% complete End Progress Progress: 2010-09-08 02:00:16.19 Source: Back Up Database (Full) Executing query "EXECUTE master.dbo.xp_create_subdir N'X:\\msdb' ".: 100% complete End Progress Progress: 2010-09-08 02:14:24.05 Source: Back Up Database (Full) Executing query "BACKUP DATABASE [XXXX.xxx.Production] TO DISK = N...".: 50% complete End Progress Progress: 2010-09-08 02:27:30.57 Source: Back Up Database (Full) Executing query "declare @backupSetId as int select @backupSetId =...".: 100% complete End Progress Progress: 2010-09-08 02:58:42.92 Source: Back Up Database (Full) Executing query "BACKUP DATABASE [XXXX.xxx.Production.DataStore] TO...".: 50% complete End Progress Progress: 2010-09-08 04:02:53.63 Source: Back Up Database (Full) Executing query "declare @backupSetId as int select @backupSetId =...".: 100% complete End Progress Progress: 2010-09-08 04:03:14.19 Source: Back Up Database (Full) Executing query "BACKUP DATABASE [XXXX.xxx.Production.MemberShip] T...".: 50% complete End Progress Progress: 2010-09-08 04:03:23.57 Source: Back Up Database (Full) Executing query "declare @backupSetId as int select @backupSetId =...".: 100% complete End Progress Progress: 2010-09-08 04:05:09.59 Source: Back Up Database (Full) Executing query "BACKUP DATABASE [XXXX.xxx.Training] TO DISK = N'X...".: 50% complete End Progress Progress: 2010-09-08 04:05:56.55 Source: Back Up Database (Full) Executing query "declare @backupSetId as int select @backupSetId =...".: 100% complete End Progress Progress: 2010-09-08 04:07:41.12 Source: Back Up Database (Full) Executing query "BACKUP DATABASE [XXXX.xxx.Training.DataStore] TO ...".: 50% complete End Progress Progress: 2010-09-08 04:08:27.17 Source: Back Up Database (Full) Executing query "declare @backupSetId as int select @backupSetId =...".: 100% complete End Progress Error: 2010-09-08 04:08:53.25 Code: 0xC00291EC Source: Back Up Database (F... The package execution fa... The step failed.

The error is indicated at the bottom where it reads:

Error: 2010-09-08 04:08:53.25 Code: 0xC00291EC Source: Back Up Database (F... The package execution fa... The step failed.

The information’s not very helpful with the truncated text so I went back to SQL Server Business Management Studio to look around. I ended up at the Maintenance Plan node’s actual job:

image

Drilling down to the actual job and clicking history gave me more information:

image

The error number is: -1073573396

The error message is: Failed to acquire connection "Local server connection". Connection may not be configured correctly or you may not have the right permissions on this connection.

From here I was able to review the time the job failed, what else was going on and noticed that there was an overlap during the time of the backups and database re-indexing which probably lead to the backup not able to obtain the lock it needed on the database. What I ended up doing was adjust the timing of the two jobs.

3 comments:

Doug Purnell said...

Thank you for this post Terence. I was trying to diagnose an error from the activity log and ran into the same trucation issue as you. Once I saw your trick to view the history from the maintenance plan, I was able to see the specific error and solve the problem. Thank you so much!

Anonymous said...

Thanks, good explanation.

digital id said...

Thank you so much for taking the time to share this information. A great read. I’ll certainly be back.