hahaha: update statement

This afternoon, I had to put together a SSIS package and inside it, I needed to write an UPDATE statement.  The only problem was that I haven’t written an UPDATE statement in quiet a while and had to pull up SQL BOL.

It was pretty funny and embarrassing at the same time.

Fix long running jobs without database modification

During an interview, the following question usually comes up, “How would you troubleshoot a long running job?” and most of the time the answers goes into details such as:

1. open up perfmon and identify the bottlenecks
2. use sql profiler or server side trace to identify the offending queries
3. obtain an execution plan and determine if any index are missing or one need to be created
4. create the needed index and test to see if it helps.

Those are some of the acceptable answers; however, there are times when none of those are needed, although it would help greatly.  Last week, I was tasked at trying to reduce a log running job from 5hrs to 2.5hrs or less.  As any DBA not knowing the system would do, I was able to identify the long running query and got an estimated execution plan.  It wasn’t a long runner query per-say but a long running INSERT using a SELECT * FROM vw_someView statement.  INSERTS as noted in a post of mine states that it does not make use of parallelism and is single threaded, unless you are using INSERT INTO … SELECT * FROM TABLE/VIEW.

In this case, the SELECT statement didn’t show any parallelism in the execution plan, so I know I don’t have to bother with any maxdop setting in sp_configure.  I did find out that they are using a mixture of uniqueidentifier and uniqueidentifier + nvarchar(4000) as their primary key/clustered index.  We know this is stupid because the index is not as narrow as it should be.  On top of that, there would be a lot of page splitting and fragmentation on the uniqueidentifier column.  In addition, cause a lot of random I/O and stresses the disk more than it needs to.  The uniqueidentifier wouldn’t be so bad had they partition that column out using Table Partitioning.

My goal was not to suggest a change in the database structure or add any new technology into the existing system but to advise on how to make the job run faster.  Therefore, I found and advised a list of things they can do to make the job run faster.  I was not given access to a physical server with similar spec (the have such an env) but was given a VM with 4GB of memory.  This VM has the offending job and the databases.  Prior run took 11hrs.

The existing jobs basically called a store procedure that makes a bunch of inserts using the INSERT INTO Table SELECT * FROM vw_someView.  So I asked the question, are there any dependency on any of these inserts?  The answer was no.  So my #1 suggestion was to break down the job and run it parallel.  After identifying which tables were taking the longest time, I created 6 new jobs.

1. Load Data
2. Inserts Table1
3. Inserts Table2
4. Inserts Table3
5. Inserts Table4
6. Call End Process

Load Data will call job 2-5 via exec msdb..sp_start_job @job_name = ‘Inserts Table1′ and so on.  So now all 4 jobs are running in parallel.  After each of the Inserts job are done, it will check to see if any of the other Inserts job are still running.  If yes, then it will not start the job ‘Call End Process.’  To determine if the jobs are still running or now, I used a bit of Euclid reverse method.

After my job ran in this VM environment, I was able to reduce the job by a good 4 hrs, from 11hrs to 7hrs.  It is definitely not a 50% reduction but I can only do so much on a VM with very little CPU and Memory.

So the next time you are at a job interview, instead of giving answer 1-4, maybe you can use this as a 5th and land that job :D .

New job, new challenges?

I left my old employer after 6 months and started a new one yesterday.  The old one had potential to be a great except that the projects kept getting sided line leaving with not very much to do.  And without doing, you start to forget what you know, which then weakens you.

When I took my old job, I was very excited.  This new job, I will withhold judgement until I get to know the team, workload, and environment.  There won’t be any more posts from me for a while until I either learn or work on something new.

Until then.

Bad query

Last week, I was putting together some reports for SSRS and at the end of the day, I wasn’t thinking straight as I just copied and pasted some previouly used code to generate the desire report.

The code I had used in the past collects data from all the databases and then store it into a table.  This table would then provide me with some historical information.  The new report I was working on, I wanted it to give me real time data.  So instead of modifing the code to give me data from a specific database, I had it query all databases and the filter it by the 1 database I needed, which is a waste of time and processing power.

Fixed it today and now the query is hundreds of time faster!

SSRS: location to save rdl as template

I’ve been playing with SSRS 2008 R2 recently and I made a template of what I want all my SSRS report to look like from the get-go.  The book I’m using suggested a path but it wasn’t correct.  Maybe it is a different verion or something.  

My version of VS 2008 is: 9.0.30729.4462 QFE.  After some trail and error, I found the location to be:

C:\Program Files (x86)\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject

Store your .rdl file in this folder, next time you go and add a new report, you will see it listed.

This is WHY you don’t have spaces in database name!!!

Today, I was working on a new script that would collect database’s file size and store them in a central repository, I have to use linked servers, some dynamic sql, and while loops.  After spending time breaking down and going thru some trails and errors, it worked nicely until I encountered an error:

Msg 7313, Level 16, State 1, Line 2
An invalid schema or catalog was specified for the provider “SQLNCLI10″ for linked server “VipA\InstanceA”.

So after spending 20 some minutes scratching my head as to why this database is giving me grief, I found out the reason:  In the first few weeks of my current job, I found that one of the databases had a trailing white space. I have asked around and no one would give me an answer.

This shit really piss me off, of all the 270 databases, this is the ONLY bloody database with a white space. I want to so BADLY right this wrong but I’m not sure what kind of impact it would have.

To solved this, I simply put [ ] around the variable @dbname.

SQLMag: What are the seconds per read and seconds per write counters for?

A blog post by Denny Cherry.  I had the opportunity to meet and talk with Denny at the SQL In The City 2011 in LA.  Haven’t heard the last name Cherry since high school, where one of my attractive counselor, also had the last name Cherry.  Oddly enough, the only thing that I really remember about her was that she doesn’t have the ability to sweat…. That and she was nice to me.  Okay a few things I remembered about her.

SQLMag blog post: What are the seconds per read and seconds per write counters for?

Error Applying SP1 to SQL 2008 R2

In my time working as a SQL DBA, I have been forunate enough to have never counter any type of error while installing or upgrading SQL servers from one version to the next.  Last month, one of the sysadmins were applying SP1 to our new SQL 2008 R2 clusters and ran into failures.  The attempted was made on the passive nodes.  However, all four instances failed to upgrade to SP1, including the active node.

First thing anyone do is basically to look at the error log and look for anything with the word ‘error’ or ‘failed’ in it.  This method found a ton of ‘error’ and ‘failed.’  Apparently the best method is to look for ‘at microsoft.’  This will help you narrow it down faster.  The error message I got was:

2011-10-31 16:38:02 SQLEngine: : Checking Engine checkpoint ‘SQLCluster_ConfigureResourceType’
2011-10-31 16:38:02 SQLEngine: : The source DLL file exists = ‘True’.
2011-10-31 16:38:02 SQLEngine: : Type ‘SQL Server’ exists and DLL is installed …
2011-10-31 16:38:02 SQLEngine: : Type ‘SQL Server’ has 2 resources. Performing upgrade …
2011-10-31 16:38:02 SQLEngine: : The source DLL file is ‘D:\Program Files\Microsoft SQL Server\MSSQL10_50.AASQL\MSSQL\Binn\SQSRVRES.DLL’ and the target is ‘C:\Windows\system32\SQSRVRES.DLL’.
2011-10-31 16:38:02 SQLEngine: : File versions : ['D:\Program Files\Microsoft SQL Server\MSSQL10_50.AASQL\MSSQL\Binn\SQSRVRES.DLL':2009.2009.2500.0], ['C:\Windows\system32\SQSRVRES.DLL':2009.2009.1600.1].
2011-10-31 16:38:02 SQLEngine: : Upgrading resource type ‘SQL Server’ using resource for instance ‘AASQL’ as reference …
2011-10-31 16:38:23 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ConfigRC and scenario ConfigRC.
2011-10-31 16:38:23 Slp: The RPC server is unavailable
2011-10-31 16:38:23 Slp: The configuration failure category of current exception is ConfigurationFailure
2011-10-31 16:38:23 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ConfigRC and scenario ConfigRC.
2011-10-31 16:38:23 Slp: System.ComponentModel.Win32Exception: The RPC server is unavailable
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.Cluster.ClusterResource.UpgradeResourceDLL(String nodeName, String dllPathName)
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlEngine.SQLEngineClusterFeature.UpgradeResourceDLL(SQLServiceResource sqlResource)
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlEngine.SQLEngineClusterFeature.ConfigureSQLEngineResourceType()
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineSetupPrivate.Patch_ConfigRC(EffectiveProperties properties)
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlEngine.SqlEngineSetupPrivate.Patch(ConfigActionTiming timing, Dictionary`2 actionData, PublicConfigurationBase spcb)
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlConfigBase.SlpConfigAction.ExecuteAction(String actionId)
2011-10-31 16:38:23 Slp:    at Microsoft.SqlServer.Configuration.SqlConfigBase.SlpConfigAction.Execute(String actionId, TextWriter errorStream)
2011-10-31 16:38:23 Slp: Exception: System.ComponentModel.Win32Exception.
2011-10-31 16:38:23 Slp: Source: Microsoft.SqlServer.Configuration.Cluster.
2011-10-31 16:38:23 Slp: Message: The RPC server is unavailable.
2011-10-31 16:38:23 Slp: Watson Bucket 1
 Original Parameter Values

You can find this message in the ‘detail.txt‘ in C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\yyyymmdd_HHmmss\instanceName.  Any detail.txt files after the initial failure won’t have this message!  Based on this error, I started digging around.  I checked the SQL instance and it was running, but when I told it to give me product version, using SERVERPROPERTY(‘ProductLevel’), it showed SP1 instead of RTM.

How the heck did it failed the upgrade but yet made it to SP1?  Having worked and troubleshoot csharp codes/errors in the past, it seem to me that it has something to do with the RPC service timing out while trying to replace the .DLL file.  The sysadmin and I wanted to manually replace the DLL file but to be safe, we contacted Microsoft’s CSS, since we have xx amount of free calls a year based on our agreement.  So I can only guess that, the SP1 patch updated everything but failed at replacing the .DLL file.

After some investigating on their part, they also suggested that we should manually replace the DLL file.  Having done so, the SP1 upgrade went without any additional glitch.  I asked CSS why this happen and how we can avoid it in the future, they said they will look but haven’t heard anything back.

Tomorrow is 11-11-11

Yahoo’s Who Knew segment gave an interesting view on 11-11-11.  I never thought much of it until now.

I personally don’t have a lucky number but it seem that 6 is always following me around.  I see enough of it to think it is stalking me.

In Buddhism, which I was when I was a kid, living in Thailand, but now, not so much, except that I’m into the wheel of life and not wanting to be reborn idea. It said that 6 is:

The six means of perfection or transcendent Virtues of the Buddhist: the patience, the charity, the energy, the wisdom or the science, the contemplation or the charity, the virtue or the purity. 

I am a bit at odd with ‘patience’ as I don’t have much of it.  I like to make and see things move forward.  I do; however, have a ton of patience when I’m calm and one with my surroundings.

Shrinking a log file that doesn’t want to shrink

Yesterday, I was approached by two of the system admins, stating that one of the database they are connecting to is having issue.  They are attempting to connect via a perl script, which I could care less about, not because perl suck or anything, because it doesn’t.  My goal is to keep SQL instances and databases humming along smoothly.

First thing I did was to look at the dataase via SSMS to see if it was in any kind of suspected or recovery mode.  When I opened SSMS, it showed (in recovery), so I looked at the error log, using xp_readerrorlog, and I noticed that the database was opening and closing everything a connection to it is made. Also, it was running a CHECKDB on the database everytime it came back online.  And I think this is why the perl script was failing.  It was timing out to quickly, although, the two sysadmins didn’t give me the error they got, I can only guess.  Having found this out, I went ahead and disabled the auto-closed.  All is well.

You can pretty tell that the database is in auto-closed mode because in the sql error log, you will see a lot of “Starting up database ‘dbname’ ” along with:

SQL server has encountered 4 occurrence(s) of cachestore flush for the ‘Object Plans’ cachestore (part of plan cache) due to some database maintenance …

SQL server has encountered 4 occurrence(s) of cachestore flush for the ‘SQL Plans’ cachestore (part of plan cache) due to some database maintenance …

SQL server has encountered 4 occurrence(s) of cachestore flush for the ‘Bound Trees’ cachestore (part of plan cache) due to some database maintenance …

Further, this database was in Bulk-Logged recovery mode so I performed a database & log backup of this database because the log file was 680GB!  It is a good thing that database and log backup only backup data and not empty spaces.  This morning, the log backup completed and the log file was still at 680GB.  I ran my ‘freespace’ script, which you can download in the scripts and applications section.  It showed that the log file has 99% free space and only 6GB was being used.

I ran my ‘shrinkdb’ script and it didn’t reduce the size of the log file, instead only increased it.  After a few different methods, I just went ahead and changed the recovery mode from Bulk-Logged to Simple, shrink the log file, and return the database to Bulk-Logged.  Lastly, I changed the database growth setting from default growth by % to 1024mb.