Tag Archive | SQL

SQL Server Management Objects – An Update

So, after previous post called “Adventures with SQL Server Management Objects“, I decided to take a step back and see if I could improve my code. I wasn’t happy with the while(true) loop, as if the developer stars align in a certain way this could potentially mean an infinite loop.

This is Bad™.

Originally I had the idea of using Tasks (from the Task Parallelism Library) to monitor the status of switching from Idle to Executing and then Executing to anything else (signifying failure), and so I plugged away at my code briefly, while also coming to grips with the TPL. However, as time went on, I realised this code was getting messier and messier because a) I didn’t really know what I was doing, and b) as I got more familiar with Tasks and when/where to use them I realised I technically didn’t need them (as it being a console app, I didn’t really care about blocking the main thread). Not only that, but monitoring the status changes of the Job in this way was fraught with danger because sometimes (yay consistency) the time between the initial job.Start() and job.Refresh(), the job can execute and fail, making the status monitoring a bit flaky at best. So I threw away my changeset and started again (from my previous commit), this time without any usage of Tasks.

This is what I ended up with, edited for brevity;

    /* snip */
    var lastRunDate = job.LastRunDate;

    job.Start();
    job.Refresh();

    // The job.LastRunDate will only change once the job finishes executing, providing a way of monitoring the job
    while (job.LastRunDate == lastRunDate)
    {
        Thread.Sleep(_interval);
        job.Refresh();
    }

    if (job.LastRunOutcome == CompletionResult.Succeeded)
        WriteInfo("Hooray! Job Successful");
    else
    {
        WriteError("Oh Noes! Job failed!");
        WriteError("Job failed with message: " + job.GetLastFailedMessageFromJob());
    }
    /* snip */

By monitoring when the LastRunDate property of the Job object changed, I was able to accurately determine when the job had actually stopped executing from the current attempt. Thankfully, the LastRunDate property only gets changed when the Job finishes executing (whether it fails or succeeds).

If you want to look at the full changeset, the details are here.

Technically though, this still blocks the main thread. I realised after I finished this, that you could wrap that entire chunk of code above inside a Task, using –


Task.Factory.StartNew( () => {

/* monitoring code here */

});

Unfortunately, this doesn’t have any of the CancellationToken stuff implemented in it. And honestly, I’m not entirely sure where it would go, or if its even necessary. That’s for another post though 🙂

Thanks for reading folks, have a great night/day.

Adventures with SQL Server Management Objects

Recently, I had an issue where I needed to be able to trigger a SQL Server Job from a Console Application.

Now initially, I had the knee-jerk reaction of hitting the good ol’ ADO.Net SQL Libraries, to invoke my overused little friend SqlCommand and his cousin sp_start_job. And to be honest, there is nothing wrong with this approach, and more often than not, that’s all you’ll need. However, for my particular situation I needed something a little more powerful, as the problem with invoking sp_start_job in this way, is that its done in a fire-and-forget manner. You see, sp_start_job DOES have a return value when its called, but that only indicates that the job has started successfully (or not), and includes nothing of what happens if the job itself fails. Given the client definitely needed to know what the heck was going on behind the scenes (at the very least from a high level perspective), I started in search of something to fulfil my requirement.

And so, on my journey, I discovered SQL Server Management Objects. And I’ll be honest, I cried a little. In happiness. This little ripper of a library has everything you need to ping a SQL Server Agent and all of its Job related goodness in several very useful classes.

Side note: This library could probably be (ab)used in so many other ways, but for today I’ll just focus on the SQL Agent side of things.

You need to add three references to your project:

  • [your install location]\Microsoft SQL Server\[your version]\SDK\Assemblies\Microsoft.SqlServer.ConnectionInfo.dll
  • [your install location]\Microsoft SQL Server\[your version]\SDK\Assemblies\Microsoft.SqlServer.Management.Sdk.Sfc.dll
  • [your install location]\Microsoft SQL Server\[your version]\SDK\Assemblies\Microsoft.SqlServer.Management.Smo.dll
  • [your install location]\Microsoft SQL Server\[your version]\SDK\Assemblies\Microsoft.SqlServer.SqlEnum.dll

The first class I will focus on will be the Server class. This represents a connection to a given SQL Server, and its quite easy to use.

var sqlServer = new Server("(local)");

Now, you can pass in a ServerConnection object (which has its own advantages), but for my purposes, passing in the server instance as a string was acceptable.

And that’s it. The Server object is now ready to accept whatever you want to throw at it! Even better, the connection isn’t currently open, so if we do;

 bool isOpen = sqlServer.ConnectionContext.IsOpen; 

This will return false. So no need to worry about connections hanging around every time you spin up a new Server object.

Ok, so now we have a Server object, lets spin up the next object of our desire, the Job class. Now I’m going to cheat slightly by retrieving an existing Job on the Agent. This might seem obvious, but the SQL Server Agent process needs to be running if you ping the JobServer property on the Server class, otherwise the code will fail horribly. Not that I’ve encountered this… Nope, not me… Ok, I encountered this 😦

var job = server.JobServer.Jobs["MyAwesomeJob"];

And now we have a SQL Server Job in a nice, clean object, ready for us to play with! From here you can call the Start() method on the Job object,  as well as iterate through the steps of the job. Honestly, the level of control you get over the SQL entities is nothing short of amazing, so I’d advise you to approach with caution.

The Refresh() method on the Job object sends a request to the server, asking for information about the SQL Server Job in context, and will update the objects properties to reflect any changes. However, there can be a delay between when the Start() method is called and the Job actually starting, so my current workaround for this is to use Thread.Sleep(1000) to give SQL Server time to process the Start() request. Once the SQL Server Agent kicks in, and Refresh() is called, the status of the object is updated to reflect the Job is executing (or that it has already failed). There is a better of way doing this than using Thread.Sleep, I’m sure, but for now it’ll do. Hmm, I think I’ll put this on my backlog to try and figure out a nicer way of doing this, and share the code when/if I’ve figured it out.

Now, its a simple while loop to ping the Job status every 1 second or so, and once we have the initial request change the status to JobExecutionStatus.Executing (this should take less than second), we check if its not currently Executing (side note: WriteLine is basically just a Console.WriteLine helper, that passes in a given colour).

while (true)
{

    if (job.CurrentRunStatus != JobExecutionStatus.Executing)
    {
        if (job.LastRunOutcome == CompletionResult.Succeeded)
            WriteLine("Hooray! Job Successful", ConsoleColor.Cyan);
        else
            WriteLine("Oh Noes! Job failed!", ConsoleColor.Red);
        break;
    }
    job.Refresh();

    WriteLine("Waiting...", ConsoleColor.Yellow);
    Thread.Sleep(1000);
}

Once this is done, we check the LastRunOutcome property, and act accordingly.

If you plan on using this, it might be a good idea to specify a timeout as this could potentially run forever if something unexpected happens on the SQL end.

Now, I’ll address pulling out the error details from the Job, should the job fail.

Again, SQL Server Management Objects come into their own with this, as finding the error details is a case of looking at the DataTable that is returned from a failed job by calling EnumHistory() on the Job object, and filtering down on the DataTable to return the Message column for the correct row. Don’t bother pulling out the error message from the first step (Step 0) itself, as this is more a high level “Job failed, was invoked by user X” type error, rather than anything useful. You may as well just output something like “An error has occurred, please contact Support.”, as its around the same level of usefulness. Overall I think it depends on how informed your users need to be (or want them to be!). It’s not the nicest way of going about things, but at this point it’s just what SMO provides.

In the end, I decided on going with a couple of extension methods for this, as I wanted to keep the flow of the code as smooth (or at least what I think is smooth) as possible.

The first is the method to retrieve the last failed step of the job that failed.

        public static JobStep GetLastFailedStep(this Job failedJob)
        {
            if (failedJob == null)
                throw new ArgumentException("failedJob cannot be null");

            if (failedJob.JobSteps == null || failedJob.JobSteps.Count == 0)
                throw new ArgumentException("failedJob steps cannot be null or 0");

            for (int i = (failedJob.JobSteps.Count-1); i >= 0; --i)
            {
                if (failedJob.JobSteps[i].LastRunOutcome == CompletionResult.Failed)
                    return failedJob.JobSteps[i];
            }

            return null;
        }

The second is the meat of the error reporting code, and retrieves the error message for the failed step,

        public static string GetLastFailedMessageFromJob(this Job failedJob)
        {
            if (failedJob == null)
                throw new ArgumentException("failedJob cannot be null");

            JobStep failedStep = failedJob.GetLastFailedStep();

            if (failedStep == null)
                throw new ArgumentException("No failed step found for job " + failedJob.Name);

            DataTable data = failedStep.EnumLogs();
            JobHistoryFilter filter = new JobHistoryFilter()
            {
                JobID = failedJob.JobID,
                OldestFirst = false,
                OutcomeTypes = CompletionResult.Failed

            };

            var jobHistory = failedJob.EnumHistory(filter).Select("StepName='" + failedStep.Name + "'", "");

            if (jobHistory != null)
                return jobHistory[0]["Message"].ToString();

            return string.Empty;
        }

So, this changes the snippet of code reporting job failure, meaning this;

WriteLine("Oh Noes! Job failed!", ConsoleColor.Red);

Becomes this;

 WriteLine("Oh Noes! Job failed!", ConsoleColor.Red);
 WriteLine("Job failed with message: " + job.GetLastFailedMessageFromJob(), ConsoleColor.Red);

And there you have it. Some (probably imperfect) code to help you run SQL Server Jobs, monitor them, and report any failures with a fair degree of accuracy. Obviously, this isn’t a one size fits all sort of solution, but worked quite well for me.

I have also posted my complete solution on GitHub, located here, and welcome any feedback/improvements you can provide.

Good luck, and happy SQLing from C#!

SQL Filegroups & Blob Data: Part 2

Continuing on from this post, will be the SQL used to move our Blob Data to a different filegroup.

The first step in creating a new filegroup for your all of your data is to create the actual filegroup itself. This is a logical entity which you associate with a file, so that when you specify a tables data to go to particular filegroup, the data is sent to that file, instead of your primary db file. The syntax for this is below.

ALTER DATABASE [MyDb] ADD FILEGROUP [BlobData]


This doesn’t actually create the relevant file for your filegroup, you need to do that yourself as well. Syntax below.

ALTER DATABASE [MyDb] ADD FILE
(
    NAME='blob_data',
    FILENAME='C:\Data\BlobData.ndf'
    SIZE=200MB,
    MAXSIZE=UNLIMITED,
    FILEGROWTH=200MB
)
TO FILEGROUP [BlobData]


You might have noticed the extension of the filename. No, this isnt a typo. This is the extension for secondary file data for SQL Server Databases.

Now comes the fun part. At this point you might be in 1 of 3 situations (disclaimer: there could be more).

  1. You have a table with no ntext/image data and are just looking to move this simple table to a different filegroup.
  2. You have a table with ntext/image data and need to move this data to a different filegroup.
  3. You have a table with ntext/image data and need to move this data as well as all non-blob data to a different filegroup.

If you are in situation 1 (you lucky thing), moving the table’s data to a different filegroup is a relatively painless operation, starting with dropping and recreating the clustered index for the table (this is usually the primary key).

ALTER TABLE [MyBlobTable]
    DROP INDEX [Idx_MyBlobTable_Id]
    WITH MOVE TO [BlobData]


Dropping the data in this fashion moves the data stored in the index from your primary filegroup to your secondary filegroup. However you will still need to physically recreate the index, as while you have moved the data from the old index to the secondary file group, the index itself doesn’t exist and performance could (and most likely will) suffer.

ALTER TABLE [MyBlobTable]
    CREATE CLUSTERED INDEX [Idx_MyBlobTable_Id] ON [MyBlobTable] (Id) ON [BlobData]
 

This ensures all data created and used by this index will now be located on your secondary file group.

Now comes situation 2 and 3, and this is where SQL Server is somewhat lacking. If you have ntext/image data on your table currently associated with your primary filegroup, you will need to actually recreate your entire table, specifying the new filegroup for ntext/image data. The only way to do this is to create a new table with the correct filegroup associations, copy all your data from your existing table into your new table, and then drop the old table and rename your new table to the old table name. This is messy and icky (and yes, that’s a technical term!), but you really don’t have a choice in this case.

CREATE TABLE [dbo].[MyBlobTable]
(
    Id INT IDENTITY (1,1), -- Identity columns obviously aren't necessary for this to work, this is just the standard for our work
    BlobData NTEXT,
    MoreBlobData IMAGE
)   ON [Primary]
    TEXTIMAGE_ON [BlobData]


The TEXTIMAGE_ON clause is used to define, for the table in context, where ALL ntext/image data goes for any column within the table. If you are situation 3, you would change the ON [Primary] to ON [BlobData] as the code currently how it stands would split the data for that table across two filegroups. (Side note: there is nothing wrong with this, SQL Server is actually quite smart in how it handles this sort of physical split between 1 logical entity).

A caveat with moving data to different filegroups in this fashion, is that if you have any other existing non-clustered indexes you will need to recreate these on the new filegroup, after you have done whatever migration processes you feel necessary.

I hope this serves a purpose for some other poor sucker out there researching this particular area of SQL Server. Thankfully there is a wealth of information out there in the internet, some of which I’ll link below for further research if you are interested. There are alot of extra tidbits within these articles, which I highly recommend.

Sources:

MSDN Article on Indexes

MSDN Forum Thread on limitations of SQL Server

SQLServerCentral Article on FileGroups

 

SQL Filegroups and Blob Data

Every now and then I get asked to do something really fun, and out of my normal purview of technology. We had a very large table in our database, filled with blob data from how we store our documents, and this table was about 5/6 of our overall database! Because the mdf for our database was so large we decided to do something about it, as it was starting to impact our support tasks and the nightly restore was blowing out to midmorning!

We looked at a couple of options:

SQL Server Replication

Snapshot replication:
This was the closest fit to our existing process and at first seemed like the way to go. When I researched into it a little further though, I found that snapshot replication locks the database for the entire duration of the process. So due to the size of our db, and the fact that the current restore process was already blowing out during business hours, this was definitely off the cards!!

Transactional Replication:
This type of replication involves pushing each transaction that hits the db to another db. This is good for when the data doesn’t change much on the replicated end, but you start to run into issues when changes are done on both ends. Plus there’s an extra network performance hit as you are literally doing every transaction twice.

Filegroups

A non-replication option you can go with is splitting the relevant table to it’s own file and file-group within the the same db. We ended up going with this option as it gives us a great deal of flexibility in terms of manipulation of the data files themselves as we can restore the smaller main chunk of data in a much quicker time and simply restore the blob table at a greater interval to take the strain off the network.

Everybody’s happy 🙂

One caveat with the file-group option is any existing blob data columns (blob is any column data type of text, next, and image), cannot be moved off their current file-group, and onto the new one without recreating the table from scratch. Yuck, I know, but I believe worth the initial pain in the long run.

I will post the (sanitized, obviously) code over the next couple of days, as well as source links for my research, as I’m currently writing this from the after hours gp in subiaco on my iPod touch 🙂

Get well soon Fran!

Night all 🙂

%d bloggers like this: