Installing SQL Server Using Desired State Configuration

(Update: I’ve since discovered that SqlServerDsc has replaced xSQLServer.)

One of my growing passions is using PowerShell Desired State Configuration (DSC) to automate all the things. I started out with simple configurations for testing but wanted to dive into more complex\useful situations for my day-to-day DBA life. To be honest, I was intimidated by the idea of doing a SQL installation. Configuring simple parameters or creating a directory are easy enough to wrap my head around, but something as complex as a DBMS installation gave me pause. I’m here to tell you that my worries were unfounded, and that you should have none as well.

The blessing and curse of DSC is that it’s so new. It is without doubt a very powerful tool, but as of yet there isn’t a lot of documentation around the individual resources. Or worse yet, the pace of improvement moves so quickly that information from two years ago is now out of date. I plan on doing a separate post for how to approach DSC given these realities. With this post, however, I wanted to fill one of those documentation gaps. Specifically, how to install and configure an instance of SQL server. I based my file off of an example one provided by Microsoft in the “Examples” folder of the xSQLServer module named “SQLPush_SingleServer.ps1”. Pro tip: always look for example folders in the modules you want to work with. It should be noted that you can address much more complicated scenarios, such as setting up clusters or Availability Groups, but for simplicity this configuration will be creating a single instance on one node.

If you have experience with DSC or simply don’t want to listen to me drone on about the details, the full configuration is at the bottom. For those interested in the play by play, or just bored and looking for something to do, I’ll address each piece individually.

The script starts out with compulsory documentation on the script and what it does. Kidding aside, get into the habit of doing small sections like this. Your coworkers (and you years from now when you’ve forgotten what you did) will thank you.

Next, we hard-code a couple of items specific to your individual run of the script. List the computer(s) that you want to deploy to as well as a local path for the configuration file that DSC will create.

Following that, we will set how the Local Configuration Manager on the target nodes is to behave. We’re specifying that the configuration is being pushed to it, that it should automatically check every so often for compliance to this configuration and auto-correct anything that’s not aligned, that modules on the node can be overwritten, and that it can reboot if needed.

Following that is the actual configuration details, where all the fun is defined. Mine is named “SQLSA”, but it really doesn’t matter what you name it. This is like defining a function; so as long as you call it by that same name later, little else is relevant. You’ll see at the top of this section there are three “Import-DscResource” lines. This tells the configuration which DSC modules will be needed to perform the actions we’re requesting.

The WindowsFeature item is one of the most handy in DSC. This allows us, as you might guess, to install Windows Features (in this case the .NET Framework).

Next I’ve created a firewall rule to make sure our instance’s port will be open (this is defined later under xSQLServerNetwork). It’s worth noting that there is a resource built into xSQLServer that allows you to configure firewall rules for SQL. However, I did not like the behavior of it and found that xFirewall from the module xNetworking provided a lot more flexibility.

Up next is the actual meat of installing SQL Server. The if($Node.Features) block is something I picked up from the example file. I’d say it’s redundant to check for whether you’re installing SQL when you came here to install SQL, but hey, it works well so I left it.

One way I’ve altered this section from the original is to parameterize everything. If you look further down there is a $ConfigurationData section. Having all of our customizable fields there allows us to easily change them for each deployment (dev, test, prod) without having to search through the code. You and your team will know exactly where to go and what to change for each situation.

I’ve also included some examples of basic SQL Server tasks like creating a database, disabling the sa account, disabling a feature like xp_cmdshell, and configuring the network port (referenced earlier). The naming on these items looks odd but makes sense. By adding in the node name we can ensure that they are unique should we deploy to more than one target node. And adding a friendly name to the configuration item, like “sa”, makes it easy to tell DSC which item depends on which. Speaking of which, note that each of the configurations depends on the base installation. That way DSC will not run those if there is nothing to actually configure.

After the configuration definition we have the $ConfigurationData mentioned earlier. It’s a great idea to get in the habit of using sections like this. It will make your transition between various environments much easier.

The next section details what we’d like the instance name to be as well as what features should be installed. It’s very picky about the feature names, and they don’t line up exactly with a standard command line install. So be careful with what you place here. It won’t install anything incorrectly, just simply cause the configuration not to run and you to lose your mind.

Also in this section, we’re copying over the modules that each node will need in order to perform this configuration. This isn’t necessary when using DSC in pull mode, but that’s a story for a different post.

I know you thought it’d never come, but at last it’s time to actually do something with all of this. We call our “SQLSA” configuration, passing in the $ConfigurationData and specifying to place the resulting .mof file in $OutputPath. After that, configuration is started on each node using Start-DscConfiguration and calling the .mof that was just created. Lastly, the node is tested to make sure it’s not out of compliance.

If all goes well, your output will lack red and eventually will end in a message stating that the configuration tests as “True”.

 

And that’s all there is to it! Not so scary after all. I deployed my first DSC SQL Server while making tea and wondered why I’d been doing it any other way…

 

 

Dynamics CRM Install\Import “The SQL Server {sqlserver_name} is unavailable”

My apologies for the significant gap between my last post on our CRM 2016 Upgrade Adventure and this one. My time has been consumed with preparing for go-live, but I’ve been keeping track of the roadblocks and caveats we encounter as I go so that I can post about them at later dates.

One of the challenges with this deployment that we did not encounter with the last is a significant increase in firewall and context configuration. LU has, to their credit, made great efforts over the last several years to ensure our network is as secure as possible. With an increase in security, however, comes an increase in complexity.

While setting up CRM you might expect to open a port for the SQL instance (ie 1433). It might also occur to you that UDP 1434 should be opened for the SQL Browser Service. Now your app server has a clear line open to the SQL instance. Everything should be ready, so you go to create or import your organization only to encounter “The SQL Server {sqlserver_name} is unavailable”.

You might also encounter a message about not being able to verify that the SQL Agent is running. Being a thorough Sys Admin\DBA you check the SQL services for these and confirm both are up. You also use telnet or another utility to confirm that the ports are indeed open, so what on earth could CRM need in order to reach SQL?

TCP 445… that’s right. Because of the unique setup of CRM it requires TCP 445 to do any kind of setup. What is TCP 445 you ask? “Active Directory service required for Active Directory access and authentication.” (https://technet.microsoft.com/en-us/library/hh699823.aspx). Why an app server would need an AD authentication port opened to the SQL server is anybody’s guess, but it cleared our issue right up. All system checks passed and it happily imported our database.

It should be noted, if you’re using an Availability Group setup then this port will need to be opened to the other servers in the AG as well. I have had the most success when opening it to the AG listener name as well as all nodes.

Bonus Round

If none of this helps you, here are some other things I’ve found are necessary to appease the install\import wizards.

  • Make sure you’re in the local Administrators group on the app servers as well as every node in the SQL cluster or AG (added explicitly, not through a group) .
  • Make sure your account has the sysadmin role on the SQL instance.
  • Specify the SQL server name using the backslash notation, even if the AG name doesn’t contain it. For instance, if your AG is normally accessed as SQLAGINSTANCE,50000 you would use SQLAGINSTANCE\SQLAGINSTANCE,50000 in the wizard. It seems to be hard-coded to only accept it in that manner.

Dynamics CRM Import Fails on “Upgrade Indexes”

As I mentioned in the last post, I’m taking you through our adventure in upgrading the existing on-premise Dynamics CRM 2011 environment to 2016 (and eventually 2016 cloud). Previously I discussed the first show-stopper error we received, “Must declare the scalar variable “@table”.” Following that resolution the import continued past the stage “Metadata xml upgrade: pass 1” but then failed at “Upgrade Indexes”.

Through the use of trace logs obtained by using the CRM diagnostic tool, we discovered that the import wizard was marking a number of indexes to be dropped and then recreated. However, as observed through a SQL Profiler deadlock trace, it was trying to drop and add indexes on the same table at the same time. As I mentioned in my previous post, our database is in excess of 1.5TB. One of the largest tables is ActivityPointerBase, and it’s also one on which many index operations were being executed by the import wizard. The result is that some of the index operations would be chosen as the deadlock victim, causing the import wizard to throw an error and exit. Also, if you restarted the import it would process the entire list again, not taking into account any that it had dropped and recreated already.

My coworker, and local wizard, Bret Unbehagen used the trace logs to determine which tables the import wizard was using to store its index operation information. He then created the query below to produce a list of indexes that it was trying to recreate as well as generate a drop statement for each of those.

So, the basic workflow is 1) let the import wizard proceed until it fails at “Upgrade Indexes”, 2) run the script above against your organization database to list the indexes that it wants to rebuild, 3) use the generated drop statements to preemptively drop those indexes, 4) restart the import so that it continues were it left off (feature of 2013+).

In our experience, this allowed the import wizard to continue through the “Upgrade Indexes” section without deadlocking and proceed with the import. Hopefully it can help you achieve success as well. If you have any questions please feel free to comment. Also, if you’d like to see more from Bret his information is listed below.

Bret Unbehagen (Twitter: @dbaunbe; Web: dba.unbe.org)

Dynamics CRM Import Error “Must declare the scalar variable “@table”

This post is the first in a new series I’m going to call “CRM 2016 Upgrade Adventure”. Summary: my organization has taken on the ambitious challenge of not only upgrading our existing Dynamics CRM 2011 environment to the 2016 version but  of moving it to the cloud service as well. Aside from getting the vanilla components through three versions (2013, 2015, 2016) there are all of the custom integrations that have been tied into CRM over the years that must come along too. That is why it is neither a joke nor hyperbole when I label this as an adventure. We are only in the initial months of this effort and I promise you that plenty of adventure has already been had.

Our first headache… I mean, adventure… was encountered while importing the 2011 database to 2013.  (In order to get to 2016 you have to “hop” your database through 2013, 2015, and 2016 by importing it to each version.) Initially we encountered some messages about incompatible components from the CRM 4.0 days, which our database started out in. That was no surprise. The developers quickly updated our current system and we went to import again assuming the path was clear. What I encountered instead was almost instant failure. Within 20 minutes the import had failed. Knowing this was a process that should take several hours (our database exceeds 1.5TB), I took a look at the logs to see what the issue was. During the stage “Metadata xml upgrade: pass 1” the following was listed:

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. —> System.Data.SqlClient.SqlException: Must declare the scalar variable “@table”.

I’m sure you can appreciate my confusion since this is a closed-source application written by Microsoft in which I have no option to declare variables. Googling only returned articles about people writing their own applications. Feeling we had no other recourse, we opened up a support ticket with Microsoft. That in itself was quite an adventure that spanned two weeks, but I’ll give you the short version. (Props to my coworker John Dalton for his endless hours on the phone with Microsoft through nights and weekends.) In the end the culprit was a trigger that was on our database, but not just any trigger. This one is named “tr_MScdc_ddl_event” and is created by Microsoft when you enable CDC on a database. After scripting out that trigger to make sure we could recreate it later, then dropping it, the import continued past “Metadata xml upgrade: pass 1” successfully.

TLDR version

Microsoft’s database level CDC trigger tr_MScdc_ddl_event interferes with the Dynamics CRM import operation. Drop that trigger before the import it then add it back once it’s finished and you shouldn’t have issues with this error.

So that’s the end of the adventure right? Everything worked fine after that? Not even close! Stay tuned…

Using Powershell to Refresh a Test SQL Server Instance from Production

A project I’ve been wanting to work on for a long time is how to automate restores of our Production databases to the Test instance. There are a number of challenges associated with this. First, the restore has to be able to find which backup it needs to use. Secondly, many of our databases do not have their files configured in the same way (for example one may have a single .mdf and log whereas another may have multiple .ndf files). Third, restoring from a different instance causes the SQL Authentication users to be orphaned and therefore connections to the databases to fail. And this is just at the high level. There are many smaller complications along each of those roads. Because of these our restore model has typically been per-request, manually executed, and filled with many small and tedious steps. I wanted to replace it with one process that would restore all of the production databases from the proper backup, configure any “with MOVE” options dynamically, and fix any orphaned users. A secondary goal was also to make it as portable as possible so that I could easily reconfigure it for our other PROD/Test/Dev instances.

The result of this labor of love is copied below. You’ll notice that at the top you can configure the Source and Destination SQL instances to be used, as well as the Data\Log paths for both. This is the section that allows me to reconfigure it for other instances. You can make these parameters that are passed in if you want to call it manually. For my purposes it is a scheduled process that always runs against  specific instance. Following that section is an exclusion list, built to avoid system databases as well as any others you want to skip (use cautiously to avoid angering coworkers in the Development or Testing departments).

The only function is one called “FixOrphans”. If you’ve guessed that’s where the orphaned users are repaired, then you’re today’s winner! This works by pulling a list of available logins from the destination, creating a list of orphans, and using the Source to find which login users should be matched to.  It will also match “dbo” to a different login if that’s the way the Source database was configured. Of course, this breaks down if the logins are not named the same on both instances. This is the case for some of mine as well. In those cases I have a section at the bottom to take care of one-off scenarios by hard coding them. It isn’t ideal, but will have to do until I can change some policies in my organization.

A fun side note about the FixOrphans function. It started as a small section of the script, then quickly grew to become its own function. I became so interested in what could be done with it that I side-tracked for a while and wrote a stand-alone script just for repairing orphans. Being a faithful reader, you will remember it being covered in Fixing Orphaned SQL Users via PowerShell. So, the restore project was actually first and fixing orphans spun out of that. I then took that work and rolled it back into the restore script, so the code will be very similar to that first post.

After that function is declared we move on to the real work. A variable is created to store Source backup information so that we know which backup to use. It then loops through each of the objects stored in that variable. If the database is in the exclusion list it notes that and moves on. Otherwise, it sets a baseline for the restoration query and starts building on that. This is the portion that allows me to account for differing file configurations per database on the instance. For each database it will pull Source information about the files and dynamically build the MOVE statements. At the end it packages them all together and adds any remaining options, such as keep_cdc. After the full statement is built the script will set the database to single user, closing any connections. It will then drop the database and execute the restoration. Dropping isn’t entirely necessary, but our Test environment is often short on space. Dropping first allows me to free up room before starting the restore.

There are two things to note at this section. The first is that, while documentation will tell you that not specifying a timeout value for Invoke-Sqlcmd means that it’s unlimited, that simply isn’t true. Unless you specify one it will die after thirty seconds. Secondly, once the script successfully kicks off that restore command it will truck happily along its way to the next line, whether your restore finishes instantaneously or not (my wager is that it doesn’t). For that reason I built a Do…While loop to monitor for restore commands to finish and then allow the script to proceed. Otherwise it gets awkward trying to set your database to simple when it doesn’t actually exist yet. The commands to set that recovery option and shrink the log file are also both in the interest of saving space.

Once all of that is finished, it’s just a matter of resetting variables so that the next database has a clean run and calling the aforementioned FixOrphans function to give those SQL Auth users a home. After all of the elements in the array have been processed I write an error string that I’ve been compiling to a file and call it a day. Two files will be created. RefreshLog.txt contains all of the operational information, and Errors.txt contains, you guessed it, errors.

As with my other scripts, much of this will be different in your environment than it is in mine. However, I hope that this will be easily adaptable for your use and, if nothing else, that you can find useful elements that can be incorporated into  your own scripts. As always, feel free to send me any questions, and I hope this helps.

P.S. I chose to use Invoke-Sqlcmd so that all versions of SQL, new or old, would be compatible. For newer instances feel free to use SMO.

 

Using PowerShell to Perform Availability Group Failovers

In the past we’ve explored how to use PowerShell to automate failovers on SQL Failover Clusters for Windows Updates and other scheduled maintenance. But what if you’re using Availability Groups instead of the traditional cluster? Fear not, there is still a PowerShell option for that. Usually I would have our night team use SSMS to fail over instances, but recently I transitioned to having them use the PowerShell method below. There are two primary reasons for this. 1) The system administrator does not need to have SQL Server rights in order to carry out the failover and 2) having pre-written commands helps cut down on human error.

For the purposes of this example we have two nodes (AGNODE1 and AGNODE2), each having their own instances (SQLINSTANCE1 and SQLINSTANCE2) that are part of an Availability Group (AGINSTANCE). This will walk through the process of installing Windows Updates on each of those nodes. We will assume that at the outset AGNODE1 is the primary for AGINSTANCE.

  1. Install updates on AGNODE2 and reboot as necessary.
  2. Log into AGNODE2.
  3. Right-click PowerShell and click “Run as Administrator”.
  4. Make AGNODE2 the primary by running the following command: Switch-SqlAvailabilityGroup -Path SQLSERVER:\Sql\AGNODE2\SQLINSTANCE2\AvailabilityGroups\AGINSTANCE
  5. Confirm that the AG instance is now owned by AGNODE2 using this command: Get-ClusterGroup
  6. Install updates on AGNODE1 and reboot as necessary.
  7. Log into AGNODE1.
  8. Right-click PowerShell and click “Run as Administrator”.
  9. Make AGNODE1 the primary by running the following command: Switch-SqlAvailabilityGroup -Path SQLSERVER:\Sql\AGNODE1\SQLINSTANCE1\AvailabilityGroups\AGINSTANCE
  10. Confirm that the AG instance is now owned by AGNODE1 using this command: Get-ClusterGroup

A couple of key points to keep in mind are this: 1) you must run the command from the destination server and 2) all of the confusing syntax is simply specifying the node, instance name, and Availability Group name for that destination server.

I hope this helps, and as always feel free to send me any questions!

Fixing Orphaned SQL Users via PowerShell

In SQL Server, a login is an instance level object that is used for authentication. It is mapped to a database user, which controls permissions at the database level. These objects (login and user) are tied to one another via a SID. If the login is deleted and then recreated, or if you restore your production database to a test environment, those SID’s will no longer match and orphaned users will result.

Information on how to resolve this situation can be found on MSDN. However, if you need to fix more than one user this can be painful. It also does not work for the “dbo” user, requiring you to take additional steps in order to repair both. In the interest of handling refreshes to our development and test instances in a more efficient way I’ve created the script below. It takes two parameters, one for the source instance and another for the destination. It will cycle through the databases on the destination instance and query the source for login information pertaining to any orphaned users. Some informational messages have been built in to help you identify issues, such as a login not existing on your destination instance.

There are a couple of disclaimers. This script assumes that your logins will have the same name on both instances. If your production instance has an account named “sqlsvc” and the test instance equivalent is named “sqlsvc_test”, then it will not sync the user to that login. For the situation I’m working with there is no reliable standard in account names for me to rely on. If your environment is more standardized then please feel free to build in that additional logic.

I hope this will be of help to those out there fatigued by running sp_change_users_login one user at a time. You call the script as “./FixOrphans.ps1 -SourceSQLInstance YourSourceName -DestSQLInstance -YourDestinationName”. If you don’t provide the parameters up front it will prompt you for them.

As always, if you have any questions please let me know.

 

Using PowerShell to Execute SQL Maintenance

It’s an odd truth that laziness leads to better systems administration. That is, so long as it spurs you to automate and thoroughly document a repetitive or tedious task. For instance, I was recently tasked with reducing the excessive size of some system tables in our Microsoft Dynamics CRM environment. To start with, I accomplished this the way you would any one-off task. I RDP’d to each of the app nodes, disabled the service that interacts with the tables we’re performing maintenance on, RDP’d to my utility box, used SSMS to disable each of the SQL Agent jobs that might interfere (on two different nodes), opened the script (provided by Microsoft), tweaked it to my liking, and executed it. The next morning I did all of this in reverse, starting with cancelling the script. For one evening this isn’t really a big deal. However, we soon realized that in order to get the record count down to where we wanted it that several iterations of this maintenance would have to occur over the course of multiple weekends. Reviewing all the steps I’d just performed, my thought was “ain’t nobody got time for that”.

Confronted with performing multiple GUI-based steps during each of these maintenance windows I did what any good/lazy Sys Admin does, I scripted it. Below you’ll find an example of what I used. I run it from PowerShell ISE, executing whichever block is applicable to what I want to do at the moment. This allowed me to go from starting up the maintenance in fifteen minutes to under one minute. (I know, 14 minutes isn’t a big deal. But when you’re tired and it’s late every minute counts.) As I mentioned before, my particular case is CRM maintenance. So basically I disable services on the app nodes, disable SQL Agent Jobs that might interfere (my database is in an Availability Group, so I disable them on both nodes), start the SQL Agent Job containing the Microsoft script referenced above, and then do it all in reverse the next morning at the end of the maintenance window. I included service status checks at the bottom because I’m paranoid and want to confirm the services are actually stopped before starting the SQL script. Also, I did not script the stopping of the job. I always hope (in vain) that the job will have finished, signaling the end of this particular maintenance need. Since both SSMS and the script run from my utilities box I check it in SSMS every morning and simply alt-tab over to ISE after stopping the job to start everything back up.

It’s unlikely that you’ll have the exact situation as me, but hopefully this can give you some ideas for how to incorporate these methods into your own work. In any case I hope this helps, and feel free to contact me with any questions.

 

Scripting Failovers for SQL Server Maintenance

As much as we’d all like to set up our SQL instances and never again let anyone touch them, that’s just not possible in real life. We live in a world where entities, for good or for evil, are constantly exposing vulnerabilities both in the SQL software and Windows itself. For that reason it is the responsibility of good DBA’s to keep their systems updated, and that means failovers.

It is typical in many environments to have a separation of duties, often keeping DBA’s from having control over the Windows Failover Cluster itself. In my particular case I actually have access to the cluster. However, our OCC (Operations Command Center) is tasked with carrying out Windows Updates on a regular basis. Letting them fail instances over means the servers get updated regularly (and also that I have to get up in the middle of the night for one less thing). I needed a way to allow their team members, regardless of experience level, to carry out Windows Updates on our cluster. Enter PowerShell scripting.

The code below is a modified version of what our OCC uses to carry out Windows Updates on our Failover Cluster. (There are options for Availability Groups too, but that’s a post for another day.) In the simplified example below there are six instances (SQL Instance 1-6) balanced across six nodes (sqlnode1-6). To keep things simple, a few assumptions are made in the script:

1. All instances are named with a particular convention. In this case that is “SQL Instance #”.
2. The instance numbers have their permanent homes on nodes of the same number (ie SQL Instance 1 belongs on sqlnode1).
3. The script is run on one of the nodes of the cluster.
4. The script is run in PowerShell ISE section by section, not all at once.

With that in mind I present the code to you below. It is written in a manner such that a Systems Administrator would be reading it as an instruction sheet and carrying out the tasks on your behalf. I am sure there are more elegant and efficient ways to accomplish this task, but this has been successful for us and I hope it can help you in some way. If you have any questions please feel free to comment or contact me on Twitter. Enjoy!

Is Optimizing for AdHoc Queries Worth It?

As you probably know, there is a server configuration option called “optimize for ad hoc workloads”. For more details on that option check out the MSDN article. The real question is, how do I know the extent to which my environment uses AdHoc queries? Well, as usual there is a DMV for that. Use the query below to get an idea of how many plan caches are being stored but only used once. The number changes often, so it’s a good idea to baseline your typical number. For instance, I’ve created a SQL Agent job to record the number to a table every ten minutes.  If you maintain high numbers then it might be a good idea to optimize for ad hoc workloads. As always, however, I would advise testing thoroughly first. Happy hunting.

USE Diagnostics

INSERT INTO Diagnostics.dbo.AdHocTracking
	SELECT COUNT (*) FROM sys.dm_exec_cached_plans
	WHERE objtype = 'Adhoc'
	AND usecounts = '1'

Credit to the Red Gate blog for my initial background information.