Windows Server Core Jumpstart

Recently I’ve been looking into the potential that Windows Server Core holds for our environment. Like most eager new Core users, I imagine, I jumped in with grand visions of spinning up a VM quickly and being off to the races administering it from my desktop. The reality wasn’t quite the same, as I ran into a chicken and egg situation wondering how I could set up the machine when I could not yet connect to it. To complicate the issue, I couldn’t find a concise list of information on exactly what is needed to simply make the machine available so that I could begin to work with it.

With that in mind, I’ve compiled the following information in hopes of saving others the same headache. There’s nothing earth shattering here, but hopefully it will allow people to get started with Server Core quickly so that they can move on to more important things, like how the server will actually be used.

Let me know if you have any questions or suggestions. Hope it’s a help to you.

  • Ports to request from your firewall team.
    •  TCP
      • 5985, 5986 (WinRM)
      • 445 (SMB) –This is up to you. I wanted to be able to move files to/from the server.
      • 135
  • Local firewall rules to allow remote administration.
    • Enable Remote Management groups
      (Note: If you enable “Remote Service Management” on the host first, then you can do the others via PowerShell remoting. This can be helpful since copy/paste in things like VMWare console doesn’t always work.)

Default outbound traffic to allow

Enable Ping (optional)

  • Remote management tools
    • Add the remote computer to Server Manager (available on Windows desktop and server versions).
      • Once added, you can easily launch Computer Management and PowerShell for that specific machine by right-clicking it.
    • Connect via PowerShell remoting.

      • Cross-domain PowerShell Remoting (ie Dev or Test domains)
        • If remoting isn’t enabled on your local machine, enable it.
        • Add machines to the TrustedHosts list. (Depending on your setup, you might have to substitute IP addresses for the machine names in -Value.)

          Verify with:
        • Use PSSession to connect
      • IIS management (run on remote machine)
        • Set HKLM\SOFTWARE\Microsoft\WebManagement\Server\EnableRemoteManagement to 1.
          (This can be achieved using the local regedit tool and connecting it to the remote machine.)
        • Restart the WMSVC service.
        • Connect from local IIS Manager for Remote Administration with the local administrator credentials of the remote machine.
      • You can either use sconfig or the following remote PowerShell commands to allow Remote Desktop. (This is especially helpful for quickly getting to sconfig and other commands that do not operate properly with remote PowerShell.)
  • Common configuration tasks
    • The utility “sconfig” can be used for most setup items.
    • For a more speedy and scriptable setup, below are some common configurations via PowerShell.
      • Change date\time
      • Change computer name
      • Add to the domain

         

 

Using Powershell to Refresh a Test SQL Server Instance from Production

A project I’ve been wanting to work on for a long time is how to automate restores of our Production databases to the Test instance. There are a number of challenges associated with this. First, the restore has to be able to find which backup it needs to use. Secondly, many of our databases do not have their files configured in the same way (for example one may have a single .mdf and log whereas another may have multiple .ndf files). Third, restoring from a different instance causes the SQL Authentication users to be orphaned and therefore connections to the databases to fail. And this is just at the high level. There are many smaller complications along each of those roads. Because of these our restore model has typically been per-request, manually executed, and filled with many small and tedious steps. I wanted to replace it with one process that would restore all of the production databases from the proper backup, configure any “with MOVE” options dynamically, and fix any orphaned users. A secondary goal was also to make it as portable as possible so that I could easily reconfigure it for our other PROD/Test/Dev instances.

The result of this labor of love is copied below. You’ll notice that at the top you can configure the Source and Destination SQL instances to be used, as well as the Data\Log paths for both. This is the section that allows me to reconfigure it for other instances. You can make these parameters that are passed in if you want to call it manually. For my purposes it is a scheduled process that always runs against  specific instance. Following that section is an exclusion list, built to avoid system databases as well as any others you want to skip (use cautiously to avoid angering coworkers in the Development or Testing departments).

The only function is one called “FixOrphans”. If you’ve guessed that’s where the orphaned users are repaired, then you’re today’s winner! This works by pulling a list of available logins from the destination, creating a list of orphans, and using the Source to find which login users should be matched to.  It will also match “dbo” to a different login if that’s the way the Source database was configured. Of course, this breaks down if the logins are not named the same on both instances. This is the case for some of mine as well. In those cases I have a section at the bottom to take care of one-off scenarios by hard coding them. It isn’t ideal, but will have to do until I can change some policies in my organization.

A fun side note about the FixOrphans function. It started as a small section of the script, then quickly grew to become its own function. I became so interested in what could be done with it that I side-tracked for a while and wrote a stand-alone script just for repairing orphans. Being a faithful reader, you will remember it being covered in Fixing Orphaned SQL Users via PowerShell. So, the restore project was actually first and fixing orphans spun out of that. I then took that work and rolled it back into the restore script, so the code will be very similar to that first post.

After that function is declared we move on to the real work. A variable is created to store Source backup information so that we know which backup to use. It then loops through each of the objects stored in that variable. If the database is in the exclusion list it notes that and moves on. Otherwise, it sets a baseline for the restoration query and starts building on that. This is the portion that allows me to account for differing file configurations per database on the instance. For each database it will pull Source information about the files and dynamically build the MOVE statements. At the end it packages them all together and adds any remaining options, such as keep_cdc. After the full statement is built the script will set the database to single user, closing any connections. It will then drop the database and execute the restoration. Dropping isn’t entirely necessary, but our Test environment is often short on space. Dropping first allows me to free up room before starting the restore.

There are two things to note at this section. The first is that, while documentation will tell you that not specifying a timeout value for Invoke-Sqlcmd means that it’s unlimited, that simply isn’t true. Unless you specify one it will die after thirty seconds. Secondly, once the script successfully kicks off that restore command it will truck happily along its way to the next line, whether your restore finishes instantaneously or not (my wager is that it doesn’t). For that reason I built a Do…While loop to monitor for restore commands to finish and then allow the script to proceed. Otherwise it gets awkward trying to set your database to simple when it doesn’t actually exist yet. The commands to set that recovery option and shrink the log file are also both in the interest of saving space.

Once all of that is finished, it’s just a matter of resetting variables so that the next database has a clean run and calling the aforementioned FixOrphans function to give those SQL Auth users a home. After all of the elements in the array have been processed I write an error string that I’ve been compiling to a file and call it a day. Two files will be created. RefreshLog.txt contains all of the operational information, and Errors.txt contains, you guessed it, errors.

As with my other scripts, much of this will be different in your environment than it is in mine. However, I hope that this will be easily adaptable for your use and, if nothing else, that you can find useful elements that can be incorporated into  your own scripts. As always, feel free to send me any questions, and I hope this helps.

P.S. I chose to use Invoke-Sqlcmd so that all versions of SQL, new or old, would be compatible. For newer instances feel free to use SMO.

 

Using PowerShell to Perform Availability Group Failovers

In the past we’ve explored how to use PowerShell to automate failovers on SQL Failover Clusters for Windows Updates and other scheduled maintenance. But what if you’re using Availability Groups instead of the traditional cluster? Fear not, there is still a PowerShell option for that. Usually I would have our night team use SSMS to fail over instances, but recently I transitioned to having them use the PowerShell method below. There are two primary reasons for this. 1) The system administrator does not need to have SQL Server rights in order to carry out the failover and 2) having pre-written commands helps cut down on human error.

For the purposes of this example we have two nodes (AGNODE1 and AGNODE2), each having their own instances (SQLINSTANCE1 and SQLINSTANCE2) that are part of an Availability Group (AGINSTANCE). This will walk through the process of installing Windows Updates on each of those nodes. We will assume that at the outset AGNODE1 is the primary for AGINSTANCE.

  1. Install updates on AGNODE2 and reboot as necessary.
  2. Log into AGNODE2.
  3. Right-click PowerShell and click “Run as Administrator”.
  4. Make AGNODE2 the primary by running the following command: Switch-SqlAvailabilityGroup -Path SQLSERVER:\Sql\AGNODE2\SQLINSTANCE2\AvailabilityGroups\AGINSTANCE
  5. Confirm that the AG instance is now owned by AGNODE2 using this command: Get-ClusterGroup
  6. Install updates on AGNODE1 and reboot as necessary.
  7. Log into AGNODE1.
  8. Right-click PowerShell and click “Run as Administrator”.
  9. Make AGNODE1 the primary by running the following command: Switch-SqlAvailabilityGroup -Path SQLSERVER:\Sql\AGNODE1\SQLINSTANCE1\AvailabilityGroups\AGINSTANCE
  10. Confirm that the AG instance is now owned by AGNODE1 using this command: Get-ClusterGroup

A couple of key points to keep in mind are this: 1) you must run the command from the destination server and 2) all of the confusing syntax is simply specifying the node, instance name, and Availability Group name for that destination server.

I hope this helps, and as always feel free to send me any questions!

Fixing Orphaned SQL Users via PowerShell

In SQL Server, a login is an instance level object that is used for authentication. It is mapped to a database user, which controls permissions at the database level. These objects (login and user) are tied to one another via a SID. If the login is deleted and then recreated, or if you restore your production database to a test environment, those SID’s will no longer match and orphaned users will result.

Information on how to resolve this situation can be found on MSDN. However, if you need to fix more than one user this can be painful. It also does not work for the “dbo” user, requiring you to take additional steps in order to repair both. In the interest of handling refreshes to our development and test instances in a more efficient way I’ve created the script below. It takes two parameters, one for the source instance and another for the destination. It will cycle through the databases on the destination instance and query the source for login information pertaining to any orphaned users. Some informational messages have been built in to help you identify issues, such as a login not existing on your destination instance.

There are a couple of disclaimers. This script assumes that your logins will have the same name on both instances. If your production instance has an account named “sqlsvc” and the test instance equivalent is named “sqlsvc_test”, then it will not sync the user to that login. For the situation I’m working with there is no reliable standard in account names for me to rely on. If your environment is more standardized then please feel free to build in that additional logic.

I hope this will be of help to those out there fatigued by running sp_change_users_login one user at a time. You call the script as “./FixOrphans.ps1 -SourceSQLInstance YourSourceName -DestSQLInstance -YourDestinationName”. If you don’t provide the parameters up front it will prompt you for them.

As always, if you have any questions please let me know.

 

Using PowerShell to Execute SQL Maintenance

It’s an odd truth that laziness leads to better systems administration. That is, so long as it spurs you to automate and thoroughly document a repetitive or tedious task. For instance, I was recently tasked with reducing the excessive size of some system tables in our Microsoft Dynamics CRM environment. To start with, I accomplished this the way you would any one-off task. I RDP’d to each of the app nodes, disabled the service that interacts with the tables we’re performing maintenance on, RDP’d to my utility box, used SSMS to disable each of the SQL Agent jobs that might interfere (on two different nodes), opened the script (provided by Microsoft), tweaked it to my liking, and executed it. The next morning I did all of this in reverse, starting with cancelling the script. For one evening this isn’t really a big deal. However, we soon realized that in order to get the record count down to where we wanted it that several iterations of this maintenance would have to occur over the course of multiple weekends. Reviewing all the steps I’d just performed, my thought was “ain’t nobody got time for that”.

Confronted with performing multiple GUI-based steps during each of these maintenance windows I did what any good/lazy Sys Admin does, I scripted it. Below you’ll find an example of what I used. I run it from PowerShell ISE, executing whichever block is applicable to what I want to do at the moment. This allowed me to go from starting up the maintenance in fifteen minutes to under one minute. (I know, 14 minutes isn’t a big deal. But when you’re tired and it’s late every minute counts.) As I mentioned before, my particular case is CRM maintenance. So basically I disable services on the app nodes, disable SQL Agent Jobs that might interfere (my database is in an Availability Group, so I disable them on both nodes), start the SQL Agent Job containing the Microsoft script referenced above, and then do it all in reverse the next morning at the end of the maintenance window. I included service status checks at the bottom because I’m paranoid and want to confirm the services are actually stopped before starting the SQL script. Also, I did not script the stopping of the job. I always hope (in vain) that the job will have finished, signaling the end of this particular maintenance need. Since both SSMS and the script run from my utilities box I check it in SSMS every morning and simply alt-tab over to ISE after stopping the job to start everything back up.

It’s unlikely that you’ll have the exact situation as me, but hopefully this can give you some ideas for how to incorporate these methods into your own work. In any case I hope this helps, and feel free to contact me with any questions.

 

Scripting Failovers for SQL Server Maintenance

As much as we’d all like to set up our SQL instances and never again let anyone touch them, that’s just not possible in real life. We live in a world where entities, for good or for evil, are constantly exposing vulnerabilities both in the SQL software and Windows itself. For that reason it is the responsibility of good DBA’s to keep their systems updated, and that means failovers.

It is typical in many environments to have a separation of duties, often keeping DBA’s from having control over the Windows Failover Cluster itself. In my particular case I actually have access to the cluster. However, our OCC (Operations Command Center) is tasked with carrying out Windows Updates on a regular basis. Letting them fail instances over means the servers get updated regularly (and also that I have to get up in the middle of the night for one less thing). I needed a way to allow their team members, regardless of experience level, to carry out Windows Updates on our cluster. Enter PowerShell scripting.

The code below is a modified version of what our OCC uses to carry out Windows Updates on our Failover Cluster. (There are options for Availability Groups too, but that’s a post for another day.) In the simplified example below there are six instances (SQL Instance 1-6) balanced across six nodes (sqlnode1-6). To keep things simple, a few assumptions are made in the script:

1. All instances are named with a particular convention. In this case that is “SQL Instance #”.
2. The instance numbers have their permanent homes on nodes of the same number (ie SQL Instance 1 belongs on sqlnode1).
3. The script is run on one of the nodes of the cluster.
4. The script is run in PowerShell ISE section by section, not all at once.

With that in mind I present the code to you below. It is written in a manner such that a Systems Administrator would be reading it as an instruction sheet and carrying out the tasks on your behalf. I am sure there are more elegant and efficient ways to accomplish this task, but this has been successful for us and I hope it can help you in some way. If you have any questions please feel free to comment or contact me on Twitter. Enjoy!