Running unsupported versions of SQL

As new releases of SQL keep coming, old versions keep going away. Lifecycle management is something as database professionals that we have to mindful of and be planning for.

If you are running an out of support version of SQL then you need to start thinking about what your plan is for it. If you have a choice and need to keep the server running, then you probably should strongly consider using a later version of SQL. Really, really try to do it. However, several cases here is that you are held to the limitation of a product and as much as you would love to go all “office space” on it, you value your job more, so it sticks around and you have to deal with it. There are some other examples such as licensing concerns, lack of employee participation, resource constraints, etc., but I believe these could and should be resolved.

officespace

Any time that you or your company make the choice to run on an old version of SQL (regardless of the reason), you are taking a risk and it is important to understand what these risks are and communicate it to your company. At the very least, you want to provide cover for yourself in case something does happen.

Let’s start by looking at the SQL Server product lifecycle.

There are 2 dates that you need to be aware of regarding end of life.

  • End of Mainstream support – This means that up to this date Microsoft will fully support the product, as well as, continue to make changes to product design and features.
  • End of Extended support – This means that up to this date Microsoft will fully support the product, but they will no longer make changes to product design and features. Anything beyond this date is unsupported software and Microsoft will only support it if you pay for expensive extended support (which is usually limited for up to 3 years)
Version Release Date End of Mainstream End of Extended
SQL Server 2005 (SP4) 12/13/2010 4/12/2011 4/12/2016
SQL Server 2008 (SP4) 7/7/2014 7/8/2014 7/9/2019
SQL Server 2008 R2 7/20/2010 7/10/2012 7/10/2012
SQL Server 2008 R2 (SP3) 9/26/2014 7/8/2014 7/9/2019
SQL Server 2012 5/20/2012 1/14/2014 1/14/2014
SQL Server 2012 (SP3) 12/1/2015 7/11/2017 7/12/2022
SQL Server 2014 6/5/2014 7/12/2016 7/12/2016
SQL Server 2014 (SP2) 7/14/2016 7/9/2019 7/9/2024
SQL Server 2016 6/1/2016 1/9/2018 1/9/2018
SQL Server 2016 (SP1) 11/16/2016 7/13/2021 7/14/2026
SQL Server 2017 9/29/2017 10/11/2022 10/12/2027

If you have something about to come out of support, then you need to start planning for that. I recommend start planning working through upgrades as soon as you can and at minimum a couple of years in advance of end of extended support.

So, tell my why running old versions of SQL is of concern and what I need to communicate.

  • Security – Once a version reaches end of extended support, Microsoft is no longer obligated to release security patches. So as new security attacks happen (insert most recent security attack) this leaves your SQL environment at risk. And not only are our databases at risk, but every application that relies on those databases are also susceptible to data theft and corruption.
  • Compliance issues –Several regulations such as HIPAA, SOX, GDPR, etc. require that must implement procedures for detecting, guarding against, and reporting malicious software for sensitive data. It is the responsibility of the company to make the appropriate technical and organizational measures. If you are using software that is no longer supported by Microsoft, then you may be out of compliance.
  • Maintenance support and\or costs – Once end of extended support has occurred, Microsoft is no longer supporting these, so in the event of any product related issues you will not have support or may be required to pay Microsoft a significant amount of money to get it. Additionally, if you choose to accept the risks and try to isolate these servers within your environment (network segmentation) then you will incur higher maintenance costs because you must maintain legacy servers, firewalls, and perform unique setups to try to guard against intrusions.
  • Insufficient performance – The information landscape has changed significantly over the years. Depending on the version of SQL you are running, you could be using a piece of software that has not received enhancements in nearly a decade. Think of all changes that have happened in the information landscape since then and think about the sophistication of today’s cyber-attacks. The features that were enough then just do not meet the needs of current IT environments.
  • Numerous product improvements – There is a long list of benefits that comes with an upgrade that can positively affect your applications and user experience that you should take a look at.  

Be sure to understand the risks and tell the story of continuing to run on this product and do everything that you can to try to get to a later version.

Tracking connections on your SQL Instance

Many times throughout my career, I have needed to complete tasks on databases that there was little to no information about. I often refer to these little guys as mystery databases. Whether it be migrations, decommissions, troubleshooting application issues or any other issues, it can sometimes be hard to figure out who or what is using one of these mystery databases. Even speaking to people you think may know (or should) know the answers can come up fruitless. So how do you handle that?

Yes, you can run one of the various scripts (such as sp_who or sp_whoisactive) to see who\what is currently connected, but how confident can you be that is the complete list? What if an application only connects periodically? If you have little to no information on the database, can you know for sure? And yes, you could shut it down and see what breaks and who screams, but is this the most effective way?

I got tired of seeing blank stares and hearing “I don’t know” from application owners and developers so I created a process that lets me track connection history. This allowed me to answer these questions on my own or at least have more information available to me when talking to other teams. What I have found since we started running this process, is that is one of the most widely used tools that my team uses. Especially after this process has been running for months and have that data available to me.

The process is just a proc that pulls a distinct list from master.dbo.sysprocesses based on Instance name, database name, login, host name, and program name and loads this data into a table. Each time it runs I have a counter column that exists that is increased by 1 each time a distinct connection is found and a timestamp column that is updated with the most recent time it was found. I run this proc via a SQL Agent job on all of my servers on a 3-minute interval. The task is lightweight so you can run it as often as you like and the table should stay relatively small.

Let’s take a look at what you need to do to set this up. For the sake of this article, I am going to include it in the master database, however, I have a user created DBA database that is created on all my servers that I normally use. I use this for all the objects that I need created to manage a server.

1. Create a table that you plan to log the data into.

USE [master]
GO

CREATE TABLE [dbo].[SQLConnections](
	[SQLConnections_ID] [int] IDENTITY(1,1) NOT NULL,
	[Instance_Name] [nvarchar](50) NOT NULL,
	[Database_Name] [nvarchar](100) NULL,
	[loginame] [nvarchar](128) NULL,
	[hostname] [nvarchar](128) NULL,
	[program_name] [nvarchar](128) NULL,
	[LastLogin] [datetime] NULL,
	[ConnectionCount] [bigint] NULL,
 CONSTRAINT [PK_SQLConnections] PRIMARY KEY CLUSTERED 
(
[SQLConnections_ID] ASC
))

2. Create the proc

USE [master]
GO
CREATE PROC [dbo].[sp_LoadCurrentConnections]

AS

-- Create a Temp Table to hold the current connections

IF OBJECT_ID('tempdb..#SQLConnections') IS NOT NULL DROP TABLE #SQLConnections
CREATE TABLE #SQLConnections
(Instance_Name nvarchar(50),
Database_Name nvarchar(100),
loginame nvarchar(128),
hostname nvarchar(128),
program_name nvarchar(128),
LastLogin datetime,
ConnectionCount bigint)

-- Insert into DBAdb.dbo.SQLConnections

INSERT INTO #SQLConnections
(Instance_Name, Database_Name, loginame, hostname, program_name, LastLogin, ConnectionCount)
SELECT @@Servername as Instance_Name,
DB_NAME(dbid) as Database_Name,
loginame,
hostname,
program_name,
MAX(login_time) as LastLogin,
1 as [Count]
FROM sys.sysprocesses
GROUP BY DB_NAME(dbid), loginame, hostname, program_name

--- Update any existing data with latest login and add to count

UPDATE SC
SET SC.LastLogin = SCS.LastLogin,
SC.ConnectionCount = SC.ConnectionCount + 1
FROM #SQLConnections SCS
INNER JOIN master.dbo.SQLConnections SC ON SC.Instance_Name = SCS.Instance_Name
AND SC.Database_Name = SCS.Database_Name AND SC.loginame = SCS.loginame
AND SC.hostname = SCS.hostname AND SC.program_name = SCS.program_name

--- Insert any new rows into the table that do not exist

INSERT INTO master.dbo.SQLConnections
(Instance_Name, Database_Name, loginame, hostname, program_name, LastLogin, ConnectionCount)
Select SCS.Instance_Name, SCS.Database_Name, SCS.loginame, SCS.hostname, SCS.program_name, SCS.LastLogin, 1
FROM #SQLConnections SCS
LEFT JOIN master.dbo.SQLConnections SC ON SC.Instance_Name = SCS.Instance_Name
AND SC.Database_Name = SCS.Database_Name AND SC.loginame = SCS.loginame
AND SC.hostname = SCS.hostname AND SC.program_name = SCS.program_name
WHERE SC.Instance_Name is null

Note: You probably could use some of the other commonly used scripts that exist out there and just log the data to a table as well. This script is just what I use since its simple and accomplishes what I need for it to.

3. Create a SQL Agent job to execute the stored proc on the schedule of your choosing. (Note: I typically run it every 3 minutes)

After everything is setup and running

Once you have the process running, you have a historical record of what is connecting to your servers and an idea of how often it is connecting.

Here is an example output from a server that I manage (names have been updated to generic values) that will give an idea of the result set.

ConnectionScreenshot

From this table, you can see what each connection being made to your databases are and how often they are connecting. There are several things you can do with this information and you may find that it comes in handy a lot more then expected. You will find that anytime you need to see what connects to a database, this will be your go to place.

Getting the most from your SQL licensing

If you have a good understanding of SQL licensing (or read my previous blog post) then there are several things you can do to help optimize your environment to make sure you are getting the most from your licensing. In this article, I will also go through and talk about things you should be doing to make sure you are in compliance, but also things that you can do that present a significant cost savings to your organization.

I have worked with or for several different organizations throughout my career and been able to help them discover millions in licensing savings\compliance issues. The information below is the process that I have went through and checked with each of them to discover this.

Make sure you are selecting the correct licensing model.

I would encourage you to read Understanding the Basics of SQL licensing if you not have done so already or if you are unsure. In my experience, the most costly mistake that people make is licensing physical hosts over licensing VM’s. I have seen several cases where this amounts to millions of dollars cost difference. Licensing by host certainly has its place, but make sure you understand what that is. And if you are not sure what is the best, then take the time to do the math and see which way is the most cost effective.

Correct any issues with resource over-provisioning.

Take a look at the servers running in your environment and see if they are actually using all of the resources that are allocated to them. The most important thing to pay attention to is CPU since you are licensing based on cores. The second most important thing to look at is memory and that is because this alone could affect whether you need Enterprise or Standard. Then you still may want to look at other resource metrics that could have an impact on your CPU.

    • CPU – Check your average CPU utilization over time (especially during critical processing hours) and see what it is running at. Do not just compile an average number, but look at the CPU usage over a period of time so you can understand your spikes. My general recommendation is to try to keep average CPU somewhere around 30%. If its way under that then you may be over-provisioned and if you are way over that you could be experiencing performance issues as a result. What I am typically looking for here is obvious signs, for example, if you never extend beyond 5% utilization on a 8 core server you may be over-provisioned and paying for something you are not using.
        • Remember that there is a 2 licensing minimum for SQL so running 2 cores is the same as 4 cores from a licensing perspective.
    • Memory – While memory is not part of the licensing guidelines it can play a supporting role, because it can define what version you need. I have seen customers purchase Enterprise edition due to the memory resource constraints of Standard edition (remember the 64GB memory limits in SQL 2012?). I have also seen examples where this was not needed and the company effectively over-bought. When you look at this, do not just look at memory utilization which is typically high, but look at internal SQL memory counters such as Page Life Expectancy and Buffer cache Hit Ratio. You want to get an idea of how often SQL is able to read from memory vs disk and determine if it is acceptable for performance.

Make sure you are running the correct version of SQL.

Oftentimes companies will run Enterprise edition in situations where it is not needed. Make sure if you are running Enterprise edition you have a valid reason to do so.

NOTE: Microsoft does not have a supported downgrade path so if you need to downgrade you will to do a uninstall\reinstall

My rule of thumb is to run the lowest version of SQL that you can while still meeting business needs. Of course, Enterprise is nice and beneficial, but it is actually needed

Avoid running applications on the same server as the database.

If at all possible, the database should always be on an isolated server. You do not want to be paying expensive licensing costs only to be having some other application consuming those resources.

A good check in this situation is to look at memory and CPU consumed by SQL and by other things running and view what the difference is. If there is a wide margin you need to figure out what is going and see if it is something that can be fixed.

Look for opportunities to virtualize your physical database servers. 

There is a good chance that if you have a database server in your environment that the workload on that may have changed at some point for better or worse. Databases and applications are retired, new ones are brought on, processes get moved, or any sort of other thing that can have an impact on the server. With physical servers you are more limited in terms of your ability to grow or decrease resources (CPU\Memory). While you can do it, it takes provisioning or removing hardware. With a virtual server you typically have an easier path related to scalability (whether up or down). Not saying you should virtualize everything, but something to consider when looking at your environment.

Spend time performance tuning where you can.

This encompasses a lot of stuff from server tuning, query tuning, index tuning, application tuning, etc. If you can perform work to reduce the CPU load then you can reduce the number of cores and ultimately reduce licensing. This is a loaded topic that deserves blog posts of its own, but there are several great resources out there to assist with this.

My experience with most of the customers or companies I have worked with is that most of the performance tuning that goes on is reactive. Meaning someone is having a problem with something being slow or not performing up to standard. But I would argue that pro-active performance tuning is also very important and can come with significant cost savings. Its just something that I do not think there is enough prioritization on.

For example, I was working on a server once with a customer that was 24 cores and running Enterprise edition. I was able to tune over a short period of time to the point we were able to reduce the core count down to 12 (in half). Looking at retail numbers, this reduced the licensing cost on this server from 168K down to 84K, saving 84K in the process. While this may not always be the case, you probably have a lot of things that you could reduce a little, and a little can add up to a lot.

Look for consolidation opportunities, especially on servers that do not meet the minimum licensing core count.

Look at your environment and look for opportunities to consolidate databases. Not every database deserves or needs to be running on its own server.

There are a few things that you need to consider when look at consolidation opportunities.

    • You need make sure you are not required the database on its own server and there are different scenarios where this may be required.
    • Check to see which servers do not meet the the minimum licensing requirement. These are prime candidates for consolidations because these are costing you money that you are not getting any return on.
    • Look at resource utilization on your existing servers to see if they would be better combined.

I think this is a topic that can be expanded on greatly and I will plan to do so. Because there is a bit of art to how you handle this. Consolidations can trigger a conversation of cost vs ease of management. I do not recommend changes based on cost alone and think you should always factor in the level of difficulty to manage. More to come on this topic.

Define and execute on a decommission process.

If something is not being used try to get it removed as soon as you can to free up those licenses.

Oftentimes servers stop getting used and they sit out in the environment taking up licenses and you continue to pay for maintenance on those. I suggest performing audits occasionally to see if a server is still being used (if you are unsure). I will be creating a new blog post soon that details how you can do this.

Determine if SQL licensing was included in the purchase of an application.

If you are using 3rd party vendor products you may want to double-check to see if the SQL licenses are included with the product. Especially if they are supporting the product for you and you are just hosting it. You do not want to be paying for a license a 2nd time over. If you are unsure, check your contract or check with the vendor. I would warn against assuming as I have been surprised before.

Keep in mind that if the licenses are included then they usually have certain restrictions so please work with your Vendor or licensing manager to check this. Commonly, the only thing you are allowed to run on the server is there product, so running on a shared instance may void the license.

Wrap up:

SQL licensing is expensive and there is a lot that can be done to reduce these costs. I have worked with companies to save millions by going through the things listed above and you can too.

How to create dynamic data sources in SSIS

I use a SSIS package to gather information for all my SQL Servers and load it into a single repository on my DBA management server. Information such as database info, security info, HA info, version info, etc. This fully automated repository contains a ton of valuable information that allows me to manage my environment easier and from a central location (I will blog more about this process in the future).

For now, I just want to share a technique that I use to be able to dynamically build data source connections to each of my servers. You can use this to do some really cool things if you are wanting to automate pulling the same data from multiple servers.

Lets start with the example that I want to pull all information stored in master.sys.databases from all my servers and move the data into a single location and I want to do this for 100 servers. You could certainly create 100 different data source connections and have 100 different SSIS tasks, but that is crazy, right? You could also create an individual package for each of the 100 servers, but still craziness! What I am going to show you is how you can dynamically build your dataflow, so you only need 1 task per queried item. In this example, we are pulling just database information so we would just need 1 task created in the Control Flow.

Step 1: You need to have an account created that has the necessary access to all the SQL Instances you are going to be querying. In our example, the account will need read access to the master database or specifically just master.sys.databases. However you grant the access, you will need to make sure the permissions are the same on each server.

NOTE: If you are running this locally from your PC, the package may be running the context of your own account. If you are a DBA and have sysadmin rights on each server it will probably work without any permissions needing assigned. However, when you deploy this to run from a server, it will run under the context of the service account on said server or under a specified proxy account. I recommend creating a proxy account and using a specified account (over using the service account) to access each of your servers and only grant it the least access needed.

Step 2: Create a new blank SSIS package or use an existing one. In our case, I am going to create a new one

Step 3: Create a Connection to the server you want the data to be loaded into. In my case, I am wanting the data to be loaded into the DBServers database on one of my servers. Because I am a sysadmin I do not need to grant any additional access, but I would need to later for the proxy account. You can connect via whatever account you have setup.

In my example, the connection is going to be called DestinationServer.DBServers

Screenshot1_DynamicSSIS

Step 4: Create a new string variable called Servername. Set the Value to your destination server name you just used.

Screenshot2_DynamicSSIS

Step 5: Create a new data source that will serve as your target. The data source will be defined based on what is passed into the variable, but for the sake of creating something we have to put some information in here. For the server name, just use your destination server name and select any database. In my case, I am going to have it point to the master database since that is where the database information we are pulling from is. In my example, I am calling this connection TargetServer.master

NOTE: If you are using this technique, I recommend using 3-part names (Database.schema.object) in your code. The reason being is you may be connecting to different databases on a server to get information and you may not want to create individual data source connections for each of these.

Step 6: Once your Target server connection is created, you will need to go into the properties on it and edit the Expression to connect it with your variable. Type in @[User::Servername] in as the expression. You can hit the evaluate expression button and you should see your destination server name or whatever server you used for the target.

Screenshot3_DynamicSSIS

Step 7: Check to make sure that your Target Server connection now shows an expression next to it.

Screenshot4_DynamicSSIS

Step 8: We are now ready to start creating the data task to pull database information.  Go ahead and create a new Foreach Loop Container. We will call this “Cycle through Servers”.

Step 9: Create a Data Flow Task in your Foreach Loop container that will perform the data pull\load. We will call this “Load Database information”

Screenshot5_DynamicSSIS

Step 10: Go ahead and open your Data flow task and create your data flow. This is typical setup of pulling data from one server and loading it into another. I am just using a query to pull data from master.sys.databases and loading it into a table on my destination server. Nothing special about the data flow, just set it up like you were doing this for 1 server.

Screenshot6_DynamicSSIS

Step 11: Now is the point that you need to create the list of servers you want the Foreach Loop Container to cycle through. There are 2 options here. One allows you to create a list and store the names of your SQL Instances directly in the package, but I do not think this is a very manageable option. The other allows you to pull the Instance names from a table in a database. I am just going to show you how to build your connections based on table values since this is a lot more manageable and will allow you to manage connections without needing to edit the package each time.

Step 12: I have a table already that exists that stores all of the SQL Servers in our environment. If you do not have one, I would suggest creating one in a location that you can manage. My table is stored within my Destination Server database (DBServers), so I can query that to get my list of servers. For the sake of this example, I am going to pull all servers, but you could create different sets to filter out certain things such as prod\non-prod, certain versions of SQL, HA, etc.

Go ahead and create a new variable called AllServers and set the Data type to Object.

Screenshot7_DynamicSSIS

Step 13: On your Control Flow, create a Execute SQL Task command. We will call this “All Servers”. This task will pull the list of servers we want to connect to.

Step 14: Open the properties for the Execute SQL Task you just created and define your connection. In my case, it is the same as my destination server, but if you have it in a different location then it could be different. Fill out the rest of the information, including the query you are using to pull your instance names.

Screenshot8_DynamicSSIS

Step 15: While still in the properties, go the option in the left hand called Result Set. You need to add a line to set the Result Name to the Variable name you created. It should look like this. Result name is always 0.

Screenshot9_DynamicSSIS

Step 16: Go back to your Control Flow screen and go into the properties for you Foreach Loop container (Cycle through Servers). Go to the option in the left hand called Collection and click that to open the options there. In the Foreach Loop Editor at the top change the Enumerator to Foreach ADO Enumerator. Under the ADO object source variable change this to User:AllServers. Leave the other options as the default. It should look like this.

Screenshot10_DynamicSSIS

Step 17: Resolve any errors that the package may be giving you. Once any errors are resolved, you are ready to run the package. If able, go ahead and test it.  As you watch the “Cycle through Servers” task you should see it executing multiple times and completing. It will cycle through all the servers that you defined to the AllServers variable. The task will pull all database info from all servers defined in my table and load it into a table on my destination server.

Screenshot11_DynamicSSIS

Step 18: Finish creating anything else that you want to create and then you can deploy and schedule the task to run. Just remember that if you do you will want to make sure the account the process is running under has the appropriate permissions.

As mentioned earlier I use this functionality on a lot of data we try to compile in a single location. I also created several variables that define server lists based on flags in my database or settings and have lots of data flow tasks that use these. If you are trying to centralize a lot of your management data you can get real creative on how you pull data.

Database Design Considerations

While working for a previous development company, I was asked to provide a list of some high level best practices when it comes to database design. For many of the developers I worked with here (and many places in the past) SQL development is secondary to primary coding language such as .NET, C# or some other language. As a result of this and oftentimes being under pressure to produce results quickly many things slip through the cracks. Or not identified as a problem until its too late and inevitably gets added into the “technical debt” black hole.

These things were not meant to be the “end all be all” but a place to start and to get developers thinking early on.

  1. Plan your design
  • Understand your database’s purpose – Before you begin anything you have to understand what you are building and why you are building it
  • Requirement Gathering – What resources do you need? People, Infrastructure, Tools, etc. What size do you expect this to be? Is this heavily utilized or not? Is this business critical? Etc. While you may not be able to answer every question, you should have an idea.
  • Determine usage patterns – What type of database are you building (OLAP\OLTP, Normalized\De-normalized, etc.)? How do you expect that this database will be used?
  • Draw out your design – Draw out\Document your design somewhere. You may not know everything you need, but you can create a base to build from. Your design should be flexible so that you can easily add to it.
  • Get input from others – Some of the best design comes from working and discussing with others. Utilize the experience and knowledge of others
  • Consider all database options – Are there specific features or changes you want to make before getting started? For example, changing isolation levels, using compression, how to handle deletes, etc. These are things that are easy to implement in the beginning and much harder to do later. Think ahead about things you may need.
  1. Documentation
  • You should have adequate documentation that showcases your design and clearly communicates it to others. You should provide enough information that when you turn the database over to someone else to support it, they can figure out bugs and fix them.
  • By carefully naming your database objects, you can provide a lot of self-documentation on your database. A good naming convention will make it clear to anyone what your database is modeling.
  • Here are some reasons why documentation is important
    • Provides a common language between the business and IT
    • Makes it easier to locate areas of potential issues
    • Changes of making bad decisions is lower because risks are easier to assess
    • Makes maintenance\outages easier to manage and reduces risk
    • Acts as mediator between newcomers and existing staff
    • Reduces the likelihood of misunderstandings
  1. Create a naming standard and commit to it
  • There are several different naming conventions and many have their own style which is fine. However, the most important thing is to be consistent in how you name things and keep things readable. A good naming convention will make a database very readable for anyone who supports the system after you.
  • The following are some guidelines when creating your naming convention
    • Use unique descriptive names that make sense
    • Makes your different objects have distinguishable names. You should be able to tell the difference between a proc, a table, a view, a function, etc. based on the name alone.
    • Avoid using just “ID” as the PK of each table. This makes the database harder to read and requires aliasing when reading multiple ID columns from multiple tables
    • Beware of using SQL Server reserved words in object names (ie. Date, User, Case, etc.). This can create syntax issues if brackets are not used.
    • Avoid using hyphens, spaces, quotes, etc. in object names
  • The following is an example of naming conventions for a few objects
    • Primary Key – PK_
    • Stored Proc – sp_
    • View – vw_
    • Function – fn_
    • Index – ix_
    • Default Constraint – DF_
  1. Testing
  • Test everything and be prepared to make changes.
  • If someone disagrees with you on a design decision, be prepared to generate tests that show the results. What someone thinks or is told is not adequate means to change a design. You should be able to back up any decision you make
  1. Carefully consider what datatypes you use
  • You should always to try to use the smallest datatype you can while taking into account future growth. Choosing the correct datatype can have a large impact on storage and performance. Take time to carefully consider what you use and determine if it is the best use for the scenario.
  • Do not sell yourself too short either, you want to avoid needing to change datatypes later.
  • Understand Unicode vs Non-Unicode and when you should implement one over the other and make sure you are consistent.
  1. Understand normalization and how to implement it
  • I think the level of normalization depends on the project, but it is important to understand the benefits and disadvantages of it during your design.
  • Benefits of normalization
    • Smaller database. By eliminating duplicate data, you will be able to reduce the overall size of the database
    • Better performance – By having more narrow tables, it allows you to have less columns and fit more records per page. Fewer indexes per table means faster writes and maintenance, and you only need to join tables that you need.
  • Disadvantages of normalization
    • More tables to join
    • Tables contains codes instead of real data so you have to go back to the lookup tables for values
    • Difficult to query against. It is optimized for applications, not ad hoc querying.
  1. Use a single column as your Primary Keys
  • When you are designing a new database you should try to design things so everything connects via single integer columns and use foreign key relations from other tables. SQL Server processes number datatypes faster than character datatypes so this increases performance. They also reduce the size of data that is being stored on both the table itself and on any additional indexes. The PK columns are hidden columns in all indexes created.
  • Your primary key will be referenced in other tables so you want it as small and as fast as possible. Composite business keys force you to duplicate data in foreign key relationships and causes more data to be read on joins.
  • Avoid using composite business keys as your primary key. If you need uniqueness, then apply a non-clustered unique index
  • If you do not have a natural primary key to use, then create an identity and let SQL Server assign the number
  • For existing development, then it may not be possible or make sense to do this so instead you may need to work within the confines of the existing database structure
  1. Most everything should have a PK/Clustered Index defined
  • You should have some way to uniquely identify a row in each table
  • The only exceptions to this could be staging tables or ETL tables
  • A primary key is important for SQL server, not only to allow for processes such as replication, Change Data Capture, filetables, audit trail, and data synchronization, but also to simplify the creation of foreign key constraints.
  • Fragmentation that occurs from updates, deletes, inserts
  • Elimination of uniquifiers when performing a table lookup
  1. Use SQL Server built-in functionality to protect data integrity
    1. Nullability – NULL is used explicitly to indicate when a value is unknown, undefined, does not, or cannot exist. If you use NULL values make sure that your code and design can easily support this.
    2. Constraints – (Including NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, CHECK, DEFAULT, INDEX). These are used to specify rules for the data in a table by limiting the type of data that can go into a table. This ensures the accuracy and reliability of the data stored.
    3. String length – Think about your string length before assigning a value. Plan each column carefully and define size accordingly. Note that anything over 4000 characters will be store as Lob_data and can only contain full-text indexes.
  2. Used stored procedures for SQL code
  • All SQL code should be stored within stored procedures
    • Maintainability – Stored procedures provide a known interface to the data. When code that accesses the database is compiled in a different layer, tweaks cannot be made without a programmer’s involvement. This makes things easier to troubleshoot.
    • Encapsulation – Logic can be stored within a single place and changes without impact to the application
    • Security – More tightly security can be implemented. Specific and granular access can be granted
    • Performance – Plans get re-used more often with stored procs. Additionally, code stored here is much easier to performance tune.
  1. Proper Indexing
  • If you are reading from a table, it needs to have a clustered index
  • Create non-clustered indexes on the columns that are frequently used in predicates and join conditions. These are typically your foreign keys.
  • Understand the characteristics of the database and its most frequently used queries so you can index appropriately so you can create covering indexes for your more critical queries.
  • Determine the optimal storage location for an index. For example, it may benefit from being on a separate filegroup
  • Avoid over-indexing heavily updated tables and keep indexes narrow (as few columns as possible).
  • Consider the order of your columns in your index
  1. Keep it simple
  • While there are certainly times that complicated coding is required, try to keep things as simple as you can. Simple designs are much easier to manage
  1. Consider using flags instead of DELETE
  • Strongly consider whether you should be deleting data from this database. You can implement Delete flags to indicate when a row is no longer active. If it is needed it can always be changed back. It is better to do this from the beginning so that the flag is written into the code to check.
  • The Delete flag should be stored in parent tables

 

Understanding the basics of SQL Licensing

Ever find yourself confused about how SQL licensing? If you have, welcome to the club! There are so many components to it that it can get crazy fast. The information in this article lays out some of the more important aspects of SQL licensing to be mindful of, as well as, situations where you may want to consider each. For this article, I am using the latest information from the SQL Server 2017 licensing guide. So any old timers out there that are still on a grandfathered plan then you will need to work with your rep.

SQL 2017 Licensing Guide

I highly recommend working with Microsoft or a 3rd party re-seller to obtain licenses and talk with them about your plan. Additionally, I also recommend getting any information they give you in writing. Unfortunately people make mistakes on this topic and things can get miscommunicated and you do not want that.  There are tons of horror stories where someone built there licensing structure on something that turned out to be wrong.

Disclaimer: I do not work for Microsoft or any 3rd party approved re-seller. I am just a guy who has spent a lot of time working on this subject sharing information I have learned with you.

Starting from the beginning let’s look at the primary editions of SQL and the differences between them. It is important to make sure that you are picking the correct one when installing SQL due to the licensing implications.

There are 5 primary editions of SQL Server that are currently available

    • Enterprise ($$$$) – This is the full version with no limitations. Should only be used when Standard does not meet the requirements for what you need
        • Cost: MSRP for a single Enterprise license (which covers 2 cores) is $14,256. The cost could be lower with your Microsoft enterprise agreement. Check with your licensing manager for exact costs
    • Standard ($) – Some enterprise functionality not available although will work for most stuff
        • Cost: MSRP for a single Standard license (which covers 2 cores) is $3,717. The cost could be lower with your Microsoft enterprise agreement. Check with your licensing manager for exact costs.
    • Web ($) – This version is somewhere between Standard and Express in terms of features. This is used for web hosters and Web VAPs and is only availible under the Microsoft Services Provider License (SPLA)
        • Cost: Speak to your rep about the costs here
    • Express (free) – A free version of SQL that has several limitations
        • Cost: Free
    • Developer – This is the full version of SQL, but should exclusively be used in non-prod. Starting in SQL 2014, Microsoft made this free. If you are running earlier versions you may be covered under a Visual Studio or MSDN subscription otherwise you need to speak to your licensing rep to see if there is any costs here. I will create a separate post specific to this subject.

Ways to license 

When looking at how you can license each of these are a handful of models to be familiar with for SQL. While this article does not specifically apply to Azure, a lot of this information does apply, although, there are additional options in Azure.

There are 4 ways that you can license a production SQL Instance

    1. License the Physical Hosts for virtual environment by physical cores (Limited to Enterprise edition) – This means that you license all physical cores on each server in a virtual cluster
    2. License a Physical server by physical cores (non-virtual) – This means that you all license all physical cores on the physical server for the version of SQL that is running on it.
    3. License the individual VM’s by virtual cores – In a virtual environment, you can license each individual VM with the version of SQL that you are running.
    4. License using server + CAL (limited to Standard edition) – This means that you must assign a license to the server running SQL Server and acquire a CAL for each user or device that accesses the server

There are a few other caveats to SQL licensing that you also need to be aware of.

Other important notes about licenses

    1. 1 license equals 2 physical cores. On a VM, 1 license equals 2 virtual cores.
    2. There is a 4 core minimum per physical processor. This means that there is a 2 license minimum per physical core. For a VM, you license the virtual cores with a 4 core minimum.

Software Assurance

In addition to the licensing you apply, there is also software assurance to consider. Software assurance is typically around 30% the cost of the license and is paid yearly. Although this is more of a starting point and can be lower, you need to work with your rep on this. Also, be aware that if you have this and drop it, then it cannot be added back a later time. To reclaim software assurance you need to re-buy the licenses.

Here are some of the benefits\reasons of why you want to consider getting SA on all licensing purchases.

Benefits of Software Assurance (SA)

    • Upgrade – Gives you the ability to upgrade SQL to any version (up to the latest).
        • Without SA you are limited to the latest version of SQL released when your SA expired
    • Virtualization – Have the ability to move servers around a virtual environment without needing to license the hosts
        • You cannot have a Virtual environment without SA unless you license all physical hosts
    • High Availability and Disaster Recovery – For high availability you receive 1 passive per 1 active node. The standby node does have to truly passive and cannot be used for any type of production workload. You are also allowed a DR server, but there are specific requirements that must be met. I have listed some of the notable requirements.
        • For DR, the server cannot be running except for 1 week every 90 days, during a DR scenario in which production is down or is being used to transfer from prod to DR.
        • For DR. the hosts running the DR servers must be dedicated to DR
        • For DR, you must have software assurance or you will need to license the DR node(s)
    • License Mobility – Ability to transfer licenses
    • Support benefits 
    • Training and deployment planning services (sometimes)

Looking at each of the Licensing models deeper

Here is a deeper explanation of each of the licensing models and the pros\cons of each.

License using server + CAL

CAL licenses can be purchased for servers running Standard edition only. It is usually used when the number of clients (user or device) is pre-determined. To license for this you have to pay a server license plus a fee for each Device or User CAL. Check with you licensing manager for cost.

License a Physical server by physical cores (non-virtual)

This one is pretty self-explanatory. If you are running SQL on physical servers you will need to license the physical cores of the servers they are running on. If you are running a passive\standby physical node then that would be covered under SA or have licenses applied if you do not have SA.

    • Be aware there is a 4 core minimum per physical processor.
    • You do not have to license for hyper-threading just the actual physical cores. Be sure to double check this. The value that SQL Server is not always right because it will include hyper-threading so be sure to check to see the actual physical core count.

Licensing the Physical Hosts for virtual environment by physical cores

Pros:

    • Easier to manage since you apply the licenses to the hosts. You can build whatever you want on it without needing to track as much
    • This model “can” function without software assurance (though could affect other things)
        • If you have a lot of licenses that do not have SA then you may want to look at using this as an interim solution
    • You can “potentially” save money by using hyper-threading, private cloud scenarios with VM density and using dynamic provisioning and de-provisioning of VM resources

Other Considerations:

    • This model can function without Software assurance as long as the number of VM’s do not exceed the physical core count. For example, if you have 40 physical cores then you can only run 40 virtual machines. If you go over the limit then a Enterprise license is required for each additional Virtual machine you are over
    • With software assurance, you can run as many Virtual machines as you like without limit
    • This option is oftentimes more difficult to quantify costs for projects\budgets. It is more difficult to communicate the cost for an application requiring SQL since it is shared by many things. This comes into play when you need to determine who is responsible for paying for what or whose budget it comes out of
    • You may end up paying for licenses then you are using. To come out ahead you have to have more licensed number of cores on the VM’s then you have on the hosts.
    • Must have a well-managed\well-designed infrastructure to maintain compliance.
        • You probably want to segregate SQL servers in your environment to dedicated virtual hosts to minimize your footprint. This requires consideration to be taken in the design stages of your infrastructure. Additionally, if you have some licenses that go up to SQL 2012 and others that go up to SQL 2014, then you may have to create dedicated Virtual clusters to host certain versions of SQL.
    • Limited opportunities to reduce licensing footprint. You cannot tune\consolidate a SQL Server and reduce the core count on it and affect the licensing, because you still have to license all of the hosts in the virtual cluster despite what is allocated to the VM’s.
    • Potential for more hardware needed to support this licensing model. If you have to create several different virtual clusters to contain different SQL versions you could end up requiring more hardware then what you need.

Recommendation: Do not select this option by default. I have worked with and seen several companies choose this option and cost themselves a significant amount of money that did not need to be spent. In fact, I think this is one the biggest mistakes I have seen in terms of licensing. There are situations where it makes sense, but not for everything. If you are unsure, perform an exercise to look at the math and see if it makes sense. Check yourself!

License the individual VM’s by virtual cores

Pros:

    • Usually the lowest cost option and the better option from what I have seen. The reason being is it’s less common to see more licensable cores being allocated to the VM’s then what the hosts have. Additionally, it is likely that you may be running multiple editions of SQL (Standard\Enterprise) on the same cluster.
    • Cost is more easily determined for budgeting\spending. It is easy to communicate the cost to the business when a new SQL VM gets built since you know exactly how many licenses it needs.
    • You are only playing for what you need.
    • Grants immediate opportunity to reduce licensing. You can tune\optimize as much as you can and reduce cores. As you reduce the cores, you can reduce the licenses needing to be allocated to it.

Other Considerations:

    • Licenses have to be managed per VM to remain in compliance. This requires you to keep track of your inventory to understand how many licenses are allocated. There is a little more management here over licensing the hosts. You cannot just build something anytime you want and expect to be in compliance.
    • Software assurance is required if the virtual server can move around on hosts.
        • You can license per VM without SA, but you can only failover that server once every 90 days to remain compliant (excluding permanent hardware failure)

If are unsure the best model for you, CHECK!

In my experience it is almost always a cheaper option to license per individual VM then it is to license the host. However, if you are unsure I would encourage you to do an exercise where you look at how many licenses you would need for the host vs how many you would need to license the VM and determine your cost difference.

Future Topics:

  1. What does building an inventory look like and how can you accomplish this? What tools are available?
  2. Advanced scenarios with licensing where you do not have software assurance
  3. Specifics to licensing a non-production environment
  4. Licensing for Analytic Platform System (Parallel Data Warehouse)
  5. Further licensing considerations with High Availibility
  6. SQL licensing in Azure
  7. Tools that you can use to better manage your licensing

Welcome to my blog

Welcome to the premiere blog about everything you need to know about SQL Server…. well, maybe that is a bit of a stretch, but it is a blog about things in SQL Server!

I have worked with SQL Server for around 10 years now and and considered stepping out into blogging for some time now. I have learned from so many others in the SQL community that I find this is a great way to start giving back. In addition, selfishly I want a place to document things for myself so I have a resource to go back to when I forget how to do something. This never happens to me, but word on the street is that as you get older you might. So this provides a resource to me and hopefully a resource to others. And hey, if the helps the ole career or if somewhere along the way a company decides they like what they see and want to offer me a lot of money to come work for them then that is okay too.

My hold up has always been where to start and how are you supposed to blog. I read some other people’s stuff pretty often, but have never been a blogger myself. But as they say “every journey begins with the first step” and sometimes you just have to start. So here goes….