Tracking connections on your SQL Instance

Many times throughout my career, I have needed to complete tasks on databases that there was little to no information about. I often refer to these little guys as mystery databases. Whether it be migrations, decommissions, troubleshooting application issues or any other issues, it can sometimes be hard to figure out who or what is using one of these mystery databases. Even speaking to people you think may know (or should) know the answers can come up fruitless. So how do you handle that?

Yes, you can run one of the various scripts (such as sp_who or sp_whoisactive) to see who\what is currently connected, but how confident can you be that is the complete list? What if an application only connects periodically? If you have little to no information on the database, can you know for sure? And yes, you could shut it down and see what breaks and who screams, but is this the most effective way?

I got tired of seeing blank stares and hearing “I don’t know” from application owners and developers so I created a process that lets me track connection history. This allowed me to answer these questions on my own or at least have more information available to me when talking to other teams. What I have found since we started running this process, is that is one of the most widely used tools that my team uses. Especially after this process has been running for months and have that data available to me.

The process is just a proc that pulls a distinct list from master.dbo.sysprocesses based on Instance name, database name, login, host name, and program name and loads this data into a table. Each time it runs I have a counter column that exists that is increased by 1 each time a distinct connection is found and a timestamp column that is updated with the most recent time it was found. I run this proc via a SQL Agent job on all of my servers on a 3-minute interval. The task is lightweight so you can run it as often as you like and the table should stay relatively small.

Let’s take a look at what you need to do to set this up. For the sake of this article, I am going to include it in the master database, however, I have a user created DBA database that is created on all my servers that I normally use. I use this for all the objects that I need created to manage a server.

1. Create a table that you plan to log the data into.

USE [master]
GO

CREATE TABLE [dbo].[SQLConnections](
	[SQLConnections_ID] [int] IDENTITY(1,1) NOT NULL,
	[Instance_Name] [nvarchar](50) NOT NULL,
	[Database_Name] [nvarchar](100) NULL,
	[loginame] [nvarchar](128) NULL,
	[hostname] [nvarchar](128) NULL,
	[program_name] [nvarchar](128) NULL,
	[LastLogin] [datetime] NULL,
	[ConnectionCount] [bigint] NULL,
 CONSTRAINT [PK_SQLConnections] PRIMARY KEY CLUSTERED 
(
[SQLConnections_ID] ASC
))

2. Create the proc

USE [master]
GO
CREATE PROC [dbo].[sp_LoadCurrentConnections]

AS

-- Create a Temp Table to hold the current connections

IF OBJECT_ID('tempdb..#SQLConnections') IS NOT NULL DROP TABLE #SQLConnections
CREATE TABLE #SQLConnections
(Instance_Name nvarchar(50),
Database_Name nvarchar(100),
loginame nvarchar(128),
hostname nvarchar(128),
program_name nvarchar(128),
LastLogin datetime,
ConnectionCount bigint)

-- Insert into DBAdb.dbo.SQLConnections

INSERT INTO #SQLConnections
(Instance_Name, Database_Name, loginame, hostname, program_name, LastLogin, ConnectionCount)
SELECT @@Servername as Instance_Name,
DB_NAME(dbid) as Database_Name,
loginame,
hostname,
program_name,
MAX(login_time) as LastLogin,
1 as [Count]
FROM sys.sysprocesses
GROUP BY DB_NAME(dbid), loginame, hostname, program_name

--- Update any existing data with latest login and add to count

UPDATE SC
SET SC.LastLogin = SCS.LastLogin,
SC.ConnectionCount = SC.ConnectionCount + 1
FROM #SQLConnections SCS
INNER JOIN master.dbo.SQLConnections SC ON SC.Instance_Name = SCS.Instance_Name
AND SC.Database_Name = SCS.Database_Name AND SC.loginame = SCS.loginame
AND SC.hostname = SCS.hostname AND SC.program_name = SCS.program_name

--- Insert any new rows into the table that do not exist

INSERT INTO master.dbo.SQLConnections
(Instance_Name, Database_Name, loginame, hostname, program_name, LastLogin, ConnectionCount)
Select SCS.Instance_Name, SCS.Database_Name, SCS.loginame, SCS.hostname, SCS.program_name, SCS.LastLogin, 1
FROM #SQLConnections SCS
LEFT JOIN master.dbo.SQLConnections SC ON SC.Instance_Name = SCS.Instance_Name
AND SC.Database_Name = SCS.Database_Name AND SC.loginame = SCS.loginame
AND SC.hostname = SCS.hostname AND SC.program_name = SCS.program_name
WHERE SC.Instance_Name is null

Note: You probably could use some of the other commonly used scripts that exist out there and just log the data to a table as well. This script is just what I use since its simple and accomplishes what I need for it to.

3. Create a SQL Agent job to execute the stored proc on the schedule of your choosing. (Note: I typically run it every 3 minutes)

After everything is setup and running

Once you have the process running, you have a historical record of what is connecting to your servers and an idea of how often it is connecting.

Here is an example output from a server that I manage (names have been updated to generic values) that will give an idea of the result set.

ConnectionScreenshot

From this table, you can see what each connection being made to your databases are and how often they are connecting. There are several things you can do with this information and you may find that it comes in handy a lot more then expected. You will find that anytime you need to see what connects to a database, this will be your go to place.

How to create dynamic data sources in SSIS

I use a SSIS package to gather information for all my SQL Servers and load it into a single repository on my DBA management server. Information such as database info, security info, HA info, version info, etc. This fully automated repository contains a ton of valuable information that allows me to manage my environment easier and from a central location (I will blog more about this process in the future).

For now, I just want to share a technique that I use to be able to dynamically build data source connections to each of my servers. You can use this to do some really cool things if you are wanting to automate pulling the same data from multiple servers.

Lets start with the example that I want to pull all information stored in master.sys.databases from all my servers and move the data into a single location and I want to do this for 100 servers. You could certainly create 100 different data source connections and have 100 different SSIS tasks, but that is crazy, right? You could also create an individual package for each of the 100 servers, but still craziness! What I am going to show you is how you can dynamically build your dataflow, so you only need 1 task per queried item. In this example, we are pulling just database information so we would just need 1 task created in the Control Flow.

Step 1: You need to have an account created that has the necessary access to all the SQL Instances you are going to be querying. In our example, the account will need read access to the master database or specifically just master.sys.databases. However you grant the access, you will need to make sure the permissions are the same on each server.

NOTE: If you are running this locally from your PC, the package may be running the context of your own account. If you are a DBA and have sysadmin rights on each server it will probably work without any permissions needing assigned. However, when you deploy this to run from a server, it will run under the context of the service account on said server or under a specified proxy account. I recommend creating a proxy account and using a specified account (over using the service account) to access each of your servers and only grant it the least access needed.

Step 2: Create a new blank SSIS package or use an existing one. In our case, I am going to create a new one

Step 3: Create a Connection to the server you want the data to be loaded into. In my case, I am wanting the data to be loaded into the DBServers database on one of my servers. Because I am a sysadmin I do not need to grant any additional access, but I would need to later for the proxy account. You can connect via whatever account you have setup.

In my example, the connection is going to be called DestinationServer.DBServers

Screenshot1_DynamicSSIS

Step 4: Create a new string variable called Servername. Set the Value to your destination server name you just used.

Screenshot2_DynamicSSIS

Step 5: Create a new data source that will serve as your target. The data source will be defined based on what is passed into the variable, but for the sake of creating something we have to put some information in here. For the server name, just use your destination server name and select any database. In my case, I am going to have it point to the master database since that is where the database information we are pulling from is. In my example, I am calling this connection TargetServer.master

NOTE: If you are using this technique, I recommend using 3-part names (Database.schema.object) in your code. The reason being is you may be connecting to different databases on a server to get information and you may not want to create individual data source connections for each of these.

Step 6: Once your Target server connection is created, you will need to go into the properties on it and edit the Expression to connect it with your variable. Type in @[User::Servername] in as the expression. You can hit the evaluate expression button and you should see your destination server name or whatever server you used for the target.

Screenshot3_DynamicSSIS

Step 7: Check to make sure that your Target Server connection now shows an expression next to it.

Screenshot4_DynamicSSIS

Step 8: We are now ready to start creating the data task to pull database information.  Go ahead and create a new Foreach Loop Container. We will call this “Cycle through Servers”.

Step 9: Create a Data Flow Task in your Foreach Loop container that will perform the data pull\load. We will call this “Load Database information”

Screenshot5_DynamicSSIS

Step 10: Go ahead and open your Data flow task and create your data flow. This is typical setup of pulling data from one server and loading it into another. I am just using a query to pull data from master.sys.databases and loading it into a table on my destination server. Nothing special about the data flow, just set it up like you were doing this for 1 server.

Screenshot6_DynamicSSIS

Step 11: Now is the point that you need to create the list of servers you want the Foreach Loop Container to cycle through. There are 2 options here. One allows you to create a list and store the names of your SQL Instances directly in the package, but I do not think this is a very manageable option. The other allows you to pull the Instance names from a table in a database. I am just going to show you how to build your connections based on table values since this is a lot more manageable and will allow you to manage connections without needing to edit the package each time.

Step 12: I have a table already that exists that stores all of the SQL Servers in our environment. If you do not have one, I would suggest creating one in a location that you can manage. My table is stored within my Destination Server database (DBServers), so I can query that to get my list of servers. For the sake of this example, I am going to pull all servers, but you could create different sets to filter out certain things such as prod\non-prod, certain versions of SQL, HA, etc.

Go ahead and create a new variable called AllServers and set the Data type to Object.

Screenshot7_DynamicSSIS

Step 13: On your Control Flow, create a Execute SQL Task command. We will call this “All Servers”. This task will pull the list of servers we want to connect to.

Step 14: Open the properties for the Execute SQL Task you just created and define your connection. In my case, it is the same as my destination server, but if you have it in a different location then it could be different. Fill out the rest of the information, including the query you are using to pull your instance names.

Screenshot8_DynamicSSIS

Step 15: While still in the properties, go the option in the left hand called Result Set. You need to add a line to set the Result Name to the Variable name you created. It should look like this. Result name is always 0.

Screenshot9_DynamicSSIS

Step 16: Go back to your Control Flow screen and go into the properties for you Foreach Loop container (Cycle through Servers). Go to the option in the left hand called Collection and click that to open the options there. In the Foreach Loop Editor at the top change the Enumerator to Foreach ADO Enumerator. Under the ADO object source variable change this to User:AllServers. Leave the other options as the default. It should look like this.

Screenshot10_DynamicSSIS

Step 17: Resolve any errors that the package may be giving you. Once any errors are resolved, you are ready to run the package. If able, go ahead and test it.  As you watch the “Cycle through Servers” task you should see it executing multiple times and completing. It will cycle through all the servers that you defined to the AllServers variable. The task will pull all database info from all servers defined in my table and load it into a table on my destination server.

Screenshot11_DynamicSSIS

Step 18: Finish creating anything else that you want to create and then you can deploy and schedule the task to run. Just remember that if you do you will want to make sure the account the process is running under has the appropriate permissions.

As mentioned earlier I use this functionality on a lot of data we try to compile in a single location. I also created several variables that define server lists based on flags in my database or settings and have lots of data flow tasks that use these. If you are trying to centralize a lot of your management data you can get real creative on how you pull data.