This is the third post in the “Parameterization” series. In the previous post I mentioned parameter sniffing. This is a very important concept in SQL Server, and it certainly deserves a dedicated post. It’s time to dive into all the juicy details about parameter sniffing…
But first, let’s begin with a short reminder of what we’ve learned so far. A batch can be either parameterized (contain parameters) or not. A parameterized batch plan in SQL Server can be one of two object types: “Prepared” or “Proc”. A “Prepared” plan corresponds to an execution of sys.sp_executesql with at least one parameter, whether it is executed directly from a T-SQL batch, or from an application through ADO.NET or any other client provider. A “Proc” plan corresponds to a stored procedure (surprised, aren’t you?).
Parameter sniffing is identical in both cases. It behaves exactly the same for “Prepared” plans and for “Proc” plans. So for the sake of this post, we’re going to just talk about parameterized batches regardless of their type.
So what is parameter sniffing?
When a batch contains one or more parameters, and it needs to be optimized (for example, because there isn’t already a plan in cache for this batch or because there is a plan, but it is not valid), the values of the parameters in this specific execution are known to the optimizer. This means the optimizer can use these values in order to estimate the number of rows to be returned at each step in the plan. Essentially, it is as if the values were hard coded in the batch text.
This is a very powerful thing, because if the optimizer didn’t know the values of the parameters, it would have been forced to guess the number of rows. It would make an intelligent guess as much as possible, based on average statistics and other metadata it can use, but in most cases it is still going to be far from reality. Wrong cardinality estimation often leads to a poor choice of execution plan and consequently to poor performance.
It is important to understand that the execution plan is generated based on the values of the parameters in the specific execution that triggered the creation of the execution plan. For example, if this is the first time a batch is executed after a server restart, it will trigger a compilation, because a corresponding plan doesn’t already exist in the plan cache. During this compilation, the values of the parameters will be used to generate the execution plan thanks to parameter sniffing. When the plan is created, it is put in the plan cache for reuse. The next time the same batch is executed, potentially with different parameter values, the plan from cache will be reused. The results of the second execution will be based on the parameter values passed in that execution, of course. But the plan used to execute the second execution is the same plan as before, which is based on the values from the first execution.
In the graphical representation of the actual execution plan of a parameterized batch, look at the properties of the outermost operator (this would usually be a “SELECT” operator), and look for the “Parameter List” property. When you expand this property, you will see the compiled values and the runtime value of each parameter. The compiled value is the one used with parameter sniffing in the process of generating the plan. The runtime value is the actual value used in this specific execution.
In fact, the plan might be very poor for the second execution, because the number of rows based on the parameter values in the second execution might be very different than the number of rows estimated by the optimizer when the plan was generated based on the parameter values in the first execution. This is purely a matter of luck, there is no better way to put it. The execution plan is determined by the values of the parameters passed in a specific execution, and we don’t have control over it, because we don’t know when a compilation will happen.
If the plan is good for most executions, then we’re OK, but what if it’s not? What if a user executed a stored procedure with a parameter value that is rarely used? This parameter value resulted in a plan that is very efficient for that specific value, but quite poor for most other values. Most users from this point forward will execute the same stored procedure with commonly used parameter values, and they will experience poor performance, because the plan used for the stored procedure is based on a rarely used parameter value.
So there is a question here: Is parameter sniffing a good thing or a bad thing?
As always, the answer is: “it depends”. In this case, it depends on the distribution of the data. Let’s look again at the stored procedure we used in the previous post:
CREATE PROCEDURE Marketing.usp_CustomersByCountry ( @Country AS NCHAR(2) ) AS SELECT Id , Name , LastPurchaseDate FROM Marketing.Customers WHERE Country = @Country; GO
In this case, we use the parameter @Country to filter rows based on values in the “Country” column in the “Marketing.Customers” table. If most countries have more or less the same number of rows in the table, and most executions of the stored procedure are used with these countries, then parameter sniffing is a good thing, because it is likely that we will get a plan that is good for most cases, and it is likely that the plan would be better than a plan generated without parameter sniffing (treating parameter values as unknown). On the other hand, if the distribution of values in the “Country” column is not uniform, then a plan based on one country might be a bad choice for other countries, and in this case parameter sniffing is not necessarily a good thing.
So what can we do in those cases where parameter sniffing is bad for us? This is exactly what I’m going to write about in the next post…