SQL Server SELECT: Optimizer

Showing posts with label Optimizer. Show all posts

Sunday, March 11, 2018

Estimated Number of Rows = 1 Query Very Slow or Not Returning Data

A query that is slow that should be fast (e.g. underlying tables are not enormous, has been fast in the past, etc. - I'll let you determine what should or should not be fast) is often (but not always) slow because of a "bad" query plan. Determining what is and what is not a good query plan is not something we're going to get into here but if you're at your wits end with a query that has run for hours and should be much faster, there's a good chance you don't have an optimal plan.

If you've landed on this page it's likely that you've already looked at your query plan and seen that some operators are showing Estimated Number of Rows = 1 when you know there are definitely more rows than that.

A key ingredient to a good query plan is accurate cardinality estimation. Cardinality refers to the uniqueness of values contained within a particular column or index and has a strong influence on the SQL Server Query Optimizer. It will drive decisions such as which join operator (e.g. hash match, netsted loops, merge) to use which will significantly impact query performance.

If you're encountering the Estimated Number of Rows = 1 there's a good chance that either your statistics need to be updated and/or the Cardinality Estimator is failing you.

To resolve the issue first trying updating statistics on the table(s) in question. Use the WITH FULLSCAN option to cover your bases.

 UPDATE STATISTICS table_or_indexed_view_name WITH FULLSCAN 
 ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;

After you've updated your statistics, take another look at your query plan. Has the Estimated Number of Rows value changed to something more accurate? If so, try running your query again. If not, it might be that you're having issues with the Cardinality Estimator.

The Cardinality Estimator received an overhaul with SQL Server 2014 and while it's better in most cases, it is not in all. In fact, I stumbled upon a bug in the CE just last year that Microsoft acknowleged and fixed. If you suspect the CE is not working as it should be there are still a few things you can try.

First, determine if you're having an issue that someone else has already encountered and that Microsoft has fixed. You can do this by enabling trace flag 4199 which will enable all query optimizer fixes. You can first try this at the query level by using the following option.

 OPTION(QUERYTRACEON 4199)

If this works and you want to apply this globally, you can execute the following.

 DBCC TRACEON (4199, -1); 
 ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;

If you still haven't solved the problem you can have SQL Server use the old or "legacy" CE as they call it. This can be used at the query level using the following hint after your query

 OPTION (USE HINT ( 'FORCE_LEGACY_CARDINALITY_ESTIMATION' ))

or at the database level using the following script.

 ALTER DATABASE SCOPED CONFIGURATION SET LEGACY_CARDINALITY_ESTIMATION = ON; 
 ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;

And if none of that works...
Take a look at your query plan. Do you have Merge Joins that require sorts on large data sets? Are there Nested Loops that are taking forever (try using Live Query Statistics to find your bottlenecks). You can try using join hints to force a HASH join and see if that helps.

I know all too well how painful this type of situation can be. While we were very grateful that Microsoft fixed our problem, it took months to identify it and then months more before it was available in a service pack. Unfortunately for us, using the Legacy CE was not an option. Hopefully for you, one of these fixes will work!

Sunday, October 3, 2010

Nested Loops Join - No Join Predicate

I was recently troubleshooting a severe query performance issue when I ran into a query plan that used a nested loops join to combine data from two tables. The join was 97% of the cost. When I looked closer I saw that there were no join predicates specified. The join in the query looked pretty straightforward to me and used a single column on each side. Why would the optimizer do what equated to a cartesian join?

It turns out the query was a little more complicated than it looked. I was actually joining two views that were part of a view-based (non-persisted) dimension model. I was joining on logical primary and foreign keys which made sense.

When I dug into the views I found that one of the key columns was based on a user-defined fuction that was converting a datetime to an integer value so that it could be used to join to a date dimension view. Once I changed the underlying view to apply the same logic as the UDF at the view level, so that a call to the UDF was unecessary, the query executed within a second as expected.

Other behavior that I noticed was that if I changed the inner join to a left join, the optimizer came up with a different much more efficient plan. This appears to be a flaw in the optimizer but I would like to speak to someone at Microsoft before making that claim.

The lesson learned here is that if you have a poorly performing query due to a nested loops join with no join predicate(s) it's not ncessarily the case that you're missing join criteria in your query (as all the posts I was able to find seemed to point to). The culprit could be a UDF on a join column.

Sunday, August 29, 2010

OPTION (FORCE ORDER)

I was recently faced with a difficult performance issue. In an effort to save a client some time and money I convinced them that a view-based dimensional model was the way to go over a table/ETL based solution. I will blog about when this option is an option and the benefits to going this route some time in the near future. Anyhow, based on this decision I ended with a performance issue that I wouldn't have otherwise had.

The view-based dimensional database consisted of 30-40 fact and dimension views. As I created these views I made sure to test them for performance issues and tuned accordingly. I tested each view on its own and I tested individual joins between fact and dimension views.

Everything was working fine until I began creating sample queries to show end-users possible queries they could write against the new database. What I found was that when a particular dimension view was joined to a fact view along with one or more other dimension views I had unexpected performance issues. Queries that should take a few seconds to execute were taking 45-50 seconds.

In looking at the execution plans I saw that the optimizer was not doing what I had expected based on what I had seen for plans for the views executed on their own. I saw weird things like LEFT OUTER joins instead of INNER joins to my views resolving or partially resolving the issue.

I couldn't understand why the optimizer would treat the SQL in my views differently now that it was joined to other tables or views. To be honest, I'm still not sure. What I am sure of is that there is a query hint that forces the optimizer to create a plan for the view portion of the SQL irrespective of any other objects in the statement by forcing the optimizer to preserve the table/join order contained in the view definition. This hint is the FORCE ORDER hint and can be added to the end of the SQL statement using the following syntax.

OPTION(FORCE ORDER)

Below is an example.

SELECT *
FROM table1 t1
INNER JOIN view1 v1 ON t1.col1 = v1.col3

OPTION(FORCE ORDER)