SQL Server SELECT: Nested Loops Join

Sunday, October 3, 2010

Nested Loops Join - No Join Predicate

I was recently troubleshooting a severe query performance issue when I ran into a query plan that used a nested loops join to combine data from two tables. The join was 97% of the cost. When I looked closer I saw that there were no join predicates specified. The join in the query looked pretty straightforward to me and used a single column on each side. Why would the optimizer do what equated to a cartesian join?

It turns out the query was a little more complicated than it looked. I was actually joining two views that were part of a view-based (non-persisted) dimension model. I was joining on logical primary and foreign keys which made sense.

When I dug into the views I found that one of the key columns was based on a user-defined fuction that was converting a datetime to an integer value so that it could be used to join to a date dimension view. Once I changed the underlying view to apply the same logic as the UDF at the view level, so that a call to the UDF was unecessary, the query executed within a second as expected.

Other behavior that I noticed was that if I changed the inner join to a left join, the optimizer came up with a different much more efficient plan. This appears to be a flaw in the optimizer but I would like to speak to someone at Microsoft before making that claim.

The lesson learned here is that if you have a poorly performing query due to a nested loops join with no join predicate(s) it's not ncessarily the case that you're missing join criteria in your query (as all the posts I was able to find seemed to point to). The culprit could be a UDF on a join column.

6 comments:

AnonymousDecember 20, 2010 at 8:37 AM
Awesome post. You saved me a ton of debugging time, and a query that was taking a half hour to return 17 rows got reduced to under a second just by moving the UDF code into the view.
ReplyDelete
Replies
Anonymous CasanovaMarch 11, 2013 at 9:16 PM
I have also noticed in some cases changing an INNER JOIN to a LEFT JOIN results in an order of magnitude decrease in query plan cost. Something odd seems to be happening there.
ReplyDelete
Replies
AnonymousAugust 4, 2013 at 5:23 PM
Great tips, thanks for sharing for the world to learn from. In my case, I had a view with a join clause that had a CASE statement in it. In one case it joined to one table, another case it joined to a different table. Imagine how the poor optimizer handled that.
ReplyDelete
Replies
SenaDecember 27, 2013 at 8:43 AM
Thanks for the tip. I was trying to optimize a simple query which looks OK. When I noticed the warning 'Nested Loops Join - No Join Predicate' in the execution plan, I started digging and eventually landed here.

I could not get rid of the function (due to business logic), I created a Temp table and first saved the results of the sub query with the function. The optimizer now have a physical file to look at, and worked as expected.
ReplyDelete
Replies
AnonymousJune 2, 2015 at 6:19 AM
OMG...thank you. I've spent an entire day on trying to figure out why I kept getting the Nested Loops - No Join Predicate....turns out I just had to change 2 of my joins to left joins. Luckily, it doesn't matter if I get nulls from these tables but this could be very aggravating if it did...
ReplyDelete
Replies
JoeJune 7, 2019 at 3:15 AM
You saved me a lot of time, thank you.

I had the same issue with a calculated join [join( select…) A on A.x=B.x].

select * from data
join (select …) A on A.DayDate=data.DayDate

Only the tipp with a tmp-Table worked, no CTE nor a dedicated view out of the inner join statement worked.
But changing the order of the joins and using a CTE like

with CTE(DayDate) as (select …)
select * from CTE
left join data on data.DayDate=CTE.DayDate

worked also
ReplyDelete
Replies

Add comment