GROUP BY and DISTINCT can return the same results when looking for uniqueness in a resultset. Consider the Sales table below.
Store Product Date
-------------------------------------
A Monopoly 1/10/2010
B Connect 4 1/12/2010
A Monopoly 1/14/2010
A Battle Ship 1/16/2010
B Othelo 1/16/2010
If you wanted to know what games each store sold from 1/10/2010 through 1/16/2010 you could write either of the following queries:
SELECT DISTINCT
Store,
Product
FROM
Sales
-OR-
SELECT
Store,
Product
FROM
Sales
GROUP BY
Store,
Product
So which one?? I hate to admit it but back when I first got into SQL performance tuning I was working in an Oracle 8i environment. If I remember correctly, GROUP BY performed better than DISTINCT for one reason or another. Well in the days of SQL Server 2005 and 2008, GROUP BY and DISTINCT perform equivalently. Therefore, my rule of thumb is to use DISTINCT when no aggregations on additional, non-unique columns are necessary. If you just want unique combinations of store and game use distinct. If you want to know how many of each game were sold by store then use GROUP BY.
SELECT [tips tricks and commentary] FROM [my experiences] WHERE subject IN ('database', 'data warehouse', 'business intelligence')
Monday, June 21, 2010
Wednesday, June 16, 2010
Unique Constraint or Unique Index?
A few things to keep in mind when deciding whether you want a unique index or constraint.
1) The index creation options other than FILLFACTOR that are available for a unique index are not available for a unique constraint.
2) A unique key can be referenced by a foreign key constraint but a column with a unique index cannot be referenced by a foreign key constraint.
3) A less subtle difference is related to the timing of validation. Constraints are checked before indexes and this can lead to a large multi-row insert/select or update to fail before modification. Often times indexes are validated at the end of a large modification. As a result it will take longer for a failure to occur with a rollback at the end of the modification.
1) The index creation options other than FILLFACTOR that are available for a unique index are not available for a unique constraint.
2) A unique key can be referenced by a foreign key constraint but a column with a unique index cannot be referenced by a foreign key constraint.
3) A less subtle difference is related to the timing of validation. Constraints are checked before indexes and this can lead to a large multi-row insert/select or update to fail before modification. Often times indexes are validated at the end of a large modification. As a result it will take longer for a failure to occur with a rollback at the end of the modification.
SQL Formatter Is Great for Auto Generated SQL
Ever had to try to decipher SQL that was generated by a report writer or sql generation tool such as Cognos Framework Manager? How about trying to read through a co-worker's messy SQL? Not fun.
One of my favorite tools is the Instant SQL Formatter. It will clean up any valid SQL statement based on user configurable options. I usually just stick with the defaults. It's a great tool and it's free. Can't beat that. One of the first things I do when I set up a machine at a new client site is to add it to my favorites in IE.
One of my favorite tools is the Instant SQL Formatter. It will clean up any valid SQL statement based on user configurable options. I usually just stick with the defaults. It's a great tool and it's free. Can't beat that. One of the first things I do when I set up a machine at a new client site is to add it to my favorites in IE.
SSIS Extracting Reading from Oracle 10g
Here are a few hints if your extracting data from an Oracle database using SSIS.
1) Use Oracle 11g client. The 10g client has bugs around the naming of the Program Files directory on 64-bit machines. It doesn't like the parantheses in Program Files (x86). There are workarounds for using the 10g client but it's been my experience that it's not worth the effort.
2) Make sure to install the 32-bit client for running SSIS in the IDE (debug mode) and the 64-bit client for the 64-bit SSIS runtime.
3) Use the Attunity connector for Oracle. This component can extract data up to 100 times faster than out of the box OLE DB for Oracle provider.
1) Use Oracle 11g client. The 10g client has bugs around the naming of the Program Files directory on 64-bit machines. It doesn't like the parantheses in Program Files (x86). There are workarounds for using the 10g client but it's been my experience that it's not worth the effort.
2) Make sure to install the 32-bit client for running SSIS in the IDE (debug mode) and the 64-bit client for the 64-bit SSIS runtime.
3) Use the Attunity connector for Oracle. This component can extract data up to 100 times faster than out of the box OLE DB for Oracle provider.
Subscribe to:
Posts (Atom)