As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.
Here are some tips for How to share your SAS knowledge with your professional network.
Tips:Counting and miscounting occurrences using SQL
Here's a simple question: How many company managers are not in the sales department? It sounds like a perfect candidate for an SQL sub-query. Count the number of managers who don't have an ID in the sales department.
proc sql; select count(distinct manager_id) 'Count of Non-Sales Managers' from company_master where manager_id not in (select employee_id from sales_department); quit;
In this particular situation, however, there's a catch. The company's CEO doesn't have a manager. Therefore, the CEO's manager_id value is missing in the company_master dataset. The SQL count function does not count missing values and the answer is wrong. The CEO, although being the highest level manager in the company, isn't counted.
One fix to this 'mis-count' is to add the counts of missing and existing values: Select count(distinct manager_id + nmiss(manager_id) 'Count of Non-Sales Managers'. But who would ever think of doing this when writing sub-queries? It's just not intuitive.
Instead of using a sub-query technique, an inline view would count correctly and provide the correct answer. Why? Because an inline view counts occurrences, not values.
proc sql; select count(∗) 'Count of Non-Sales Managers' from (select distinct manager_id from company_master except all select employee_id from sales_department); quit;
Programmers like sub-queries. They are easy to understand. They build look-up lists which can be searched with the outer query. However, they may not be bulletproof. Inline views are almost as easy to understand and may be more reliable.