As the first step in the decommissioning of the site has been converted to read-only mode.

Here are some tips for How to share your SAS knowledge with your professional network.

Be Aware of Implicit Pass-Through

From sasCommunity
Jump to: navigation, search

PROC SQL can automatically ask a non-SAS database to perform a join for it. While this can speed up query execution, it can also produce results which differ from those which PROC SQL alone would produce.

Here's a little example using Excel, of all things, as the supporting database. First we set up an Excel libref

libname xldemo excel 'xldemo.xls';

then a second libref, this time with the DIRECT_SQL=NO option, which prohibits implicit pass-through.

libname nopass excel 'xldemo.xls' direct_sql=no;

Next, we set up two little tables in the form of Excel worksheets:

proc datasets library=xldemo kill nolist; quit; 
data xldemo.a xldemo.b; do var = 1, (.), 3; output; end; run;

Now we ask SQL to join the two tables. First, we use the libref which prevents implicit pass-through:

proc sql;
title 'NO implicit pass-through join';
select *
 from nopass.a join nopass.b
  on a.var = b.var;

The result is

var       var
  1         1
  .         .
  3         3

Now we do the same join, but with the libref which, by default, permits implicit pass-through

title 'POSSIBLE implicit pass-through join';
select *
 from xldemo.a join xldemo.b
  on a.var = b.var;

The result is

a.var     b.var
    1         1
    3         3

The row with the two nulls (missing values) does not appear. That's because Microsoft join rules rather than SAS join rules are in effect. Microsoft's SQL implementation of SQL, like most but unlike the SAS implementation, does not consider two nulls to be equal.

There are warnings:

WARNING: The SCAN_TEXT option is ignored in the implicit pass-through joins.
WARNING: Use libname option DIRECT_SQL=NO to disable the implicit pass-through joins if you
         want the SCAN options on.

but they focus on the SCAN options whereas the implications of permitting implicit pass-through joins are broader.

Question: What warnings are issued when other data bases (not MS Office) are used?

Character Null vs. Blank in WHERE clauses

When comparing the value of a character value to blank in a WHERE clause, whether or not SAS or the DBMS evaluates the WHERE is important. In SAS, there is no null value for character. So character blank and character null are the same thing. However, in many databases, character fields with no value are likely stored with a null value rather than a blank value.

Thus in either SQL or the DATA step, the following where clause might not return any data:

  where VARIABLE = ' '

while a subsetting IF in the data step:

  if VARIABLE = ' '

will return data where the value of VARIABLE is null/or blank. The reason it works in the subsetting IF is because by the time the logical expression is evaluated, the value has been brought into SAS and so it is character blank and not null.

In order to avoid this trap, the is missing syntax should be used:

  where VARIABLE is missing

This issue was highlighted in a SAS-L thread.