PROC SQL by Example

From sasCommunity

Jump to: navigation, search

Contents

[edit] Overview

60500.gif

Title: PROC SQL by Example: Using SQL within SAS

Author: Howard Schreier

ISBN: 978-1-59994-297-1

First Printing: September 2008

Click on the image to order the book from SAS online bookstore.


[edit] Description

Howard Schreier's book explains and illustrates the use of PROC SQL in the context of the SAS DATA step and other SAS procedures (such as SORT, FREQ, MEANS/SUMMARY, APPEND, DATASETS, and TRANSPOSE) whose functionality overlaps and complements that of SQL. Using a side-by-side approach, this concise reference guide includes many extensively explained examples showing equivalent DATA step and SQL code. It enables SAS users to take advantage of existing SAS skills and knowledge while learning about SQL. Discussions cover the differences between SQL and the DATA step as well as situations where SQL and the DATA step are used together to benefit from the strengths of each.

[edit] Reviews

Read the first set of reviews about this book.

[edit] Errata

Page No(s). Severity Description
33 * The CREATE TABLE statement should be preceded by a PROC SQL statement and followed by a QUIT statement.
41 * The RUN statement which appears in lower case should be in upper case.
62 *** The first SELECT statement should be preceded by a PROC SQL statement and followed by a QUIT statement.
92 *** The hypothetical CREATE TABLE statement should be preceded by a PROC SQL statement and followed by a QUIT statement.
133, 135, 137, 139 ** The running head (chapter indicator preceding the page number) is incorrect.
135 *** In each of the two SELECT statements, whitespace is needed between the FROM and GROUP BY clauses.
  FROM         twelves
  GROUP BY     sex
135 ** There should be a blank TITLE statement (TITLE;) after the last SELECT statement. Without it, the last declared title will persist.
147 ** There should be blank TITLE and FOOTNOTE statements (TITLE; FOOTNOTE;) after the SELECT statement. Without them, the last declared title and footnote will persist.
150 ** The next-to-last %PUT statement is superfluous.
163 * In the DATA step, several tokens should be in upper case, as in
  DATA simple;
  DO Measure = 1 TO 3; OUTPUT; END;
  RUN;
176 *** Each of the two INSERT statements should be preceded by a PROC SQL statement and followed by a QUIT statement.
215 * The RUN statement which appears in lower case should be in upper case.
224 *** The SQL code should be preceded by a PROC SQL statement and followed by a QUIT statement.
252 *** The CREATE TABLE statement must be followed by a QUIT statement. The statements must be submitted together in order to get accurate timings.

Severity is reflected in the number of asterisks. One asterisk (*) indicates a minor issue such as a spelling or grammar mistake, or a deviation from style. Two asterisks (**) indicate ambiguity or other source of possible confusion. Three asterisks(***) indicate an incorrect statement (or code which produces incorrect results) where the mistake is fairly obvious. Four asterisks (****) indicate a subtle mistake.

[edit] Readers' Questions and Comments

Just edit this page or use the discussion tab.


Q: Some of the examples deliberately trigger ERROR conditions. How does one prevent those from "snowballing" and interfering with subsequent examples.

A: Within a PROC SQL session, run
RESET EXEC;
At the global level, run
OPTIONS NOSYNTAXCHECK OBS=MAX;

pp. 27-28: Is there a second form of CASE, parallel to SELECT (expression); etc.?

Yes. To illustrate, suppose that the destinations for the field trips depended only on age, with all 11-year-olds going to the zoo and all 12-year-olds to the museum. Adjusting the CASE structure in the book to these rules would result in
CASE WHEN age=11 THEN 'Zoo'
     WHEN age=12 THEN 'Museum'
     ELSE             '[None]'
     END
This can also be expressed more compactly as
CASE age WHEN 11 THEN 'Zoo'
         WHEN 12 THEN 'Museum'
         ELSE         '[None]'
         END

p. 29: What happens when PROC PRINT or DATA step are fed zero observations?

In both situations the behavior is like that of PROC SQL. If you run
PROC PRINT DATA=preteen;
WHERE age=10;
RUN;
you see
NOTE: No observations were selected from data set WORK.PRETEEN.
If you run
DATA tens;
SET preteen;
WHERE age=10;
RUN;
an empty (zero-observation) data set is created.

p. 45: Is this usage of DISTINCT (with a function) allowed in clauses other than SELECT?

p. 59: Names less cryptic than u1 and m1 would make things easier to follow.

p. 88: The documentation says that "NOT" can be coded directly before "IN" to negate the sense of comparison. Are there similar constructs for the other three subquery contexts?

pp. 89-91: Where are ANY and ALL documented? They don't appear in the SQL Procedure Component Dictionary?

ANY and ALL are explained as part of the general documentation of subqueries. See the section "Query Expressions (Subqueries)" in the "sql-expression" page, which does appear in the SQL Procedure Component Dictionary.

p. 95: Exactly how can the missing values be replaced with zeroes?

p. 119: The same table name (UNIONALL) was used in an earlier example. Elsewhere, SQL and DATA step examples use the same names for their output. It would be better to have distinct names for all these tables.

p. 173: How can PROC SQL drop the table when PROC DATASETS has just done exactly that?

p. 194: Why is the integrity constraint violation an ERROR in SQL but just a NOTE when the same thing is attempted in a DATA step?

p. 200: Shouldn't the SET statement in the second DATA step on the page be SET GDS_Demo? Similarly, shouldn't the second CREATE TABLE on p. 202 be FROM GDS_Demo? Wouldn't that also generate the recursion error?

pp. 229-230: The file names in the ODS HTML statements appear to be specific to Windows. Also, the code won't work unless &PATH is resolved.

p. 234: Why is the code to create ARRIVALS and DEPARTURES not shown?

p. 250: Why create the macro variable and include it in a %PUT statement when the outcome is not shown? Also, where did the the column heading in the output originate?

p. 261: The V. 9.2 doc includes more choices for object-item, specifically

*table-name.* 
specifies all columns in the PROC SQL table that is specified in table-name.
*table-alias.* 
specifies all columns in the PROC SQL table that has the alias that is specified in table-alias.
*view-name.* 
specifies all columns in the SAS view that is specified in view-name.
*view-alias.* 
specifies all columns in the SAS view that has the alias that is specified in view-alias.

Also, it says that a "SELECT Clause Lists the columns that will appear in the output", whereas this book seems to use "SELECT Clause" more broadly.

p. 271: There is no libref MYLIB assigned.


Visit the SAS Press site for this book.

E-mail the author.

Personal tools