Anatomy of a SQL query
proc sql; ... quit;
What makes a valid query?
The necessary clauses of a basic query are SELECT and FROM. Other components are optional. Within a clause, square brackets () represent optional components. Note that you can use lower case for these keywords. Using all uppercase letters is done here for demonstration; lowercase letters may be preferred style.
CREATE TABLE | VIEW new.table_name AS [VALIDATE] SELECT col1 [as newName format fmtName.] [* /* use an asterisk to select all columns from a table or series of tables */] [INTO :var1 [separated by "<character>"]] [except <subquery>] FROM library.table_name [optional alias letter(s)] (in-line view allowed here, e.g. subquery [EXCEPT subquery]) [inner/left/right/full join table_name on ...] WHERE (sub-query allowed here) and or GROUP BY HAVING (sub-query allowed here) ORDER BY var1 [DESC], var2, ...; [INTERSECT | OUTER UNION [CORR] | UNION [ALL] SELECT...]
For more information on joins, which are performed in the FROM clause, see SQL joins.
Subqueries and in-line queries
Subqueries are queries-within-queries. For example:
proc sql; create table current_students as select * from students where id not in (select id from graduated_students); quit;
In-line views are similar: 
proc sql; create table current_students select * from (select * from all_students where status = "current"); quit;
While ANSI standard SQL may be used in SAS, there are also some additional capabilities in SAS. For example,
and may be used for subsetting in the WHERE clause as an English equivalent of the math operators > and <. For more information, see SQL vs SAS.
If a column name contains spaces, use
"Example Name"n in your query.
You can use Pass Through Queries to query other databases (DB2, Hadoop, Oracle, SQL, Teradata, etc.) using query language native to that database. The syntax changes from the above to use:
proc sql; connect to <dbase engine> (user=... password=... server=... path=... <other parameter(s)>) ... FROM connection to <dbase engine> (<dbase engine-specific query)) disconnect from <dbase engine> ... quit;
You can alternatively use a libname statement to connect to an external database, and proceed as if it were a SAS library. By doing this, SAS will autogenerate the actual query passed, and may thus be less efficient than writing your own pass-through query.
There are several mnemonics for remembering the valid order of SQL queries, specifically Select - From - Where - Group by - Having - Order by. Some are:
- So Few Workers Go Home On-time
- Some Fishy Workers Get Hooked On-line
- Sure Fire Way to Get Huge Output
- Sly Foxes Watch Geese Hide Offspring
- Some French workers glue hardwood often
- San Francisco, where the Grateful Dead heads originated
Here are some example table manipulation statements:
insert into tableName set var1 = value1, var2 = value2 set var1 = value3, var2 = value4; insert into tableName(column1, column2, column3, column4) values (val1, val2, val3, val4); insert into tableName values (val5, val6, val7, val8); update tableName set var1 = newValueOrExpression where var2 is null; alter table tableName add var1 [type, format, label, etc.] drop var1 [, var2, ...] modify var1 [new format]; drop index index-name from table-name; create [unique] index index-name on table-name(col-name1, col-name2, ...); describe table tableName; drop table table-name [,table-name-2, ...];
Some examples of altering and updating may be found here.
PROC SQL options
You can also pass options in to the
proc sql statement. INOBS= and OUTOBS= define how many observations SAS will read and write, respectively. FEEDBACK will print to the log all column names when SELECT * is used. NOEXEC will allow you to verify your syntax, without executing the query or queries within your PROC SQL block. For more information, see SAS Support.
- An Introduction to SQL in SAS®
- PROC SQL vs. DATA Step
- infographic about SQL query basics
- SQL joins
- learn about a table's schema with Proc SQL as Proc Contents
- Comparing PROC SQL with the SAS DATA Step
- Tips and Tricks in SQL
- Many Uses of SQL Subqueries
- Note that a query contains only one semicolon after the final clause, whether it be FROM, WHERE, GROUP BY, HAVING, or ORDER BY. If one of the set operators are used (intersect, union, etc.), the semicolon is omitted and is inserted after the final query.
- Read more from the SAS blog post How many records are in that PROC SQL result?
- from the SAS SQL 1: Essentials training
- from Dear Miss SASAnswers: A Guide to Efficient PROC SQL Coding