As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


SQL Allows Multiple Columns with Same Name

From sasCommunity
Jump to: navigation, search


As part of the decommissioning effort for sasCommunity.org this article/tip has been migrated to communities.sas.com.

The new home for this article/tip is SQL Allows Multiple Columns with Same Name (https://communities.sas.com/t5/SAS-Communities-Library/SQL-Allows-Multiple-Columns-with-Same-Name/ta-p/475974)



In a DATA step, the program data vector does not allow two variables to have the same name. SQL is different. The namespace for a query can have multiple instances of the same column name.

Usually, the SQL processor will generate an error message if there is an ambiguous reference to a column name. There are still situations, however, where it is quite possible to get into trouble.

Here is an example:

proc sql;
 
create table one as
 select name, height, round(weight) as Size
  from sashelp.class
  where sex='M';
 
title 'Boys Taller Than 66 Inches';
 
create table two as
 select name, round(height) as Size
  from one
  where size > 66;
 
select * from two;

The result is

Boys Taller Than 66 Inches     
Name Size ------------------ Alfred 69 Henry 64 James 57 Jeffrey 63 John 59 Philip 72 Robert 65 Ronald 67 Thomas 58 William 67

There are lots of values of 66 or less. Why? It has to do with the order in which the parts of the query are processed. The WHERE filtering is done first. In fact, it is passed up to the data engine, which sees only the columns in the header of data set ONE. So it is not aware of the ambiguity and works with the SIZE values it finds in ONE, which of course reflect weights, not heights.

One way to fix this is to change the WHERE clause and specify (by means of the keyword CALCULATED) that SIZE refers to the new column derived from heights, as in:

create table three as
 select name, round(height) as Size
  from one
  where calculated size > 66;
 
select * from three;
 
quit;

That tells the processor to use the new SIZE columns and forces the WHERE filtering to be done after that column is derived. It produces, as expected:

Boys Taller Than 66 Inches     
Name Size ------------------ Alfred 69 Philip 72 Ronald 67 William 67

While multiple like-named columns are tolerated in SQL's workspace, they cannot be stored in SAS data sets. So if query results are feeding a CREATE TABLE or CREATE VIEW statement, only the first one will be saved, and warnings will be issued for subsequent columns with duplicative names. Of course unambiguous names can be specified to avoid this problem.