As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


Solutions in the Round -- Making HASH of Arrays: Pros and Cons

From sasCommunity
Jump to: navigation, search

Introduction

From the WUSS abstract for this topic Do you use HASH, Arrays or both regularly? Do you avoid HASH and ARRAYS like the plague? If you answered Yes to either question than this session is for you. Join a discussion of advantages and disadvantages of each, and how to effectively use them well. Solutions in the Round (SITR) is a roundtable format section where attendees discuss approaches and solutions to a programming problem or SAS topic. Users from different perspectives and all experience levels are encouraged to participate.

Discussion at WUSS

Facilitators:

  • Elizabeth Axelrod (EA)
  • Kirk Lafler (KL)


How many use Hashes and Arrays

 Hashes 2 / 
 Array  10 / 

KL: Hash

 Objects
 Memory
 Data Step Construct
 Paul Dorfman  (Jacksonville)
 26 Methods
 Lookups
 Merges
 Sorts
 Tranposes
 Random Access
 I/O

KL: Array

 Repetitive Data (e.g. month1-12)
 Data Step Construct
 Memory

KL: use Options FullSTimer;


Q: What hash algorithms does SAS use?

 EA:  User can specify some algorithms in SAS, such as MD5


KL: Use arrays / hash with SAS macros

 e.g. find "n-levels", 
 -> data-driven programming


EA: Those haven't / don't use Hash

 -> lots of great papers, but very technical 
 -> Hash is part of Base SAS
 -> EXTREMELY useful
 e.g. Lookups with small tables in memory 
 ->  Reduce disk I/O, speed up program runs


 E.g. Build aggregate tables in memory
 ->  collect summary data during 1st pass!


Q: SORTs are expensive

 BJ:  Order (N log N)
 Hash 
 - compute aggregates on 1st pass
 - skip PROC SORT when saving summary


 JP: SORT can be less expensive
 -  Indexes 
 -  threading 


 JP: Hash table/array vs MACRO list 
 -  Hash / array support numeric data type
 -  macro can be global ( hash/array are local to 1 data step)


KL: SASFILE (store a SAS file in memory)

 -  Great when reusing 1+ datasets throughout program (across 

multiple steps)


REFS: www.lexjansen.com

 Dorfman,Paul    
 Henderson, Don
 Eberhardt, Peter
 Loren, Judy
 Axelrod, Elizabeth
 Carpenter, Art
 Lafler, Kirk
 Miller, Ethan
 <lastname> , Marc (SAS)
 Burlew, Michelle 
 Secosky, Jason  (SAS)
 GOOGLE Search:  
   e.g. "sas hash filetype:sas"

Further Discussion -- Open to All

Please join the conversation! Also, if you were one of the live participants, please feel free to correct any mistakes or omissions from our original discussion.