Style guide for writing and polishing programs

From sasCommunity
Jump to: navigation, search

TinyUrl for this page: http://tinyurl.com/25y2c4m

http://www.sascommunity.org/wiki/Style_guide_for_writing_and_polishing_programs

A Modest Proposal for

A Style Guide for Writing and Polishing Programs

for Reading and Reuse or Testing

Author: Ronald_J._Fehd

Introduction: Why bother?

According to Brooks, in his book Mythical Man Month, half of program development time is spent in testing. This means that a program is bound to be read by other people, many times.

How is reading a program different from reading literature?

Literature is read sequentially; as the King said to Alice: "Begin at the beginning, go to the end, then stop."

Programs are not 'read' with the same purpose as literature. Reading a program consists of three actions

1. Scanning for step boundaries or other blocks of code

2. Searching, or drill down, to statements within the block.

3. Checking off statements and options

Hint: alphabetizing keywords increases readers' searching speed.

Reasons and Benefits of Using a Style Guide

Professional Reasons:

1. You are a professional.

2. Professionals have a displine.

3. You practice that discipline, daily.


Professional Benefits:

1. Easier reading makes desk checking faster.

2. Clear layout makes testing and debugging faster.

3. Good documentation makes reuse easier.


Personal Benefits:

1. Faster development means you're finished sooner.

2. Finishing early or on-time gets you home sooner.

3. Sooner is better than later.

The Style Guide

Documentation

  • Documentation: Requirements Specifications and Data Dictionary
  • Documentation: Program Header
  • Documentation: internal comments in program
  • Documentation: of test suite of program, both unit and integration tests
  1. Requirements Specifications: http://en.wikipedia.org/wiki/Software_Requirements_Specification
  2. Data Dictionary: http://en.wikipedia.org/wiki/Data_dictionary
  3. Program Header documentation:
    1. http://en.wikipedia.org/wiki/Program_documentation
    2. http://en.wikipedia.org/wiki/Literate_programming
  4. Comments: http://en.wikipedia.org/wiki/Comment_%28computer_programming%29

Considerations while Typing

These are categories of ideas about formatting programs for readability.

  • Case
  • Comments
  • Indentation
  • Linesize: cross-platform issues
  • Naming Conventions

Case

  • use lower case

Exception, use ALL CAPS for DATA, PROC. Note: this facilitates scanning a page for step boundaries.

Exception, use Initial Caps or InternalCaps for nouns: names of data sets, variables, arrays, macro names, etc.

See also Celko's SQL Programming Style for citation of why initalLowCase slows down reading speed.

Comments

SAS provides three types of comments:

 /*comment block*/
*comment statement;
%*macro comment statement;

Any of the above may contain multiple lines.

Note: macro comment statements may occur inside sas statements:

attrib RowID  length = 4 %*integer;
       Height length = 8 %*numeric real;
       ;

The slash+asterisk comment block is often used for banners:

/****************************************
This is a banner containing documentation
****************************************/

NOTE: In Operating System= MVS, z/OS, mainframe, or BigIron, slash+asterisk in columns one and two, as illustrated in the above banner, is interpreted as EndOfJob, or EndSAS.

Therefore a recommendation:

 /********************************************
Ensure space on column 1 before slash+asterisk
**********************************************/
 
;/**************************
... or fill with a semicolon
***************************/

Slash+asterisk comment block is useful in debugging and testing since it can disable many statements. Therefore a recommendation: use sas comment and macro comment statements in your code, so that when you wish to use slash+asterisk comment block you will not be have small slash+asterisk comments closing your block.

Indentation

  • white space A Good Thing, but where and how much? The purpose of white space is to
    • emphasize hierarchy: see indentation
    • align similar tokens in columns
    • newline vertically separates paragraphs

See rivers of white space http://en.wikipedia.org/wiki/River_%28typography%29 Celko addresses this issue in his book.

  • tabs or spaces? use spaces; later readers may have assigned a different number of spaces per tab on their computer or editor which

effectively destroys your indentation

  • indentation: use hanging indent style, indent word-1 of succeeding lines to align with word-2 in first line

Example:

* data step;
if SomeCondition then do;
   Assignment1 = 'value';
   Assign2     = value;
   end;
 
* macro statements and else;
%if   &SomeCondition1. %then %do;
      %let Variable1 = value;
      %let Var2      = value2;
      %end;
%else %if &SomeCondition2. %then %do;
      %let Variable1 = value;
      %let Var2      = value2;
      %end;

compare to:

*data step;
if SomeCondition then do;  Assignment1 = 'value';
                           Assign2     = value;
                      end;
 
if SomeCondition then 
   do;  Assignment1 = 'value';
        Assign2     = value;
   end;
 
if SomeCondition 
   then do;  
      Assignment1 = 'value';
      Assign2     = value;
   end;

Note vertical alignment of equals signs (=)

  • Use Two.Level Data.Set Names
* note: illustrates procedure templates;
*       and use of white space to align columns;
 
PROC Freq data   = SAShelp.Class
          order  = frequency;
          tables   Sex
                 / list missing noprint
             out = Work.ProcFreqOutput;
 
PROC Sort data = SAShelp.Class
          out  = Work.Sorted
                 nodupkey;
          by     Sex;
run;

Linesize

  • maximum line length: 72

For cross-platform portability ensure that no text on a line extends beyond column 72.

Naming Conventions

Use underline as first character of names

SAS uses underline as prefix and suffix for temporary variables in data step, e.g.: _All_ _Error_ _N_

If you use underline as first character to indicate a temporary variable then you can use this statement to drop all variables whose first character is underline

drop _:;

Use sort order to your advantage.

  • Series: For list of variables with suffixes greater than 9: use leading zeroes: instead of Q1, Q2, ... Q9, Q10, use Q01, Q02, ... Q10
  • Date-stamps: in file or folder names: use ccYY-MM-DD, i.e.: 2012-01-01 == 2012-Jan-01. This convention provides a directory listing of the earliest to latest. Compare SysDate9: ddMmmccYY : 01Jan2012.
  • Date-Time stamp: A datetime requires 22 characters 2012.01.12 01:02:3.456; this can reduced to 16 characters using hex16.

Echo Use SAS parameter names for names of macro variables

%Let Data   = sashelp.class;
%Let Class  = Sex;
%Let Vars   = Height Weight;
 
PROC Which data   = &Data.;
           class    &Class;
           var      &Vars.;

References

to Show Calling Sequence and Parameters of Routines]

Steve McConnel, Code Complete http://cc2e.com/ Chapters: 35.

Fred Brooks, The Mythical Man-Month http://en.wikipedia.org/wiki/The_Mythical_Man-Month

Joe Celko's SQL Programming Style http://www.elsevier.com/wps/find/bookdescription.cws_home/705199/description#description Chapters: 10, 200 p., resources, bibliography, index: 20 p.

See also: Journeymens_Tools

-- created by User:Rjf2 10:54, 7 May 2007 (EDT)

--Ronald_J._Fehd macro.maven == the radical programmer