4
Dec

Controlling your formats

During my 35 years of using SAS® software, I have found the CNTLIN and CNTLOUT options in the FORMAT procedure to be among the most useful features that I routinely suggest to other SAS users. The CNTLIN option enables you to create user-defined formats from a SAS data set (input control data set). The CNTLOUT option enables you to create a SAS data set (output control data set) containing format details from an entry in a SAS format catalog.

In this blog post, I provide a few examples demonstrating how to use the CNTLIN option. I also mention how to use the CNTLOUT option to store your format information in case you need to move to a new operating environment.

You can store all the format details from a SAS format catalog in a CNTLOUT data set and later restore them in a format catalog in your new operating environment using the CNTLIN option. For details, see SAS Usage Note 22194: “How to use the CNTLOUT= and CNTLIN= options in PROC FORMAT to move formats from one platform to another.”

A data set for the CNTLIN option contains variables that give specific information about ranges and values. At a minimum, the data set must contain the following variables:

FMTNAME specifies a character variable whose value is the format or informat name.
START specifies a variable that gives the range's starting value.
LABEL specifies a variable whose value is associated with a format or an informat.

For details about input and output control data sets, see the “FORMAT Procedure” section of Base SAS® 9.4 Procedures Guide, Seventh Edition.

Create a Numeric Format

The following simple example using the CNTLIN option creates a numeric format named respf:

 data test;                                         
    input response desc $20.;                       
 datalines;                                         
 1  Strongly Disagree                               
 2  Disagree                                        
 3  Neutral                                         
 4  Agree                                           
 5  Stongly Agree                                   
 ;                                                  
 run;                                               
 
 data crfmt;                                        
    set test;                                       
    start=response;                                 
    label=desc;                                     
    fmtname='respf';                                
 run;                                               
 
 proc format library=work cntlin=crfmt fmtlib;      
    select respf;                                   
 run;

Controlling Your Formats

Reveal Data Set Variables

To see the other variables that are included in data sets created by the CNTLIN and CNTLOUT options, use CNTLOUT to create a data set for the respf format created above:

 proc format library=work cntlout=outfmt;       
    select respf;                               
 run;                                      
 proc print data=outfmt;                        
 run;

Add Additional Ranges

To add another range to the respf format, you can use DATA step processing with the data set created by the CNTLOUT option. Then, re-create the format using the CNTLIN option:

data infmt;                                               
    set outfmt end=last;                                   
    output;                                                
    if last then do;                                       
       HLO='O';  /* indicates a special other range  */      
       label='NA';                                         
       output;                                             
    end;                                                   
 run;                                                     
 
 proc format library=work cntlin=infmt fmtlib;             
    select respf;                                          
 run;

Convert a State Name to Its Postal Abbreviation

One use for the CNTLIN option is to create a format that converts a state name to its 2-letter postal abbreviation. For example, this option can convert 'North Carolina' to 'NC'.  Because SAS does not have a function or format to convert state names to postal abbreviations, this is an excellent use of the CNTLIN option.

We can use data from the SASHELP.ZIPCODE data set to create a user-defined format using the CNTLIN option, as shown below:

proc sql noprint;                               
    create table crfmt as                        
    select distinct statename as start,          
           statecode as label,                   
           '$mystate' as fmtname                 
    from sashelp.zipcode;                        
 quit;                                           
 
 proc format library=work cntlin=crfmt fmtlib;   
    select $mystate;                             
 run;

Identify State Capitals

In a similar manner, we can use the MAPS.USCITY data set to create a user-defined format that identifies state capitals from the 2-letter state abbreviation. See the sample code and partial results below:

proc sql noprint;                                 
   create table crfmt as                          
   select distinct statecode as start,            
          city as label,                          
          '$mycity' as fmtname                    
   from maps.uscity                               
   where capital='Y';                             
 quit;                                            
 
proc format library=work cntlin=crfmt fmtlib;     
   select $mycity;                                
run;

Use External Data Sources

You can gather information from external data sources and read that information into a data set created by the CNTLIN option to create user-defined formats.

The following example uses ICD10 medical diagnosis codes. I downloaded a list of ICD10 codes and their descriptions into a Microsoft Excel file from the Center for Medicare & Medicaid Services website. Then, I created a user-defined format from the first 25 records:
Note: You can also download the codes as a text file.

/* This code reads in the Excel file.   */                                                                  
proc import out==myicd10                                              
   datafile= "C:Section111ValidICD10-2017.xlsx"   
   dbms=excelcs replace;                                                
   range="'Valid ICD10 2017 & NF Exclude$'";                            
   scantext=yes;                                                        
   usedate=yes;                                                         
   scantime=yes;                                                        
run;                                                                    
 
 
data crfmt;                                         
   set myicd10 (obs=25);                         
   fmtname='$myicd';                                
   start=code;                                      
   label=short_description;                         
run;  
 
title1 'ICD10 Format';                                                      
title3 'FMTLIB results only display the first 40 characters of the label';  
proc format library=work cntlin=crfmt fmtlib;       
   select $myicd;                                   
run;

A more complicated example that uses other data set variables created by the CNTLIN option is included in the linked sample program in Sample 47312: “Create a user-defined format containing decile ranges from PROC UNIVARIATE results.”

If you can think of a scenario in which the CNTLIN format would be helpful, give it a try. If you have questions, you can ask via  SAS Communities or contact us in SAS Technical Support.

Controlling your formats was published on SAS Users.

17
Nov

The ODS Destination for PowerPoint Has a New Option

Have you heard?  The ODS Destination for PowerPoint Has a New Option

It’s true.  The ODS destination for PowerPoint now has the STARTPAGE= option, which provides you with greater control and flexibility when creating presentations.

Added to the ODS POWERPOINT statement in SAS® 9.4TS1M4, the STARTPAGE= option enables you to force the creation of a new slide between procedures and between ODS layout containers.  Inserting a slide break between layout containers is one of the most impactful ways that you can use this option.

A new layout container does not automatically trigger a new slide within the presentation.  A new slide is started when the current slide is full.  That is the default, but the new STARTPAGE= option gives you the ability to start a new slide between containers even if the current slide is not full.

Examples

Shown below are four procedures placed within three layout containers.

  • The first PROC ODSTEXT step is placed in the first layout container.  Its purpose is to generate a slide of text, with that text roughly centered in the slide.
  • The second PROC ODSTEXT step is in the second container.  Its purpose is to provide useful information about the table and graph.
  • The PROC TABULATE and SGPLOT steps make up the third container.  They are the results of the analysis and, as such, need to be displayed side by side.

Default Behavior of ODS POWERPOINT, Without STARTPAGE=

Let’s look at the default behavior.  In this example, the STARTPAGE= option is not used.

ods powerpoint file='example1.pptx' options(backgroundimage="saslogo_pptx.png");
title;
ods layout gridded x=10% y=25%;
proc odstext;
	p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue];
	p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt];
run;
ods layout end;
 
ods layout gridded rows=1 columns=1;
   ods region;
   proc odstext;
      p 'Table Shows Total Runs and Hits for Each League';
      p 'The Graph Contains One Bubble for Each Player.  The Size of Each Bubble Represents the Magnitude of the RBIs.';
   run;
ods layout end;
 
ods graphics / width=4.5in height=4in;
ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct;
   ods region;
   proc tabulate data=sashelp.baseball;
      class league;
      var nruns nhits nrbi;
      tables league='', sum='Totals'*(nruns nhits)*f=comma12.;
   run;
 
   ods region;
   proc sgplot data=sashelp.baseball;
      bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3;
   run;
ods layout end;
ods powerpoint close;

Here is the resulting slide output:

ODS Destination for PowerPoint

Those results are not what we hoped they would be.  The output from the second PROC ODSTEXT step, which is to provide information about the table and graph, is on the first slide.  So is the graph!!!  And the graph does not look good because it is the wrong size.  The table is by itself on the second slide.  This is not the desired output at all.

Here Is Where STARTPAGE= Helps!

In this example, an ODS POWERPOINT statement with the STARTPAGE= option is added.  It is placed after the ODS LAYOUT END statement for the first container.

ods powerpoint file='example2.pptx' options(backgroundimage="saslogo_pptx.png");
title;
ods layout gridded x=10% y=25%;
proc odstext;
	p "Have you heard?" /style=[just=c fontsize=42pt color=RoyalBlue];
	p "The STARTPAGE= option has been added to the ODS POWERPOINT statement!" /style=[just=c fontsize=24pt];
run;
ods layout end;
 
ods powerpoint startpage=now; /* <---- Triggers a new slide */
 
ods layout gridded rows=1 columns=1;
   ods region;
   proc odstext;
      p 'Table Shows Total Runs and Hits for Each League';
      p 'The Graph Contains One Bubble for Each Player.  The Size of Each Bubble Represents the Magnitude of the RBIs.';
   run;
ods layout end;
 
ods graphics / width=4.5in height=4in;
ods layout gridded columns=2 column_widths=(47% 47%) column_gutter=1pct;
   ods region;
   proc tabulate data=sashelp.baseball;
      class league;
      var nruns nhits nrbi;
      tables league='', sum='Totals'*(nruns nhits)*f=comma12.;
   run;
 
   ods region;
   proc sgplot data=sashelp.baseball;
      bubble x=nhits y=nruns size=nrbi/ group=league transparency=.3;
   run;
ods layout end;
ods powerpoint close;

The STARTPAGE= option gave us exactly what we need.  The first slide contains just the text (from the first layout container).  The second slide contains more text along with the table and graph (from the second and third layout containers).

Use It Wisely!

The most important thing to know about using the STARTPAGE= option with layout containers is that it has to be placed between containers.  It cannot be placed within a container.  So you cannot put the statement between the ODS LAYOUT GRIDDED and ODS LAYOUT END statements.

For more information about ODS destination for PowerPoint and all of its options, visit this The Dynamic Duo: ODS Layout and the ODS Destination for PowerPoint.  Take a peek at it for more examples of using ODS Layout with the ODS destination for PowerPoint.

The ODS Destination for PowerPoint Has a New Option was published on SAS Users.

20
Oct

Tips for using the IMPORT procedure to read files that contain delimiters

using the IMPORT procedure to read files that contain delimitersReading an external file that contains delimiters (commas, tabs, or other characters such as a pipe character or an exclamation point) is easy when you use the IMPORT procedure. It's easy in that variable names are on row 1, the data starts on row 2, and the first 20 rows are a good sample of your data. Unfortunately, most delimited files are not created with those restrictions in mind.  So how do you read files that do not follow those restrictions?

You can still use PROC IMPORT to read the comma-, tab-, or otherwise-delimited files. However, depending on the circumstances, you might have to add the GUESSINGROWS= statement to PROC IMPORT or you might need to pre-process the delimited file before you use PROC IMPORT.

Note: PROC IMPORT is available only for use in the Microsoft Windows, UNIX, or Linux operating environments.

The following sections explain four different scenarios for using PROC IMPORT to read files that contain the delimiters that are listed above.

Scenario 1

In this scenario, I use PROC IMPORT to read a comma-delimited file that has variable names on row 1 and data starting on row 2, as shown below:

proc import datafile='c:tempclassdata.csv' 
out=class dbms=csv replace;
run;

 

When I submit this code, the following message appears in my SAS® log:

NOTE: Invalid data for Age in line 28 9-10.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
28        Janet,F,NA,62.5,112.5 21
Name=Janet Sex=F Age=. Height=62.5 Weight=112.5 _ERROR_=1 _N_=27
NOTE: 38 records were read from the infile 'c:tempclassdata.csv'.
      The minimum record length was 17.
      The maximum record length was 21.
NOTE: The data set WORK.CLASS has 38 observations and 5 variables.

 

In this situation, how do you prevent the Invalid Data message in the SAS log?

By default, SAS scans the first 20 rows to determine variable attributes (type and length) when it reads a comma-, tab-, or otherwise-delimited file.  Beginning in SAS® 9.1, a new statement (GUESSINGROWS=) is available in PROC IMPORT that enables you to tell SAS how many rows you want it to scan in order to determine variable attributes. In SAS 9.1 and SAS® 9.2, the GUESSINGROWS= value can range from 1 to 32767.  Beginning in SAS® 9.3, the GUESSINGROWS= value can range from 1 to 2147483647.  Keep in mind that the more rows you scan, the longer it takes for the PROC IMPORT to run.

The following program illustrates the use of the GUESSINGROWS= statement in PROC IMPORT:

proc import datafile='c:tempclassdata.csv' out=class              dbms=csv replace;
guessingrows=100;
run;

 

The example above includes the statement GUESSINGROWS=100, which instructs SAS to scan the first 100 rows of the external file for variable attributes. You might need to increase the GUESSINGROWS= value to something greater than 100 to obtain the results that you want.

Scenario 2

In this scenario, my delimited file has the variable names on row 4 and the data starts on row 5. When you use PROC IMPORT, you can specify the record number at which SAS should begin reading.  Although you can specify which record to start with in PROC IMPORT, you cannot extract the variable names from any other row except the first row of an external file that is comma-, tab-, or an otherwise-delimited.

Then how do you program PROC IMPORT so that it begins reading from a specified row?

To do that, you need to allow SAS to assign the variable names in the form VARx (where x is a sequential number). The following code illustrates how you can skip the first rows of data and start reading from row 4 by allowing SAS to assign the variable names:

proc import datafile='c:tempclass.csv' out=class dbms=csv replace;
getnames=no;
datarow=4;
run;

 

Scenario 3

In this scenario, I want to read only records 6–15 (inclusive) in the delimited file. So the question here is how can you set PROC IMPORT to read just a section of a delimited file?

To do that, you need to use the OBS= option before you execute PROC IMPORT and use the DATAROW= option within PROC IMPORT.

The following example reads the middle ten rows of a CSV file, starting at row 6:

options obs=15; 
 
proc import out=work.test2  
            datafile= "c:tempclass.csv" 
            dbms=csv replace; 
            getnames=yes; 
            datarow=6; 
run; 
 
options obs=max; 
run;

 

Notice that I reset the OBS= option to MAX after the IMPORT procedure to ensure that any code that I run after the procedure processes all observations.

Scenario 4

In this scenario, I again use PROC IMPORT to read my external file. However, I receive more observations in my SAS data set than there are data rows in my delimited file. The external file looks fine when it is opened with Microsoft Excel. However, when I use Microsoft Windows Notepad or TextPad to view some records, my data spans multiple rows for values that are enclosed in quotation marks.  Here is a snapshot of what the file looks like in both Microsoft Excel and TextPad, respectively:

The question for this scenario is how can I use PROC IMPORT to read this data so that the observations in my SAS data set match the number of rows in my delimited file?

In this case, the external file contains embedded carriage return (CR) and line feed (LF) characters in the middle of the data value within a quoted string. The CRLF is an end-of-record marker, so the remaining text in the string becomes the next record. Here are the results from reading the CSV file that is illustrated in the Excel and TextPad files that are shown earlier:

That behavior is why you receive more observations than you expect.  Anytime SAS encounters a CRLF, SAS considers that a new record regardless of where it is found.

A sample program that removes a CRLF character (as long as it is part of a quoted text string) is available in SAS Note 26065, "Remove carriage return and line feed characters within quoted strings."

After you run the code (from the Full Code tab) in SAS Note 26065 to pre-process the external file and remove the erroneous CR/LF characters, you should be able to use PROC IMPORT to read the external file with no problems.

For more information about PROC IMPORT, see "Chapter 35, The IMPORT Procedure" in the Base SAS® 9.4 Procedures Guide, Seventh Edition.

 

 

Tips for using the IMPORT procedure to read files that contain delimiters was published on SAS Users.

15
Sep

PROC SGPLOT: There’s an ATTRS for that

ATTRS The SGPLOT procedure (as well as other ODS Graphics procedures) does a great job of creating nice- looking output with very little coding. However, there are times when you want to make adjustments to the output's appearance. For those occasions, we have an ATTRS for that!

The statements in PROC SGPLOT include many options that enable you to change the attributes for parts of the plot. Each of these options ends in ATTRS, which makes them easy to find in code.

Before you can change the attributes, you need to know which part of the plot you want to change.  For example, do you want to change the color of the line, the marker symbol, the size of the label font, and so on? Once you know the part of the graph that you want to change, you can search the PROC SGPLOT documentation for an ATTRS option.

In the following PROC SGPLOT code, we have added some ATTRS options to demonstrate the types of changes you can make to a graph.

proc sgplot data=sashelp.class;
vbar age / stat=freq datalabel datalabelattrs=(size=12pt color=blue)
fillattrs=(color=cx66A5A0) transparency=0.3 
dataskin=matte name='bar' 
legendlabel='Frequency of age';
vline age / stat=percent markers 
markerattrs=(symbol=circlefilled color= cx01665E size=12px) 
lineattrs=(color=cxD05B5B thickness=3px) 
curvelabel='Percent Line' 
curvelabelattrs=(size=11pt style=italic)
curvelabelloc=inside curvelabelpos=min 
name='vline' legendlabel='Percent of age' y2axis;
refline 4 / axis=y lineattrs=(pattern=2 thickness=2px) label='Refline' 
labelattrs=(size=12pt) labelpos=min labelloc=inside;
xaxis valueattrs=(size=10pt color=navy);
yaxis labelattrs=(size=12pt weight=bold) offsetmin=0;
keylegend 'bar' 'vline' / title='My legend' 
titleattrs=(color=blue size=14pt)
valueattrs=(size=12pt) noborder;
run;

 

The figure below shows the graph that is produced by this PROC SGPLOT code. In the figure, some labels are added to help you identify the part of the graph that is modified using an ATTRS option. Note that this graph depicts only some of the ATTRS options that are available. For other ATTRS options, see the SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition for the specific plot statement that you want to use.

In this figure:

  • The LABELATTRS= option enables you to change the color, font family, font weight, font style, and size for the axis or reference line labels.
  • The LINEATTRS= option enables you to change the color, pattern, and thickness for the plot line.
  • The CURVELABELATTRS= option enables you to change the color, font family, font weight, font style, and size for the text that is added by the CURVELABEL= option.
  • The DATALABELATTRS= option enables you to change the color, font family, font weight, font style, and size for the text that is added by the DATALABEL= option.
  • The MARKERATTRS= option enables you to change the color, size, and symbol for the plot markers.
  • The FILLATTRS= option enables you to change the color and transparency of the bar colors.
  • The VALUEATTRS= option enables you to change the color, font family, font weight, font style, and size for the axis tick-value labels or legend value labels.
  • The TITLEATTRS= option enables you to change the color, font family, font weight, font style, and size for the legend title.

For more information about attribute options, see the Commonly Used Attribute Options section of the SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

The ATTRS options affect all of the output that is produced by that statement. This means that if you include the GROUP= option, all of the groups use the attributes that are specified in the ATTRS options. This behavior is great if you want all of the lines to use the same line pattern, but it can be a problem if you want to specify colors for each of your lines.

Beginning with SAS 9.4, the STYLEATTRS (notice the ATTRS ending) statement is part of the SPLOT (and SGPANEL) procedure to enable you to define attributes for grouped data.

For example, the following code uses the DATACONTRASTCOLORS= option to specify the colors for the marker symbols and the DATASYMBOLS= option to specify the symbols that are to be used.

ods graphics / attrpriority=none;
 
proc sgplot data=sashelp.class;
styleattrs datacontrastcolors=(pink blue)
datasymbols=(circlefilled squarefilled);
scatter x=age y=height / group=sex markerattrs=(size=10px);
xaxis valueattrs=(size=12pt) labelattrs=(size=14pt);
yaxis valueattrs=(size=12pt) labelattrs=(size=14pt);
keylegend / valueattrs=(size=12pt) titleattrs=(size=14pt);
run;

 

You also might need to add the ATTRPRIORITY=NONE option in your ODS GRAPHICS statement to cycle the colors and symbols as expected. For more information about how the attributes are applied to the grouped values, see the How the Attributes Are Cycled section of the SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

The attributes that are listed in the STYLEATTRS statement are associated with the group values in the order in which they appear in the data set. This behavior can cause the same value to be associated with a different color when you use the same code with another set of data.

To associate an attribute with a specific data value, you can define an attribute map. The attribute map is a data set, referenced in the DATTRMAP= option in the PROC SGPLOT statement, which includes variables that indicate to the SGPLOT procedure how to assign attributes to the group variable values.

Within the attribute map, the ID variable identifies the variables that are specific to a particular set of group values. The VALUE variable identifies the data value for the group variable that you want to associate with attributes. Note that if the variable for the GROUP= option has an associated format, the VALUE variable in the attribute map needs to contain the formatted value.

The other variables in the attribute map data set define attributes such as color, symbol, line thickness, and so on.

For example, the following code defines an attribute map to assign the color pink and the filled-circle  symbol to group value F and the color blue and the filled-square symbol to the group value M:

data myattrmap;
id='scattersymbols';
length markersymbol $12;
input value $ markercolor $ markersymbol $;
datalines;
F pink circlefilled
M blue squarefilled
;
 
proc sgplot data=sashelp.class dattrmap=myattrmap;
scatter x=age y=height / group=sex markerattrs=(size=10px) attrid=scattersymbols;
xaxis valueattrs=(size=12pt) labelattrs=(size=14pt);
yaxis valueattrs=(size=12pt) labelattrs=(size=14pt);
keylegend / valueattrs=(size=12pt) titleattrs=(size=14pt);
run;

 

Your attribute-map data set can contain multiple attribute maps, using a different value for the ID variable to distinguish each of the attribute maps. For more information about attribute maps, see the Using Attribute Maps to Control Visual Attributes section of the SAS® 9.4 ODS Graphics: Procedures Guide, Sixth Edition.

As you can see, there are many ways to assign attributes to plot elements. So, the next time you want to make a change to the visual appearance of your graph, remember that we have an ATTRS for that!

If you would like to see how to make attribute changes using a style template, read Dan Heath’s 2017 SAS Global Forum paper, Diving Deep into SAS® ODS Graphics Styles.

PROC SGPLOT: There’s an ATTRS for that was published on SAS Users.

21
Aug

Always save your code when creating a stored compiled macro

The stored compiled macro facility enables you to compile and save your macro definition in a permanent catalog in a library that you specify. The macro is compiled only once. When you call the macro in the current and subsequent SAS® sessions, SAS executes the compiled code from the macro catalog that you created when you compiled the macro.

The stored compiled facility has two main purposes. The first is that it enables your code to run faster because the macro code does not need to be compiled each time it is executed. The second purpose is to help you protect your code. Sometimes you need to share code that you’ve written with other users, but you do not want them to be able to see the code that is being executed. The stored compiled macro facility enables you to share the program without revealing the code. Compiling the macro with the SECURE option prevents the output of the SYMBOLGEN, MPRINT, and MLOGIC macro debugging options from being written to the log when the macro executes. This means that no code is written to the log when the code executes. After the macro has been compiled, there is no way to decompile it to retrieve the source code that created the catalog entry. This behavior prevents the user from being able to retrieve the code. However, it also prevents you from being able to recover the code.

It is very important to remember that there is no way to get back the code from a stored compiled macro. Because of this behavior, you should ALWAYS save your code when creating a stored compiled macro catalog. In order to update a stored compiled macro, you must recompile the macro. The only way to do this is to submit the macro definition again. Another important fact is that a stored compiled macro catalog can be used only on the same operating system and release of SAS that it was created on. So, in order to use a stored compiled macro on another operating system or release of SAS, that macro must be compiled in the new environment. Again, the only way to compile the macro is to resubmit the macro definition.

Save the Macro Source Code

To make it easier for you to save your code, the %MACRO statement contains the SOURCE option. When you create a stored compiled macro, the SOURCE option stores the macro definition as part of a catalog entry in the SASMACR catalog in the permanent SAS library listed on the SASMSTORE= system option.

Here is the syntax needed to create a stored compiled macro with the SOURCE option set:

libname mymacs 'c:my macro library';   ❶                                                                                                
options mstored sasmstore=mymacs;       ❷                                                                                              
 
%macro test / store source;             ❸                                                                                                          
 
  libname mylib1 'path-to-my-first-library';                                                                                            
  libname mylib2 'path-to-my-second-library';                                                                                           
 
%mend;

 

❶ The LIBNAME statement points to the SAS library that will contain my stored compiled macro catalog.

❷ The MSTORED system option enables the stored compiled facility. The SASMSTORE= option points to the libref that points to the macro library.

❸ The STORE option instructs the macro processor to store the compiled version of TEST in the SASMACR catalog in the library listed in the SASMSTORE= system option. The SOURCE option stores the TEST macro definition in the same SASMACR catalog.

Note that the contents of the SASMACR catalog do not contain an entry for the macro source. The source has been combined with the macro entry that contains the compiled macro. To verify that the source has been saved, add the DES= option to the %MACRO statement. The DES= option enables you specify a description for the macro entry in the SASMACR catalog. So for example, you could add the following description when compiling the macro to indicate that the source code has been saved:

%macro test / store source des=’Source code saved with entry’;

 

You can look at the contents of the macro catalog using the CATALOG procedure:

proc catalog cat=a.sasmacr;                                                                                                            
contents;                                                                                                                               
run;                                                                                                                                    
quit;

 

You see the description indicating that the source code was saved with the macro entry in the output from PROC CATALOG:

Retrieve the Macro Source Code

When you need to update the macro or re-create the catalog on another machine, you can retrieve the macro source code using the %COPY statement. The %COPY statement enables you to retrieve the macro source code and write the code to a file. Here is the syntax:

%copy test / source outfile='c:my macro librarytest.sas';

 

This %COPY statement writes the source code for the TEST macro to the TEST.SAS file. Using TEST.SAS, you are now able to update the macro or compile the macro on another machine.

Remember, you should always save your source code when creating a stored compiled macro. Without the source code, you will not be able to update the macro or move the macro to a new environment.

Here are the relevant links for this article:

Always save your code when creating a stored compiled macro was published on SAS Users.

21
Jul

Selecting the top n% and bottom n% of observations from a data set

n% of observations from a data setSAS® offers several ways that you can find the top n% and bottom n% of data values based on a numeric variable. The RANK procedure with the GROUPS= option is one method. Another method is The UNIVARIATE procedure with the PCTLPTS= option. Because there are several ways to perform this task, you can choose the procedure that you are most familiar with. In this blog post, I use the SUMMARY procedure to generate the percentile values and macro logic to dynamically choose the desired percentile statistics. After the percentiles are generated, I subset the data set based on those values. This blog post provides two detailed examples: one calculates percentiles for a single variable and one calculates percentiles within a grouping variable.

Calculate Percentiles of a Single Variable

Calculating percentiles of a single variable includes the following steps. Within the macro, a PROC SUMMARY step calculates the percentiles. The subsequent DATA step uses CALL SYMPUTX to create macro variables for the percentile values, and the final DATA step uses those macro variables to subset the data. Here is the code, which is explained in detail below:

/* Create sample data */
data test;                   
   do i=1 to 10000;                                                     
      x=ranuni(i)*12345;                                         
      output;                                                         
   end; 
   drop i; 
run;     
 
proc sort data=test;
   by x;
run; 
 
%macro generate_percentiles(ptile1,ptile2); 
/* Output desired percentile values */                         
proc summary data=test;                                               
   var x;                                                       
   output out=test1 &amp;ptile1= &amp;ptile2= / autoname;                               
run;                                                                 
 
/* Create macro variables for the percentile values */     
data _null_;                                                         
   set test1;                                                         
   call symputx("&amp;ptile1", x_&amp;ptile1);                                     
   call symputx("&amp;ptile2", x_&amp;ptile2);                                     
run;    
%put &amp;&amp;&amp;ptile1;
%put &amp;&amp;&amp;ptile2; 
 
data test2;                                                             
   set test;                                                           
/* Use a WHERE statement to subset the data */                         
   where x le &amp;&amp;&amp;ptile1 or x ge &amp;&amp;&amp;ptile2;                                       
run;  
 
proc print;
run; 
 
%mend;
 
options mprint mlogic symbolgen;
%generate_percentiles(p1,p99)
%generate_percentiles(p25,p75)

After creating and sorting the sample data, I begin my macro definition with two parameters that enable me to substitute the desired percentiles in my macro invocation:

%macro generate_percentiles(ptile1,ptile2);

The PROC SUMMARY step writes the desired percentiles for variable X to the Test1 data set. The AUTONAME option names the percentile statistics in the following format, <varname>_<percentile> (for example, x_p25).

proc summary data=test;                                               
   var x;                                                       
   output out=test1 &amp;ptile1= &amp;ptile2= / autoname;                               
run;

Next, I want to store the values of the percentile statistics in macro variables so that I can use them in later processing. I use CALL SYMPUTX to do this, which gives the macro variables the same name as the statistic. To see the resulting values in the log, I use

data _null_;                                                         
   set test1;                                                         
   call symputx("&amp;ptile1", x_&amp;ptile1);                                     
   call symputx("&amp;ptile2", x_&amp;ptile2);                                     
run;    
%put &amp;&amp;&amp;ptile1;
%put &amp;&amp;&amp;ptile2;

The SAS log shows the following:

MLOGIC(GENERATE_PERCENTILES):  %PUT &amp;&amp;&amp;ptile1
SYMBOLGEN:  &amp;&amp; resolves to &amp;.
SYMBOLGEN:  Macro variable PTILE1 resolves to p1
SYMBOLGEN:  Macro variable P1 resolves to 123.22158288
123.22158288
MLOGIC(GENERATE_PERCENTILES):  %PUT &amp;&amp;&amp;ptile2
SYMBOLGEN:  &amp;&amp; resolves to &amp;.
SYMBOLGEN:  Macro variable PTILE2 resolves to p99
SYMBOLGEN:  Macro variable P99 resolves to 12232.136483
12232.136483

I use these macro variables in a WHERE statement within a DATA step to subset the data set based on the percentile values:

data test2;                                                             
   set test;                                                           
/* Use a WHERE statement to subset the data */                         
   where x le &amp;&amp;&amp;ptile1 or x ge &amp;&amp;&amp;ptile2;                                       
run;

Finally, the macro invocations below pass in the desired percentile statistics:

%generate_percentiles(p1,p99)
%generate_percentiles(p25,p75)

The percentile statistics that are available with PROC SUMMARY are included in the documentation for the

/* Create sample data */
data test; 
 do group='a','b'; 
   do i=1 to 10000;                                                     
      x=ranuni(i)*12345;                                         
      output;                                                         
   end; 
 end;
   drop i; 
run;     
 
proc sort data=test;
   by group x;
run; 
 
%macro generate_percentiles(ptile1,ptile2); 
/* Output desired percentile values by group */                         
proc summary data=test; 
   by group; 
   var x;                                                       
   output out=test1 &amp;ptile1= &amp;ptile2= / autoname;                               
run;                                                                 
 
/* Create macro variables for each value of the BY variable */
/* Create macro variables for the percentile values for each BY group */ 
/* Create a macro variable that is the count of the unique
values of the BY variable */ 
data _null_;   
  retain count 0; 
   set test1;   
   by group;
   if first.group then do;
    count+1;
    call symputx('val'||left(count),group); 
    call symputx("&amp;ptile1"||'_'||left(count), x_&amp;ptile1);                                     
    call symputx("&amp;ptile2"||'_'||left(count), x_&amp;ptile2);  
   end; 
  call symput('last',left(count));
run;    
%put _user_;
 
/* Loops through each value of the BY variable */ 
%do i=1 %to &amp;last;
 
data test&amp;i;                                                             
   set test;  
   where group="&amp;&amp;val&amp;i"; 
/* Use an IF statement to subset the data */
   if x le &amp;&amp;&amp;ptile1._&amp;i or x ge &amp;&amp;&amp;ptile2._&amp;i;                                       
run;  
 
proc print;
run; 
 
%end;
%mend;
 
options mprint mlogic symbolgen;
%generate_percentiles(p1,p99)

Calculating percentiles has many applications, including ranking data, finding outliers, and subsetting data. Using a procedure in Base SAS® that enables you to request percentile statistics along with the power of the macro language, you can dynamically generate desired values that can be used for further processing and analysis.

Selecting the top n% and bottom n% of observations from a data set was published on SAS Users.

16
Jun

Using parameters within the macro facility

Using parameters within the macro facilityHave you ever written a macro and wondered if there was an easy way to pass values to the macro? You can by using macro parameters. Macro parameters enable you to pass values into the macro at macro invocation, and set default values for macro variables within the macro definition. In this blog post, I also discuss how you can pass in a varying number of parameter values.

There are two types of macro parameters: positional and keyword.

Positional Parameters

You can use positional parameters to assign values based on their position in the macro definition and at invocation. The order that you use to specify the values must match the order in which they are listed in the %MACRO statement. When specifying multiple positional parameters, use a comma to separate the parameters. If you do not pass a value to the macro when it is invoked, a null value is assigned to the macro variable specified in the %MACRO statement.

Here is an example:

%macro test(var1,var2,var3);                                                                                                            
 %put &amp;=var1;                                                                                                                           
 %put &amp;=var2;                                                                                                                           
 %put &amp;=var3;                                                                                                                           
%mend test;                                                                                                                             
 
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing numeric values.                                         **/                                                            
%test(1,2,3)                                                                                                                            
/** The first position matches with var1 and is given a null value.            **/                                                             
%test(,2,3)                                                                                                                             
/** I pass no values, so var1-var3 are created with null values.               **/                                                             
%test()                                                                                                                                 
/** The first value contains a comma, so I use %STR to mask the comma.         **/                                                             
/** Otherwise, I would receive an error similar to this: ERROR: More           **/
/** positional parameters found than defined.                                  **/                                                             
%test(%str(1,1.1),2,3)                                                                                                                  
/** Each value corresponds to the position of each variable in the definition. **/ 
/** Here, I am passing character values.                                       **/                                                            
%test(a,b,c) 
/** I gave the first (var1) and second (var2) positions a value of             **/
/** b and c, so var3 is left with a null value.                                **/                                                             
%test(b,c)

 

Here are the log results:

173  /** Each value corresponds to the position of each variable in the definition. **/
174  /** Here, I am passing numeric values.                                         **/
175  %test(1,2,3)
VAR1=1
VAR2=2
VAR3=3
176  /** The first position matches with var1 and is given a null value.            **/                                                             
177  %test(,2,3)
VAR1=
VAR2=2
VAR3=3
 
178  /** I pass no values, so var1-var3 are created with null values.               **/
179  %test()
VAR1=
VAR2=
VAR3=
180  /** The first value contains a comma, so I use %STR to mask the comma.         **/                                                             
181  /** Otherwise, I would receive an error similar to this: ERROR: More           **/
182  /** positional parameters found than defined.                                  **/                                                             
183  %test(%str(1,1.1),2,3)
VAR1=1,1.1
VAR2=2
VAR3=3
184  /** Each value corresponds to the position of each variable in the definition. **/
185  /** Here, I am passing character values.                                       **/
186  %test(a,b,c)
VAR1=a
VAR2=b
VAR3=c
187  /** I gave the first (var1) and second (var2) positions a value of             **/
188  /** b and c, so var3 is left with a null value.                               **/
189  %test(b,c)
VAR1=b
VAR2=c
VAR3=

 

Keyword Parameters

The benefit of using keyword parameters is the ability to give the macro variables a default value within the macro definition. When you assign values using keyword parameters, you must include an equal sign after the macro variable name.

Here is an example:

%macro test(color=blue,id=123);                                                                                                         
 %put &amp;=color;                                                                                                                          
 %put &amp;=id;                                                                                                                             
%mend test;                                                                                                                             
 
/** Values passed to the macro overwrite default values from the definition. **/                                                                 
%test(color=red,id=456)                                                                                                                 
/** Passing in no values allows the default values to take precedence.      **/                                                                 
%test()                                                                                                                                 
/** You are not required to pass in a value for each keyword parameter.    **/                                                                 
%test(color=green)                                                                                                                      
/** The order of variables does not matter.                               **/                                                                                                 
%test(id=789,color=yellow)

 

Here are the log results:

270  /** Values passed to the macro overwrite default values from the definition. **/
271  %test(color=red,id=456)
COLOR=red
ID=456
272  /** Passing in no values allows the default values to take precedence.     **/
273  %test()
COLOR=blue
ID=123
274  /** You are not required to pass in a value for each keyword parameter.   **/
275  %test(color=green)
COLOR=green
ID=123
276  /** The order of variables does not matter.                              **/
277  %test(id=789,color=yellow)
COLOR=yellow
ID=789

 

If the macro definition combines positional and keyword parameters, positional parameters must come first. If you do not follow this order, this error is generated:

ERROR: All positional parameters must precede keyword parameters.

 

Here is an example:

%macro test(val,color=blue,id=123);                                                                                                     
 %put &amp;=color;                                                                                                                          
 %put &amp;=id;                                                                                                                             
 %put &amp;=val;                                                                                                                            
%mend test;                                                                                                                             
 
/** The positional parameter is listed first. **/                                                                 
%test(1,color=red,id=456)
 
Here are the log results:
 
318  /** The positional parameter is listed first. **/                                                                 319  %test(1,color=red,id=456)
COLOR=red
ID=456
VAL=1

 

PARMBUFF

The PARMBUFF option creates a macro variable called &SYSPBUFF that contains the entire list of parameter values, including the parentheses. This enables you to pass in a varying number of parameter values. In the following example, you can pass any number of parameter values to the macro. This following example illustrates how to parse each word in the parameter list:

%macro makes/parmbuff; 
  /** The COUNTW function counts the number of words within &amp;SYSPBUFF.            **/                                                                                                                 
   %let cnt=%sysfunc(countw(&amp;syspbuff)); 
  /** The %DO loop increments based on the number of words returned to the macro. **/
  /** variable &amp;CNT.                                                              **/                                
   %do i= 1 %to &amp;cnt;  
  /** The %SCAN function extracts each word from &amp;SYSPBUFF.                      **/                                                                                                                  
     %let make=%scan(&amp;syspbuff,&amp;i);                                                                                                     
     %put &amp;make;                                                                                                                        
   %end;                                                                                                                                
%mend makes;                                                                                                                            
 
%makes(toyota,ford,chevy)

 

Here are the log results:

19  %macro makes/parmbuff;
20    /** The COUNTW function counts the number of words within &amp;SYSPBUFF.            **/
21     %let cnt=%sysfunc(countw(&amp;syspbuff));
22    /** The %DO loop increments based on the number of words returned to the macro  **/
23    /** variable &amp;CNT.                                                              **/
24     %do i= 1 %to &amp;cnt;
25    /** The %SCAN function extracts each word from &amp;SYSPBUFF.                       **/
26       %let make=%scan(&amp;syspbuff,&amp;i);
27       %put &amp;make;
28     %end;
29  %mend makes;
30
31  %makes(toyota,ford,chevy)
toyota
ford
chevy

 

When you specify the PARMBUFF option and the macro definition includes both positional and keyword parameters, the parameters still receive values when you invoke the macro. In this scenario, the entire invocation list of values is assigned to &SYSPBUFF. Here is an example:

%macro test(b,a=300)/parmbuff;                                                                                                      
 %put &amp;=syspbuff;                                                                                                                        
 %put _local_;                                                                                                                          
%mend;                                                                                                                                  
 
%test(200,a=100)

 

Here are the log results:

SYSPBUFF=(200,a=100)
TEST A 100
TEST B 200

 

Notice that &SYSPBUFF includes the entire parameter list (including the parentheses), but each individual parameter still receives its own value.

If you need to know all the parameter values that are passed to the macro, specify the PARMBUFF option in the macro definition to get access to &SYSPBUFF, which contains all the parameter values. For more information about PARMBUFF, see %MACRO Statement in SAS® 9.4 Macro Language: Reference, Fifth Edition.

I hope this blog post has helped you understand how to pass values to a macro. If you have SAS macro questions that you would like me to cover in future blog posts, please comment below.

Using parameters within the macro facility was published on SAS Users.

19
May

Demystifying and resolving common transcoding problems

Technical Support regularly receives incoming calls from customers who have encountered the following transcoding warning:

WARNING: Some character data was lost during transcoding in the data set xxx.xxx. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding

People are not always exactly sure what this warning means nor what to do about it. No worries! This blog provides background information about why this warning occurs and offers resources that can help you understand and resolve any encoding or truncation issues that are the source of the warning above.

In most Western character sets, each character occupies only one byte of computer memory or storage. These are known as single-byte character sets (SBCS). Character sets for languages such as Japanese, Korean, and Simplified and Traditional Chinese require mixed-width encoding. The character sets for these languages are double-byte character sets (DBCS). (The term DBCS can be misleading because not all of the characters in a double-byte character set are two bytes. Thus, the term multibyte character set [MBCS] is sometimes used instead of DBCS.)

Unicode is a multibyte character set that was created to support all languages. It includes all characters from most modern written languages and historic scripts, even hieroglyphs and cuneiform. UTF-8 is one of the more common encoding forms of the Unicode standard, and this encoding is recommended as the SAS session encoding for multilingual environments if your data sources contain characters from many regions. Each character can be up to four bytes in UTF-8. UTF-8 is the SAS session encoding for SAS® Viya™. Clients such as SAS University Edition, SAS Studio, and SAS Visual Analytics typically execute statements in a server environment that run the UTF-8 encoding.

Now, let's look at how SAS handles data from different encodings along with two causes of the warning shown earlier.

Starting in SAS®9, SAS data sets store an encoding indicator in the descriptor portion. If the encoding value of the file differs from the encoding of the currently executing SAS session, the Cross Environment Data Access (CEDA) engine is invoked when SAS reads the data set. CEDA transcodes the data, by default. Transcoding is the process of converting the contents of SAS files from one encoding to another, and this process is necessary in order to read data from around the world.

When transcoding occurs, you receive an informational note about the use of CEDA. In addition, you might see the warning (shown above) in your SAS log.

The transcoding warning is issued when you have values for character variables that have either of the following characteristics:

  • The number of bytes for a character in one encoding do not match the number of bytes that are used for the same character in another encoding (as shown in the image below). If the length of the column is not wide enough to accommodate the additional bytes, truncation of the character data occurs.

  • The characters exist in one encoding, but they do not exist in another encoding. As a result, the transcoding is not successful.

If you use a procedure (for example, the PRINT or REPORT procedures), the procedure runs to completion. The variable is truncated if the additional bytes cause the value to exceed the length of the column. Alternatively, if a character from the data set does not also exist in the target encoding or if it occupies a different code point (after transcoding), the column might be blank or another character might be substituted for the original character. The procedure generates the transcoding warning.

If you have an output data set open, SAS transcodes the character data and stops writing observations to the file at the point it encounters the problematic characters. This behavior can result in either fewer observations than expected or a data set with zero observations. When this happens, SAS generates an error message instead of a warning. This error message contains the same content as the warning message.

ERROR: Some character data was lost during transcoding in the data set xxx.xxx. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding

New Video Available to Help You Resolve the Transcoding Error or Warning

Before you attempt to resolve the transcoding error or warning, it is helpful to understand a bit about encodings and how to handle multilingual SAS data. To help you in this endeavor, SAS Technical Support worked with Education and Training to create a new, short (12 minutes) video to help you quickly troubleshoot and resolve the most common reasons for the transcoding error or warning.

Click the image below to access the video. (Note: You will be prompted to create a SAS profile if you do not have one already.)

Click image to play video

 

If you want to practice the techniques in this video, you can download the data sets that are used in the video here.

This video is based on solutions that have helped customers who called SAS Technical Support for assistance. The video offers these solutions in a visual format that is easy to follow. In addition to showing you how to launch SAS in different encodings to resolve the error, the video also introduces the Character Variable Padding (CVP) engine (Read-only) for processing SAS data files that contain multilingual data.

This video is one of a number of resources available to help you. The next section lists several tips that might help prevent or resolve transcoding warnings or errors.

Helpful Tips from the Field

The following tips have been tested and used in the field, and you might find them helpful when you work with multilingual data. The first two tips are usage tips; the others are programming tips.

  • Invoke SAS in more than one encoding. For this first tip, it is important to know that if you try to change the value of the ENCODING system option during a SAS session, SAS ignores the option. When this situation occurs, the following warning is generated:

WARNING 30-12: SAS option ENCODING is valid only at startup of the SAS System. The SAS option is ignored.

If you work with data that is stored in more than one encoding, you can invoke SAS in more than one session encoding and manage the sessions easily by setting the title bar to display the session encoding in the main SAS window.

Note: This solution only works in the Microsoft Windows operating environment. Display of UTF-8 encoded data in the SAS windowing environment is not fully supported.  SAS Enterprise Guide and SAS Studio have full support for display of UTF-8 characters.

To customize the title bar of the main SAS window.

1.  In a Windows environment, select Start All Programs SAS Additional Languages.

2.  Under Additional Languages, select the shortcut for the session encoding that you want. For example, the following image shows that SAS 9.4 (Unicode Support) is selected

3.  Right-click the shortcut and select Properties. This opens the properties dialog box for that shortcut.

4. On the Shortcut tab, place your cursor at the end of the current string in the Target text box. If it is not already listed in the box, enter the location of the configuration file that you want to use by using the –CONFIG option followed by the –AWSTITLE option.

Then include in quotation marks the text that you want to see in the main SAS window, as shown in the following example:

The entire line should look similar to the following, only it will appear as one long string:

"C:Program FilesSASHomeSASFoundation9.4sas.exe" -CONFIG "C:Program FilesSASHomeSASFoundation9.4nls
u8sasv9.cfg" -awstitle "SAS UTF8"

The following display shows the –AWSTITLE option added to the Target field of the SAS 9.4 (Unicode Support) Properties dialog box:


5
. Click OK.

Now, when you open SAS, you can immediately recognize the session encoding based on the value in the title bar.

You can pin this shortcut to your task bar by dragging it from the Start menu to the task bar. Then you should see your custom title when you hover your cursor over the shortcut.

  • Prevent an unexpected transcoding warning or error: If you read or convert a SAS data set from LATIN1 or WLATIN1 encoding to UTF-8 that contains Microsoft Word smart quotes or dashes in character variables, it can cause truncation errors. Those are not ASCII characters, and they require more than one byte when converted to UTF-8. For example, performing the following steps causes an unexpected transcoding error or warning.
    1. You use SAS in the WLATIN1 encoding.
    2. You copy a quoted string from Microsoft Word and paste that string into the DATALINES section of a SAS program in the SAS windowing environment.
    3. You create a new variable that uses the quoted string in its value.
    4. You execute the SAS program to create a permanent SAS data set.
    5. You invoke a UTF-8 SAS session to read the WLATIN1 data set in a procedure or DATA step.

You can prevent this issue by clearing the "Straight quotes" with "smart quotes" option in Word, as shown below. This option is available by selecting File Options Proofing AutoCorrect Option. On the AutoFormat tab, the option is listed under the Replace category.

  • Set the CVP engine explicitly (for UTF-8 sessions). The transcoding warning is common if you use a UTF-8 session encoding and the input data source is in the WLATIN1 encoding. The first 128 code points of the UTF-8 code page are identical to the 7-bit ASCII encoding. If your character data only contains characters within those 128 code points, SAS does not generate a warning or error when it transcodes the data. For code points above 128, the character variables must be expanded to two or more bytes. Many characters that are used in Western European languages fall into this category. You can address this issue and avoid the warning or error by adding the CVP engine to the LIBNAME statement for your input data source.

The following example LIBNAME statement illustrates how to specify the CVP engine explicitly:

libname mySAS cvp 'SAS-library-path';
data utf8dat;
set mySAS.wlatindat;
run;

In this example, the code is submitted in a UTF-8 SAS session. The CVP engine is Read-only, and it is used to read the input data set from the WLATIN1 encoding.

  • Set the CVP engine implicitly. The CVP engine provides an easy way to convert your files and avoid truncation problems by using a default value of 1.5 times the original length of the character variable. However, that value might not be sufficient for the length of the characters in your data. The expansion length should be based on the characters that are contained in your data.

For example, if your data consists of double-byte character (DBCS) data (Chinese, Japanese, and so on), most characters are three bytes and emoji characters are four bytes. Therefore, the default 1.5 value for the CVP engine might not be enough for the number of additional bytes. A value of 2.5 is a reasonable expansion for DBCS characters, but you might need more.

In such cases, you might want to try setting the CVP engine implicitly. When you set the CVP engine implicitly, you can use either the CVPBYTES= or CVPMULTIPLIER= options. The following example uses the CVPBYTES= option to accommodate the four bytes that are required for the winking emoticon in UTF-8:

libname enc 'c:publicencdata' cvpbytes=4;

 

  • Convert text in a character variable to another encoding. Sometimes, the CVP engine does not correct the transcoding error or warning. In these cases, Technical Support usually finds that the destination encoding does not support all of the characters found in the data. For example, if you have a UTF-8 data set that contains French, Greek, Hindi, and other characters and you are running a WLATIN1 session, SAS will successfully transcode the French and other Western European characters to WLATIN1. However, the Greek and Hindi characters cannot be transcoded because there is no representation for them in the WLATIN1 encoding. As a result, an error is generated, and the data is lost.

If you have this kind of multilingual data, you can use the KPROPDATA() function to transcode a character string from and to any encoding and convert characters that are not represented in the target encoding to a character of your choice (a question mark or a space). Note that you must specify BINARY or ANY as the input encoding to prevent CEDA from transcoding your data. (See the next tip for information about suppressing transcoding.) A macro is available in the technical paper Multilingual Computing with SAS 9.4 (on page 7) that you can run to accomplish these tasks.

  • Suppress transcoding. There are certain cases in which you might want to bypass transcoding errors. For example, suppose that you know that your data contains only ASCII characters. You are providing data sets to users and you do not know whether the users will run SAS in WLATIN1 or UTF-8 session encodings. Because the characters are ASCII, SAS does not need to transcode them. You do not want CEDA notes nor the extra overhead that CEDA requires. In such a case, you can suppress transcoding without risking data loss. In the LIBNAME statement for the output data set, specify either the OUTENCODING=ASCIIANY option or the ENCODING=ASCIIANY data set option.

Example 1:

libname final 'SAS-library-path' outencoding=asciiany; 
proc copy in=sashelp out=final noclone;
   select dsname;
run;

Example 2:

libname final 'SAS-library-path'
data final.class(encoding=asciiany);
   set dsname;
run;

The value of ASCIIANY means no transcoding occurs on ASCII machines. When the data source is open on an EBCDIC machine, SAS assumes that the data is ASCII and transcodes it into the EBCDIC session encoding. Note: The CONTENTS procedure shows the encoding value: us-ascii ASCII (ANSI).

The encoding options shown above have two other values:

  • ANY: With this value, SAS does not transcode at all, which is, effectively, binary mode.
  • EBCDICANY: With this value, transcoding only occurs on ASCII machines. SAS assumes the data is EBCDIC and transcodes the character data into the ASCII session encoding.

Additional Self-Help Resources

This section lists several documents that provide more detail about encoding concepts and suggests ways to handle transcoding problems.

  • Encoding: Helping SAS speak your language. This blog by Bari Lawhorn briefly explains what an encoding is, how to determine the default encoding, what to do if your encoding differs from that of other people with whom you share SAS data, and where to find more in-depth information about encodings.
  • Multilingual Computing with SAS® 9.4: This paper, which describes new and enhanced features for handling multiple languages in SAS 9.4, covers topics such as installing and configuring the SAS System, changing the locale of a deployment, working with multibyte data, multilingual support in SAS® Visual Analytics, and more
  • SAS® and UTF-8: Ultimately the Finest. Your Data and Applications Will Thank You!: This SAS Global Forum 2017 paper by Elizabeth Bales and Wei Zheng is a perfect go-to resource when you are migrating data files and formats from another encoding to UTF-8.
  • SAS® Encoding: Understanding the Details: This SAS Press book, by Manfred Kiefer, explains the basics about character encoding that are required for creating, manipulating, and rendering any type of character. This book also provide examples for troubleshooting a variety of encoding problems.
  • SAS® 9.4 National Language Support (NLS): Reference Guide, Fifth Edition. This user guide provides detailed information about encodings, transcoding, and other NLS topics (including dictionaries of NLS autocall macros, data set options, system options, formats, informats, functions, and procedures)
  • SAS Note 15597, "How to convert SAS data set encoding." This note explains how you can convert the encoding for a SAS data set.
  • SAS Note 52716, "The error "Some character data was lost during transcoding in the data set" occurs when the data set encoding does not match the SAS® session encoding."
  • SAS Sample 55054, "PROC SQL can be used to identify data representation and encoding for all data sets in a library:" This sample enables you to query a library of data sets that are in mixed encodings so you can plan to use them in compatible session encodings or you can convert the data sets to another encoding. Using the SQL procedure in this sample, you can create output similar to the following that shows you the data representation and encoding for all data sets in your library.

 

 

Demystifying and resolving common transcoding problems was published on SAS Users.

21
Apr

Use SAS to send an email that embeds a graph in the body of the email

send an email that embeds a graphWhen using the SAS® system to email graphics output, a common request is to use SAS to send an email in which the graphics output is embedded in the body of the email. This functionality is not available until the second maintenance release for SAS® 9.4 (TS1M2). If you are using a version of SAS earlier than SAS 9.4 TS1M2, your best option is to create graphics output in a format such as RTF or PDF, and then attach the RTF or PDF file to your email.

Using the INLINED Option to Embed Graphics

If you are running SAS 9.4 TS1M2 or later, you can embed graphics output in an email. To do this, use the INLINED suboption with the ATTACH option in a SAS FILENAME statement that uses the EMAIL engine. Here is an example:

filename sendmail email to=("first.last@company.com")          from=("first.last@company.com")
    attach=("c:tempemail.png" inlined='sgplot')
    type='text/html' subject="Emailing graphics output";

 

Then, later in your code, reference the value specified for the INLINED option in DATA step code that creates custom HTML output, as shown below:

    data _null_;  
        file sendmail;  
  put '<html>';
  put '<body>';
  put '<img src=cid:sgplot>';
  put '</body>';
  put '</html>';
run;

With this technique, although the graph is sent as an attachment, the attachment is hidden. When the email recipient opens the email, the attached graph is automatically displayed in the email (so that it looks like the graph is embedded in the body of the email).

Note: When using SAS to email graphics output, you must first set the EMAILSYS system option to SMTP and the EMAILHOST system option to the name of the SMTP email server at your site.

Embedding Multiple Graphs

You can also send multiple graphs in a single email using a SAS FILENAME statement as shown here:

                        filename sendmail email to=("first.last@company.com") from=("first.last@company.com")
    attach=("c:tempemail1.png" inlined='sgplot1'  "c:tempemail2.png" inlined='sgplot2')
    type='text/html' subject="Emailing graphics output";

Then, create custom HTML output using DATA step code similar to the following:

     Data _null_;  
       file sendmail;  
 put '<html>';
 put '<body>';
 put '<img src=cid:sgplot1>';
 put '<img src=cid:sgplot2>';
 put '</body>';
 put '</html>';
    run;

Embedding a Graph and a PROC PRINT Table

This example shows how to embed a graph and PRINT procedure table in one email. Let us assume that you have a graph named sgplot.png stored in C:Temp. You want to send an email using SAS that displays the SGPLOT graph in the body of the email directly before a table created with PROC PRINT. The following sample code demonstrates how to do this using a TITLE statement with PROC PRINT:

filename sendmail email  to=("first.last@company.com") from=("first.last@company.com")
     attach=("c:tempsgplot.png" inlined='sgplot') 
     type='text/html' subject="Email test of GRAPH output";
      ods _all_ close; 
ods html file=sendmail; 
title1 '<img src=cid:sgplot>';
proc print data=sashelp.class; 
run;
ods html close; 
ods listing; 
filename sendmail clear;

Embedding a Graph (Complete Program)

Here is a complete sample program that demonstrates embedding graphics in an email using graphics output created with the SGPLOT procedure:

%let workdir=%trim(%sysfunc(pathname(work)));
ods _ALL_ close; 
ods listing gpath="&workdir";
ods graphics / reset=index outputfmt=PNG imagename='email';  
title1 'Graph output emailed using SAS';
proc sgplot data=sashelp.cars; 
  bubble x=horsepower y=mpg_city size=cylinders;
run;
filename sendmail email to=("first.last@company.com") from=("first.last@company.com")
     attach=("&workdir./email.png" inlined='sgplot')
     type='text/html' subject="Emailing graphics output";
      data _null_;
 file sendmail;  
 put '<html>';
 put '<body>';
 put '<img src=cid:sgplot>';
 put '</body>';
 put '</html>';
run; 
      filename sendmail clear;

In conclusion, if you are running SAS 9.4 TS1M2 or later, using the INLINED option in a FILENAME statement is an excellent option when emailing graphics output.  Note that you can use this technique to email any graphics file in PNG, GIF, or JPEG format created with the SAS SG procedures, ODS Graphics, or SAS/GRAPH® procedures (such as GPLOT and GCHART).  You can also use this technique to email graphics files created with software other than SAS.

Use SAS to send an email that embeds a graph in the body of the email was published on SAS Users.

20
Mar

Create accessible ODS results with SAS or "Why you should be running SAS 9.4m4!"

accessible ODS results with SASLet’s look at the term “accessible” and how it relates to the SAS world. Accessible output is output that can be read by a screen reader to someone with low or no vision, visualized by someone with low vision or color blindness, or navigated by someone with limited mobility. In January of 2017, the United States Access Board published a final rule that documents federal standards and guidelines for accessibility compliance specific to information and communication technology (ICT). The new standards and guidelines update the Section 508 law that was most recently amended in 1998, and adopt many of the Web Content Accessibility Guidelines (WCAG) 2.0 standards. Here’s a comparison document: Comparison Table of WCAG 2.0 to Existing 508 Standards. The final rule (as a parent, I appreciate this rule name!) is also known as the “508 refresh”.

To help SAS US customers comply with the visual-accessibility regulations outlined by the final rule, ODS developers are providing SAS programmers with the ability to create accessible results to present their SAS data. In SAS® 9.4 TS1M4 (9.4M4), there are three great improvements that I would like to highlight:

  • ACCESSIBLE_GRAPH option (preproduction) in the ODS HTML5 statement
  • SAS® Graphics Accelerator
  • ACCESSIBLE option (preproduction) in the ODS PDF statement

Note about “Preproduction” Status: ACCESSIBLE and ACCESSIBLE_GRAPH

Why are the ACCESSIBLE (in the ODS PDF FILE= statement) and ACCESSIBLE_GRAPH (in the ODS HTML5 FILE=statement) options preproduction? By setting the status as preproduction, the development team has greater flexibility to make changes to the syntax and underlying architecture. The development team has worked hard to provide these new features, and is very eager to hear feedback from the SAS programming community. They also encourage feedback from the compliance teams that work for those of you who are striving to make your SAS results accessible. Please request and install SAS 9.4M4 here, and start using these new features to generate your current results in the new formats. Ask your compliance team to assess the output, and let us know (accessibility@sas.com) how close we are to making your files compliant with the final rule.

ODS HTML5

The ODS HTML5 destination, which was introduced in SAS® 9.4, creates the most accessible output in SAS for consumption of tables and graphs on the web. This destination creates SVG graphs from ODS GRAPHICS results. SVG graphics scale when zoomed, which maintains the visual integrity of the image. To ensure that the results comply with the maximum number of WCAG standards, use one of the following styles:

  • DAISY (recommended)
  • VADARK
  • HIGHCONTRAST
  • JOURNAL2

In most cases, these styles provide a high level of contrast in graphics output and tabular output.[1]
When you use an accessibility checker such as the open-source accessibility testing tool aXe, you can see how the HTML5 destination compares to the HTML4 destination (the default destination in SAS® Foundation). Here is a comparison of a simple PRINT procedure step. Using this code, I generate two HTML files, html4.html and html5.html:

ods html file="html4.html" path="c:temp" ;
ods html5 file="html5.html" path="c:temp" style=daisy;
 
proc print data=sashelp.cars(obs=5);
var type origin;
run;
 
ods _all_ close;

 

An aXe analysis in Mozilla Firefox finds the following violations in the html4.html file:

Here is the analysis of the html5.html file:

Your success might vary with more complex procedures. The final paragraph of the blog post describes how to offer feedback after you test your code in the HTML5 destination using SAS 9.4M4.

SAS® Graphics Accelerator

The addition of the ACCESSIBLE_GRAPH preproduction option to the ODS HTML5 statement adds accessibility metadata (tags) around ODS GRAPHICS images routed to the HTML5 destination[2]. This metadata provides the ability to have bar charts, time series plots, heat maps, line charts, scatter plots, and histograms consumed by an exciting new add-in available for the Google Chrome and (coming soon) Firefox browsers: SAS GRAPHICS Accelerator. SAS Graphics Accelerator provides the following capabilities:

  • The interactive exploration of supported graphics using sound
  • The ability to download data in tabular format to a CSV file
  • Customization of visual and auditory settings for alternative presentations

Pay attention to the SAS Graphics Accelerator web page because improvements and features are being offered on a regular basis!

ODS PDF ACCESSIBLE

The web is “where it’s at” for most consumers of your organization’s information. However, many sites need PDF files for results requiring a longer storage time. Prior to SAS 9.4M4, using a screen reader with PDF files created by ODS did not work because the PDF files created by ODS are not “tagged.” Tags in a PDF file are not visible in Adobe Reader when the file is opened. But, when a PDF file is tagged, the file contains underlying metadata to facilitate screen readers verbalizing the results. Here’s an example of the same PROC PRINT step above written to two different PDF files using SAS 9.4M4. I created the tagged.pdf file using the ACCESSIBLE preproduction destination option, so the file includes tags, making it accessible using assistive technology.

ods pdf file="c:tempuntagged_default.pdf";
ods pdf (id=a) file="c:temptagged.pdf" accessible;
 
proc print data=sashelp.cars(obs=5);
var type origin;
run;
 
ods _all_ close;

 

 

To determine whether a PDF file is tagged, open the file and select View ► Show/Hide ► Navigation Panes ► Tags. For example, I checked the untagged_default.pdf file and saw the following, which means that this file is not useful to a screen reader:

Let’s compare the results from the tagged.pdf file:

A screen reader uses the HTML-like markup shown above to verbalize the file to someone with low or no vision.

Adobe Acrobat Pro has built-in accessibility checkers that enable us to examine the degree of accessibility of our files. You can display this setting by selecting View ► Tools ► Accessibility. A full discussion of the Adobe compliance-check features is outside the scope of this article. But, an initial examination of the tagged.pdf file shows that there are many accessible features included in the file, and that two of the features need a manual check:

Check with staff who are well-versed in compliance at your organization, and let us know if our files meet your standards.

I want to see sample code, and hear more! How do I get access to SAS 9.4M4 and more information about these new features?

Your SAS installation representative can order SAS 9.4M4 using the information on this page, Request a Maintenance Release. If you want to get a preview of SAS 9.4M4 to learn it before your site gets it, you can use SAS® University Edition (www.sas.com/universityedition).

Read these upcoming papers (available in April 2017) for code samples. And, if you are attending SAS Global Forum 2017, plan to attend the following presentations:

Here are links to the documentation and a previously published SAS Global Forum paper on the topic:

How did we do?

We welcome feedback regarding the results that you are generating with SAS 9.4M4, and look forward to offering the ODS statement options in production status with improved features and support for more procedures. Please send feedback to accessibility@sas.com.

 

[1] See “ODS HTML5 Statement Options Related to Accessibility” in Creating Accessible SAS 9.4 Output Using ODS and ODS Graphics for more information

 

Create accessible ODS results with SAS or "Why you should be running SAS 9.4m4!" was published on SAS Users.

Back to Top