The many faces of HTML

Generating HTML output might be something that you do daily. After all, HTML is now the default format for Display Manager SAS output, and it is one of the available formats for SAS® Enterprise Guide®. In addition, SAS® Studio generates HTML 5.0 output as a default. The many faces of HTML are also seen during everyday operations, which can include the following:

  • Creating reports for the corporate intranet.
  • Creating a responsive design so that content is displayed well on all devices (including mobile devices).
  • Emailing HTML within the body of an email message.
  • Embedding figures in a web page, making the page easier to send in an email.

These tasks show the need for and the true power and flexibility of HTML. This post shows you how to create HTML outputs for each of these tasks with the Output Delivery System (ODS). Some options to use include the HTML destination (which generates HTML 4.1 output by default) or the HTML5 destination (which generates HTML 5.0 output by default).


With the HTML destination and PROC REPORT, you can create a summary report that includes drill-down data along with trafficlighting.

   ods html path="c:temp" file="summary.html";	
   proc report data=sashelp.prdsale;
      column Country  Actual Predict; 
      define Country / group;
      define actual / sum;
      define predict / sum;
      compute Country;
         call define(_col_,"url",drillvar);
   ods html close;
   /* Create Detail data */
   %macro detail(country);
   ods html path="c:temp" file="&country..html";
   proc report data=sashelp.prdsale(where=(country="&country"));
      column Country region product Predict Actual; 
      compute actual;
         if actual.sum >  predict.sum then 
         call define(_col_,"style","style={background=green}");
   ods html close;

Generating HTML output

In This Example

  • The first ODS HTML statement uses a COMPUTE block to create drill-down data for each Country variable. The CALL DEFINE statement within the COMPUTE block uses the URL access method.
  • The second ODS HTML statement creates targets for each of the drill-down values in the summary table by using SAS macro language to subset the data. The filename is based on the value.
  • Trafficlighting is added to the drill-down data. The added color is set to occur within a row when the data value within the Actual Sales column is larger than the data value for the Predicted Sales column.

HTML on Mobile Devices

One approach to generating HTML files is to assume that users access data from mobile devices first. Therefore, each user who accesses a web page on a mobile device should have a good experience. However, the viewport (visible area) is smaller on a mobile device, which often creates a poor viewing experience. Using the VIEWPORT meta tag in the METATEXT= option tells the mobile browser how to size the content that is displayed. In the following output, the content width is set to be the same as the device width, and the  initial-scale property controls the zoom level when the page first loads.

<meta name="viewport" content="width=device-width, initial-scale=1">

 ods html path="C:temp" file="mobile.html" 
 metatext='name="viewport" content="width=device-width, initial-
   proc print data=sashelp.prdsale;
      title "Viewing Output Using Mobile Device";
   ods html close;

In This Example

  • The HTML destination and the METATEXT= option set the width of the output to the width of the mobile device, and the zoom level for the initial load is set.

HTML within Email

Sending SMTP (HTML) email enables you to send HTML within the body of a message. The body can contain styled output as well as embedded images. To generate HTML within email, you must set the EMAILSYS= option to SMTP, and the EMAILHOST= option must be set to the email server. To generate the email, use a FILENAME statement with the EMAIL access method, along with an HTML destination. You can add an image by using the ATTACH= option along with the INLINED= option to add a content identifier, which is defined in a later TITLE statement. For content to appear properly in the email, the CONTENT_TYPE= option must be set to text/html.

The MSOFFICE2K destination is used here instead of the HTML destination because it holds the style better for non-browser-based applications, like Microsoft Office. The ODSTEXT procedure adds the text to the message body.

   filename mymail email to="chevell.parker@sas.com"
                       subject="Forecast Report"
                       attach=('C:SAS.png' inlined="logo")
   ods msoffice2k file=mymail rs=none style=htmlblue options(pagebreak="no");
     title j=l '<img src="cid:logo" width="120" height="100" />';
     title2 "Report for Company XYZ";
   proc odstext;
      H3 "Confidential!";
   proc print data=sashelp.prdsale;
   ods msoffice2k close;

In This Example

  • The FILENAME statement with the EMAIL access method is used.
  • The ATTACH= option specifies the image to include.
  • The INLINED= option specifies a content identifier.
  • The CONTENT_TYPE= option is text/html for HTML output.
  • The ODSTEXT procedure adds the text before the table.
  • The TITLE statement defines the “logo” content identifier.

Graphics within HTML

The ODS HTML5 destination has many benefits, such as the ability to embed graphics directly in an HTML file (and the default file format is SVG). The ability to embed the figure is helpful when you need to email the HTML file, because the file is self-contained. You can also add a table of contents inline to this file.

ods graphics / height=2.5in width=4in;
ods html5 path="c:temp" file="html5output.html";
   proc means data=sashelp.prdsale;
   proc sgplot data=sashelp.prdsale;
      vbar product / response=actual;
   ods html5 close;

In This Example

  • The ODS HTML5 statement creates a table along with an embedded figure. The image is stored as an SVG file within the HTML file.


HTML is used in many ways when it comes to reporting. Various ODS destinations can accommodate the specific output that you need.

The many faces of HTML was published on SAS Users.


SAS Data Studio Code Transform (Part 2)

This is a continuation of my previous blog post on SAS Data Studio and the Code transform. In this post, I will review some additional examples of using the Code transform in a SAS Data Studio data plan to help you prepare your data for analytic reports and/or models.

Create a Unique Identifier Example

The DATA step code below combines the _THREADID_ and the _N_ variables to create a UniqueID for each record.

SAS Data Studio Code Transform

The variable _THREADID_ returns the number that is associated with the thread that the DATA step is running in a server session. The variable _N_ is an internal system variable that counts the iterations of the DATA step as it automatically loops through the rows of an input data set. The _N_ variable is initially set to 1 and increases by 1 each time the DATA step loops past the DATA statement. The DATA step loops past the DATA statement for every row that it encounters in the input data. Because the DATA step is a built-in loop that iterates through each row in a table, the _N_ variable can be used as a counter variable in this case.

_THREADID_ and _N_ are variables that are created automatically by the SAS DATA step and saved in memory. For more information on automatic DATA step variables refer to its

Cluster Records Example

The DATA step code below combines the _THREADID_ and the counter variables to create a unique ClusterNum for each BY group.

This code uses the concept of FIRST.variable to increase the counter if it is the beginning of a new grouping. FIRST.variable and LAST.variable are variables that CAS creates for each BY variable. CAS sets FIRST.variable when it is processing the first observation in a BY group, and sets LAST.variable when it is processing the last observation in a BY group. These assignments enable you to take different actions, based on whether processing is starting for a new BY group or ending for a BY group. For more information, refer to the topic

De-duplication Example

The DATA step code below outputs the last record of each BY group; therefore, de-duplicating the data set by writing out only one record per grouping.

Below are the de-duplication results on the data set used in the previous Cluster Records Example section.

For more information about DATA step, refer to the

Below is the resulting customers2.xlsx file in the Public CAS library.

For more information on the available action sets, refer to the SAS® Cloud Analytic Services 3.3: CASL Reference guide.

For more information on SAS Data Studio and the Code transform, please refer to this SAS Data Studio Code Transform (Part 2) was published on SAS Users.


SAS Data Studio Code Transform (Part 1)

SAS Data Studio is a new application in SAS Viya 3.3 that provides a mechanism for performing simple, self-service data preparation tasks to prepare data for use in SAS Visual Analytics or other applications. It is accessed via the Prepare Data menu item or tile on SAS Home. Note: A user must belong to the Data Builders group in order to have access to this menu item.

In SAS Data Studio, you can either select to create a new data plan or open an existing one. A data plan starts with a source table and consists of transforms (steps) that are performed against that table. A plan can be saved and a target table can be created based on the transformations applied in the plan.

SAS Data Studio Code Transform

SAS Data Studio

In a previous blog post, I discussed the Data Quality transforms in SAS Studio.  This post is about the Code transform which enables you to create custom code to perform actions or transformations on a table. To add custom code using the Code transform, select the code language from the drop-down menu, and then enter the code in the text box.  The following code languages are available: CASL or DATA step.

Code Transform in SAS Data Studio

Each time you run a plan, the table and library names might change. To avoid errors, you must use variables in place of table and caslib names in your code within SAS Data Studio. Indicating variables in place of table and library names eliminates the possibility that the code will fail due to name changes.  Errors will occur if you use literal values. This is because session table names can change during processing.  Use the following variables:

  • _dp_inputCaslib – variable for the input CAS library name.
  • _dp_inputTable – variable for the input table name.
  • _dp_outputCaslib – variable for the output CAS library name.
  • _dp_outputTable –  variable for the output table name.

Note: For DATA step only, variables must be enclosed in braces, for example, data {{_dp_outputTable}} (caslib={{_dp_outputCaslib}});.

The syntax of “varname”n is needed for variable names with spaces and/or special characters.  Refer to the Avoiding Errors When Using Name Literals help topic for more Information.  There are also several

CASL Code Example

The CASL code example above uses the ActionSet fedSQL to create a summary table of counts by the standardized State value.  The results of this code are pictured below.

Results from CASL Code Example

For more information on the available action sets, refer to the SAS® Cloud Analytic Services 3.3: CASL Reference guide.

DATA Step Code Example

In this DATA step code example above, the BY statement is used to group all records with the same BY value. If you use more than one variable in a BY statement, a BY group is a group of records with the same combination of values for these variables. Each BY group has a unique combination of values for the variables.  On the CAS server, there is no guarantee of global ordering between BY groups. Each DATA step thread can group and order only the rows that are in the same data partition (thread).  Refer to the help topic

Results from DATA Step Code Example

For more information about DATA step, refer to the In my next blog post, I will review some more code examples that you can use in the Code transform in SAS Data Studio. For more information on SAS Data Studio and the Code transform, please refer to this SAS Data Studio Code Transform (Part 1) was published on SAS Users.


Move fraud detection from hindsight to insight to foresight

fraud detectionIn the medical field, an autopsy is valuable because it helps you understand the cause of death. But, what’s more valuable is identifying the leading indicators of an illness so that you can address it before the Grim Reaper comes knocking. Best in class organizations are taking a similar approach to their fraud detection, shifting from a purely hindsight view to insights and even foresight – getting out in front of the fraud before it happens, revenue is lost, reputation damaged and regulators apply even more pressures.

Proactively detecting fraud isn’t easy though. There is the nature of the challenge itself: Fraud is a behavioral problem and one that is dynamic, complex and often sophisticated. Then, there is the data challenge – lots of it and in many different formats, including structured and unstructured. Next is the analytics. There are many techniques available, and some might be good, and others not. Finally, the technology. There is no shortage of solutions, but they can be expensive and organizations need to beware of ending up with a collection of siloed, single-point solutions that don’t tell the full story.

That said, unless you’re willing to close your business, which is the only surefire way to get to 0% fraud, you’ve got to tackle it.

How to tackle fraud?

For starters, I advise leaders to define their risk appetite and tolerance. What is the level of risk that you – and the organization – can live with? If you can live with 5%, let’s say, then that’s your true North and benchmark to measure against. Once the risk appetite is set, next comes the balancing act of strategic long-term view and tactical short-term needs plus balancing fraud prevention against the customer experience, and more. Then, make sure you have the data, technology, people, processes, governance and analytics in place to continuously measure and refine.

What we are seeing today is that analytics is a key component of moving fraud detection from hindsight to foresight. It starts with dividing risk into three classes. The first is what you know. I have fraud, it’s happening, and I can put business rules in place to detect it. It’s a repeatable pattern that usually responds well to the “if x, then y” formula. The second class is what you do not know.  This is about anomaly detections and can often be found by highlighting things that don’t happen often, but stand out when they do. The third, and most challenging class, is when you don’t even know what you’re looking for. Is it a needle in a haystack? Maybe a rusty nail? This is where AI and ML come in play.

Applying best-in-class tools allows organizations to ingest enormous sets of data, including text, voice, social, structured and unstructured data. Adding best-in-class analytics helps to sort the noise from signals, and advanced analytics including Artificial Intelligence, Machine Learning and Natural Language Processing enable organizations to move faster, by processing in real time, and benefit from iterative learning, where humans help models become smarter and smarter until they can improve themselves every single time. And, of course, the best solutions provide an end-to-end analytics lifecycle from data to analytics to insights.

There’s no question that fraud is complex and challenging, but unless you’re willing to send your business to the morgue – and close your doors forever – you’ve got to tackle it. And, thanks to advances in analytics, we can help stop fraud before it starts.

Find out more at the SAS Global User Forum 2018

Join Constantine Boyadjiev for his “Suspect Behavior Identification through Sentiment Analysis and Communication Surveillance” Breakout Session at SAS Global Forum 2018 April 10 at 3 p.m. in Mile High Ballroom Theater C.





Move fraud detection from hindsight to insight to foresight was published on SAS Users.


The power behind a Hidden Data Role in SAS Visual Analytics

SAS Visual Analytics 8.2 introduces the Hidden Data Role. This role can accept one or more category or date data items which will be included in the query results but will not be displayed with the object. You can use this Hidden Data Role in:

  • Mapping Data Sources.
  • Color-Mapped Display Rules.
  • External Links.

Note that this Hidden Data Role is not available for all Objects and cannot be used as both a Hidden Data Role and Data tip value, it can only be assigned to one role.

In this example, we will look at how to use the Hidden Data Role for an External Link.

Here are a few applications of this example:

  • You want to show an index of available assets, and you have a URL to point directly to that asset.
  • Your company sells products, you want to show a table summary of product profit but have a URL that points to each Product’s development page.
  • As the travel department, you want to see individual travel reports rolled up to owner, but have a URL that can link out to each individual report.

The applications are endless when applied to our customer needs.

In my blog example, I have NFL data for Super Bowl wins. I have attached two columns of URLs for demonstration purposes:

  • One URL is for each Super Bowl event, so I have 52 URLs, one for each row of data.
  • The second URL is for each winning team. There have been 20 unique Super Bowl winning teams, so I have 20 unique URLs.

Hidden Data Role in SAS Visual Analytics

In previous versions of SAS Visual Analytics, if you wanted to link out to one of these URLs, you would have to include it in the visualization like in the List Table shown above. But now, using SAS Visual Analytics 8.2, you can assign a column containing these URLs to the Hidden Data Role and it will be available as an External URL.

Here is our target report. We want to be able to link to the Winning Team’s website.

In Visual Analytics 8.2, for the List Table, assign the Winning Team URL column to the Hidden Data Role.

Then, for the List Table, create a new URL Link Action. Give the Action a name and leave the URL section blank. This is because my data column contains a fully qualified URL. If you were linking to a destination and only needed to append a name value pair, then you could put in the partial URL and pass the parameter value, but that’s a different example.

That is using the column which has 20 URLs that matches the winning team in the Hidden Data Role. Now, what if we use the column that has the 52 URLs that link out to the individual Super Bowl events?

That’s right, the cardinality of the Hidden Data Role item does impact the object. Even though the Hidden data item is not visible on the Object, remember it is included in the results query; and therefore, the cardinality of the Hidden data item impacts the aggregation of the data.

Notice that some objects will just present an information warning that a duplicate classification of the data has caused a conflict.

In conclusion, the Hidden Data Role is an exciting addition to the SAS Visual Analytics 8.2 release. I know you'll enjoy and benefit from it.

The power behind a Hidden Data Role in SAS Visual Analytics was published on SAS Users.


SAS Viya 3.3 command-line interfaces for Administration

SAS Viya 3.3 introduces a set of command-line interfaces that SAS Viya administrators will find extremely useful. The command-line interfaces(CLI) will allow administrators to perform numerous administrative tasks in batch as an alternative to using the SAS Environment Manager interface. In addition, calls to the CLI’s can be chained together in scripts to automate more complex administration tasks. In the post I will introduce the administration CLI’s and look at a few useful examples.

The sas-admin CLI is the main interface; it acts as a wrapper for the other CLI’s. The individual CLI’s operate as interfaces to functionality from with sas-admin. The CLI’s provide a simplified interface to the SAS Viya REST services. They abstract the functionality of the REST services allowing an administrator to enter commands on a command line and receive a response back from the system. If the CLI’s do not surface, all functionality you need, calls to the REST API can be made to fill in the gaps.

In SAS Viya 3.3 the available interfaces(plug-ins) within sas-admin are:

Plugin Purpose
audit Gets SAS audit information.
authorization Gets general authorization information, creates and manages rules and permissions on folders.
backup Manages backups.
restore Manages restore operations
cas Manages CAS administration and authorization
configuration Manages the operations of the configuration service
compute Manages the operations of the compute service.
folders Gets and manages SAS folders.
fonts Manages VA fonts
devices Manages mobile device blacklist and whitelist actions and information.
identities Gets identity information, and manages custom groups and group membership
licenses Manages SAS product license status and information
job Manages the operations of the job flow scheduling service
reports Manages SAS Visual Analytics 8.2 reports
tenant Manages tenants in a multi-tenant deployment.
transfer Promotes SAS content.


The command-line interfaces are located on a SAS Viya machine (any machine in the commandline host group in your ansible inventory file) in the directory /opt/sas/viya/home/bin.

There are two preliminary steps required to use the command-line interface: you need to create a profile and authenticate.

To create a default profile (you can also create named profiles):

sas-admin profile set-endpoint “http://myserver.demo.myco.com”
sas-admin profile set-output text

You can also simple enter the following and respond to the prompts.

sas-admin profile init

The default profile will be stored in the user’s home directory in a file <homedir>/.sas/config.json

The output options range from text, which provides a simplified text output of the result, to full json which provides the full json output that is returned by the rest call which the CLI will submit.  The full json output is useful if you’re piping the output from one command into a tool which is expecting json.

To authenticate:

sas-admin auth login –user sasadm –password ********

The authentication step creates a token in a file stored in the user’s home directory which is valid for, by default, 12 hours.  The file location is <homedir>/.sas/credentials.json.

The syntax of a call to the sas-admin CLI is shown below. The CLI requires an interfaces(plugin) and a command.

The example shows a call to the identities interface. This command will list all the users who are members of the SAS Administrators custom group.

SAS Viya 3.3 command-line interfaces

In this execution of sas-admin:

  • the interface is identities.
  • there is a global option –output set so that the result is returned in basic text.
  • the command is list-members.
  • the command option –group-id specifies the group whose members you wish to list.

The built-in help of the CLI’s is a very useful feature.

./sas-admin --help

This command provides help on the commands and interfaces(plugins) available, and the global options that may be used.

You can also display help on a specific interface by adding the interface name and then specifying –help.

./sas-admin authorization -–help

Let’s look at an example of using the command-line interface to perform some common administrative tasks. In this example I will:

  • create a new folder that is a sub-folder of an existing folder.
  • create a rule to set authorization on a folder.
  • create and secure a caslib.

Many of the folders commands require the ID of a folder as an argument. The id of the folder is displayed when you create the folder, when you list folders using the CLI and in SAS Environment Manager.

To return a folder id based on its path you can use a rest call to the /folders/folders endpoint. The json that is returned can be parsed to retrieve the id. The folders id can then be used in subsequent calls to the CLI. The rest api call below requests the id of the /gelcontent folder.

curl -X GET “http://myserver.demo.myco.com/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool

It returns the following json (partial)

“creationTimeStamp”: “2017-11-17T15:20:28.563Z”,
“modifiedTimeStamp”: “2017-11-20T23:03:19.939Z”,
“createdBy”: “sasadm”,
“modifiedBy”: “sasadm”,
“id”: “e928249c-7a5e-4556-8e2b-7be8b1950b88”,
“name”: “gelcontent”,
“type”: “folder”,
“memberCount”: 2,
“iconUri”: “/folders/static/icon”,
“links”: [
        “method”: “GET”,
        “rel”: “self”,

NOTE: the authentication token($TOKEN) in the rest call is read from the credentials.json file created when the user authenticated via sas-admin auth login. To see how this is done check out the script at the end of the blog.

The next step is to create a folder that is a sub-folder of the /gelcontent folder. The id of the parent folder, and name of the new folder is passed to the create command of the folders interface.

./sas-admin –-output json folders create –-description “Orion Star” –-name “Orion” -–parent-id e928249c-7a5e-4556-8e2b-7be8b1950b88

Next using the folder id from the previous step set authorization on the folder. In this call to the authorization interface I will grant full control to the group gelcorpadmins on the new folder and its content.

./sas-admin authorization create-rule grant -–permissions read,create,update,delete,add,remove,secure -–group gelcorpadmins -–object-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe/** -–container-uri /folders/folders/49b7ba6a-0b2d-4e32-b9b9-2536d84cfdbe

Now in Environment Manager, check that the folder has been created and check the authorization settings. The authorization setting on the folder shows that a new rule has been created and applied providing explicit full access to gelcorpadmins (whose user-friendly name is “GELCorp Admins”).

The next task we might perform is to add a caslib and set authorization on it. We can do that with the following calls to the cas interface.

./sas-admin cas caslibs create path -name ordata --path /tmp/orion --server cas-shared-default
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata –-group gelcorpadmins –-grant ReadInfo
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins –-grant Select
./sas-admin cas caslibs add-control --server cas-shared-default --caslib ordata --group gelcorpadmins --grant LimitedPromote
export TOKEN=
export TOKEN=`grep access-token ~/.sas/credentials.json | cut -d’:’ -f2 | sed s/[{}”,]//g `
#Get gelcontent folder id
curl -X GET “$endpoint/folders/folders/@item?path=/gelcontent” -H “Authorization: bearer $TOKEN” | python -mjson.tool > /tmp/newfolder.txt
id=$(grep ‘”id”:’ /tmp/newfolder.txt | cut -d’:’ -f2 | sed s/[{}”,]//g)
echo “The folder ID is” $id
#Create orion Folder
$clidir/sas-admin –output text folders create –name Orion –parent-id $id > /tmp/folderid.txt
orionid=$(grep “Id ” /tmp/folderid.txt | tr -s ‘ ‘ | cut -f2 -d ” “)
echo “The orion folderid is” $orionid
# set permissions
$clidir/sas-admin authorization create-rule grant –permissions read,create,update,delete,add,remove,secure –group gelcorpadmins –object-uri /folders/folders/$orionid/** –container-uri /folders/folders/$orionid
$clidir/sas-admin authorization create-rule grant –permissions read –group gelcorp –object-uri /folders/folders/$orionid

The SAS Viya command-line interfaces are a very valuable addition to the administrator’s toolbox. There is obviously much more which can be done with the CLI’s than we can cover in this article. For more information and details of the available interfaces please check out the SAS Viya 3.3 command-line interfaces for Administration was published on SAS Users.


Is it sensitive? Mask it with data suppression

Report data shared by educational institutions, government agencies, healthcare organizations, and human resource departments can contain sensitive or confidential data. Data in such reports are suppressed selectively to protect the identities of individuals or to prevent the report’s audience from easily inferring individual values. The Data Suppression feature in SAS Visual Analytics 8.2 is easy to use when you need to selectively suppress aggregated data values in your reports.

All you need to do is create a calculated data item for Data Suppression and apply it to a report object such as a list table or a crosstab.  You could apply Data Suppression to a variety of report objects, but suppressing data for cells in either list tables or crosstabs is a common practice.

Here are a couple of examples where data suppression is applicable:

  • Universities and schools that release data on their students often use a cell threshold value in their report data to protect the risk of identifying specific students when the number of students in a class falls below the defined threshold value, and individual values for test scores or other criteria such as race can be easily determined by looking at the data.
  • In official reports with federal statistics that are provided by the Centers for Disease Control and Prevention in the U.S., certain data cells in the reports are suppressed to protect the confidentiality of patients and eliminate the risk of disclosing their identity. Patient data in such reports are suppressed by using a cell suppression threshold value of 16.

Before we jump into data suppression in SAS Visual Analytics, a quick note on understanding two kinds of data suppression.

Data Suppression by Using the withComplement Option

When a calculated data item is created for Data Suppression, SAS Visual Analytics applies the  withComplement option by default, and an additional complementary value is hidden randomly (by displaying an asterisk)  when you suppress the data for a single aggregated value.  This is done to prevent easy inference of the data values by viewing the total, subtotals, or other cell values.

Data Suppression by Using the withoutComplement Setting

If a calculated data item for Data Suppression is created by using the withoutComplement option, SAS Visual Analytics suppresses (by using an asterisk) only the aggregated data values that you chose to suppress, and no other additional complementary values are hidden with asterisks.

Let’s Do It

As an instructional exercise for data suppression, I chose a small subset of the data for high school students and their SAT test scores in the state of the North Carolina. I added three list tables to my report. My first list table has no data suppression (so we can see the data that I intend to suppress). My second list table will have data suppression without complementary values, and my third list table will have data suppression with complementary values.

In the first list table, the TESTED column shows the number of students that took the SAT test in each high school. If 14 or fewer than 14 students took the SAT test, I want to suppress the display of the number of students in the TESTED column for that high school.

Create the Calculated Data Item for Data Suppression Without Complementary Values

1.  In SAS Visual Analytics, I click on Data, right click on TESTED (the measure upon which my calculated item for data suppression will be created), and select New calculation.

2.  In the Create Calculation dialog, I change the Type to Suppression. By default, SAS Visual Analytics fills in the default value of 5 observations for the Suppress data if count less than: parameter field. I plan to change this value and the condition; for now, I keep the default value so I click OK.

Edit the Calculated Data Item for Data Suppression Without Complementary Values

1.  To edit the calculated item that I just created, I click on Data, right click on the calculated item I just created (TESTED (Data suppression) 1 and choose Edit.

2.  In the Visual mode, I see the calculated item for data suppression.

3.  I click on Text because I want to suppress low values for the TESTED column (which is the number of students that took the test) to 14 and below, and not the number of observations (Frequency) that are suppressed by default. So I edited the condition for data suppression and saved it:

4.   My second list table already has roles assigned to it. Now I added the newly created calculated data item: TESTED (Data Suppression) 1.
This List Table now shows asterisks for values suppressed in the TESTED column for any high school where 14 or fewer than 14 students took the SAT test.

All values for the TESTED measure upon which my condition is based are replaced with asterisk characters. It is important to note that although the suppressed values for TESTED are hidden from view with asterisks, they are still present in the data source. Therefore, I should hide the original measure (in this case, TESTED) from view in the report to prevent the accidental use of the TESTED measure for other report objects in the same report – (we’ll take a quick look at that at the end).

Create the Calculated Data Item for Data Suppression With Complementary Value

1.  I click on Data, right click on TESTED, and select New calculation.

2.  In the Create Calculation dialog, I change the Type to Suppression and click OK to save this new calculated item.

Edit the Calculated Data Item for Data Suppression With Calculated Value Suppression

1.  To edit the calculated item that I just created, I right click on the calculated item for data suppression and choose Edit.

2.  In the Edit Calculated Item dialog, I click Text to see the text version of the calculated data item, and I edited the condition to ensure that data is suppressed for high schools where the total number of students tested equals 13.

My List Table now shows values suppressed in the TESTED column for the high school where 13 students took the SAT test. In addition, another value in the TESTED column is also suppressed randomly by SAS Visual Analytics – in this case, it was for Creswell High School. The random suppression of another value is done to prevent your audience from looking at the Totals column and guessing the number of students that took the SAT test in each high school.

Be sure to follow the three best practices that are described for data suppression in the SAS Visual Analytics 8.2 documentation:

The TESTED measure does not display anymore.

For details on how to show or hide data items, see Is it sensitive? Mask it with data suppression was published on SAS Users.


Admin Notebook: Making the case for selectively backing up metadata

Let’s say that you are administering a SAS 9.4 environment that is working just fine. You’ve checked that your full backups are indeed happening and you’ve even tried restoring from one of your backups. You are prepared for anything, right? Well, I’d like to propose a scenario to you. You probably have users responsible for creating reports, maybe even very important reports. What if something happened to one of these reports? Perhaps the user wants to revert to an earlier version. Perhaps the report was accidentally deleted or even corrupted, what then? Restoring a full backup in this situation might help this one user but would likely inconvenience most other users. With a little more preparation, you could “magically” restore a single report if needed. Here’s what you need to do: create a backup of only these critical reports using the promotion tools.

The promotion tools include:

  • the Export SAS Package Wizard and the Import SAS Package Wizard available in SAS Management Console, SAS Data Integration Studio, and SAS OLAP Cube Studio.
  • the batch export tool and the batch import tool.

Note: Starting with the third maintenance of SAS 9.4, you can use the -disableX11 option to run the batch import and batch export tools on UNIX without setting the DISPLAY variable.

You can use the promotion tools on almost anything found in the SAS Folder tree, especially if you use SAS Management Console. If you use the wizards in SAS Data Integration Studio or SAS OLAP Cube Studio, those applications only allow you to access and export/import objects that pertain to that application, a subset of what is available in SAS Management Console.

You may be thinking that using an interactive wizard is not really the answer you are looking for and you may be right. The batch tools are a great solution if you want to schedule the exporting of some objects on a regular basis. If you are unfamiliar with the promotion tools, I would suggest you start with the interactive wizards. You will find that the log produced by the wizard includes the equivalent command line you would use. It’s a nice way to explore how to invoke the batch tools.

Creating the Export Package

How to invoke the Export SAS Package Wizard:

1.  Right-click on a folder or object in the SAS Folders tree and select Export SAS Package.

Selectively backing up metadata

2.  Enter the location and name of the package file to be created and set options as appropriate.

You can opt to Include dependent objects when retrieving initial collection of objects here or you can select specific dependent objects on the next screen.

Filtering offers some very interesting ways of selecting objects including:

  • By object name
  • By object type
  • By when objects were created
  • By when objects were last modified

3.  Select the objects to export. If you started the process with a folder, you will be presented with the folder and all of its contents selected by default. You can deselect specific objects as you like.

In this example, we only want the Marketing folder and its contents. Deselect the other folders. You want to be careful to not create a package file that is too big.

You can click on individual objects and explore what dependencies the object has, what other metadata objects use the current object, options and properties for the object.

In this example, the Marketing Unit Report is dependent on the MEGACORP table whose metadata is found in the /Shared Data/LASR Data folder. When you import this report, you will need to associate the report with the same or similar table in order for the report to be fully functional.

If you had selected Include dependent objects when retrieving initial collection of objects on the previous screen, all of the dependent objects would be listed and be selected for export by default.

Bonus things you get by default in the export package include:

  • Permissions set directly on the objects
  • For most object types, the export tools include both metadata and the associated physical content. For example, with reports you get both the metadata and associated report XML. For a complete list of physical content promoted with metadata objects, refer to:

    5.  When the export process is complete (hopefully without errors) review the log.

    At the top of the log, you can see the location of the log file in case you want to refer to it later.

    If you scroll to the end of the log, you’ll find the command line to invoke the batch export tool to create the same package.

    Considerations for Exporting

    Importing to the Rescue

    Let’s talk about what happens if and when you actually need to import some or all of the objects in a package file.
    Let’s take a look at what we would need to do to replace an accidentally deleted report, Marketing Unit Report.

    How to invoke the Import SAS Package Wizard:

    5.  Right-click on the same folder you started the export, SAS Folders folder in our example, and select Import SAS Package. It is important to initiate the import from the same folder you started the export if you want to end up with the same folder structure.

    6.  If needed, use the Browse functionality to locate the correct package file.

    Include access controls

    By default, Include access controls is not selected. This option will import permission settings directly applied to the objects in the package. It will not import any permissions if there were only inherited permissions on the object in the source environment.

    Since we are bringing the report back into the folder it originally came from, it makes sense to also include direct permissions, if there were any.

    If you do not check the Include access controls box and there are in face some direct permissions on objects being imported, you will get this warning later in the wizard:

    Select objects to import

    If you’re not sure whether to select to import All objects or New objects only, you can always start with all objects. You can use the Back buttons in the wizard to go back to previous prompts and change selections, at least before you kick off the actual import process.

    7.  If you selected import all objects on the first screen, you will see a listing of all objects. Each object will have an icon indicating if the object currently exists where you are doing the import or not. The red exclamation mark indicates the object currently exists and doing the import of this object will overwrite the current object with the copy from the package. The asterisk icon indicates that the object does not currently exist and will be created by the import process.

    In our example, the Marketing Unit Report does not currently exist in the Marketing folder but is in the package file so it is labeled with an asterisk. The other two reports are both in the folder and the package file so they are labeled with red exclamation marks.

    You’ll want to make the appropriate selections here. If you want all of the contents of the package to be written to the Marketing folder, overwriting the first two reports and adding the Marketing Unit Report, leave all objects selected. If one of the reports had become corrupted, you could use this method to overwrite the current copy with the version stored in the package file.

    If you just want to replace the missing Marketing Unit Report, make sure only that object is selected as below:

    By default, objects are imported into the same folder structure they were in when the export package was created.

    8.  Part of the import process is to establish associations between the objects you are importing and metadata not included in the package. You are first presented with a list of the metadata values you will need to select.

    9.  Set the target value(s) as needed.

    In our example, we definitely want the report to use the same table it used originally.
    If we were moving objects to a new folder or a new environment, you might want to associate the report with a different table.

    If you use the batch import tool, changing these associations would be done in a substitution properties file.

    10.  Review the import summary and initiate the import process.

    11.  Hopefully, the process completes without errors and you can review the log.

    12.  Finish things off by testing the content you imported. In this case, we would log in to SAS Visual Analytics and view the Marketing Unit Report.

    Considerations for Importing

    • If you initiated the export from the SAS Folders folder and try to import the package from another folder, Marketing for example, the wizard will recreate everything in the package, including a new Marketing subfolder which is probably not what you intended.

    Notice the new Marketing folder inside the current Marketing folder. In addition, all three reports are considered new since the new Marketing subfolder does not currently exist.

    • The account you use to do the import should have enough access to metadata and the operating system.

    Next Steps

    • Decide what you want to export, how often, and how long you want to keep a specific package file.
    • Once you’ve gotten comfortable with the wizards and you want to schedule an export (or several), you should try out the batch export and import tools. When you name the export package, you can consider customizing the package name to include the date to avoid overwriting the same package file each time.

    Review the documentation on both the wizards and batch tools in the Admin Notebook: Making the case for selectively backing up metadata was published on SAS Users.


A conversation with SAS Global Forum 2018 Chair Goutam Chakraborty

Goutam Chakraborty is a busy man. In addition to serving as the Ralph A. and Peggy A. Brenneman professor of marketing at Oklahoma State University, Dr. Chakraborty is the director and founder of the SAS and Oklahoma State University Business Data Mining Certificate programs, and an award winning author and professor. He teaches courses in such areas as business analytics, marketing analytics, data mining, marketing research, and web strategy, and has been preparing students to enter the workforce with advanced skills in marketing and analytics for more than 20 years. Throw in the regular consulting engagements he has with some of the world's top companies and it makes you wonder if Dr. Chakraborty has time to add anything else to his already full plate. Well, this year at least, you add SAS Global Forum 2018 Chair to the list - likely at the expense of a good night's sleep.

As the largest gathering of SAS users in the world, SAS Global Forum will attract more than 5,000 SAS professionals for several days of learning and networking. Recently, I sat down with Dr. Chakraborty to talk with him a bit about this year's conference, which takes place April 8-11, 2018 in Denver. I left excited about SAS Global Forum 2018 and, at the expense of losing credibility as a fair and balanced reporter, convinced that Dr. Chakraborty is one of the nicest individuals you'll ever meet.

Larry LaRusso: I know you've been preparing to chair SAS Global Forum 2018 for more than three years, but now that the event is only a few weeks away, how excited are you to kick this thing off?
Goutam Chakraborty: More excited than you know Larry. I've participated in many SAS Global Forums, but serving as chair gives you the ability to influence every aspect of the event, from speaker and content selection to charity-related events and networking opportunities. It's been a wonderful opportunity to give back to the SAS user community, one I'll never forget.

LL: What excites you most about this year's event?
GC: There are so many new things about this year's conference, all geared toward providing an enriching experience for all SAS users. I'll mention three that immediately come to mind.

One thing we've tried to do well this year is connect industry with academics. While we'll have a full program of events and talks specifically geared toward students and professors, this year we'll emphasize partnerships with industries in a new way. I might be most excited about Sunday's Talent Connection. This event brings students and SAS professionals together to network, discuss career opportunities and share knowledge, research and partnership opportunities that might exist with each other. I anticipate it being a great success for both students and industry looking to connect with young analytical talent.

Another strong focus for us is career development and learning for SAS users at all levels. We'll have a full menu of traditional training and certification opportunities for data scientists, business and data analysts and SAS programmers, but we're also providing opportunities to build on soft-skills development, such as networking, analytical story-telling and much more. We'll also have an on-site Learning Lab, available for several hours each day, where users can explore more than 25 e-learning courses for free.

Finally, I'll mention our volunteer opportunities. We'll have several ways for users to give back, but I'm particularly excited about our STEM-related charity event. During meals and evening networking receptions, both Monday and Tuesday, attendees will have the opportunity to work with RAFT Colorado (Resource Area For Teaching), and build STEM-inspired teaching kits for local teachers to use in their classrooms. Each kit will repurpose educational items RAFT has collected and make them available to teachers as creative tools for teaching STEM – inspiring the next generation of thinkers, innovators, problem-solvers and creators. It's an extraordinary opportunity to impact local area children.

LL: Speaking of extraordinary, this year's conference theme is "Inspire the Extraordinary." What does that theme mean to you?
GC: It means never accept "good enough." I always tell my students to push for something above and beyond what's expected of them, to be extra-ordinary. We expect the same for this year's SAS Global Forum. Knowing the event like I do, I feel confident we're going to deliver a SAS Global Forum that surprises and delights our users in a way they didn't expect.

LL: We all know that one of the best things about SAS Global Forum is its incredible content. What can you tell us about the content you’re putting together for this year’s event?
GC: Thanks to tons of hard work and research from a lot of SAS users, we've selected fantastic content from renowned speakers from across the world. Perhaps the best part of our content planning this year is the variety. Topics range from deep hard-core programming to high-level strategic thinking about data and analytics. From sessions that will help you to develop yourself personally as a better human-being to learning about optimizing Monday night NFL schedule for best viewership to thinking strategically about data as a currency – there is something of value for everyone.

LL: SAS Global Forum is likely to attract more than 5,000 data scientists, analytics professionals and business leaders. Every year it amazes me how many of those users are attending SAS Global Forum for the first time. What advice would you give first-timers?
GC: First piece of advice: Have a plan and build a personalized agenda so you don’t get overwhelmed by the large number of available sessions. Second, take every opportunity to engage and network with other attendees. One of the best things about this conference is how willing veteran SAS users (regulars at this conference) are to help and welcome newcomers. So, take advantage of it. If you are sitting down for breakfast or lunch, take the time to introduce yourself to people around you. You may be surprised where it could lead. I'd also encourage attendees to take time to visit the Quad. The Quad is a casual and interactive space where attendees can network with other SAS enthusiasts, view demos and visit with experts from SAS and our sponsors. And, last but not the least, have some fun! Attend the social events we have planned, especially the Kick Back Party at Mile High Stadium on Tuesday evening.

LL: As an academician, I know you’re passionate about learning? What additional learning opportunities, beyond the session talks, are available to attendees?
GC: There are so many learning opportunities at SAS Global Forum that it is mind-numbing. Of course, the 20 and 50 minute session talks are the main modes of content delivery, but there are also e-posters, table talks and super demos in the Quad. We'll also have dozens of pre-conference tutorials, post-conference training, and all the activity in the Learning Labs, including hands-on workshops and the ability to take individual e-learning courses.

LL: Given your personal interests, I know one of your goals for this year’s conference is to increase participation in the event for students and professors. Can you tell me a little more about the special events you have planned for this audience?
GC: For starters, SAS Global Forum is completely “free” for students! As long as you are a full-time enrolled student of an accredited, degree-granting academic institution you can attend free of charge. There are credit hour minimums that must be reached to be eligible, so I'd encourage students to visit the website for complete details.

Programmatically, we have the Sunday afternoon sessions entirely dedicated to academics. We have a fantastic academic keynote speaker, Temple Grandin from Colorado State University, and special training sessions for professors interested in teaching analytics at their universities. For students, we offer a number of opportunities to network and special courses, such as how to best use social media for networking while looking for a job, to help them make a successful transition from student to working professional. We also encourage students, and really anyone who has an interest, to attend the presentations students make as winners of the SAS Global Forum Student Symposium Student Symposium. Though closed now, the Symposium provides an opportunity for teams of two to four students and a faculty adviser to showcase their skills and compete with other teams in the application of SAS Analytics in solving a big data problem. This year, more than 60 teams entered; the top eight will present 20-minute talks during the event.

LL: Dr. Chakraborty, I've taken a lot of your time, but is there anything else you'd like to share with our readers?
GC: Actually, I'd like to thank the many volunteers who have helped put this conference together. From serving on our SAS Global Users Group Executive Board to helping evaluate and select talks, to serving in our Presenter Mentor Program, hundreds of users have invested their time to make this conference the best one yet. SAS Global Forum is truly a user's conference and we depend on the user community to plan, promote and execute so many tasks and activities related to the event. Though I can't call them out by name, I would be remiss if I didn't mention their contributions and take a minute to thank them.

LL: Well let's hope they're reading! Dr. Chakraborty, I want to thank you again for your time. I look forward to seeing you in Denver in April.

Visit the SAS Global Forum 2018 website for more information and to register. Conference Proceedings will be available shortly before the event begins.

Continue the conversation: Join our live Tweetchat, Wednesday, March 7, 2018

How are you inspiring the extraordinary?

The next analytics extraordinary use case is just waiting to be discovered. We believe that in the hands of lifelong learners, the future of data is unlimited, especially when education and business join forces. That is why we are warming up to SAS Global Forum 2018 in Denver with a tweetchat on Wednesday 7th March (simply search #SASchat or #SASGF). We kick off at 6pm CET, 5pm UK, noon ET and 9am Pacific. Will you join us? The discussion will kick off with the following questions, posed to our expert panel:

  • Why is there more interest in lifelong learning now?
  • How does lifelong learning contribute to the analytics economy?
  • What are your favorite examples of analytics in the not-for-profit sector?
  • How is the education sector influencing the development of citizen data scientists?
  • What trends do you see in the consumption of analytics?

A conversation with SAS Global Forum 2018 Chair Goutam Chakraborty was published on SAS Users.


Beam your customers into invisibility: a data protection masked ball to get you up to speed with the GDPR

Data protection with GDPRHere are some new tips for masking. The new EU General Data Protection Regulation (GDPR) requires your company to implement (quote) all necessary technical and organizational measures and to take into consideration the available technology at the time of the processing and technological developments. So, how can you comply with this requirement in the real world? In Part 1, we anonymized field content or replaced it with aliases. That can be sufficient, but it doesn’t have to be. That’s why we’ll cover beta functions in this article (the ideal solution for pseudonymization), personal data that has slipped through the cracks, and the exciting question of ...

Read part 1 of this series: Pseudonymagical: masking data to get up to speed with GDPR

How random can your birth be?

The exact date of your birth is important to you, naturally. The analytics experts working with your data, on the other hand, aren’t looking to send you birthday wishes anyway (missing opt-in?!). What they’re interested in is your approximate age, maybe even just the decade. The SQL code from Part 1 moves the date of birth randomly plus or minus five days. Someone who knows your birth date would therefore be unable to locate your records within a stolen database. Privacy risk abated!

But even that should be verified… with respect to providing proof of “appropriate measures,” in other words, cluster size. In our example of around 5,000 VIP customers, there is only one who is in their 20’s and has a postal code beginning with the numeral 1. The time required to indirectly identify the individual (Recital 21, GDPR) could be rather low here. In the worst case scenario, legally too low.

Enter the beta function: the ideal solution for pseudonymization

Luckily, Recital 29 of the General Data Protection Regulation tells us how to handle this problem. The information required to pinpoint an individual is simply stored separately. That can be accomplished using a key or a mathematical function, in other words a macro, with a secret key that I only use – but don’t know about the math hidden behind it. The law doesn’t tell us how tricky this logic has to be, though. This so-called beta function should satisfy two additional conditions from an analytical standpoint:

  • It must be invertible (a hash is not, for instance).
  • The result of the masking should be monotonic, which means: high original value = high new value (encryption doesn’t do this).

Why? Well, we don’t want to affect the analytic modelling too much - ideally, the function would output something linear or slightly exponential… Here is a √2 example I’ve kept simple:

proc fcmp outlib= SH.GDPR.BETA;
     function BETA1( Typ $, Wert );
          if Type = ‘AGE1’ then return(value*sqrt(2));
          if Type = ‘DATE1’ then return(value+floor(3650*sqrt(2)));
     function BETA1I( Typ $, Wert );
          if Typ = ‘AGE1’ then return(value/sqrt(2));
          if Typ = ‘DATE1’ then return(value-floor(3650*sqrt(2)));

Mathematically, this is a coordinate transformation - or you can also think of it in terms of Star Trek: people are being beamed to an unfamiliar planet. There is a different gravity field than the earth there (a different coordinate system), but it applies to everyone equally — which means that lightweight visitors on the planet can still jump higher there than their heavyweight colleagues. The same applies accordingly to age etc.

pre lang="SAS">
proc DS2;
package GDPR / overwrite=yes language=’fcmp’ table=’SH.GDPR’;
data pdp_de_demo.data.crm_customerbase_ds2_beta(type=view overwrite=yes
keep=(birthdate birthdate_after_beta age age_after_beta));
dcl package GDPR p();
dcl date birthdate_after_beta having format ddmmyyp10.;
dcl double age;
dcl double age_after_beta;
method run();
set pdp_de_demo.data.CRM_CUSTOMERBASE;
age = round((today()-to_double(birthdate))/365.25,1);
age_after_beta = round(p.BETA1(‘AGE1’,age),1);
birthdate_after_beta = to_date(p.BETA1(‘DATE1’,to_double(birthdate)));
run; quit;

When using the birth date or the age, I, as an analytics expert, have no knowledge of how this beaming works technically, but trust that when I’m developing models (and later when scoring) them, that nothing about the behavior has changed. By the way, the computer and correlation don’t care anyway - neither have any concept of age. (It just feels a bit strange for humans.)

We don’t lose the “true” age. It can be re-calculated using another beta function. With what is known as the inverse, but it’s available only to authorized employees - for instance to fraud or legal people during data protection lawsuits. In these cases, your customer can safely be beamed back to earth, so to speak.

A complaint from my office mate

“But how do I explain to the boss my model behavior for these 300-year-olds?!” ... Well in this era of machine learning, neural networks are gaining in popularity and are as selective as they are indescribable. On our side, the math behind it is at least deterministic and explainable; good to know that this key code is no longer stored on your PC, not glued to its data source and target, but remote and safe – because of modern data protection to protect you and the data. And that’s a good thing.

Final aspect: the data for relevant columns has now been subjected to smart masking, the logic is in a central repository, and it’s working in secret. But what about those seemingly harmless fields way in the back, mostly empty and irrelevant, which then in the form of a sales memo or notice suddenly reveal the name of the wife, the second email address, or the former employer? The author who created them thought it was extremely practical, since they didn’t find anywhere else in the contract template where they could enter and save the information.

pre lang="SAS">
(CASE WHEN ( SYSPROC.DQ.DQUALITY.DQEXTRACT ( A.COMMENTFIELD, ‘PDP - Personal Data (Core)’, ‘Individual’,’DEDEU’ ) ne “ )
AS commentfield_without_name,

SAS Data Quality has pre-configured, transparent sets of rules that you can tweak as necessary to detect many of these types of cases using heuristics. That’s indispensable because if I don’t know about it, I can’t protect against it. (If I forget about the tiny basement window when installing the security system, I can be sure that the robbers won’t cooperate by breaking down the front door).

That is a prerequisite for an inventory of the data warehouse, the estimate of the GDPR implementation expense — and here an additional safeguard. Because in the code above, a firewall filter is applied to the data: if the name of a human being slips through the cracks, then only asterisks are displayed when it is output. The field “Note” is always replaced by the description of the category, such as “This is where a telephone number is hidden. After approval by the data protection officer, you may read it – but not for now.”

Are you ready for the GDPR? Learn how your peers are preparing in this global survey report.

Disclaimer: The author of this blog is not an attorney. None of the statements in this article can be construed as legal advice nor can they serve as a substitute for professional legal consultation. All code samples are for illustrative purposes only.

Beam your customers into invisibility: a data protection masked ball to get you up to speed with the GDPR was published on SAS Users.

Back to Top