29
Jun

Key Value Object in SAS Visual Analytics

Introduced with SAS Visual Analytics 8.2 is a new object named: Key Value. The intent of this object is call attention to an aggregated value for a measure, a category, or both. For additional specifics,

I’ve mocked up several reports to show some of the combinations available to give you an idea of what the Key Value object can look like. Toward the end of the blog, I will add additional reports to provide design ideas for placement and action assignments. Click on any image to enlarge.

Text Style with Measure Value Highlight

Here you can see in the report, I am highlighting two measure values. Both are representing the Highest value but I selected one to show the aggregation and the other not. I was able to mimic similar headings by renaming my data item. Be sure to look at the Options and Roles pane for the assignments to understand how I accomplished this.

Infographic Style with Measure Value Highlight

Here is the same information represented using the Infographic style. Seeing the same report using the different styles allows you to quickly determine the most powerful and appropriate visualization to meet your needs. We cannot control the size of the circle, only the color. In this case, the circles are different thicknesses because of the number of characters used to represent the measure values inside the visual.

Text Style displaying both Measure Value and Category

In this report, I have shown how to use the Text style to display both a Category value and a Measure value. As you can see, only one can be Highlighted, i.e. given the largest font. Notice that in this report I used the object’s Title to help explain the key value being displayed, this is a recommended best practice.

Infographic Style displaying both Measure Value and Category

Below I am using the Infographic style and notice in the Options pane below that I had to use the Additional information attribute to better label the data. Make sure that when you are in the design phase and toggling between text and infographic to review and test the available Key Value Style attributes to better label the visual.

Text Style with Category Value Highlight

In this report, I show how to highlight the Category value using the Text style. Since I chose to not use any of the available label attributes it is critical that I use the object’s Title to better explain the key value displayed.

Infographic Style with Category Value Highlight

In this report, you can see how I changed the layout from the previous report to make the Key Value object side-by-side the other report objects. If you are interested in using the Infographic style with the circle enabled then you may have to adjust your report design to accommodate for the space the circle needs to display. Remember not to shy away from adding white space to your report, it can assist when adding emphasis to a particular visual, in this case a key value.

Summary

Some important things to remember about using the Key Value object in your reports:

  • Use the Key Value object’s Title to inform your users what the number or category value represents so there is no ambiguity as to if they are looking at the maximum or minimum value.
  • When determining which style you prefer, Text or Infographic, it may be easier to make a duplicate of the Page and then adjust the style attributes till you find the desired combination.
  • Take time to adjust the arrangement of objects on your report to get the most pleasing configuration. Don’t shy away from leaving white space in your report. You can also experiment with the Container object using the Precision container type to layer the Key Value object.
  • The Key Value object will be affected by Report and Page Prompts like any other report object and you can even define Actions to filter the Key Value object.

Here are some additional examples of using the Key Value object:

In this example, you can see from the Actions Diagram how the Key Value object is being filtered. First, by the two page prompts and second, there is a direct filter action defined from the List Control object

In this example, we can see from the Actions Diagram that I used the Container object. I then selected the Precision container type and overlaid the Key Value object on the Line Chart. The only filters applied to these report objects are the page prompts.

And in this last example, you can see how I have no Report or Page prompts or any other filters impacting the Key Value objects. Therefore, these values are representative for the entire data.

Key Value Object in SAS Visual Analytics was published on SAS Users.

21
Jun

SAS Viya connecting with SAS 9.4 One-Time-Passwords

As a follow up to my previous blog, I want to address connecting to SAS Viya 3.3 using a One-Time-Password generated by SAS 9.4. I will talk about how this authentication flow operates and when we are likely to require it.

To start with, a One-Time-Password is generated by a SAS 9.4 Metadata Server when we connect to a resource via the metadata. For example, whenever we connect to the SAS 9.4 Stored Process Server we leverage a One-Time-Password. Sometimes this is referred to as a “trusted connection,” in that the resource we are connecting to is configured to “trust” the single-use credential generated by the SAS 9.4 Metadata Server.

To make the connection, the client application connects to the SAS 9.4 Metadata Server and requests the One-Time-Password (OTP). This OTP is sent by the client to the resource along with the username that has “@!*(generatedpassworddomain)*!” appended to it. The resource then connects back to the SAS 9.4 Metadata Server to validate the OTP and allow access.

What Does OTP mean for SAS Viya?

First and foremost, we cannot use the OTP to access the SAS Viya 3.3 Visual Interfaces. OTP is not a mechanism to allow SAS Viya 3.3 to be authenticated by SAS 9.4.

The One-Time-Password enables a process running in SAS 9.4 Maintenance 5 (M5), that does not have the end-user credentials, to access SAS Cloud Analytic Services running on SAS Viya 3.3. The easiest and clearest example is that a SAS 9.4 M5 Stored Process can now access the advanced analytics features of SAS Cloud Analytic Services. Equally, the same process would work with a SAS 9.4 M5 Workspace Server that has been configured for “trusted authentication,” where the operating system process runs as a launch credential rather than the end user.

How Does the OTP Work?

If we continue the example of a SAS 9.4 M5 Stored Process, the SAS code in the Stored Process includes a CAS statement or CAS LIBNAME. In the CAS statement the authdomain is specified as _sasmeta_; this tells the Stored Process to connect to SAS 9.4 M5 Metadata to obtain credentials. The SAS 9.4 M5 Metadata returns a One-Time-Password to the Stored Process and this is used in the connection to SAS Cloud Analytic Services.

SAS Cloud Analytic Services authenticates the incoming connection using the OTP. Since the user is flagged with “@!*(generatedpassworddomain)*!” SAS Cloud Analytic Services knows not to authenticate the user against the PAM stack on the host. SAS Cloud Analytic Services instead connects to the SAS Viya 3.3 SAS Logon Manager to obtain an internal OAuth token to authenticate the connection.

The SAS Viya 3.3 SAS Logon Manager has been configured with information about the SAS 9.4 M5 environment, specifically, the host running the SAS Web Infrastructure Platform, in the form of a URL. Since the user is “@!*(generatedpassworddomain)*!”, SAS Viya 3.3 SAS Logon Manager knows to send this to the SAS 9.4 M5 Web Infrastructure Platform to validate the OTP. Once the OTP is validated, the SAS Viya 3.3 Logon Manager can generate an internal OAuth token, including retrieving the end-users group information from the Identities microservice. This internal OAuth token is returned to SAS Cloud Analytic Services and the session launched.

The diagram below describes these steps:

SAS Viya connecting with SAS 9.4

The general steps include:

1.     The SAS 9.4 M5 SAS Server, running with a launch credential (Stored Process, Pooled Workspace, or Workspace Server) requests a One-Time Password from the Metadata Server for the connection to SAS Cloud Analytic Services.
2.     The SAS 9.4 M5 SAS Server connects to the CAS Server Controller, sending the One-Time Password.
3.     The CAS Controller connects to SAS Logon Manager to obtain an internal OAuth token using the One-Time Password.
4.     SAS Logon Manager connects via the SAS 9.4 M5 Middle-Tier to validate the One-Time Password.
5.     SAS 9.4 M5 Middle-Tier connects to the Metadata Server to validate the One-Time Password.
6.     SAS Logon Manager connects to the identities microservice to fetch custom and LDAP group information for the validated End-User.
7.     The identities microservice either looks up the validated End-User in its cache or connects to Active Directory using the LDAP Service Account to update the cache.
8.     SAS Logon Manager returns a valid internal OAuth token to the SAS CAS Server Controller.
9.     SAS CAS Server Controller launches the CAS Session Controller as the service account for the End-User.

Note that none of the processes are running as the end-user. The SAS 9.4 process is running with a launch credential, either sassrv or some other account, whilehe SAS Cloud Analytic Services session runs as the account starting the SAS Cloud Analytic Services process, by default the CAS account.

What do we need to configure?

Now that we understand how the process operates, we can look at what we need to configure to make this work correctly. We need to make changes on both the SAS 9.4 M5 side and the SAS Viya 3.3 side. For SAS 9.4 M5 we need to:

1.     Register the SAS CAS Server in Metadata. As of SAS 9.4 M5, the templates for adding a server include SAS Cloud Analytic Services.
2.     Optionally we might also register libraries against the SAS CAS Server in the SAS 9.4 M5 Metadata.

For SAS Viya 3.3 we need to:

1.     Configure SAS Logon Manager with the information about the SAS 9.4 M5 Web Infrastructure Platform, under sas.logon.sas9, as shown below.
2.     Ensure the usernames from SAS 9.4 M5 are the same as those returned by the SAS Identities microservice.

The SAS Viya 3.3 SAS Logon Manager will need to be restarted after adding the definition shown here:

Conclusion

By leveraging the One-Time-Password, we make the power of SAS Cloud Analytic Services directly available to a wider range of SAS 9.4 M5 server process. This means our end-users, whether they are using SAS Stored Process Server, Pooled Workspace Server, or even a Workspace Server using a launched credential, can now directly access SAS Cloud Analytic Services.

SAS Viya connecting with SAS 9.4 One-Time-Passwords was published on SAS Users.

19
Jun

Threads and CAS DATA Step

CAS DATA StepCloud Analytic Services (CAS) is really exciting. It’s open. It’s multi-threaded. It’s distributed. And, best of all for SAS programmers, it’s SAS. It looks like SAS. It feels like SAS. In fact, you can even run DATA Step in CAS. But, how does DATA Step work in a multi-threaded, distributed context? What’s new? What’s different? If I’m a SAS programming wizard, am I automatically a CAS programming wizard?

While there are certain _n_ automatic variable as shown below:

DATA tableWithUniqueID;
SET tableWithOutUniqueID; 
 
        uniqueID = _n_;
 
run;

CAS DATA Step

Creating a unique ID in CAS DATA Step is a bit more complicated. Each thread maintains its own _n_. So, if we just use _n_, we’ll get duplicate IDs. Each thread will produce an uniqueID field value of 1, 2..and so on. …. When the thread output is combined, we’ll have a bunch of records with an uniqueID of 1 and a bunch with an uniqueID of 2…. This is not useful.

To produce a truly unique ID, you need to augment _n_ with something else. _threadID_ automatic variable can help us get our unique ID as shown below:

DATA tableWithUniqueID;
SET tableWithOutUniqueID;
 
        uniqueID = put(_threadid_,8.) || || '_' || Put(_n_,8.);
 
run;

While there are surely other ways of doing it, concatenating _threadID_ with _n_ ensures uniqueness because the _threadID_ uniquely identifies a single thread and _n_ uniquely identifies a single row output by that thread.

Aggregation with DATA Step

Now, let’s look at “whole table” aggregation (no BY Groups).

SAS DATA Step

Aggregating an entire table in SAS DATA Step usually looks something like below. We create an aggregator field (totSalesAmt) and then add the detail records’ amount field (SaleAmt) to it as we process each record. Finally, when there are no more records (eof), we output the single aggregate row.

DATA aggregatedTable ;
SET detailedTable end=eof;
 
      retain totSalesAmt 0;
      totSalesAmt = totSalesAmt + SaleAmt;
      keep totSalesAmt;
      if eof then output;
 
run;

CAS DATA Step

While the above code returns one row in single-engine SAS, the same code returns multiple rows in CAS — one per thread. When I ran this code against a table in my environment, I got 28 rows (because CAS used 28 threads in this example).

As with the unique ID logic, producing a total aggregate is just a little more complicated in CAS. To make it work in CAS, we need a post-process step to bring the results together. So, our code would look like this:

DATA aggregatedTable ;
SET detailedTable end=eof;
 
      retain threadSalesAmt 0;
      threadSalesAmt = threadSalesAmt + SaleAmt;
      keep threadSalesAmt;
      if eof then output;
 
run;
 
DATA aggregatedTable / single=yes;
SET aggregatedTable end=eof;
 
      retain totSalesAmt 0;
      totSalesAmt = totSalesAmt + threadSalesAmt;
      if eof then output;
 
run;

In the first data step in the above example, we ran basically the same code as in the SAS DATA Step example. In that step, we let CAS do its distributed, multi-threaded processing because our table is large. Spreading the work over multiple threads makes the aggregation much quicker. After this, we execute a second DATA Step but here we force CAS to use only one thread with the single=yes option. This ensures we only get one output row because CAS only uses one thread. Using a single thread in this case is optimal because we’ll only have a few input records (one per thread from the previous step).

BY-GROUP Aggregation

Individual threads are then assigned to individual BY-Groups. Since each BY-Group is processed by one and only one thread, when we aggregate, we won’t see multiple output rows for a BY-Group. So, there shouldn’t be a need to consolidate the thread results like there was with “whole table” aggregation above.

Consequently, BY-Group aggregation DATA Step code should look exactly the same in CAS and SAS (at least for the basic stuff).

Concluding Thoughts

Coding DATA Step in CAS is very similar to coding DATA Step in SAS. If you’re a wizard in one, you’re likely a wizard in the other. The major difference is accounting for CAS’ massively parallel processing capabilities (which manifest as threads). For more insight into data processing with CAS, check out the SAS Global Forum paper.

Threads and CAS DATA Step was published on SAS Users.

13
Jun

A look at the new data pane and data item features in SAS Visual Analytics 8.2

In SAS Visual Analytics 8.2 on SAS Viya 3.3, there are a number of new data features available. Some of these features are completely new, and some are features from the 7.x release that had not yet been included in the 8.1 release.  I’ll cover a few of these new features in this post.

First of all, the Data pane interface has changed to enable users to access actions via fewer and better organized menus.

Data item properties can also be displayed for viewing or editing with a single click.

The new Change data source action displays a Repair report window if report data items are not in the new data source.  The window enables you to replace the missing data items with replacement data items from the new data source before continuing with the change.

Speaking of mapping, in SAS Visual Analytics 8.2, linked selections and filters can automatically be add to objects, and the objects may use different data sources. In that case, you can manually map data sources from the data pane.  The + icon enables you to add additional pairs of mappings.

When you create a new Geography data item in SAS Visual Analytics 8.2, in addition to using Predefined names and codes or your own custom latitude and longitude data items, you can now also use custom polygon shapes to display your own custom regions. Once you select Custom polygon shapes, you specify, in additional dialogs, the characteristics of your polygon provider.  You can use a CAS table or an Esri Feature Service.

For more information on custom polygons, see my previous blog here.

If you need to use and Esri shape file for your polygon data, there are macros available in VA 8.2 to convert the data to a SAS dataset and to load the data into CAS.

  • %SHPCNTNT display the contents of the shape file
  • %SHPIMPRT converts the shapefile into a SAS dataset and loads it into CAS.

The Custom Sort feature is also back in SAS Visual Analytics 8.2. Just right-click the data item, select Custom sort, and then select and order your data values.

For creating a new derived data item, there are several new calculations available for measures:

And speaking of creating calculated data items, you’ll want to check out three useful new operators that are available in SAS Visual Analytics 8.2:

A look at the new data pane and data item features in SAS Visual Analytics 8.2 was published on SAS Users.

8
Jun

Saving and reloading SAS Viya configuration

SAS Viya provides import and export functionality for user-created content like reports and data plans. Often, in addition to content, an administrator will want to save configuration so that it can be reloaded or updated and applied to a different system. SAS Viya provides the capability to save and reload configuration using the SAS Viya command-line interfaces that are previous blog post.

The

It is possible to save a set of configuration settings and reload them to the same or a different system. This can be useful when you have your configuration established and you wish to keep a backup, or make a selective backup of configuration prior to making a change.

The connection to LDAP is a key early step in a SAS Viya implementation. With the configuration CLI, once you have the SAS Viya LDAP configuration established, you can export it to a file, and then use that file (with any necessary modifications) to stage additional systems, or as a backup prior to making changes to your existing systems configuration.

How to save and reload configuration

As always, when using the command-line interfaces you must

./sas-admin configuration configurations list --definition-name sas.identities.providers.ldap.user  --service identities

 

Next, using the id from the previous step you can list the configuration properties.

./sas-admin configuration configurations show -id b313a5a7-1c73-4f4a-9d3d-bba05b626939

 

Save LDAP Configuration

The save process creates json files. The following steps use the download command to save to json files the connection, user and group configuration instances for the SAS Viya connection to LDAP.

./sas-admin configuration configurations download --target /tmp/ldapconnection.json  --definition-name sas.identities.providers.ldap.connection  --service identities
 
 
./sas-admin configuration configurations download --target /tmp/user.json  --definition-name sas.identities.providers.ldap.user  --service identities
 
 
./sas-admin configuration configurations download --target /tmp/group.json  --definition-name sas.identities.providers.ldap.group  --service identities

 

You should open the json files and check that the correct configuration has been saved. It is possible for the process to complete without errors and return json that is not what you are expecting. This would cause problems with your reload, so checking the saved json is important.

You can keep the JSON file as is, or make changes to key attributes. You may want to do this if you are importing to a different system.

Load the SAS Viya LDAP Configuration

To load you simply use the update command and pass the json file.

./sas-admin configuration configurations update --file /tmp/ldapconnection.json
 
./sas-admin configuration configurations update --file /tmp/user.json
 
./sas-admin configuration configurations update --file /tmp/group.json

 

The impact of isDefault

There is a value, isDefault, stored within the configuration which has an impact on the persistence of changes made to configuration.

isDefault impacts how services treat existing configuration when a service starts. When a service starts a setting of:

  • isDefault=true in the existing configuration means the service will overwrite the configuration object with new defaults.
  • isDefault=false in the existing configuration means the service will NOT overwrite the existing configuration object.

In other words, if the configuration is flagged as “default” then the service is permitted to update or add to the default values.

Objects created by the services at startup always have isDefault set to true. Objects created in Environment Manager always have isDefault set to false. This means changes in Environment Manager are always respected by services on restart, they will not be overwritten.  But services are allowed to overwrite their own defaults at startup.

When using the CLI, the administrator needs to decide what is the appropriate value for isDefault. If you require the configuration change to persist across service restarts then set isDefault=false.

Saving and Reloading Micro-Service Logging Levels

Let’s look at another use case for save and reload of configuration. Updating micro-service logging configuration levels in batch can be very useful. You may want to save your current logging configuration and modify it to raise logging levels. You may create multiple json files with different logging configurations for different scenarios. When debugging an issue in the environment you could load a verbose logging configuration. If you wish to keep the new configuration you would edit the json and set IsDefault=false.

The step below saves all configuration instances created from the logging.level configuration definition. These configuration instances control the logging level for the SAS Viya microservices and servers.

./sas-admin configuration configurations download --definition-name logging.level -target /tmp/default_logging.level.txt

 

If you wish to persist your new logging configuration, edit the file to set metadata.isDefault=false, save the new file and then and update the logging configuration using the update command:

./sas-admin configuration configurations update --file /tmp/new_logging.level.txt

 

When you are done, you can use the original file to reset the logging level back to default values.

In most cases a server restart is not required after a configuration update, find details in the administration guide.

Saving and reloading SAS Viya configuration was published on SAS Users.

6
Jun

Log Management in SAS Viya 3.3

Log Management in SAS ViyaLogs. They can be an administrator’s best friend or a thorn in their side. Who among us hasn’t seen a system choked to death by logs overtaking a filesystem? Thankfully, the chances of that happening with your SAS Viya 3.3 deployment is greatly reduced due to the automatic management of log files by the SAS Operations Infrastructure, which archives log files every day.

With a default installation of SAS Viya 3.3, log files older than three days will be automatically zipped up, deleted, and stored in /opt/sas/viya/config/var/log. This process is managed by the sas-ops-agent process on each machine in your deployment. According to SAS R&D, this results in up to a 95% compression rate on the overall log file requirements.

The task that archives log files is managed by the sas-ops-agent on each machine.  Running ./sas-ops tasks shows that the LogFileArchive process runs daily at 0400:

 {
 "version": 1,
 "taskName": "LogfileArchive",
 "description": "Archive daily",
 "hostType": "linux",
 "runType": "time",
 "frequency": "0000-01-01T04:00:00-05:00",
 "maxRunTime": "2h0m0s",
 "timeOutAction": "restart",
 "errorAction": "cancel",
 "command": "sas-archive",
 "commandType": "sas",
 "publisherType": "none"
 },

While the logs are zipped to reduce size, the zip files are stored locally so additional maintenance may be required to move the zipped files to another location. For example, my test system illustrates that the zip files are retained once created.

[sas@intviya01 log]$ ll
total 928
drwxr-xr-x. 3 sas sas 20 Jan 24 14:19 alert-track
drwxrwxr-x. 3 sas sas 20 Jan 24 16:13 all-services
drwxr-xr-x. 3 sas sas 20 Jan 24 14:23 appregistry
...
drwxr-xr-x. 4 sas sas 31 Jan 24 14:20 evmsvrops
drwxr-xr-x. 3 sas sas 20 Jan 24 14:27 home
drwxr-xr-x. 3 root sas 20 Jan 24 14:18 httpproxy
drwxr-xr-x. 3 sas sas 20 Jan 24 14:35 importvaspk
-rw-r--r--. 1 sas sas 22 Jan 26 04:00 log-20180123090001Z.zip
-rw-r--r--. 1 sas sas 22 Jan 27 04:00 log-20180124090000Z.zip
-rw-r--r--. 1 sas sas 22 Jan 28 04:00 log-20180125090000Z.zip
-rw-r--r--. 1 sas sas 10036 Jan 29 04:00 log-20180126090000Z.zip
-rw-r--r--. 1 sas sas 366303 Mar 6 04:00 log-20180303090000Z.zip
-rw-r--r--. 1 sas sas 432464 Apr 3 04:00 log-20180331080000Z.zip
-rw-r--r--. 1 sas sas 22 Apr 4 04:00 log-20180401080000Z.zip
-rw-r--r--. 1 sas sas 22 Apr 5 04:00 log-20180402080000Z.zip
-rw-r--r--. 1 sas sas 15333 Apr 6 04:00 log-20180403080000Z.zip
-rw-r--r--. 1 sas sas 21173 Apr 7 04:00 log-20180404080000Z.zip
-rw-r--r--. 1 sas sas 22191 Apr 8 04:00 log-20180405080000Z.zip
-rw-r--r--. 1 sas sas 21185 Apr 9 04:00 log-20180406080000Z.zip
-rw-r--r--. 1 sas sas 21405 Apr 10 04:00 log-20180407080000Z.zip
drwxr-xr-x. 3 sas sas 20 Jan 24 14:33 monitoring

If three days is too short of a time to retain logs, you can adjust the default timeframe for archiving logs by modifying the default task list for the sas-ops-agent on each machine.

Edit the tasks.json file to suit your needs and then issue a command to modify the task template for sas-ops-agent processes. For example, this will modify the log archive process to retain seven days of information:

...  
{
 "version": 1,
 "taskName": "LogfileArchive",
 "description": "Archive daily",
 "hostType": "linux",
 "runType": "time",
 "frequency": "0000-01-01T04:00:00-05:00",
 "maxRunTime": "2h0m0s",
 "timeOutAction": "restart",
 "errorAction": "cancel",
 "command": "sas-archive -age 7",
 "commandType": "sas",
 "publisherType": "none"
 },
...
 
$ ./sas-ops-agent import -tasks tasks.jason

Restart the sas-ops-agent processes on each of your machines and you will be good to go.

I hope you found this post helpful.

Additional Resources for Administrators

SAS Administrators Home Page
How-to Videos for Administrators
SAS Administration Community
SAS Administrators Blogs
SAS Administrator Training
SAS Administrator Certification

Log Management in SAS Viya 3.3 was published on SAS Users.

31
May

Using Dynamic Text in a SAS Visual Analytics Report

You will not find an object in SAS Visual Analytics named Dynamic Text. Instead, you will find a Text object that allows you to insert dynamically driven data items. By using the Text object’s dynamic capabilities you can build custom report titles, object titles, emphasize measures and even supply the last modified time of the data source in your SAS Visual Analytics Report. In this post, I will outline the ways how you can leverage the Text object’s dynamic capabilities.

In this example report below, I have used a red font color to indicate the dynamically driven text.
Dynamic Text in a SAS Visual Analytics Report

Let’s take a look the available dynamic roles in the Text object. You can see from the Objects pane that the Text object is grouped under Other.

From the Data pane we have the ability to add both Measure and Parameter data items. From the interactive editor of the Text object shown below, we also have the ability to insert the Table Modified Time and Interactive Filters.

The following sections will demonstrate how to configure each of these dynamically driven elements of the Text object.

Interactive Filters

The out of the box display for Interactive Filters includes the selected values for control objects added to either the Report or Page Prompt areas.

To edit, be sure you are in Edit mode of Explore and Visualize. Click on the Text object to make it the active window and double click inside, then the interactive editor will open. Next, click on the Interactive Filters button. Use your cursor to position where you would like to add static text. In this case, I added the qualifier Default filter information:.

Multiple control object values are separated by a comma and also accommodates multi-value control objects.

Parameters

While the Interactive Filter functionality is extremely useful, you may want to use prompt values more granularly to create custom report titles or even object titles. To do this, you must first create a parameter to hold the value selected in the control object, then use that parameter in the Text object.

In my example report, I have two prompts and two custom object titles leveraging parameters. Let’s look at each one individually.

First is the Report Prompt, which prompts for year.

1.     Create your prompt by using the Control object of your choice and assigning the desired data role.
2.     Create a parameter that corresponds to the data type and assign it to the Control object’s Parameter Role.
3.     For the Text object, assign the same parameter to the Text object’s Parameter Role.
4.     Double click on the Text object, use your cursor to add static text as you like.

The steps are similar for the Page Prompt, which prompts for region.

1.     Create your prompt by using the Control object of your choice and assigning the desired data role.
2.     Create a parameter that corresponds to the data type and assign it to the Control object’s Parameter Role.
3.     For the Text object, assign the same parameter to the Text object’s Parameter Role.
4.     Double click on the Text object, use your cursor to add static text as you like.

Even though I demonstrate how to do this for both Report and Page Prompts, this same technique can be used for report canvas prompts. You just have to be sure you store the selected value(s) in a parameter that you can then use in the Text object’s Parameter Role.

Measures

Very much the same way the Text object’s Roles are used to assign the Parameter values, we can do the same thing with a measure. This measure will be affected by any Report or Page Prompts automatically, but if you want to use a report canvas prompt you will need to create the Actions to the Text object appropriately.

Here you can see we are using the measure TotalExpense which is an aggregated measure of Expenses. Like in the previous examples, be sure to assign the measure to the Text object then double click to open the editor and use your cursor to add the static text.

The only applied filters for this aggregated measure are the selected year and region, therefore this Sum _ByGroup_ will return the Total Expenses for that Year and Region.

Table Modified Time

The last capability of dynamic text available in the Text object is the Table Modified Time.

The out of the box display uses the fully qualified datetime stamp and cannot be altered to a different format. To edit, double click inside the Text object and the editor will open. Then click on the Table Modified Time button. Next, use your cursor to position where you would like to add static text. In this case, I added the qualifier Data last updated:.

Conclusion

There are two main takeaways from this blog post. First is that you can easily build dynamic customizable titles, emphasize measures or parameter values.

Second, look to use the Text object for your dynamic text needs.

Here is a quick mapping as a review of what was detailed in the steps above.

 

Using Dynamic Text in a SAS Visual Analytics Report was published on SAS Users.

19
Apr

SAS Visual Analytics filters on periodic calculations: Apply them or ignore them!

In SAS Visual Analytics 7.4 on 9.4M5 and SAS Visual Analytics 8.2 on SAS Viya, the periodic operators have a new additional parameter that controls how filtering on the date data item used in the calculation affects the calculated values.

The new parameter values are:

SAS Visual Analytics filters

These parameter values enable you to improve the appearance of reports based on calculations that use periodic operators. You can have periods that produce missing values for periodic calculations removed from the report, but still available for use in the calculations for later periods. These parameter settings also enable you to provide users with a prompt for choosing the data to display in a report, without having any effect on the calculations themselves.

The following will illustrate the points above, using periodic Revenue calculations based on monthly data from the MEGA_CORP table. New aggregated measures representing Previous Month Revenue (RelativePeriod) and Same Month Last Year (ParallelPeriod) will be displayed as measures in a crosstab. The default _ApplyAllFilters_ is in effect for both, as shown below, but there are no current filters on report or objects.

The Change from Previous Month and Change From Same Month Last Year calculations, respectively, are below:

The resulting report is a crosstab with Date by Month and Product Line in the Row roles, and Revenue, along with the 4 aggregations, in the Column roles.  All calculations are accurate, but of course, the calculations result in missing values for the first month (Jan2009) and for the first year (2009).

An improvement to the appearance of the report might be to only show Date by Month values beginning with Jan2010, where there are no non-missing values.  Why not apply a filter to the crosstab (shown below), so that the interval shown is Jan2010 to the most recent date?

With the above filter applied to the crosstab, the result is shown below—same problem, different year!

This is where the new parameter on our periodic operators is useful. We would like to have all months used in the calculations, but only the months with non-missing values for both of the periodic calculations shown in the crosstab. So, edit both periodic calculations to change the default _ApplyAllFilters_ to _IgnoreAllTimeFrameFilters_, so that the filters will filter the data in the crosstab, but not for the calculations. When the report is refreshed, only the months with non-missing values are shown:

This periodic operator parameter is also useful if you want to enable users to select a specific month, for viewing only a subset of the crosstab results.

For a selection prompt, add a Drop-Down list to select a MONYY value and define a filter action from the Drop-Down list to the Crosstab. To prevent selection of a month value with missing calculation values, you will also want to apply a filter to the Drop-Down list as you did for the crosstab, displaying months Jan2010 and after in the list.

Now the user can select a month, with all calculations relative to that month displayed, shown in the examples below:

Note that, at this point, since you’ve added the action from the drop-down list to the crosstab, you actually no longer need the filter on the crosstab itself.  In addition, if you remove the crosstab filter, then all of your filters will now be from prompts or actions, so you could use the _IgnoreInteractiveTimeFrameFilters_ parameter on your periodic calculations instead of the _IgnoreTimeFrameFilters_ parameter.

You will also notice that, in release 8.2 of SAS Visual Analytics that the performance of the periodic calculations has been greatly improved, with more of the work done in CAS.

Be sure to check out all of the periodic operators, documented here for SAS Visual Analytics 7.4 and SAS Visual Analytics filters on periodic calculations: Apply them or ignore them! was published on SAS Users.

13
Apr

An introduction to SAS Visual Forecasting 8.2

SAS Visual Forecasting 8.2 effectively models and forecasts time series in large scale. It is built on SAS Viya and powered by SAS Cloud Analytic Services (CAS).

In this blog post, I will build a Visual Forecasting (VF) Pipeline, which is a process flow diagram whose nodes represent tasks in the VF Process.  The objective is to show how to perform the full analytics life cycle with large volumes of data: from accessing data and assigning variable roles accurately, to building forecasting models, to select a champion model and overriding the system generated forecast. In this blog post I will use 1,337 time series related to a chemical company and will illustrate the main steps you would use for your own applications and datasets.

In future posts, I will work in the Programming application, a collection of SAS procedures and CAS actions for direct coding or access through tasks in SAS Studio, and will develop and assess VF models via Python code.

In a VF pipeline, teams can easily save forecast components to the Toolbox to later share in this collaborative environment.

Forecasting Node in Visual Analytics

This section briefly describes what is available in SAS Visual Analytics, the rest of the blog discusses SAS Visual Forecasting 8.2.

In SAS Visual Analytics the Forecast object will select the model that best fits the data out of these models: ARIMA, Damped trend exponential smoothing, Linear exponential smoothing, Seasonal exponential smoothing, Simple exponential smoothing, Winters method (additive), or Winters method (multiplicative). Currently there are no diagnostic statistics (MAPE, RMSE) for the model selected.

You can do “what-if analysis” using the Scenario Analysis and Goal Seeking functionalities. Scenario Analysis enables you to forecast hypothetical scenarios by specifying the future values for one or more underlying factors that contribute to the forecast. For example, if you forecast the profit of a company, and material cost is an underlying factor, then you might use scenario analysis to determine how the forecasted profit would change if the material cost increased by 10%. Goal Seeking enables you to specify a target value for your forecast measure, and then determine the values of underlying factors that would be required to achieve the target value. For example, if you forecast the profit of a company, and material cost is an underlying factor, then you might use Goal Seeking to determine what value for material cost would be required to achieve a 10% increase in profit.

Another neat feature in SAS Visual Analytics is that one can apply different filters to the final forecast. Filters are underlying factors or different levels of the hierarchy, and the resulting plot incorporates those filters.

Data Requirements for Forecasting

There are specific data requirements when working in a forecasting project, a time series dataset that contains at least two variables: 1) the variable that you want to forecast which is known as the target of your analysis, for example, Revenue and 2) a time ID variable that contains the time stamps of the target variable. The intervals of this variable are regularly spaced. Your time series table can contain other time-varying variables. When your time series table contains more than one individual series, you might have classification variables, as shown in the photo below. Distribution Center and Product Key are classification variables. Optionally, you can designate numerical variables (ex: Discount) as independent variables in your models.

You also have the option of adding a table of attributes to the time series table. Attributes are categorical variables that define qualities of the time series. Attribute variables are similar to BY variables, but are not used to identify the series that you want to forecast. In this post, the data I am using includes Distribution Center, Supplier Name, Product Type, Venue and Product Category. Notice that the attributes are time invariant, and that the attribute table is much smaller than the time series table.

The two data sets (SkinProduct and SkinProductAttributes) used in this blog contain 1,337 time series related to a chemical company. This picture shows a few rows of the two data sets used in this post, note that DATE intervals are regularly spaced weeks. The SkinProduct dataset is referred as the Time Series table in the SkinProductAttributes dataset as the Attribute Data.

Introduction SAS Visual Forecasting 8.2

Developing Models in SAS Visual Forecasting 8.2

Step One: Create a Forecasting Project and Assign Variables

From the SAS Home menu select the action Build Models that will take you to SAS Model Studio, where you select New Project, and enter 1) the Name of your project, 2) the Type of project, make sure you enter “Forecasting” and 3) the data source.

Introduction SAS Visual Forecasting 8.2

In the Data tab,  assign the variables roles by using the icon in the upper right corner.

The roles assigned in this post are typical of role assignments in forecasting projects. Notice these variables are in the Time Series table. Also, notice that the classification variables are ordered from highest to lowest hierarchy:

Time Variable: Date
Dependent Variable: Revenue
Classification Variables: Distribution Center and Product Key

In the Time Series table, you might have additional variables you’d like to assign the role “independent” that should be considered for model generation. Independent variables are the explanatory, input, predictor, or causal variables that can be used to model and forecast the dependent variable. In this post, the variable “Discount” is assigned the role “independent”. To do this assignment: right click on the variable, and select Edit Variables.

To bring in the second dataset with the Attribute variables, follow the steps in this photo:

 

Step two: Automated Modeling with Pipelines

The objective in this step is to select a champion model.  Working in the Pipelines tab, one explores the time series plots, uses the code editor to modify the default model, adds a 2nd model to the pipeline, compares the models and selects a champion model.

After step one is completed, select the Pipelines Tab

The first node in the VF pipelines is the Data node. After right-clicking and running this node, one can see the time series by selecting Explore Time Series. Notice that one can filter by the attribute variables, and that the table shows the exact historical data values.

Auto-forecasting is the next node in the default pipeline. Remember that we are modeling 1,332 time series. For each time series, the Auto-forecasting node automatically diagnoses the statistical characteristics of the time series, generates a list of appropriate time series models, automatically selects the model, and generates forecasts. This node evaluates and selects for each time series the ARIMAX and exponential smoothing models.

One can customize and modify the forecasting models (except the Hierarchical Forecasting model) by editing the model’s code. For example, to add the class of models for intermittent demand to the auto- forecasting node, one could open the code editor for that node and replace these lines
rc = diagspec.setESM();
rc = diagspec.setARIMAX();
with:
rc = diagspec.setIDM();

To open the code editor see photo below. After changes, save code and close the editor.

At this point, you can run the Auto Forecasting node, and after looking at its results, save it to the toolbox, so the editing changes are saved and later reused or shared with team members.

By expanding the Nodes pane and the Forecasting Modeling pane on the left, you can select from several models and add a 2nd modeling node to the pipeline

The next photo shows a pipeline with the Naïve Forecast as the second model. It was added to the pipeline by dropping its node into the parent node (data node). This is the resulting pipeline:

After running the Model Comparison node, compare the WMAE (Weighted Mean Absolute Error) and WMAPE (Weighted Mean Absolute Percent Error) and select a champion model.

You can build several pipelines using different model strategies. In order to select a champion model from all the models developed in the pipelines one uses the Pipeline Comparison tab.

Before you work on any overrides for your forecasting project, you need to make sure that you are working with the best pipeline and modeling node for your data. SAS Visual Forecasting selects the best fit model in each pipeline. After each pipeline is run, the champion pipeline is selected based on the statistics of fit that you chose for the selection criteria. If necessary, you can change the selected champion pipeline.

Step Three: Overrides

The Overrides tab is used to manually adjust the forecasts in the future. For example, if you want to account for some promotions that your company and its competitors are running and that are not captured by the models.

The Overrides module allows users to select subsets of time series at the aggregate level by selecting attribute values in the attribute table that you defined in the data tab. The filters based on the attributes are highly customizable and do not restrict you to use the hierarchy that was used for the modeling. The section of a filter using attribute is often referred to as faceted search. Whenever you create a new filter based on a selection of values of the attributes (also known as facets), the aggregate for all series that match the facets will be displayed on your main panel.

There is a wealth of information in the Overrides overview: 1) a list of the BY variables, as well as attribute variables, available to use as filters.

2) a Plot of the Time Series Aggregation and Overrides displaying historical and forecast data, and

3) a Forecast and Overrides table which can be used to create, edit and submit, override values for a time series based on external factors that are not included in the forecast models

Conclusion

Using SAS Visual Forecasting 8.2 you can effectively model and forecast time series in large scale. The Visual Forecasting Pipeline greatly facilitates the automatic forecasting of large volumes of data, and provides a structured and robust method with efficient and flexible processes.

References

An introduction to SAS Visual Forecasting 8.2 was published on SAS Users.

12
Apr

An overview of SAS Data Quality 3.3 programming capabilities

The release of SAS Viya 3.3 has brought some nice data quality features. In addition to the visual applications like Data Studio or Data Explorer that are part of the Data Preparation offering, one can leverage data quality capabilities from a programming perspective.

For the time being, SAS Viya provides two ways to programmatically perform data quality processing on CAS data:

  • The Data Step Data Quality functions.
  • The profile CAS action.

To use Data Quality programming capabilities in CAS, a Data Quality license is required (or a Data Preparation license which includes Data Quality).

Data Step Data Quality functions

The list of the Data Quality functions currently supported in CAS are listed here and below:

SAS Data Quality 3.3 programming capabilities

They cover casing, parsing, field extraction, gender analysis, identification analysis, match codes and standardize capabilities.

As for now, they are only available in the CAS Data Step. You can’t use them in DS2 or in FedSQL.

To run in CAS certain conditions must be met. These include:

  • Both the input and output data must be CAS tables.
  • All language elements must be supported in the CAS Data Step.
  • Others.

Let’s look at an example:

cas mysession sessopts=(caslib="casuser") ;
 
libname casuser cas caslib="casuser" ;
 
data casuser.baseball2 ;
   length gender $1 mcName parsedValue tokenNames lastName firstName varchar(100) ;
   set casuser.baseball ;
   gender=dqGender(name,'NAME','ENUSA') ;
   mcName=dqMatch(name,'NAME',95,'ENUSA') ;   
   parsedValue=dqParse(name,'NAME','ENUSA') ;
   tokenNames=dqParseInfoGet('NAME','ENUSA') ;
   if _n_=1 then put tokenNames= ;
   lastName=dqParseTokenGet(parsedValue,'Family Name','NAME','ENUSA') ;
   firstName=dqParseTokenGet(parsedValue,'Given Name','NAME','ENUSA') ;
run ;

Here, my input and output tables are CAS tables, and I’m using CAS-enabled statements and functions. So, this will run in CAS, in multiple threads, in massively parallel mode across all my CAS workers on in-memory data. You can confirm this by looking for the following message in the log:

NOTE: Running DATA step in Cloud Analytic Services.
NOTE: The DATA step will run in multiple threads.

I’m doing simple data quality processing here:

  • Determine the gender of an individual based on his(her) name, with the dqGender function.
  • Create a match code for the name for a later deduplication, with the dqMatch function.
  • Parse the name using the dqParse function.
  • Identify the name of the tokens produced by the parsing function, with the dqParseInfoGet function.
  • Write the token names in the log, the tokens for this definition are:
    Prefix,Given Name,Middle Name,Family Name,Suffix,Title/Additional Info
  • Extract the “Family Name” token from the parsed value, using dqParseTokenGet.
  • Extract the “Given Name” token from the parsed value, again using dqParseTokenGet.

I get the following table as a result:

Performing this kind of data quality processing on huge tables in memory and in parallel is simply awesome!

The dataDiscovery.profile CAS action

This CAS action enables you to profile a CAS table:

  • It offers 2 algorithms, one is faster but uses more memory.
  • It offers multiple options to control your profiling job:
    • Columns to be profiled.
    • Number of distinct values to be profiled (high-cardinality columns).
    • Number of distinct values/outliers to report.
  • It provides identity analysis using RegEx expressions.
  • It outputs the results to another CAS table.

The resulting table is a transposed table of all the metrics for all the columns. This table requires some post-processing to be analyzed properly.

Example:

proc cas; 
   dataDiscovery.profile /
      algorithm="PRIMARY"
      table={caslib="casuser" name="product_dim"}
      columns={"ProductBrand","ProductLine","Product","ProductDescription","ProductQuality"}
      cutoff=20
      frequencies=10
      outliers=5
      casOut={caslib="casuser" name="product_dim_profiled" replace=true}
   ;
quit ;

In this example, you can see:

  • How to specify the profiling algorithm (quite simple: PRIMARY=best performance, SECONDARY=less memory).
  • How to specify the input table and the columns you want to profile.
  • How to reduce the number of distinct values to process using the cutoff option (it prevents excessive memory use for high-cardinality columns, but might show incomplete results).
  • How to reduce the number of distinct values reported using the frequencies option.
  • How to specify where to store the results (casout).

So, the result is not a report but a table.

The RowId column needs to be matched with

A few comments/cautions on this results table:

  • DoubleValue, DecSextValue, or IntegerValue fields can appear on the output table if numeric fields have been profiled.
  • DecSextValue can contain the mean (metric #1008), median (#1009), standard deviation (#1022) and standard error (#1023) if a numeric column was profiled.
  • It can also contain frequency distributions, maximum, minimum, and mode if the source column is of DecSext data type which is not possible yet.
  • DecSext is a 192-bit fixed-decimal data type that is not supported yet in CAS, and consequently is converted into a double most of the time. Also, SAS Studio cannot render correctly new CAS data types. As of today, those metrics might not be very reliable.
  • Also, some percentage calculations might be rounded due to the use of integers in the Count field.
  • The legend for metric 1001 is not documented. Here it is:

1: CHAR
2: VARCHAR
3: DATE
4: DATETIME
5: DECQUAD
6: DECSEXT
7: DOUBLE
8: INT32
9: INT64
10: TIME

A last word on the profile CAS action. It can help you to perform some identity analysis using patterns defined as RegEx expressions (this does not use the QKB).

Here is an example:

proc cas; 
   dataDiscovery.profile /
      table={caslib="casuser" name="customers"}
      identities={
         {pattern="PAT=</span>?999[<span style=" />-]? ?999[- ]9999",type="USPHONE"}, 
         {pattern= "PAT=^99999[- ]9999$",type="ZIP4"}, 
         {pattern= "PAT=^99999$",type="ZIP"}, 
         {pattern= "[^ @]+@[^ @]+.[A-Z]{2,4}",type="EMAIL"}, 
         {pattern= "^(?i:A[LKZR]|C[AOT]|DE|FL|GA|HI|I[ADLN]|K[SY]|LA|M[ADEINOST]|N[CDEHJMVY]|O[HKR]|PA|RI|S[CD]|T[NX]|UT|V[AT]|W[AIVY])$",type="STATE"}
      }
      casOut={caslib="casuser" name="customers_profiled" replace="true"}
   ;
quit ;

In this example that comes from

I hope this post has been helpful.

Thanks for reading.

An overview of SAS Data Quality 3.3 programming capabilities was published on SAS Users.

Back to Top