16
Jan

SAS Viya 3.3 authentication options

In this article, I want to give you an overview of the authentication options available with SAS Viya 3.3. SAS Viya 3.3, released in the second week of December 2017, and the second release with the new microservices architecture, presents more options for authentication than the previous releases. In future posts, we will delve in to more detail for a select option.

Types of Deployment

Before we look at the options for authentication we need to define some terms to help us describe the type of environment. The first of these is the type of deployment. With SAS Viya 3.3 we can have two different types of deployment:

  1. Full Deployment
  2. Programming Only

As the name suggests, the full deployment is a deployment of all the different components that make up the ordered SAS Viya 3.3 product or solution. This includes the SAS Viya runtime engine, CAS (Cloud Analytic Services), the microservices, stateful services, and foundation components used by SAS® Studio.

The programming only deployment more closely resembles the deployment we saw in an earlier release; so, this includes CAS and all the parts for SAS Studio to function. A programming only deployment does not include the microservices and stateful services. The only interaction with CAS is via SAS Studio and the code end-users run within this.

Types of Interfaces

Following on from the type of deployment, we can classify the end-user interfaces used to access SAS Viya 3.3.  The interface could be a visual interface or a programming interface. For a visual interface, we group all the SAS Viya 3.3 web applications, excluding SAS Studio. For a programming interface we mean SAS Studio. Equally within programming interface, when we say a programming interface accesses CAS we could also mean the Python, Lua, R or Java interfaces.

Similarly, as of the fifth maintenance release of SAS 9.4 we can interact directly with CAS. Previously, this interaction was based around the use of SAS/CONNECT® and remote submitting code to the SAS Viya programming interface. With SAS 9.4 M5, we can now directly connect to CAS from the SAS foundation. So, a third type of interface for us to consider is the SAS 9.4 M5 client.

Visual Interfaces Authentication

As we know with SAS Viya 3.3, the way the end-user authenticates to the visual interfaces is via the SAS® Logon Manager. The SAS Logon Manager is accessed via the HTTP Proxy. The following picture summarizes the options for authenticated to the SAS Logon Manager in SAS Viya 3.3.

SAS Viya 3.3 authentication options

The first thing to point out and something to always remember is the following:

The identities microservice always must connect to an LDAP provider to obtain user and group information.

This LDAP provider could be Microsoft Active Directory or any other LDAP provider such as OpenLDAP.

So, what are our options for authenticating the users accessing SAS Logon Manager? We have five options with the SAS Viya 3.3:

1.      LDAP Provider (the default option)
2.      Kerberos or Integrated Windows Authentication
3.      OAuth/OpenID Connect
4.      SAML
5.      Multi-factor Authentication (New with SAS Viya 3.3)

Option 1 is the default authentication mechanism enabled out-of-the-box for SAS Viya 3.3 is the LDAP Provider. The same connection details used by the identities microservice are used by SAS Logon Manager to authenticate the credentials the end-user enters in the logon form. From a security perspective, we need to be concerned about what network connections these end-user credentials will be sent over. First, we have the network connection between the browser and the HTTP proxy, which is secured by default with HTTPS in SAS Viya 3.3. Then we have the network connection between SAS Logon and the LDAP Provider, here we can support LDAPS to encapsulate the LDAP connection in standard TLS encryption.

Option 2, as shown in the diagram, is to configure SAS Logon Manager for Kerberos authentication. This provides the end-user with Single Sign-On from their desktop where the browser is running. This is sometimes referred to as Integrated Windows Authentication (IWA). This will enable the end-user to access the SAS Viya 3.3 visual interfaces without being prompted to enter any credentials. However, it is important to remember that the identities microservice will still be connecting to the LDAP provider. The Kerberos authentication option completely replaces the option to use the default LDAP provider for the SAS Logon Manager. Introduced with SAS Viya 3.3 is the option to delegate the credentials from SAS Logon Manager through to CAS; more on this option below.

Option 3 enables the SAS Logon Manager to be integrated with an alternative OAuth/OpenID Connect provider. This provider could be something internal to the customer’s wider environment or could be external to the customer, such as Google Auth of Facebook. When the OAuth/OpenID Connect option is configured this does not completely replace the default LDAP provider. Instead when the end-user accesses the SAS Logon Manager they are presented with a link to authenticate using OAuth/OpenID Connect and the standard login form using the LDAP provider. The end-user can then select which to use. This option can provide single sign-on from the OAuth/OpenID Connect provider;for example, sign into your Google account and access the SAS Viya 3.3 visual interfaces without further prompting for credentials. Custom code can be added to the SAS Logon Manager login form that automatically links to the external OAuth/OpenID Connect provider. This makes the single sign-on more seamless, since there is no need to select the link.

Option 4 supports configuring the SAS Logon Manager to be integrated with an external SAML Identity Provider. This SAML Identity Provider could be internal or external to the customer’s wider environment. If it is internal it could be something like Oracle Access Manager or Active Directory Federation Services, whilst if its external it could be something like salesforce.com. Again, like option 3, the use of SAML does not completely replace the default LDAP provider. End-users accessing the SAS Logon Manager will be able to choose SAML authentication or the default LDAP provider. Also, this option provides single sign-on with the third-party SAML provider. Custom code can be added to the SAS Logon Manager login form that automatically links to the external SAML provider, making the single sign-on more seamless, since there is no need to select the link.

Option 5 supports the use of Multi-factor authentication with SAS Logon Manager. This is a new option (with SAS Viya 3.3) and requires the configuration of a third-party Pluggable Authentication Module (PAM). This PAM module is the part of the system that integrates with the multi-factor authentication provider such as Symantec’s VIP. The PAM module authenticates the end-user by causing the third-party to push an out-of-band validation request to the end-user. Normally, this would be a push message to a smart phone application, approving the request forms the additional factor in the authentication of the end-user. When an end-user enters their username and password in the SAS Logon Manager form they are checked against the PAM provider. This means this option replaces the LDAP provider, just as with Kerberos.

For all five options listed above, the connection to CAS is performed using internal OAuth tokens generated by the SAS Logon Manager. In most cases the actual session started by the CAS Controller will now run on the operating system as the same user who launched the CAS operating system service. This account defaults to the name cas.

The exception to this is Option 2: Kerberos with delegation. In this case while an OAuth token is generated and initially used to connect to CAS  a second authentication takes place with the delegated Kerberos credentials. This means that the CAS session is started as the end-user and not the user who launched the CAS operating system service.

Programming Interfaces Authentication

Now we’ve looked at the visual interfaces for SAS Viya 3.3, what about the programming interfaces or SAS Studio? Unlike SAS 9.4, SAS Studio with SAS Viya 3.3 is not integrated with the SAS Logon Manager. The following diagram illustrates the case with SAS Studio.

SAS Viya 3.3 authentication options

SAS Studio in the full deployment is integrated with the HTTP Proxy, so with SAS Viya 3.3 end-users do not directly connect to the SAS Studio web application. However, the username and password entered into SAS Studio are not passed to the SAS Logon Manager to authenticate. Instead the SAS® Object Spawner uses the PAM configuration on the host to validate the username and password. This could be a local account on the host or, depending on the PAM configuration, an account in an LDAP Provider. This authentication is sufficient to start the SAS® Workspace Server where the code entered in SAS Studio will be run.

When the SAS Workspace Server connects to CAS it uses the username and password that were used to start the SAS Workspace Server. The CAS Controller uses its own PAM configuration to validate the end-user’s credentials and launch the session process running as the end-user.

Since CAS is integrated into the visual components, and the username and password are passed from the SAS Workspace Server, the CAS Controller uses them to obtain an internal OAuth token from the SAS Logon Manager. This means that the username and password must be valid in the provider configured for the SAS Logon Manager otherwise CAS will not be able to obtain an OAuth token and the session launch will fail.

Therefore, it makes sense in such a deployment for all the three components:

1.      PAM for SAS Studio (sasauth*)
2.      PAM for CAS (cas)
3.      SAS Logon Manager

to all use the same LDAP Provider. If these three components are not sending the username and password entered in SAS Studio to the same place we are likely to see errors when trying to connect.

Programming Only Deployment

For a programming only deployment, we have SAS Studio and CAS but we don’t have any microservices or stateful services. So here all authentication is via the PAM configuration for SAS Studio and CAS. Since CAS knows there are no microservices, it does not attempt to obtain an internal OAuth token from the SAS Logon Manager, the same type of setup we had for SAS Viya 3.1.

SAS 9.4 Maintenance 5 Integration

There are three main ways in which SAS 9.4 Maintenance 5 can integrate with CAS. First, if the SAS 9.4 M5 session has access to a Kerberos credential for the end-user, then Kerberos can be used for the authentication. For example, if Kerberos is used by the end-user to access the SAS 9.4 M5 client, such as a web application or SAS Enterprise Guide, the authentication can be delegated all the way through to CAS. Kerberos will then be used to authenticate to SAS Viya Logon Manager and obtain an OAuth token.

Second, if the SAS 9.4 M5 session has access to the end-user’s username and password; either from the cached credentials used to launch the session, an authinfo file, or from SAS 9.4 Metadata, then these credentials can be used to authenticate to CAS. The username and password will be used to launch the CAS and obtain an OAuth token from SAS Viya Logon Manager. This will be like the programming approach we detailed above.

Finally, for SAS 9.4 Maintenance 5 sessions which are not running as the end-user, we also have a solution. These sessions could be SAS® Stored Process or Pooled Workspace Server sessions, or even a SAS token launched workspace server. For these sessions, we can leverage the SAS® 9.4 Metadata Server to generate a one-time-password. This is the same way in which the SAS Stored Process itself is accessed. To be able to leverage the One-Time-Password with CAS, additional configuration is required in SAS Viya Logon Manager. SAS Viya Logon Manager must be configured with the details of the location of the URL for the SAS® 9.4 Web Infrastructure Platform. The services in the SAS 9.4 Web Infrastructure Platform will be used to validate the One-Time-Password. All this means that CAS can be accessed from a SAS 9.4 Metadata aware connection where end-user Operating System credentials are not available.

Conclusion

I hope that this overview has provided some context to the different types of authentication happening within and to a SAS Viya 3.3 deployment. Understanding the types of authentication available will be important for helping customers to select the best options for them. In future blog posts, we’ll look at the different new options in more detail.

SAS Viya 3.3 authentication options was published on SAS Users.

12
Jan

Interested in being the SAS Global Forum 2021 Conference Chair? Apply Now!

SAS Global Forum 2021 Conference ChairEach year the SAS Global Users Group Executive Board (SGUGEB) solicits applications for the SAS Global Forum Conference Chair for the conference three years from now. Individuals are identified, applications are requested, submitted applications are reviewed, candidates are interviewed, and finally a choice is made.

We are asking for interested individuals to submit their application for SAS Global Forum 2021 Conference Chair. Yep, 2021! The SGUGEB wants to ensure that each conference chair has time to learn, gather ideas, generate ideas, learn from their predecessors and determine the focus for their conference.

Three years?

Is three years really necessary? Yep! The first year you will be working with the current conference team and begin to understand all the ins and outs of planning the content, organizing the content, and delivering the content. You will play a key role on the conference team, either on the Content Advisory Team or on the Content Delivery Team. This will help you in understanding the various roles and responsibilities of each team.   In the second year, you will again play a key role on the conference team and will utilize the experience gained from the previous year to begin developing and determining your content focus, identify potential new initiatives, and begin to build your team. The third year is all about your conference and the implementation of the focus and initiatives you identified… all with the aid of your team of course.

Who are we looking for?

Good candidates should be active SAS users, authors, administrators, managers, and/or practitioners. Individuals should be active in the SAS community and other professional conferences and organizations as well. Good presentations and collaboration skills are a must. Also, candidates should have a vision on how they want to shape their conference to benefit the SAS Community. As an SASGF or Regional conference attendee, we have benefitted from the content and education we received. Those who have been a conference chair will tell you that it is an honor and a privilege to be able to shape the educational content delivered to our SAS Community.

My experience

As conference chair for SASGF 2016, I can tell you it was one of the most rewarding professional and personal experiences I have had. I was given the opportunity to work with a lot of intelligent and talented individuals who, like me, wanted to ensure that current and future SAS users have a place to learn and grow professionally. With over 5,000 attendees and Livestream content available to millions, my institution had increased visibility, I developed additional leadership skills (by chairing such a large international conference), and I got to know and spend time with some exceptional SAS users, SAS leaders and executives. The experience was worth all the time and effort I expended.

Ready to Apply

So, are you interested? If so, we invite you to peruse information about Conference Leadership and SAS Global Forum Conference Chair roles and responsibilities, as well as the many different volunteer opportunities that exist before, during and after SAS Global Forum, and then make an informed decision about whether to apply for conference chair.

I would encourage anyone interested in applying to submit an application. Information on how to apply is available here. As well, share this information with anyone you feel would make a great conference chair and remember that the application deadline is February 18, 2018.

Interested in being the SAS Global Forum 2021 Conference Chair? Apply Now! was published on SAS Users.

10
Jan

Come on in, we're open: The openness of SAS® 9.4

The SAS® platform is now open to be accessed from open-source clients such as Python, Lua, Java, the R language, and REST APIs to leverage the capabilities of SAS® Viya® products and solutions. You can analyze your data in a cloud-enabled environment that handles large amounts of data in a variety of different formats. To find out more about SAS Viya, see the “SAS Viya: What's in it for me? The user.” article.

This blog post focuses on the openness of SAS® 9.4 and discusses features such as the SASPy package and the SAS kernel for Jupyter Notebook and more as clients to SAS. Note: This blog post is relevant for all maintenance releases of SAS 9.4.

SASPy

The SASPy package enables you to connect to and run your analysis from SAS 9.4 using the object-oriented methods and objects from the Python language as well as the Python magic methods. SASPy translates the objects and methods added into the SAS code before executing the code. To use SASPy, you must have SAS 9.4 and Python 3.x or later.
Note: SASPy is an open-source project that encourages your contributions.

After you have completed the installation and configuration of SASPy, you can import the SASPy package as demonstrated below:
Note: I used Jupyter Notebook to run the examples in this blog post.

1.   Import the SASPy package:

Openness of SAS® 9.4

2.   Start a new session. The sas object is created as a result of starting a SAS session using a locally installed version of SAS under Microsoft Windows. After this session is successfully established, the following note is generated:

Adding Data

Now that the SAS session is started, you need to add some data to analyze. This example uses SASPy to read a CSV file that provides census data based on the ZIP Codes in Los Angeles County and create a SASdata object named tabl:

To view the attributes of this SASdata object named tabl, use the PRINT() function below, which shows the libref and the SAS data set name. It shows the results as Pandas, which is the default result output for tables.

Using Methods to Display and Analyze Data

This section provides some examples of how to use different methods to interact with SAS data via SASPy.

Head() Method

After loading the data, you can look at the first few records of the ZIP Code data, which is easy using the familiar head() method in Python. This example uses the head() method on the SASdata object tabl to display the first five records. The output is shown below:

Describe() Method

After verifying that the data is what you expected, you can now analyze the data. To generate a simple summary of the data, use the Python describe() method in conjunction with the index [1:3]. This combination generates a summary of all the numeric fields within the table and displays only the second and third records. The subscript works only when the result is set to Pandas and does not work if set to HTML or Text, which are also valid options.

Teach_me_SAS() Method

The SAS code generated from the object-oriented Python syntax can also be displayed using SASPy with the teach_me_SAS() method. When you set the argument in this method to True, which is done using a Boolean value, the SAS code is displayed without executing the code:

ColumnInfo() Method

In the next cell, use the columnInfo() method to display the information about each variable in the SAS data set. Note: The SAS code is generated as a result of adding the teach_me_SAS() method in the last section:

Submit() Method

Then, use the submit() method to execute the PROC CONTENTS that are displayed in the cell above directly from Python. The submit method returns a dictionary with two keys, LST and LOG. The LST key contains the results and the LOG key returns the SAS log. The results are displayed as HTML. The HTML package is imported  to display the results.

The SAS Kernel Using Jupyter Notebook

Jupyter Notebook can run programs in various programming languages including SAS when you install and configure the SAS kernel. Using the SAS kernel is another way to run SAS interactively using a web-based program, which also enables you to save the analysis in a notebook. See the links above for details about installation and configuration of the SAS kernel. To verify that the SAS kernel installed successfully, you can run the following code: jupyter kernelspec list

From the command line, use the following command to start the Jupyter Notebook: Jupyter notebook. The screenshot below shows the Jupyter Notebook session that starts when you run the code. To execute SAS syntax from Jupyter Notebook, select SAS from the New drop-down list as shown below:

You can add SAS code to a cell in Jupyter Notebook and execute it. The following code adds a PRINT procedure and a SGPLOT procedure. The output is in HTML5 by default. However, you can specify a different output format if needed.

You can also use magics in the cell such as the %%python magic even though you are using the SAS kernel. You can do this for any kernel that you have installed.

Other SAS Goodness

There are more ways of interacting with other languages with SAS as well. For example, you can use the Groovy procedure to run Groovy statements on the Java Virtual Machine (JVM). You can also use the LUA procedure to run LUA code from SAS along with the ability to call most SAS functions from Lua. For more information, see “Using Lua within your SAS programs.” Another very powerful feature is the DATA step JavaObject, which provides the ability to instantiate Java classes and access fields and methods. The DATA step JavaObject has been available since SAS® 9.2.

Resources

SASPy Documentation

Introducing SASPy: Use Python code to access SAS

Come on in, we're open: The openness of SAS® 9.4 was published on SAS Users.

10
Jan

Professional Awards provide first-time attendees a chance to attend SAS Global Forum 2018

SAS Global Forum 2018 AwardsThis April, more than 5,000 SAS users and business leaders will converge on Denver CO for the premier event for SAS professionals: SAS Global Forum 2018. The event provides an excellent forum to expand your SAS knowledge and network with users of all skill levels. (Last year I found myself having lunch one day sandwiched between a consultant who had built a three-decade career around SAS and a graduate student who started using SAS three months earlier. How's that for diversity!)

And because SAS Global Forum attracts users from across the globe; in every industry imaginable; and from countless government and academic institutions, it really is a user event not to be missed. Thanks to the SAS Global Users Group Executive Board there are a couple of award programs in place to help those who might otherwise have a hard time getting to the event... well, get to the event!

New SAS® Professional Award

For relatively new SAS users who want to experience the conference for the first time, there's the New SAS® Professional Award. This award provides full-time SAS professionals with five years or less of SAS experience the opportunity to earn a free conference registration and one free pre-conference tutorial. You are eligible if you have never attended a SAS Global Forum in the past and would not otherwise be able to attend without assistance.

SAS® Global Forum International Professional Award

A similar award, the SAS® Global Forum International Professional Award, provides users outside of the 48 contiguous U.S. states a similar opportunity. To qualify for this award, you must be a full-time SAS professional who has never attended a SAS Global Forum and would not otherwise be able to attend. This award provides free registration, including meals; one free pre-conference tutorial; and an invitation to an awards recognition luncheon on Sunday, April 8.

Both awards are managed by SAS users who will assume leadership roles in future conferences.

MaryAnne DePesquo, the 2019 SAS Global Forum Chair, is in charge of the 2018 International Professional Awards, while Lisa Mendez, SAS Global Forum Chair in 2020, manages the 2018 New SAS Professional Awards. Direct questions about either program to MaryAnne or Lisa.

To be considered for either program, you must submit your application by Jan. 29, 2018. You will be notified if you received an award no later than March 5, 2018.

Hope to see you in Denver!

Apply for the New SAS Professional Award.
Apply for the SAS Global Forum International Professional Award.

Interested? Hear more from a couple of last year's award recipients

Professional Awards provide first-time attendees a chance to attend SAS Global Forum 2018 was published on SAS Users.

5
Jan

Looking beyond the AI and deep learning hype

Deep learning is not synonymous with artificial intelligence (AI) or even machine learning. Artificial Intelligence is a broad field which aims to "automate cognitive processes." Machine learning is a subfield of AI that aims to automatically develop programs (called models) purely from exposure to training data.

Deep Learning and AI

Deep learning is one of many branches of machine learning, where the models are long chains of geometric functions, applied one after the other to form stacks of layers. It is one among many approaches to machine learning but not on equal footing with the others.

What makes deep learning exceptional

Why is deep learning unequaled among machine learning techniques? Well, deep learning has achieved tremendous success in a wide range of tasks that have historically been extremely difficult for computers, especially in the areas of machine perception. This includes extracting useful information from images, videos, sound, and others.

Given sufficient training data (in particular, training data appropriately labelled by humans), it’s possible to extract from perceptual data almost anything that a human could extract. Large corporations and businesses are deriving value from deep learning by enabling human-level speech recognition, smart assistants, human-level image classification, vastly improved machine translation, and more. Google Now, Amazon Alexa, ad targeting used by Google, Baidu and Bing are all powered by deep learning. Think of superhuman Go playing and near-human-level autonomous driving.

In the summer of 2016, an experimental short movie, Sunspring, was directed using a script written by a long short-term memory (LSTM) algorithm a type of deep learning algorithm.

How to build deep learning models

Given all this success recorded using deep learning, it's important to stress that building deep learning models is more of an art than science. To build a deep learning or any machine learning model for that matter one need to consider the following steps:

  • Define the problem: What data does the organisation have? What are we trying to predict? Do we need to collect more data? How can we manually label the data? Make sure to work with domain expert because you can’t interpret what you don’t know!
  • What metrics can we use to reliably measure the success of our goals.
  • Prepare validation process that will be used to evaluate the model.
  • Data exploration and pre-processing: This is where most time will be spent such as normalization, manipulation, joining of multiple data sources and so on.
  • Develop an initial model that does better than a baseline model. This gives some indication of whether machine learning is ideal for the problem.
  • Refine model architecture by tuning hyperparameters and adding regularization. Make changes based on validation data.
  • Avoid overfitting.
  • Once happy with the model, deploy it into production environment. This may be difficult to achieve for many organisations giving that a deep learning score code is large. This is where SAS can help. SAS has developed a scoring mechanism called "astore" which allows deep learning method to be pushed into production with just a click.

Is the deep learning hype justified?

We're still in the middle of deep learning revolution trying to understand the limitations of this algorithm. Due to its unprecedented successes, there has been a lot of hype in the field of deep learning and AI. It’s important for managers, professionals, researchers and industrial decision makers to be able to distill this hype from reality created by the media.

Despite the progress on machine perception, we are still far from human level AI. Our models can only perform local generalization, adapting to new situations that must be similar to past data, whereas human cognition is capable of extreme generalization, quickly adapting to radically novel situations and planning for long-term future situations. To make this concrete, imagine you’ve developed a deep network controlling a human body, and you wanted it to learn to safely navigate a city without getting hit by cars, the net would have to die many thousands of times in various situations until it could infer that cars are dangerous, and develop appropriate avoidance behaviors. Dropped into a new city, the net would have to relearn most of what it knows. On the other hand, humans are able to learn safe behaviors without having to die even once—again, thanks to our power of abstract modeling of hypothetical situations.

Lastly, remember deep learning is a long chain of geometrical functions. To learn its parameters via gradient descent one key technical requirements is that it must be differentiable and continuous which is a significant constraint.

Looking beyond the AI and deep learning hype was published on SAS Users.

21
Dec

Relative Period Report in SAS Visual Analytics

Another report requirement came my way and I wanted to share how to use our Visual Analytics’ out-of-the-box relative period calculations to solve it.

Essentially, we had a customer who wanted to see a metric for every month, the previous month’s value next to it, and lastly the difference between the two.

Relative Period Report in SAS Visual Analytics

To do this in SAS Visual Analytics, which is available in versions 7.3 and above, use the relative periodic operators. I am going to use the Mega_Corp data which has a date data item called Date by Month using the format: MMMYYYY. SAS Visual Analytics supports relative period calculations for month, quarter and year.
The first two columns, circled in red, are straight from the data. The metric we are interested in for this report is Profit.

Next, we will create the last column, Profit (Difference from Previous Period), which is an aggregated measure that uses the periodic operators.

From the Data pane, select the metric used in the list table, Profit. Then right-click on Profit and navigate the menus: Create / Difference from Previous Period / Using: Date by Month.

A new aggregated measure will be created for you:

If you right-click on the aggregated measure and select Edit Aggregated Measure…, you will see this relative period calculation, where it is taking the current period (notice the 0) minus the value for the previous period (notice the -1).

Okay – that’s it. This out-of-the-box relative period calculation is ready to be added to the list table. Notice the other Period Operators available in the list. These support SAS Visual Analytics’ additional out-of-the-box aggregated measure calculations such as the Difference between Parallel Periods, Year to Date cumulative calculations, etc.

Now we have to create the final column to meet our report requirement: the Previous Period column.

To do this we are going to leverage the out-of-the-box functionality of the relative period calculation. Since this aggregated measure calculates the previous period for the subtraction – let’s use this to our advantage.

Duplicate the out-of-the-box relative period calculation by right-clicking on Profit (Difference from Previous Period) and select Duplicate Data Item.

Then right-click on the new data item, and select Edit Aggregated Measure….

Now delete everything highlighted in yellow below, remember to also delete the minus sign. And give the data item a new name. Click OK. This will create an aggregated measure that will calculate the previous period.

The final result should look like this from either the Visual tab or Text tab:

Now we have all the columns to meet our report requirement:

Now that I’ve piqued your interest, I’m sure you are wondering if you could use this technique to create aggregated data items to represent the Period -1, -2, -3 offset? YES! This is absolutely possible.
Also, I went ahead and plotted the Difference from Previous Period on a line chart. This is an extremely useful visualization to gage if the variance between periods is acceptable. You can easily assign display rules to this visualization to flag any periods that may need further investigation.

Relative Period Report in SAS Visual Analytics was published on SAS Users.

20
Dec

Manage remediation issues using SAS Data Management

With SAS Data Management, you can set up SAS Data Remediation to manage and correct data issues. SAS Data Remediation allows user- or role-based access to data exceptions.

When a data issue is discovered it can be sent automatically or manually to a remediation queue where it can be corrected by designated users. The issue can be fixed from within SAS Remediation without the need of going to the affected source system. For more efficiency, the remediation process can also be linked to a purpose designed workflow.

It involves a few steps to set up a remediation process that allows you to correct data issues from within SAS Remediation:

  • Set up Data Management job to retrieve data and correct data in remediation.
  • Set up a Workflow to control the remediation process.
  • Register the remediation service.

Set up Data Management job to retrieve and correct data in remediation

To correct data issues from within Data Remediation we need two real-time data management jobs to retrieve and send data. The retrieve job will read the record in question to populate its data in the remediation UI and a send job to write the corrected data back to the data source or a staging area first.

Retrieve and sent job

If the following remediation fields are available in the retrieve or send job’s External Data Provider node, data will be passed to the fields. The field values can be used to identify and work with the correct record:

REM_KEY (internal field to store issues record id)
REM_USERNAME (the current remediation user)
1.  REM_ITEM_NAME
2.  REM_ISSUE_NAME
3.  REM_APPLICATION
4.  REM_SUBJECT_AREA
5.  REM_PACKAGE_NAME
6.  REM_STATUS
7.  REM_ASSIGNEE

Retrieve Action

The "retrieve” action occurs when the issue is opened in SAS Data Remediation. Data Remediation will only pass REM_ values to the data management job if the fields are present in the External Data Provider node. Although the REM_ values are the only way the data management job can communicate with SAS Data Remediation but they are not all required, meaning you can just call the fields in the External Data Provider node you need.

The job’s output fields will be displayed in the Remediation UI as edit fields to correct the issue record. it's best to use a Field Layout node as the last job node to pass out the wanted fields with desired labels.

Note: The retrieve job should only return one record.

A simple example of a retrieve job would be to have the issue record id coming from REM_KEY into the data management job to select the record from the source system.

Send Action

The “send” action occurs when pressing the “Commit Changes” button in the Data Remediation UI. All REM_ values in addition to the output fields of the retrieve job (the issue record edit fields) are passed to the send job. The job will receive values for those fields present in the External Data Provider node.

The send job can now work with the remediation record and save it to a staging area or submit it to the data source directly.

Note: Only one row will be sent as an input to the send job. Any data values returned by the send job will be ignored by Data Remediation.

Move jobs the Data Management Server

When both jobs are written, and tested you need to move them to Data Management Server into a Real-Time Data Services sub-directory for Data Remediation to call them.

When Data Remediation is calling the jobs, it will use the user credentials for the person logged on to Data Remediation. Therefore, you need to make sure that the jobs on Data Management Server have been granted the right permission.

Set up a Workflow to control the remediation process

Although you don’t need to involve a workflow in SAS Data Remediation but to improve efficiency it might be a good using one.

You can design your own workflow using SAS Workflow Studio or you can use a prepared workflow already coming with Data Remediation. You need to make sure that the desired workflow is loaded on to Workflow Server to link it to the Data Remediation Service.

Using SAS Workflow will help you to better control Data Remediation issues.

Register the remediation service

We can now register our remediation service in SAS Data Remediation. Therefore, we go to Data Remediation Administrator “Add New Client Application.”

Under Properties we supply an ID, which can be the name of the remediation service as long as it is unique, and a Display name, which is the name showing in the Remediation UI.

Next we set up the edit UI for the issue record. Under Issue User Interface we go to: User default remediation UI…. Using Data Management Server:

The Server address is the fully qualified address for Data Management Server including the port it is listening on. For example: http://dmserver.company.com:21036.

The Real-time service to retrieve item attributes and Real-time service to send item attributes needs to point to the retrieve/send job respectively on Data Management Server, including the job suffix .ddf as well as any directories under Real-Time Data Services where the jobs are stored.

 

 

 

 

 

 

Under the tab Subject Area, we can register different subject categories for this remediation service.  When calling the remediation service we can categorize different remediation issues by setting different subject areas.

Under the tab Issues Types, we can register issue categories. This enables us to categorize the different remediation issues.

At Task Templates/Select Templates you can set the workflow to be used for a particular issue type.

By saving the remediation service you will be able to use it. You can now assign data issues to the remediation service to efficiently correct the data and improve your data quality from within SAS Data Remediation.

Manage remediation issues using SAS Data Management was published on SAS Users.

19
Dec

Running SAS programs in batch under Unix/Linux

Running SAS programs in batchWhile SAS program development is usually done in an interactive SAS environment (SAS Enterprise Guide, SAS Display Manager, SAS Studio, etc.), when it comes to running SAS programs in a production or operations environment, it is routinely done in batch mode.

Why run SAS programs in batch mode?

First and foremost, this is done for automation, as the batch process does not require human participation at the time of run. It can be scheduled to run (using Operating System scheduler or other scheduling software) while we sleep, at any time of the day or at any time interval between two consecutive runs.

Running SAS programs in batch mode allows streamlining SAS processing by eliminating the possibility of human error, submitting multiple SAS jobs (programs) all at once or in a sequence securing programs and/or data dependencies.

SAS batch processing also takes care of self-documenting, as it automatically generates and stores SAS logs and outputs.

Imagine the following scenario. Every night, a SAS batch process “wakes up” at 3 a.m. and runs an ETL process on a SAS Application server that extracts multiple tables from a database, transforms, combines, and loads them into a SAS datamart; then moves some data tables across the network and loads them into SAS LASR server, so when you are back to work in the morning your SAS Visual Analytics application has all its data refreshed and ready to roll. Of course, the process schedule can be custom-tailored to your particular needs; your batch jobs may run every 15 minutes, once a week, every first Friday of the month – you name it.

What is a batch script file?

To submit a single SAS program in batch mode manually, you could submit an OS command that looks something like the following:

Unix/Linux

sas /sas/code/proj1/job1.sas -log /sas/code/proj1/job1.log

DOS/Windows

"C:Program FilesSASHomeSASFoundation9.4Sas.exe" -SYSIN c:proj1job1.sas -NOSPLASH -ICON -LOG c:proj1job1.log

However, submitting an OS command manually has too many drawbacks: it’s too much typing, it only submits one SAS program at a time, and most importantly – it is manual, which means it is prone to human error.

Usually, these OS commands are packaged into so called batch files (shell scripts in Unix) that allow for sequential, parallel, as well as conditional execution of multiple OS line commands. They can be run either manually, or automatically – on schedule, or called by other batch scripts.

In a Windows/DOS Operating System, these script files are called batch files and have .bat filename extensions. In Unix-like operating systems, such as Linux, these script files are called shell scripts and have .sh filename extensions.

Since Windows batch files are similar, but slightly different from the Unix (and its open source cousin Linux) shell scripts, in the below examples we are going to use Unix/Linux shell scripts only, in order to avoid any confusion. And we are going to use terms Unix and Linux interchangeably.

Here is the typical content of a Linux shell script file to run a single SAS program:

#!/usr/bin/sh
dtstamp=$(date +%Y.%m.%d_%H.%M.%S)
pgmname="/sas/code/project1/program1.sas"
logname="/sas/code/project1/program1_$dtstamp.log"
/sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname

Note, that the shell script syntax allows for some basic programming features like current datetime function, formatting, and variables. It also provides some conditional processing similar to “if-then-else” logic. For detailed information on the shell scripting language you may refer to the following BASH shell script tutorial or any other source of many dialects or flavors of the shell scripting (C Shell, Korn Shell, etc.)

Let’s save the above shell script as the following file:
/sas/code/project1/program1.sh

How to submit a SAS program via Unix script

In order to run this shell script we would submit the following Linux command:
/sas/code/project1/program1.sh

Or, if we navigate to the directory first:
cd /sas/code/project1

then we can submit an abbreviated Linux command
./program1.sh
When run, this shell script not only executes a SAS program (program1.sas), but for every run it also creates and saves a uniquely named SAS Log file. You may create the SAS log file in the same directory where the SAS code is stored, as specified in the script shell above, or specify another directory of your choice.

For example, it creates the following SAS log file:
/sas/code/project1/program1_2017.12.06_09.15.20.log

The file name uniqueness is achieved by adding a date/time stamp suffix between the SAS program name and .log file name extension, in this particular case indicating that this SAS log file was created on December 6, 2017, at 09:15:20 (hours:minutes:seconds).

Unix script for submitting multiple SAS programs

Unix scripts may contain not only OS commands, but also other Unix script calls. You can mix-and-match OS commands and other script calls.

When scripts are created for each individual SAS program that you intend to run in a batch, you can easily combine them into a program flow by creating a flow script containing those single program scripts. For example, let’s create a script file /sas/code/project1/flow1.sh with the following contents:

/sas/code/project1/program1.sh
/sas/code/project1/program2.sh
/sas/code/project1/program3.sh

When submitted as

/sas/code/project1/flow1.sh

it will sequentially execute three scripts - program1.sh, program2.sh, and program3.sh, each of which will execute the corresponding SAS program - program1.sas, program2.sas, and program3.sas, and produce three SAS logs - program1.log, program2.log, and program3.log.

Unix script file permissions

In order to be executable, UNIX script files must have certain permissions. If you create the script file and want to execute it yourself only, the file permissions can be as follows:

-rwxr-----, or 740 in octal representation.

This means that you (the Owner of the script file) have Read (r), Write (w) and Execute (x) permission as indicated by the green highlighting; Group owning the script file has only Read (r) permission as indicated by yellow highlighting;  Others have no permissions to the script file at all as indicated by red highlighting.

If you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:

-rwxr-x---, or 750 in octal representation.

In this case, your group has Read (r) and Execute (x) permissions as highlighted in yellow.

In Unix, file permissions are assigned using the chmod Unix command.

Note, that in both examples above we do not give Others any permissions at all. Remember that file permissions are a security feature, and you should assign them at the minimum level necessary.

Conditional execution of scripts and SAS programs

Here is an example of a Unix script file that allows running multiple SAS programs and OS commands at different times.

#!/bin/sh

#1 extract data from a database
/sas/code/etl/etl.sh

>#2 copy data to the Visual Analytics autoload directory
scp -B userid@sasAPPservername:/sas/data/*.sas7bdat userid@sasVAservername:/sas/config/.../AutoLoad

#3 run weekly, every Monday
dow=$(date +%w)
if [ $dow -eq 1 ]
then
   /sas/code/alerts_generation.sh
fi

#4 run monthly, first Friday of every month
dom=$(date +%d)
if [ $dow -eq 5 -a $dom -le 7 ]
then
   /sas/code/update_history.sh
   /sas/code/update_transactions.sh
fi

In this script, the following logical operators are used: -eq (equal), -le (less or equal), -a (logical and).

As you can see, the script logic takes care of branching to execute different SAS programs when certain timing conditions are met. With such an approach, you would need to schedule only this single script to run at a specified time/interval, say daily at 3 a.m.

In this case, the script will “wake up” every morning at 3 a.m. and execute its component scripts either unconditionally, or conditionally.

If one of the included programs needs to run at a different, lesser frequency (e.g. every Monday, or monthly on first Friday of every month) the script logic will trigger those executions at the appropriate times.

In the above script example steps #1 and #2 will execute every time (unconditionally) the script runs (daily). Step #1 runs ETL program to extract data from a database, step #2 copies the extracted data across the network from SAS Application server to the SAS LASR Analytic server’s drop zone from where they are automatically loaded (autoloaded) into the LASR.

Step #3 will run conditionally every Monday ( $dow -eq 1). Step #4 will run conditionally every first Friday of a month ($dow -eq 5 -a $dom -le 7).

For more information on how to format date for use in shell scripts please refer to this post.

Do you run your SAS programs in batch?

Please share your batch experiences in the comment section below. I am sure the rest of us will really appreciate it!

Running SAS programs in batch under Unix/Linux was published on SAS Users.

18
Dec

SAS Admin Notebook: Managing SAS Configuration Directory Security for SAS Visual Analytics

In my last article, Managing SAS Configuration Directory Security, we stepped through the process for granting specific users more access without opening up access to everyone. One example addressed how to modify security for autoload. There are several other aspects of SAS Visual Analytics that can benefit from a similar security model.

You can maintain a secure environment while still providing one or more select users the ability to:

  • start and stop a SAS LASR Analytic Server.
  • load data to a SAS LASR Analytic Server.
  • import data to a SAS LASR Analytic Server.

Requirements for these types of users fall into two areas: metadata and operating system.

The metadata requirements are very well documented and include:

  • an individual metadata identity.
  • membership in appropriate groups (for example: Visual Analytics Data Administrators for SAS Visual Analytics suite level administration; Visual Data Builder Administrators for data preparation tasks; SAS Administrators for platform level administration).
  • access to certain metadata (refer to the SAS Visual Analytics 7.3: Administration Guide for metadata permission requirements).

Operating System Requirements

Users who need to import data, load data, or start a SAS LASR Analytic Server need the ability to authenticate to the SAS LASR Analytic Server host and write access to some specific locations.

If the SAS LASR Analytic Server is distributed users need:

If the compute tier (the machine where the SAS Workspace Server runs) is on Windows, users need the Log on as a batch job user right on the compute machine.

In addition, users need write access to the signature files directory, the path for the last action logs for the SAS LASR Analytic Server, and the PIDs directory in the monitoring path for the SAS LASR Analytic Server.

Signature Files

There are two types of signature files: server signature files and table signature files. Server signature files are created when a SAS LASR Analytic Server is started. Table signature files are created when a table is loaded into memory. The location of the signature files for a specific SAS LASR Analytic Server can be found on the Advanced properties of the SAS LASR Analytic Server in SAS Management Console.

SAS Configuration Directory Security for SAS Visual Analytics

On Linux, if your signature files are in /tmp you may want to consider relocating them to a different location.

Last Action Logs and the Monitoring Path

In the SAS Visual Analytics Administrator application, logs of interactive actions for a SAS LASR Analytic Server are written to the designated last action log path. The standard location is on the middle tier host in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/Logs. The va.lastActionLogPath property is specified in the SAS Visual Analytics suite level properties. You can access the SAS Visual Analytics suite level properties in SAS Management Console under the Configuration Manager: expand SAS Applicaiton Infrastructure, right-click on Visual Analytics 7.3 to open the properties and select the Advanced tab.

The va.monitoringPath property specifies the location of certain monitoring process ID files and logs. The standard location is on the compute tier in <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator/Monitoring/. This location includes two subdirectories: Logs and PIDs. You can override the default monitoring path by adding the va.monitoringPath extended attribute to the SAS LASR Analytic Server properties.

Host Account and Group

For activities like starting the SAS LASR Analytic Server you might want to use a dedicated account such as lasradm or assign the access to existing users. If you opt to create the lasradm account, you will need to also create the related metadata identity.

For group level security on Linux, it is recommended that you create a new group, for example sasusers, to reserve the broader access provided by the sas group to only platform level administrators. Be sure to include in the membership of this sasusers group any users who need to start the SAS LASR Analytic Server or that need to load or import data to the SAS LASR Analytic Server.

Since the last action log path, the monitoring path, and the autoload scripts location all fall under <SAS_CONFIG_ROOT>/Lev1/Applications/SASVisualAnalytics/VisualAnalyticsAdministrator, you can modify the ownership of this folder to get the right access pattern.

A similar pattern can also be applied to the back-end store location for the data provider library that supports reload-on-start.

Don’t forget to change the ownership of your signature files location too!

SAS Admin Notebook: Managing SAS Configuration Directory Security for SAS Visual Analytics was published on SAS Users.

18
Dec

What can compression do for you?

Compressing a data setCompressing a data set is a process that reduces the number of bytes that are required to represent each observation in a file. You might choose to enable compression to reduce the storage requirements for the file and to lessen the number of I/O operations that are needed to read from or write to the data during processing.

Compression is enabled by the COMPRESS= system option, the COMPRESS= option in the LIBNAME statement, and the COMPRESS= data set option. The COMPRESS= system option compresses all data set sets that are created during a SAS session, and the COMPRESS= option in the LIBNAME statement compresses all data sets for a particular SAS® library. The COMPRESS= data set option is the most popular of these methods because you compress data sets individually as they are created.

The COMPRESS= data set option can be set to CHAR (or YES), NO, and BINARY. The following example illustrates using COMPRESS=YES:

data new(compress=yes);
set old;
run;

 

While compression is a useful tool in your programming toolbox, it isn't a tool that you should use on every data set. When you request compression by using the COMPRESS= option, SAS considers the following information:

  • The header information of the data set to determine how many variables are in the program data vector
  • whether the variables are character or numeric
  • the lengths of the variables

SAS doesn't consider data values at all. The compression overhead for Microsoft 32-bit Windows and 64-bit Windows is 12 bytes, whereas 64-bit UNIX hosts require 24 bytes of overhead. When SAS determines that it is possible to recoup the 12 or 24 bytes of overhead per observation that compression requires, then SAS attempts to compress the data. If that 12 or 24 bytes per observation can't be recouped, the data set size is increased when the compression is completed. So, you should determine ahead of time whether your data set is a good candidate for compression.

In the following example, a data set is created in the Windows operating environment with two variables having lengths, respectively, of 3 and 5 bytes. Because it is impossible to recoup the 12 bytes that are needed per observation for compression overhead, SAS automatically disables compression and a note is written to the SAS log that indicates the same.

571  data new(compress=char);
572     x='abc';
573     y='defgh';
574  run;
 
NOTE: Compression was disabled for data set WORK.NEW because compression overhead would increase
      the size of the data set.
NOTE: The data set WORK.NEW has 1 observations and 2 variables.

 

The compression process doesn’t recognize individual variables within an observation. Instead, the process sees each observation as a large collection of bytes that are run together end to end. In the COMPRESS= data set option, you enable compression by specifying either CHAR (YES) and BINARY. These values for the option differ slightly in the types of data values that they target for compression.

Using the COMPRESS=CHAR|YES option

Specifying COMPRESS=CHAR (or YES) targets data with repeating single characters and variables with stored lengths that are longer than most of the values. As a result, blank spaces pad the end of values that are shorter than the number of bytes of storage.

In thinking about conserving space, customers often shorten the storage lengths of variables by using a LENGTH statement. When you shorten the lengths of your variables, you remove the best opportunity for SAS to compress. For example, if a numeric variable can be stored accurately in 4 bytes, the remaining 4 bytes (in an 8-byte variable) will all be zeros. This situation is perfect for compression. However, when you shorten the length to 4 bytes, the layout of the value is no longer suitable for compression. The only reason to truncate the storage length by using the LENGTH statement is to save disk space. All values are expanded to the full size of 8 bytes in the program data vector to perform computations in DATA and PROC steps. You'll use extra CPU resources to uncompress the data set as well as to expand variables back to 8 bytes.

Using the COMPRESS=BINARY option

When you use COMPRESS=BINARY, patterns of multiple characters across the entire observation are compressed. Binary compression uses two techniques at the same time. This option searches for the following:

  1. Repeating byte sequences (for example, 10 blank spaces or 10 zero bytes in a row)
  2. Repeating byte patterns (for example, the repeated pattern in the hexadecimal value 0102030405FAF10102030405FBF20102030405FCF3)

With that in mind, you can see that the bytes in a numeric variable are just as likely to be compressed as those in a character variable because the compression process does not consider those bytes to be numeric or character. They are just viewed as bytes. Consider a missing value that is represented in hexadecimal notation as FFFF000000000001. In the middle of that value is a string of five zero bytes (0x00) that can be replaced by two compression code-bytes. So, what starts as a sequence of 8 bytes ends up as a sequence of 5 bytes.

Keep in mind

As mentioned earlier, although compression saves space and is a great tool to keep handy in your SAS toolbox, it’s not meant for all your data sets. Some data sets are not going to compress well and the data set will grow larger, so know your data. Also, you’ll want to consider the extra CPU resources that are required to read a compressed file due to the overhead of uncompressing each observation.

What can compression do for you? was published on SAS Users.

Back to Top