Creating an AI Assistant for SAS Viya in 5 steps (@sassoftware/viya-assistantjs) - Part I
Recent Library Articles
Recently in the SAS Community Library: SAS' @kumardeva debunks the myth that developing AI assistants is too hard. He shows you how to use the @sassoftware/viya-assistantjs library to jump start your development.
SAS 9.4M5 has been released for 3 months and it now provides a great integration with CAS. SAS 9.4M5 is a true CAS client and it makes very easy to work with CAS, as you would normally do with the SAS engine that is part of a Viya deployment. You no longer need to use SAS/CONNECT. Among the new features and in addition to the CAS integration, M5 comes with a new version of SAS Data Integration Studio: 4.903. This version takes advantage of this better CAS integration from the SAS 9 world. Let's dive into how DI Studio now works with CAS.
I've already blogged about DI Studio integration on 9.4M4 with CAS. So, if you or your customer haven't moved yet to M5, then you might want to check out this article.
The 9.4M4 (SAS DI Studio 4.902) data transformation that works with CAS is called "Cloud Analytic Services Transfer" and allows you to move data between SAS 9.4 and CAS using SAS/CONNECT behind the scenes. This transformation is no longer available in DI Studio 4.903. But if you have designed DI jobs with this node and then have migrated to 9.4M5, these jobs will continue to work as expected, assuming you still have SAS/CONNECT on both sides (SAS9.4 and SAS Viya).
Loading data into CAS in DI Studio 4.903 is much more straightforward. There's a new built-in data transformation that is called "Cloud Analytic Services Table Loader". Before using it, let's talk about the metadata that you need to setup in order to be able to use it.
SAS9.4M5 Metadata Setup
To use CAS from SAS9.4M5 and SAS Data Integration Studio 4.903, you need:
A metadata definition for your CAS Server
You just have to add to your existing metadata an entry for referencing your CAS Server properties. This is done through the definition of a “Cloud Analytic Services Server” metadata object in SAS Management Console.
Simply reference the CAS Controller host machine name and its port, and set Authentication type to “(None)”.
A metadata definition for your CAS engine library
A CAS engine library is different from a CASLIB. A CASLIB is an object managed and owned by CAS. A CAS engine library is a SAS library that points to a CASLIB. It's analogous to having an Oracle schema on one side and a SAS library that points to this Oracle schema on the other side.
The SAS Management Console lets you define a CAS engine library in the metadata, that points to an existing CASLIB in CAS. Simply choose the “SAS Cloud Analytic Services Library”
Unlike the similar configuration of the 9.4M4 bridge (relying on SAS/CONNECT), you must not assign this library to the SAS Viya application server (if it exists) but rather to the “local” SASApp application server. Indeed, this “SASApp” application server is based on a 9.4M5 engine and thus is now able to interact directly with CAS.
Define the libref and the CASLIB that you want to connect to. The CASLIB must exist in CAS.
Finally, attach the CAS Server definition that you just created in the previous step.
SAS users who can connect interchangeably to the SAS (9.4) Workspace Server and CAS
From an authentication perspective, if you share the same security layer between your SAS9.4 environment and your SAS Viya environment, then it should work smoothly. SAS 9.4M5 credentials get automatically passed to CAS for authentication.
In my testing environment, I have 2 systems (OS for SAS9.4M5 and LDAP for Viya) but have the same login/password in both environments. That's definitely not a best practice but it works.
Use the “Cloud Analytic Services Table Loader” transformation
Once you are ready with the metadata, you are all set to use the new “Cloud Analytic Services Table Loader” data transformation to load different types of data into CAS. This node is located in the “Access” folder.
You cannot fully take advantage of this transformation until you connect it to a target CAS table defined in the metadata. This allows the transformation to know about the CAS server and to interact dynamically with it.
Otherwise, the node will not be able to display files and tables that are currently available in CAS.
Thus, you need to import in the metadata already loaded CAS tables that will be used as targets or define yourself from scratch new CAS table metadata definitions.
All right. So, what could you do with this “CAS Table Loader”?
Load data from the SAS 9.4 environment to CAS (client-side loading)
This is probably the most natural way of using this transformation. You extract data from your SAS 9 data sources, transform it using various data manipulations and then want to make it available to your Viya users. This is done graphically by putting the right source table on the canvas and connect it to the CAS Table Loader. In most cases this should work smoothly without any change in the transformation (unless you want to change the default behavior).
You can also configure the "CAS Table Loader" by selecting the right “Load Technique” (“LOAD DATA”) and the right source table in the transformation:
As a result, the canvas will be refreshed and display the newly added source table.
Load files from the SAS 9.4 environment to CAS (client-side loading)
This is almost the same situation except that instead of SAS files, you want to load external files like CSV, Microsoft Excel, JMP files, etc. that are located on the SAS Workspace Server machine.
You can do this by selecting the right “Load Technique” (“LOAD FILE”) and the right data source file on your server.
This cannot be configured using the canvas. Therefore, the input file will not appear on the process flow.
NB: you could also prefer to use SAS DI Studio to read and parse your external file
if the format is not one of the standards supported by the “LOAD FILE” technique,
if you want to have more control on reading options,
if you want to transform the data file before loading
or if you want to have more traceability from a lineage perspective.
You will then move to a “LOAD DATA” technique using the canvas.
Load data from the Viya/CAS environment to CAS (server-side loading)
Finally, this transformation enables you to fully control and orchestrate data loading in CAS, even server-side loading. This is new in this transformation, compared to what we have with the “CAS Transfer” data transformation in the previous DI Studio version. This is the “LOAD CASDATA” load technique.
Using this “Load Technique”, you will be able to:
List the currently available CASLIBs (with respect to the CAS user authorizations)
Select a source CASLIB
It could be the same CASLIB as the one that supports your target table (=> natural loading)
It could be a different CASLIB than the one that supports your target table (=> cross loading)
List the source files associated with a CASLIB
The files list comes from an OS directory, or an HDFS directory, or a database schema, or a LASR library, etc.
Select a source file
And specify some additional options
As you can see, the transformation generates a PROC CASUTIL code that performs the server-side loading.
Options of interest
Replace
You can define the replacement strategy using the “Replace” option in the “Load Technique” tab. Notice that “Append” is only available for client-side loading of SAS data sets (LOAD DATA).
Be careful, if set to “Replace” and if the persistent copy is set to YES (see later), this option applies to both the in-memory table and the persistent copy of the table.
View output columns
In this dialog box in the “Load Technique” tab, you can select the variables to be output in the target table. It will update the target table’s metadata definition.
Where
You can specify filters to subset the data that you want to load.
Group By
You can specify here the partition variables. CAS will organize the target table in memory over multiple nodes according to those variables. It will help further operations that require by-group processing.
It only works with the “LOAD DATA” technique.
Order By
Ordering rows only makes sense within partitions. So, you can also “Order By” variables in this data transformation. It only works with the “LOAD DATA” technique and if you have defined a partition key.
Persist table
You can choose to persist the table in the target CAS library (“Options” tab). That means that you will load a table/file in CAS in-memory and this table will also be backed on disk in the CASLIB “datasource” folder (OS folder or HDFS folder).
This can be useful in client-side scenarios for fast re-loading or this can be painful. You might inadvertently generate many copies of your data. Use that with caution, default is YES! It works in conjunction with the REPLACE option, discussed earlier.
This only works with path-based CASLIBs (PATH, HDFS, DNFS) in Viya 3.2 because they provide write-back capabilities. This will work with all CASLIBs (DBMS CASLIBs for example) in Viya 3.3.
Advanced Options
Finally, in the “Table Options” tab, you can:
Load data in CAS in a compressed format
Specify the number of copies for failover
Load data as a repeated table (client-side loading of SAS data sets only)
Promote the table directly after loading
Certain CAS table options might not be supported for a given load technique and a given file format (especially SASHDAT).
Conclusion
SAS DI Studio 4.903 provides a nice way to load data from your 9.4 environment (but also from your CAS CASLIBs) into CAS with its new “CAS Table Loader” data transformation.
Keep in mind that you must restrict the usage of CAS in SAS DI Studio to the final loading step. Don’t try to process CAS data using the standard DI data transformations. It might work or not (with respect to the DATALIMIT option which prevents the CAS client to download an entire CAS table locally; default is 100MB). But it will use the SAS 9.4 Workspace Server and all the required data will be downloaded before.
If you want to process data in CAS from DI Studio, then a “User Written Code” or a new custom “Transformation” with the right syntax, are the only options.
... View more
I am trying to understand how filevar= option is used in infile statement. I am able to understand that a different file each time (varying file, filevar) will be given to read and it is read sequentially. But I am not able to understand the exact syntax. I am sharing the syntax below which I am trying.
filename qtr1 "path";
filename qtr2 "path";
filename qtr3 "path";
data new;
infile temp filevar = qtr1;
input A B C D;
run;
A B C D are four columns in .dat file qtr1, qtr2 and qtr3 and they are getting read properly without filevar= option. But when I am using filevar=option, the error is shown that "Invalid physical name."
I request to kindly guide me about this. Thanks in advance.
- Dr. Abhijeet Safai
... View more
hello. I am a SAS beginner. I have an experimental group, and I need to build a control group. A control group must be created by matching control group subjects 1:1 based on two variables: age and sex of the experimental group. Matching should be done using only those two variables, not the propensity score. Should I use proc psmatch or inner join? I would appreciate it if you could provide detailed code.
... View more
Välkommen till FANS Vinteravslutning (Endast för FANS-medlemmar)
Datum: 5 december 2024
Tid: 15:00 - 21:00
Plats: Stora Frösunda, Solna
Registrering: 14:30
14:30
Registrering och fika med något gott
15:00
Välkomna
Presentationer
17:00
Vi går ner till Orangeriet och får oss en dryck och har lite kul
18:00
Middag serveras och vi fortsätter ha kul
Registrera deltagare genom att klicka här!
Känner du till vårt månatliga SAS Nordic User Newsletter? Här kan du läsa om SAS Nordic User-aktiviteter och evenemang.
Håll dig uppdaterad och prenumerera mha denna länk. Du kan avbryta prenumerationen närhelst du vill.
Tillbaka till FANS - Nordic SAS User Group
... View more