Creating an AI Assistant for SAS Viya in 5 steps (@sassoftware/viya-assistantjs) - Part I
Recent Library Articles
Recently in the SAS Community Library: SAS' @kumardeva debunks the myth that developing AI assistants is too hard. He shows you how to use the @sassoftware/viya-assistantjs library to jump start your development.
Hi there,
I am trying to create a csv file that has both the variable names of the dataset and another line as the header. It should look like below.
I have tried proc export, ods csvall, and a data _null_ statement but can't seem to get it to work. Examples of my code below. Is there a way to add in the ndar_subject line header while keeping the variable names as the headers below? If I could get the variable names as a header in the data _null_ step that would be ideal, but I have not been able to figure this out.
%let header1 = %str(ndar_subject,1);
%put &header1; data _null_; file "R:\tsanche\REALM Data\Data\SAS Created Data\Data Repository\July 2024 CSV Upload Files\REALM Subject and Pedigree July 2024.csv" dsd; set repdata.pedig;
if _n_ = 1 then do; put "&header1"; end;
put (_all_)(+0); run;
proc export data=repdata.pedig outfile='R:\tsanche\REALM Data\Data\SAS Created Data\Data Repository\July 2024 CSV Upload Files\REALM Subject and Pedigree July 2024.csv' dbms=csv replace; run;
ods csvall file="R:\tsanche\REALM Data\Data\SAS Created Data\Data Repository\July 2024 CSV Upload Files\REALM Subject and Pedigree July 2024.csv"; proc report data=repdata.pedig; run; ods csvall close;
Thanks in advance!
... View more
Hi. I'm working with doses of limestone and styding soil pH. My model is a quadratic polynomial and I did a scatterplot in PROC SGPLOT. But I want to know if I can put the equation inside of the graph. How can I do that? I know how to do in Excell, but in the SAS on Demand I don't know if it's possible. Below is the code line that I used. Thanks for the helpful. proc sgplot data=mydata noautolegend; title height=11pt color=black "soil pH vs limestone"; scatter Y=pH X=DOSE; yaxis label="pH"; xaxis min=0 max=1.92 values=(0 0.48 0.96 1.44 1.92) label="Limestone (t/ha)"; REG Y=pH X=DOSE/degree=2; keylegend / location=inside position=top across=5 down=3 titleattrs=(wight=bold size=9pt) valueattrs=(collor=black size=7pt); run;
... View more
Hello,
I am building models on VA interface. Is it possible to do a 80/20 train/test split in this interface? Or do I need to use SAS Studio or VDMML pipelines to do that split? Thank you!
... View more
Please I need clarifcation on the code below. Thanks.
"data-type bis enclosed in parentheses and specifies one of the following: CHARACTER (or CHAR) | VARCHAR | INTEGER (or INT)."
proc sql; create table work.discount (Destination char(3), BeginDate num Format=date9., EndDate num format=date9., Discount num); quit;
I expect the data-type should be in parenthesis based on the syntax description above, e.g, Destination (char)(3). Please what am I missing?
... View more
Decision makers in cloud-savvy organisations continuously seek ways to reduce unpredictability of cost for consuming cloud services, while aiming to maximise their return on investment. Those investing in solutions based on Lakehouse architecture, such as Databricks, anticipate cost savings and productivity gains from this unified storage solution for their cloud-migrated data. After all, Delta Lake storage, which underpins the Lakehouse architecture, combines the strengths of both Data Lakes and Data Warehouses, leveraging the inherent flexibility of the cloud.
However, what if certain business operations require processing the data in Lakehouse, using an application that is not co-located with it? Does this mean the data must be transferred to the remote application? This scenario would introduce concerns about network latency, performance, and egress cost. Alternatively, should organisations consider replacing these remote applications with new, co-located ones? This approach also seems impractical, as it would be very disruptive and counterproductive to the anticipated business objectives.
Some recently published blog posts by my colleagues on SAS Communities (see reference links below) describe how SAS can harness the analytical power of data stored in Databricks. For example, Cecily Hoffritz discussed in her blog how user-friendly interface of SAS Viya enables more users within an organisation to use the Databricks Lakehouse and participate in data analysis, decision-making, and innovation, regardless of their technical background. In another example, Patric Hamilton explained in his blog how SAS Data Quality can be applied to the Databricks Lakehouse for Entity Resolution, enhancing the effectiveness and accuracy of data-driven decisions. I will continue with the theme in this blog and describe how can organisations extract even more value from their investment in Databricks through the “In-Database processing” capabilities of SAS.
As the name suggests, “SAS In-Database“ processing allows processing to happen inside the database to utilise resources much more efficiently and effectively. Examples of SAS In-Database processing with Databricks include Implicit and Explicit SQL Pass-Through, In-Database Procedures and In-Database Model Scoring facilities. In-Database Scoring for Databricks enables the environment to leverage Massive Parallel Processing architecture of the database and allows processing to happen locally at the database level. Only the final result travels to network, that too if required. The function remains in the SAS language and is executed by a lightweight SAS engine (SAS Embedded Process for Spark) deployed within the Databricks cluster. Deployment of SAS Embedded process is a one-time task for the administrator, and a simplified view of the overall process could look as shown below.
Once SAS Embedded Process has been deployed, users can develop models using SAS Studio (by writing SAS code), SAS Model Studio or SAS Intelligent Decisioning (by creating visual pipelines). SAS In-Database scoring supports the following platforms to publish and run the models in Databricks:
Databricks on AWS or on Azure
Spark Table (as of version 2023.10)
Amazon S3
Microsoft ADLS Gen2
Here is a list of steps and example code to publish a model to a Spark table and run it in Azure Databricks:
Step 1: Start a CAS session and assign a caslib to Spark
cas mysess;
caslib myspark datasource=(
srcType="spark",
platform="databricks",
server="myserver.azuredatabricks.net",
userName="token",
password="authentication-token",
clusterId="cluster-id",
jobManagementUrl="https://nnnnn.cloud.databricks.net/",
httpPath="my-http-path",
schema="my-schema"
);
Step 2: Publish a model to a Spark table
proc scoreaccel sessref=mysess;
publishmodel
exttype=databricks
caslib="myspark"
modelname="mymodel"
storefiles="/myfiles/mystore.ast"
programfile="/myfiles/myprogram.sas"
modeldatabase="mydatabase"
;
run;
quit;
Step 3: Start a Databricks interactive session before you run the model
proc cas;
sparkEmbeddedProcess.startSparkEP
caslib="myspark";
run;
quit;
Step 4: Run the model
proc scoreaccel sessref=mysess;
runmodel
exttype=spark
caslib="myspark"
modelname="mymodel"
modeldatabase="mydatabase"
intable="mytable"
outtable="mytable_out";
run;
quit;
Step 5: Stop the SAS Embedded Process for Spark continuous session
proc cas;
sparkEmbeddedProcess.stopSparkEP caslib="myspark";
run;
quit;
Conclusion
SAS In-Database processing enables organisations to extend the value of their data stored in a Databricks cluster by executing programs so that data is analysed without crossing the database boundary – giving them the benefits of improved performance, productivity and governance, while avoiding the data egress cost.
Learn more about SAS and Databricks
Harness the analytical power of your Databricks platform with SAS
Data everywhere and anyhow! Gain insights from across the clouds with SAS
Elevated efficiency and reduced cost: SAS in the era of Cloud Adoption
SAS and Databricks: Your Practical Guide to Data Access and Analysis
Data to Databricks? No need to recode - get your existing SAS jobs to SAS Viya in the cloud
Maximize Coding and Data Freedom with SAS, Python and Databricks
Data Brilliance Unleashed: SAS Data Quality against Databricks - Precision, Performance, Perfection
Unlock Seamless Efficiency: SAS Viya's No-Code/Low-Code Experience to Democratize Databricks
... View more