Recently in the SAS Community Library: SAS' @Sundaresh1 highlights a sometimes overlooked task when applying document embeddings for purposes of similarity-based search. Normalisation of vectors helps obtain relevant matches.
Brand new installation of SAS 9.4 M8 and when writing code to make a plot appear in PROC POWER (the code is correct, per the reference I am using as the accompanying visualization is given), I receive the following error: ERROR: Java virtual machine exception. java.lang.NoClassDefFoundError: Could not initialize class com.sas.graphics.applets.statgraph.sgchart.data.DataModel Has anyone found a workaround for this?
... View more
When I subset a dataset in a DATA step, SAS will continue to run for much longer than expected, so long in fact that I have not seen it finish running. However, when I break the run and cancel the submitted statements, the log indicates that 1,271 observations were read, which is number of observations that I expect to have in the subset. Why is it that SAS keeps running when all of the observations that match the WHERE condition have been read? In the DATA step I use a WHERE statement to subset for observations where the character variable SUB = '123'. The dataset is large (1.3M+ obs.), but as I mentioned, the resulting data set "filtered_items" should only have 1,271 observations. libname corpxin "\\filepath\folder";
data filtered_items;
set corpxin.items_202001;
where SUB = '123';
run;
... View more
When I converted the dataset with %xpt2loc I found that creating the XPT in V8 format via %loc2xpt was missing the dataset labels. Dose anyone know about it? Thanks!
... View more
Hello, I am getting the following error messages when trying to merge two datasets. One of the datasets I am getting from a csv file, so maybe the issue could be there? I was trying to specify the length of the PID variable for the redcap_sort dataset from the redcap one, which is the one we got from the csv file. However, I keep getting messages that the variable has multiple lengths and it keeps truncating the data. Any PID after 999 gets shortened. So 1000 and 1001 become 100, 1010 becomes 101, etc. Any help or a nudge in the right direction would be greatly appreciated, thank you so much. Edit: The programming with the csv file already has: data work.redcap; %let _EFIERR_ = 0;
infile &csv_file delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=1 ;
informat pid $500. ;
informat pid_ini $500. ; and the code for format: format pid $500. ; It has this for all the variables. I thought the above code would make it so that the variables would have that limit of 500 characters?
... View more
Hi, I would like to apply two types of floors, respectively for lower and upper segments of a dataset. Assume the dataset has two columns: Col1 which is unique and asc sorted; Col2 is the actual data the floor applies to 1) Floor 1 on the lower band of Col1 : for any values in Col1 less than3, which is set to by the user - replace their values in Col2 with the Col2 value of Col1 =3 2)Floor 2 on the upper band of Col1: for any values in Col1 greater than 4, which is set to by the user - replace their values in Col2 with the Col2 value of previous Col1 to ensure they are not decreasing when Col1 values increase Below is an example of the dataset I have and what I want. Many thanks in advance. data have; input Col1 Col2; datalines; 1 1 2 2 3 3 4 4 5 3 6 6 7 1 8 3 ; data want; Col1 Col2 1 3 2 3 3 3 4 4 5 4 6 6 7 6 8 6 ;
... View more