Recently in the SAS Community Library: SAS' @Sundaresh1 highlights a sometimes overlooked task when applying document embeddings for purposes of similarity-based search. Normalisation of vectors helps obtain relevant matches.
Hi all - ¿How can I connect files from a web server to sas viya? I want to automatically capture the log files in .gz, and then process them (ex: domain.com-2024-04-11.gz).
... View more
Hello! Using a macro, I am producing a series of crosstabs using several demographic variables (sex, age, income, etc.). Some of these variables have formats I created using PROC FORMAT. When I use PROC SURVEYFREQ, the ods output provides a table with a column that contains the values (say sex contains values 1 or 2) and another column with the formatted value (say F_sex contains Male or Female). I use the formatted columns and rename them Demographic. I then append the files created with ods output and I obtain a column named Demographic that contains the formatted values (ex. Male, Female, 18 to 30, 31 to 40, et.). However, when I do the same with PROC SURVEYMEANS, the ods output does not include columns with formatted values only columns with the actual values such as 0 and 2 and not Male and Female). Is there a way to force ods output to include the formatted values? Thanks in advance for your help! * Both PROC steps are part of a macro and &DemoVar contains the demographic variable ;
proc surveyfreq data=Mydataset VARHEADER=LABEL ;
tables year * &DemoVar * fg_use_choose
year * &DemoVar * fg_use_amt
year * &DemoVar * fg_use_assess
year * &DemoVar * fg_use_plan
year * &DemoVar * fg_use_wt
year * &DemoVar * fg_use_away / nostd row ;
weight wght ;
format income_adeq_G income_adeq_G_F.
fg_use_choose
fg_use_amt
fg_use_assess
fg_use_plan
fg_use_wt
fg_use_away yesno_F. ;
ods output CrossTabs=Results_Info ;
run ;
***********************************************;
proc surveymeans data=Mydataset mean VARHEADER=LABEL plots = none ;
var FG_trust ;
domain year * &DemoVar year ;
weight wght ;
format income_adeq_G income_adeq_G_F. ;
ods output Domain=Results_Trust ;
run ;
Erik
... View more
Hi, I have some code that sums up monthly values and assigns each month as a new column. The problem is the code is ugly and I feel like there should be a better way to do this. Here is the code:
PROC SQL;
CREATE TABLE TMP1DAY.MONTHLY_SUMMARY AS
SELECT t1.ID_VARIABLE,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201901" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201902", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201901,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201902" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201903", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201902,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201903" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201904", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201903,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201904" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201905", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201904,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201905" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201906", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201905,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201906" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201907", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201906,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201907" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201908", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201907,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201908" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201909", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201908,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201909" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201910", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201909,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201910" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201911", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201910,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201911" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201912", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201911,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202001", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_02,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202001" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202001,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202002" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202003", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202002,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202003" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202004", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202003,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202004" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202005", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202004,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202005" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202006", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202005,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202006" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202007", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202006,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202007" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202008", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202007,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202008" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202009", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202008,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202009" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202010", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202009,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202010" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202011", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202010,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202011" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202012", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202011,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202101", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202102", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_02
FROM WORK.PRE_MONTHLY_SUMMARY t1
GROUP BY 1;
QUIT;
How would I instead write this to do the same thing but programmatically/with macro variables? its already out of hand and I will need to add more years which will just make it even worse. Any and all advice is appreciated.
Thanks in advance!
... View more
Hello! I have a table that I have to export to excel using a proc report, where subtotals rows are computed for certain numeric columns. My problem is that in the table(so before the proc report) I also created a check variable (like if col A + col B = col C then "true" else "false") that need to be calculated only for the subtotal rows. Is ther a way to write this in a proc report
... View more
Hi everyone - a bit newer to SAS and running into a problem trying to adapt someone's older code. Essentially I have a character variable with responses, and am coding in skips to it. e.g. Add skip coded 777 into the open text variable OPENTEXT1 if responses are blank OPENTEXT1 has character responses maxing out at 161 characters, proc contents show format=161 informat=$161. Check responses with proc freq, all 777 and responses are normal Then I create a "label" using the proc format function with value to label the skip and keep responses proc format; value OPENTEXT '777'="Skip"; run; data dataset1; set dataset; format OPENTEXT $OPENTEXT. proc freq, this time 777 show "SKIP" but other responses truncated to 4 characters I checked proc contents, and it is still showing format and informat still remaining at the same length. Any help appreciated
... View more