Recently in the SAS Community Library: SAS' @Sundaresh1 highlights a sometimes overlooked task when applying document embeddings for purposes of similarity-based search. Normalisation of vectors helps obtain relevant matches.
Hi, I am new to SAS and I am trying to input data from the a .txt file. Could someone please help me figure out why the IF-THEN statement only applies to the first row of the data? Thanks! Data: DATA work.condo_ranch;
INFILE '/home/u63748921/SAS 123 Problems/sas123_p13.txt' DSD;
INPUT style $ @;
IF style = 'RANCH' OR style = 'CONDO' THEN INPUT sqfeet bedrooms baths street $ price : dollar10.;
RUN; Output:
... View more
As I recently posted here, I'm trying to create Table 1 for a research journal article. @Reeza pointed me to the Table 1 macro "TableN" and it looks like exactly what I need.
I downloaded the macro and ran it, and then I tried to call it based on the example provided.
/*Example macro call*/
/*%tablen(data=example, by=arm,
var=age date_on sex race1 smoke_st num_met,
type=1 3 2, outdoc=~/ibm/example1.rtf);*/
/*My version*/
%tablen(data=have, by=TM_group,
var=DEM_AGE DEM_SEX,
type=2 2, outdoc="C:\data_output\test.rtf);
But I got an error after "by TM_group": ERROR: All positional parameters must precede keyword parameters. I looked up the error and it has something to do with the commas between the parameters. After some trial and error of deleting commas, I got it to run without error, but it doesn't produce any output.
Sample data is provided below:
data have;
infile datalines dsd dlm=',' truncover;
input DEM_AGE DEM_SEX cohort_flag TM_group;
datalines;
3,1,1,0
2,1,1,1
3,2,1,1
3,2,1,1
3,2,1,0
2,2,1,1
2,1,1,1
3,1,1,1
2,1,1,1
3,2,1,0
2,1,1,0
2,2,1,1
3,2,1,0
2,2,0,
3,2,1,1
3,2,1,1
3,1,1,0
3,2,1,0
2,1,1,0
3,1,1,1
3,2,1,1
3,2,1,0
3,2,1,1
3,2,1,1
3,2,1,1
; RUN;
proc format library=temp;
value age2grp
1='1:Age Group <65'
2='2:Age Group [65,75)'
3='3:Age Group >=75'
.='Inapplicable/Missing';
value sex
.='Inapplicable/Missing'
1='1:Male'
2='2:Female'
value yesfmt
1='1:Yes'
2='2:No'
.='Inapplicable/Missing'
;
RUN;
... View more
Hi everyone, I can see all the markers in the plot, but the corresponding admission dates on the x-axis are not fully visible. How can I adjust to see every admission date? proc shewhart data=VV; xchart LOS*admission_date /Markers outtable=outtable nohlabel TOTPANELS=1; run;proc print;run; Thanks,
... View more
I've been looking for a good, efficient method to create "Table 1", which has a more or less standard format in research journals:
Variable Name
Category 1
Category 2
P-value
Variable 1
1
N(%)
N(%)
Chi-sq or Fisher's
2
N(%)
N(%)
3
N(%)
N(%)
Variable 2
Mean (SD)
Mean (SD)
t-test
Median (IQR)
Median (IQR)
I found this PDF paper. It seems to work pretty well, but I'm having 2 issues. The first is that the chi-square results are saved into a file and then merged into the main file. But the value of the "variable" variable is the var label in 1 file and the var name in the other, so they don't merge.
The second issue is that Proc Report does not produce a table and instead gives me this: NOTE: Groups are not created because the usage of levels is DISPLAY. To avoid this note, change all GROUP variables to ORDER variables.
WARNING: A GROUP, ORDER, or ACROSS variable is missing on every observation.
data have;
infile datalines dsd dlm=',' truncover;
input DEM_AGE DEM_SEX cohort_flag TM_group;
datalines;
3,1,1,0
2,1,1,1
3,2,1,1
3,2,1,1
3,2,1,0
2,2,1,1
2,1,1,1
3,1,1,1
2,1,1,1
3,2,1,0
2,1,1,0
2,2,1,1
3,2,1,0
2,2,0,
3,2,1,1
3,2,1,1
3,1,1,0
3,2,1,0
2,1,1,0
3,1,1,1
3,2,1,1
3,2,1,0
3,2,1,1
3,2,1,1
3,2,1,1
; RUN;
/*Load formats from existing file in temp folder*/
options fmtsearch=(temp.formats);
/*DEM_SEX sex 1 Male 2 Female*/
/*DEM_AGE AGE2GRP 1:Age Group <65 2:Age Group [65,75) 3:Age Group >=75*/
/*TM_group yesno 1 Yes 2 No*/
/*Generate descriptive statistics*/
proc means data = temp.have noprint n sum mean;
class DEM_AGE DEM_SEX;
var TM_group /*MA_group*/;
ways 1;
output out = temp.expl_PreTable n =
sum =
mean = / autoname;
WHERE cohort_flag = 1;
run;
/*Format descriptive stats*/
data temp.expl_Table (keep = variable levels TM_group_N TM_group_sum
TM_group_mean pct ExpPct indexvar); set temp.expl_PreTable;
length variable $ 20; /* These four variables */;
length levels $ 20; /* will describe the first */;
length pct $ 8; /* four columns of the table */;
length ExpPct $ 15;
if DEM_AGE ne . then do; /*Building "variable" and "Levels" columns for "DEM_AGE"*/;
variable = 'Age category';
levels = put(DEM_AGE, age2grp.);
IndexVar = 1; /*This index is included just in case the order of data presentation needs to be changed*/;
end;
if DEM_SEX ne . then do; /*Building "variable" and "Levels" columns for "DEM_SEX"*/;
variable = 'Sex';
levels = put(DEM_SEX,sex.);
IndexVar = 2;
end;
pct = put(TM_group_mean*100,4.1); /*Calculate % exposed */;
ExpPct = compress(put(TM_group_sum,comma4.),' ')
||' '||'('||compress(pct,' ')||')'; /*creating data in the form of "count (%)" */;
run;
/*Run chi-square significance tests and use ODS to create a dataset of these results*/
ods trace on;
ods output chisq = temp.expl_ChiData;
proc freq data = temp.have;
table TM_group*DEM_AGE / chisq;
table TM_group*DEM_SEX / chisq;
WHERE cohort_flag = 1;
run;
ods trace off;
/*Rearrange chi-square dataset so it can be merged with descriptive stats table*/
data temp.expl_ChiData2 (keep = variable prob);
set temp.expl_ChiData (where = (statistic = 'Chi-Square'));
length variable $ 20;
variable = scan(table,-1,' '); /* Returns the last word in a character value from the "table" variable*/
run;
/*Sort both tables so they will merge*/
PROC SORT data=temp.expl_Table OUT=temp.expl_Table_sort; BY variable; RUN;
PROC SORT data=temp.expl_ChiData2 OUT=temp.expl_ChiData2_sort; BY variable; RUN;
/**************MERGE DOES NOT WORK BECAUSE The value of the "variable" variable is the var label in 1 file
and the var name in the other. DUE TO THE 'IF a' STATEMENT, NOTHING FROM THE CHIDATA2 FILE IS MERGED IN****/
/*Merge descriptive stats table with chi-square table*/
DATA temp.expl_TableData;
MERGE temp.expl_Table_sort (in = a) temp.expl_ChiData2_sort (in = b);
BY variable;
IF a;
RUN;
/*Use PROC REPORT to create final output table*/
proc report data = temp.expl_TableData nowd;
column variable levels TM_group_N ExpPct prob;
define variable / "Variable" group format = $variable.;
define levels / " " ;
define TM_group_N / "TM" /*format = comma5.*/;
define ExpPct / "TM Group/n (%)";
define prob / "p-value" group format = pvalue6.4;
Title "Table 1. Descriptive characteristics of individuals in the sample";
RUN;
... View more
Join MSUG for their 1-Day SAS Conference!
Date: Wednesday, June 12, 2024 Time: 8:00 AM - 4:30 PM Place: VisTaTech Center Schoolcraft College 18600 Haggerty Rd Livonia, MI 48152 Cost: $50 on or before May 28, 2024; $95 after May 28, 2024. $10 students with proof of student status.
Register Now!
Agenda
Know Thy Data: Techniques for Data Exploration - Charu Shankar, SAS
Bayesian Time Series in PROC MCMC - Danny Modlin, SAS
Introduction to Data Simulation - Jason Brinkley, Abt Associates Inc.
SAS HPSPLIT: A Powerful Machine Learning Tool - Russ Lavery, Independent Consultant
NHANES Dietary Supplement Component: A Parallel Programming Project - Jay Iyengar, Data System Consultants, LLC
Being a Statistical Expert Witness - David Corliss, Grafham Analytics
Binning Procedures for Logistic Regression - Bruce Lund, Independent Consultant
Confessions of a PROC SQL Instructor - Charu Shankar, SAS
Missing Data in PROC MCMC - Danny Modlin, SAS
Regression Models for Count Data - Jason Brinkley, Abt Associates Inc.
An Animated Introduction to Git and GitHub - Russ Lavery, Independent Consultant
SAS Job Searching and Interviewing Tips – Strategies in the Post-Pandemic Era - Jay Iyengar, Data System Consultants, LLC
They are also planning to hold the following training classes before and after the conference. Cost is $185 for a half-day class, and $370 for a full-day class. All classes will be held at the VisTaTech Center at Schoolcraft College (18600 Haggerty Rd, Livonia, MI 48152). Click on the titles for the course descriptions. Tuesday, June 11
8:00 AM - 5:00 PM: SAS Macros in Cartoons: Complex Stuff Made Easy! - Russ Lavery, Independent Consultant
Thursday, June 13
8:00 AM - 12:00 PM: An Overview of Multivariate Statistical Analysis of Quantitative Data (PCA, FA, and Clustering) - Jason Brinkley, Abt Associates Inc.
1:00 - 5:00 PM: An Overview of Causal Inference, Counterfactual Data Analysis, and Propensity Score Methods - Jason Brinkley, Abt Associates Inc.
Register Now!
... View more