Recently in the SAS Community Library: SAS' @StuartRogers provides a close look at the new Microsoft Entra Gallery application and details how it can be used.
I need to combine same clinic names if the corresponding prior_span =1.
For example, if Clinic_name1 and Clinic_name2 where both Clinic A and Prior_span1 and prior_span2 both =1 then Clinic would = A.
Clinic_name_1 corresponds with Prior_span1
Clinic_name_2 corresponds with Prior_span2
Clinic_name_3 corresponds with Prior_span3
Below is my sample data.
Thanks!
Data Have;
input
ID$ Clinic_Name_1$ Clinic_Name_2$ Clinic_Name_3$ Prior_Span1$ Prior_Span2$ Prior_Span3$ Cost service$ ;
datalines;
1 B A . 0 0 0 168 A
2 A A . 1 0 0 46 B
2 A A . 0 1 0 99 C
3 B . . 0 0 0 24 D
4 D . . 0 0 0 54 A
5 B D F 0 1 0 405 B
7 A . . 1 0 0 90 F
8 E E A 0 0 1 87 J
9 F . . 0 0 0 0 G
10 C C C 0 0 0 40 C
10 C C C 0 0 1 374 B
10 C C C 0 1 1 60 A
;
run;
***Summary table-clinic var not created yet****;
proc sql;
create table want as
select 1 as ID1,'service' as a length=40,Clinic,
1 as ID2,put(service, $40.) as b length=40,
1 as ID3,(select count(distinct ID) from have where Clinic=a.Clinic and service=a.service and cost>0 ) as c format=comma20.,
1 as ID4,calculated c/(select count(distinct ID) from have where Clinic=a.Clinic and cost>0 ) as d format=percent8.1,
1 as ID5,sum(cost) as e format=dollar20.,
1 as ID6,sum(cost)/(select sum(cost) from have where Clinic=a.Clinic) as f format=percent8.1
from have as a
group by Clinic,service
union all
select 7 as id1,'09'x as a,Clinic,
7 as id2,'Total Clients' as b,
7 as id3,count(distinct ID) as f,
7 as id4,.,
7 as id5,.,
7 as id7,.
from have
where service ne 'j' and cost>0
group by Clinic
union all
select 7 as id1,'09'x as a,Clinic,
7 as id2,' Total Cost' as f,
7 as id3,. as c,
7 as id4,.,
7 as id5,sum(cost),
7 as id7,.
from have
where service ne 'j' and cost>0
group by Clinic
;quit;
options missing=' ';
proc report data=want nowd;
column id1 a b Clinic,(c d e f);
define id1/group noprint;
define a/group noprint ' ';
define b/group ' ';
define c/analysis 'Clients';
define d/analysis '% of Clients';
define e/analysis 'Cost';
define f/analysis '% of Cost';
define Clinic/across '';
compute before a/style={just=l color=black};
if id1=1 then len=0;
else len=10;
x=' ';
line x $varying10. len;
line a $40.;
endcomp;
run;
... View more
Hi, tsmk is my time-dependent covariate, deathstatus_inhance is censor, and stime_inhance is time. The "Change step" worked well, but the "Count step" only worked for four observations and stopped. Please help me to correct my code. DATA analysis.change;
SET analysis.smk_stime2;
ARRAY tsmk_(*) tsmk_1-tsmk_5; *call in the time-varying smoking variables;
ARRAY chng(5); *the new indicator variables;
t=1; *initialize the position variable for the indicator variables;
DO i = 2 TO 5;
IF tsmk_(i) NE tsmk_(i-1) THEN DO; *detects whether there is a change in smoking status;
chng(t) = i-1; *assigns the last year the status remained constant;
t=t+1;
END;
END;
RUN;
DATA analysis.count;
SET analysis.change;
ARRAY tsmk_(*) tsmk_1-tsmk_5; /* call in the time-varying smoking variables */
ARRAY chng(*) chng1-chng5; /* call in the indicator variables */
start = 0; /* initialize the beginning time for the study */
censor2 = 0; /* initialize the new censor variable */
t = 1; /* initialize the position variable for the indicator variables (chng1-chng5) */
DO i=1 TO stime_inhance; /* makes sure we only output the records that smoking status remains constant */
IF (chng(t) > . and chng(t) < stime_inhance) or i = stime_inhance THEN do;
/* assign the value of smoking status */
IF chng(t) > . THEN smoking_status = tsmk_(chng(t));
ELSE smoking_status = tsmk_(stime_inhance); /* assign the end time */
stop = min(chng(t), stime_inhance); /* assign the value of the censor variable */
IF i = stime_inhance THEN censor2 = deathstatus_inhance; /* assign the new start time */
IF t > 1 THEN start = chng(t-1); /* move the position variable */
t = t + 1;
OUTPUT; /* output the record to the new dataset */
end;
END;
RUN;
... View more
I have been looking at this code syntax for hours, the Teradata code runs in a different environment, but bringing it through Enterprise Guide is giving me issues. The error I am getting is: ERROR: Teradata execute: Syntax error: expected something between ')' and ';'. I can't seem to figure out what the issue is. I appreciate any help or insight on this.
PROC SQL NOERRORSTOP;
CONNECT TO TERADATA (AUTHDOMAIN=xx server="xxx" connection=global mode=teradata);
execute (drop table TestServer.Test3) BY Teradata;
execute (create table TestServer.Test3 as
(
select *
from TestServer.Test1
UNION ALL
select *
from TestServer.Test2
))BY Teradata;
quit;
... View more
Hi SAS community, I am a novice SAS user. I would like to analyze a dataset from an experiment. I have tree samples from two trials conducted at different sites. Each site was divided into several blocks, and tree seedlings from 4 origins were randomly planted within each black. 50+ years later, two blocks from each site were selected and 3 trees from each of the origins were harvested. Then each tree was "cut" into wood products using 3 different cutting methods (using a simulation software). I would like to test if there are differences in product recovery between the two sites, among the origins, and among the cutting methods. 1) I am not interested in blocks, so the block factor will be a random variable. Should I consider trees from each origin as subsampling or as another random factor, or maybe a replication? 2) The analysis I have done so far assumed the trees were randomly select from each origin within each block. What if the trees were randomly selected not at the origin level, but at the block level to cover the full range of tree sizes within the block (tree size have huge effect on recovery)? If this were the case, should I drop origin in the model, because its levels were not considered in the sampling? Here is what I have so far. I am not confident if this is a correct analysis and will be very appreciate if anyone could point me to a correct way to analyze the data (attached). PROC GLIMMIX data=data; class site origin block cut tree; model recovery = site*origin*cut/ddfm=kr; random block site*origin*cut*block; run; Thank you very much. Tess
... View more
Hi, I have some code that sums up monthly values and assigns each month as a new column. The problem is the code is ugly and I feel like there should be a better way to do this. Here is the code:
PROC SQL;
CREATE TABLE TMP1DAY.MONTHLY_SUMMARY AS
SELECT t1.ID_VARIABLE,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201901" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201902", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201901,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201902" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201903", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201902,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201903" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201904", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201903,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201904" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201905", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201904,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201905" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201906", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201905,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201906" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201907", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201906,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201907" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201908", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201907,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201908" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201909", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201908,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201909" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201910", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201909,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201910" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201911", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201910,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201911" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201912", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201911,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202001", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_02,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202001" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202001,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202002" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202003", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202002,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202003" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202004", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202003,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202004" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202005", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202004,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202005" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202006", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202005,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202006" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202007", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202006,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202007" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202008", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202007,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202008" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202009", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202008,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202009" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202010", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202009,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202010" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202011", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202010,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202011" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202012", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202011,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202101", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202102", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_02
FROM WORK.PRE_MONTHLY_SUMMARY t1
GROUP BY 1;
QUIT;
How would I instead write this to do the same thing but programmatically/with macro variables? its already out of hand and I will need to add more years which will just make it even worse. Any and all advice is appreciated.
Thanks in advance!
... View more