I have got the SAS re-installed in my personal laptop. Earlier when I used to open a new file whether it is a SAS data file or a SAS code, they used to open in separate windows. However, now when I open any kind of SAS files, all the files get clubbed under one window and within the process flow as shown in the attached screenshot. This makes it quite difficult to see these different files separately or at the same time. Could anyone please help me resolve this issue? Thanks
... View more
**ADDENDUM to original post: I realized that this issue was being caused by starting with a "RETAIN" statement, which I use to put the variables in the desired order. But I'd still like to leave this question up because I'd appreciate any feedback on: How does a RETAIN statement work? When does it affect the outputs of a command in a DATA step? Does anyone have alternate/preferred strategies for reordering the variables in a dataset? Thanks! *********************************************************************** Original post: Hello SAS community, I'm very confused about how SAS deciphers "IF" Statements in the DATA step. In this specific case, I'm working with an account dataset that has some conflicting information about when accounts close, and I am constructing an "effective" close date. Earlier in my data step, I used some IF statements to construct my desired close date. The last step is to convert that numeric close date to a string variable in the format YYYYMM. Here's what I tried: DATA WORK.dates_test;
SET WORK.raw_dates;
close_eff_n = acct_close_dte_n; IF closed = 1 AND acct_close_dte_n = . THEN DO; close_eff_n = maxdate_n; END;
*(omitting some additional logic used here for parsimony);
IF close_eff_n > 0 THEN DO;
close_dte_eff = put(close_eff_n,yymmn.);
END;
RUN; I had earlier written this last segment as: close_dte_eff = put(close_eff_n,yymmn.); but this populated the string variable close_dte_eff with a value of "." when close_eff_n was missing, which is why I'm now trying to implement this conditional logic. The problem is: where this condition fails, SAS populates the close_dte_eff field with whatever the last non-failed value was, which is completely incorrect. e.g. I have: close_eff_n 01MAR2023 01APR2023 . . 01JUL2021 I want: close_eff_n close_dte_eff 01MAR2023 202303 01APR2023 202304 . . 01JUL2021 202107 But instead I get: close_eff_n close_dte_eff 01MAR2023 202303 01APR2023 202304 . 202304 . 202304 01JUL2021 202107 When I tried to replicate this problem with a simplified dataset, i.e. just taking the final input variables and creating the desired output, I got the result I want, so I suspect it might have something to do with the preceding IF-statements. I can think of plenty of workarounds to get this to work as intended, so my question is not so much how to fix this, but why is this happening? There's something fundamental about how the "IF-statement" is being processed where rows that fail the "IF" condition are being populated with the value of the last row that met that condition, and I would like to understand when SAS applies this behavior and when it does not. I can see this being a useful feature in some limited cases, but it's generally not what I would want to do when applying conditional logic. I had thought that these sort of situations where SAS operates on one row depending on what was in the previous row only happen when there is a "BY" statement, but obviously that's incorrect as there is no "BY" statement in this DATA step. I'd really appreciate some explanation as to when actions are applied to rows that do not meet the specified condition in an "IF" statement, and how to control that behavior, so I can make sure that the commands I write are applying to the rows that I expect them to apply to. Please let me know if I can provide any other context or information that would be helpful. Many thanks, Scott
... View more
proc surveyreg data=XXX varmethod=brr(fay); model score_m=ESCS ST004D01T IMMIG ; weight w_fstuwt; repweights w_fsturwt; run; I have used the PROC SURVEYREG in SAS with the BRR (Fay) method for my analysis of the PISA 2022 survey data. I am now uncertain whether it estimates using OLS or GLS. In my thesis, I have stated that the estimation is conducted using the OLS estimator. I am concerned that this might be incorrect. Could anyone clarify whether the estimation method is indeed OLS or if it is GLS? xx
... View more
Hi: I am working on a study. It has been planned to do multiple imputation for missing values related to the primary endpoint by an agency's requirements. I have read many materials online and have some ideas. But, I am still not sure. Study information: The study's indication is Epilepsy. Each patient is given a diary and they are supposed to record how many seizures they expeirenced each day during the study period (DB period is ~ 85 days). As you can image, some patients may forget to record their seizures on some days and some patients may discontinue from the study before the Day 85. So there are missing values for seizure counts on some days for some subjects. My questions: 1. Our missing type is considered as 'missing as random' and so this procedure(Proc MI) can be used, correct? 2. Since our missing data is only in one variable, i.e., seizure count, so I think I should NOT use the methods in SAS documentation (Imputation Methods, Table 5) with 'monotone', correct? 3. Since our data is 'seizure count', which should follow poission distribution (correct?), not normal, I should NOT use methods with 'MCMC' since MCMC method is based on the assumption of multivariate normal distribution (MVN) for variables, correct? 4. Then I thought I should use FCS, fcs reg, or fcs regpmm. I read SAS documentation, it has "The predictive mean matching method ensures that imputed values are plausible; it might be more appropriate than the regression method if the normality assumption is violated (Horton and Lipsitz 2001, p. 246)." So I thought I should use 'fcs regpmm'. I also tried 'fcs reg', the imputed values gives non-integer, a number with decimal. It seems it does not fit my case. Our seizure is an count; so it should be an interger. If I use 'fcs regpmm', the imputed values are integers. 5. If using 'fcs regpmm' is correct for my case, what number of 'k' (SAS option with 'fcs regpmm' option) should I pick? Here is the code I use. proc mi data = post nimpute = 25 out = post_mi seed = 54321 noprint; by subjid; var qsdy count;' fcs regpmm (/k = 5); run; Note: 'qsdy' is the study Day variable; it is from Day 1 till Day 85. 'count' is seizure count for each day. There are missings in this variable. Note: since the imputation is by subjid, so covariates such as age, treatment, etc, are not needed (no change for an individual), correct? If any detailed information is needed for this discussion, please ask me. Thanks a lot in advance. Xiaoshu
... View more
I'm trying to follow the code on this site Test for the equality of two proportions in SAS - The DO Loop for the section called A chi-square test for association in SAS. I basically need to compare the proportion in one area which was tested for something to the proportion in another area which was tested and see if they are significantly different proportions, but I can't get the code to work right. I get this error:
NOTE: Invalid data for N in line 79 1-6.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
79 CountyB Yes 71
Group=CountyA Seq=No N=. _ERROR_=1 _N_=2
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
My full code is:
data underfive; length Group $15 Test $3; input Group Test N; datalines; CountyA Yes 55 CountyA No 45027 CountyB Yes 71 CountyB No 311726;
Once I had that in I figured I would run this:
proc freq data=underfive order=data; weight N; tables Group*Test/chisq; run;
... View more