I'm having difficulty writing a piece of SAS code and I'm seeking the most effective approach within SAS. I have a variable named 'type' (with values 1, 2, 3, 4) and client IDs. Essentially, I aim to create another variable called 'modified_type,' which follows these rules: If there's a single unique client ID with only 1 in the 'type' variable, then 'modified_type' should be set to 0. However, if there are multiple client IDs with the same number having 1 in the 'type' variable, 'modified_type' should be set to 1. If there are 2 in the 'type' variable, then 'modified_type' should be set to 2... and so on. I tried using the retain function so something like this: retain modified_type 0; if first.client_id then modified_type = 0; if type = '1' or type = '1st' then modified_type = 1; else if type = '2' or = '2nd' then modified_type = 2; and so on..... but these does seem to retain 0 in the type and for some reason shows as 1. Yea im note sure the best way to do this, a newbie at SAS so would even appreciate the point in the right direction:
... View more
Hi all - ¿How can I connect files from a web server to sas viya? I want to automatically capture the log files in .gz, and then process them (ex: domain.com-2024-04-11.gz).
... View more
Hi all,
Please help me to code this. This is the dataset below-
I have 2 drugs being used for each ID i.e., SGA and DM. I have multiple start date and end dates for each ID and drug. I want to see if the DM_start falls into any of the SGA_start and SGA_end under each id. That is DM_start of period 3 can fall in the SGA_start and SGA_end of period1.
And also calculate the number of overlapping days for each time period. I want to know the no. of days for each orange period.
Thank you all in advance,
Any help would be appreciated.
... View more
I'm working on an actuarial project to estimate monthly probabilities that someone becomes disabled. A portfolio of persons (each having a different 'PolicyNr') is observed during 12 months, and the time until disability is registered by the variable 'TimetoDisability'. When no disability occured during the 12 months, the variable 'RightCensored' has the value 1. We further have the variables 'Gender', 'AgeatDisability' (which equals the age at the disability, or the age after 12 months for the right censored observations), and the variable 'OccupationClass'. We have data available in a wide format. For example, the following lines are part of the data: PolicyNr TimetoDisability RightCensored Gender AgeatDisability OccupationClass 001 2 months 0 Male 40 year 1 002 3 months 0 Male 30 year 2 003 12 months 1 Female 42 year 1 Intuitively, I would model this using a Cox proportional hazard model, with the variable 'TimetoDisabilty' as the time until the occurrence of the disability, and 'Gender', 'AgeatDisability', and 'OccupationClass' as covariates. Monthly probabilities are derived from the survival function. Now assume -because of practical/technical reasons- it is only possible to perform a GLM Binomial regression. I read that performing a GLM Binomial regression on data with pseudo observations is analogue to a Cox Discrete Time Survival model. To prepare the analysis, I transform the wide dataset to a long dataset (with pseudo observations, see, e.g., https://grodri.github.io/glms/notes/c7s6), in which each line is duplicated according the variable TimetoDisability. For example, the first line from the table above is transformed to 2 lines as is took 2 months to become disabled. The last line has a value 1 for the variable 'Disability', as the disability occured in month 2. The variable 'AgeatDisability' is transformed into the variable 'Age', now representing the age during that month. The right censored observation is transformed into 12 lines, all having the value zero for the variable 'Disability', as the disability is not observed. This becomes: PolicyNr Duration Disability Gender Age OccupationClass 001 1 0 Male 39 year 11 months 1 001 2 1 Male 40 year 1 002 1 0 Male 29 year 10 months 2 002 2 0 Male 29 year 11 months 2 002 3 1 Male 30 year 2 003 1 0 Female 41 year 1 months 1 003 2 0 Female 41 year 2 months 1 003 3 0 Female 41 year 3 months 1 003 4 0 Female 41 year 4 months 1 003 5 0 Female 41 year 5 months 1 003 6 0 Female 41 year 6 months 1 003 7 0 Female 41 year 7 months 1 003 8 0 Female 41 year 8 months 1 003 9 0 Female 41 year 9 months 1 003 10 0 Female 41 year 10 months 1 003 11 0 Female 41 year 11 months 1 003 12 0 Female 42 year 1 Question: In this long data format, the multiple rows (pseudo observations) for each person are not independent. We have repeated measures for each person. However, I read in Therneau and Grambsch: (quote) "One concern that often arises is that observations [on the same individual] are "correlated," and would thus not be handled by standard methods. This is not actually an issue. The internal computations for a Cox model have a term for each unique death or event time..." So for a Cox Discrete Time Survival model, the dependency is not an issue. However, I don't see how the dependency in the data is not an issue for a GLM Binomial regression? Is it -given the dependency in the data- appropriate to perform a GLM to get trustworthy estimates of monthly probabilities? Or should I go for a mixed effect model? Thank you.
... View more