Hi all,
Please help me to code this. This is the dataset below-
I have 2 drugs being used for each ID i.e., SGA and DM. I have multiple start date and end dates for each ID and drug. I want to see if the DM_start falls into any of the SGA_start and SGA_end under each id. That is DM_start of period 3 can fall in the SGA_start and SGA_end of period1.
And also calculate the number of overlapping days for each time period. I want to know the no. of days for each orange period.
Thank you all in advance,
Any help would be appreciated.
... View more
Hi all - ¿How can I connect files from a web server to sas viya? I want to automatically capture the log files in .gz, and then process them (ex: domain.com-2024-04-11.gz).
... View more
I'm working on an actuarial project to estimate monthly probabilities that someone becomes disabled. A portfolio of persons (each having a different 'PolicyNr') is observed during 12 months, and the time until disability is registered by the variable 'TimetoDisability'. When no disability occured during the 12 months, the variable 'RightCensored' has the value 1. We further have the variables 'Gender', 'AgeatDisability' (which equals the age at the disability, or the age after 12 months for the right censored observations), and the variable 'OccupationClass'. We have data available in a wide format. For example, the following lines are part of the data: PolicyNr TimetoDisability RightCensored Gender AgeatDisability OccupationClass 001 2 months 0 Male 40 year 1 002 3 months 0 Male 30 year 2 003 12 months 1 Female 42 year 1 Intuitively, I would model this using a Cox proportional hazard model, with the variable 'TimetoDisabilty' as the time until the occurrence of the disability, and 'Gender', 'AgeatDisability', and 'OccupationClass' as covariates. Monthly probabilities are derived from the survival function. Now assume -because of practical/technical reasons- it is only possible to perform a GLM Binomial regression. I read that performing a GLM Binomial regression on data with pseudo observations is analogue to a Cox Discrete Time Survival model. To prepare the analysis, I transform the wide dataset to a long dataset (with pseudo observations, see, e.g., https://grodri.github.io/glms/notes/c7s6), in which each line is duplicated according the variable TimetoDisability. For example, the first line from the table above is transformed to 2 lines as is took 2 months to become disabled. The last line has a value 1 for the variable 'Disability', as the disability occured in month 2. The variable 'AgeatDisability' is transformed into the variable 'Age', now representing the age during that month. The right censored observation is transformed into 12 lines, all having the value zero for the variable 'Disability', as the disability is not observed. This becomes: PolicyNr Duration Disability Gender Age OccupationClass 001 1 0 Male 39 year 11 months 1 001 2 1 Male 40 year 1 002 1 0 Male 29 year 10 months 2 002 2 0 Male 29 year 11 months 2 002 3 1 Male 30 year 2 003 1 0 Female 41 year 1 months 1 003 2 0 Female 41 year 2 months 1 003 3 0 Female 41 year 3 months 1 003 4 0 Female 41 year 4 months 1 003 5 0 Female 41 year 5 months 1 003 6 0 Female 41 year 6 months 1 003 7 0 Female 41 year 7 months 1 003 8 0 Female 41 year 8 months 1 003 9 0 Female 41 year 9 months 1 003 10 0 Female 41 year 10 months 1 003 11 0 Female 41 year 11 months 1 003 12 0 Female 42 year 1 Question: In this long data format, the multiple rows (pseudo observations) for each person are not independent. We have repeated measures for each person. However, I read in Therneau and Grambsch: (quote) "One concern that often arises is that observations [on the same individual] are "correlated," and would thus not be handled by standard methods. This is not actually an issue. The internal computations for a Cox model have a term for each unique death or event time..." So for a Cox Discrete Time Survival model, the dependency is not an issue. However, I don't see how the dependency in the data is not an issue for a GLM Binomial regression? Is it -given the dependency in the data- appropriate to perform a GLM to get trustworthy estimates of monthly probabilities? Or should I go for a mixed effect model? Thank you.
... View more
Hi all - I have a specific report format that I am trying to achieve using PROC REPORT, and I'm getting pretty close, but there's a column subtotal element that I cannot figure out.
I have sales data, summarized by Client, Quarter, and Week. Each quarter contains an arbitrary number of weeks, up to around 13.
I would like to list Client information down the side, with:
Quarters listed across the top,
each Quarter's corresponding Weeks nested underneath,
the table populated with Sales for each Client-Week,
and Grand Totals on the far right and bottom. I've gotten this far on my own just fine.
Where I'm getting stuck is that I'd also like to show Quarterly subtotal columns, at the end of each Quarter. I have tried different variations of "break after" and the like, but I'm just not getting there.
Does anyone in the community have any suggestions? Below is some sample code that represents the progress I've made so far. I've also attached an image showing what I currently "have" versus what I "want." I greatly appreciate any and all help - cheers!
data one; input client_rank client $9. client_id quarter $ weekending sales; datalines; 1 Apple 12345 Q1 20240106 1000 1 Apple 12345 Q1 20240113 2000 1 Apple 12345 Q1 20240127 5000 1 Apple 12345 Q2 20240413 3000 1 Apple 12345 Q2 20240420 4000 1 Apple 12345 Q2 20240427 2000 2 Microsoft 67890 Q1 20240106 3000 2 Microsoft 67890 Q1 20240113 1000 2 Microsoft 67890 Q1 20240127 2500 2 Microsoft 67890 Q2 20240413 4000 2 Microsoft 67890 Q2 20240420 500 2 Microsoft 67890 Q2 20240427 1500 ; run;
proc report data = one nowd; column client_rank client client_id quarter, weekending, sales ("Total" sales=tot); define client_rank / noprint group; define client / "Client" group; define client_id / "Client ID" group; define quarter / " " across; define weekending / " " across nozero; define sales / "Sales" analysis sum format=comma12.0; define tot / "Sales" analysis sum format=comma12.0;
rbreak after / summarize; compute after; client = "Total"; endcomp; define client_rank / order group; run;
... View more