Recently in the SAS Community Library: SAS' @Sundaresh1 highlights a sometimes overlooked task when applying document embeddings for purposes of similarity-based search. Normalisation of vectors helps obtain relevant matches.
I'm just checking my understanding. From what I understand, if the following is correct:
Then is the following not correct? My responses would have been: 189.87%, 696.20% and 31.54%.
... View more
I'm trying to create a summary output table of frequencies. I have 4 different subgroups between TM and ADRD (0,0; 1,0; 0,1; 1,1) and I'm trying to line them up horizontally rather than vertically. This is what I have:
TM_group
ADRD_group
Table
Value_Labels
WgtFreq
Percent
0
0
Table DEM_AGE
2:Age Group [65,75)
48301097
56.025
0
0
Table DEM_AGE
3:Age Group >=75
37912485
43.975
0
0
Table DEM_SEX
1:Male
36957033
42.867
0
0
Table DEM_SEX
2:Female
49256549
57.133
0
1
Table DEM_AGE
2:Age Group [65,75)
983069
26.877
0
1
Table DEM_AGE
3:Age Group >=75
2674626
73.123
0
1
Table DEM_SEX
1:Male
1363336
37.273
0
1
Table DEM_SEX
2:Female
2294360
62.727
1
0
Table DEM_AGE
2:Age Group [65,75)
82517247
60.176
1
0
Table DEM_AGE
3:Age Group >=75
54609890
39.824
1
0
Table DEM_SEX
1:Male
63773136
46.507
1
0
Table DEM_SEX
2:Female
73354000
53.493
1
1
Table DEM_AGE
2:Age Group [65,75)
1290895
25.83
1
1
Table DEM_AGE
3:Age Group >=75
3706852
74.171
1
1
Table DEM_SEX
1:Male
2215172
44.323
1
1
Table DEM_SEX
2:Female
2782576
55.677
And this is what I want:
no TM, no ADRD
no TM, ADRD
TM, no ADRD
TM, ADRD
Table
Value_Labels
WgtFreq
Percent
WgtFreq
Percent
WgtFreq
Percent
WgtFreq
Percent
Table DEM_AGE
2:Age Group [65,75)
48301097
56.025
983069
26.8767
82517247
60.1757
1290895
25.8295
Table DEM_AGE
3:Age Group >=75
37912485
43.975
2674626
73.1233
54609890
39.8243
3706852
74.1705
Table DEM_SEX
1:Male
36957033
42.867
1363336
37.2731
63773136
46.5066
2215172
44.3234
Table DEM_SEX
2:Female
49256549
57.133
2294360
62.7269
73354000
53.4934
2782576
55.6766
I think Proc Transpose could be used here, if I could get it to work correctly. Another idea I thought of is to separate each group into its own datafile, and then to add the columns from the second file to the right of the columns in the first file, and then repeat for the 3rd and 4th files. But I don't know how to add the columns like that.
Here is code to create the dataset above
data have;
infile datalines dsd dlm=',' truncover;
input TM_group ADRD_group Table Value_Labels WgtFreq Percent;
datalines;
0,0,Table DEM_AGE,2:Age Group [65,75),48301097,56.0249
0,0,Table DEM_AGE,3:Age Group >=75,37912485,43.9751
0,0,Table DEM_SEX,1:Male,36957033,42.8668
0,0,Table DEM_SEX,2:Female,49256549,57.1332
0,1,Table DEM_AGE,2:Age Group [65,75),983069,26.8767
0,1,Table DEM_AGE,3:Age Group >=75,2674626,73.1233
0,1,Table DEM_SEX,1:Male,1363336,37.2731
0,1,Table DEM_SEX,2:Female,2294360,62.7269
1,0,Table DEM_AGE,2:Age Group [65,75),82517247,60.1757
1,0,Table DEM_AGE,3:Age Group >=75,54609890,39.8243
1,0,Table DEM_SEX,1:Male,63773136,46.5066
1,0,Table DEM_SEX,2:Female,73354000,53.4934
1,1,Table DEM_AGE,2:Age Group [65,75),1290895,25.8295
1,1,Table DEM_AGE,3:Age Group >=75,3706852,74.1705
1,1,Table DEM_SEX,1:Male,2215172,44.3234
1,1,Table DEM_SEX,2:Female,2782576,55.6766
; RUN;
... View more
Hi sasinators, Anyone know how does the "Actionable_entity_nm" field get populated in the Alert table in SAS VI V.03.05. Been searching for the data mapping but can locate it? Thank you
... View more
Hi everyone, I am using claims data for analysis and was stuck at one part in continuous enrollment. Any help would be appreciated. Thanks! I need patients who are continuously enrolled for 12 months prior to the index date to at least one month and maximum of 12 months after the index date. A gap of 30 days is acceptable. Also, I'm interested in first enrollment period only. Here's the data I have Data Test; input patid $ dtstart :YYMMDD10. dtend : YYMMDD10.; format dtstart YYMMDD10. dtend YYMMDD10.; cards; 001 2017-01-01 2017-01-31 001 2017-02-01 2017-02-31 001 2017-05-01 2017-05-31 002 2018-01-01 2018-01-31 002 2018-02-20 2018-04-31 003 2020-03-25 2020-12-31 003 2021-01-15 2021-08-31 Output (intermediate): 001 2017-01-01 2017-02-31 001 2017-05-01 2017-05-31 002 2018-01-01 2018-04-31 003 2020-03-25 2021-08-31 Now, patid 001 has two periods of continuous enrollment but I need only one enrollment period. So the desired output should be - 001 2017-01-01 2017-02-31 002 2018-01-01 2018-04-31 003 2020-03-25 2021-08-31 Following is the Code that I think would work. Not tried yet since data takes long time (~ a day) to process so I want to make sure I do it correctly. I just want to share my thought process Data test1; set test (rename = (dtstart = start dtend = end)); ; by patid; retain dtstart dtend enrolcnt; label dtstart = "Enrollment Date Start" dtend = "Enrollment Date End" Enrolcnt = "Enrollment Period Count"; if first.patid then do; dtstart = start; dtend = end; enrolcnt = 1; end; else do; if dtend + 30 >=start then do; if dtend < end then dtend = end; end; end; if last.patid then output; run; Once I get the continuous enrollment period, I will then select patients with continuos enrollment with enrollment start date 12 months prior to the index date to at least one month and maximum of 12 months after the index date. What confuses me is how to code for "at least" part in the previous statement. Should I create a new date variable as (new = index_date + 30)? SAS CODE (definitely incomplete) - Proc sql; create table test3 as select a.*, b.dtstart, b.dtend from data_index_date as a right join test1 as b on a.patid = b.patid where intnx ('day', a.index_date, -365) GE b.dtstart and a.index_date LT b.dtend order by patid; quit;
... View more