Recently in the SAS Community Library: SAS' @BethEbersole reveals 4 steps to stop money laundering, solve law-enforcement cases, find missing children and more with SAS Visual Investigator.
Hi I am new to proc template . 1. I am displaying bar value at the top of bar using barlabel=true. How to display the value at the center of the bar instead of top? 2. Also, how to wrap x axis value. text variable have one value very long 3. what option in proc template i can use to display P value at the top. Thanks in advance
... View more
I need suggestions on how to widen my dataset. I need a single line for each ID while keeping all the data in the sample dataset. I plan on later merging this with another dataset by ID. I was unable to transpose the data because an ID can have multiple same_spans of the same number (see ID R in the data example).
Thanks!
Data Have;
input
ID$ Clinic$ Span_Beg :Date9. Span_End :Date9. Same_Span ;
format Span_Beg MMDDYY10. Span_End MMDDYY10.;
datalines;
A Clinic1 17Nov2022 4Jan2023 1
B Clinic2 1Jul2022 29Jul2022 1
B Clinic2 21Nov2022 4Apr2023 2
C Clinic1 1Jul2022 7Jul2022 1
D Clinic3 1Jul2022 12Jul2022 1
E Clinic1 22Nov2022 19Mar2023 1
F Clinic2 26Aug2022 15May2023 1
G Clinic4 1Oct2022 10Apr2023 1
H Clinic5 30Dec2022 19Jan2023 1
I Clinic6 1Jul2022 1Nov2022 1
I Clinic6 11Nov2022 10Jan2023 2
I Clinic6 1Feb2023 30Apr2023 3
J Clinic2 1Jul2022 21Aug2022 1
K Clinic2 9Mar2023 30Jun2023 1
L Clinic5 20Sep2022 20Feb2023 1
M Clinic5 1Jul2022 2Aug2022 1
N Clinic1 6Jul2022 1Sep2022 1
O Clinic7 1Jul2022 1Sep2022 1
P Clinic8 1Jul2022 24Oct2022 1
Q Clinic8 1Jul2022 4Dec2022 1
R Clinic9 19Dec2022 31May2023 1
R Clinic9 8Jun2023 18Jun2023 2
R Clinic1 28Aug2022 18Sep2022 1
S Clinic1 1Jul2022 10Jul2022 1
S Clinic3 4Aug2022 17Nov2022 1
;
run;
... View more
I have a process where I am:
reading 67M rows from SQL Server to SAS; transforming the data using SAS; writing 67M rows from SAS to SQL Server.
no rows are dropped or created in the transformation processing. it's just fixing some incorrect values in historic data.
the source and target tables have identical schema
the target table is a heap (no indexes, keys, constraints)
The read operation is taking approx. 5 minutes; the write operation is taking approx. 5 hours. So, yeah 😬😱😠
Here are some excerpts from the log:
Libname statement:
27 %libname_sqlsvr(libref=PROD,server=MYSERVER,port=,database=MYDB,schema=prod)
=================================================================================
LIBREF: PROD
CONNECT: NOPROMPT="Driver={SQL Server Native Client
10.0};Server=MYSERVER;Database=MYDB;Trusted_Connection=yes;"
INTERNAL OPTIONS: schema=prod bulkload=yes dbcommit=100000 direct_exe=delete preserve_names=yes
USER OPTIONS:
=================================================================================
ODBC: AUTOCOMMIT is NO for connection 3
MPRINT(LIBNAME_SQLSVR): libname PROD odbc NOPROMPT="Driver={SQL Server Native Client
10.0};Server=MYSERVER;Database=MYDB;Trusted_Connection=yes;" schema=prod bulkload=yes dbcommit=100000
direct_exe=delete preserve_names=yes ;
NOTE: Libref PROD was successfully assigned as follows:
Engine: ODBC
Physical Name:
Read data from SQL Server to SAS:
27 * Copy data from SQL Server ;
ODBC: AUTOCOMMIT is NO for connection 6
ODBC_28: Prepared: on connection 6
SELECT * FROM "prod"."RLDX_DEMOGRAPHIC_ORIG"
28 data workspde.RLDX_DEMOGRAPHIC;
29 set prod.RLDX_DEMOGRAPHIC_ORIG;
30 run;
ODBC_29: Executed: on connection 6
Prepared statement ODBC_28
NOTE: There were 67,766,170 observations read from the data set PROD.RLDX_DEMOGRAPHIC_ORIG.
NOTE: The data set WORKSPDE.RLDX_DEMOGRAPHIC has 67,766,170 observations and 34 variables.
NOTE: Compressing data set WORKSPDE.RLDX_DEMOGRAPHIC decreased size by 75.73 percent.
NOTE: DATA statement used (Total process time):
real time 5:05.05
user cpu time 6:15.26
system cpu time 57.09 seconds
memory 1401.65k
OS Memory 22828.00k
Timestamp 25/03/2020 07:17:50 AM
Write data from SAS to SQL Server:
27 proc append
ODBC_24: Prepared: on connection 6
SELECT * FROM "prod"."RLDX_DEMOGRAPHIC" WHERE 0=1
ODBC: AUTOCOMMIT is NO for connection 7
ODBC: COMMIT performed on connection 7.
ODBC_25: Prepared: on connection 7
SELECT * FROM "prod"."RLDX_DEMOGRAPHIC"
28 base=prod.RLDX_DEMOGRAPHIC (
29 bulkload=yes bl_options=tablock
30 )
31 data=workspde.RLDX_DEMOGRAPHIC6 (
32 keep=
33 rldx_record_key
34 rldx_audit_key
35 recnum
36 stay_number
37 episode_sequence_number
38 hospital_type
39 source
40 hoscode
41 tfrhosp
42 transfrom
43 mrn
44 surname
45 givnames
46 sex
47 dob
48 age
49 address
50 locality
51 pcode
52 cob
53 admdate
54 sepdate
55 dthdate
56 notifdate
57 patientnum
58 supi
59 moh_auid
60 AHS_hosp
61 dc_extract_date
62 cob_sacc
63 state_of_usual_residence
64 rldx_valid_from
65 rldx_valid_to
66 rldx_current_ind
67 );
68 run;
NOTE: Appending WORKSPDE.RLDX_DEMOGRAPHIC6 to PROD.RLDX_DEMOGRAPHIC.
NOTE: There were 67,766,170 observations read from the data set WORKSPDE.RLDX_DEMOGRAPHIC6.
NOTE: 67766170 observations added.
NOTE: The data set PROD.RLDX_DEMOGRAPHIC has . observations and 34 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 5:34:15.35
user cpu time 18:04.33
system cpu time 37.90 seconds
memory 634.87k
OS Memory 21804.00k
Timestamp 25/03/2020 03:34:13 AM
The write operation is a TRUNCATE TABLE (not shown) and PROC APPEND approach.
I note this previous thread: https://communities.sas.com/t5/General-SAS-Programming/Get-data-into-sql-server-more-efficiently/td-p/95560
I will continue playing with the libname and/or dataset options, but at 5 hours a pop for performance metrics, I thought I'd post to see if someone has further ideas. For example, is it possible that bulkloading needs to be tweaked on SQL Server itself (not that I'd know how to diagnose that, and I'm not the DBA).
... View more
Hi,
I've a sample input data below and the expected output format.
I need to find number of active customers as per the start date of the year and their based on end date, I've to keep the count added for each year.
Example,
Customer A,H,I - Active from 2004, so we have 3 customers active from 2004 and I'm keeping the count added horizontally for each year and by 2009 one customer got dropped. So for 2010 - customer count was 2.
Similarly I've to populate the same for the next row active from 2005 and followed by other rows.
Could you please suggest me the best approach to get this done ?
Thanks in advance for your help.
... View more
The fact that I even have to post this tells u what kind of day I'm having 😞 If the HLT_ALZDEM variable doesn't have a valid value of 1 or 2 (yes/no), then the variables that are constructed based on HLT_ALZDEM should be given a value of missing. However, they sometimes are given a value of 0 instead.
Below is a portion of the output file, sorted for rows where HLT_ALZDEM is missing. Since missing is clearly not 1 or 2, all ADRD_group and non_ADRD_group values should also be missing. But they're not. Obs 9 and 10 are identical except for these 2 variables., when Obs 9 is missing and Obs 10 is 0 for both.
The only possible explanation I can think of is that there are non-printable characters for some values of HLT_ALZDEM. But even if there are, those non-printable characters are not 1 or 2. Frequencies for HLT_ALZDEM only show values of 1, 2, and missing.
IF cohort_flag = 1 AND ADM_MA_FLAG_YR = 3
THEN MA_group = 1; ELSE MA_group = 0;
IF cohort_flag = 0 THEN MA_group = .;
IF cohort_flag = 1 AND ADM_FFS_FLAG_YR = 3
THEN TM_group = 1; ELSE TM_group = 0;
IF cohort_flag = 0 THEN TM_group = .;
IF cohort_flag = 1 AND HLT_ALZDEM = 1
THEN ADRD_group = 1; ELSE ADRD_group = 0;
IF cohort_flag = 0 THEN ADRD_group = .;
IF HLT_ALZDEM ^in (1,2) THEN ADRD_group = .;
IF cohort_flag = 1 AND HLT_ALZDEM = 2
THEN non_ADRD_group = 1; ELSE non_ADRD_group = 0;
IF cohort_flag = 0 THEN non_ADRD_group = .;
IF HLT_ALZDEM ^in (1,2) THEN non_ADRD_group = .;
Obs
HLT_ALZDEM
cohort_flag
MA_group
TM_group
ADRD_group
non_ADRD_group
1
0
2
1
0
1
0
0
3
1
1
0
4
0
5
0
6
1
0
1
7
1
1
0
8
1
1
0
9
1
1
0
10
1
1
0
0
0
11
1
1
0
12
1
0
1
13
1
0
1
14
1
0
1
15
1
0
1
16
1
0
1
0
0
... View more