Hello, I produced a histogram for a single continuous variable (minimum value=0, mean=31, SD=14) using PROC SGPLOT. I included a normal density curve for the distribution, as well as a normal density curve for the same continuous variable from a standard population for comparison (lowest possible value=0, mean=43, SD=26.0). Given the lowest possible value is 0, I am not sure why the left tail of my normal curves extend past 0, and do not intersect at 0. Any insight would be very much appreciated! Thank you:)
... View more
Hi all, I have successfully created an annotated state map from geocoded address info using the states map. However, when I try to use the county map data set, it appears that the geocoding output does not match what information in the county map dataset. Here is a small program to illustrate the difficulty. Code doesn't get past proc gproject. It seems that the length of x and y of the map do not match the length of x and y in the geocoded output. thanks! Phil : libname lookup "D:\Documents\desi2023\frlanalyses\geocodedata__2006__ZIP4_Geocode_Data\data"; data zips; input c zip; anno_flag=1l cards; 1 65203 2 65211 3 65201 ; proc geocode /* Invoke geocoding procedure */ method=plus4 /* Specify geocoding method */ lookup=lookup.zip4 /* Lookup data from MapsOnline */ data=zips /* Input data set to geocode */ out=geocodedzips; /* Specify name of Output data set of locations */ run; data mocounties;set maps.counties;if state=29; data mocounties;set mocounties geocodedzips;run; proc gproject data=mocounties out=MOCountiesP; id state ; run; data my_map locations; set mocountiesP; if anno_flag=1 then output locations; else output my_map; run; data anno_locations; set locations; if c=1 then color="red"; if c=2 then color="bib"; if c=3 then color="viyg"; if c=4 then color="bippk"; xsys='2'; ysys='2'; hsys='3'; when='a'; function='pie'; rotate=360; size=.5; style='psolid'; output; length html $300; html='title='||quote(trim(left(citystate))||'0d'x||trim(left(put(zip,z5.)))); style='pempty'; color='deb'; output; run; pattern1 v=e; proc gmap data=mymap map=mymap anno=anno_locations; id county; choro segment / coutline=black levels=1 nolegend coutline=gray99;run;
... View more
Hi, I have some code that sums up monthly values and assigns each month as a new column. The problem is the code is ugly and I feel like there should be a better way to do this. Here is the code:
PROC SQL;
CREATE TABLE TMP1DAY.MONTHLY_SUMMARY AS
SELECT t1.ID_VARIABLE,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201901" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201902", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201901,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201902" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201903", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201902,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201903" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201904", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201903,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201904" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201905", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201904,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201905" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201906", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201905,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201906" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201907", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201906,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201907" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201908", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201907,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201908" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201909", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201908,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201909" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201910", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201909,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201910" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201911", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201910,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201911" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "201912", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201911,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202001", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "201901" AND "201912" AND t1.SECOND_YYYYMM BETWEEN "201901" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_201912_02,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202001" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202002", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202001,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202002" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202003", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202002,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202003" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202004", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202003,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202004" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202005", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202004,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202005" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202006", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202005,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202006" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202007", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202006,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202007" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202008", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202007,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202008" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202009", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202008,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202009" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202010", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202009,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202010" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202011", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202010,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202011" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202012", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202011,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202101", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_01,
SUM(IFN(t1.FIRST_YYYYMM BETWEEN "202001" AND "202012" AND t1.SECOND_YYYYMM BETWEEN "202001" AND "202102", t1.SUM_COLUMN, 0)) AS SUM_COLUMN_202012_02
FROM WORK.PRE_MONTHLY_SUMMARY t1
GROUP BY 1;
QUIT;
How would I instead write this to do the same thing but programmatically/with macro variables? its already out of hand and I will need to add more years which will just make it even worse. Any and all advice is appreciated.
Thanks in advance!
... View more
Hi all - ¿How can I connect files from a web server to sas viya? I want to automatically capture the log files in .gz, and then process them (ex: domain.com-2024-04-11.gz).
... View more
Hello! Using a macro, I am producing a series of crosstabs using several demographic variables (sex, age, income, etc.). Some of these variables have formats I created using PROC FORMAT. When I use PROC SURVEYFREQ, the ods output provides a table with a column that contains the values (say sex contains values 1 or 2) and another column with the formatted value (say F_sex contains Male or Female). I use the formatted columns and rename them Demographic. I then append the files created with ods output and I obtain a column named Demographic that contains the formatted values (ex. Male, Female, 18 to 30, 31 to 40, et.). However, when I do the same with PROC SURVEYMEANS, the ods output does not include columns with formatted values only columns with the actual values such as 0 and 2 and not Male and Female). Is there a way to force ods output to include the formatted values? Thanks in advance for your help! * Both PROC steps are part of a macro and &DemoVar contains the demographic variable ;
proc surveyfreq data=Mydataset VARHEADER=LABEL ;
tables year * &DemoVar * fg_use_choose
year * &DemoVar * fg_use_amt
year * &DemoVar * fg_use_assess
year * &DemoVar * fg_use_plan
year * &DemoVar * fg_use_wt
year * &DemoVar * fg_use_away / nostd row ;
weight wght ;
format income_adeq_G income_adeq_G_F.
fg_use_choose
fg_use_amt
fg_use_assess
fg_use_plan
fg_use_wt
fg_use_away yesno_F. ;
ods output CrossTabs=Results_Info ;
run ;
***********************************************;
proc surveymeans data=Mydataset mean VARHEADER=LABEL plots = none ;
var FG_trust ;
domain year * &DemoVar year ;
weight wght ;
format income_adeq_G income_adeq_G_F. ;
ods output Domain=Results_Trust ;
run ;
Erik
... View more