As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


Converting shapefiles to KML for viewing in Google Earth and Google Maps

From sasCommunity
Jump to: navigation, search

KML is a file format used for displaying geographic data in an Earth browser, such as Google Earth or Google Maps. The KML format is useful for several reasons: it is open source, Google Earth and Google Maps are widely familiar to non-GIS users, and Google Earth and Google Maps provide orthoimagery and ample base layer data at no cost.

In disease surveillance there is a frequent need to assign census geography (blocks, block groups, tracts) to disease cases to assess patterns and trends. Most of this is done through automated text-matching processes, but for a small portion of cases it can be easier to locate the actual residential location in Google Earth.

The Census Bureau provides boundary files as shapefiles [1]. What follows is SAS code to convert these to KML files. While there are free tools available for doing this, such as shp2kml [2], none of them offer full control of behavior or appearance, and not all allow the conversion of many shapefiles at once.

The first program below downloads shapefiles from the census web site, unzips them using a free unzip program [3], and converts them to a SAS data set using PROC MAPIMPORT. It is shown here for census tracts within New York State counties, but could easily be modified for other states/census geographies. It is useful to keep things separated by counties, because attempting to view census geography for an entire state in Google Earth can overtax your computer's video card.

 
 
 libname ge 'C:\gemaps10';
 
 %let state=36; *two-digit FIPS code for New York, replace as needed; 
 
  %macro nycounties;
  %do i = 1 %to 123 %by 2; *download all shapefiles for NY (county codes 001 to 123, odd);
    dm 'wbrowse "http://www2.census.gov/geo/tiger/TIGER2010/TRACT/2010/tl_2010_&state%sysfunc(putn(&i,z3.))_tract10.zip"';
    run;
  %end;
 
  x unzip C:\temp\tl_2010_*.zip -d C:\gemaps10; *unzip and move files to preferred directory; 
 
  %do i = 1 %to 123 %by 2; *convert all shapefiles to SAS datasets;
   proc mapimport out=ge.county%sysfunc(putn(&i,z3.))_tracts datafile="C:\gemaps10\tl_2010_&state%sysfunc(putn(&i,z3.))_tract10.shp";
    run;
  %end;
  %mend;
 
 %nycounties;


The next program writes the KML code based on the SAS data sets. The code is designed to show each census unit (New York state tracts in this example) in Google Earth as a translucent orange polygon with a white paddle in the center. When the mouse is moved over a polygon, it turns a bolder orange and the census tract number appears. When a polygon is clicked, a balloon with geographic identifiers opens.

Most of the program consists of PUT statements. The trick from the SAS perspective is that there are three different types of polygons that require slightly different code: a single polygon, a multi-part polygon (as in a chain of islands), and a polygon with one or more holes in it (as in a town entirely surrounding a village). The first type of polygon is straightforward. The second is identified by the variable called "section" - each island has its own section number. The third is identified by rows of missing values. For these, I changed the section value from missing to 999 to make these easy to identify.

The resulting KML programs for all counties may be found at Frank Boscoe's web page [4] (replace 001 with any value from 001-123 odd)

 libname ge 'C:\gemaps10'; 
 dm log 'clear';
 
  %let statem=36; *FIPS code for New York, replace as needed; 
 
 %macro writekml;
 %do i=1 %to 123 %by 2; *again, these are the FIPS codes of New York counties;
   data county&i;
   set ge.county&i;
   rowid=_n_;
   if x=. then segment=999; *used for later identification of tracts with holes;
   run;
 
   proc sort; *order by tract number;
   by geoid10 rowid;
   run;
 
   *this classifies each polygon type: maxseg=1 is a simple polygon, 2 to 998=multi-part 
   polygons, and 999=polygon with holes;
   proc sql;
   create table tractlist&i as select unique geoid10, max(segment) as maxseg from county&i
   group by geoid10;
   quit;
   run;
 
   %obscount; *obtains the number of tracts in each county;
 
   *extracts the county name from the sashelp.zipcode table;
   data _null_;
   set sashelp.zipcode;
   where state=&statem and county=&i;
   call symputx('namem',countynm); 
   run;
 
   *this stores latitude and longitude extents for each county, used in setting the visibility
   level;
   proc sql noprint;
   select max(x), max(y), min(x), min(y) 
   into :maxx, :maxy, :minx, :miny
   from county&i;
   quit;
   run;
   %let mwest=%sysevalf(&minx-.2);
   %let meast=%sysevalf(&maxx+.2);
   %let msouth=%sysevalf(&miny-.2);
   %let mnorth=%sysevalf(&maxy+.2);
 
   FILENAME kml CLEAR ;
   filename kml  "C:\gemaps10\county%sysfunc(putn(&i,z3.)).kml";
 
   data _null_;
    file kml;
    put "<?xml version='1.0' encoding='UTF-8'?>";
    put "<kml xmlns='http://earth.google.com/kml/2.2'>";
    put "<Document>";
    put "<name>&namem. County </name>"; 
    put "<Schema name='balloon' id='balloon_schema'>";
    put "<SimpleField name='zcounty' type='char'>";
    put "<displayName><![CDATA[<b>County</b>:]]></displayName></SimpleField>";
    put "<SimpleField name='ztract' type='int'>";
    put "<displayName><![CDATA[<b>Tract</b>:]]></displayName></SimpleField>";
    put "</Schema>";
    put "<StyleMap id='MapRollOver1'>";
    put "<Pair>";
    put "<key>normal</key>";
    put "<styleUrl>#zn1</styleUrl>";
    put "</Pair>";
    put "<Pair>";
    put "<key>highlight</key>";
    put "<styleUrl>#zh1</styleUrl>";
    put "</Pair>";
    put "</StyleMap>";
 
    *defines normal style;
    put "<Style id='zn1'>";
    put "<IconStyle>";
    put "<color>FFDBF3E0</color>";
    put "<scale>0.8</scale>";
    put "<Icon>";
    put "<href>http://www.albany.edu/~fboscoe/ge10/wht-blank.png</href>";
    put "</Icon>";
    put "</IconStyle>";
    put "<LabelStyle>";
    put "<scale>0</scale>";
    put "</LabelStyle>";
    put "<BalloonStyle>"; 
    put "<BalloonStyle>"; 
    put "<bgColor>BFFFFF</bgColor>";
    put "<text><br/><![CDATA[";
    put "<table><tr><td>$[balloon/zcounty/displayName]</td>";
    put "<td>$[balloon/zcounty]</td></tr>";
    put "<tr><td>$[balloon/ztract/displayName]</td>";
    put "<td>$[balloon/ztract]</td></tr>";
	%if &geog=bg %then %do;
       put "<tr><td>$[balloon/zbg/displayName]</td>";
       put "<td>$[balloon/zbg]</td></tr>";
	%end;
	%if &geog=tabblock %then %do;
       put "<tr><td>$[balloon/zblock/displayName]</td>";
       put "<td>$[balloon/zblock]</td></tr>";
	%end;
    put "</table>";
    put "]]></text>";
    put "</BalloonStyle>";
    put "<LineStyle>";
    put "<color>FF0080FF</color>";
    put "<width>1</width>";
    put "</LineStyle>";
    put "<PolyStyle>";
    put "<color>63DBF3E0</color>"; 
    put "</PolyStyle>";
    put "</Style>";
 
    *define highlight style;
    put "<Style id='zh1'>";
    put "<IconStyle>";
    put "<color>FFDBF3E0</color>";
    put "<scale>0.8</scale>";
    put "<Icon>";
    put "<href>http://http://www.albany.edu/~fboscoe/ge10/orange-stars.png</href>";
    put "</Icon>";
    put "</IconStyle>";
    put "<LabelStyle>";
    put "<scale>1</scale>";
    put "</LabelStyle>";
    put "<BalloonStyle>"; 
    put "<bgColor>BFFFFF</bgColor>";
    put "<text><br/><![CDATA[<table><tr><td>$[balloon/zcounty/displayName]</td><td>$[balloon/zcounty]</td></tr><tr><td>$[balloon/ztract/displayName]</td><td>$[balloon/ztract]</td></tr></table>]]></text>"; 
    put "</BalloonStyle>";
    put "<LineStyle>";
    put "<color>FF0080FF</color>";
    put "<width>3</width>";
    put "</LineStyle>";
    put "<PolyStyle>";
    put "<color>BFDBF3E0</color>"; 
    put "</PolyStyle>";
    put "</Style>";
    put "<Folder>";
    put "<name>Census Tracts</name>";
 
    *define visibility level;
    put "<Region>";
    put "<LatLonAltBox>";
    put "<north>&mnorth</north>"; 
    put "<south>&msouth</south>";
    put "<east>&meast</east>";
    put "<west>&mwest</west>";
    put "</LatLonAltBox>";
    put "<Lod><minLodPixels>512</minLodPixels><maxLodPixels>-1</maxLodPixels></Lod>"; *use 512 for tracts and 1024 for block groups and 8192 for blocks;
    put "</Region>";
   run;
 
   %polygons; *writes polygon-specific kml code for the 3 different types of polygons;
 
    put "</Folder>";
    put "</Document>";
    put "</kml>";
    run;
 
 %end;
 %mend;
 
 %macro obscount; *gets the number of tracts for each county (e.g., Albany=75);
   data _NULL_;
   set tractlist&i NOBS=obsout;
   call symputx('obscountm',obsout);
   stop;
   run;
 %mend;
 
 %macro polygons; *writes polygon-specific kml code for the 3 different types of polygons;
 %do j = 1 %to &obscountm;
   data _null_;
   set tractlist&i;
   if _n_ = &j;
   call symputx('tractnm',geoid10);
   call symputx('segm',maxseg);
   run;
 
   data temp;
   set county&i;
   segmentlag=lag(segment);
   where geoid10="&tractnm";
   call symputx('tractm',name10); 
   call symputx('intptlat10m',intptlat10);
   call symputx('intptlon10m',intptlon10);
   call symputx ('xm',x);
   call symputx ('ym',y);
   run;
 
   data _null_;
   file kml mod;
   put "<Placemark>";
   put "<name>&tractm.</name>"; 
   put "<Snippet>&namem. County, &tractm.</Snippet>"; 
   put "<description></description>";
   put "<styleUrl>#MapRollOver1</styleUrl>"; 
   put "<ExtendedData><SchemaData schemaUrl='#balloon_schema'>";
   put "<SimpleData name='zcounty'>&namem.</SimpleData>";
   put "<SimpleData name='ztract'>&tractm.</SimpleData>";
   put "</SchemaData></ExtendedData>"; 
   put "<MultiGeometry>";
   put "<Polygon>";
   put "<outerBoundaryIs>";
   put "<LinearRing>";
   put "<coordinates>";
   run;
 
 %if &segm=1 %then %do; *simple polygon case;
   data _null_;
   set temp;
   xystring=cats(x,",",y,",0"); 
   file kml mod;
   put xystring;
   run;
 
   data _null_;
   file kml mod;
   put "</coordinates>";
   put "</LinearRing>";
   put "</outerBoundaryIs>";
   run;
 %end;
 
 %else %if &segm>1 and &segm<999 %then %do; *multi-part polygon case;
   data _null_;
   set temp;
   xystring=cats(x,",",y,",0"); 
   file kml mod;
   if segment=segmentlag or segmentlag eq . then put xystring;
   else do;
   put "</coordinates>";
   put "</LinearRing>";
   put "</outerBoundaryIs>";
   put "</Polygon>";
   put "<Polygon>";
   put "<outerBoundaryIs>";
   put "<LinearRing>";
   put "<coordinates>";
   put xystring;
   end;
   run; 
 
   data _null_;
   file kml mod;
   put "</coordinates>";
   put "</LinearRing>";
   put "</outerBoundaryIs>";
   run;
 %end;
 
 %else %if &segm=999 %then %do; *polygon with holes case;
   data _null_;
   set temp;
   if segment=999 then counter+1; 
   xystring=cats(x,",",y,",0");
   file kml mod;
   if segment=1 then put xystring;
   else if segment=999 and counter=1 then do; *first hole;
    put "</coordinates>";
    put "</LinearRing>";
    put "</outerBoundaryIs>";
    put "<innerBoundaryIs>";
    put "<LinearRing>";
    put "<coordinates>";
   end;
   else if segment=999 and counter>1 then do; *second and subsequent holes;
    put "</coordinates>";
    put "</LinearRing>";
    put "</innerBoundaryIs>";
    put "<innerBoundaryIs>";
    put "<LinearRing>";
    put "<coordinates>";
    end;
   run;
 
   data _null_;
   file kml mod;
   put "</coordinates>";
   put "</LinearRing>";
   put "</innerBoundaryIs>";
   run; 
 %end;
 
   data _null_; *add placemark;
   file kml mod;
   put "</Polygon>";
   put "<Point>";
   put "<coordinates>&intptlon10m.,&intptlat10m.,0 </coordinates>"; 
   put "</Point>";
   put "</MultiGeometry>";
   put "</Placemark>";
 %end;
 %mend;
 
 %writekml;