As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.


Here are some tips for How to share your SAS knowledge with your professional network.


Creating IP address formats/informats

From sasCommunity
Jump to: navigation, search

I find myself analyzing log files a lot from applications that have IP addresses in them. For some time I've wished SAS had built-in IP formats to read in and display vsn 4 IP addresses. Looking through Art Carpenter's Innovative SAS Techniques book, I realized I had all the tools I needed to create this myself!

The motivation in doing so is twofold: first, the largest text-based IPv4 address is xxx.xxx.xxx.xxx which is 15 characters. Each xxx ranges from 0-255, which in binary fits in 1 byte. This format is how the Internet uses it. So converting the format to 1 word is almost a 75% savings in space. The second benefit is sort order--a character representation of an IPv4 address doesn't sort numerically, resulting in extra effort to see if two IP addresses are adjacent to one another. This also messes up other things as well, such as formats defining ranges, etc.

I had used user-defined functions extensively when they were first released but hadn't realized they could be used as a function argument in a function-powered format. Art's book provided the 'of course!' moment for that.

So here is my initial version of a format/informat pair which provides translation for IPv4 addresses into a 4 byte character field. I'm sure it can be improved upon; please feel free to post ideas for improvement.

Thanks!

--Ben

/* This code snippet defines a set of functions to convert text-based IPv4 addresses into a 4 character IP address field, and back again.*/
/* It then uses those functions in an informat/format pair..*/
 
/* Author: Ben Conner */
/* Email: ben@webworldinc.com */
 
/* Proc FCMP allows you to define user functions.  In this case, a function called cip2ch brings in an ip address of the form*/
/* xxx.xxx.xxx.xxx  where xxx is from 0 to 255, and returns a 4 byte character representation of it.*/
/* Two functions are defined: one converting to the 4 byte field, and one converting back to the original form.*/
 
proc fcmp outlib=work.functions.conversions;
 
function cip2ch(c $) $ 16;
  o1=input(scan(c,1,'.'),3.);
  o2=input(scan(c,2,'.'),3.);
  o3=input(scan(c,3,'.'),3.);
  o4=input(scan(c,4,'.'),3.);
 
  /* Basic error checking: if malformed, return the input   */
  if 0<=o1<=255 and
     0<=o2<=255 and
     0<=o3<=255 and
     0<=o4<=255 then do;
     ip=byte(o1)||
        byte(o2)||
        byte(o3)||
        byte(o4);
     return(ip);
     end;
else
     return(c);
endsub;
 
function ch2ip(c $) $;
  ip=put(rank(substr(c,1,1)),3.)||'.'||
     put(rank(substr(c,2,1)),3.)||'.'||
     put(rank(substr(c,3,1)),3.)||'.'||
     put(rank(substr(c,4,1)),3.);
  ip=compress(ip);
  return(ip);
  endsub;
run;
 
 
/* Now tell SAS to use the library FCMP placed the new functions in to.   */
options cmplib=(work.functions);
 
/* And to make them easier to use, define an informat and outformat using the functions as well.  */
 
proc format;
  invalue $inip (default=16)
        other=[cip2ch()];
  value $outip (default=16)
        other=[ch2ip()];
run;
 
data ips;
  length b e $16;
  informat b e $inip.;
  r=cip2ch('8.27.163.66');
  input b  e ;
  bip=put(b,$outip.);
  eip=put(e,$outip.);
cards;
8.127.163.66 8.127.163.84
;;;