sasCommunity.org Planet

February 08, 2010

At the risk of stumbling over the description of another "new" function, I discovered the DIVIDE function alongside my discovery of the IFC/IFN functions. This is definitely new in V9.2; confirmed by the What's New in the Base SAS 9.2 Language web page.

It caught my eye because it handles those division scenarios where SAS normally issues notes or warnings to the log, e.g. divide by zero. In many cases you would want to be warned of missing values and divide by zero, there are some cases where you do not, but you need to create long-winded conditional coding around your division in order to avoid the messages. Well, with the DIVIDE function you don't.

Instead of returning notes or warnings, the DIVIDE function returns a special missing value. For example, it might return .I (indicating infinity) or .M (minus infinity). The full range of values and circumstances are in the DIVIDE documentation. The special missing values are understood by ODS too, and handled nicely. Typically, when SAS prints special missing values, it omits the dot and just prints the upper-cased letter.

If you're not familiar with special missing values (they're not new in V9.2, they've been around for many, many years), take a look at the V9.2 documentation "Creating Special Missing Values".

The DIVIDE function won't be useful to everybody, but for some SAS coders it'll be received with thanks.

It seems my recent post on the IFC and IFC functions caused a fair bit of interest. Not least in further related functions.

Firstly I must own-up to my mistake of originally stating IFC and IFN were new with 9.2. A number of correspondents pointed-out that they were available from the beginning of V9. Plus, I have it on good authority that they were experimental in V8.2 (albeit possibly with different names). They were implemented primarily to provide a logical construct that could be used interchangeably in both SQL and data step code. They happen to be a lot shorter than SQL's case/when/else construct too! My thanks to the little ex-SAS birdie and the correspondent who passed the message along.

Secondly, The SAS Plumber started a thread on news:comp.soft-sys.sas. In response, Data _Null_ pointed-out that the third conditional value only gets returned when the value of the first parameter is actually missing; hence my description of the functionality was incorrect. Jason Secosky pointed-out the same thing in comments on the original post. And Ron Fehd highlighted the safe, traditional option of using the SELECT statement.

Finally, Jack Hamilton suggested a look at CHOOSEC and CHOOSEN. These were new to me too - I've clearly been walking around with my eyes closed (and bumped into something, causing the poor quality of the original IFC/IFN post!). The syntax for both is:

CHOOSEx (index-expression, selection-1 <,...selection-n>)

The CHOOSEx function uses the value of index-expression to select from the arguments that follow. For example, if index-expression is three, CHOOSEx returns the value of selection-3. If the first argument is negative, the function counts backwards from the list of arguments, and returns that value.

In his comments to the original posting, Jack also mentioned that IFC/IFN are also available in the macro language via %SYSFUNC, and so provide a primitive IF/THEN/ELSE mechanism in open code. Jack highlighted the example in the sascommunity.org wiki.

So, I now know four functions that I'd not heard of just a few short weeks ago. And I've made a note to read the documentation more carefully before posting next time! My thanks to all of you who contributed to the discussion.

During Analytics Camp 2010 this past weekend, I facilitated a discussion with a great group of attendees looking to create or improve their own dashboards. The question asked was:

"What should all dashboards have to make them useful & successful?"

During this discussion I added *** within my notes around these points ~
1. The big IDEA of the discussion was to marry dashboards with the community via social media tools. Allowing users to vote on indicators they found important to the business and then have that drive the main display would allow the community to collaborate on the resulting data. Adding comments would provide a mechanism for the group to quickly discuss what the issue is or what is currently being done to address the indicator's response to the data. Both of these ensure that conversations are not lost and the tool is an active part of managing by data.

2. Keeping everything green might not be the best mode of business. (You know I'm not talking environment here, but indicator traffic colors. <grin>) There should be individuals looking and reviewing even the good items occasionally to determine if the organization can make even more improvements & therefore save even more money $$$!

3. Allowing dashboards to be created and modified in an iterative fashion is probably the most effective mechanism in implementing these tools. Many businesses will not know the most useful displays, indicators, etc until they try some. Debating the requirements could take much longer than anyone would desire. Of course, having incorrect data could drive incorrect business decisions, so I'm not advocating that direction. I would recommend using measures that are already used in more detailed reports in a pilot dashboard program.

Included below are all my notes from our brainstorming session. I have grouped our freeflow list into categories. If you feel that we missed something, please include your comments & join in this discussion!


  1. Levels
    • High level vantage point or aggregation so users can determine where if anywhere they should drill through to more detail
    • Multiple levels per different levels of organization
    • Security to ensure correct access at difference levels
  2. Visualization
    • Accessible / ADA Compliant Coloring Schemes
    • Printable/Exportable
    • Mobile Viewing
  3. Measures
    • Utilize $$$ within the display to ensure action
    • Include forecasting of what is expected to happen in the near future based on trends
    • Offer what-if analysis to determine an impact of potential change, assisting management with creating the proper action
    • Have the ability to view prior trends
    • Access archival results
    • Measures must be actionable! If noone can do anything about it, why display it?
    • A measure's implementation is completely dependent on relevant data availability
  4. Usefulness
    • Multi-user comments
      • Marry with SOCIAL MEDIA!
      • Voting on items in indicator or text descriptions ~ which then drive priority for dashboard displays to others
      • Idea: Provide tag clouds on common terms w/in comments
    •  Generate email alerts for specific events ~ driving traffic to dashboard
      • Assists with culture change of utilizing dashboard
  5. Deployments
    • Customizable, easy to modify
    • Exceptions to rule, exception reporting
    • Agile deployment, allow for quick iterative development


Kernel method is a very useful technique in data mining that is applicable to any algorithms relying on inner product [1]. The key is applying appropriate kernel function to the inner product of original data space. I show here SAS/STAT+BASE example code for Kernel PCA implementation. The example used is from Wikipedia @ here. Extention to other algorithms that suitable for Kernel method, such

February 07, 2010

Whenever you try to open a SAS data set which has permanent formats, you will get the error message in the log file saying “ERROR: The format XXXX was not found or could not be loaded.” This happens generally when you don't have the format library where the SAS dataset located.

You can do this manually or programmatically.Manual Approach: Right click on the data set in the explorer and go to properties, then go to the details tab and see the sorted by values (Works only if you use Windows operating system)


Whenever you try to open a SAS data set which has permanent formats, you will get the error message in the log file saying “ERROR: The format XXXX was not found or could not be loaded.” This happens generally when you don't have the format library where the SAS dataset located. What happens here is, SAS will not permit you to open the data set without the format library. But, if you use options nofmterr; at the top of your program, SAS will opens the dataset for you despite the fact that it does not have the format library. The FMTERR\NOFMTERR option determines whether or not SAS generates an error message when a variable's format cannot be found. (source: support.sas.com) The caveat here is, you will not be able to see the formatted values for your variables, but you will be able to open or use the dataset. OPTIONS nofmterr; *you will not be able to see the formatted values for your variables; If you want to see the formatted values for the variables, you need...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]


You can do this manually or programmatically. Manual Approach: Right click on the data set in the explorer and go to properties, then go to the details tab and see the sorted by values (Works only if you use Windows operating system) Programmatic approach: Use proc contents and create an output dataset of the contents. The sorted and sortedby columns will give us the details in about how the dataset got sorted in place. Note: proc contents will give missing values if the dataset isn’t sorted. *Example; proc sort data=sashelp.class out=class; by name sex age; run; proc contents data=class out=contents noprint;run; proc print data=contents; var memname name sorted sortedby; run; ('’)

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

February 06, 2010

I am finally ready with my SAS dataset reader/writer for .NET. It is written in 100% managed code using .NET 3.5. The dlls can be found
here.

A sample .NET console application can be found in the program file entries after the installation.


In my view, Display Manager commands didn’t get much attention of SAS programmers as they should be. It may be because... 1) SAS Documentation has very little information about how to use this facility. 2) Even Google searches aren’t helpful enough. Here are the list of Display Manager Commands I know… Using Manager commands we can open the dataset from any directory without touching the Explorer window… dm"log; clear; out; clear;";*Clears Output and Log Window; dm "vt work.dsn" ; *Opens the dataset DSN in a View table window;dm "vt &syslast"; *Opens the dataset recently created;dm 'next viewtable:work.dsn; end;';*Closes the VT window of DSN;dm 'keydef f12 submit';*Assigns the submit command to the F12 key ; dm "keydef F2 'cle log; cle output; submit'";*Assigns clear log and output commands to F2 Key; Following DM commands will open the specified window in detail view. dm "next explorer; detail";*applies detail view to explorer window; dm "next log; detail";...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

February 05, 2010

The Clipper 09-10 Round the World Yacht Race, featuring UK-based SAS consultant Andy Phillips onboard the 68-foot Team Finland, continues to offer incidents and accidents. As race 5 was building to a close-fought contest between a number of yachts, one of Team Finland's rivals, Cork, struck a submerged reef in the Java Sea, leaving its crew to hurriedly launch the life rafts and paddle to the nearby small island of Gosong Mampango. Thankfully nobody was hurt. All 16 crew were subsequently evacuated to two sister yachts, Team Finland and California.

After a good deal of inspection, surveyors and salvage experts advised that the boat was so weakened by the pounding she had taken on the reef that she would not survive being dragged off the shelf, made buoyant and then completing the 450 mile tow to Singapore. Even if this were possible, the extent of the damage meant that she would be beyond economic repair. As such, Chairman of Clipper Ventures Sir Robin Knox-Johnston accepted, with very great sadness, that Cork should remain on the reef and be lost to the sea. This is the first loss that the Clipper organisation has ever experienced. Since the first race in 1996, more than 2 million miles of racing has been safely and successfully completed.

Having been savaged by Hull & Humber at the start of race 4, causing several days of delay back in port for repairs, one has to feel sorry for Cork's crew. However, all crew members have been found berths onboard other yachts, and a replacement vessel will be in-place by the start of race 9 (starting in Panama).

At the end of race 5, after making due consideration for the delays incurred by Team Finland and others who went to the aid of Cork, the Race Committee placed Team Finland fourth. As a result, Andy Phillips's yacht now stands second in the overall results - having stood at the head of the leaderboard from the very start of Clipper 09-10.

Race 6, from Singapore to Qingdao, China started 4 days ago and is due to see the leaders cross the finish line around February 20th. Let's hope for a competitive but less incident-packed race!


let the SVD of X be: X=USV' use the following SAS code: ods output EigenValues=E; ods output EigenVectors=VT; PROC PRINCOMP NOINT COV data=X noprint; var_1....var_k; RUN; in order to get the diagonal elements of singular value matrix S of SVD output, use the following formula: S_ii=sqrt(E_ii*N) where E_ii is the diagonal elements of EigenValue matrix E and N is the number of observations


In this example, we show the a clustering exercise on Optical Recognition of Handwritten Digits Data Set available @ UCI data set repository (Link). This exercise is a standard application of HOSVD by stacking 8X8 matrix of digitalized bitmap of each letter where 1-mode is digits for each column and 2-mode corresponds to rows while 3-mode is each letter object, a standard Tensor representation. 

Have you ever asked to create a numeric ID variable with leading zeros? I mean 1 to 001 and 2 to 002. Even though there are many different ways to do this, here I will present one simple way.SAS has a special numeric format Zw.d. which can include leading zeros. Here is the example which shows how it works.

Often times SAS® programmers need to retain the value of a variable in the current observation to the value of the same variable in the next observation. The LAG function is a very useful tool for this purpose. A LAGn (n=1-100) function returns the value of the nth previous execution of the function.


Have you ever asked to create a numeric ID variable with leading zeros? I mean 1 to 001 and 2 to 002. Even though there are many different ways to do this, here I will present one simple way. SAS has a special numeric format Zw.d. which can include leading zeros. Here is the example which shows how it works. data leadzero; input ID NAME $ SEX $ AGE HEIGHT WEIGHT; datalines; 1 Alfred M 14 69 112.5 2 Alice F 13 56.5 84 3 Barbara F 13 65.3 98 4 Carol F 14 62.8 102.5 5 Henry M 14 63.5 102.5 6 James M 12 57.3 83 7 Jane F 12 59.8 84.5 ; run; data leadzero; set leadzero; id_num = put(id,z4.); run; *z4. format is used so 3 leading zeros will be added if the ID length is 1; ('’)

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

February 04, 2010

Labels: DD-MM-YYYY, How to extract year information from three formats of dates in a dataset?, Imputed Dates, MM-YYYY, YYYY The LINKEDIN SAS Professionals group had a question on How to extract year information from three formats of dates in a dataset? (IMPUTED Dates). There were a number of good suggestions submitted.


The LINKEDIN SAS Professionals group had a question on How to extract year information from three formats of dates in a dataset? (IMPUTED Dates). There were a number of good suggestions submitted. Here is a summary of the suggestions: **************************************************************; *Method1: Using scan or substr with length functions; data temp; infile datalines; input date:$10.; datalines; 1998 2008 01-1998 01-2008 01-01-1998 01-01-2008 ; run; data temp; set temp; if length(date) = 4 then year = input(substr(date,1,4),best32.); else if length(date) = 7 then year = input(substr(date,4,4),best32.); else if length(date) = 10 then year = input(substr(date,7,4),best32.); run; *(or); data temp; set temp; if length(date) = 4 then year = input(date,best.); else if length(date) = 7 then year = input(scan(date,2,'-'),best.); else if length(date) = 10 then year = input(scan(date,3,'-'),best.); run;...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

You’d have to be living in the most desolate areas of the globe not to feel the onslaught of information clamoring for our attention. In the midst of it all, there are pieces that inform and enlighten, but most information is just noise. Those of us who spend time on our computers from 5 p.m. to 9 p.m. are able to take advantage of some very savvy, user-friendly, semi-efficient ways to cut through the clutter.

For example, I love Google Finance’s interactive stock charts and Bestiario’s Videosphere project, showing the semantic relationship between videos. But from 9 a.m. to 5 p.m., many of us are saddled with applications that were developed for much smaller volumes of data and that take much longer to respond to our information needs.

Most information research states that data is doubling every 18 months. Some say even faster. Regardless of the actual timeframe, data is growing faster than it can be consumed and digested. Companies and individuals are constantly being challenged to make decisions more quickly with less time to analyze the data that is required as supporting evidence for their conclusions and proposed courses of action.

What makes 5-9 applications so much more appealing? Much of it is their ability to present information visually and use graphs that the viewer can interact with. But the 9-5 world is catching up. It’s moved from data-heavy reports, to static graphs, to dashboard widgets, to today’s interactive graphs (many of them copied from the Internet).

It’s been well documented that one of the best ways to understand data, especially lots of it, is through good data visualization. Since most of us at JMP spend lots of time on our computers (much to the displeasure of our significant others), we are constantly on the lookout for better ways to present information to our users in an interactive, intuitive, visual way. We know we can learn a lot from 5-9 applications. What are your favorite 5-9 applications and why?

Here is the simple code which allows us to save the log file in the required location.Use Proc printto procedure to save or print the log file.

Today I will tell you how easy it is to find ERROR/WARNING/UNINITIALIZED messages inside the LOG window. What you need to do is….Create a list of shortcut keys inside your SAS Toolbar for the LOG window. This is quick and also saves time.


Today I will tell you how easy it is to find ERROR/WARNING/UNINITIALIZED messages inside the LOG window. What you need to do is….Create a list of shortcut keys inside your SAS Toolbar for the LOG window. This is quick and also saves time. You  can customize toolbar settings using the Customize tools dialog box. You can open the Customize tools dialog box by 1) Enter TOOLEDIT (one word) in the command bar or 2) Go to, TOOLS ________ CUSTOMIZE or 3) Right click on the Toolbar and select CUSTOMIZE. Adding a Tool to the Toolbar To add a tool to the toolbar, just click the Add tool button to add a blank tool to the toolbar list. Enter a SAS command in the Command box as mentioned below. You can also add icon to the SAS command as I did below. Adding these7 icons means you can look for ERROR/WARNING/UNINITIALIZED messages and the two arrow signs will help to find the next or previous ERROR/WARNING/UNINITIALIZED messages.

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]


To inverse a matrix in SAS, people usually resort to SAS/IML. But many analystsin the industries don't have access to IML because their bosses want to save money. However, almost all SAS analysts have ready access to SAS/STAT. Without doubt, matrix inversion without IML is one of the top listed questions on SAS-L. Here, I demonstrate how to inverse a matrix using SAS/STAT. There are several


Gap statistic is a method used to estimate the most possible number of clusters in a partition clustering, noticeablly k-means clustering. This measurement was originated by Trevor Hastie, Robert Tibshirani, and Guenther Walther, all from Standford University. A reference paper can be found at: R. Tibshirani, G. Walther, and T. Hastie. 2001. Estimating the number of clusters in a dataset via

suppose i’ve got some odd character dates defined as follows:

d1='29-31 DEC 2009,04-05 JAN 2010'; *broken across month, year;
d2='04-05,08-09,11-12 FEB 2010'; *broken within month;
d3='01-03 MAR 2010'; *contiguous within month;
d4='31 MAR-01 APR 2010'; *contiguous across month;
d5='29 DEC 2009-01 JAN 2010'; *contiguous across year;
d6='01 FEB 2010'; *single day;


trying to query based on dates would be next to impossible. so, prx those things.



Original: >>>29-31 DEC 2009,04-05 JAN 2010<<<
sd=29 sm=DEC sy=2009 sdate=29DEC2009
ed=05 em=JAN ey=2010 edate=05JAN2010

Original: >>>04-05,08-09,11-12 FEB 2010<<<
sd=04 sm=FEB sy=2010 sdate=04FEB2010
ed=12 em=FEB ey=2010 edate=12FEB2010

Original: >>>01-03 MAR 2010<<<
sd=01 sm=MAR sy=2010 sdate=01MAR2010
ed=03 em=MAR ey=2010 edate=03MAR2010

Original: >>>31 MAR-01 APR 2010<<<
sd=31 sm=MAR sy=2010 sdate=31MAR2010
ed=01 em=APR ey=2010 edate=01APR2010

Original: >>>29 DEC 2009-01 JAN 2010<<<
sd=29 sm=DEC sy=2009 sdate=29DEC2009
ed=01 em=JAN ey=2010 edate=01JAN2010

Original: >>>01 FEB 2010<<<
sd= sm= sy= sdate=01FEB2010
ed= em= ey= edate=01FEB2010


full code lives here.

February 02, 2010


Accurately Calculate AUC (Area Under the Curve) in SAS for a binary classifier rank ordered data In order to calculate AUC for a given SAS data set that is already rank ordered by a binary classifier (such as linear logistic regression), where we have the binary outcome Y and rank order measurement P_0 or P_1 (for class 0 and 1 respectively), we can use PROC NPAR1WAY to obtain Wilcoxon Rank Sum


Tensor, a math concept for high order array, is a very useful tool in modern data mining applications. HOSVD, the counter part of SVD in higher order array, is at the heart of modern applications, such as face recognition and clustering, segmentation of multi-dimensional transaction data, etc. 【"Matrix Method in Data Mining and Pattern Recognition" by Lars Eldén】 But implementation of math


In some predictive modelling projects, we may have variables that most of the observations have the same value, while the small percentage rest ones are populated with meaningful values. For example, 90% observations have values=0 but the rest 10% have value=1, 2, 3.....for a variable called WebHits, etc. We may have a large number of such variables, say web hits at different pages, and due to

So hows that for a double entendre?

One of the biggest challenges traditional SAS users have trouble with is the fact that WORK does not work the same way in BI/Web applications. Such SAS users are used to the work library containing all sorts of data sets and catalogs that are easily accessible later in the program (either in an interactive SAS session or a long batch job). Most BI tasks however are (and should be) structured as a series of short tasks. This is exactly where some SAS users get tripped up when building BI/Web based reports - each request starts with a brand new clean/emply WORK library. This is because the Web is a stateless environment – each request knows nothing about any prior request. This creates a simple environment in that each request is completely self contained. However, quite often it is desirable and necessary to maintain certain information from one request to the next (i.e. do what the WORK library does). This is known as maintaining state.

The good news is that that there are a number of ways to deal with this. The Stored Process Server (and the SAS/IntrNet Application Dispatcher) support the creation and use of Sessions as a way to share data and macro variables from one request to the next. A Session is simply the saving of data and parameters from one request to the next. Sessions provide the ability to save library members (data sets and catalogs) and macro variables. The program explicitly specifies what to save and restore.

The mechanics of using Sessions are straightforward. Once a session has been created, a library named SAVE is created. By creating or copying data sets and catalogs to this library, the user program can rely on them being there the next time a request is made that uses this session. For global macro variables, at the completion of any request that uses sessions, all the global macro variables whose name begin with SAVE_ are saved. At the beginning of any future request, they are restored.

And sometimes SAS will create sessions automatically. For example, when ODS is used to create an HTML page containing a graph, ODS will create a lightweight session. Since a single web request can only return one file, ODS will create the additional files and save them in SAS catalog entries. For an HTML page with an embedded graph, the graph (or graphs) will be created and saved as a graphic (e.g., gif or jpeg) entry in a catalog. For a frame, the table of contents file and the body file are saved in html entries in such a catalog. These entries are returned to the user’s browser by a link that reconnects to the session to replay the entry.

Check back later for some examples on creating and using sessions:

  • the mechanics of creation a session
  • using lightweight sessions so you can generate, for example, HTML, PDF and RTF output in one request
  • creating a session to extract/summarize data once so multiple reports can be generated from it.

February 01, 2010

In his January 28 Webcast on Advanced Tips and Tricks for Maximizing Your Efficiency, Sam Gardner demonstrated three keyboard shortcuts that are useful for applying commands or options to graphical reports.

1. CTRL/Click applies one red triangle hot spot command or option to all applicable graphical reports in an analysis window. For example, in a display of multiple Distributions, you may want to see the Normal Quantile Plot for all continuous variables. Simply hold down the Control key while you click the red hot spot and select Normal Quantile Plot. JMP is smart enough to identify the Distributions for the continuous variables and display the Normal Quantile Plot for each.



2. ALT/Click applies multiple red triangle hot spot commands or options to one graphical report. For example, you may want to Fit Line, display Histogram Borders and Fit Mean for a scatterplot. Simply hold down the Alt key while you click the hot spot and then JMP displays a checklist of options. Check the options you want, click OK, and JMP applies all the selected options to the graphical report .



3. Want to combine both shortcuts? CTRL/ALT/Click applies multiple commands or options to all applicable graphical reports in an analysis window.

Want more tips? A recorded demo that includes other advanced tips and tricks for using JMP is available for download. .

Want to join a live Webcast? Webcasts offering demos and helpful tips for JMP users are held at least once per week into May. Register for as many as you like.

‫אני חושב שמתישהו באמצע כיתה ב הבנתי כי לכל מספר שיעלה בדעתי ניתן למצוא מספר גדול יותר. אם אני חושב על מאה, אז יש מספר גדול יותר ממאה: מאה ואחד. מליון – מליון ואחד. פטרוזיליה – פטרוזיליה ואחד. כיום אני יודע שפשוט תפסתי אז באופן טבעי את מהות האקסיומה השניה של פיאנו (מתוך 5 אקסיומות [...]‬

January 31, 2010

Andy wrote in the NOTE: blog that he's stumbled upon a few custom tasks to extend features in SAS Enterprise Guide. He wondered aloud whether there might be more "free" tasks out there, available for use by anyone who can find them and download them.

The answer is: You betcha!*

Custom tasks provide a great way to fill a gap: to provide a feature that isn't built into SAS Enterprise Guide, to facilitate a process that is unique to your industry or organization, or to help create a bridge to SAS from a data source or process that isn't easy to get to from the SAS programming language. It's also a useful mechanism for us at SAS to provide early access to a feature that we're considering for the production version of SAS Enterprise Guide.

Here is a list of several useful tasks, some of which have been featured on this blog before. Some of the tasks were built for SAS Enterprise Guide 4.1, but they should run without changes in the 4.2 version. However, the tasks that were designed for the 4.2 version require 4.2 or later -- they won't work in 4.1.

  • Create a SAS format from a data set (4.2 only): lets you use the contents of a data set (from discrete values/labels or range values) to create a format.
  • Build a Top N report (4.1 or later): lets you click a few buttons to create a useful report to measure the top -- whatever it use you need to measure.
  • Run PROCs on your Facebook friends (4.2 only): this one is for fun, but it's an example of how a custom task can communicate with an external source (Facebook) and bring the information into SAS.
  • Import SPSS, JMP, and Stata files into SAS (4.1 only): this is an example of a useful feature that we previewed with SAS Enterprise Guide 4.1 in a custom task, and it's built into 4.2.
  • Open Data in JMP (4.1 or later): If you have SAS Enterprise Guide and JMP installed, this task provides a simple way to launch JMP with a selected data set -- even if the data is on a remote SAS server or database library.
  • Create KPI dashboards (4.2 only): created by Bill Sawyer in SAS Technical Support, this task drives the new GKPI procedure to create interesting indicators for use in your reports.
Custom tasks are simple to download and deploy. Check this sasCommunity.org article for some instructions.

* Translation: yes, there are more free tasks.

I have a huge SAS program that isn't working. The results I get are not right but there are no errors or warnings in the SAS log. How can I figure out where I went wrong?

CAT/CATS/CATT/CATX Functions. These Functions and Call Routines can be used to join two or more strings together.Even though we can use the concatenation operator in combination with the STRIP,TRIM/LEFT functions, these functions make it much easier to put strings together and if you wish, to place one or more separator characters between strings.

Have you been asked how to convert missing values for all the variables into zeros..... if you are.... here is the answer for that.....In this example the I have used array to do the same.

January 30, 2010

Standardising the Clinical Data Exchange

Because.. the most common mistake is that.. We assume that macro variable resolution process proceeds from right to left...Infact it isn't. The bottom line is ... For some reason.... the &&&dsn&n is taken as && &dsn&n, which resolves to &client05. This macro variable will not get resolved, because &client5.

There are a number of options available for the INFILE statement. Below you will find discussion of the following options: DLM='character', DSD, MISSOVER, and FIRSTOBS=value.

January 29, 2010

PROC TRANSPOSE: How to Convert Variables(columns) into Observations(ROWS) and Observations(ROWS) into Variables(Columns)

January 28, 2010

Out of the box, no members of the SASUSERS group are allowed to create OLAP Cubes. In 9.2, the error message states "The security package failed while authenticating a user", included below is a screen shot.



The fastest way to remediate this is to allow the SASUSERS group to have WriteMetadata Access on the OLAP Schema. However, I could also recommend that you estabilish an 'OLAP Creator' group and gave that ReadMetadata and WriteMetadata access (so a smaller trained group of OLAP developers would have the ability to create & update Cubes).

Either way, to update the security on the OLAP Schema, in SAS Management Console move to Authorization Manager -> Resource Management -> By Type -> OLAPSchema and Select Properties on the appropriate lowest level listing - by default in 9.2 it's: 'SASApp - OLAP Schema'.

January 27, 2010

‫רשימה זו נכתבת בעקבות דיון שנערך ביני ובין ידידי משכבר הימים ראובן בתגובות לרשימה "מה רע בקצת סטרואידים". (ניתן לראות כאן את שרשור התגובות, שהתחיל הקורא סמיילי). הדיון התנהל סביב הטענה שהעליתי, לפיה חוקים נוצרים כדי להמנע מנקודת שיווי המשקל של דילמת האסירים. ראובן אמר כי לדעתו זו גישה נאיבית, ולכן אנסה להסביר ברשימה הזו באופן [...]‬

Convert variable values using either the INPUT or PUT function. Convert a character value to a numeric value by using the INPUT function. Specify a numeric informat that best describes how to Read the data value into the numeric variable. When changing types a new variable name is required.

Have you discovered SAS V9.1's new IF functions yet: IFC and IFN? They don't provide any functionality that you couldn't achieve in V8 (nor V6 nor V5 nor...) but they could make a big change to your SAS coding style.

Consider the following piece of DATA step code (please excuse the sexual stereotyping!):

if gender eq 'M' then hobby='football'; /* Male */
else hobby='knitting';
/* Female */

You will be aware that missing input data (or an earlier code/logic error) will result in the code erroneously following the female route. You might resolve this issue by coding the following:

select (gender);
  when ('M') hobby='football';
  when ('F') hobby='knitting';
  otherwise hobby='data err';
end;

The SELECT statement offers a more robust coding solution, but our code is beginning to get a bit verbose. The IFC and IFN functions resolve this issue by offering three potential outcomes, viz:

hobby = ifc(gender eq 'M','football','knitting','data err');

Do you see how your coding style might be changed significantly? I think the IFC and IFN functions might easily be dismissed as insignificant additions to the SAS language, but when we look a little deeper we see that they are really rather significant.

Check-out the V9.2 documentation for the low-down, such as default lengths for the return values.

USING THE SCAN FUNCTION:SCAN(string,n,delimiters): returns the nth word from the character string string, where words are delimited by the characters in delimiters. It is used to extract words from a character value when the relative order of words is known, but their starting positions are not.

Subtitle: An accounting of accounts which, by all accounts, you should hold accountable.

With the introduction of the SAS metadata server for use with SAS Business Intelligence, the role of the SAS administrator became more important. In SAS 9.1.3 a number of new standard "accounts" -- user IDs that serve a specific purpose -- were introduced. You had to have these accounts in place before installing and configuring SAS. In SAS 9.2, the process for creating and using these accounts was simplified quite a bit.

Don Henderson created a tip for sasCommunity.org as a caution against overusing the "sasdemo" account. The SAS deployment process encourages the use of sasdemo as a way to exercise your environment the way an end user would, without having to designate a specific "guinea pig user" in your organization. But the use of sasdemo should not go beyond that. Don warns against the practice of using sasdemo to create/maintain production-level content. For real work, you should use real accounts that are backed up by real people.

JMP benefits heavily from the pictorial superiority effect. To save you a moment of searching in Wikipedia, here's what that is: "According to the picture superiority effect, concepts are much more likely to be remembered experientially if they are presented as pictures rather than as words." I was reminded of this effect when a friend sent me a link to Alex Lundy's talk on "Chart Wars: The Political Power of Data Visualization." It is a rather humorous (although please be warned that the video contains some expletives) and educational talk espousing the benefits of tools such as JMP.

In the video, Lundy displays the names of a few of the experts in the field of data visualization. We pay a lot of attention to these data viz gurus at JMP; specifically, you may have noticed that we frequently invite Stephen Few to speak at JMP events. I would add Ben Schneiderman to that list, and there is certainly a more comprehensive list to compile.

Lundy also shows a quote from H.G. Wells -- "statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write" -- and adds the word "visual." Add the word "dynamic" to this statement, and now you are describing a state-of-the-art statistical tool called JMP. To demonstrate this, let's use his example of published White House staff salaries. You can get the data set yourself.

I was able to generate the same graphic Lundy shows and distribute it in a few minutes with JMP. Using a combination of the Distribution platform with the Data Filter and Tabulate, I quickly gleaned more information about the sum and distributions of salaries of different subsets of the White House staff.



For example, the combined salaries of staff with the status "Detailee" is $3,966,222.00, and they have significantly higher mean salaries ($123,949.19)...



...than the rest of the White House staff ($77,320.55), who have a combined salary of $35,180,848.00.



You can easily get this information in JMP, and it is not trivial to do this with other software. This is why I believe JMP is the best statistically visual software product on the market.

The 4.2 version of SAS Information Map Studio provides data managers with the ability to create a dynamic filter based on dates. This is extremely valuable for things such as daily reports.
Included below is an example of a new filter that is built off a date formatted data item.
You are then allowed to select a parameterized date, such as Today, Tomorrow, or Yesterday.

Also note that I also circled the 'Hide from user' box. If you don't select this item, the WRS authors have the ability to choose whether or not to include this filter. However if you DO select this, ALL data out of this information map will be filtered accordingly. (In this situation, I always modify the name of the information map to make it apparent that a filter occurs.)


I was interested to see that FORTUNE has ranked SAS as number one best company to work for in America. FORTUNE lists child care, health insurance, and the Cary campus's medical centre & fitness centre as just some of the reasons for SAS's top result. For those of us that don't work for SAS this is still good news because we can rest assured that new versions of the software are being developed by happy and motivated staff. As Mikael Hagström (Executive Vice President, EMEA and Asia Pacific) says in his blog, SAS's focus on its staff ensures that it is able to "continue to attract top talent and domain experts".

Also on the radar for those who have invested in SAS software: SAS have reported revenue growth of 2.2% (to $2.31 billion) in 2009. Whilst this is a lot less than the fabled "double digit growth" that SAS have achieved for most of their 34 year history, it's still a good performance in the current market conditions. Indeed, in some markets, such as retail, double-digit growth has actually been achieved. However, a note of caution is required. The Bankers' Mantra says "turnover is vanity, profit is sanity". I've not seen any published profit figures (as a private company SAS are not obliged to publish any), so we can't be sure at what cost the increased revenue was produced.

"We live in a global economy," said SAS Senior Vice President Jim Davis. "Not everyone is turning the corner at the same time. In 2010, we need to realise where the opportunities lie and how to approach those opportunities based on whether that particular region is seeing the light at the end of the tunnel."

Importing Excel/DBF Files to SAS Datasets Using DDE Labels: DDE, Dynamic Data Exchange, Importing DBF files to SAS Datasets, Importing excel files into SAS

January 26, 2010

The fourth seminar in SAS's SAS Talks series was held last week. I need to get myself onto the right distribution list because I didn't find out about it until after the event :(

This event was lead by SAS's Vince DelGobbo and focused on providing an introduction to creating multi-sheet Microsoft Excel workbooks with SAS. It's a topic that Vince has presented at various forums before, e.g. PharmaSUG 2007, SUGI 31 and NESUG 09, but still contemporary nonetheless.


Vince described how to use the ExcelXP ODS tagset to create multi-sheet Microsoft Excel workbooks using almost any SAS procedure and incorporating Excel features and functionality such as worksheet names, headers & footers and AutoFilters. All of this is available with just SAS/BASE (on any operating system, not just Windows).

You can find more information at http://www.sas.com/reg/gen/corp/867226. Hopefully a video will be available soon. In the meantime, check-out Vince's SUGI 31 paper.


Importing Excel Files Into SAS Using DDE What in the world is DDE and why use it? Good question... DDE is “Dynamic Data Exchange” and is a method of accessing data from one MS-Windows application by another. As a SAS user, you can use DDE within a DATA step to import data into SAS and export data from SAS. Using DDE involves SAS code and statements that other MS-Windows applications understand. But, why bother with DDE when other methods are available? What about the IMPORT and EXPORT procedures? Or, Open Database Connectivity (ODBC)? How about SAS Enterprise Guide? These alternatives are useful under the right conditions. For example, the PROC IMPORT and PROC EXPORT are simple to use, but are limited in the way you can define your data and these procedures require you license SAS/ACCESS For PC Formats. Bummer. ODBC is also simple to use, but importing data into SAS using ODBC requires you license SAS/ACCESS For ODBC. Bummer again. And, the new Excel LIBNAME engine? Oh, that also...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

SAS Interview Questions: General(Part-2) Labels: Introduction to SAS Informats and Formats, SAS Interview Questions

It’s that time of year for academics, the start of a new semester. The JMP academics team is busy preparing for the academic Webcasts we deliver at the beginning of each semester. We thought it would be interesting to share with you our results from our fall Webcasts. We delivered four sessions called “JMP Basics for Professors and Students,” and 148 people attended with a 98% satisfaction rating.

The 148 people were made up of 28% each graduate students, research faculty and teaching faculty.



If we combine the categories for students and teachers, we can see that 90% of our attendees were students, teaching faculty and research faculty.



Somewhat surprising to us was that the No. 1 discipline among attendees was life, health or physical sciences, followed by general and social or behavioral sciences. These three groups made up 77% of our audience. In academics, we have traditionally had many users in the business and economics group as well as the engineering or computer science group, so we were happy to see so much interest from people in the first three groups as well.



Other fun facts about our attendees include that 28% were Mac users and 72% were Windows users. About 70% said they had little JMP experience. And we had attendees from 32 states.

Starting on January 28, we will be hosting the same Webcast as in the fall, “JMP Basics for Professors and Students,” as well as a new Webcast called “Teaching Statistical Concepts with JMP.”

“JMP Basics for Professors and Students” will be offered on Jan. 28 and Feb. 9, 12, 15, 19 and 24 at a variety of times. This Webcast covers an overview of basics for teaching or using JMP for elementary and intermediate statistics courses.

“Teaching Statistical Concepts with JMP” will be offered on Feb. 10 and 23 at 1:00 p.m. EST. This Webcast covers the resources available to help teach statistics using JMP.

You can register to attend one or both of these. Hope you’ll join us and spread the word!

The SAS 9.2 Stored Process Server includes an option to specify the name of a SAS program that is to be run before any request - this is conceptually similar to the autoexec facility that has existed in SAS back to it's Jurassic days.

And, as an added bonus, there is also an option to specify the name of a program to run after each stored process. This facility provides the facility to do many of the functions described in Chapter 9 of my SAS Press Book, Building Web Applications with SAS/IntrNet®: A Guide to the Application Dispatcher, such as defining the list of available libraries based on who the user is, etc.

NOTE: Do NOT try the examples that utilize ENDSAS; as that statement does not do the same thing in the Stored Process Server as it did in the SAS/IntrNet Application Dispatcher where it simply ended the current request. In the Stored Process Server it will shut down the Stored Process Server itself.

Just follow these simple instructions to specify the program to run before/after each Stored Process.

January 25, 2010

Labels: Guidelines for Coding of SAS® Programs, SAS ® PROGRAM EFFICIENCY FOR BEGINNERS, SAS® Programming Guidelines

I saw an interesting question posted to Twitter by @sastips. She asked experienced SAS programmers what advice they would give to new or junior SAS programmers. New user advice is always needed. On support.sas.com, we are frequently asked for advice on how to get started, so this question peaked my interest. I may have missed a few responses, but I managed to catch a few of them and post them here for you.


@SASJedi: Get certified!

@Jeff_Bailey: learn PROC SQL.if they deal w/ databases this is priceless: OPTIONS SASTRACE=',,,d' SASTRACELOC=saslog NOSTSUFFIX;

@gtcox76: my tip to jr programmer. learn the data. You can write great code but if logically wrong it's bad code.

@SSessoms: My advice to a jr SAS prgmr? Read "The Little SAS Book." It's a wealth of info! http://tinyurl.com/yaat224

@sasbi: My advice to a jr SAS pgmr? ALWAYS check the log first!

@sastip: just because it runs doesn't mean it's right.

These tips are interesting and useful. Let's continue this topic and offer more help to those just starting to learn SAS. If you have a tip, post your advice in the comments for this blog post.
What advice would you give?

My advice is more generic than the tips offered on Twitter. I suggest using the free resources available to you.

Two great resources are support.sas.com and sasCommunity.org. Both of these sites contain so much information. I recommend:
  • starting with a few sample programs from the support.sas.com samples database or the Tip of the Day on sasCommunity.org

  • signing-up for training or tutorials from SAS Training

  • Finally, leverage the knowledge of the user community. Take advantage of the discussion forums, sasCommunity.org, SAS users on Twitter, and your co-workers.


And one last thing for junior and senior SAS programmers. I saw this great thread on SAS-L Some Tips and Tricks from SAS-L. It is loaded with years of sage advice on all sorts of SAS topics.


This paper presents a set of programming guidelines and conventions that can be considered in developing code to ensure that it is clear, efficient, transferable and maintainable. These are not hard and fast rules but they are generally accepted good programming practices. These techniques can be used in all types of SAS® programs, on all platforms, by beginners and experts alike. SAS ® PROGRAM EFFICIENCY FOR BEGINNERS Guidelines for Coding of SAS® Programs

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

Conditional formatting for numeric values is pretty simple for users in Web Report Studio, but how about colors on categorical values - such as customer name, product type, issue severity etc. (This was requested from a WRS user in the SAS Discussion Forums)

Turns out this can be done, however it's a little tricky.
1. You will need to define the values within the data table itself.
Such as changing Region = 'Canada' to Region = <font color=RED>Canada</font>
Note & possible best practice: This could be done in a reference table and joined to the actual table within the relationship tab of Information Map Studio.

2. In Information Map Studio, select the data item's properties, move to the 'Classification, Aggregations and Formats' tab and check the box next to 'Display as a hyperlink'. This tells SAS to allow all the tags to go through to the displayed WRS report.



Here is an example of the final results:


January 22, 2010

During yesterday’s Mastering JMP live Webcast on Shaping Your Data, one of the JMP capabilities presenter Jeff Hawes demonstrated was Tabulate (Tables>Tabulate). Here is a tidbit that folks found interesting.

Jeff: Tabulate is one of the data management tools in JMP. It is akin to a Pivot Table in Excel, but more interactive. By dragging and dropping into zones on a template, you create a multidimensional table.



Question: What is the advantage of using Tabulate instead of Summary (Tables>Summary)?

Jeff: There are two main differences. First, Summary only summarizes for rows - and the table is made up of records in rows with single-header columns. With Tabulate, you can group your rows and columns into subcategories and nest factors within multiple categories. For example, you can build subcategories like vehicle size for each car brand.

Second, Summary provides a data table, whereas Tabulate is a an interactive report that responds to dynamic, animated data filtering. In this example I apply the data filter to highway mileage.



Want to learn more?

• View a recorded demo on shaping your data.
• Register for any of our weekly live Webcasts in the Mastering JMP series.
• If you have JMP and want to play with Tabulate using the same data Jeff used, get Cars 1993.jmp from your JMP sample directory.

We received several comments noting extremley slow loading pages, especially in the search and browse areas. I checked on the pages and the slowness seemed random and impossible to reproduce. Then I remembered that the world just found out that SAS is the #1 place to work according to Fortune Magazine. I know our Web sites saw a lot more traffic yesterday than usual. I jumped to ideas about pipes and traffic and such. Luckily, our Webmasters are always watching and checking and tweaking, so they were keeping an eye on it. Everything is good.

This gives me a chance to say that if you experience this or any other issues with support.sas.com, post a comment here or use this feedback form to let us know about your experience. We read every comment and do our best to correct and respond.

[I'm putting this topic in the Q&A topic category because I am responding to feedback received in our feedback forms. I hope that Customer Support Site users are starting to look here for answers to questions and comments that are submitted about support.sas.com.]


If you find yourself thinking "what search and browse pages?", read on.
We have a few sections of the site where the pages are tagged with metadata (words that we think you use to describe the topics within those pages). You can drill through the topic hierarchy to narrow the resulting document list to only those pages that were tagged with a specific topic value (browse). Searching is still a powerful way to locate content, so you can also use the search box to search the content in the same way that you search any part of support.sas.com.

Topic browse is available for