If you are familiar with the output delivery system (ODS), then you know that you can modify the tables and graphs that analytical procedures display by modifying table and graph templates. Perhaps less familiar is the fact that you can also modify dynamic variables. Procedures use dynamic variables to control […]Read More
In my quest for interesting data to graph, I found some Drug Enforcement Administration (DEA) data on US domestic cannabis eradication. Does the data say anything interesting? Read on to find out! ... While doing some searches for other data, I happened across a table on the DEA website titled […]Read More
Encryption and SAS is a wide ranging topic – so wide it gets its own book and features strongly in both the SAS(R) 9.4 Intelligence Platform: Security Administration Guide, Second Edition and SAS(R) 9.4 Intelligence Platform: Middle-Tier Administration Guide, Third Edition. In this blog we’ll take a high level look at […]Read More
Last week, SAS released the 14.1 version of its analytics products, which are shipped as part of the third maintenance release of 9.4. If you run SAS/IML programs from a 64-bit Windows PC, you might be interested to know that you can now create matrices with about 231 ≈ 2 […]Read More
SAS 9.4 Maintenance release 3 was released on July 14. The ODS Graphics procedures include many important, useful and cool features in this release, some that have been requested by you for a while. In the next few articles, I will cover some of these features. Last time I covered […]Read More
There's been quite a bit of controversy about the number of undocumented immigrants in the US lately - for example, Ann Coulter claims that number is 30 million, whereas others claim it's about 11 million (readers of my blog are data-savvy, and would dig into the details of such claims, […]Read More
- The labs usually take long time to finish, say 8-10 hours. If the host machine is closed, the RDDs will be lost and the pipeline has to be run again.
- Some RDD operations take a lot computation/communication powers, such as
distinct. Many of my 50k classmates complained about the waiting time. And my most used laptop is a Chromebook and doesn’t even have options to install Virtual Box.
I found that a Linux box with 1 GB memory and 1 CPU at DigitalOcean that costs 10 dollars a month will handle most labs fairly easy with IPython and Spark. A 2 GB memory and 2 CPU droplet will be ideal since it is the minimal requirement for a simulated cluster. It costs 20 dollars a month, but is still much cheaper than the cost to earn the big data certificate that is $100 (50 for each). I just need to write Python scripts to install IPython notebook with SSL, and download Spark and the course materials.
- The deployment pipeline is also at GitHub
I returned to work from a 2+ week vacation this morning. When I fired up SAS Enterprise Guide (as I do each work day and occasionally on weekends), I was greeted with this message: As a SAS insider, I knew this was coming. It's a new feature that was added […]Read More