As the first step in the decommissioning of sasCommunity.org the site has been converted to read-only mode.
Here are some tips for How to share your SAS knowledge with your professional network.
Importing CSV Files with All Character Data
As part of the decommissioning effort for sasCommunity.org this article/tip has been migrated to communities.sas.com.
When using PROC IMPORT to read CSV files, the SCANTYPE=MIXED option is not available. This means that we do not have an option to force variables to be character. The GUESSINGROWS option can be used, but it can be costly for large files, or potentially not sufficient. Although it does require two passes of the data, the following technique can be used to force all columns to be character when using PROC IMPORT on a CSV file.
The technique works by inserting an asterisk ( * ) in each data column, before the file is seen by PROC IMPORT. A potential side benefit is that this allows us to use a smaller value for the GUESSINGROWS option, however remember that the variable length is determined during the scanning process and also depends on GUESSINGROWS.
Insert a row of asterisks, one for each incoming column, into the data steam. This requires a first pass of the data.
We read the first row which contains the names of the variables. These are counted and a series of asterisks are written on the first data line following the variable names. The asterisk forces PROC IMPORT to see each column as a character variable.
* HOLDIT is a temporary location; filename holdit temp lrecl=32000; * Point to the csv file; filename rawcsv "YOUR PATH.csv" lrecl=32000; data _null_; file holdit; infile rawcsv end=done; * Read the first observation (variable names); input; * Write the var names; put _infile_; * Count the variables (the names are comma separated); wcount = countw(_infile_,','); * Write an asterisk for each variable - forces each to be character; * but will not set the var length; * string = catt('*',repeat(',*',wcount-2)); put string; * Read and write the data portion of the raw data; do until(done); input; ********** * Pre processing of the data could be done here **********; * Write the record; put _infile_; end; stop; run;
Reading the Data
PROC IMPORT is then used to import the modified CSV file into a SAS data set. The GUESSINGROWS option can be set to a small number, and the DATAROW=2 is used to make sure that the row of asterisks is included in the read, but not the variable names.
* Read the Altered CSV file into a SAS dataset; PROC IMPORT OUT= fromcsv DATAFILE= holdit DBMS=CSV REPLACE; guessingrows=4; GETNAMES=YES; DATAROW=2; run; * Clear the filerefs for the next data set; filename holdit; filename rawcsv;
Removing the Asterisk
The first time this new data set is used the asterisks can be removed using the FIRSTOBS option.
data noasterisk; set fromcsv(firstobs=2); * Doing other stuff here.; run;