As the first step in the decommissioning of the site has been converted to read-only mode.

Here are some tips for How to share your SAS knowledge with your professional network.

Now You See It, Now You Don't -- Using SAS to De-Identify Data to Support Clinical Trial Data Transparency

From sasCommunity
Jump to: navigation, search
Dave Handelsman


Research Triangle Park, North Carolina


Clinical trial data transparency initiatives will only be successful if the data being shared is properly de-identified in order to protect patient confidentiality and comply with national regulations, while still supporting investigation and analysis. In this emerging area, however, the rules regarding clinical trial data de-identification can be confusing, open to interpretation and difficult to understand.

At a basic level, de-identification means that someone accessing trial data should not be able to match an individual patient's data to a real-life individual. This means not only obfuscating patient ID information (patient number and site number, for example), but also masking all dates, eliminating references to sensitive terms like "HIV", and a wide variety of additional, and often confusing, rules. All of these data modifications must be done in such a way that the clinical trial data can still be successfully analyzed on its own, or when combined with additional trial data. To further complicate matters, this additional trial data may frequently be provided by multiple biopharmaceutical companies.

Many companies actively engaged in clinical trial data transparency initiatives are using SAS to perform de-identification. Additionally, they have published their individual de-identification strategies in order for patients to understand how their confidentiality will be protected, and to inform researchers how they will need to prepare to analyze the data. This paper will review the company strategies and the various SAS approaches to de-identification in use today.

Online resources

View the .pdf of this paper.