Contents

Simple frequency histogram, to check if need log transformation. 1

Use scan to extract bits of text, especially numbers and times in a text string. 2

Use TRANSTRN to replace or remove all occurrences of a substring in a character string. 2

Convert a number to a letter 2

Get age from date and date of birth. 2

Convert text date or time to sas date or time. 2

Input a time in text format and convert to SAS time. 3

Extract info from a variable name. 3

Assign a value to a macro variable from another macro variable. 3

Check for misspelt names in a huge dataset with repeated measurements. 3

Get the values of some variable(s) corresponding to the min or max of another variable. 4

Transpose the values of the variable Estimate giving the new variables the values in Effect 4

How to output results for log vs raw analyses. 4

Set up macro variables to put 2SD or whatever into an estimate statement 4

Suppress output in proc mixed in SAS Studio but keep the graph of the residuals. 4

Generate confidence limits from conf level and SE. 5

Relax convergence in proc mixed and proc glimmix. 5

Parametric bootstrap confidence limits for the ICC. 6

If proc univariate plot does not produce a plot 7

Import an excel file in the full SAS package. 7

Import an excel file in SAS Studio. 7

Print to an excel file. 8

Print to an excel file in full SAS program but not Studio. 8

Clear the listing and log windows in full SAS package. 9

Get old-fashioned listing-type output in Studio. 9

Make sounds, useful at the end of long-running program.. 9

Send an email alert to yourself when a long-running program has finished. 9

Check for misspelt names in a huge dataset with repeated measurements. 10

 

Simple frequency histogram, to check if need log transformation

ods graphics / reset width=12cm height=8cm;

proc sgplot data=dat0;

histogram PossessionLength;

by PossessionType;

run;

 

Use scan to extract bits of text, especially numbers and times in a text string

data temp;

RunTime="1:23";

d1=scan(RunTime,1,":");

d2=scan(RunTime,2,":");         

TimeInMin=d1*60+d2;

 

data temp;

RunTime="1:23:45.6";

d1=scan(RunTime,1,":"); 

d2=scan(RunTime,2,":");

d3=scan(RunTime,3,":");

TimeInSec=d1*60*60+d2*60+d3;

 

Use TRANSTRN to replace or remove all occurrences of a substring in a character string

if index(text,"Athlete=") then

  Name=transtrn(text,"Athlete=",trimn('')); *otherwise puts a space in front;

 

Convert a number to a letter

NameA=byte(64+NameX); *A,B,… for NameX=1,2,…;

NameA=byte(96+NameX); *a,b,… for NameX=1,2,…;

 

Get age from date and date of birth

Age=yrdif(DOB,CompetitionDate);

AgeRounded=round(Age,1);

 

Convert text date or time to sas date or time

options datestyle=dmy; *specifies sequence of day month year in inputted date data;

 

data temp;

TodaysDate='14OCT2010'd; *single or double quotes are OK;

BirthDate=input('11/9/01',anydtdte24.); *this is 11SEP2001 for datestyle=dmy, but 11NOV2001 for mdy;

CheckThis=input('14Feb2015',anydtdte24.);

Age=yrdif(BirthDate,TodaysDate,'ACT/ACT'); *ACT/ACT is the default and can be left blank;

format TodaysDate BirthDate CheckThis date9.;

 

proc print;

run;

 

Input a time in text format and convert to SAS time

data temp;

x="1:23";

TimeInSec1=input(x,ANYDTTME24.); *1 h, 23 min;

TimeInSec2=input("1:23:45.6",ANYDTTME24.); *1 h, 23 min, 45.6 s;

TimeInSec3="1:23:45.6"t; *single or double quotes are OK;

 

proc print;run;

 

Extract info from a variable name

The example shown was for data in a spreadsheet, in which there were column headings

of 1/1, 1/2, 1/3, 2/1, 2/2, etc. through 9/3,representing weeks (1 to 9) and sessions (1 to 3 each week).

The code in the data step extracts the number of the week and the number of the session.

 

array a _1_1--_9_3;

do over a;

  temp=vname(a);

  Week=substr(temp,2,1);

  Session=substr(temp,4,1);

  &dep=a;

  output;

  end;

drop _1_1--_9_3 temp;

 

Assign a value to a macro variable from another macro variable

%let predmhalf=%SYSEVALF(0.5*&predm);

 

Check for misspelt names in a huge dataset with repeated measurements

Go to the last item in this doc.

 

Get the values of some variable(s) corresponding to the min or max of another variable

proc means noprint data=peak1;

var Perform;

output out=peak2 min=PeakPerform minid(Perform(Age MidPeak))=PeakAge MidPeak;

*output out=peak2 min=PeakPerform minid(Perform(var1 var2…))=;

by Event Sex Competitor;

 

Transpose the values of the variable Estimate giving the new variables the values in Effect

Proc transpose is always difficult! You have to play with it to get it right.

proc transpose data=solf out=solftr; *prefix=Whatever;

var Estimate;

id Effect;

by PosGrp;

run;

 

How to output results for log vs raw analyses

%let logflag=1; *or 0, if not using logs;

Then make separate proc prints (or whatever) with: where &logflag; or with: where &logflag=0;

 

I also sometimes use this to create a macro variable &unitsrawlog,
for use in results tables and heading:

data _null_;

if &logflag then call symput('unitsrawlog',"percent");

else call symput('unitsrawlog',"raw");

 

Set up macro variables to put 2SD or whatever into an estimate statement

data _null_;

set meansd;

if Sex="Female" and TeacherRole="Assistant" then

   call symput('FA2sd',left(trim(round(2*SD, .01))));

 

Suppress output in proc mixed in SAS Studio but keep the graph of the residuals

*I stopped using this, preferring to use proc sgplot for residuals vs predicteds

 and residuals vs predictors;

N.B. must be preceded by a run;

run;

ods select none;

ods select StudentPanel;

 

proc mixed data=games plots(only)=StudentPanel(conditional) alpha=0.1;

class AthleteID;

model LnDepVar=.../residual;

random Intercept/subject=AthleteID;

run;

ods select all;

 

Generate confidence limits from conf level and SE

*normal dist;

lower=estimate+probit(alpha/2)*StdErr;

upper=estimate-probit(alpha/2)*StdErr;

*or;

Lower=estimate-quantile('NORMAL',1-alpha/2)*Stderr;

Upper=estimate+quantile('NORMAL',1-alpha/2)*Stderr;

*chi-squared dist for residuals;

Lower=DegFree*estimate/CINV(1-alpha/2,DegFree);

Upper=DegFree*estimate/CINV(alpha/2,DegFree);

*tdist;

tValue=tinv(1-&alpha/2,DF);

Lower=estimate-tValue*StdErr;

Upper=estimate+tValue*StdErr;

*same again in one hit:;

Lower=estimate-tinv(1-&alpha/2,DF)*Stderr;

Upper=estimate+tinv(1-&alpha/2,DF)*Stderr;

 

Relax convergence in proc mixed and proc glimmix

Use when you get "did not converge" or "too many likelihood evaluations".

When you get it to work with, say 1E-6, try 1E-5 to check that the answer is practically the same.

If it's not, then you have to live with the fact that the results may be only approximate.

 

proc mixed data=alldat3 covtest cl alpha=&alpha &nob CONVH=1E-6 convf=1E-6;

for proc glimmix it's pconv=1E-6;

Or use even smaller values than -6. (The default is -8.)  I have sometimes had to resort to -2 with glimmix.

 

Parametric bootstrap confidence limits for the ICC

data covicc;

merge covbet2(rename=(estimate=AthVar stderr=AthVarSE) keep=Condition estimate stderr alpha)

  covwthn2;

by Condition alpha;

ICC=AthVar/(Athvar+estimate);

*if Group ne "Condition Ruck";

do i=1 to 5000;

  AthVarX=AthVar+rannor(0)*AthVarSE;

  *if AthVarX<0 then AthVarX=0;

  ICCboot=AthVarX/(AthVarX+estimate*sqrt(cinv(ranuni(0),DF)/DF)); *checked by simulation;

  *ICCboot=AthVarX/(AthVarX+estimate+rannor(0)*StdErr);

  output;

  end;

keep ICC ICCboot Condition alpha;

 

proc univariate noprint data=covicc;

var ICCboot ICC;

output out=boot1(drop=d) mean=d ICC median=ICCboot ICC pctlpts=0.5, 5,95,99.5 pctlpre=CL pctlname=L99 L90 U90 U99;

by Condition;

 

data boot2;

set boot1;

CLpm90=(CLU90-CLL90)/2;

CLpm99=(CLU99-CLL99)/2;

 

option ls=90 ps=90;

title4 "Confidence limits via semi-bootstrapping";

title5 "CLpm90 = 90%CL in approx. ± form, etc.  CLL90 = lower 90% conf. limit, etc. ";

title6 "Unrealistic 99%CL reflects the inadequate sample size";

proc print noobs data=boot2;

var Condition icc clpm90 clpm99 cll90 clu90 cll99 clu99;

format _numeric_ 5.2;

run;

options ps=52;

 

If proc univariate plot does not produce a plot

This is relevant to the main SAS package only, I think, not SAS Studio.

ods _ALL_ close;

ods listing;

ods graphics off; 

run;

 

*to turn on html graphics again;

ods html;

ods graphics;

 

Import an excel file in the full SAS package

*this version allows the xlsx to be open but not with a cell active;

*Note the extra period before the xls, if you have a macro variable at the end. You may need extra periods, if you have several macro variables;

*the variable type is determined by the type in Row 2;

*this does not work with a filename statement replacing the address info;

*you can mix \ and / (but not in SAS Studio? all / needed;

*NB with several sheets, if one is Sheet1, SAS will read that by default if you do not specify the sheet;

PROC IMPORT

  DATAFILE="D:\Will's Documents\Projects\etc\data&year..xlsx"

  OUT=teams

  DBMS=EXCELCS replace;

*sheet=&sheet;

GUESSINGROWS=N OR MAX; *this works only with XLSX;

attrib _all_ label="" format=; *this deletes all annoying labels and misleading nominal formats;

run;

*DBMS=XLSX requires the file to be closed, and if there is a single non-numeric value (blanks excepted), the variable becomes nominal;

*DBMS=EXCELCS does not work in the latest 9.4, incredibly;

 

Import an excel file in SAS Studio

*SAS Studio produces most of this code for you. It does not work in the main SAS package (9.40;

PROC IMPORT DATAFILE=REFFILE

  DBMS=XLSX

  OUT=whatever

  GETNAMES=YES;

*the following are optional statements. See the documentation for other options;

SHEET="whatever"; *the name of the spreadsheet on the tab;

GUESSINGROWS=N OR MAX; *for proc import to guess nominal or numeric and the format;

attrib _all_ label="" format=; *this deletes all annoying labels and misleading nominal formats;

RUN;

 

MIXED=NO; *YES makes variable with blanks into nominal, NO looks for majority of type, maybe!;

*the above seems to have disappeared from the documentation, and it doesn't work;

 

Print to an excel file

You can proc print the file, then copy from the html file into a spreadsheet.

You can control the output by KEEPing variables and setting the FORMATs thereof before you proc print.

 

Here's another solution for Studio (it does not work in SAS 9.4).

This solution does not allow you to control the format.

You have to download the resulting Excel file to your computer.

 

libname out xlsx "/home/.../TheNameOfTheExcelFile.xlsx";

 

data out.x;

set whatever; *the dataset you want to save to an Excel file;

run;

libname out clear;

 

Print to an excel file in full SAS program but not Studio

/*

This is an older klunky solution. There are probably better ways now.

Open the file when prompted. The file appears to be in html format.  You have to View/Gridlines, fill all cells with No fill, and Save as… xlsx.

Note the extra period before the xls, if you have a macro variable at the end.

A better version is available in the latest version 9.4, using ods excel.

*/

ods listing close;

 

ods html file="D:\Will's Documents\Projects\Patria Hume\Steve Hollings\athletics2009\Men Track\Curves&gender.&event..xls";

 

proc print data=curves;

run;

 

ods html close;

ods listing;

run;

 

Clear the listing and log windows in full SAS package

dm "out;clear;log;clear;";

 

Get old-fashioned listing-type output in Studio

*I don't use this;

ods listing file='/folders/myshortcuts/ExternalFiles/Projects/_VU Melbourne/Alireza Esmaeili/Ali Study3/whatever.txt';

*etc as above;

ods listing close;

*You have to open the file in Word and process it quite a lot.

 

Make sounds, useful at the end of long-running program

*Alas, does not work in Studio, and no-one in the SAS communities could help;

*but you can get Studio to send you an email! See below;

data _null_;

call sound(440,200);

call sound(622,200);

call sound(880,200);

run;

 

Send an email alert to yourself when a long-running program has finished

*code provided by Cynthia Zender of SAS Global Academic <academic@sas.com>;

*put this block of code at the end of the program;

filename doemail email 

  to='willthekiwi@gmail.com'

  subject='Program is finished';

* cc='first3.last3@email3.com'; *if these two lines needed, remove semicolons and insert into the above;

* from='willthekiwi@gmail.com' ;

                               

data _null_;

  file doemail;

* put 'Message Line 1.';

* put 'Message Line 2.';

run;

 

Check for misspelt names in a huge dataset with repeated measurements

*ex NIH/Paul Solberg/Weights study;

*which was modified from more complex read rowing comps SAS.doc, in Brett Smith rowers/Rowing Competitions;

data check;

set dat1;

keep Athlete Born;

*if Sex="W";

if Sex="M"; *ran separately because >2000 athletes if both together;

 

proc sort;

by Sex Athlete;

 

data check1;

set check;

if lag(Athlete) ne Athlete;

 

data check2;

set check1;

rename Athlete=AthleteX Born=BornX;

 

*make Cartesian product;

proc sql;

create table check3 as

select *

from check1 as l, check2 as r;

quit;

run;

*where l.gender=r.gender; *put before semicolon before the quit;

 

data check4;

set check3;

if 0<compged(Athlete,AthleteX)<250;

if lag(Athlete)=Athlete and lag(AthleteX)=AthleteX then delete;

run;

 

*proc print;

run;

 

data check5;

set check4;

length ret1name1-ret1name2000 ret2name1-ret2name2000 $ 33;

array a $ ret1name1-ret1name2000;

array b $ ret2name1-ret2name2000;

retain ret1name1--ret2name2000;

a(_n_)=Athlete;

b(_n_)=AthleteX;

if _n_>1 then do;

  do i=1 to _n_-1;

    if Athlete=b(i) and AthleteX=a(i) then delete;

    end;

  end;

drop i ret1name1-ret1name2000 ret2name1-ret2name2000;

if lag(Athlete)=Athlete and lag(Athletex)=Athletex then delete;

if lag2(Athlete)=Athlete and lag2(Athletex)=Athletex then delete;

if lag3(Athlete)=Athlete and lag3(Athletex)=Athletex then delete;

if lag4(Athlete)=Athlete and lag4(Athletex)=Athletex then delete;

if lag5(Athlete)=Athlete and lag5(Athletex)=Athletex then delete;

run;

*the above might give index out of range error if too many athletes;

 

options ls=120;

proc print data=check5 noobs;

var Athlete Athletex Born BornX;

format Athlete Athletex $25.;

where Born=BornX;

run;