Translate SAS code to R (Roll model from Hasbrouck's book on Micro structure)

'm not familiar with SAS and is focusing to learn R. I have some SAS code (from Hasbrouck's book on market microstructure), could some one on here be so kind and help me translate this code into R ? This is a SAS macro and I want it to be a R function: Thank you very much !

I originally just tried to run the code straight in SAS and it worked fine with the included data sets. But when I want to use my own data sets i ran into various trouble with SAS and whatnot so I figured it would be much better to have this program in R where I'm more comfortable to learn.


    GRUnivariate(dsIn=, maOrder=, price=)

    macro to estimate a univariate generalized Roll model of prices.

    dsIn        Input dataset
    maOrder     Order of moving average estimated
    price       name of price variable (default=p)
                The price variable is assumed to be in
                level (or log) form. The routine computes the first-differences internally.


%macro GRUnivariate(dsIn=, maOrder=5, price=p);
proc arima data=&dsIn;
    identify var=&price(1) center nlag=10;
    estimate noint p=0 q=&maOrder;
    title "Univariate MA analysis of &price Input dataset=&dsIn maOrder=&maOrder";
    ods output parameterEstimates=parameterEstimates;
    ods output fitStatistics=fitStatistics;
title "Univariate random-walk analysis";
proc iml;
    start main;
    reset printadv=1;
    use fitStatistics;
    read next var {nValue1} into varEpsilon;
    print varEpsilon [label="Innovation variance"];
    use parameterEstimates;
    read all var {estimate} into theta;
    theta = -theta;
    rn = char(1:&maOrder);
    print (t(theta)) [colname=rn label='Thetas'];
    sumTheta = 1+sum(theta);
    print sumTheta [label="Sum of thetas, including theta(0)=1" f=50.5];
    varW = sumTheta##2 * varEpsilon;
    print varW [label="Random-walk variance" f=best30.5];
    sdW = sqrt(varW);
    print sdW [label="Random-walk standard deviation" f=best30.5];

    *   Cumulate sums of thetas;
    sCoeff = j(1,&maOrder,0);
    do i=1 to &maOrder;
        do j=i to &maOrder;
            sCoeff[i] = sCoeff[i] + theta[j];
    print sCoeff [label="Pricing error coefficients" f=12.5];
    sVar = sum( sCoeff##2 ) * varEpsilon;
    print sVar [label="Pricing error variance (lower bound)" f=50.10];
    sSD = sqrt(sVar);
    print sSD [label="Pricing error standard deviation (lower bound)" f=50.6];
    finish main;

%mend GRUnivariate;

It looks like the first part of the SAS code is generating an ARIMA model, which you can do in R with the arima function. The arima function has an order argument, which is a vector containing the AR, I, and MA (autoregressive, integration, and moving average) portions of the ARIMA model specification. arima also has a seasonal argument for specifying the seasonal portion of the model. Run ?arima to bring up the help file, which has detailed information on using the arima function.

It looks like the second part of the code is providing information on the model coefficients and goodness of fit statistics. The object output by the arima function will have the model coefficients and some fit statistics as well, but I don't know if it will have the specific output you're looking for. Try generating an ARIMA model with a built-in data set and you can see what's available. For example, with the built in lh time series data set:

# Create model
x = arima(lh, order = c(3,0,2))

# Print to the console a summary of the model output

# Get model coefficients

# Look at structure of model object returned by arima function

There are almost certainly additional R functions for generating additional model diagnostics, and you can also run tsdiag(x), which will output some diagnostic plots of the model residuals.

I'm not familiar with the Roll model and have only basic knowledge of time series analysis, but here and here are an ARIMA tutorial and a free online book about forecasting, respectively, that use R and will hopefully help you get started.


Can you be more specific about the arima model implied by the SAS code? I find it difficult to identify the values of ar, i, ma implied by the SAS code.

I've never used SAS, but looking at the code, the MA order (the q of the ARIMA model) is 5, and p, which normally refers to the AR order, is zero. The help for proc arima seems to confirm this. There's nothing about the I term (the "integrated" part of the ARIMA acronym), which refers to how many times the series needs to be differenced to make it stationary, so I assume that term is zero. So, in R it would be arima(data, order=c(0,0,5), ...other arguments...).

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.