Friday, August 6, 2010

Excel Tips and Tricks II

I thought I would write another instalment of Excel Tips and Tricks. This time I will look at Excels capability for data analysis.

If you've used Minitab or another statistical package you might have been frustrated that you cant easily export results to Excel, or vice versa Minitab tends to be pretty clunky when you want to drop an Excel list into it for analysis. You might be wondering why Excel doesnt include some statistical analysis functionality.



Well, wonder no longer, Excel does in fact come pre-installed with some pretty powerful statistical analysis capability, it just happens to be hidden away in the Tools-Add Ins menu.


The first time you want to use the Data Analysis functionality you will need to activate it through the Tools - Add Ins menu option see the image below :

From the Add-Ins menu you will need to check the box for Analysis Toolpak and hit OK. There is no messy CD instal required, it is all preinstalled you just have to activate it this way.

Once you have activated the Analysis Toolpak, you will see that under your Tools menu bar, the Data Analysis option now appears. Once installed it will reside in your Excel Application forever, or until you uninstall it I guess.


Why Microsoft chose to hide away such great functionality is beyond me. Since I discovered this little nugget I would say I use it at least once a week to perform any basic statistical task.

Features accessed through the data analysis function include histograms, regressions, basic statistical data, ANOVA, covariance and correlation p tests and f tests .... the list goes on.

Unfortunately for Minitab, since discovering this great feature in Excel, I have never had the need to use their software again.

Easy Linear and Quadratic Regression in Excel

OK so you can use the Data Analysis add in for linear regression, but it can be a little bit technical, and sometimes you just quickly want to see the line of best fit and know the gradient, intercepts, and R squared.

This is where the Excel chart trendline function simply cannot be beaten.

In the chart below, I have a scatterplot of time based data values. The actual data I've used below is monthly demand of a material. I used the linear regression to help calculate a forecast for future demand.

First, right click one of the data points to get the context menu. From the context menu select "Add Trendline" as shown.

The following window pops up. Simply select the type of regression you want. In this case I've selected linear regression but you could just as easily select logarithmic or polynomial for non linear regressions. You can also select moving average but just remember that Excel wont give you a formula for moving average and it wont have any predictive value.



Once you've selected the type of regression you want, click on the options tab to jump to the next tab.


On the options tab you just want to click on the two checkboxes shown below - the first one to display the equation on the chart, and the second one to show R squared. Hit OK.



Voila ! Excel automatically calculates the linear or non linear regression, illustrates it on the chart with a trendline, and even gives you the regression equation.
This is an enormously powerful Excel statistical application and it is just so easy to do.

For the uber-nerds you can easily compare a linear regression to a logarithmic, quadratic, or even higher order regression (by selecting polynomial and adjusting the order value in the trendline popup window) to determine the highest R squared value, and consequently the most appropriate regression model.


No comments:

Post a Comment