Analysis of M&M data using Data Desk:
- Select all the data from the web page (including
the headers) and Edit > Copy.
- Open Data Desk and Edit > Paste. The "Import
or Paste Variables" dialogue
box should open saying "The first row of the data is: Type Color Count." Click
on "Use these variable names." A box labeled ClipBoard should open
with the three variables.
- Let's rename the data. Double click on the file cabinet
if it isn't already open. Double click on the Data folder. There will be
an icon labeled ClipBoard.
Click
on the name and type "M&Ms" and hit return. Move the Data box
out of the way (click and drag the title bar) to reveal the variables in the
box,
which is now renamed M&Ms.
- Click on Color, option/alt-click on Type. They
should be labeled with Ys. Now shift-click on Count. It should be labeled
with an X. Choose Manip > Repilicate
Y by X. A new Data box will be created with variables named Color:Count and
Type:Count. These variables represent the raw data (one line per M&M).
- Click
on Color:Count (Y) and shift-click on Type:Count (X). Choose Calc > Contingency
Tables. You should get a contingency table with color as the rows and type
as the columns. Since we're interested in the color distribution, let's
add column
percentages. Click on the little triangle in the upper left corner of the
contingency table window and choose Table Options. Click on Percent of
column total and
then OK. The table should now show the conditional distribution as a percent
of the column totals.
- Here's what I did in class to make a bar chart that you can update to show
different conditional distributions. Select the Color:Count variable (Y)
and choose Plot > Bar Charts. Click on the little triangle in the upper
left corner of the bar chart and choose Selector > Use Hot Selector. Click
again on this triangle and choose Turn On Automatic Update. Now click and
hold on any column total in the contingency table and choose Select from
the pop-up menu. The bar chart will change to show the conditional distribution
if you choose the "plain" or "peanut" column, or the
marginal distribution if you select the "total" column.
- Here's how you can make simultaneous bar charts of the conditional distributions. Select the Type:Count variable (Y) and choose
Special > Group > Assign.
A little black oval will appear at the bottom of your Data Desk window that
says Group Variable: type:count.
Now click on color:count (Y) and choose Plot > Bar Charts. Two bar charts
will open showing the conditional distributions.
That's a pretty good start!
Keep playing with Data Desk and let me know what you discover!!