Empirical Project 2 Working in Google Sheets

Part 2.1 Collecting data by playing a public goods game

Learning objectives for this part

  • collect data from an experiment and enter it into Google Sheets
  • use summary measures, for example, mean and standard deviation, and line charts to describe and compare data.

Note

You can still do Parts 2.2 and 2.3 without completing this part of the project.

Before taking a closer look at the experimental data, you will play a public goods game like the one in the introduction with your classmates to learn how experimental data can be collected. If your instructor has not set up a game, follow the instructions below to set up your own game.

Instructions How to set up the public goods game

Form a group of at least four people. (You may also want to set a maximum of 8 or 10 players to make the game easier to play). Choose one person to be the game administrator. The administrator will monitor the game, while the other people play the game.

Administrator

  1. Create the game: Go to the ‘Economics Games’ website, scroll down to the bottom of the page, and click ‘Create a Multiplayer Game and Get Logins’. Then click ‘Externalities and public goods’. Under the heading ‘Voluntary contribution to a public good’, click ‘Choose this Game’. Enter in the number of people playing the game, and select ‘1’ for the number of universes. Then click ‘Get Logins’. A pop-up will appear, showing the login IDs and passwords for the players and for the administrator.
  2. Start the game: Give each player a different login ID. The game should be played anonymously, so make sure that players do not know the login IDs of other players. You are now ready to start the first round of the game. There are ten rounds in total.
  3. Confirm that all the rounds are complete: On the top right corner of the webpage, click ‘Login’, enter your login ID and password, and then click the green ‘Login’ button. You will be taken to the game administration page, which will show the average contribution in each round, and the results of the round just played. Wait until all the players have finished playing ten rounds before refreshing this page.
  4. Collect the game results: Once the players have finished playing ten rounds, refresh this page. The table at the top of the page will now show the average contribution (in euros) for each of the ten rounds played. Select the whole table, then copy and paste it into a new worksheet in Google Sheets.

Players

  1. Login: Once the administrator has created the game, go to the ‘Economics Games’ website. On the top right corner, click ‘Login’, enter the login ID and password that your administrator has given you, then click the green ‘Login’ button. You will be taken to the public goods game that your administrator has set up.
  2. Play the first round of the game: Read the instructions at top of the page carefully before starting the game. In each round, you must decide how much to contribute to the public good. Enter your choice for each universe (group of players) that you are a part of (if the same players are in two universes, then make the same contribution in both), then click ‘Validate’.
  3. View the results of the first round: You will then be shown the results of the first round, including how much each player (including yourself) contributed, the payoffs, and the profits. Click ‘Next’ to start the next round.
  4. Complete all the rounds of the game: Repeat steps 2 and 3 until you have played ten rounds in total, then collect the results of the game from your administrator.

The results from your game will look like Figure 2.1. In the questions below you will compare your results with those in Figure 3 of Herrmann et al. (2008), but first you need to reformat your table to look like Figure 2.2. Follow the steps in Google Sheets walk-through 2.1 to reformat your table.

Round 10 9 8 7 6 5 4 3 2 1
Average contribution                    

A table formatted with ‘Round’ and ‘Average contribution’ as the row variables.

Figure 2.1 A table formatted with ‘Round’ and ‘Average contribution’ as the row variables.

Round Average contribution
1
2
3
4
5
6
7
8
9
10

A table formatted with ‘Round’ and ‘Average contribution’ as the column variables.

Figure 2.2 A table formatted with ‘Round’ and ‘Average contribution’ as the column variables.

 

Google Sheets walk-through 2.1 Reformatting a table

Figure 2.3 How to reformat a table.

The data

The table generated from playing the public goods game will look like the one shown in Rows 1–2. We need to reformat it so that the columns show the different variables, rather than the rows.

Figure 2.3a The table generated from playing the public goods game will look like the one shown in Rows 1–2. We need to reformat it so that the columns show the different variables, rather than the rows.

Copy and paste the transposed table

The transpose option switches the rows and columns of the table, so the first column becomes the first row and so on.

Figure 2.3b The transpose option switches the rows and columns of the table, so the first column becomes the first row and so on.

Rearrange rows in the correct order

The rows in Column A are arranged in descending order (Round 10, Round 9, and so on), so we will sort the data in reverse order.

Figure 2.3c The rows in Column A are arranged in descending order (Round 10, Round 9, and so on), so we will sort the data in reverse order.

The reformatted table

The table is now reformatted as required.

Figure 2.3d The table is now reformatted as required.

Use the results of the game you have played to answer the following questions.

  1. Make a line chart with average contribution as the vertical axis variable, and period (from 1 to 10) on the horizontal axis. Describe how average contributions have changed over the course of the game.

Google Sheets walk-through 2.2 Drawing a line chart with multiple variables

Figure 2.4 How to plot a line chart with multiple variables.

The data

This is what the data looks like. Each column has data for a particular country, and each row has data for a given time period (1 to 10). We will draw Figure 3 from Herrmann et al. as an example; the steps to do Figure 2A are identical.

Figure 2.4a This is what the data looks like. Each column has data for a particular country, and each row has data for a given time period (1 to 10). We will draw Figure 3 from Herrmann et al. as an example; the steps to do Figure 2A are identical.

Draw a line graph

After completing step 4, the graph will look like this.

Figure 2.4b After completing step 4, the graph will look like this.

Move the legend to the right

After step 6, the legend will now be on the right-hand side of your chart. You can also experiment with the other positions to see which looks better.

Figure 2.4c After step 6, the legend will now be on the right-hand side of your chart. You can also experiment with the other positions to see which looks better.

Add horizontal axis labels for every period

After step 7, the horizontal axis will be labelled with all periods.

Figure 2.4d After step 7, the horizontal axis will be labelled with all periods.

  1. Compare your line chart with Figure 3 of Herrmann et al. (2008).1 Comment on any similarities or differences between the results (for example, the amount contributed at the start and end, or the change in average contributions over the course of the game).
  1. Can you think of any reasons why your results are similar to (or different from) those in Figure 3? You may find it helpful to read the ‘Experiments’ section of the Herrmann et al. (2008) study for a more detailed description of how the experiments were conducted.

Part 2.2 Describing the data

Learning objectives for this part

  • use summary measures, for example, mean and standard deviation, and column charts to describe and compare data.

Note

You can still do Parts 2.2 and 2.3 without completing this part of the project.

We will now use the data for Figures 2A and 3 of Herrmann et al. (2008), and evaluate the effect of the punishment option on average contributions. Rather than compare two charts showing all of the data from each experiment, as the authors of the study did, we will use summary measures to compare the data, and show the data from both experiments (with and without punishment) on the same chart.

mean
A summary statistic for a set of observations, calculated by adding all values in the set and dividing by the number of observations.
variance
A measure of dispersion in a frequency distribution, equal to the mean of the squares of the deviations from the arithmetic mean of the distribution. The variance is used to indicate how ‘spread out’ the data is. A higher variance means that the data is more spread out. Example: The set of numbers 1, 1, 1 has zero variance (no variation), while the set of numbers 1, 1, 999 has a high variance of 221,334 (large spread).

First, download and save the data. The spreadsheet contains two tables:

You can see that in each period (row), the average contribution varies across countries, in other words, there is a distribution of average contributions in each period.

The mean and variance are two ways to summarize distributions. We will now use these measures, along with other measures (range and standard deviation) to summarize and compare the distribution of contributions in both experiments.

Before answering these questions, make sure you understand mean and variance, and how to calculate these measures in Google Sheets.

  1. Using the data for Figures 2A and 3 of Herrmann et al. (2008):
  1. Instead of looking at all periods, we can focus on contributions in the first and last period. Plot a column chart showing the mean contribution in the first and last period for both experiments. Your chart should look like Figure 2.6.

Google Sheets walk-through 2.3 Drawing a column chart to compare two groups

Figure 2.5 How to draw a column chart to compare two groups.

The data

This is what the data looks like. Column R has the means for Figure 3 (without punishment). Column S has the means for Figure 2A (with punishment). We will use the cells in bold font (Cells R3, R12, S3, and S12) to make the chart.

Figure 2.5a This is what the data looks like. Column R has the means for Figure 3 (without punishment). Column S has the means for Figure 2A (with punishment). We will use the cells in bold font (Cells R3, R12, S3, and S12) to make the chart.

Create a table with relevant data only

When creating a new table, make sure that all the adjacent cells are empty as shown. Otherwise, Google Sheets might misinterpret your data and draw the chart incorrectly.

Figure 2.5b When creating a new table, make sure that all the adjacent cells are empty as shown. Otherwise, Google Sheets might misinterpret your data and draw the chart incorrectly.

Draw the column chart

Use the new table you created to make the column chart.

Figure 2.5c Use the new table you created to make the column chart.

Move the legend to the bottom and add axis titles

After step 4, the legend will now be underneath your chart. You can also experiment with the other positions to see which looks better.

Figure 2.5d After step 4, the legend will now be underneath your chart. You can also experiment with the other positions to see which looks better.

Add data labels on top of the columns

After completing step 7, numbers showing the height of the column will appear in the columns selected. Your chart should now look like Figure 2.6.

Figure 2.5e After completing step 7, numbers showing the height of the column will appear in the columns selected. Your chart should now look like Figure 2.6.

Average contributions in Periods 1 and 10, with and without punishment.

Figure 2.6 Average contributions in Periods 1 and 10, with and without punishment.

variance
A measure of dispersion in a frequency distribution, equal to the mean of the squares of the deviations from the arithmetic mean of the distribution. The variance is used to indicate how ‘spread out’ the data is. A higher variance means that the data is more spread out. Example: The set of numbers 1, 1, 1 has zero variance (no variation), while the set of numbers 1, 1, 999 has a high variance of 221,334 (large spread).

The mean is one useful measure of the ‘middle’ of a distribution, but is not a complete description of what our data looks like. We also need to know how ‘spread out’ the data is in order to get a clearer picture and make comparisons between the distributions. The variance is one way to measure spread—the higher the variance, the more spread out the data is.

standard deviation
A measure of dispersion in a frequency distribution, equal to the square root of the variance. The standard deviation has a similar interpretation to the variance. A larger standard deviation means that the data is more spread out. Example: The set of numbers 1, 1, 1 has a standard deviation of zero (no variation or spread), while the set of numbers 1, 1, 999 has a standard deviation of 46.7 (large spread).

A similar measure is standard deviation, which is the square root of the variance. Standard deviation is commonly used because it provides a handy rule of thumb for large datasets—most of the data (95% if there are many observations) will be less than two standard deviations away from the mean.

  1. Using the data for Figures 2A and 3 of Herrmann et al. (2008):

Google Sheets walk-through 2.4 Calculating the standard deviation

Figure 2.7 How to calculate and understand the standard deviation.

The data

We will compare the example data from Google Sheets walk-through 2.1 with some new data that is less spread out. You can see from Column H that all the values are between 10 to 12 (inclusive).

Figure 2.7a We will compare the example data from Google Sheets walk-through 2.1 with some new data that is less spread out. You can see from Column H that all the values are between 10 to 12 (inclusive).

Standard deviation calculation and interpretation

The STDEVP function will calculate the standard deviation over the selected cells. To enter in the formula, click on an empty cell.

Figure 2.7b The STDEVP function will calculate the standard deviation over the selected cells. To enter in the formula, click on an empty cell.

The relationship between the standard deviation and the variance

Both the variance and standard deviation measure spread. We need the variance to calculate the standard deviation, but we usually use the standard deviation to describe distributions because of the handy rule of thumb.

Figure 2.7c Both the variance and standard deviation measure spread. We need the variance to calculate the standard deviation, but we usually use the standard deviation to describe distributions because of the handy rule of thumb.

range
The interval formed by the smallest (minimum) and the largest (maximum) value of a particular variable. The range shows the two most extreme values in the distribution, and can be used to check whether there are any outliers in the data. (Outliers are a few observations in the data that are very different from the rest of the observations.)

Another measure of spread is the range, the interval formed by the smallest (minimum) and the largest (maximum) values of a particular variable. For example, we might say that the number of periods in the public goods experiment ranges from 1 to 10. Once we know the most extreme values in our dataset, we have a better picture of what our data looks like.

  1. Calculate the maximum and minimum value for Periods 1 and 10 separately, for both experiments.

Google Sheets walk-through 2.5 Finding the minimum, maximum, and range of a variable

Figure 2.8 How to find the minimum, maximum, and range of a variable.

The data

Here we are going to calculate the minimum and maximum value of the example data in Google Sheets walk-through 2.1.

Figure 2.8a Here we are going to calculate the minimum and maximum value of the example data in Google Sheets walk-through 2.1.

Calculate the minimum value

The MIN function calculates the minimum value of the selected cells. Here, the minimum value is 4.

Figure 2.8b The MIN function calculates the minimum value of the selected cells. Here, the minimum value is 4.

Calculate the maximum value and the range

The MAX function calculates the maximum value. The maximum value here is 18. The range is the interval formed by the minimum and the maximum value. In words, we say ‘the range is 4 to 18’ or ‘the average contribution ranges from 4 to 18’. In numbers, we say ‘the range is [4,18]’.

Figure 2.8c The MAX function calculates the maximum value. The maximum value here is 18. The range is the interval formed by the minimum and the maximum value. In words, we say ‘the range is 4 to 18’ or ‘the average contribution ranges from 4 to 18’. In numbers, we say ‘the range is [4,18]’.

  1. A concise way to describe the data is in a summary table. With just four numbers (mean, standard deviation, minimum value, maximum value), we can get a general idea of what the data looks like.
  Mean Standard deviation Minimum Maximum
Contribution (Period 1, without punishment)        

A summary table for contributions in a given period.

Figure 2.9 A summary table for contributions in a given period.

Part 2.3 Did changing the rules of the game affect behaviour?

Learning objectives for this part

  • calculate and interpret the p-value
  • evaluate the usefulness of experiments for determining causality, and the limitations of these experiments.

The punishment option was introduced into the public goods game in order to see whether it could help sustain contributions, compared to the game without a punishment option. We will now use a calculation called a p-value to compare the results from both experiments more formally.

By comparing the results in Period 10 of both experiments, we can see that the mean contribution in the experiment with punishment is 8.5 units higher than in the experiment without punishment (see Figure 2.6). Is it more likely that this behaviour is due to chance, or is it more likely to be due to the difference in experimental conditions?

  1. You can conduct another experiment to understand why we might see differences in behaviour that are due to chance.

The important point to note is that even when we conduct experiments under the same controlled conditions, due to an element of randomness, we may not observe the exact same behaviour each time we do the experiment.

Randomness arises because the statistical analysis is conducted on a sample of data (for example, a small group of people from the entire population), and the sample we observe is only one of many possible samples. Whatever differences we calculate between two samples would almost certainly change if we had observed another pair of samples. Importantly, economists aren’t really interested in whether two samples are actually different, but rather whether the underlying populations, from which the samples were drawn, differ in the characteristics we are interested in (for example, age, income, contributions to the public good). And this is the challenge faced by the empirical economist.

When we are interested in whether a treatment works — in this case, whether having the punishment option makes a difference in how much people contribute to the public good — we want a way to check whether any observed differences could just be due to sample variation.

The size of the difference alone cannot tell us whether it might just be due to chance. Even if the observed difference seems large, it could be small relative to how much the data vary. Figures 2.10 and 2.11 show the mean exam score of two groups of high school students and the size of house in which they live (represented by the height of the columns, and reported in the boxes above the columns), with the dots representing the underlying data. Figure 2.10 shows a relatively large difference in means that could have arisen by chance because the data is widely spread out (the standard deviation is large), while Figure 2.11 shows a relatively small difference that looks unlikely to be due to chance because the data is tightly clustered together (the standard deviation is very small). Note that we are looking at two distinct questions here: first, is there a large or small difference in exam score associated with the size of house of the student and second, is that difference likely to have arisen by chance. A social scientist is interested in the answer to both questions. If the difference is large but could easily have occurred by chance or if the difference is very small and unlikely to have occurred by chance, then the results are not suggestive of an important relationship between size of house and exam grade.

An example of a large difference in means that is likely to have happened by chance.

Figure 2.10 An example of a large difference in means that is likely to have happened by chance.

An example of a small difference in means that is unlikely to have happened by chance.

Figure 2.11 An example of a small difference in means that is unlikely to have happened by chance.

p-value
The probability of observing data at least as extreme as the data collected if a particular hypothesis about the population is true. The p-value ranges from 0 to 1: the lower the probability (the lower the p-value), the less likely it is to observe the given data, and therefore the less compatible the data are with the hypothesis.

To help us decide, we consider the hypothesis that the difference occurred by chance – in other words, we start by hypothesizing that house size does not matter for exam scores. Then we ask how likely it is that we would observe differences at least as extreme as those we actually observe in our sample groups, assuming that our hypothesis is true. The answer to this question is called a p-value. The smaller the p-value, the less likely that we would observe differences at least as extreme as those we did, given our hypothesis. So the smaller this p-value, the smaller our confidence will be in the hypothesis that in the population house size does not matter for exam grades.

Notice that the p-value is not the probability that the hypothesis is correct – the data cannot tell us that probability. It is the probability that we would find a difference as big as the one we have observed if the hypothesis were correct.

We can estimate the p-value from the data, using the sample means and sample deviations. It is calculated by comparing the difference in the means with the amount of variation in the data as measured by the standard deviations. This is a well-established method, although some other statistical assumptions, which we do not discuss, are required to ensure that it gives a good estimate.

When we look at the data in Figure 2.10, we cannot be absolutely certain that there really is a link between house size and exam scores. But if the p-value for the difference in means is very small (for example, 0.02) then we know that there would only be a 2% probability of seeing differences at least as extreme as those we did observe in the sample, given our hypothesis that in the population there was no relationship between house size and exam scores.

hypothesis test
A test in which a null (default) and an alternative hypothesis are posed about some characteristic of the population. Sample data is then used to test how likely it is that these sample data would be seen if the null hypothesis was true.

Find out more Hypothesis testing and p-values

The process of formulating a hypothesis about the data, calculating the p-value, and using it to assess whether what we observe is consistent with the hypothesis, is known as a hypothesis test. When we conduct a hypothesis test, we consider two hypotheses: either there is no difference between the populations, in which case the differences we observe must have happened by chance (known as the ’null hypothesis’); or the populations really are different (known as the ‘alternative hypothesis’). The smaller the p-value, the lower is the probability that the differences we observe could have happened simply by chance, i.e. if the null hypothesis was true. The smaller the p-value, the stronger is the evidence in favour of the alternative hypothesis.

It is a common, but highly debatable practice, to pick a cutoff level for the p-value, and reject the null hypothesis if the p-value is below this cutoff. This approach has been criticized recently by statisticians and social scientists because the cutoff level is quite arbitrary.

Instead of using a cutoff, we prefer to calculate p-values and use them to assess the strength of the evidence. Whether the statistical evidence is strong enough for us to draw a firm conclusion about the data will always be a matter of judgement.

In particular, you want to make sure that you understand the consequences of concluding that the null hypothesis is not true, and hence that the alternative is true. You may be quite easily prepared to conclude that house sizes and exam scores are related, but much more cautious about deciding that a new medication is more effective than an existing one if you know that this new medication has severe side effects. In the case of the medication, you might want to see stronger evidence against the null hypothesis before deciding that doctors should be advised to prescribe the new medication.

To calculate the p-value in Google Sheets, we use a function called TTEST.

  1. Using the data for Figures 2A and 3:

Google Sheets walk-through 2.6 Calculating and interpreting the p-value

Figure 2.12 How to calculate and interpret the p-value.

The data

Here we are going to calculate the p-values we need.

Figure 2.12a Here we are going to calculate the p-values we need.

Calculate the p-value for Period 1 data

The TTEST function will calculate the p-value for the two groups of cells selected. In the example shown, the formula to type is =TTEST(B3:Q3,B17:Q17,2,1). In this example there is a high probability that the difference we observe could have happened simply by chance.

Figure 2.12b The TTEST function will calculate the p-value for the two groups of cells selected. In the example shown, the formula to type is =TTEST(B3:Q3,B17:Q17,2,1). In this example there is a high probability that the difference we observe could have happened simply by chance.

Calculate the p-value for Period 10 data

Follow these steps to calculate the p-value for Period 10 (the cells in the dotted boxes). The only difference from the previous formula (for Period 1) is the data selected.

Figure 2.12c Follow these steps to calculate the p-value for Period 10 (the cells in the dotted boxes). The only difference from the previous formula (for Period 1) is the data selected.

  1. Using the data for Period 10:
spurious correlation
A strong linear association between two variables that does not result from any direct relationship, but instead may be due to coincidence or to another unseen factor.

An important point to note is that calculating p-values may not tell us anything about causation. The example of house size and exam scores shown in Figure 2.11, gives us evidence that some kind of relationship between house size and exam scores is very likely. However, we would not conclude that building an extra room automatically makes someone smarter. P-values cannot help us detect these spurious correlations.

However, calculating p-values for experimental evidence can help us determine whether there is a causal link between two variables. If we conduct an experiment and find a difference in outcomes with a low p-value, then we may conclude that the change in experimental conditions is likely to have caused the difference.

  1. Refer to the results from the public goods games.

Experiments can be useful for identifying causal links. However, if people’s behaviour in experimental conditions were different from their behaviour in the real world, our results would not be applicable anywhere outside the experiment.

  1. Discuss some limitations of lab experiments, and suggest some ways to address (or partially address) them. (You may find pages 158–171 of the paper ‘What do laboratory experiments measuring social preferences reveal about the real world?’ helpful, as well as the discussion on free riding and altruism in Section 2.6 of Economy, Society, and Public Policy.)
  1. Benedikt Herrmann, Christian Thöni, and Simon Gächter. 2008. Figure 3 in ‘Antisocial punishment across societies’. Science Magazine 319 (5868): p. 1365.