Empirical Project 11 Working in Excel
Part 11.1 Summarizing the data
Learning objectives for this part
- construct indices to measure attitudes or opinions
- use Cronbach’s alpha to assess indices for internal consistency
- practise recoding and creating new variables.
We will be using data collected from an internet survey sponsored by the German government.
First, download the survey data and documentation:
- Download the data. Read the ‘Data dictionary’ tab and make sure you know what each variable represents. (Later we will discuss exactly how some of these variables were coded.)
- Download a short description of the dataset, which explains the survey design and questions asked. You may find it helpful to read it before starting on the questions below.
- While contingent valuation methods can be useful, they also have shortcomings. Read Section 5 of the paper ‘Introduction to economic valuation methods’ (pages 16–19), and explain which limitations you think apply particularly to the survey we are looking at. You may also find it useful to look at Table 2 of that paper, which compares stated-preference with revealed-preference methods.
Before comparing between question formats (dichotomous choice (DC) and two-way payment ladder (TWPL)), we will first compare the people assigned to each question format to see if they are similar in demographic characteristics and attitudes towards related topics (such as beliefs about climate change and the need for government intervention). If the groups are vastly dissimilar then any observed differences in answers between the groups might be due to differences in attitudes and/or demographics rather than the question format.
- Likert scale
- A numerical scale (usually ranging from 1–5 or 1–7) used to measure attitudes or opinions, with each number representing the individual’s level of agreement or disagreement with a particular statement.
Attitudes were assessed using a 1–5 Likert scale, where 1 = strongly disagree, and 5 = strongly agree. The way the questions were asked was not consistent, so an answer of ‘strongly agree’ might mean high climate change skepticism for one question, but low skepticism for another question. In order to combine these questions into an index we need to recode (in this case, reverse-code) some of the variables.
- Recode or create the variables as specified:
- Reverse-code the following variables (so that 1 is now 5, 2 is now 4, and so on): cog_2, cog_5, scepticism_6, scepticism_7. (Hint: One way to do this is to create a new variable and use Excel’s IF function to fill in the values of the new variable based on the values of the original variable. For help on using the IF function, see Excel walk-through 6.1.)
- For the variables ‘WTP_plmin’ and ‘WTP_plmax’, create new variables with the values replaced as shown in Figure 11.1 (these are the actual amounts, in euros, that individuals selected in the survey, and will be useful for calculating summary measures later). In order to produce the correct summary statistics, you will need to use Excel’s IF function to preserve the blank cells and only fill in values for the new variable if the original variable is non-blank.
Original value | New value |
---|---|
1 | 48 |
2 | 72 |
3 | 84 |
4 | 108 |
5 | 156 |
6 | 192 |
7 | 252 |
8 | 324 |
9 | 432 |
10 | 540 |
11 | 720 |
12 | 960 |
13 | 1,200 |
14 | 1,440 |
WTP survey categories (original value) and euro amounts (new value).
- Create the following indices, giving them an appropriate name in your spreadsheet (make sure to use the reverse-coded variable where relevant):
- Belief that climate change is a real phenomenon: Take the mean of scepticism_2, scepticism_6, and scepticism_7.
- Preferences for government intervention to solve problems in society: Take the mean of cog_1, cog_2, cog_3, cog_4, cog_5, and cog_6.
- Feeling of personal responsibility to act pro-environmentally: Take the mean of PN_1, PN_2, PN_3, PN_4, PN_6, and PN_7.
- Cronbach’s alpha
- A measure used to assess the extent to which a set of items is a reliable or consistent measure of a concept. This measure ranges from 0–1, with 0 meaning that all of the items are independent of one another, and 1 meaning that all of the items are perfectly correlated with each other.
When creating indices, we may be interested to see if each item used in the index measures the same underlying variable of interest (known as reliability or consistency). There are two common ways to assess reliability: either look at the correlation between items in the index, or use a summary measure called Cronbach’s alpha (this measure is used in the social sciences). We will be calculating and interpreting both of these measures.
Cronbach’s alpha is a way to summarize the correlations between many variables, and ranges from 0 to 1, with 0 meaning that all of the items are independent of one another, and 1 meaning that all of the items are perfectly correlated with each other. While higher values of this measure indicate that the items are closely related and therefore measure the same concept, with values that are very close to 1 (or 1) we could be concerned that our index contains redundant items (for example, two items that tell us the same information, so we would only want to use one or the other, but not both). You can read more about this in the paper ‘Using and interpreting Cronbach’s Alpha’.
- Calculate correlation coefficients and interpret Cronbach’s alpha:
- For one of the indices you created in Question 3, create a correlation table to show the correlation between each of the items in the index. Remember to give the variables meaningful names in your table (refer to the ‘Data dictionary’ tab for descriptions of each variable). (For help on calculating correlation coefficients, see Excel walk-through 1.7.) Figure 11.2 shows an example for Question 3(a). (Remember that the correlation between A and B is the same as the correlation between B and A, so you only need to calculate the correlation for each pair of items once.) Are the items in that index strongly correlated?
exaggeration | not.human.activity | no.evidence | |
---|---|---|---|
exaggeration | 1 | – | – |
not.human.activity | 1 | – | |
no.evidence | 1 |
Correlation table for survey items on climate change scepticism: Climate change is exaggerated (exaggeration), Human activity is not the main cause of climate change (not.human.activity), No evidence of global warming (no.evidence).
- The Cronbach’s alpha for these indices are 0.66, 0.71, and 0.85 respectively. Interpret these values in terms of index reliability.
Now we will compare characteristics of people in the dichotomous choice (DC) group and two-way payment ladder (TWPL) group (the variable ‘abst_format’ indicates which group an individual belongs to). Since the groups are of different sizes, we will use percentages instead of frequencies.
-
For each group (DC and TWPL), create tables to summarize the distribution of the following variables (a separate table for each variable):
- gender (‘sex’)
- age (‘age’)
- number of children (‘kids_nr’)
- household net income per month in euros (‘hhnetinc’)
- membership in environmental organization (‘member’)
- highest educational attainment (‘education’).
For help on creating tables, see Excel walk-through 3.1. Using the tables you have created, and without doing formal calculations, discuss any similarities/differences in demographic characteristics between the two groups.
- Create a separate summary table as shown in Figure 11.3 for each of the three indices you created in Question 3. Without doing formal calculations, do the two groups of individuals look similar in the attitudes specified?
Mean | Standard deviation | Min | Max | |
---|---|---|---|---|
DC format | ||||
TWPL format |
Summary table for indices.
Part 11.2 Comparing willingness to pay across methods and individual characteristics
Learning objectives for this part
- compare survey measures of willingness to pay.
Before comparing WTP across question formats, we will summarize the distribution of WTP within each question format.
- For individuals who answered the TWPL question:
- Use the variables ‘WTP_plmin’ and ‘WTP_plmax’ to create column charts (one for each variable) with frequency on the vertical axis and category (the numbers 1–14) on the horizontal axis. Describe characteristics of the distributions shown on the charts.
- Using the variables you created in Question 2(b) in Part 11.1 (showing the actual WTP amounts), make a new variable that contains the average of the two variables (i.e. for each individual, calculate the average of the minimum and maximum willingness to pay).
- Calculate the mean and median of the variable you created in Question 1(b).
- Using the variable from Question 1(b), calculate the correlation between individuals’ average WTP and the demographic and attitudinal variables (see Questions 3 and 5 from Part 11.1 for a list of these variables). Interpret the relationships implied by the coefficients.
- For individuals who answered the DC question:
- Each individual was given one amount and had to decide ‘yes’, ‘no’, or ‘no vote/abstain from deciding’. Make a pivot table showing the frequency of ‘DC_ref_outcome’, with ‘costs’ as the row variable and ‘DC_ref_outcome’ as the column variable.
- Use this table to calculate the percentage of individuals who voted ‘no’ and ‘yes’ for each amount (in other words as a percentage of the row total, not the overall total). Count individuals who chose ‘abstain’ as voting ‘no’.
- Make a scatterplot showing the ‘demand curve’, with percentage of individuals who voted ‘yes’ as the vertical axis variable and amount (in euros) as the horizontal axis variable. (To connect the points, use the chart option ‘Scatter with Straight Lines and Markers’.) Describe features of this ‘demand curve’ that you find interesting.
- Repeat Question 2(b), this time excluding individuals who chose ‘abstain’ from the calculations. Plot this new ‘demand curve’ on the chart created for Question 2(c). Do your results change qualitatively, depending on how you count individuals who did not vote?
- Compare the mean and median WTP under both question formats:
- Complete Figure 11.4 and use it to calculate the difference in means (DC minus TWPL), the standard deviation of these differences, and the number of observations. (The mean of DC is the mean of ‘DC_ref_outcome’ for individuals who voted ‘yes’.)
Format | Mean | Standard deviation | Number of observations |
---|---|---|---|
DC | |||
TWPL |
Summary table for WTP.
- Use Excel’s CONFIDENCE.T function and the calculated values for Question 3(a) to make a 95% confidence interval for the difference in means. Discuss your findings. (For help on calculating confidence intervals for the difference in means, see Excel walk-through 6.4 and the discussion in Part 8.3.).
- Does the median WTP look different across question formats? (You do not need to do any formal calculations.)
- Using your answers to Questions 3(a)–(c), would you recommend that governments use the mean or median WTP in policy-making decisions? (That is, which measure appears to be more robust to changes in the question format?)
-
Leading think tanks estimate that the world needs 20 trillion USD of investment in low-carbon energy supplies and energy efficient technologies by 2030 to meet the Paris Agreement targets. This amount roughly corresponds to 3,273 euros total per adult (aged 15 and above), or 298 euros per adult per year (2020 to 2030 inclusive).
Compare this approximate number with the WTP estimates you have found. Assuming people around the world have the same attitudes towards climate change as the Germans surveyed, would a tax equal to the median WTP be enough to finance climate change mitigation? Discuss how governments could increase public support for and involvement in climate change mitigation activities.