-
Notifications
You must be signed in to change notification settings - Fork 3
/
RQ-week6.Rmd
156 lines (138 loc) · 12 KB
/
RQ-week6.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
# Week 6 - Reading questions
```{r include=FALSE}
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
config <- yaml::yaml.load_file("config.yml")
showanswers = FALSE
```
#### Reference {-}
[Baron, R.M and Kenny, D.A. (1986) The moderator-Mediator Variable Distinction
in Social Psychological Research: Conceptual, Strategic, and Statistical
Considerations, Journal of Personality and Individual Differences, 51(6), p. 1173-
1182](http://webcom.upmf-grenoble.fr/LIP/Perso/DMuller/GSERM/Articles/Journal%20of%20Personality%20and%20Social%20Psychology%201986%20Baron.pdf)
#### Questions: {-}
1. What is according to Baron and Kenny the nature of a moderation effect?
(and can you give an example from an article or theory you had in a
previous course?
2. List the four methods hat according to Baron and Kenny are available to
study interaction effects in SPSS given different levels of measurement
(categorical or continuous) for the independent and moderating variable?
3. On page 1177, Baron Kenny mention the most common approach to deal
with unreliability in the mediating variable is to use multiple indicators.
Thinking back to the readings and lectures about factor analysis, can you
shortly explain why this is a good method?
4. The framework that Baron and Kenny present on page 1179 is one way to
study complicated models (mediating moderation, or moderating
mediation). They do not discuss alternatives to this framework, but
(surprise!) Structural Equation Modeling is one of them. What would you
think be the main advantage of SEM over the framework of Baropn and
Kenny.
5. How can you determine whether a variable is a mediator or moderator?
<!--config$show_text-->
```{r echo=FALSE, results = 'asis', eval = showanswers}
cat("#### Answers {-}
1. What is according to Baron and Kenny the nature of a moderation effect? (and can you give an example from an article or theory you had in a previous course?
* A: A moderator is a third variable that affects the relation between two other variables. E.g. the relation between drinking a lot of alcohol while going out and trying to pick up a one night stand might be different for boys and girls. It could be that there is a relation for boys, and not for girls, but also, it might be that the relation is linear for boys, while for girls, there is a quadratic effect (indicating girls one try to pick up a one-night stand when they are really drunk), or a stepwise effect for girls (indicating that girls need to drink more than a specific amount to engage in this behavior
2. List the four methods hat according to Baron and Kenny are available to study interaction effects in SPSS given different levels of measurement (categorical or continuous) for the independent and moderating variable?
* Case 1: IV dich, Mod dich -> 2x2 ANOVA
* Case 2: IV cont, mod dich-> test correlation for each level of the moderator
* Case 3: IV dich, mod cont -> include interaction effect (with quadratic or dichotomized term term if necessary -> theorize about this!)
* Case 4: IV cont, Mode. Cont. -> include interaction effect. (with possible quadratic term)
3. On page 1177, Baron Kenny mention the most common approach to deal with unreliability in the mediatimng variable is to use multiple indicators. Thinking back to the readings and lectures about factor analysis, can you shortly explain why this is a good method?
* Each variable contains measurement error. In factor analysis, scales are formed using multiple indicators. The idea is that random measurement errors here cancel eachother out, while at the same time, a better operationalization of the latent variable is achieved. In this way, measurement error in a mediating variable, would be incorporated in the model, and corrected for.
4. The framework that Baron and Kenny present on page 1179 is one way to study complicated models (mediating moderation, or moderating mediation). They do not discuss alternatives to this framework, but (surprise!) Structural Equation Modeling is one of them. What would you think be the main advantage of SEM over the framework of Baropn and Kenny.
* It is much more simple (practically), and all model parameters are estimated at once, so that complex interdependencies can be modeled in a better and more sophisticated way
5. How can you determine whether a variable is a mediator or moderator?
* this should not be done statistically, but theoretically. Is it a intermitting variable (mediator), or really an external variable that cannot be explained by something else (moderator)?. Statistically, one would want moderators to be unrelated to both X and Y, while in mediation, one wants a strong relation of the mediator with both X and Y
")
```
#### Reference {-}
[Weston, R. and Gore, Paul, A. (2006) A brief guide to structural equation
Modeling, The Counseling Psychologist 34, p.719-752 ](http://tcp.sagepub.com/content/34/5/719.full.pdf+html)
Notes before reading:
* To all you non-counseling psychologists: this article is very general, don’t
worry
* this article is quite general, and you might recognize a lot of things we have
discussed. This article provides an overview of things we discussed, but also
adds an important aspect: the combination of factor model and path models
into one Structural Equation Model
* skip the part on GFI (p. 741) – this index has been shown to be dependent
on sample size after all and is old-fashioned (in a bad way, in statistics we
don’t do vintage).
* skip the part on missing data – there is nothing wrong with this section, but
missing data is a very difficult topic, and there is a lot more to say about this.
Take the course ”Conducting a Survey” if you want to know more on missing
data!
#### Questions: {-}
1. The authors (Weston and Gore) state three similarities and two
big differences between SEM and other multivariate statistical techniques
(ANCOVA, regression). What are they?
2. The second difference that Weston and Gore mention is
presented as being a disadvantage, while some would think of it as an
advantage. Tests for the model fit can indicate whether your theory can be
confirmed or rejected based on your data! Along with this, the authors
miss another big difference between SEM and other general linear model
(ANOVA, regression). Any idea what this is?
3. Structural Equation Models are called hybrid models when
they contain a measurement and structural model. Can you explain what
is meant by the terms “measurement” and “structural” model?
4. On p. 726-726, the authors introduce the term item parcels.
Look up (on the internet) what is meant with an item parcel. Can you
think of an example of item parceling that you used during on of the
practicals of this course? Revisit this practical, and try to thank about the
disadvantages of item parceling in SEM.
5. The authors identify 6 steps in doing SEM-analyses. What are
they?
6. Go back to the unconstrained multigroup path model
estimated in the class exercise of practical 6 (question 8. last week). Try
to work out for yourself how many degrees of freedom there are based on
pages 732-733 of the article.
7. On page 745 the authors mention in the 6th line from the bottom of the
page, that it is a good idea to test the model using cross-validation.
Look up what cross-validation in SEM is about, and explain when it is a
good idea to do a cross-validation check
8. The model that the authors use as an example does not fit the data very
well (see figure 4 on page 740) and the discussion of the authors of their
model.
Without having the data, nor a good theories on the constructs used in
this paper, can you think of a possible reason why this model might not
fit? (there is no one right answer here).
Hint: have a look at the results that are presented and those not
presented in the paper
```{r echo=FALSE, results = 'asis', eval = showanswers}
cat("#### Answers {-}
1. The authors (Weston and Gore) state three similarities and two big differences between SEM and other multivariate statistical techniques (ANCOVA, regression). What are they?
* Similarities:
they are all general linear models
they are only valid when the assumptions are met (PL: and many assumptions are very similar!)
they cannot “prove” or imply causality
* Differences:
latent constructs van be included, that provide a better operationalisation of the theoretical construct and contain less measurement error
test for the model fit
2. the second difference that Weston and Gore mention is presented as being a disadvantage, while some would think of it as an advantage. Tests for the model fit can indicate whether your theory can be confirmed or rejected based on your data! Along with this, the authors miss another big difference between SEM and other general linear model (ANOVA, regression). Any idea what this is?
* In SEM, more complex interdependencies between your constructs can be modeled.
3. Structural Equation Models are called hybrid models when they contain a measurement and structural model. Can you explain what is meant by the terms “measurement” and “structural” model?
* measurement model = factor model, the parts where latent constructs are operationalised using two or more observed variables
Structural model = the path model, the part where regression coefficients are estimated. This often includes regression coefficients between latent variables!
4. On p. 726-726, the authors introduce the term item parcels. Look up (on the internet) what is meant with an item parcel. Can you think of an example of item parceling that you used during on of the practicals of this course? Revisit this practical, and try to thank about the disadvantages of item parceling in SEM.
* we used this in the first practical, where we analysed the data from the paper by Kestila. During the practical, we did EFA analysis on items in trust in politics and attitudes towardas immigrants, and saved the factor scores, which we later used in a regression model. This is a form of item parceling. The disadvantage to this approach is that the factor scores contain a lot of measurement model, which (generally) deflate (also called attenuate) the size of the regression coefficients to or from the factor scores.
5. The authors identify 6 steps in doing SEM-analyses. What are they?
* data collection
* specification
* identification
* estimation
* evaluation
* modification
6. Go back to the unconstrained multigroup path model estimated in the class exercise of practical 6 (question 8. last week). Try to work out for yourself how many degrees of freedom there are based on pages 732-733 of the article.
* 6 regression coefficients, 4 disturbances, 1 variance(depression) -> 6+4+1=11
Two groups = 11 X 2 = 22 parameters to be estimated.
The number of free parameters is 2x(5x(5+1)/2 = 2 x 5x6/2 = 5x6 = 30 sample moments
Degrees of freedom = sample moments – estimated parameters = 30-22 = 8
7. On page 745 the authors mention in the 6th line from the bottom of the page, that it is a good idea to test the model using cross-validation.
Look up what cross-validation in SEM is about, and explain when it is a good idea to do a cross-validation check
* after many alterations to the model, you might be modeling more data driven than theory driven, and you risk ending up with a theory that replicates badly (for example by blindly following modification indices). If you then check your model against a fresh (sub)sample from the same population, you can actually check whether that model fits!
8. The model that the authors use as an example does not fit the data very well (see figure 4 on page 740) and the discussion of the authors of their model.
Without having the data, nor a good theories on the concstructs used in this paper, can you think of a possible reason why this model might not fit? (there is no one right answer here).
Hint: have a look at the results that are presented and those not presented in the paper
* it probably has to do something with the factor structure, as that is not reported in the paper
")
```