-
Notifications
You must be signed in to change notification settings - Fork 0
/
1-theory.qmd
352 lines (215 loc) · 8.99 KB
/
1-theory.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
---
title: "Introduction"
author: "Thomas W. Valente"
date: 2024-12-05
date-modified: 2024-12-05
---
## Outline
- Introduction to the network diffusion model and the motivation for
netdiffuseR
- The classic datasets (MI, BF & KFP)
- Simulating diffusion with netdiffuseR
- Data management features
- The FCTC study -- multiple networks both static and dynamic
## Acknowledgements
NetDiffuseR was created with the support of grant R01 CA157577 from the
National Cancer Institute/National Institutes of Health.
It has benefited from input provided by participants of the Center for
Applied Network Analysis at the University of Southern California
## Motivation for NetdifusseR
Build a platform in which:
- network exposure terms can be easily calculated and compared;
- other diffusion network terms can be created (e.g., Moran's I,
infectiousness, susceptibility);
- multiple dynamic and static networks can be analyzed simultaneously;
- Graphical displays of diffusion processes can be made;
- Statistical tests both conventional and new can be conducted;
- Integrate network diffusion models with other analytic approaches;
- and more.
## Diffusion Networks
- Unique set of studies that combine social network data with behavioral
data
- Typically have data from zero or low prevalence to saturation or high
prevalence.
- Multiple time points presents unique opportunities and challenges.
- Data often do not include dis-adoption but the model accounts for it.
- Network data may be static or dynamic.
- Can have one or multiple networks
- Model multiple behaviors simultaneously (e.g., tobacco & alcohol,
awareness & adoption, VHS & Betamax)
- Because we combine networks and behavior
- Can have people in the network with no adoption data
- People with adoption data but not in the network
## Diffusion of Innovations
- Explains how new ideas and practices spread within and between
communities.
- One of the oldest and richest social science theories.
- Many factors influence diffusion
- Spatial
- Economic
- Cultural
- Developmental
- Biological
- Etc.
- One of the most significant is social networks
---
![](figs/slides-valente-2019.png){width="80%"}
## Diffusion Networks
![](figs/slides-diffusion-networks.png){width="80%"}
## Structural Equivalence is Associated with Influence
![](figs/slides-struct-equiv.png){width="80%"}
Burt, R. (1987) **Social contagion and innovation: Cohesion versus structural equivalence**. American Journal of Sociology, 92, 1287-1335.
## My article discovered network thresholds
![](figs/slides-valente-socnets.png){width="80%"}
## Graph of Time of Adoption by Network Threshold for One Korean Family Planning Community
![](figs/slides-kfamilies.png){width="80%"}
## Indirect Exposures Matter
![](figs/slides-indirect-expo.png){width="80%"}
## Alter Attributes May Affect Influence
![](figs/slides-attr-exposure.png){width="80%"}
## Centrality Weighted Exposures
![](figs/slides-centrality-expo.png){width="80%"}
---
![](figs/slides-toa-example.png){width="80%"}
---
- Node A: Threshold=3/5, Susceptibility=1, Susc_Normed=1/3
- Node E: Impact 3/5, Infectiousness=1, Infect_Normed=1/3
![](figs/slides-threshold-net.png){width="80%"}
## Other Features: Extending Exposure
- Thresholds
- Indirect exposures
- Attribute weighted
- Network weighted (i.e., centrality)
- Infectiousness
- Susceptibility
- Ego-alter table
- Graphing!
## 1995 Reported on the 3 Classic Studies
![](figs/valente_1995.jpg){width="40%" fig-align="center"}
- 3 classic diffusion network datasets
- MI: Medical Innovation
- BF: Brazilian Farmers
- KFP: Korean Family Planning
## Diffusion Curves for Classic Studies
![](figs/slides-classical-curve.png){width="80%"}
## 6 Diffusion Network Studies 3 Early Studies Lost
- Rogers' dissertation on 2-4D Weed Spray
- Rogers' work in Colombia
- Becker's study of public health officers in New York, Ohio, and
Michigan (compared 2 innovations)
## 3 Classic Diffusion Network Studies
| | Medical Innovation | Korean Family Planning | Brazilian Farmers |
|--------------------|-----------------------|-------------------|--|
| Country | USA | Korean | Brazil |
| \# Respondents | 125 Doctors | 1,047 Women | 692 Farmers |
| \# Communities | 4 | 25 | 11 |
| Innovation | Tetracycline | Family Planning | Hybrid Corn Seed |
| Time for Diffusion | 18 Months | 11 Years | 20 Years |
| Year Data Collected| 1955 | 1973 | 1966 |
| Ave. Time to 50% | 6 | 7 | 16 |
| Highest Saturation | 89% | 83% | 98% |
| Lowest Saturation | 81% | 44% | 29% |
| Citation | Coleman et al (1996) | Rogers & Kincaid (1981) | Rogers et al (1970) |
## Reconfigure to Event History Data
![](figs/slides-event-data.png){width="80%"}
## Medical Innovation (Coleman, Katz, & Menzel, 1966),
Data collected from physicians in the 4 cities in Illinois in 1955
(Galesburg, Bloomington, Quincy, & Peoria). 125 MDs interviewed about
the medical practice characteristics:
- 3 network questions: advice, discussion, and friends.
- Adoption data collected by analyzing the prescription records of local
pharmacies on the first 3 days of each month.
## Brazilian Farmers (Rogers, Ascroft, & Röling, 1970)
As part of the USAID funded 3-country study, farmers in selected
villages in India, Nigeria, and Brazil were interviewed about their
farming practices, use of media, and other personal characteristics.
Networks were measured by asking:
- their three best friends, (2) the three most influential people in
their community, (3) the three most influential people regarding various
farm innovations, and (4) the best person to organize a cooperative
project.
- Adoption data were collected by asking farmers when they first planted
hybrid seed corn
## Korean Family Planning (Rogers & Kincaid, 1981)
Scholars at Seoul National University\'s School of Public Health
collected data on the adoption of family planning methods among all
married women of child-bearing age in 25 Korean villages in 1973
(N=1,047). Networks Network data were obtained by asking respondents to
name five people from whom
- they sought advice about (1) family planning, (2) general information,
(3) abortion, (4) health, (5) the purchase of consumer goods, and (6)
children\'s education.
- Adoption data were obtained by asking respondents to state the year
they first used a modern family planning method.
---
![](figs/slides-no-time-deps.png){width="90%"}
---
![](figs/slides-late-adopt.png){width="90%"}
---
![](figs/slides-negative-tendency.png){width="90%"}
## Exposure results
| Term | MI | KFP | BF |
|------|----|-----|----|
| Detail Agents | 1.27 | | |
| Science Orientation | 0.60\*\* | | |
| Journals Subs. | 1.63\* | | |
| \# Sons | | 1.43\*\* | |
| Media Camp. Exp. | | 1.10\*\* | |
| Income | | | 1.18\*\* |
| Visits to City | | | 1.00 |
| Out Degree | 0.96 | 1.05 | 0.98 |
| In Degree | 1.04 | 1.06\*\* | 1.02\* |
| Exposure (Cohesion) | 0.94 | 1.16 | 2.16\*\* |
## Simulating diffusion
- Seeds:
- Type (central, bridging, marginal, random, etc.)
- Size (e.g., 5%, 10%, 15%)
- Network structure:
- Random
- Small world
- Empirical
- Scale-free (centralized)
- Attribute-based (e.g., homophily)
- Threshold distribution:
- Uniform
- Normal (varying variance)
- Skewed
- Beta
- Mechanism:
- Cohesion
- Structural equivalence
- Indirect ties
- Attribute-based (e.g., homophilous more persuasive)
- Adoption behavior:
- No-disadoption
- Disadoption
- Conflicting behaviors (perhaps coded as -1,0, 1)
- Incorporate uncertainty
## Reading Data Challenges
- Merging attribute and network data
- Nominated but no attribute data
- Attribute data no network data
- Data file formats
- Single flat file
- 2 files: edgelist and attribute file (flat)
- Cohort studies: 1 file separate waves of data
## Formats
- Survey Data (static & dynamic)
- Edge-list and Attribute data
- Panel Data
- STATNET, Igraph, ....
## Input Types: Single Flat File
- Flat File with IDs, Nodelist, and Time of Adoption
- Typically a retrospective study
- Examples, Medical Innovation, Korean Family Planning; Brazilian Farmers
![](figs/slides-single-flat.png){width="40%"}
## Input Types: Double Files
- Files will have mostly matching IDs
- One or both files may contain time information
- Edgelist may also, potentially, contain values or spell information
![](figs/slides-double-files.png){width="80%"}
## Input Types: Classic Cohort/Longitudinal
- IDs and Nodelist Network data
- Behavior is typically binary (e.g., ever smoked) or potentially valued (e.g., # of cigarettes
- Examples SNS
![](figs/slides-file-long.png){width="50%"}