-
Notifications
You must be signed in to change notification settings - Fork 1
/
desciptives.Rmd
94 lines (61 loc) · 3.32 KB
/
desciptives.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: "Network Analysis of Scientific Publications"
output: html_document
---
## Network Analysis
This section presents an analysis of a network of scientific publications, where the edges represent shared authors between papers. We calculate various network descriptives using the `igraph` package without directly importing libraries, utilizing the `::` operator.
```{r}
setwd('/Users/bocsar000/Documents/projects/scopus_scrape')
# Replace 'path/to/your/nodes.csv' and 'path/to/your/edges.csv' with the actual file paths
nodes <- arrow::read_feather("data/network/node_list.feather")
edges <- arrow::read_feather("data/network/edge_list.feather")
full_paper_sample <- arrow::read_feather("data/final_sample.feather")
library(dplyr)
nodes <- nodes %>%
mutate(
Domain = sapply(Domain, function(x) paste(x, collapse = "; ")), # Collapse list into a single string separated by "; "
`Search Term` = sapply(`Search Term`, function(x) paste(x, collapse = "; "))
)
# Create a graph from the edge list
paper_network <- igraph::graph_from_data_frame(d=edges, vertices=nodes, directed=FALSE)
# save source objects into RData
save(nodes, edges, full_paper_sample, paper_network, file = "data/network_analysis.RData")
```
### Network Density
The network density gives us an idea of how connected the graph is, representing the proportion of potential connections in the network that are actual connections.
```{r network-density}
density <- igraph::edge_density(g)
cat("Network Density: ", density, "\n")
```
### Triad Census
The triad census counts the number of triads (groups of three nodes) for each possible triad type, which helps us understand the local structure of the network.
```{r triad-census}
triad_census <- igraph::triad_census(g)
cat("Triad Census:\n")
print(triad_census)
cat("First is empty triad, 3rd is dyad with isolated third, 11th is A<->B<->C, last is full graph\n")
```
### Degree Distribution
The degree distribution shows how many nodes have a certain number of connections, which can indicate the presence of hubs in the network.
```{r degree-distribution}
degree_distribution <- igraph::degree_distribution(g)
cat("Degree Distribution (first few values):\n")
print(head(degree_distribution)) # Showing only the first few values for brevity
```
### Small-worldness
Small-world networks are characterized by higher clustering and shorter average path lengths compared to random graphs. Here, we calculate the clustering coefficient and average path length as a first step to assess small-worldness.
```{r small-worldness}
clustering_coefficient <- igraph::transitivity(g, type = "average")
average_path_length <- igraph::mean_distance(g, directed = FALSE)
cat("Clustering Coefficient:", clustering_coefficient, "\n")
cat("Average Path Length:", average_path_length, "\n")
# Note: For a complete small-worldness assessment, compare these to equivalent random graphs.
```
### Betweenness Centrality
Betweenness centrality measures the extent to which a node lies on paths between other nodes, highlighting potential points of control within the network.
```{r betweenness-centrality}
betweenness_centrality <- igraph::betweenness(g, directed = FALSE)
cat("Betweenness Centrality (first few nodes):\n")
# Assuming we want to print a summary of betweenness centrality values for the first few nodes:
print(head(betweenness_centrality))
```