-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCUL_OpenRefineWkshp_2018-10-10.txt
74 lines (46 loc) · 2.47 KB
/
CUL_OpenRefineWkshp_2018-10-10.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Columbia University Libraries
Workshop: OpenRefine
Part 2: Enhancing data with OpenRefine
October 10, 2018
Butler Library 306
Ryan Mendenhall
212-851-2452
URLs for reconciliation services:
Library of Congress / NACO authority file via VIAF proxy (conciliator):
http://refine.codefork.com/reconcile/viafproxy/LC
WIKIDATA (should already be installed and available as a reconciliation service)
https://tools.wmflabs.org/openrefine-wikidata/en/api
EXERCISE 2
GREL for extracting identifiers, preferred labels, and URIs from the reconciliation service:
Preferred label: cell.recon.match.name
Identifier (see following for URI): cell.recon.match.id
From the Identifier, you can construct the URI based on the controlled vocabulary's URL handle.
In our example, we are using LC/NAF, which has a preferred handled of:
http://id.loc.gov/authorities/names/ So....
URI: "http://id.loc.gov/authorities/names/" + cell.recon.match.id
EXERCISE 4:
Sample query of FAST API for the term Theater:
http://fast.oclc.org/searchfast/fastsuggest?query=Theater&rows=30&queryReturn=suggestall+idroot+auth+cscore&suggest=autoSubject&queryIndex=suggest50&wt=json
Edit column --> Add column by fetching URLs
GREL:
"http://fast.oclc.org/searchfast/fastsuggest?query="+ value.replace(/[\s\-\.\,\:\(\)]/,"%20") +"&rows=30&queryReturn=suggestall%2Cidroot%2Cauth%2cscore&suggest=autoSubject&queryIndex=suggest50&wt=json"
On the returned results [make sure you are using Rows view, NOT Record view]:
forEach(value.parseJson().response.docs,v,if(v.auth == cells["Subject"].value,"http://id.worldcat.org/fast/"+substring(v.idroot,3),"")).uniques().join("")
EXERCISE 5:
Sample SPARQL query:
SELECT ?x ?label
{
?x skos:inScheme aat:;
(xl:prefLabel|xl:altLabel)/gvp:term "painting"@en;
skos:prefLabel ?label
FILTER(lang(?label)="en").
}
That query embedded in GREL for Edit column --> Add column based on this column
'http://vocab.getty.edu/sparql.json?query=select+?x+?label{?x+skos:inScheme+aat:;(xl:prefLabel|xl:altLabel)/gvp:term"' + escape(value, 'url') + '"@en;skos:prefLabel+?label+filter(lang(?label)="en").}'
GREL to grab URI from returned query results:
forEach(value.parseJson().results.bindings,v,if(v.label.value == cells["Form"].value,v.x.value,"")).uniques().join("")
EXERCISE 6:
Collection column: Edit column --> Add column based on this column
GREL:
cell.cross('CUL_DLC_Locations', 'Label').cells['URI'].value[0]