generated from statsim/port-template
-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.html
162 lines (156 loc) · 7.85 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width" />
<title>Visualize high-dimensional data | StatSim Vis</title>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/css/materialize.min.css">
<link type="text/css" rel="stylesheet" href="https://statsim.com/port/css/port.css" media="screen"/>
<link rel="icon" type="image/png" href="https://statsim.com/app/images/favicon-32x32.png" sizes="32x32">
<link rel="icon" type="image/png" href="https://statsim.com/app/images/favicon-16x16.png" sizes="16x16">
<link type="text/css" rel="stylesheet" href="https://statsim.com/assets/common.css" media="screen"/>
<style>
a { color: #3030B7 }
.logo { width: 75px; padding: 0; margin: 8px 0 0 0}
.btn, .port-btn { background: #3030B7 }
.btn:hover, .port-btn:hover { background: #21218B }
.file-field .btn { background: #BBB }
.file-field .btn:hover { background: #AAA }
.grey-bar { background: #f5f5f5 }
.spinner-green, .spinner-green-only { border-color: #3030B7 }
</style>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-7770107-2"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-7770107-2');
</script>
</head>
<body>
<div class="status-bar grey-bar">
<div class="container">
<div class="row">
<div class="col s12" style="font-size: 14px;">
<div id="menu"></div>
<a href="https://statsim.com/">StatSim</a> → <b>Vis</b>
<span class="version">Version: 0.3.1</span>
</div>
</div>
</div>
</div>
<div class="container">
<div class="row">
<div id="port-container"></div>
</div>
</div>
<div id="description" class="grey-bar">
<div class="container">
<div class="row">
<div class="col m12">
<h1>Visualize high-dimensional data online</h1>
<h2>Explore CSV datasets using dimensionality reduction and feature importance methods</h2>
<p>
Feature extraction is the process of reducing the number of variables (i.e., columns) in a dataset by obtaining a smaller representative set of features using dimensionality reduction methods (<a href="https://en.wikipedia.org/wiki/Dimensionality_reduction">Wiki</a>). You can apply dimensionality reduction to any tabular CSV dataset using StatSim Vis, a 100% free and open-source tool for feature extraction and visualization. It supports PCA, t-SNE, UMAP, SOM, and Autoencoders. To understand your data even better, try the feature importance method to know how strong the dependency between input variables and the target.
</p>
</div>
</div>
<div class="row features">
<div class="col m4 feature">
<h3>
Dimensionality reduction
</h3>
<p>
In many cases, real-world datasets contain more than 2 or 3 variables (i.e., columns). For us humans, it's tough to analyze and reason in high dimensions. Computers work nicely with high-dimensional data, but sometimes we still need to get some bird-view over a dataset. Luckily various projections methods can map data with many variables to a low-dimensional space we understand.
</p>
</div>
<div class="col m4 feature">
<h3>
2D/3D view of a dataset
</h3>
<p>
All dimensionality reductions techniques support mapping from a high-dimensional space to two dimensions. That works for most datasets. In 2D mode, you can also select and save a data subset using the lasso tool: <img src="lasso.png" style="width:14px; height: auto;">. However, plotting data in three dimensions makes it possible to recognize patterns even better. You can rotate, zoom and then choose the most effective 3D to 2D projection.
</p>
</div>
<div class="col m4 feature">
<h3>
Feature importance
</h3>
<p>
Sometimes we are interested in a specific column of a dataset and how it depends on other variables. Let's call that column a target variable. Historically a correlation coefficient was used as a measure of such dependency. However, correlation works only with linear relationships and <a href="https://en.wikipedia.org/wiki/Correlation_and_dependence#/media/File:Correlation_examples2.svg">fails in many cases</a>. Feature importance relies on a more complex model under the hood. It estimates non-linear dependencies and variable interactions.
</p>
</div>
</div>
<div class="row">
<div class="col m12">
<small>
All processing and visualization happens in your browser. We don't see, collect or sell data you explore <br>
</small>
<p>
<a class="github-button" href="https://github.com/statsim/vis" data-icon="octicon-star" data-show-count="true" aria-label="Star statsim/vis on GitHub">Star</a>
<a class="github-button" href="https://github.com/statsim/vis/issues" data-icon="octicon-issue-opened" data-show-count="true" aria-label="Issue statsim/vis on GitHub">Issue</a>
</p>
</div>
</div>
</div>
</div>
<script async defer src="https://buttons.github.io/buttons.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/js/materialize.min.js"></script>
<script src="https://statsim.com/port/dist/port.js"></script>
<script src="https://statsim.com/assets/common.js"></script>
<script>
var port = new Port({
portContainer: document.getElementById('port-container'),
schema: {
"model": {
"name": "Process",
"method": "run",
"type": "class",
"url": "process.js",
"worker": true,
},
"render": {
"name": "Render",
"method": "render",
"type": "class",
"url": "render.js"
},
"design": {
"layout": "sidebar",
"colors": "light"
},
"inputs": [
{ "type": "file", "name": "File", "reactive": true },
{ "type": "select", "name": "Target variable", "options": ['None'], "default": 'None' },
{ "type": "select", "name": "Projection method", "options": ['None', 'PCA', 'SOM', 't-SNE', 'UMAP', 'Autoencoder'], "default": "None", "onchange": (value) => {
if (value === 'None') {
return {
'Steps': {'className': 'hidden'},
'Dimensions': {'className': 'hidden'},
'Transform': {'className': 'hidden'}
}
} else if (value === 'PCA') {
return {
'Steps': {'className': 'hidden'},
'Dimensions': {'className': ''},
'Transform': {'className': ''}
}
} else {
return {
'Steps': {'className': ''},
'Dimensions': {'className': ''},
'Transform': {'className': ''}
}
}
}},
{ "type": "select", "name": "Dimensions", "options": [2, 3], "default": 2},
{ "type": "select", "name": "Transform", "options": ['None', 'Scale', 'Log'], "default": 'None' },
{ "type": "int", "name": "Steps", "default": 200},
{ "type": "select", "name": "Feature importance", "options": ['None', 'Random Forest'], "default": 'None' },
]
}
})
</script>
</body>
</html>