index.html

<!DOCTYPE html>
<html>
  <head>
    <meta charset="utf-8" />
    <meta name="viewport" content="width=device-width" />
    <title>Visualize high-dimensional data | StatSim Vis</title>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/css/materialize.min.css">
    <link type="text/css" rel="stylesheet" href="https://statsim.com/port/css/port.css" media="screen"/>
    <link rel="icon" type="image/png" href="https://statsim.com/app/images/favicon-32x32.png" sizes="32x32">
    <link rel="icon" type="image/png" href="https://statsim.com/app/images/favicon-16x16.png" sizes="16x16">
    <link type="text/css" rel="stylesheet" href="https://statsim.com/assets/common.css" media="screen"/>
    <style>
      a { color: #3030B7 }
      .logo { width: 75px; padding: 0; margin: 8px 0 0 0}
      .btn, .port-btn { background: #3030B7 }
      .btn:hover, .port-btn:hover { background: #21218B }
      .file-field .btn { background: #BBB }
      .file-field .btn:hover { background: #AAA }
      .grey-bar { background: #f5f5f5 }
      .spinner-green, .spinner-green-only { border-color: #3030B7 }
    </style>
    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=UA-7770107-2"></script>
    <script>
      window.dataLayer = window.dataLayer || [];
      function gtag(){dataLayer.push(arguments);}
      gtag('js', new Date());
      gtag('config', 'UA-7770107-2');
    </script>
  </head>
  <body>

    <div class="status-bar grey-bar">
      <div class="container">
        <div class="row">
          <div class="col s12" style="font-size: 14px;">
            <div id="menu"></div>
            <a href="https://statsim.com/">StatSim</a> → <b>Vis</b>
            <span class="version">Version: 0.3.1</span>
          </div>
        </div>
      </div>
    </div>

    <div class="container">
      <div class="row">
        <div id="port-container"></div>
      </div>
    </div>

    <div id="description" class="grey-bar">
      <div class="container">
        <div class="row">
          <div class="col m12">
            <h1>Visualize high-dimensional data online</h1>
            <h2>Explore CSV datasets using dimensionality reduction and feature importance methods</h2>
            <p>
              Feature extraction is the process of reducing the number of variables (i.e., columns) in a dataset by obtaining a smaller representative set of features using dimensionality reduction methods (<a href="https://en.wikipedia.org/wiki/Dimensionality_reduction">Wiki</a>). You can apply dimensionality reduction to any tabular CSV dataset using StatSim Vis, a 100% free and open-source tool for feature extraction and visualization. It supports PCA, t-SNE, UMAP, SOM, and Autoencoders. To understand your data even better, try the feature importance method to know how strong the dependency between input variables and the target.
            </p>
          </div>
        </div>

        <div class="row features">
          <div class="col m4 feature">
            <h3>
              Dimensionality reduction
            </h3>
            <p>
              In many cases, real-world datasets contain more than 2 or 3 variables (i.e., columns). For us humans, it's tough to analyze and reason in high dimensions. Computers work nicely with high-dimensional data, but sometimes we still need to get some bird-view over a dataset. Luckily various projections methods can map data with many variables to a low-dimensional space we understand.
            </p>
          </div>
          <div class="col m4 feature">
            <h3>
              2D/3D view of a dataset
            </h3>
            <p>
              All dimensionality reductions techniques support mapping from a high-dimensional space to two dimensions. That works for most datasets. In 2D mode, you can also select and save a data subset using the lasso tool: <img src="lasso.png" style="width:14px; height: auto;">. However, plotting data in three dimensions makes it possible to recognize patterns even better. You can rotate, zoom and then choose the most effective 3D to 2D projection.
            </p>
          </div>
          <div class="col m4 feature">
            <h3>
              Feature importance
            </h3>
            <p>
              Sometimes we are interested in a specific column of a dataset and how it depends on other variables. Let's call that column a target variable. Historically a correlation coefficient was used as a measure of such dependency. However, correlation works only with linear relationships and <a href="https://en.wikipedia.org/wiki/Correlation_and_dependence#/media/File:Correlation_examples2.svg">fails in many cases</a>. Feature importance relies on a more complex model under the hood. It estimates non-linear dependencies and variable interactions.
            </p>
          </div>
        </div>

        <div class="row">
          <div class="col m12">
            <small>
              All processing and visualization happens in your browser. We don't see, collect or sell data you explore <br> 
            </small>
            <p>
              <a class="github-button" href="https://github.com/statsim/vis" data-icon="octicon-star" data-show-count="true" aria-label="Star statsim/vis on GitHub">Star</a>
              <a class="github-button" href="https://github.com/statsim/vis/issues" data-icon="octicon-issue-opened" data-show-count="true" aria-label="Issue statsim/vis on GitHub">Issue</a>
            </p>
          </div>
        </div>
      </div>
    </div>

    <script async defer src="https://buttons.github.io/buttons.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/materialize/1.0.0/js/materialize.min.js"></script>
    <script src="https://statsim.com/port/dist/port.js"></script>
    <script src="https://statsim.com/assets/common.js"></script>
    <script>
      var port = new Port({
        portContainer: document.getElementById('port-container'),
        schema: {
          "model": {
            "name": "Process",
            "method": "run",
            "type": "class",
            "url": "process.js",
            "worker": true,
          },
          "render": {
            "name": "Render",
            "method": "render",
            "type": "class",
            "url": "render.js"
          },
          "design": {
            "layout": "sidebar",
            "colors": "light"
          },
          "inputs": [
            { "type": "file", "name": "File", "reactive": true },
            { "type": "select", "name": "Target variable", "options": ['None'], "default": 'None' },
            { "type": "select", "name": "Projection method", "options": ['None', 'PCA', 'SOM', 't-SNE', 'UMAP', 'Autoencoder'], "default": "None", "onchange": (value) => {
              if (value === 'None') {
                return {
                  'Steps': {'className': 'hidden'},
                  'Dimensions': {'className': 'hidden'},
                  'Transform': {'className': 'hidden'}
                }
              } else if (value === 'PCA') {
                return {
                  'Steps': {'className': 'hidden'},
                  'Dimensions': {'className': ''},
                  'Transform': {'className': ''}
                }
              } else {
                return {
                  'Steps': {'className': ''},
                  'Dimensions': {'className': ''},
                  'Transform': {'className': ''}
                }
              }
            }},
            { "type": "select", "name": "Dimensions", "options": [2, 3], "default": 2},
            { "type": "select", "name": "Transform", "options": ['None', 'Scale', 'Log'], "default": 'None' },
            { "type": "int", "name": "Steps", "default": 200},
            { "type": "select", "name": "Feature importance", "options": ['None', 'Random Forest'], "default": 'None' },
          ]
        }
      })
    </script>
  </body>
</html>