diff --git a/DOCS/_Plot_module.md b/DOCS/_Plot_module.md
index ce7f1ef..af29cbc 100644
--- a/DOCS/_Plot_module.md
+++ b/DOCS/_Plot_module.md
@@ -28,7 +28,8 @@ The method returns a plot, the number of classes and bin size of the histogram,
 def distribution(data,
                  plot=('hist', 'kde'),
                  avg=('amean', 'gmean', 'median', 'mode'),
-                 binsize='auto', bandwidth='silverman'):
+                 binsize='auto',
+                 bandwidth='silverman'):
     """ Return a plot with the ditribution of (apparent or actual) grain sizes
     in a dataset.
 
@@ -200,6 +201,6 @@ KDE bandwidth =  0.1
 =======================================
 ```
 
-![]()
+![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/new_normalized_median.png?raw=true)
 
 Note that in this case, the method returns the normalized inter-quartile range (IQR) rather than the normalized standard deviation. Also, note that the kernel density estimate appears smoother resembling an almost perfect normal distribution.
\ No newline at end of file
diff --git a/DOCS/_describe.md b/DOCS/_describe.md
index 9a60608..2352e4a 100644
--- a/DOCS/_describe.md
+++ b/DOCS/_describe.md
@@ -9,18 +9,18 @@ dataset = pd.read_csv(filepath, sep='\t')
 
 # estimate equivalent circular diameters (ECDs)
 dataset['diameters'] = 2 * np.sqrt(dataset['Area'] / np.pi)
-dataset
+dataset.head()
 ```
 
-![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_output.png?raw=true)
+![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_newcol.png?raw=true)
 
 ```python
-# Set the population properties
+# Set the population properties for the toy dataset
 scale = np.log(20)  # set sample geometric mean to 20
 shape = np.log(1.5)  # set the lognormal shape to 1.5
 
 # generate a random lognormal population of size 500
-np.random.seed(seed=1)  # this is to generate always the same population for reproducibility
+np.random.seed(seed=1)  # this is for reproducibility
 toy_dataset = np.random.lognormal(mean=scale, sigma=shape, size=500)
 ```
 
@@ -73,9 +73,7 @@ By default, the `summarize()` function returns:
 - The shape of the lognormal distribution using the multiplicative standard deviation (MSD)
 - A Shapiro-Wilk test warning indicating when the data deviates from normal and/or lognormal (when p-value < 0.05).
 
-Note that here the Shapiro-Wilk test warning tells us that the distribution is not normally distributed, which is to be expected since we know that this is a lognormal distribution. Note that the geometric mean and the lognormal shape are very close to the values used to generate the synthetic dataset, 20 and 1.5 respectively.
-
-Now, let's do the same using the dataset that comes from a real rock, for this, we have to pass the column with the diameters:
+In the example above, the Shapiro-Wilk test tells us that the distribution is not normally distributed, which is to be expected since we know that this is a lognormal distribution. Note that the geometric mean and the lognormal shape are very close to the values used to generate the synthetic random dataset, 20 and 1.5 respectively. Now, let's do the same using the dataset that comes from a real rock, for this, we have to pass the column with the diameters:
 
 ```python
 summarize(dataset['diameters'])
@@ -117,7 +115,7 @@ Lognormality test: 0.99, 0.03 (test statistic, p-value)
 ============================================================================
 ```
 
-Leaving aside the difference in numbers, there are some subtle differences compared to the results obtained with the toy dataset. First, the confidence interval method for the arithmetic mean is no longer the modified Cox (mCox) but the one based on the central limit theorem (CLT) advised by the [ASTM](https://en.wikipedia.org/wiki/ASTM_International). As previously noted, the function ```summarize()``` automatically choose the optimal confidence interval method depending on distribution features. We show below the decision tree flowchart for choosing the optimal confidence interval estimation method, which is based on [Lopez-Sanchez (2020)](https://doi.org/10.1016/j.jsg.2020.104042).
+Leaving aside the different numbers, there are some subtle differences compared to the results obtained with the toy dataset. First, the confidence interval method for the arithmetic mean is no longer the modified Cox (mCox) but the one based on the central limit theorem (CLT) advised by the [ASTM](https://en.wikipedia.org/wiki/ASTM_International). As previously noted, the function ```summarize()``` automatically choose the optimal confidence interval method depending on distribution features. We show below the decision tree flowchart for choosing the optimal confidence interval estimation method, which is based on [Lopez-Sanchez (2020)](https://doi.org/10.1016/j.jsg.2020.104042).
 
 ![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/avg_map.png?raw=true)
 
@@ -125,61 +123,56 @@ The reason why the CLT method applies in this case is that the grain size distri
 
 Now, let's focus on the different options of the ``summarize()`` method.
 
-```
-Signature:
-summarize(
-    data,
-    avg=('amean', 'gmean', 'median', 'mode'),
-    ci_level=0.95,
-    bandwidth='silverman',
-    precision=0.1,
-)
-Docstring:
-Estimate different grain size statistics. This includes different means,
-the median, the frequency peak grain size via KDE, the confidence intervals
-using different methods, and the distribution features.
-
-Parameters
-----------
-data : array_like
-    the size of the grains
-
-avg : string, tuple or list; optional
-    the averages to be estimated
-
-    | Types:
-    | 'amean' - arithmetic mean
-    | 'gmean' - geometric mean
-    | 'median' - median
-    | 'mode' - the kernel-based frequency peak of the distribution
-
-ci_level : scalar between 0 and 1; optional
-    the certainty of the confidence interval (default = 0.95)
-
-bandwidth : string {'silverman' or 'scott'} or positive scalar; optional
-    the method to estimate the bandwidth or a scalar directly defining the
-    bandwidth. It uses the Silverman plug-in method by default.
-
-precision : positive scalar or None; optional
-    the maximum precision expected for the "peak" kde-based estimator.
-    Default is 0.1. Note that this has nothing to do with the
-    confidence intervals
-
-Call functions
---------------
-- amean, gmean, median, and freq_peak (from averages)
-
-Examples
---------
->>> summarize(dataset['diameters'])
->>> summarize(dataset['diameters'], ci_level=0.99)
->>> summarize(np.log(dataset['diameters']), avg=('amean', 'median', 'mode'))
-
-Returns
--------
-None
-File:      c:\users\marco\documents\github\grainsizetools\grain_size_tools\grainsizetools_script.py
-Type:      function
+```python
+def summarize(data,
+              avg=('amean', 'gmean', 'median', 'mode'),
+              ci_level=0.95,
+              bandwidth='silverman',
+              precision=0.1):
+    """ Estimate different grain size statistics. This includes different means,
+    the median, the frequency peak grain size via KDE, the confidence intervals
+    using different methods, and the distribution features.
+
+    Parameters
+    ----------
+    data : array_like
+        the size of the grains
+
+    avg : string, tuple or list; optional
+        the averages to be estimated
+
+        | Types:
+        | 'amean' - arithmetic mean
+        | 'gmean' - geometric mean
+        | 'median' - median
+        | 'mode' - the kernel-based frequency peak of the distribution
+
+    ci_level : scalar between 0 and 1; optional
+        the certainty of the confidence interval (default = 0.95)
+
+    bandwidth : string {'silverman' or 'scott'} or positive scalar; optional
+        the method to estimate the bandwidth or a scalar directly defining the
+        bandwidth. It uses the Silverman plug-in method by default.
+
+    precision : positive scalar or None; optional
+        the maximum precision expected for the "peak" kde-based estimator.
+        Default is 0.1. Note that this is not related with the confidence
+        intervals
+
+    Call functions
+    --------------
+    - amean, gmean, median, and freq_peak (from averages)
+
+    Examples
+    --------
+    >>> summarize(dataset['diameters'])
+    >>> summarize(dataset['diameters'], ci_level=0.99)
+    >>> summarize(np.log(dataset['diameters']), avg=('amean', 'median', 'mode'))
+
+    Returns
+    -------
+    None
+    """
 ```
 
 
diff --git a/DOCS/_first_steps.md b/DOCS/_first_steps.md
index d5ed27e..ed2de19 100644
--- a/DOCS/_first_steps.md
+++ b/DOCS/_first_steps.md
@@ -3,7 +3,7 @@
 Installing Python for data science
 -------------
 
-GrainSizeTools script requires [Python](https://www.python.org/ ) 3.5+ or higher and the Python scientific libraries [*Numpy*](http://www.numpy.org/ ) [*Scipy*](http://www.scipy.org/ ), [*Pandas*](http://pandas.pydata.org ) and [*Matplotlib*](http://matplotlib.org/ ). If you have no previous experience with Python, I recommend downloading and installing the [Anaconda Python distribution](https://www.anaconda.com/distribution/ ) (Python 3.x version), as it includes all the required the scientific packages (> 5 GB disk space). In case you have a limited space in your hard disk, there is a distribution named [miniconda](http://conda.pydata.org/miniconda.html ) that only installs the Python packages you actually need. For both cases you have versions for Windows, MacOS and Linux.
+GrainSizeTools script requires [Python](https://www.python.org/ ) 3.5+ or higher and the Python scientific libraries [*NumPy*](http://www.numpy.org/ ) [*SciPy*](http://www.scipy.org/ ), [*Pandas*](http://pandas.pydata.org ) and [*Matplotlib*](http://matplotlib.org/ ). If you have no previous experience with Python, I recommend downloading and installing the [Anaconda Python distribution](https://www.anaconda.com/distribution/ ) (Python 3.x version), as it includes all the required the scientific packages (> 5 GB disk space). In case you have a limited space in your hard disk, there is a distribution named [miniconda](http://conda.pydata.org/miniconda.html ) that only installs the Python packages you actually need. For both cases you have versions for Windows, MacOS and Linux.
 
 Anaconda Python Distribution: https://www.anaconda.com/distribution/ 
 
@@ -201,7 +201,7 @@ Let's first see how the data set looks like. Instead of calling the variable (as
 dataset.head()  # returns 5 rows by default, you can define any number within the parenthesis
 ```
 
-![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_output.png?raw=true)
+![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_output_head5.png?raw=true)
 
 The example dataset has 11 different columns (one without a name). To interact with one of the columns we must call its name in square brackets with the name in quotes as follows:
 
@@ -233,7 +233,7 @@ dataset = dataset.drop(' ', axis=1)
 dataset.head(3)
 ```
 
-![]()
+![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_head3.png?raw=true)
 
 If you want to remove more than one column pass a list of columns instead as in the example below:
 
@@ -256,7 +256,7 @@ dataset['diameters'] = 2 * np.sqrt(dataset['Area'] / np.pi)
 dataset.head()
 ```
 
-![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_diameters.png?raw=true)
+![](https://github.com/marcoalopez/GrainSizeTools/blob/master/FIGURES/dataframe_newcol.png?raw=true)
 
 You can see a new column named diameters.
 
@@ -285,9 +285,10 @@ dataset.info()       # display info of the DataFrame
 dataset.shape()      # (rows, columns)
 dataset.count()      # number of non-null values
 
+# Data cleaning
 dataset.dropna()        # remove missing values from the data
 
-# writing to disk
+# Writing to disk
 dataset.to_csv(filename)    # save as csv file, the filename must be within quotes
 dataset.to_excel(filename)  # save as excel file
 ```
diff --git a/FIGURES/dataframe_diameters.png b/FIGURES/dataframe_diameters.png
deleted file mode 100644
index a8e8943..0000000
Binary files a/FIGURES/dataframe_diameters.png and /dev/null differ
diff --git a/FIGURES/dataframe_output_head3.png b/FIGURES/dataframe_output_head3.png
new file mode 100644
index 0000000..d82a2ab
Binary files /dev/null and b/FIGURES/dataframe_output_head3.png differ
diff --git a/FIGURES/dataframe_output_head5.png b/FIGURES/dataframe_output_head5.png
new file mode 100644
index 0000000..97b1415
Binary files /dev/null and b/FIGURES/dataframe_output_head5.png differ
diff --git a/FIGURES/dataframe_output_newcol.png b/FIGURES/dataframe_output_newcol.png
new file mode 100644
index 0000000..38f608e
Binary files /dev/null and b/FIGURES/dataframe_output_newcol.png differ
diff --git a/grain_size_tools/GrainSizeTools_script.py b/grain_size_tools/GrainSizeTools_script.py
index bbd76d7..df1eab7 100644
--- a/grain_size_tools/GrainSizeTools_script.py
+++ b/grain_size_tools/GrainSizeTools_script.py
@@ -116,8 +116,8 @@ def summarize(data, avg=('amean', 'gmean', 'median', 'mode'), ci_level=0.95,
 
     precision : positive scalar or None; optional
         the maximum precision expected for the "peak" kde-based estimator.
-        Default is 0.1. Note that this has nothing to do with the
-        confidence intervals
+        Default is 0.1. Note that this is not related with the confidence
+        intervals
 
     Call functions
     --------------
diff --git a/grain_size_tools/example_notebooks/grain_size_description.ipynb b/grain_size_tools/example_notebooks/grain_size_description.ipynb
index 7d788ec..0facaed 100644
--- a/grain_size_tools/example_notebooks/grain_size_description.ipynb
+++ b/grain_size_tools/example_notebooks/grain_size_description.ipynb
@@ -159,129 +159,24 @@
        "      <td>0.970</td>\n",
        "      <td>21.821815</td>\n",
        "    </tr>\n",
-       "    <tr>\n",
-       "      <th>...</th>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "      <td>...</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2656</th>\n",
-       "      <td>2657</td>\n",
-       "      <td>452.50</td>\n",
-       "      <td>0.789</td>\n",
-       "      <td>28.504</td>\n",
-       "      <td>1368.0</td>\n",
-       "      <td>1565.5</td>\n",
-       "      <td>127.875</td>\n",
-       "      <td>22.500</td>\n",
-       "      <td>1.235</td>\n",
-       "      <td>0.810</td>\n",
-       "      <td>0.960</td>\n",
-       "      <td>24.002935</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2657</th>\n",
-       "      <td>2658</td>\n",
-       "      <td>1081.25</td>\n",
-       "      <td>0.756</td>\n",
-       "      <td>47.909</td>\n",
-       "      <td>1349.5</td>\n",
-       "      <td>1569.5</td>\n",
-       "      <td>108.246</td>\n",
-       "      <td>31.363</td>\n",
-       "      <td>1.446</td>\n",
-       "      <td>0.692</td>\n",
-       "      <td>0.960</td>\n",
-       "      <td>37.103777</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2658</th>\n",
-       "      <td>2659</td>\n",
-       "      <td>513.50</td>\n",
-       "      <td>0.720</td>\n",
-       "      <td>32.962</td>\n",
-       "      <td>1373.0</td>\n",
-       "      <td>1586.0</td>\n",
-       "      <td>112.286</td>\n",
-       "      <td>20.496</td>\n",
-       "      <td>1.493</td>\n",
-       "      <td>0.670</td>\n",
-       "      <td>0.953</td>\n",
-       "      <td>25.569679</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2659</th>\n",
-       "      <td>2660</td>\n",
-       "      <td>277.75</td>\n",
-       "      <td>0.627</td>\n",
-       "      <td>29.436</td>\n",
-       "      <td>1316.0</td>\n",
-       "      <td>1601.5</td>\n",
-       "      <td>159.102</td>\n",
-       "      <td>17.002</td>\n",
-       "      <td>1.727</td>\n",
-       "      <td>0.579</td>\n",
-       "      <td>0.920</td>\n",
-       "      <td>18.805379</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th>2660</th>\n",
-       "      <td>2661</td>\n",
-       "      <td>725.00</td>\n",
-       "      <td>0.748</td>\n",
-       "      <td>39.437</td>\n",
-       "      <td>1335.5</td>\n",
-       "      <td>1615.5</td>\n",
-       "      <td>129.341</td>\n",
-       "      <td>28.025</td>\n",
-       "      <td>1.351</td>\n",
-       "      <td>0.740</td>\n",
-       "      <td>0.960</td>\n",
-       "      <td>30.382539</td>\n",
-       "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
-       "<p>2661 rows × 12 columns</p>\n",
        "</div>"
       ],
       "text/plain": [
-       "               Area  Circ.    Feret  FeretX  FeretY  FeretAngle  MinFeret  \\\n",
-       "0        1   157.25  0.680   18.062  1535.0     0.5     131.634    13.500   \n",
-       "1        2  2059.75  0.771   62.097   753.5    16.5     165.069    46.697   \n",
-       "2        3  1961.50  0.842   57.871   727.0    65.0      71.878    46.923   \n",
-       "3        4  5428.50  0.709  114.657  1494.5    83.5      19.620    63.449   \n",
-       "4        5   374.00  0.699   29.262  2328.0    34.0      33.147    16.000   \n",
-       "...    ...      ...    ...      ...     ...     ...         ...       ...   \n",
-       "2656  2657   452.50  0.789   28.504  1368.0  1565.5     127.875    22.500   \n",
-       "2657  2658  1081.25  0.756   47.909  1349.5  1569.5     108.246    31.363   \n",
-       "2658  2659   513.50  0.720   32.962  1373.0  1586.0     112.286    20.496   \n",
-       "2659  2660   277.75  0.627   29.436  1316.0  1601.5     159.102    17.002   \n",
-       "2660  2661   725.00  0.748   39.437  1335.5  1615.5     129.341    28.025   \n",
-       "\n",
-       "         AR  Round  Solidity  diameters  \n",
-       "0     1.101  0.908     0.937  14.149803  \n",
-       "1     1.314  0.761     0.972  51.210889  \n",
-       "2     1.139  0.878     0.972  49.974587  \n",
-       "3     1.896  0.528     0.947  83.137121  \n",
-       "4     1.515  0.660     0.970  21.821815  \n",
-       "...     ...    ...       ...        ...  \n",
-       "2656  1.235  0.810     0.960  24.002935  \n",
-       "2657  1.446  0.692     0.960  37.103777  \n",
-       "2658  1.493  0.670     0.953  25.569679  \n",
-       "2659  1.727  0.579     0.920  18.805379  \n",
-       "2660  1.351  0.740     0.960  30.382539  \n",
+       "         Area  Circ.    Feret  FeretX  FeretY  FeretAngle  MinFeret     AR  \\\n",
+       "0  1   157.25  0.680   18.062  1535.0     0.5     131.634    13.500  1.101   \n",
+       "1  2  2059.75  0.771   62.097   753.5    16.5     165.069    46.697  1.314   \n",
+       "2  3  1961.50  0.842   57.871   727.0    65.0      71.878    46.923  1.139   \n",
+       "3  4  5428.50  0.709  114.657  1494.5    83.5      19.620    63.449  1.896   \n",
+       "4  5   374.00  0.699   29.262  2328.0    34.0      33.147    16.000  1.515   \n",
        "\n",
-       "[2661 rows x 12 columns]"
+       "   Round  Solidity  diameters  \n",
+       "0  0.908     0.937  14.149803  \n",
+       "1  0.761     0.972  51.210889  \n",
+       "2  0.878     0.972  49.974587  \n",
+       "3  0.528     0.947  83.137121  \n",
+       "4  0.660     0.970  21.821815  "
       ]
      },
      "execution_count": 2,
@@ -296,7 +191,7 @@
     "\n",
     "# estimate equivalent circular diameters (ECDs)\n",
     "dataset['diameters'] = 2 * np.sqrt(dataset['Area'] / np.pi)\n",
-    "dataset"
+    "dataset.head()"
    ]
   },
   {
@@ -391,9 +286,7 @@
     "- The shape of the lognormal distribution using the multiplicative standard deviation (MSD)\n",
     "- A Shapiro-Wilk test warning indicating when the data deviates from normal and/or lognormal (when p-value < 0.05).\n",
     "\n",
-    "Note that here  the Shapiro-Wilk test warning tells us that the distribution is not normally distributed, which is to be expected since we know that this is a lognormal distribution. Note that the geometric mean and the lognormal shape are very close to the values used to generate the synthetic dataset, 20 and 1.5 respectively.\n",
-    "\n",
-    "Now, let's do the same using the dataset that comes from a real rock, for this, we have to pass the column with the diameters:"
+    "In the example above, the Shapiro-Wilk test tells us that the distribution is not normally distributed, which is to be expected since we know that this is a lognormal distribution. Note that the geometric mean and the lognormal shape are very close to the values used to generate the synthetic random dataset, 20 and 1.5 respectively. Now, let's do the same using the dataset that comes from a real rock, for this, we have to pass the column with the diameters:"
    ]
   },
   {
@@ -450,7 +343,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Leaving aside the difference in numbers, there are some subtle differences compared to the results obtained with the toy dataset. First, the confidence interval method for the arithmetic mean is no longer the modified Cox (mCox) but the one based on the central limit theorem (CLT) advised by the [ASTM](https://en.wikipedia.org/wiki/ASTM_International). As previously noted, the function ```summarize()``` automatically choose the optimal confidence interval method depending on distribution features. We show below the decision tree flowchart for choosing the optimal confidence interval estimation method, which is based on [Lopez-Sanchez (2020)](https://doi.org/10.1016/j.jsg.2020.104042)."
+    "Leaving aside the different numbers, there are some subtle differences compared to the results obtained with the toy dataset. First, the confidence interval method for the arithmetic mean is no longer the modified Cox (mCox) but the one based on the central limit theorem (CLT) advised by the [ASTM](https://en.wikipedia.org/wiki/ASTM_International). As previously noted, the function ```summarize()``` automatically choose the optimal confidence interval method depending on distribution features. We show below the decision tree flowchart for choosing the optimal confidence interval estimation method, which is based on [Lopez-Sanchez (2020)](https://doi.org/10.1016/j.jsg.2020.104042)."
    ]
   },
   {