From b0c29fa435506706a8a2019a4d0fe18f1d9f387e Mon Sep 17 00:00:00 2001 From: "Marco A. Lopez-Sanchez" Date: Sun, 25 Mar 2018 19:30:33 +0200 Subject: [PATCH] update doc to release v1.4.4 --- DOCS/Requirements.md | 2 ++ DOCS/Scope.md | 40 +++++++++++++++++++++++++++------------- DOCS/brief_tutorial.md | 39 ++++++++++++++++++++++++++++++--------- DOCS/imageJ_tutorial.md | 14 ++++++++------ GrainSizeTools_script.py | 14 +++++++------- 5 files changed, 74 insertions(+), 35 deletions(-) diff --git a/DOCS/Requirements.md b/DOCS/Requirements.md index 3818593..1528091 100644 --- a/DOCS/Requirements.md +++ b/DOCS/Requirements.md @@ -1,3 +1,5 @@ +*last update 208/03/24* + Requirements ------------- diff --git a/DOCS/Scope.md b/DOCS/Scope.md index 8c7a6d6..7b876d6 100644 --- a/DOCS/Scope.md +++ b/DOCS/Scope.md @@ -1,36 +1,50 @@ +*Last update: 2018/03/25* + Scope ------------- GrainSizeTools (GST) script is primarily targeted at anyone who wants to: -1. Visualize grain size features -2. Obtain a set of single 1D measures of grain size to estimate the magnitude of differential stress (or rate of mechanical work) in dynamically recrystallized rocks or any other type of crystalline aggregate -3. Estimate differential stress via paleopizometers (**New in version 1.4!**) -4. Estimate the actual 3D distribution of grain sizes from a population of apparent grain sizes measured in thin sections. This includes the estimation of the volume occupied by a particular grain size fraction and the shape of the population of grain sizes (assuming that the distribution of grain sizes follows a lognormal distribution) +1. Visualize the distribution of apparent grain sizes and extract different statistical parameters to describe the features of the distribution +3. Estimate differential stress via paleopizometers (**New in versions 1.4+!**) +3. Approximate the actual 3D distribution of grain sizes from thin sections. This includes an estimate of the volume occupied by a particular grain size fraction and the shape of the population of grain sizes (assuming that the distribution of grain sizes is lognormal-like) -GST script requires as the input the measurement of the areas of the grain profiles (grain-by-grain) in a thin section. Hence, the script does not apply for determining mean grain sizes via the planimetric (Jeffries) (i.e. the number of grains per unit area) or intercept (the number of grains intercepted by a test line per unit length of test line) procedures. The reasons for using grain-by-grain methods over the planimetric/intercept procedures in rocks are detailed in [Lopez-Sanchez and Llana-Fúnez (2015)](http://www.solid-earth.net/6/475/2015/). The following is an overview of the key assumptions to consider so that the results obtained by the script are meaningful and reliable. +GST script only requires as input the areas of the grain profiles measured grain-by-grain in a thin section. The script is not intended to determine the mean grain size via the planimetric (Jeffries) (i.e. the number of grains per unit area) or intercept (the number of grains intercepted by a test line per unit length of test line) methods. The reasons for using grain-by-grain methods over the planimetric or intercept ones in natural rocks are detailed in [Lopez-Sanchez and Llana-Fúnez (2015)](http://www.solid-earth.net/6/475/2015/). Below is a brief outline of the key assumptions to consider so that the results obtained by the script are meaningful and reliable. ### Safety concerns > All this safety concerns assume that the calibration of the microscope and the scale of the micrographs were set correctly. -#### Getting unidimensional measures of apparent grain size +#### Determination of "average" (mean, median, peak) apparent grain sizes + +Unidimensional apparent grain size measures such as the **mean** or the **median** are only meaningful in specimens that show a **unimodal distribution of diameters** (or areas). It is therefore key to **always visualize if the distribution of apparent grain sizes is unimodal** (a single peak). In the case that the distribution is multimodal (two or more frequency peaks), you can use for comparative purposes the **location of the frequency peaks** based on the Kernel density estimate [(Lopez-Sanchez and Llana-Fúnez, 2015)](http://www.solid-earth.net/6/475/2015/). Despite this, the best option when a multimodal distribution appears is to separate the different populations of grain size using image analysis methods. + +Unfortunately, no general protocol exists in the earth science community on what unidimensional grain size measures to use when using grain-by-grain methods. For example, in the case of paleopiezometry studies, some authors have been using the mean, others the median and some the mode. In addition, some authors applied stereological corrections and others do not, and some estimate the logarithmic or the square root grain size instead of the linear grain size. Since it seems this is not going to change in the short term, it is advisable to always report all the different unidimensional grain size measures (mean, median, freq. peak). This will allow other scientists to directly compare their data with yours without using correction factors which always involve some assumptions. In any event, you will have to choose one of them in your study so we advise from here to follow this rule of thumb: + +- use **mean and standard deviation (SD)** when your **distribution is normal-like** +- use **median and interquartile (or interprecentil) range** when your **distribution is skewed** + +The rationale behind this rule is that the sample mean is generally more efficient than the median, but the sample median is always more robust. The latter feature makes the median more efficient when distributions have "thick" tails as it happens in most skewed distributions. + +When grains are equant (equiaxed) or near-equant (i.e. aspect ratios mostly < 2.0) any specimen orientation is acceptable for estimating unimodal grain size measures, and a **single section** could be enough to obtain a reliable estimate as long as you measure a minimum of grain sections (see below for details). When grains show aspect ratios above 2.0 and preferred orientation throughout the rock volume, you will need to **estimate the grain size over three orthogonal sections and then average the results** to obtain meaningful results. Although specimens with equant grains accept any orientation to estimate unidimensional grain size measures, it is advisable to use a principal section. Specifically, we promote the use of the XZ section, i.e. parallel to the lineation and perpendicular to the foliation, since this will allow us: (i) to estimate whether the grains are far from equant using the aspect ratio; and (ii) to provide a fairer comparison between different specimens when near-equant grains and preferred orientation of the large axes exist. + +To obtain reliable grain size estimates you should measure (or your grain boundary map should contain) at least 433 grain sections, although use better 965 sections when possible. This sample size will ensure that 95 % of the time the mean grain size estimated will have an error equal or less than ± 4 % (99% if you measure 965) (see [Lopez-Sanchez and Llana-Fúnez, 2015](http://www.solid-earth.net/6/475/2015/) for details). If you want to obtain an accurate **confidence interval** for your estimates you have to take several representative micrographs from the same specimen (three or more) and estimate the "average" (mean, median or peak) grain size in each of them. Then use the ```confidence_interval``` function implemented in the script to get a robust confidence interval. For details on how this determination works see the next section or the documentation of the function using the command ``help(confidence_interval)`` in the console. -Unidimensional apparent grain size measures such as the **mean** or **median** are only meaningful in specimens that show a **unimodal distribution of diameters** (or areas). Consequently, in all cases it is key to visualize the distribution of apparent grain sizes and **observe if the distribution is unimodal** (a single peak). In the case that the distribution is multimodal (two or more peaks), you can use for comparative purposes the modal interval or, better, the **location of frequency peaks** based on the Kernel density estimate [(Lopez-Sanchez and Llana-Fúnez, 2015)](http://www.solid-earth.net/6/475/2015/). Despite this, the best option when a multimodal grain size distribution occurs is to separate the different populations of grain size previously via image analysis methods. Unfortunately, no general protocol exists in the earth science community for unidimensional grain size measures. Consequently, if possible, it is advisable to always report all the different unidimensional grain size measures (mean, median, freq. peak). This will allow other scientists to compare their data with yours directly when using a different type of grain size measurement from that used in your study. +Regarding paleopizometry estimates, **do not use central measures derived from distributions estimated via stereological methods but directly apparent grain size measures**. The rationale for this is that stereological methods are always built on several (ill-conditioned) geometric assumptions and the results will always be, at best, only approximate. This means that the precision of the estimated 3D size distribution is **much poorer** than the precision of the original distribution of grain profiles since the latter is based on real data. -When we estimate unimodal grain size measures from a **single section**, whatever the number of grain boundary maps used, the results will be only meaningful if grains are equant (equiaxed) or near-equant (i.e. aspect ratios mostly < 2.0). If grains systematically show aspect ratios above 2.0 and a shape preferred orientation of their large axes throughout the rock volume, you will need to **estimate the grain size over three orthogonal sections and then averaged the results**. Although specimens with equant grains accept any orientation to obtain a unidimensional grain size measure, it is advisable to use a principal section. Specifically, we promote the use of the XZ section, i.e. parallel to the lineation and perpendicular to the foliation, since this will allow us: (i) to visualize and measure whether the grains are far from equant via the aspect ratio; and (ii) to provide a fairer comparison between different specimens when near-equant grains and preferred orientation of the large axes exist. +#### Determination of stress via paleopiezometers -A common way to estimate a **confidence interval** of your grain size measurement is to take several representative micrographs from the same specimen (three or more) and then estimate the mean and the variation in the results reporting the standard deviation (SD) at a 2-sigma level of confidence, i.e. the confidence interval will be the mean ± two times the SD. To minimize variations in the results due to an insufficient number of grain measurements, a minimum of 433 (although use better 965) is required for each grain boundary map (see [Lopez-Sanchez and Llana-Fúnez, 2015)](http://www.solid-earth.net/6/475/2015/) for details). +When using a piezometer relation is of paramount importance to ensure what type of grain size measure should be used. For example, if you want to use the piezometric relation established for quartz in Stipp and Tullis (2003), note that they have been established using the **root mean square apparent diameter** instead of the *linear or the logarithmic mean diameter*. For more details see the step-by-step tutorial. -For paleopizometry/wattmetry studies **do not report measures derived from distributions estimated via stereological methods but apparent grain size measures**. The reasoning behind this is that stereological methods are built on several (weak) geometric assumptions and the results will always be, at best, only approximate. This means that the precision of the estimated 3D size distribution is **much poorer** than the precision of the original distribution of grain profiles since the latter is based on real data. Lastly, when using a piezometer relation is of paramount importance to ensure what type of grain size measure should be used. For example, if you want to use the piezometric relation established for quartz in Stipp and Tullis (2003), note that they have been established using the **root mean square apparent diameter** not the *linear nor the logarithmic mean diameter*. For details see the step-by-step tutorial. +You should always estimate a **confidence interval** for your paleopiezometry estimates, which means that you should do at least three or more independent measures (i.e. from different grain size maps). Since version 1.4.4, the GST script implements a function **to estimate robust confidence intervals using the student's t-distribution** (see the step-by-step tutorial for details). This is a much robust approach than using the mean plus two times the standard deviation when the sample size is small (< 10) and both, the mean and the SD, cannot be estimated accurately. #### Getting the shape of actual grain size distribution or the volume occupied by a particular grain size fraction -Estimating the actual grain size distribution from thin sections using stereological methods requires spatial homogeneity and that **grains under study are equant or near-equant**. The Saltykov and two-step methods will not provide reliable results if most of the grains show aspect ratios above 2.0, regardless of whether a shape preference orientation exists or not. In any event, this assumption is acceptable most of the time for some of the most common dynamically recrystallized non-tabular grains in crustal and mantle shear zones, such as quartz, feldspar, olivine and calcite, as well as in ice or metals/alloys. However, be careful when recrystallized grains show very irregular/lobate grain boundaries (i.e. the main recrystallization mechanism was "fast" grain boundary migration). +Estimating the actual grain size distribution from thin sections using stereological methods requires assuming spatial homogeneity and that **grains under study are equant or near-equant**. The Saltykov and two-step methods will not provide reliable results if most of the grains show aspect ratios above 2.0, regardless of whether a shape preference orientation exists or not. In any event, this assumption is acceptable most of the time for the most common *recrystallized (dynamically or statically)* mineral phases in crustal and mantle shear zones, such as quartz, feldspar, olivine, and calcite, as well as in ice or metals/alloys. However, be careful when recrystallized grains show very irregular/lobate grain boundaries (i.e. the main recrystallization mechanism was "fast" grain boundary migration). -The Saltykov method is suitable to estimate the volume of a particular grain fraction of interest (in percentage) and to visualize the aspect of the derived 3D grain size distribution using the histogram and a volume-weighted cumulative frequency curve. To provide reliable results, the method requires using a few number of classes and a large number of individual grain measurements. *Practical experience* indicates using more than 1000 grains and less than 20 classes. The number of classes has to be set by a trial and error approach. This will inevitably lead to different authors using a different number of classes across studies. Due to this, when estimating the volume of a grain size fraction based on a single grain boundary map it is necessary to take an absolute error of ± 5 to stay safe (see details in [Lopez-Sanchez and Llana-Fúnez, 2016](http://www.sciencedirect.com/science/article/pii/S0191814116301778)). If possible, take more than one representative grain boundary map and then estimate a confidence interval as explained above in this section. +The Saltykov method is suitable to estimate the volume of a particular grain fraction of interest (in percentage) and to visualize the aspect of the derived 3D grain size distribution using the histogram and a volume-weighted cumulative frequency curve. To provide reliable results, the method requires using a few numbers of classes and a large number of individual grain measurements. Practical experience indicates using more than 1000 grains and less than 20 classes. The number of classes has to be set by a trial and error approach. This will inevitably lead to different authors using a different number of classes across studies. Due to this, when estimating the volume of a grain size fraction based on a single grain boundary map it is necessary to take an absolute error of ± 5 to stay safe (see details in [Lopez-Sanchez and Llana-Fúnez, 2016](http://www.sciencedirect.com/science/article/pii/S0191814116301778)). If possible, take more than one representative grain boundary map and then estimate a confidence interval as explained above in this section. -The two-step method ([Lopez-Sanchez and Llana-Fúnez, 2016](http://www.sciencedirect.com/science/article/pii/S0191814116301778)) is suitable for describing quantitatively the shape of the actual 3D grain size distribution using a single parameter; the multiplicative standard deviation (MSD) value. The method also provides a reliable uncertainty value. The method assumes that the actual grain size distribution follows a lognormal distribution, **there is therefore critical to visualize the distribution using the Saltykov method first and ensure that the distribution is unimodal and lognormal-like**. The MSD estimate is independent of the chosen number of classes as long as the Saltykov method produces stable results (i.e. you do not lose the lognormal appearance of the distribution due to the use of an excessive number of classes). +The two-step method ([Lopez-Sanchez and Llana-Fúnez, 2016](http://www.sciencedirect.com/science/article/pii/S0191814116301778)) is suitable for describing quantitatively the shape of the actual 3D grain size distribution using a single parameter called the multiplicative standard deviation (MSD) value. The method assumes that the actual grain size distribution follows a lognormal distribution, **there is therefore critical to visualize the distribution using the Saltykov method first and ensure that the distribution is unimodal and lognormal-like**. The MSD estimate is independent of the chosen number of classes as long as the Saltykov method produces stable results (i.e. you do not lose the lognormal appearance of the distribution due to the use of an excessive number of classes) and provides a reliable uncertainty value. [next section](https://github.com/marcoalopez/GrainSizeTools/blob/master/DOCS/brief_tutorial.md) diff --git a/DOCS/brief_tutorial.md b/DOCS/brief_tutorial.md index f58827c..f2415e5 100644 --- a/DOCS/brief_tutorial.md +++ b/DOCS/brief_tutorial.md @@ -1,8 +1,10 @@ +*last update 2018/03/25* + Getting Started: A step-by-step tutorial ------------- > **Important note:** -> Please, **update to version 1.4.3**. It is also advisable to **update the plotting library matplotlib to version 2.x** since all the plots are optimized for such version. +> Please, **update to version 1.4.4**. It is also advisable to **update the plotting library matplotlib to version 2.x** since all the plots are optimized for such version. ### *Open and running the script* @@ -19,7 +21,7 @@ To use the script it is necessary to run it. To do this, just click on the green The following text will appear in the shell/console (Fig. 1): ``` ====================================================================================== -Welcome to GrainSizeTools script v1.4.2 +Welcome to GrainSizeTools script v1.4.4 ====================================================================================== GrainSizeTools is a free open-source cross-platform script to visualize and characterize @@ -27,9 +29,8 @@ the grain size in polycrystalline materials from thin sections and estimate diff stresses via paleopizometers. METHODS AVAILABLE ------------------ ================== ================================================================== -Function Description +Functions Description ================== ================================================================== extract_areas Extract the areas of the grains from a text file (txt, csv or xlsx) calc_diameters Calculate the diameter via the equivalent circular diameter @@ -38,11 +39,12 @@ derive3D Estimate the actual grain size distribution via steorology m quartz_piezometer Estimate diff. stress from grain size in quartz using piezometers olivine_piezometer Estimate diff. stress from grain size in olivine using piezometers other_pizometers Estimate diff. stress from grain size in other phases +confidence_interval Estimate the confidence interval using the t distribution ================== ================================================================== You can get information on the different methods by: - (1) Typing help(name of the method) in the console. e.g. >>> help(derive3D) - (2) In the Spyder IDE by writing the name of the method and clicking Ctrl + I + (1) Typing help(function name) in the console. e.g. help(conf_interval) + (2) In the Spyder IDE by writing the name of the function and clicking Ctrl + I (3) Visit script documentation at https://marcoalopez.github.io/GrainSizeTools/ @@ -103,7 +105,7 @@ To sum up, the name following the Python keyword ```def```, in this example ```c The names of the Python functions in the script are self-explanatory and each one has been implemented to perform a single task. Although there are a lot of functions within the script, we will only need to call less than four functions to obtain the results. -### *Using the script to visualize and estimate the grain size features* +### *Using the script to visualize and estimate the grain size* #### Loading the data and extracting the areas of the grain profiles @@ -231,7 +233,7 @@ Although we promote the use of frequency *vs* apparent grain size linear plot (F ```python >>> find_grain_size(areas, diameters, plot='area') ``` -in this example setting to use the area-weighted plot. The name of the different plots available are ```'lin'``` for the linear number-weighted plot (the default), ```'area'``` for the area-weighted plot (as in the example above), ```'sqrt'``` for the square-root grain size plot, and ```'log'``` for the logarithmic grain size plot. Note that the selection of different scales also implies to obtain different grain size estimations. Last, it is very important to note that **the mean of the square root or logarithmic grain sizes is not the same as the square root or the logarithm of the mean**! +in this example setting to use the area-weighted plot. The name of the different plots available are ```'lin'``` for the linear number-weighted plot (the default), ```'area'``` for the area-weighted plot (as in the example above), ```'sqrt'``` for the square-root grain size plot, and ```'log'``` for the logarithmic grain size plot. Note that the selection of different scales also implies to obtain different grain size estimations. Last, it is very important to note that **the mean of the square root or logarithmic grain sizes is not the same as the square root or the logarithm of the grain size mean**! The function includes different plug-in methods to estimate an "optimal" bin size, including an automatic mode. The default automatic mode ```'auto'``` use the Freedman-Diaconis rule when using large datasets (> 1000) and the Sturges rule for small datasets. Other available rules are the Freedman-Diaconis ```'fd'```, Scott ```'scott'```, Rice ```'rice'```, Sturges ```'sturges'```, Doane ```'doane'```, and square-root ```'sqrt'``` bin sizes. For more details on the methods see [here](https://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html). We encourage you to use the default method ```'auto'```. Empirical experience indicates that the ```'doane'``` and ```'scott'``` methods work also pretty well when you have a lognormal- or a normal-like distributions, respectively. To specify the method we write in the shell: @@ -324,8 +326,27 @@ The constant values as put in the script are described in Table 2 below. *§ Holyoke and Kronenberg (2010) is a linear recalibration of the Stipp and Tullis (2003) piezometer* *¶ Cross et al. (2017) reanalysed the samples of Stipp and Tullis (2003) using EBSD data for reconstructing the grains. Specifically, they use grain maps with a 1 μm and a 200 nm (hr - high resolution) step sizes . This is the preferred piezometer for quartz when grain size data comes from EBSD maps* +#### Estimating a robust confidence interval + +As pointed out in the scope section, the optimal approach is to obtain several measures of stress or grain sizes and then estimate a confidence interval. Since v1.4.4+, the script implements a function called ```confidende_interval```for this. The script assume that the sample size is small (< 10) and hence it uses the student's t-distribution with n-1 degrees of freedom to estimate a robust confidence interval. For large datasets the t-distribution approaches the normal distribution so you can also use this method for large datasets. The function has two inputs, the dataset, required, and the confidence interval, optional and set at 0.95 by default. For example: + +```python +>>> my_results = [165.3, 174.2, 180.1] +>>> confidence_interval(data=my_results, confidence=0.95) +``` + +The function will return the following information in the console: + +``` +Confidence set at 95.0 % +Mean = 173.2 ± 18.51 +Max / min = 191.71 / 154.69 +Coefficient of variation = 10.7 % +``` + +The coefficient of variation express the confidence interval in percentage respect to the mean and thus it can be used to compare confidence intervals between samples with different mean values. -#### *Derive the actual 3D distribution of grain sizes from thin sections* +#### Derive the actual 3D distribution of grain sizes from thin sections The function responsible to unfold the distribution of apparent grain sizes into the actual 3D grain size distribution is named ```derive3D```. The script implements two methods to do this, the Saltykov and the two-step methods. The Saltykov method is the best option for exploring the dataset and for estimating the volume of a particular grain size fraction. The two-step method is suitable to describe quantitatively the shape of the grain size distribution assuming that they follow a lognormal distribution. This means that the two-step method only yield consistent results when the population of grains considered are completely recrystallized or when the non-recrystallized grains can be previously discarded using shape descriptors or any other relevant paramater such as the density of dislocations. It is therefore necessary to check first whether the linear distribution of grain sizes is unimodal and lognormal-like (i.e. skewed to the right as in the example shown in figure 7). For more details see [Lopez-Sanchez and Llana-Fúnez (2016)](http://www.sciencedirect.com/science/article/pii/S0191814116301778). diff --git a/DOCS/imageJ_tutorial.md b/DOCS/imageJ_tutorial.md index 53d9925..28497e0 100644 --- a/DOCS/imageJ_tutorial.md +++ b/DOCS/imageJ_tutorial.md @@ -1,7 +1,9 @@ +*last update 2018/03/25* + How to measure the areas of the grain profiles with ImageJ ------------- -**Before you start:** This tutorial assumes that you have installed the ImageJ application. If this is not the case, go [here](http://imagej.nih.gov/ij/) to download and install it. You can also install different flavours of the ImageJ application that will work in a similar way (see [here](http://fiji.sc/ImageJ) for a summary) . As a cautionary note, this is not a detailed tutorial on image analysis using ImageJ at all, but a quick systematic tutorial on how to measure the areas of the grain profiles from a thin section to later estimate the grain size and grain size distribution using the GrainSizeTools script. If you are interested in image analysis methods (e.g. grain segmentation techniques, shape characterization, etc.) you should have a look at the list of references at the end of this tutorial. +> **Before you start:** This tutorial assumes that you have installed the ImageJ application. If this is not the case, go [here](http://imagej.nih.gov/ij/) to download and install it. You can also install different flavours of the ImageJ application that will work in a similar way (see [here](http://fiji.sc/ImageJ) for a summary) . As a cautionary note, this is not a detailed tutorial on image analysis methods using ImageJ, but a quick systematic tutorial on how to measure the areas of the grain profiles from a thin section to later estimate the grain size and grain size distribution using the GrainSizeTools script. If you are interested in image analysis methods (e.g. grain segmentation techniques, shape characterization, etc.) you should have a look at the list of references at the end of this tutorial. ### *Previous considerations on the Grain Boundary Maps* @@ -10,15 +12,15 @@ Grain size studies in rocks are usually based on measures performed in thin sect ![Figure 1. An example of a grain boundary map](https://raw.githubusercontent.com/marcoalopez/GrainSizeTools/master/FIGURES/GBmap.png) *Figure 1. An example of a grain boundary map* -Nowadays, these measures are mostly made on digital images made by pixels (e.g. Heilbronner and Barret 2014), also known as raster graphics image. You can obtain some information on raster graphics [here](https://en.wikipedia.org/wiki/Raster_graphics). For example, in a 8-bit grayscale image -the most used type of grayscale image-, each pixel contains three values: information about its location in the image -their x and y coordinates- and its 'gray' value in a range that goes from 0 (white) to 256 (black) (i.e. it allows 256 different gray intensities). In the case of a grain boundary map (Fig. 1), we usually use a binary image where only two possible values exist, 0 for white pixels and 1 for black pixels. +Nowadays, these measures are mostly made on digital images made by pixels (e.g. Heilbronner and Barret 2014), also known as raster graphics image. You can obtain some information on raster graphics [here](https://en.wikipedia.org/wiki/Raster_graphics). For example, in a 8-bit grayscale image -the most used type of grayscale image-, each pixel contains three values: information about its location in the image -their x and y coordinates- and its 'grey' value in a range that goes from 0 (white) to 256 (black) (i.e. it allows 256 different grey intensities). In the case of a grain boundary map (Fig. 1), we usually use a binary image where only two possible values exist, 0 for white pixels and 1 for black pixels. -One of the key points on raster images is that they are resolution dependent, which means that each pixel have a physical dimension. Consequently, the smaller the size of the pixel, the higher the resolution. The resolution depends on the number of pixels per unit area or length, and it is usually measured in pixel per (square) inch (PPI) (more information about [Image resolution](https://en.wikipedia.org/wiki/Image_resolution) and [Pixel density](https://en.wikipedia.org/wiki/Pixel_density)). This concept is key since the resolution of our raw image -the image obtained directly from the microscope- will limit the precision of the measures. Known the size of the pixels is therefore essential and it will allow us to set the scale of the image to measure of the areas of the grain profiles. In addition, it will allow us to later make a perimeter correction when calculating the equivalent diameters from the areas of the grain profiles. So be sure about the image resolution at every step, from the raw image until you get the grain boundary map. +One of the key points about raster images is that they are resolution dependent, which means that each pixel have a physical dimension. Consequently, the smaller the size of the pixel, the higher the resolution. The resolution depends on the number of pixels per unit area or length, and it is usually measured in pixel per (square) inch (PPI) (more information about [Image resolution](https://en.wikipedia.org/wiki/Image_resolution) and [Pixel density](https://en.wikipedia.org/wiki/Pixel_density)). This concept is key since the resolution of our raw image -the image obtained directly from the microscope- will limit the precision of the measures. Known the size of the pixels is therefore essential and it will allow us to set the scale of the image to measure of the areas of the grain profiles. In addition, it will allow us to later make a perimeter correction when calculating the equivalent diameters from the areas of the grain profiles. So be sure about the image resolution at every step, from the "raw" image you get from the microscope until you get the grain boundary map. -> Note: It is important not to confuse the pixel resolution with the actual spatial resolution of the image. The spatial resolution is the actual resolution of the image and it is limited physically not by the number of pixels per unit area/length. For example, conventional SEM techniques have a maximum spatial resolution of 50 to 100 nm whatever the pixels in the image recorded. Think in a digital image of a square inch in size and made of just one black pixel (i.e. with a resolution of ppi = 1). If we double the resolution of the image, we will obtain the same image but now formed by four black pixels instead of one. The new pixel resolution per unit length will be ppi = 2 (or ppi = 4 per unit area). In contrast, the spatial resolution of the image remains the same. Strictly speaking, the spatial resolution refers to the number of independent pixel values per unit area/length. +> Note: It is important not to confuse the pixel resolution with the actual spatial resolution of the image. The spatial resolution is the actual resolution of the image and it is limited physically not by the number of pixels per unit area/length. For example, conventional SEM techniques have a maximum spatial resolution of 50 to 100 nm whatever the pixels in the image recorded. Think in a digital image of a square inch in size and made of just one black pixel (i.e. with a resolution of ppi = 1). If we double the resolution of the image, we will obtain the same image but now formed by four black pixels instead of one. The new pixel resolution per unit length will be ppi = 2 (or 4 per unit area), however, the spatial resolution of the image remains the same. Strictly speaking, the spatial resolution refers to the number of independent pixel values per unit area/length. -The techniques that make possible the transition from a raw image to a grain boundary map, known as grain segmentation, are numerous and depend largely on the type of image obtained from the microscope. Thus, digital images may come from transmission or reflected light microscopy, semi-automatic techniques coupled to light microscopy such as the CIP method (e.g. Heilbronner 2000), electron microscopy either from BSD images or EBSD grain maps, or even from electron microprobes through compositional mapping. All this techniques produce very different images (i.e. different resolutions, color *vs* gray scale, nature of the artefacts, grain size boundary *vs* phase maps, etc.). The presentation of this image analysis techniques is beyond the scope of this tutorial and the reader is referred to the references cited at the end of this document and, particularly, to the books written by Russ (2011) and Heilbronner and Barret (2014). Hence, this tutorial is focused on the features of the grain boundary maps by itself not in how to convert the raw images to grain boundary maps using manual, automatic or semi-automatic grain segmentation. +The techniques that make possible the transition from a raw image to a grain boundary map, known as grain segmentation, are numerous and depend largely on the type of image obtained from the microscope. Thus, digital images may come from transmission or reflected light microscopy, semi-automatic techniques coupled to light microscopy such as the CIP method (e.g. Heilbronner 2000), electron microscopy either from BSD images or EBSD grain maps, or even from electron microprobes through compositional mapping. All this techniques produce very different images (i.e. different resolutions, colour *vs* grey scale, nature of the artefacts, grain size boundary *vs* phase maps, etc.). The presentation of this segmentation techniques is beyond the scope of this tutorial and the reader is referred to the references cited at the end of this document and, particularly, to the books written by Russ (2011) and Heilbronner and Barret (2014). This tutorial is focused instead on the features of the grain boundary maps by itself not in how to convert the raw images to grain boundary maps using manual, automatic or semi-automatic grain segmentation. -Once the grain segmentation is done, it is crucial to ensure that at the actual pixel resolution the grain boundaries have a width of two or more pixels (Fig. 2). This will prevent the formation of undesirable artefacts since when two black pixels belonging to two different grains are adjacent to each other, both grains will be considered the same grain by the image analysis software. +Once the grain segmentation is done, it is crucial to ensure that at the actual pixel resolution the grain boundaries have a width of two or three pixels (Fig. 2). This will prevent the formation of undesirable artefacts since when two black pixels belonging to two different grains are adjacent to each other, both grains will be considered the same grain by the image analysis software. *Figure 2. Detail of grain boundaries in a grain boundary map. The figure shows the boundaries (in white) between three grains in a grain boundary map. The squares represent the pixels in the image. The boundaries are two pixels wide approximately.* diff --git a/GrainSizeTools_script.py b/GrainSizeTools_script.py index 9423924..95f98bf 100644 --- a/GrainSizeTools_script.py +++ b/GrainSizeTools_script.py @@ -383,10 +383,10 @@ def derive3D(diameters, numbins=10, set_limit=None, fit=False, initial_guess=Fal def confidence_interval(data, confidence=0.95): - """Estimate the confidence interval using the t distribution with n-1 + """Estimate the confidence interval using the t-distribution with n-1 degrees of freedom t(n-1). This is useful when sample size is small and the standard deviation cannot be estimated accurately. For large - datasets, the t distribution approaches the normal distribution. + datasets, the t-distribution approaches the normal distribution. Parameters ---------- @@ -394,7 +394,7 @@ def confidence_interval(data, confidence=0.95): the dataset confidence: float between 0 and 1 - the confidence interval + the confidence interval, default = 0.95 Assumptions ----------- @@ -405,17 +405,17 @@ def confidence_interval(data, confidence=0.95): None """ - n = len(data) - degrees_freedom = n - 1 + degrees_freedom = len(data) - 1 sample_mean = np.mean(data) sd_err = sem(data) # Standard error of the mean SD / sqrt(n) low, high = t.interval(confidence, degrees_freedom, sample_mean, sd_err) err = high - sample_mean - + print(' ') + print('Confidence set at', confidence*100, '%') print('Mean =', round(sample_mean, 2), '±', round(err, 2)) print('Max / min =', round(high, 2), '/', round(low, 2)) - print('Coefficient of variation =', round(100 * err / sample_mean, 1), '(%)') + print('Coefficient of variation =', round(100 * err / sample_mean, 1), '%') return None