diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index bca6e52..d95421b 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.6.7","generation_timestamp":"2023-12-28T18:43:39","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2023-12-28T18:48:57","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/acd/index.html b/dev/acd/index.html index 617be1f..2fee5fd 100644 --- a/dev/acd/index.html +++ b/dev/acd/index.html @@ -28,4 +28,4 @@ y = rand(rng, 3000) z = rand(rng, 10000) a = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)] -distinctiveness(a[1], a[2:3])
45165.66072314729

The number is effectively an index of how acoustically unique a word is in a language.

Function documentation

Phonetics.acdistMethod
acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])

Calculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.

Args

  • s1 Features-by-time array of first sound to compare
  • s2 Features-by-time array of second sound to compare
  • method (keyword) Which method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • dtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw
  • fastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
source
Phonetics.acdistFunction
acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])

Convert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.avgseqMethod
avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Return a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.

Args

  • S An array of sequences to average
  • method (keyword) The method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • radius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
  • center (keyword) The method used to select the initial center of the sequences in S
  • dtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction
  • progress Whether to show the progress coming from dba
source
Phonetics.avgseqFunction
avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Convert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.distinctivenessMethod
distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Calculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median.

For more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).

source
Phonetics.distinctivenessFunction
distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Converts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source

References

Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.

Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.

Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q

Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84

Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.

+distinctiveness(a[1], a[2:3])
45165.66072314727

The number is effectively an index of how acoustically unique a word is in a language.

Function documentation

Phonetics.acdistMethod
acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])

Calculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.

Args

  • s1 Features-by-time array of first sound to compare
  • s2 Features-by-time array of second sound to compare
  • method (keyword) Which method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • dtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw
  • fastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
source
Phonetics.acdistFunction
acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])

Convert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.avgseqMethod
avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Return a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.

Args

  • S An array of sequences to average
  • method (keyword) The method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • radius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
  • center (keyword) The method used to select the initial center of the sequences in S
  • dtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction
  • progress Whether to show the progress coming from dba
source
Phonetics.avgseqFunction
avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Convert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.distinctivenessMethod
distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Calculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median.

For more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).

source
Phonetics.distinctivenessFunction
distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Converts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source

References

Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.

Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.

Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q

Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84

Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.

diff --git a/dev/alt_axes_vowel_plot.svg b/dev/alt_axes_vowel_plot.svg index 8683aea..7dfe2f4 100644 --- a/dev/alt_axes_vowel_plot.svg +++ b/dev/alt_axes_vowel_plot.svg @@ -1,141 +1,141 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/ellipse_vowel_plot.svg b/dev/ellipse_vowel_plot.svg index 0ef967c..a25385e 100644 --- a/dev/ellipse_vowel_plot.svg +++ b/dev/ellipse_vowel_plot.svg @@ -1,144 +1,144 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/index.html b/dev/index.html index ab11f5e..98df184 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · Phonetics.jl

Home

This is a Julia package that provides a collection of functions to analyze phonetic data.

+Home · Phonetics.jl

Home

This is a Julia package that provides a collection of functions to analyze phonetic data.

diff --git a/dev/lc/index.html b/dev/lc/index.html index 728421c..561d126 100644 --- a/dev/lc/index.html +++ b/dev/lc/index.html @@ -63,5 +63,5 @@ ["M", "AA1", "R", "K"], # mark ["K", "AE1", "B"], # cab ] -upt(sample_corpus, [["T", "AE1", "T"]]; inCorpus=false)
1×2 DataFrame
RowQueryUPT
Array…Any
1["T", "AE1", "T"]4

Here, [T AE1 T] tat cannot be uniquely identified until after the sequence is complete, so its uniqueness point is one longer than its length.

Function documentation

Phonetics.pndMethod
pnd(corpus::Array, queries::Array; [progress=true])

Calculate the phonological neighborhood density (pnd) for each item in queries based on the items in corpus. This function uses a vantage point tree data structure to speed up the search for neighbors by pruning the search space. This function should work regardless of whether the items in queries are in corpus or not.

Parameters

  • corpus The corpus to be queried for phonological neighbors
  • queries The items to query phonological neighbors for in corpus
  • progress Whether to display a progress meter or not

Returns

  • A DataFrame with the queries in the first column and the phonological neighborhood density in the second
source
Phonetics.phnprbMethod
phnprb(corpus::Array, frequencies::Array, queries::Array; positional=false,
-    nchar=1, pad=true)

Calculates the phonotactic probability for each item in a list of queries based on a corpus

Arguments

  • corpus The corpus on which to base the probability calculations
  • frequencies The frequencies associated with each element in corpus
  • queries The items for which the probability should be calculated

Keyword arguments

  • positional Whether to consider where in the query a given phone appears

(e.g., should "K" as the first sound be considered a different category than "K" as the second sound?)

  • nchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)
  • pad Whether to add padding to each query or not

Returns

A DataFrame with the queries in the first column and the probability values in the second

source
Phonetics.uptMethod
upt(corpus, queries; [inCorpus=true])

Calculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.

Parameters

  • corpus The items comprising the corpus to compare against when calculating the uniqueness point of each query
  • queries The items for which to calculate the uniqueness point
  • inLexicon Whether the query items are expected to be in the corpus or not

Returns

  • A DataFrame with the queries in the first column and the uniqueness points in the second
source

References

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.

Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.

+upt(sample_corpus, [["T", "AE1", "T"]]; inCorpus=false)
1×2 DataFrame
RowQueryUPT
Array…Any
1["T", "AE1", "T"]4

Here, [T AE1 T] tat cannot be uniquely identified until after the sequence is complete, so its uniqueness point is one longer than its length.

Function documentation

Phonetics.pndMethod
pnd(corpus::Array, queries::Array; [progress=true])

Calculate the phonological neighborhood density (pnd) for each item in queries based on the items in corpus. This function uses a vantage point tree data structure to speed up the search for neighbors by pruning the search space. This function should work regardless of whether the items in queries are in corpus or not.

Parameters

  • corpus The corpus to be queried for phonological neighbors
  • queries The items to query phonological neighbors for in corpus
  • progress Whether to display a progress meter or not

Returns

  • A DataFrame with the queries in the first column and the phonological neighborhood density in the second
source
Phonetics.phnprbMethod
phnprb(corpus::Array, frequencies::Array, queries::Array; positional=false,
+    nchar=1, pad=true)

Calculates the phonotactic probability for each item in a list of queries based on a corpus

Arguments

  • corpus The corpus on which to base the probability calculations
  • frequencies The frequencies associated with each element in corpus
  • queries The items for which the probability should be calculated

Keyword arguments

  • positional Whether to consider where in the query a given phone appears

(e.g., should "K" as the first sound be considered a different category than "K" as the second sound?)

  • nchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)
  • pad Whether to add padding to each query or not

Returns

A DataFrame with the queries in the first column and the probability values in the second

source
Phonetics.uptMethod
upt(corpus, queries; [inCorpus=true])

Calculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.

Parameters

  • corpus The items comprising the corpus to compare against when calculating the uniqueness point of each query
  • queries The items for which to calculate the uniqueness point
  • inLexicon Whether the query items are expected to be in the corpus or not

Returns

  • A DataFrame with the queries in the first column and the uniqueness points in the second
source

References

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.

Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.

diff --git a/dev/means_only_ellipse_vowel_plot.svg b/dev/means_only_ellipse_vowel_plot.svg index 3a1785a..c179eeb 100644 --- a/dev/means_only_ellipse_vowel_plot.svg +++ b/dev/means_only_ellipse_vowel_plot.svg @@ -1,58 +1,58 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/norm/index.html b/dev/norm/index.html index ea3931d..783616d 100644 --- a/dev/norm/index.html +++ b/dev/norm/index.html @@ -1,2 +1,2 @@ -Vowel normalization · Phonetics.jl

Vowel normalization

Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.

Nearey normalization routines

Barreda and Nearey normalization routine

Lobanov normalization routine

References

Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742

Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396

Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.

+Vowel normalization · Phonetics.jl

Vowel normalization

Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.

Nearey normalization routines

Barreda and Nearey normalization routine

Lobanov normalization routine

References

Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742

Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396

Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.

diff --git a/dev/phon_spectrogram/0f436fcf.svg b/dev/phon_spectrogram/0cc3b8eb.svg similarity index 99% rename from dev/phon_spectrogram/0f436fcf.svg rename to dev/phon_spectrogram/0cc3b8eb.svg index 5a17674..c8ee9ed 100644 --- a/dev/phon_spectrogram/0f436fcf.svg +++ b/dev/phon_spectrogram/0cc3b8eb.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/e7b86cba.svg b/dev/phon_spectrogram/29bbe87d.svg similarity index 98% rename from dev/phon_spectrogram/e7b86cba.svg rename to dev/phon_spectrogram/29bbe87d.svg index f8292af..fbedfe7 100644 --- a/dev/phon_spectrogram/e7b86cba.svg +++ b/dev/phon_spectrogram/29bbe87d.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/17f429f1.svg b/dev/phon_spectrogram/abadf348.svg similarity index 99% rename from dev/phon_spectrogram/17f429f1.svg rename to dev/phon_spectrogram/abadf348.svg index 33c5af1..f64fa4d 100644 --- a/dev/phon_spectrogram/17f429f1.svg +++ b/dev/phon_spectrogram/abadf348.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/2d309846.svg b/dev/phon_spectrogram/e3672ad4.svg similarity index 99% rename from dev/phon_spectrogram/2d309846.svg rename to dev/phon_spectrogram/e3672ad4.svg index e11f6e0..6a1177b 100644 --- a/dev/phon_spectrogram/2d309846.svg +++ b/dev/phon_spectrogram/e3672ad4.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/index.html b/dev/phon_spectrogram/index.html index e9d975a..621cd7a 100644 --- a/dev/phon_spectrogram/index.html +++ b/dev/phon_spectrogram/index.html @@ -3,4 +3,4 @@ using Plots s, fs = wavread("assets/iwantaspectrogram.wav") s = vec(s) -phonspec(s, fs)Example block output

A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.

phonspec(s, fs, col=:binary)
Example block output

A narrowband style spectrogram can be plotted using the style argument:

phonspec(s, fs, style=:narrowband)
Example block output

And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.

phonspec(s, fs, pre_emph=0)
Example block output

Function documentation

Phonetics.phonspecFunction
phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)

Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.

Args

  • s A vector containing the samples of a sound
  • fs Sampling frequency of s in Hz
  • pre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins
  • style Either :broadband or :narrowband; will affect the window length and window stride
  • dbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument
  • kw... extra named parameters to pass to heatmap
source
+phonspec(s, fs)Example block output

A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.

phonspec(s, fs, col=:binary)
Example block output

A narrowband style spectrogram can be plotted using the style argument:

phonspec(s, fs, style=:narrowband)
Example block output

And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.

phonspec(s, fs, pre_emph=0)
Example block output

Function documentation

Phonetics.phonspecFunction
phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)

Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.

Args

  • s A vector containing the samples of a sound
  • fs Sampling frequency of s in Hz
  • pre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins
  • style Either :broadband or :narrowband; will affect the window length and window stride
  • dbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument
  • kw... extra named parameters to pass to heatmap
source
diff --git a/dev/search_index.js b/dev/search_index.js index 7f1cd14..d139801 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"lc/#Lexical-characteristics","page":"Lexical characteristics","title":"Lexical characteristics","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"There are some functions to calculate common lexical characteristics of words. These characteristics are a reflection of how a word relates to all the other words in a language, that is, how they relate to all other words in the lexicon.","category":"page"},{"location":"lc/#Phonological-neighborhood-density","page":"Lexical characteristics","title":"Phonological neighborhood density","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Phonological neighborhood density, as described by Luce & Pisoni (1998), as a concept is a set of words that sound similar to each other. Vitevitch & Luce (2016) explain that it's common to operationalize this concept as the number of words that have a Levenshtein distance (minimal number of segment additions, subtractions, or substitutions to transform one word or string into another) of exactly 1 from the word in question.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The pnd function allows a user to calculate this value for a list of words based on a given corpus. The following example shows how to use the pnd function. Note that the entries in the sample corpus are given using the Arpabet transcription scheme.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\npnd(sample_corpus, [[\"K\", \"AE1\", \"T\"]])","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"As we can see, [K AA1 T] cat has 2 phonological neighbors in the given corpus, so it has a phonological neighborhood density of 2. The data is returned in a DataFrame so that processing that uses tabular data can be performed.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"A more likely scenario is calculating the phonological neighborhood density for each item in the CMU Pronouncing dictionary. For the purposes of this example, I'll assume you have already downloaded the CMU Pronouncing Dictionary. There is a bit of extra information at the top of the document that needs to be deleted, so make sure the first line in the document is the entry for \"!EXCLAMATION-POINT\".","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Now, the first thing we need to do is read the file into Julia and process it into a usable state. Because we're interested in the phonological transcriptions here, we'll strip away the orthographic representation.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\ncorpus = Vector()\nopen(\"cmudict-0.7b\") do f\n lines = readlines(f)\n for line in lines\n phonological_transcription = split(split(line, \" \")[2])\n push!(corpus, phonological_transcription)\n end\nend","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Notice that we called split twice. The first time was to split the orthographic representation from the phonological one, and they're separated by two spaces. We wanted the phonological transcription, so we took the second element from the Array that results from that call to split. The second call to split was to split the phonological representation into another Array. This is necessary because the CMU Pronouncing Dictionary uses a modified version of the Aprabet transcription scheme and doesn't always use only 1 character to represent a particular phoneme. So we can't just process each individual item in a string as we might be able to do for a 1 character to 1 phoneme mapping like the International Phonetic Alphabet. Representing each phoneme as one element in an Array allows us to process the data correctly.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Now that we have the corpus set up, all we need to do is call the pnd function.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"neighborhood_density = pnd(corpus, corpus)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The output from pnd is a DataFrame where the queries are in the first column and the associated neighborhood densities are in the second column. This DataFrame can then be used in subsequent statistical analyses or saved to a file for use in other programming language or software like R.","category":"page"},{"location":"lc/#Implementation-note","page":"Lexical characteristics","title":"Implementation note","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The intuitive way of coding phonological neighborhood density involves comparing every item in the corpus against every other item in the corpus and counting how many neighbors each item has. However, this is computationally inefficient, as there are approximately n^2 comparisons that must be performed. In this package, this process is sped up by using a spatial data structure called a vantage-point tree. This data structure is a binarily branching tree where all the items on the left of a node are less than a particular distance away from the item in the node, and all those on the right are greater than or equal to that particular distance.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Because of the way that the data is organized in a vantage-point tree, fewer comparisons need to be made. While descending the tree, it can be determined whether any of the points in a branch from a particular node should be searched or not, limiting the number of branches that need to be traversed. In practical terms, this means that the Levenshtein distance is calculated fewer times for each item, and the phonological neighborhood density should be calculated faster for a data set than from using the traditional approach that compares each item to all the other ones in the corpus. At the time of writing this document, I am not aware of any phonological neighborhood density calculator/script that offers this kind of speedup.","category":"page"},{"location":"lc/#Phonotactic-probability","page":"Lexical characteristics","title":"Phonotactic probability","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The phonotactic probability is likelihood of observing a sequence in a given language. It's typically calculated as either the co-occurrence probability of a series of phones or diphones, or the cumulative transitional probability of moving from one portion of the sequence to the next.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"This package currently provides the co-occurrence method of calculating the phonotactic probability, and this can be done taking the position of a phone or diphone into account, or just looking at the co-occurrence probability. By means of example:","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics # hide\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nfreq = [1,1,1,1,1,1]\np = prod([4,4,4] / 20)\nphnprb(sample_corpus, freq, [[\"K\", \"AE1\", \"T\"]])","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"In this example, each phone has 4 observations in the corpus, and the likelihood of observing each of those phones is 4/20. Because there are 3, the phonotactic probability of this sequence is frac420^3, which is 0.008. Floating point errors sometimes occur in the arithmetic in programming, but this is unavoidable.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics # hide\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nfreq = [1,1,1,1,1,1]\np = prod([3,2,3,2]/26)\nphnprb(sample_corpus, freq, [[\"K\", \"AE1\", \"T\"]]; nchar=2)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"In this example here, the input is padded so that the beginning and ending of the word are taken into account when calculating the phonotactic probability. There are 3 counts of [. K] (where [.] is the word boundary symbol), 2 counts of [K AE1], 3 counts of [AE1 T], and 2 counts of [T .]. There are 26 total diphones observed in the corpus, so the phonotactic probability is calculated as","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"frac326 times frac226 times frac326 times frac226 ","category":"page"},{"location":"lc/#Uniqueness-point","page":"Lexical characteristics","title":"Uniqueness point","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The uniqueness point of a word is defined as the segment in a sequence after which that sequence can be uniquely identified. In cohort models of speech perception, it is after this point that a listener will recognize a word while it's being spoken. As an example:","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"K\", \"AA1\", \"T\"]]; inCorpus=true)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Here, [K AA1 B] cob has a uniqueness point of 2. Looking at the corpus, we can be sure we're looking at cob after observing the [AA1] because nothing else begins with the sequence [K AA1]. Thus, its uniqueness point is 2.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"K\", \"AE1\", \"D\"]]; inCorpus=false)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"As is evident, given this sample corpus, [K AE1 D] cad is unique after the 3rd segment. That is, it can be uniquely identified after hearing the [D].","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"T\", \"AE1\", \"T\"]]; inCorpus=false)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Here, [T AE1 T] tat cannot be uniquely identified until after the sequence is complete, so its uniqueness point is one longer than its length.","category":"page"},{"location":"lc/#Function-documentation","page":"Lexical characteristics","title":"Function documentation","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"pnd(corpus::Array, queries::Array; [progress=true])","category":"page"},{"location":"lc/#Phonetics.pnd-Tuple{Array, Array}","page":"Lexical characteristics","title":"Phonetics.pnd","text":"pnd(corpus::Array, queries::Array; [progress=true])\n\nCalculate the phonological neighborhood density (pnd) for each item in queries based on the items in corpus. This function uses a vantage point tree data structure to speed up the search for neighbors by pruning the search space. This function should work regardless of whether the items in queries are in corpus or not.\n\nParameters\n\ncorpus The corpus to be queried for phonological neighbors\nqueries The items to query phonological neighbors for in corpus\nprogress Whether to display a progress meter or not\n\nReturns\n\nA DataFrame with the queries in the first column and the phonological neighborhood density in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"phnprb(corpus::Array, frequencies::Array{Int64}, queries::Array;\n [positional=false, nchar=1, pad=true])","category":"page"},{"location":"lc/#Phonetics.phnprb-Tuple{Array, Array{Int64, N} where N, Array}","page":"Lexical characteristics","title":"Phonetics.phnprb","text":"phnprb(corpus::Array, frequencies::Array, queries::Array; positional=false,\n nchar=1, pad=true)\n\nCalculates the phonotactic probability for each item in a list of queries based on a corpus\n\nArguments\n\ncorpus The corpus on which to base the probability calculations\nfrequencies The frequencies associated with each element in corpus\nqueries The items for which the probability should be calculated\n\nKeyword arguments\n\npositional Whether to consider where in the query a given phone appears\n\n(e.g., should \"K\" as the first sound be considered a different category than \"K\" as the second sound?)\n\nnchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)\npad Whether to add padding to each query or not\n\nReturns\n\nA DataFrame with the queries in the first column and the probability values in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"upt(corpus::Array, queries::Array; [inCorpus=true])","category":"page"},{"location":"lc/#Phonetics.upt-Tuple{Array, Array}","page":"Lexical characteristics","title":"Phonetics.upt","text":"upt(corpus, queries; [inCorpus=true])\n\nCalculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.\n\nParameters\n\ncorpus The items comprising the corpus to compare against when calculating the uniqueness point of each query\nqueries The items for which to calculate the uniqueness point\ninLexicon Whether the query items are expected to be in the corpus or not\n\nReturns\n\nA DataFrame with the queries in the first column and the uniqueness points in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/#References","page":"Lexical characteristics","title":"References","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.","category":"page"},{"location":"phon_spectrogram/#Spectrograms","page":"Spectrograms","title":"Spectrograms","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A basic function is provided to plot spectrograms that look familiar to phoneticians. It makes use of the spectrogram function from DSP.jl to perform the short-time Fourier analysis. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.","category":"page"},{"location":"phon_spectrogram/#Examples","page":"Spectrograms","title":"Examples","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A standard broadband spectrogram can be created without using optional parameters.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV\nusing Plots\ns, fs = wavread(\"assets/iwantaspectrogram.wav\")\ns = vec(s)\nphonspec(s, fs)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nusing Plots # hide\nphonspec(s, fs, col=:binary)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A narrowband style spectrogram can be plotted using the style argument:","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nusing Plots # hide\nphonspec(s, fs, style=:narrowband)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\nusing Plots # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nphonspec(s, fs, pre_emph=0)","category":"page"},{"location":"phon_spectrogram/#Function-documentation","page":"Spectrograms","title":"Function documentation","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"phonspec","category":"page"},{"location":"phon_spectrogram/#Phonetics.phonspec","page":"Spectrograms","title":"Phonetics.phonspec","text":"phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)\n\nRudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.\n\nArgument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.\n\nArgs\n\ns A vector containing the samples of a sound\nfs Sampling frequency of s in Hz\npre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins\nstyle Either :broadband or :narrowband; will affect the window length and window stride\ndbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument\nkw... extra named parameters to pass to heatmap\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/#Vowel-plotting","page":"Vowel plotting","title":"Vowel plotting","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The function provided for plotting vowels diplays offers a variety of visualization techniques for displaying a two-dimensional plot for vowel tokens. Traditionally, it is F1 and F2 that are plotted, but any two pairs of data can be plotted, such as F2 and F3, F2-F1 and F3, etc. A traditional, vanilla vowel plot only requires three positional arguments, f1, f2, and cats. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"vanilla_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vanilla vowel plot)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"This is a traditional vowel plot, with F1 on the x-axis in increasing order and F2 on the y-axis in increasing order. Note that simulated data were generated using the generateFormants function. Specifying a seed value makes the results reproducible. (Keep in mind that if you are generating values for different experiments, reports, studies, etc., the seed value needs to be changed (or left unspecified) so that the same data are not generated every time when they shouldn't be reproducible.)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"For those inclined to use the alternate axes configuration with F2 decreasing on the x-axis and F1 decreasing on the y-axis, the xflip and yflip arguments that the Plots.jl package makes use of can be passed in to force the axes to be decreasing, the F2 values can be passed into the first argument slot, and the F1 values can be passed into the second argument slot.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f2, data.f1, data.vowel,\n xflip=true, yflip=true, xlab=\"F2 (Hz)\", ylab=\"F1 (Hz)\")\nsavefig(\"alt_axes_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with alternate axes)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"I don't personally prefer to look at vowel plots in this manner because I think it unfairly privileges articulatory characteristics of vowel production when examining acoustic characteristics, so subsequent examples will not be presented using this axis configuration. However, the same principle applies to switching the axes around.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The vowelPlot function also allows for ellipses to be plotted around the values with the ell and ellPercent arguments. The ell argument takes a true or false value. The ellPercent argument should be a value between greater than 0 and less than 1, and it represents the approximate percentage of the data the should be contained within the ellipse. This is in contrast to some packages available in R that allow you to specify the number of standard deviations that the ellipse should be stretched to. The reason is that the traditional cutoff values of 1 standard deviation for 67%, 2 standard deviations for 95%, etc. for univariate Gaussian distributions does not carry over to multiple dimensions. While, the appropriate amount of stretching of the ellipse can be determined from the percentage of data to contain (Wang et al., 2015).","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, ell=true, ellPercent=0.67,\n xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"ellipse_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with ellipses)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Each of the data clouds in the scatter have an ellipse overlaid on them so as to contain 67% of the data. The ellipse calculation process is given in Friendly et al. (2013).","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"One final feature to point out is that the vowelplot function can also plot just the mean value of each vowel category with the meansOnly argument. Additionally, a label can be added to each category with the addLabels argument, which bases the labels on the category given in the cats argument.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, ell=true,\n meansOnly=true, addLabels=true, xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"means_only_ellipse_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with ellipses and markers only for mean values)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.","category":"page"},{"location":"vowelplot/#Function-documentation","page":"Vowel plotting","title":"Function documentation","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"vowelplot","category":"page"},{"location":"vowelplot/#Phonetics.vowelplot","page":"Vowel plotting","title":"Phonetics.vowelplot","text":"vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)\n\nCreate an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.\n\nIf meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.\n\nArgument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.\n\nArgs\n\nf1 The F1 values, or otherwise the values to plot on the x-axis\nf2 The F2 values, or otherwise the values to plot on the y-axis\ncats The vowel categories associated with each F1, F2 pair\nmeansOnly Plot only mean value for each category\naddLabels Add labels for each category to the plot near the mean\nell Whether to add data ellipses to the plot\nellPercent Percentage of the data distribution the ellipse should cover (approximately)\nnEllPts How many points should be used when plotting the ellipse\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"ellipsePts","category":"page"},{"location":"vowelplot/#Phonetics.ellipsePts","page":"Vowel plotting","title":"Phonetics.ellipsePts","text":"ellipsePts(f1, f2; percent=0.95, nPoints=500)\n\nCalculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.\n\nSee Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.\n\nArgs\n\nf1 The F1 values or otherwise x-axis values\nf2 The F2 values or otherwise y-axis values\npercent (keyword) Percent of the data distribution the ellipse should approximately cover\nnPoints (keyword) How many points to use when drawing the ellipse\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/#References","page":"Vowel plotting","title":"References","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537","category":"page"},{"location":"acd/#Acoustic-distance","page":"Acoustic distance","title":"Acoustic distance","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Recent work has used dynamic time warping on sequences of Mel frequency cepstral coefficient (MFCC) vectors to compute a form of acoustic distance (Mielke, 2012; Kelley, 2018; Kelley & Tucker, 2018; Bartetlds et al., 2020). There are a number of convenience functions provided in this package. For the most part, they wrap the DynamicAxisWarping.jl and MFCC.jl packages. See also the Phonological CorpusTools page on acoustic similarity.","category":"page"},{"location":"acd/#Computing-acoustic-distance","page":"Acoustic distance","title":"Computing acoustic distance","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Let's start by creating some sample sounds to work with. You could also load in your own sounds from file as well using the Sound constructor that takes a filename.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1, 1000)\ny = rand(rng, 1, 3000)\nacdist(x, y)","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"The output value is the result of performing dynamic time warping on the x and y. If x and y are Sound objects (in this example, they are not), they will first be converted to MFCC vectors with the sound2mfcc function. This value has been found to situate phonological similarity in terms of acoustics (Mielke, 2012), reflect aspects of the activation/competition process during spoken word recognition (Kelley, 2018; Kelley & Tucker, 2018), and judgments of nativelike pronunciation (Bartelds et al., 2020).","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"As an implementation note, the distance metric used to compare the MFCC vectors is the squared Euclidean distance between two vectors.","category":"page"},{"location":"acd/#Sequence-averaging","page":"Acoustic distance","title":"Sequence averaging","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley & Tucker (2018) also used the dynamic barycenter averaging (Petitjean et al., 2011) technique to create \"average\" acoustic representations of English words, in an attempt to better model the kind of acoustic representation a listener may be accessing when hearing a word (given that a listener has heard most words more than just once). The interface for calculating the average sequence is with the avgseq function.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1000)\ny = rand(rng, 3000)\nz = rand(rng, 10000)\na = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)]\navgseq(a)","category":"page"},{"location":"acd/#Acoustic-distinctiveness","page":"Acoustic distance","title":"Acoustic distinctiveness","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley (2018) and Kelley & Tucker (2018) introduced the concept of acoustic distinctiveness. It is how far away a word is, on average, from all the other words in a language. The distinctiveness function performs this calculation.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1000)\ny = rand(rng, 3000)\nz = rand(rng, 10000)\na = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)]\ndistinctiveness(a[1], a[2:3])","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"The number is effectively an index of how acoustically unique a word is in a language.","category":"page"},{"location":"acd/#Function-documentation","page":"Acoustic distance","title":"Function documentation","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])","category":"page"},{"location":"acd/#Phonetics.acdist-Tuple{Any, Any}","page":"Acoustic distance","title":"Phonetics.acdist","text":"acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])\n\nCalculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.\n\nArgs\n\ns1 Features-by-time array of first sound to compare\ns2 Features-by-time array of second sound to compare\nmethod (keyword) Which method of dynamic time warping to use\ndist (keyword) Any distance function implementing the SemiMetric interface from the Distances package\ndtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw\nfastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])","category":"page"},{"location":"acd/#Phonetics.acdist","page":"Acoustic distance","title":"Phonetics.acdist","text":"acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])\n\nConvert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])","category":"page"},{"location":"acd/#Phonetics.avgseq-Tuple{Any}","page":"Acoustic distance","title":"Phonetics.avgseq","text":"avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])\n\nReturn a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.\n\nArgs\n\nS An array of sequences to average\nmethod (keyword) The method of dynamic time warping to use\ndist (keyword) Any distance function implementing the SemiMetric interface from the Distances package\nradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw\ncenter (keyword) The method used to select the initial center of the sequences in S\ndtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction\nprogress Whether to show the progress coming from dba\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":" avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])","category":"page"},{"location":"acd/#Phonetics.avgseq","page":"Acoustic distance","title":"Phonetics.avgseq","text":"avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])\n\nConvert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"distinctiveness(s, corpus; [method=:dtw, radius=10, reduction=mean])","category":"page"},{"location":"acd/#Phonetics.distinctiveness-Tuple{Any, Any}","page":"Acoustic distance","title":"Phonetics.distinctiveness","text":"distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])\n\nCalculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median. \n\nFor more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, radius=10, reduction=mean])","category":"page"},{"location":"acd/#Phonetics.distinctiveness","page":"Acoustic distance","title":"Phonetics.distinctiveness","text":"distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])\n\nConverts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/#References","page":"Acoustic distance","title":"References","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.","category":"page"},{"location":"#Home","page":"Home","title":"Home","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This is a Julia package that provides a collection of functions to analyze phonetic data.","category":"page"},{"location":"norm/#Vowel-normalization","page":"Vowel normalization","title":"Vowel normalization","text":"","category":"section"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.","category":"page"},{"location":"norm/#Nearey-normalization-routines","page":"Vowel normalization","title":"Nearey normalization routines","text":"","category":"section"},{"location":"norm/#Barreda-and-Nearey-normalization-routine","page":"Vowel normalization","title":"Barreda and Nearey normalization routine","text":"","category":"section"},{"location":"norm/#Lobanov-normalization-routine","page":"Vowel normalization","title":"Lobanov normalization routine","text":"","category":"section"},{"location":"norm/#References","page":"Vowel normalization","title":"References","text":"","category":"section"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742","category":"page"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396","category":"page"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.","category":"page"},{"location":"textvptree/#Text-VP-Tree","page":"Text VP Tree","title":"Text VP Tree","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For n items, there would be n-1 comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be n times (n-1) = n^2-n comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!","category":"page"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"With a vantage-point tree, however, we might get an average of only needing log_2(n) comparisons per query because of the way the data are organized. This means we would only need n times log_2(n) comparisons in total, which can be substantially lower than n^2-n for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.","category":"page"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"This impelentation is based on the description by Samet (2006).","category":"page"},{"location":"textvptree/#Function-documentation","page":"Text VP Tree","title":"Function documentation","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"TextVPTree(items::Array, d::Function)","category":"page"},{"location":"textvptree/#Phonetics.TextVPTree-Tuple{Array, Function}","page":"Text VP Tree","title":"Phonetics.TextVPTree","text":"TextVPTree(items::Array, d)\n\nOuter constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.\n\n\n\n\n\n","category":"method"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"radiusSearch(tree::TextVPTree, query, epsilon)","category":"page"},{"location":"textvptree/#Phonetics.radiusSearch-Tuple{TextVPTree, Any, Any}","page":"Text VP Tree","title":"Phonetics.radiusSearch","text":"radiusSearch(tree::TextVPTree, query, epsilon)\n\nPerforms a search for all items in a VP tree tree that are within a radius epsilon from a query query.\n\nReturns\n\nA Vector of items that are within the given radius epsilon\n\n\n\n\n\n","category":"method"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"nneighbors(tree::TextVPTree, query, n)","category":"page"},{"location":"textvptree/#Phonetics.nneighbors-Tuple{TextVPTree, Any, Any}","page":"Text VP Tree","title":"Phonetics.nneighbors","text":"nneighbors(tree::TextVPTree, query, n)\n\nFind the n nearest neighbors in a VP tree tree to a given query query.\n\nReturns\n\nA PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones\n\n\n\n\n\n","category":"method"},{"location":"textvptree/#References","page":"Text VP Tree","title":"References","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.","category":"page"}] +[{"location":"lc/#Lexical-characteristics","page":"Lexical characteristics","title":"Lexical characteristics","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"There are some functions to calculate common lexical characteristics of words. These characteristics are a reflection of how a word relates to all the other words in a language, that is, how they relate to all other words in the lexicon.","category":"page"},{"location":"lc/#Phonological-neighborhood-density","page":"Lexical characteristics","title":"Phonological neighborhood density","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Phonological neighborhood density, as described by Luce & Pisoni (1998), as a concept is a set of words that sound similar to each other. Vitevitch & Luce (2016) explain that it's common to operationalize this concept as the number of words that have a Levenshtein distance (minimal number of segment additions, subtractions, or substitutions to transform one word or string into another) of exactly 1 from the word in question.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The pnd function allows a user to calculate this value for a list of words based on a given corpus. The following example shows how to use the pnd function. Note that the entries in the sample corpus are given using the Arpabet transcription scheme.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\npnd(sample_corpus, [[\"K\", \"AE1\", \"T\"]])","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"As we can see, [K AA1 T] cat has 2 phonological neighbors in the given corpus, so it has a phonological neighborhood density of 2. The data is returned in a DataFrame so that processing that uses tabular data can be performed.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"A more likely scenario is calculating the phonological neighborhood density for each item in the CMU Pronouncing dictionary. For the purposes of this example, I'll assume you have already downloaded the CMU Pronouncing Dictionary. There is a bit of extra information at the top of the document that needs to be deleted, so make sure the first line in the document is the entry for \"!EXCLAMATION-POINT\".","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Now, the first thing we need to do is read the file into Julia and process it into a usable state. Because we're interested in the phonological transcriptions here, we'll strip away the orthographic representation.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\ncorpus = Vector()\nopen(\"cmudict-0.7b\") do f\n lines = readlines(f)\n for line in lines\n phonological_transcription = split(split(line, \" \")[2])\n push!(corpus, phonological_transcription)\n end\nend","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Notice that we called split twice. The first time was to split the orthographic representation from the phonological one, and they're separated by two spaces. We wanted the phonological transcription, so we took the second element from the Array that results from that call to split. The second call to split was to split the phonological representation into another Array. This is necessary because the CMU Pronouncing Dictionary uses a modified version of the Aprabet transcription scheme and doesn't always use only 1 character to represent a particular phoneme. So we can't just process each individual item in a string as we might be able to do for a 1 character to 1 phoneme mapping like the International Phonetic Alphabet. Representing each phoneme as one element in an Array allows us to process the data correctly.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Now that we have the corpus set up, all we need to do is call the pnd function.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"neighborhood_density = pnd(corpus, corpus)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The output from pnd is a DataFrame where the queries are in the first column and the associated neighborhood densities are in the second column. This DataFrame can then be used in subsequent statistical analyses or saved to a file for use in other programming language or software like R.","category":"page"},{"location":"lc/#Implementation-note","page":"Lexical characteristics","title":"Implementation note","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The intuitive way of coding phonological neighborhood density involves comparing every item in the corpus against every other item in the corpus and counting how many neighbors each item has. However, this is computationally inefficient, as there are approximately n^2 comparisons that must be performed. In this package, this process is sped up by using a spatial data structure called a vantage-point tree. This data structure is a binarily branching tree where all the items on the left of a node are less than a particular distance away from the item in the node, and all those on the right are greater than or equal to that particular distance.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Because of the way that the data is organized in a vantage-point tree, fewer comparisons need to be made. While descending the tree, it can be determined whether any of the points in a branch from a particular node should be searched or not, limiting the number of branches that need to be traversed. In practical terms, this means that the Levenshtein distance is calculated fewer times for each item, and the phonological neighborhood density should be calculated faster for a data set than from using the traditional approach that compares each item to all the other ones in the corpus. At the time of writing this document, I am not aware of any phonological neighborhood density calculator/script that offers this kind of speedup.","category":"page"},{"location":"lc/#Phonotactic-probability","page":"Lexical characteristics","title":"Phonotactic probability","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The phonotactic probability is likelihood of observing a sequence in a given language. It's typically calculated as either the co-occurrence probability of a series of phones or diphones, or the cumulative transitional probability of moving from one portion of the sequence to the next.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"This package currently provides the co-occurrence method of calculating the phonotactic probability, and this can be done taking the position of a phone or diphone into account, or just looking at the co-occurrence probability. By means of example:","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics # hide\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nfreq = [1,1,1,1,1,1]\np = prod([4,4,4] / 20)\nphnprb(sample_corpus, freq, [[\"K\", \"AE1\", \"T\"]])","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"In this example, each phone has 4 observations in the corpus, and the likelihood of observing each of those phones is 4/20. Because there are 3, the phonotactic probability of this sequence is frac420^3, which is 0.008. Floating point errors sometimes occur in the arithmetic in programming, but this is unavoidable.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics # hide\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nfreq = [1,1,1,1,1,1]\np = prod([3,2,3,2]/26)\nphnprb(sample_corpus, freq, [[\"K\", \"AE1\", \"T\"]]; nchar=2)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"In this example here, the input is padded so that the beginning and ending of the word are taken into account when calculating the phonotactic probability. There are 3 counts of [. K] (where [.] is the word boundary symbol), 2 counts of [K AE1], 3 counts of [AE1 T], and 2 counts of [T .]. There are 26 total diphones observed in the corpus, so the phonotactic probability is calculated as","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"frac326 times frac226 times frac326 times frac226 ","category":"page"},{"location":"lc/#Uniqueness-point","page":"Lexical characteristics","title":"Uniqueness point","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"The uniqueness point of a word is defined as the segment in a sequence after which that sequence can be uniquely identified. In cohort models of speech perception, it is after this point that a listener will recognize a word while it's being spoken. As an example:","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"K\", \"AA1\", \"T\"]]; inCorpus=true)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Here, [K AA1 B] cob has a uniqueness point of 2. Looking at the corpus, we can be sure we're looking at cob after observing the [AA1] because nothing else begins with the sequence [K AA1]. Thus, its uniqueness point is 2.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"K\", \"AE1\", \"D\"]]; inCorpus=false)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"As is evident, given this sample corpus, [K AE1 D] cad is unique after the 3rd segment. That is, it can be uniquely identified after hearing the [D].","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"using Phonetics\nsample_corpus = [\n[\"K\", \"AE1\", \"T\"], # cat\n[\"K\", \"AA1\", \"B\"], # cob\n[\"B\", \"AE1\", \"T\"], # bat\n[\"T\", \"AE1\", \"T\", \"S\"], # tats\n[\"M\", \"AA1\", \"R\", \"K\"], # mark\n[\"K\", \"AE1\", \"B\"], # cab\n]\nupt(sample_corpus, [[\"T\", \"AE1\", \"T\"]]; inCorpus=false)","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Here, [T AE1 T] tat cannot be uniquely identified until after the sequence is complete, so its uniqueness point is one longer than its length.","category":"page"},{"location":"lc/#Function-documentation","page":"Lexical characteristics","title":"Function documentation","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"pnd(corpus::Array, queries::Array; [progress=true])","category":"page"},{"location":"lc/#Phonetics.pnd-Tuple{Array, Array}","page":"Lexical characteristics","title":"Phonetics.pnd","text":"pnd(corpus::Array, queries::Array; [progress=true])\n\nCalculate the phonological neighborhood density (pnd) for each item in queries based on the items in corpus. This function uses a vantage point tree data structure to speed up the search for neighbors by pruning the search space. This function should work regardless of whether the items in queries are in corpus or not.\n\nParameters\n\ncorpus The corpus to be queried for phonological neighbors\nqueries The items to query phonological neighbors for in corpus\nprogress Whether to display a progress meter or not\n\nReturns\n\nA DataFrame with the queries in the first column and the phonological neighborhood density in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"phnprb(corpus::Array, frequencies::Array{Int64}, queries::Array;\n [positional=false, nchar=1, pad=true])","category":"page"},{"location":"lc/#Phonetics.phnprb-Tuple{Array, Array{Int64}, Array}","page":"Lexical characteristics","title":"Phonetics.phnprb","text":"phnprb(corpus::Array, frequencies::Array, queries::Array; positional=false,\n nchar=1, pad=true)\n\nCalculates the phonotactic probability for each item in a list of queries based on a corpus\n\nArguments\n\ncorpus The corpus on which to base the probability calculations\nfrequencies The frequencies associated with each element in corpus\nqueries The items for which the probability should be calculated\n\nKeyword arguments\n\npositional Whether to consider where in the query a given phone appears\n\n(e.g., should \"K\" as the first sound be considered a different category than \"K\" as the second sound?)\n\nnchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)\npad Whether to add padding to each query or not\n\nReturns\n\nA DataFrame with the queries in the first column and the probability values in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"upt(corpus::Array, queries::Array; [inCorpus=true])","category":"page"},{"location":"lc/#Phonetics.upt-Tuple{Array, Array}","page":"Lexical characteristics","title":"Phonetics.upt","text":"upt(corpus, queries; [inCorpus=true])\n\nCalculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.\n\nParameters\n\ncorpus The items comprising the corpus to compare against when calculating the uniqueness point of each query\nqueries The items for which to calculate the uniqueness point\ninLexicon Whether the query items are expected to be in the corpus or not\n\nReturns\n\nA DataFrame with the queries in the first column and the uniqueness points in the second\n\n\n\n\n\n","category":"method"},{"location":"lc/#References","page":"Lexical characteristics","title":"References","text":"","category":"section"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.","category":"page"},{"location":"lc/","page":"Lexical characteristics","title":"Lexical characteristics","text":"Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.","category":"page"},{"location":"phon_spectrogram/#Spectrograms","page":"Spectrograms","title":"Spectrograms","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A basic function is provided to plot spectrograms that look familiar to phoneticians. It makes use of the spectrogram function from DSP.jl to perform the short-time Fourier analysis. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.","category":"page"},{"location":"phon_spectrogram/#Examples","page":"Spectrograms","title":"Examples","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A standard broadband spectrogram can be created without using optional parameters.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV\nusing Plots\ns, fs = wavread(\"assets/iwantaspectrogram.wav\")\ns = vec(s)\nphonspec(s, fs)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nusing Plots # hide\nphonspec(s, fs, col=:binary)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"A narrowband style spectrogram can be plotted using the style argument:","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nusing Plots # hide\nphonspec(s, fs, style=:narrowband)","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.","category":"page"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"using Phonetics # hide\nusing WAV # hide\nusing Plots # hide\ns, fs = wavread(\"assets/iwantaspectrogram.wav\") # hide\ns = vec(s) # hide\nphonspec(s, fs, pre_emph=0)","category":"page"},{"location":"phon_spectrogram/#Function-documentation","page":"Spectrograms","title":"Function documentation","text":"","category":"section"},{"location":"phon_spectrogram/","page":"Spectrograms","title":"Spectrograms","text":"phonspec","category":"page"},{"location":"phon_spectrogram/#Phonetics.phonspec","page":"Spectrograms","title":"Phonetics.phonspec","text":"phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)\n\nRudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.\n\nArgument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.\n\nArgs\n\ns A vector containing the samples of a sound\nfs Sampling frequency of s in Hz\npre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins\nstyle Either :broadband or :narrowband; will affect the window length and window stride\ndbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument\nkw... extra named parameters to pass to heatmap\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/#Vowel-plotting","page":"Vowel plotting","title":"Vowel plotting","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The function provided for plotting vowels diplays offers a variety of visualization techniques for displaying a two-dimensional plot for vowel tokens. Traditionally, it is F1 and F2 that are plotted, but any two pairs of data can be plotted, such as F2 and F3, F2-F1 and F3, etc. A traditional, vanilla vowel plot only requires three positional arguments, f1, f2, and cats. The plot specification is given using RecipesBase.jl to avoid depending on Plots.jl. It is necessary to specify using Plots before spectrograms can be plotted.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"vanilla_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vanilla vowel plot)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"This is a traditional vowel plot, with F1 on the x-axis in increasing order and F2 on the y-axis in increasing order. Note that simulated data were generated using the generateFormants function. Specifying a seed value makes the results reproducible. (Keep in mind that if you are generating values for different experiments, reports, studies, etc., the seed value needs to be changed (or left unspecified) so that the same data are not generated every time when they shouldn't be reproducible.)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"For those inclined to use the alternate axes configuration with F2 decreasing on the x-axis and F1 decreasing on the y-axis, the xflip and yflip arguments that the Plots.jl package makes use of can be passed in to force the axes to be decreasing, the F2 values can be passed into the first argument slot, and the F1 values can be passed into the second argument slot.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f2, data.f1, data.vowel,\n xflip=true, yflip=true, xlab=\"F2 (Hz)\", ylab=\"F1 (Hz)\")\nsavefig(\"alt_axes_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with alternate axes)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"I don't personally prefer to look at vowel plots in this manner because I think it unfairly privileges articulatory characteristics of vowel production when examining acoustic characteristics, so subsequent examples will not be presented using this axis configuration. However, the same principle applies to switching the axes around.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The vowelPlot function also allows for ellipses to be plotted around the values with the ell and ellPercent arguments. The ell argument takes a true or false value. The ellPercent argument should be a value between greater than 0 and less than 1, and it represents the approximate percentage of the data the should be contained within the ellipse. This is in contrast to some packages available in R that allow you to specify the number of standard deviations that the ellipse should be stretched to. The reason is that the traditional cutoff values of 1 standard deviation for 67%, 2 standard deviations for 95%, etc. for univariate Gaussian distributions does not carry over to multiple dimensions. While, the appropriate amount of stretching of the ellipse can be determined from the percentage of data to contain (Wang et al., 2015).","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, ell=true, ellPercent=0.67,\n xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"ellipse_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with ellipses)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Each of the data clouds in the scatter have an ellipse overlaid on them so as to contain 67% of the data. The ellipse calculation process is given in Friendly et al. (2013).","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"One final feature to point out is that the vowelplot function can also plot just the mean value of each vowel category with the meansOnly argument. Additionally, a label can be added to each category with the addLabels argument, which bases the labels on the category given in the cats argument.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"using Phonetics # hide\nusing Plots # hide\ndata = generateFormants(30, gender=[\"w\"], seed=56) # hide\nvowelplot(data.f1, data.f2, data.vowel, ell=true,\n meansOnly=true, addLabels=true, xlab=\"F1 (Hz)\", ylab=\"F2 (Hz)\")\nsavefig(\"means_only_ellipse_vowel_plot.svg\") # hide\nnothing # hide","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"(Image: Vowel plot with ellipses and markers only for mean values)","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.","category":"page"},{"location":"vowelplot/#Function-documentation","page":"Vowel plotting","title":"Function documentation","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"vowelplot","category":"page"},{"location":"vowelplot/#Phonetics.vowelplot","page":"Vowel plotting","title":"Phonetics.vowelplot","text":"vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)\n\nCreate an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.\n\nIf meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.\n\nArgument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.\n\nArgs\n\nf1 The F1 values, or otherwise the values to plot on the x-axis\nf2 The F2 values, or otherwise the values to plot on the y-axis\ncats The vowel categories associated with each F1, F2 pair\nmeansOnly Plot only mean value for each category\naddLabels Add labels for each category to the plot near the mean\nell Whether to add data ellipses to the plot\nellPercent Percentage of the data distribution the ellipse should cover (approximately)\nnEllPts How many points should be used when plotting the ellipse\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"ellipsePts","category":"page"},{"location":"vowelplot/#Phonetics.ellipsePts","page":"Vowel plotting","title":"Phonetics.ellipsePts","text":"ellipsePts(f1, f2; percent=0.95, nPoints=500)\n\nCalculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.\n\nSee Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.\n\nArgs\n\nf1 The F1 values or otherwise x-axis values\nf2 The F2 values or otherwise y-axis values\npercent (keyword) Percent of the data distribution the ellipse should approximately cover\nnPoints (keyword) How many points to use when drawing the ellipse\n\n\n\n\n\n","category":"function"},{"location":"vowelplot/#References","page":"Vowel plotting","title":"References","text":"","category":"section"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.","category":"page"},{"location":"vowelplot/","page":"Vowel plotting","title":"Vowel plotting","text":"Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537","category":"page"},{"location":"acd/#Acoustic-distance","page":"Acoustic distance","title":"Acoustic distance","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Recent work has used dynamic time warping on sequences of Mel frequency cepstral coefficient (MFCC) vectors to compute a form of acoustic distance (Mielke, 2012; Kelley, 2018; Kelley & Tucker, 2018; Bartetlds et al., 2020). There are a number of convenience functions provided in this package. For the most part, they wrap the DynamicAxisWarping.jl and MFCC.jl packages. See also the Phonological CorpusTools page on acoustic similarity.","category":"page"},{"location":"acd/#Computing-acoustic-distance","page":"Acoustic distance","title":"Computing acoustic distance","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Let's start by creating some sample sounds to work with. You could also load in your own sounds from file as well using the Sound constructor that takes a filename.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1, 1000)\ny = rand(rng, 1, 3000)\nacdist(x, y)","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"The output value is the result of performing dynamic time warping on the x and y. If x and y are Sound objects (in this example, they are not), they will first be converted to MFCC vectors with the sound2mfcc function. This value has been found to situate phonological similarity in terms of acoustics (Mielke, 2012), reflect aspects of the activation/competition process during spoken word recognition (Kelley, 2018; Kelley & Tucker, 2018), and judgments of nativelike pronunciation (Bartelds et al., 2020).","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"As an implementation note, the distance metric used to compare the MFCC vectors is the squared Euclidean distance between two vectors.","category":"page"},{"location":"acd/#Sequence-averaging","page":"Acoustic distance","title":"Sequence averaging","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley & Tucker (2018) also used the dynamic barycenter averaging (Petitjean et al., 2011) technique to create \"average\" acoustic representations of English words, in an attempt to better model the kind of acoustic representation a listener may be accessing when hearing a word (given that a listener has heard most words more than just once). The interface for calculating the average sequence is with the avgseq function.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1000)\ny = rand(rng, 3000)\nz = rand(rng, 10000)\na = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)]\navgseq(a)","category":"page"},{"location":"acd/#Acoustic-distinctiveness","page":"Acoustic distance","title":"Acoustic distinctiveness","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley (2018) and Kelley & Tucker (2018) introduced the concept of acoustic distinctiveness. It is how far away a word is, on average, from all the other words in a language. The distinctiveness function performs this calculation.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"using Phonetics # hide\nusing Random\nrng = MersenneTwister(9)\nx = rand(rng, 1000)\ny = rand(rng, 3000)\nz = rand(rng, 10000)\na = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)]\ndistinctiveness(a[1], a[2:3])","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"The number is effectively an index of how acoustically unique a word is in a language.","category":"page"},{"location":"acd/#Function-documentation","page":"Acoustic distance","title":"Function documentation","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])","category":"page"},{"location":"acd/#Phonetics.acdist-Tuple{Any, Any}","page":"Acoustic distance","title":"Phonetics.acdist","text":"acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])\n\nCalculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.\n\nArgs\n\ns1 Features-by-time array of first sound to compare\ns2 Features-by-time array of second sound to compare\nmethod (keyword) Which method of dynamic time warping to use\ndist (keyword) Any distance function implementing the SemiMetric interface from the Distances package\ndtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw\nfastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])","category":"page"},{"location":"acd/#Phonetics.acdist","page":"Acoustic distance","title":"Phonetics.acdist","text":"acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])\n\nConvert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])","category":"page"},{"location":"acd/#Phonetics.avgseq-Tuple{Any}","page":"Acoustic distance","title":"Phonetics.avgseq","text":"avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])\n\nReturn a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.\n\nArgs\n\nS An array of sequences to average\nmethod (keyword) The method of dynamic time warping to use\ndist (keyword) Any distance function implementing the SemiMetric interface from the Distances package\nradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw\ncenter (keyword) The method used to select the initial center of the sequences in S\ndtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction\nprogress Whether to show the progress coming from dba\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":" avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])","category":"page"},{"location":"acd/#Phonetics.avgseq","page":"Acoustic distance","title":"Phonetics.avgseq","text":"avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])\n\nConvert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"distinctiveness(s, corpus; [method=:dtw, radius=10, reduction=mean])","category":"page"},{"location":"acd/#Phonetics.distinctiveness-Tuple{Any, Any}","page":"Acoustic distance","title":"Phonetics.distinctiveness","text":"distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])\n\nCalculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median. \n\nFor more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).\n\n\n\n\n\n","category":"method"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, radius=10, reduction=mean])","category":"page"},{"location":"acd/#Phonetics.distinctiveness","page":"Acoustic distance","title":"Phonetics.distinctiveness","text":"distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])\n\nConverts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.\n\n\n\n\n\n","category":"function"},{"location":"acd/#References","page":"Acoustic distance","title":"References","text":"","category":"section"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84","category":"page"},{"location":"acd/","page":"Acoustic distance","title":"Acoustic distance","text":"Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.","category":"page"},{"location":"#Home","page":"Home","title":"Home","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This is a Julia package that provides a collection of functions to analyze phonetic data.","category":"page"},{"location":"norm/#Vowel-normalization","page":"Vowel normalization","title":"Vowel normalization","text":"","category":"section"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.","category":"page"},{"location":"norm/#Nearey-normalization-routines","page":"Vowel normalization","title":"Nearey normalization routines","text":"","category":"section"},{"location":"norm/#Barreda-and-Nearey-normalization-routine","page":"Vowel normalization","title":"Barreda and Nearey normalization routine","text":"","category":"section"},{"location":"norm/#Lobanov-normalization-routine","page":"Vowel normalization","title":"Lobanov normalization routine","text":"","category":"section"},{"location":"norm/#References","page":"Vowel normalization","title":"References","text":"","category":"section"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742","category":"page"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396","category":"page"},{"location":"norm/","page":"Vowel normalization","title":"Vowel normalization","text":"Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.","category":"page"},{"location":"textvptree/#Text-VP-Tree","page":"Text VP Tree","title":"Text VP Tree","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For n items, there would be n-1 comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be n times (n-1) = n^2-n comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!","category":"page"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"With a vantage-point tree, however, we might get an average of only needing log_2(n) comparisons per query because of the way the data are organized. This means we would only need n times log_2(n) comparisons in total, which can be substantially lower than n^2-n for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.","category":"page"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"This impelentation is based on the description by Samet (2006).","category":"page"},{"location":"textvptree/#Function-documentation","page":"Text VP Tree","title":"Function documentation","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"TextVPTree(items::Array, d::Function)","category":"page"},{"location":"textvptree/#Phonetics.TextVPTree-Tuple{Array, Function}","page":"Text VP Tree","title":"Phonetics.TextVPTree","text":"TextVPTree(items::Array, d)\n\nOuter constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.\n\n\n\n\n\n","category":"method"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"radiusSearch(tree::TextVPTree, query, epsilon)","category":"page"},{"location":"textvptree/#Phonetics.radiusSearch-Tuple{TextVPTree, Any, Any}","page":"Text VP Tree","title":"Phonetics.radiusSearch","text":"radiusSearch(tree::TextVPTree, query, epsilon)\n\nPerforms a search for all items in a VP tree tree that are within a radius epsilon from a query query.\n\nReturns\n\nA Vector of items that are within the given radius epsilon\n\n\n\n\n\n","category":"method"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"nneighbors(tree::TextVPTree, query, n)","category":"page"},{"location":"textvptree/#Phonetics.nneighbors-Tuple{TextVPTree, Any, Any}","page":"Text VP Tree","title":"Phonetics.nneighbors","text":"nneighbors(tree::TextVPTree, query, n)\n\nFind the n nearest neighbors in a VP tree tree to a given query query.\n\nReturns\n\nA PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones\n\n\n\n\n\n","category":"method"},{"location":"textvptree/#References","page":"Text VP Tree","title":"References","text":"","category":"section"},{"location":"textvptree/","page":"Text VP Tree","title":"Text VP Tree","text":"Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.","category":"page"}] } diff --git a/dev/textvptree/index.html b/dev/textvptree/index.html index 1d62468..148fff8 100644 --- a/dev/textvptree/index.html +++ b/dev/textvptree/index.html @@ -1,2 +1,2 @@ -Text VP Tree · Phonetics.jl

Text VP Tree

A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For $n$ items, there would be $n-1$ comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be $n \times (n-1)\, = \, n^2-n$ comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!

With a vantage-point tree, however, we might get an average of only needing $\log_2(n)$ comparisons per query because of the way the data are organized. This means we would only need $n \times \log_2(n)$ comparisons in total, which can be substantially lower than $n^2-n$ for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.

This impelentation is based on the description by Samet (2006).

Function documentation

Phonetics.TextVPTreeMethod
TextVPTree(items::Array, d)

Outer constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.

source
Phonetics.radiusSearchMethod
radiusSearch(tree::TextVPTree, query, epsilon)

Performs a search for all items in a VP tree tree that are within a radius epsilon from a query query.

Returns

A Vector of items that are within the given radius epsilon

source
Phonetics.nneighborsMethod
nneighbors(tree::TextVPTree, query, n)

Find the n nearest neighbors in a VP tree tree to a given query query.

Returns

  • A PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones
source

References

Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.

+Text VP Tree · Phonetics.jl

Text VP Tree

A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For $n$ items, there would be $n-1$ comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be $n \times (n-1)\, = \, n^2-n$ comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!

With a vantage-point tree, however, we might get an average of only needing $\log_2(n)$ comparisons per query because of the way the data are organized. This means we would only need $n \times \log_2(n)$ comparisons in total, which can be substantially lower than $n^2-n$ for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.

This impelentation is based on the description by Samet (2006).

Function documentation

Phonetics.TextVPTreeMethod
TextVPTree(items::Array, d)

Outer constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.

source
Phonetics.radiusSearchMethod
radiusSearch(tree::TextVPTree, query, epsilon)

Performs a search for all items in a VP tree tree that are within a radius epsilon from a query query.

Returns

A Vector of items that are within the given radius epsilon

source
Phonetics.nneighborsMethod
nneighbors(tree::TextVPTree, query, n)

Find the n nearest neighbors in a VP tree tree to a given query query.

Returns

  • A PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones
source

References

Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.

diff --git a/dev/vanilla_vowel_plot.svg b/dev/vanilla_vowel_plot.svg index 70b237c..e6d935a 100644 --- a/dev/vanilla_vowel_plot.svg +++ b/dev/vanilla_vowel_plot.svg @@ -1,141 +1,141 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/vowelplot/index.html b/dev/vowelplot/index.html index 9f932d3..348710d 100644 --- a/dev/vowelplot/index.html +++ b/dev/vowelplot/index.html @@ -3,4 +3,4 @@ vowelplot(data.f1, data.f2, data.vowel, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vanilla vowel plot

This is a traditional vowel plot, with F1 on the x-axis in increasing order and F2 on the y-axis in increasing order. Note that simulated data were generated using the generateFormants function. Specifying a seed value makes the results reproducible. (Keep in mind that if you are generating values for different experiments, reports, studies, etc., the seed value needs to be changed (or left unspecified) so that the same data are not generated every time when they shouldn't be reproducible.)

For those inclined to use the alternate axes configuration with F2 decreasing on the x-axis and F1 decreasing on the y-axis, the xflip and yflip arguments that the Plots.jl package makes use of can be passed in to force the axes to be decreasing, the F2 values can be passed into the first argument slot, and the F1 values can be passed into the second argument slot.

vowelplot(data.f2, data.f1, data.vowel,
   xflip=true, yflip=true, xlab="F2 (Hz)", ylab="F1 (Hz)")

Vowel plot with alternate axes

I don't personally prefer to look at vowel plots in this manner because I think it unfairly privileges articulatory characteristics of vowel production when examining acoustic characteristics, so subsequent examples will not be presented using this axis configuration. However, the same principle applies to switching the axes around.

The vowelPlot function also allows for ellipses to be plotted around the values with the ell and ellPercent arguments. The ell argument takes a true or false value. The ellPercent argument should be a value between greater than 0 and less than 1, and it represents the approximate percentage of the data the should be contained within the ellipse. This is in contrast to some packages available in R that allow you to specify the number of standard deviations that the ellipse should be stretched to. The reason is that the traditional cutoff values of 1 standard deviation for 67%, 2 standard deviations for 95%, etc. for univariate Gaussian distributions does not carry over to multiple dimensions. While, the appropriate amount of stretching of the ellipse can be determined from the percentage of data to contain (Wang et al., 2015).

vowelplot(data.f1, data.f2, data.vowel, ell=true, ellPercent=0.67,
   xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses

Each of the data clouds in the scatter have an ellipse overlaid on them so as to contain 67% of the data. The ellipse calculation process is given in Friendly et al. (2013).

One final feature to point out is that the vowelplot function can also plot just the mean value of each vowel category with the meansOnly argument. Additionally, a label can be added to each category with the addLabels argument, which bases the labels on the category given in the cats argument.

vowelplot(data.f1, data.f2, data.vowel, ell=true,
-  meansOnly=true, addLabels=true, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses and markers only for mean values

The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.

Function documentation

Phonetics.vowelplotFunction
vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)

Create an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.

If meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.

Args

  • f1 The F1 values, or otherwise the values to plot on the x-axis
  • f2 The F2 values, or otherwise the values to plot on the y-axis
  • cats The vowel categories associated with each F1, F2 pair
  • meansOnly Plot only mean value for each category
  • addLabels Add labels for each category to the plot near the mean
  • ell Whether to add data ellipses to the plot
  • ellPercent Percentage of the data distribution the ellipse should cover (approximately)
  • nEllPts How many points should be used when plotting the ellipse
source
Phonetics.ellipsePtsFunction
ellipsePts(f1, f2; percent=0.95, nPoints=500)

Calculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.

See Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.

Args

  • f1 The F1 values or otherwise x-axis values
  • f2 The F2 values or otherwise y-axis values
  • percent (keyword) Percent of the data distribution the ellipse should approximately cover
  • nPoints (keyword) How many points to use when drawing the ellipse
source

References

Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.

Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537

+ meansOnly=true, addLabels=true, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses and markers only for mean values

The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.

Function documentation

Phonetics.vowelplotFunction
vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)

Create an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.

If meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.

Args

  • f1 The F1 values, or otherwise the values to plot on the x-axis
  • f2 The F2 values, or otherwise the values to plot on the y-axis
  • cats The vowel categories associated with each F1, F2 pair
  • meansOnly Plot only mean value for each category
  • addLabels Add labels for each category to the plot near the mean
  • ell Whether to add data ellipses to the plot
  • ellPercent Percentage of the data distribution the ellipse should cover (approximately)
  • nEllPts How many points should be used when plotting the ellipse
source
Phonetics.ellipsePtsFunction
ellipsePts(f1, f2; percent=0.95, nPoints=500)

Calculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.

See Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.

Args

  • f1 The F1 values or otherwise x-axis values
  • f2 The F2 values or otherwise y-axis values
  • percent (keyword) Percent of the data distribution the ellipse should approximately cover
  • nPoints (keyword) How many points to use when drawing the ellipse
source

References

Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.

Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537