diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index d95421b..17b0826 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.0","generation_timestamp":"2023-12-28T18:48:57","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2023-12-28T19:13:22","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/acd/index.html b/dev/acd/index.html index 2fee5fd..df962ee 100644 --- a/dev/acd/index.html +++ b/dev/acd/index.html @@ -28,4 +28,4 @@ y = rand(rng, 3000) z = rand(rng, 10000) a = [Sound(x, 8000), Sound(y, 8000), Sound(z, 8000)] -distinctiveness(a[1], a[2:3])
45165.66072314727

The number is effectively an index of how acoustically unique a word is in a language.

Function documentation

Phonetics.acdistMethod
acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])

Calculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.

Args

  • s1 Features-by-time array of first sound to compare
  • s2 Features-by-time array of second sound to compare
  • method (keyword) Which method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • dtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw
  • fastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
source
Phonetics.acdistFunction
acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])

Convert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.avgseqMethod
avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Return a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.

Args

  • S An array of sequences to average
  • method (keyword) The method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • radius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
  • center (keyword) The method used to select the initial center of the sequences in S
  • dtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction
  • progress Whether to show the progress coming from dba
source
Phonetics.avgseqFunction
avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Convert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.distinctivenessMethod
distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Calculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median.

For more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).

source
Phonetics.distinctivenessFunction
distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Converts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source

References

Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.

Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.

Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q

Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84

Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.

+distinctiveness(a[1], a[2:3])
45165.66072314727

The number is effectively an index of how acoustically unique a word is in a language.

Function documentation

Phonetics.acdistMethod
acdist(s1, s2; [method=:dtw, dist=SqEuclidean(), radius=10])

Calculate the acoustic distance between s1 and s2 with method version of dynamic time warping and dist as the interior distance function. Using method=:dtw uses vanilla dynamic time warping, while method=:fastdtw uses the fast dtw approximation. Note that this is not a true mathematical distance metric because dynamic time warping does not necessarily satisfy the triangle inequality, nor does it guarantee the identity of indiscernibles.

Args

  • s1 Features-by-time array of first sound to compare
  • s2 Features-by-time array of second sound to compare
  • method (keyword) Which method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • dtwradius (keyword) maximum warping radius for vanilla dynamic timew warping; if no value passed, no warping constraint is used argument unused when method=:fastdtw
  • fastradius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
source
Phonetics.acdistFunction
acdist(s1::Sound, s2::Sound, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10])

Convert s1 and s2 to a frequency representation specified by rep, then calculate acoustic distance between s1 and s2. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.avgseqMethod
avgseq(S; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Return a sequence representing the average of the sequences in S using the dba method for sequence averaging. Supports method=:dtw for vanilla dtw and method=:fastdtw for fast dtw approximation when performing the sequence comparisons. With center=:medoid, finds the medoid as the sequence to use as the initial center, and with center=:rand selects a random element in S as the initial center.

Args

  • S An array of sequences to average
  • method (keyword) The method of dynamic time warping to use
  • dist (keyword) Any distance function implementing the SemiMetric interface from the Distances package
  • radius (keyword) The radius to use for the fast dtw method; argument unused when method=:dtw
  • center (keyword) The method used to select the initial center of the sequences in S
  • dtwradius (keyword) How far a time step can be mapped when comparing sequences; passed directly to DTW function from DynamicAxisWarping; if set to nothing, the length of the longest sequence will be used, effectively removing the radius restriction
  • progress Whether to show the progress coming from dba
source
Phonetics.avgseqFunction
avgseq(S::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, center=:medoid, dtwradius=nothing, progress=false])

Convert the Sound objects in S to a representation designated by rep, then find the average sequence of them. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source
Phonetics.distinctivenessMethod
distinctiveness(s, corpus; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Calculates the acoustic distinctiveness of s given the corpus corpus. The method, dist, and radius arguments are passed into acdist. The reduction argument can be any function that reduces an iterable to one number, such as mean, sum, or median.

For more information, see Kelley (2018, September, How acoustic distinctiveness affects spoken word recognition: A pilot study, DOI: 10.7939/R39G5GV9Q) and Kelley & Tucker (2018, Using acoustic distance to quantify lexical competition, DOI: 10.7939/r3-wbhs-kr84).

source
Phonetics.distinctivenessFunction
distinctiveness(s::Sound, corpus::Array{Sound}, rep=:mfcc; [method=:dtw, dist=SqEuclidean(), radius=10, reduction=mean])

Converts s and corpus to a representation specified by rep, then calculates the acoustic distinctiveness of s given corpus. Currently only :mfcc is supported for rep, using defaults from the MFCC package except that the first coefficient for each frame is removed and replaced with the sum of the log energy of the filterbank in that frame, as is standard in ASR.

source

References

Bartelds, M., Richter, C., Liberman, M., & Wieling, M. (2020). A new acoustic-based pronunciation distance measure. Frontiers in Artificial Intelligence, 3, 39.

Mielke, J. (2012). A phonetically based metric of sound similarity. Lingua, 122(2), 145-163.

Kelley, M. C. (2018). How acoustic distinctiveness affects spoken word recognition: A pilot study. Presented at the 11th International Conference on the Mental Lexicon (Edmonton, AB). https://doi.org/10.7939/R39G5GV9Q

Kelley, M. C., & Tucker, B. V. (2018). Using acoustic distance to quantify lexical competition. University of Alberta ERA (Education and Research Archive). https://doi.org/10.7939/r3-wbhs-kr84

Petitjean, F., Ketterlin, A., & Gançarski, P. (2011). A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognition, 44(3), 678–693.

diff --git a/dev/alt_axes_vowel_plot.svg b/dev/alt_axes_vowel_plot.svg index 7dfe2f4..7bb8932 100644 --- a/dev/alt_axes_vowel_plot.svg +++ b/dev/alt_axes_vowel_plot.svg @@ -1,141 +1,141 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/ellipse_vowel_plot.svg b/dev/ellipse_vowel_plot.svg index a25385e..4667a78 100644 --- a/dev/ellipse_vowel_plot.svg +++ b/dev/ellipse_vowel_plot.svg @@ -1,144 +1,144 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/index.html b/dev/index.html index 98df184..ecf17ae 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · Phonetics.jl

Home

This is a Julia package that provides a collection of functions to analyze phonetic data.

+Home · Phonetics.jl

Home

This is a Julia package that provides a collection of functions to analyze phonetic data.

diff --git a/dev/lc/index.html b/dev/lc/index.html index 561d126..1f6d2e4 100644 --- a/dev/lc/index.html +++ b/dev/lc/index.html @@ -64,4 +64,4 @@ ["K", "AE1", "B"], # cab ] upt(sample_corpus, [["T", "AE1", "T"]]; inCorpus=false)
1×2 DataFrame
RowQueryUPT
Array…Any
1["T", "AE1", "T"]4

Here, [T AE1 T] tat cannot be uniquely identified until after the sequence is complete, so its uniqueness point is one longer than its length.

Function documentation

Phonetics.pndMethod
pnd(corpus::Array, queries::Array; [progress=true])

Calculate the phonological neighborhood density (pnd) for each item in queries based on the items in corpus. This function uses a vantage point tree data structure to speed up the search for neighbors by pruning the search space. This function should work regardless of whether the items in queries are in corpus or not.

Parameters

  • corpus The corpus to be queried for phonological neighbors
  • queries The items to query phonological neighbors for in corpus
  • progress Whether to display a progress meter or not

Returns

  • A DataFrame with the queries in the first column and the phonological neighborhood density in the second
source
Phonetics.phnprbMethod
phnprb(corpus::Array, frequencies::Array, queries::Array; positional=false,
-    nchar=1, pad=true)

Calculates the phonotactic probability for each item in a list of queries based on a corpus

Arguments

  • corpus The corpus on which to base the probability calculations
  • frequencies The frequencies associated with each element in corpus
  • queries The items for which the probability should be calculated

Keyword arguments

  • positional Whether to consider where in the query a given phone appears

(e.g., should "K" as the first sound be considered a different category than "K" as the second sound?)

  • nchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)
  • pad Whether to add padding to each query or not

Returns

A DataFrame with the queries in the first column and the probability values in the second

source
Phonetics.uptMethod
upt(corpus, queries; [inCorpus=true])

Calculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.

Parameters

  • corpus The items comprising the corpus to compare against when calculating the uniqueness point of each query
  • queries The items for which to calculate the uniqueness point
  • inLexicon Whether the query items are expected to be in the corpus or not

Returns

  • A DataFrame with the queries in the first column and the uniqueness points in the second
source

References

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.

Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.

+ nchar=1, pad=true)

Calculates the phonotactic probability for each item in a list of queries based on a corpus

Arguments

  • corpus The corpus on which to base the probability calculations
  • frequencies The frequencies associated with each element in corpus
  • queries The items for which the probability should be calculated

Keyword arguments

  • positional Whether to consider where in the query a given phone appears

(e.g., should "K" as the first sound be considered a different category than "K" as the second sound?)

  • nchar The number of characters for each n-gram that will be examined (e.g., 2 for diphones)
  • pad Whether to add padding to each query or not

Returns

A DataFrame with the queries in the first column and the probability values in the second

source
Phonetics.uptMethod
upt(corpus, queries; [inCorpus=true])

Calculates the phonological uniqueness point (upt) the items in queries based on the items in corpus. If the items are expected to be in the corpus, this function will calculate the uniqueness point to be when a branch can be considered to only represent 1 word. If the items are not expected to be in the corpus, the uniqueness point will be taken to be the depth at which the tree can no longer be traversed.

Parameters

  • corpus The items comprising the corpus to compare against when calculating the uniqueness point of each query
  • queries The items for which to calculate the uniqueness point
  • inLexicon Whether the query items are expected to be in the corpus or not

Returns

  • A DataFrame with the queries in the first column and the uniqueness points in the second
source

References

Luce, P. A., & Pisoni, D. B. (1998). Recognizing spoken words: The neighborhood activation model. Ear and hearing, 19(1), 1.

Vitevitch, M. S., & Luce, P. A. (2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2, 75-94.

diff --git a/dev/means_only_ellipse_vowel_plot.svg b/dev/means_only_ellipse_vowel_plot.svg index c179eeb..95cd0a1 100644 --- a/dev/means_only_ellipse_vowel_plot.svg +++ b/dev/means_only_ellipse_vowel_plot.svg @@ -1,58 +1,58 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/norm/index.html b/dev/norm/index.html index 783616d..a88affc 100644 --- a/dev/norm/index.html +++ b/dev/norm/index.html @@ -1,2 +1,2 @@ -Vowel normalization · Phonetics.jl

Vowel normalization

Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.

Nearey normalization routines

Barreda and Nearey normalization routine

Lobanov normalization routine

References

Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742

Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396

Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.

+Vowel normalization · Phonetics.jl

Vowel normalization

Vowel normalization routines come from a variety of sources. In the case of the Nearey normalizations, they are intended to show how a listener takes formant information (which varies not only based on vowel category, but also on factors like gender and age) and transforms them to a space where formant information is more related to vowel category. In that sense, it is a perceptually motivated routine. Other routines, like the Lobanov normalization routine (Lobanov, 1971), are more purpose driven in that their goal is just to allow vowel comparisons between speakers, regardless of whether the technique is perceptually motivated or plausible.

Nearey normalization routines

Barreda and Nearey normalization routine

Lobanov normalization routine

References

Barreda, S., & Nearey, T. M. (2018). A regression approach to vowel normalization for missing and unbalanced data. The Journal of the Acoustical Society of America, 144(1), 500–520. https://doi.org/10.1121/1.5047742

Lobanov, B. M. (1971). Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, 49(2B), 606–608. https://doi.org/10.1121/1.1912396

Nearey, T. M. (1978). Phonetic feature system for vowels. Indiania University Linguistics Club.

diff --git a/dev/phon_spectrogram/29bbe87d.svg b/dev/phon_spectrogram/1f2bddcf.svg similarity index 98% rename from dev/phon_spectrogram/29bbe87d.svg rename to dev/phon_spectrogram/1f2bddcf.svg index fbedfe7..d431921 100644 --- a/dev/phon_spectrogram/29bbe87d.svg +++ b/dev/phon_spectrogram/1f2bddcf.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/abadf348.svg b/dev/phon_spectrogram/b1569262.svg similarity index 99% rename from dev/phon_spectrogram/abadf348.svg rename to dev/phon_spectrogram/b1569262.svg index f64fa4d..3a47b95 100644 --- a/dev/phon_spectrogram/abadf348.svg +++ b/dev/phon_spectrogram/b1569262.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/0cc3b8eb.svg b/dev/phon_spectrogram/c331bf3d.svg similarity index 99% rename from dev/phon_spectrogram/0cc3b8eb.svg rename to dev/phon_spectrogram/c331bf3d.svg index c8ee9ed..e258d83 100644 --- a/dev/phon_spectrogram/0cc3b8eb.svg +++ b/dev/phon_spectrogram/c331bf3d.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/e3672ad4.svg b/dev/phon_spectrogram/d3e80988.svg similarity index 99% rename from dev/phon_spectrogram/e3672ad4.svg rename to dev/phon_spectrogram/d3e80988.svg index 6a1177b..dca004e 100644 --- a/dev/phon_spectrogram/e3672ad4.svg +++ b/dev/phon_spectrogram/d3e80988.svg @@ -1,45 +1,45 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + - + - + - + diff --git a/dev/phon_spectrogram/index.html b/dev/phon_spectrogram/index.html index 621cd7a..73a633a 100644 --- a/dev/phon_spectrogram/index.html +++ b/dev/phon_spectrogram/index.html @@ -3,4 +3,4 @@ using Plots s, fs = wavread("assets/iwantaspectrogram.wav") s = vec(s) -phonspec(s, fs)Example block output

A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.

phonspec(s, fs, col=:binary)
Example block output

A narrowband style spectrogram can be plotted using the style argument:

phonspec(s, fs, style=:narrowband)
Example block output

And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.

phonspec(s, fs, pre_emph=0)
Example block output

Function documentation

Phonetics.phonspecFunction
phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)

Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.

Args

  • s A vector containing the samples of a sound
  • fs Sampling frequency of s in Hz
  • pre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins
  • style Either :broadband or :narrowband; will affect the window length and window stride
  • dbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument
  • kw... extra named parameters to pass to heatmap
source
+phonspec(s, fs)Example block output

A color scheme more similar to the Praat grayscale can be achieved using the col argument and the :gist_yarg color scheme. These spectrograms are created using the heatmap function from Plots.jl, so any color scheme available in the Plots package can be used, though not all of them produce legible spectrograms.

phonspec(s, fs, col=:binary)
Example block output

A narrowband style spectrogram can be plotted using the style argument:

phonspec(s, fs, style=:narrowband)
Example block output

And, the pre-emphasis can be disabled by passing in a value of 0 for the pre_emph argument. Pre-emphasis will boost the prevalence of the higher frequencies in comparison to the lower frequencies.

phonspec(s, fs, pre_emph=0)
Example block output

Function documentation

Phonetics.phonspecFunction
phonspec(s, fs; pre_emph=0.97, style=:broadband, dbr=55, kw...)

Rudimentary functionality to plot a spectrogram, with parameters familiar to phoneticians. Includes a pre-emphasis routine which helps increase the intensity of the higher frequencies in the display. Uses a Kaiser window with a parameter value of 2.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot.

Args

  • s A vector containing the samples of a sound
  • fs Sampling frequency of s in Hz
  • pre_emph The α coefficient for pre-emmphasis; default value of 0.97 corresponds to a cutoff frequency of approximately 213 Hz before the 6 dB / octave increase begins
  • style Either :broadband or :narrowband; will affect the window length and window stride
  • dbr The dynamic range; all frequencies that are dbr decibels quieter than the loudest frequency will not be displayed; will specify the clim argument
  • kw... extra named parameters to pass to heatmap
source
diff --git a/dev/textvptree/index.html b/dev/textvptree/index.html index 148fff8..e038ced 100644 --- a/dev/textvptree/index.html +++ b/dev/textvptree/index.html @@ -1,2 +1,2 @@ -Text VP Tree · Phonetics.jl

Text VP Tree

A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For $n$ items, there would be $n-1$ comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be $n \times (n-1)\, = \, n^2-n$ comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!

With a vantage-point tree, however, we might get an average of only needing $\log_2(n)$ comparisons per query because of the way the data are organized. This means we would only need $n \times \log_2(n)$ comparisons in total, which can be substantially lower than $n^2-n$ for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.

This impelentation is based on the description by Samet (2006).

Function documentation

Phonetics.TextVPTreeMethod
TextVPTree(items::Array, d)

Outer constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.

source
Phonetics.radiusSearchMethod
radiusSearch(tree::TextVPTree, query, epsilon)

Performs a search for all items in a VP tree tree that are within a radius epsilon from a query query.

Returns

A Vector of items that are within the given radius epsilon

source
Phonetics.nneighborsMethod
nneighbors(tree::TextVPTree, query, n)

Find the n nearest neighbors in a VP tree tree to a given query query.

Returns

  • A PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones
source

References

Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.

+Text VP Tree · Phonetics.jl

Text VP Tree

A vantage-point tree is a data structure that takes advantage of the spatial distribution of data and lets allows for faster searching through the data by lowering the amount of comparisons that need to be made. Consider the traditional example of phonological neighborhood density calculation. The code would be written to compare each item to all the other items. For $n$ items, there would be $n-1$ comparisons. So, to calculate the phonological neighborhood density for each item in a given corpus, there would need to be $n \times (n-1)\, = \, n^2-n$ comparisons. This is a lot of comparisons, especially when you're working with tens or hundreds of thousands of words!

With a vantage-point tree, however, we might get an average of only needing $\log_2(n)$ comparisons per query because of the way the data are organized. This means we would only need $n \times \log_2(n)$ comparisons in total, which can be substantially lower than $n^2-n$ for larger corpora. Though, analyzing the runtime of a VP tree is difficult, so the actual speedup may not be as drastic, but it should still be faster than the naive phonological neighborhood density calculation.

This impelentation is based on the description by Samet (2006).

Function documentation

Phonetics.TextVPTreeMethod
TextVPTree(items::Array, d)

Outer constructor for a TextVPTree. Takes in an array of items items and a distance function d and proceeds to build a vantage-point tree from them.

source
Phonetics.radiusSearchMethod
radiusSearch(tree::TextVPTree, query, epsilon)

Performs a search for all items in a VP tree tree that are within a radius epsilon from a query query.

Returns

A Vector of items that are within the given radius epsilon

source
Phonetics.nneighborsMethod
nneighbors(tree::TextVPTree, query, n)

Find the n nearest neighbors in a VP tree tree to a given query query.

Returns

  • A PriorityQueue of items where the keys are the items themselves and the values are the distances from the items to query; the PriorityQueue is defined such that small values have higher priorities than large ones
source

References

Samet, H. (2006). Foundations of multidimensional and metric data structures. San Francisco, California: Morgan Kaufmann.

diff --git a/dev/vanilla_vowel_plot.svg b/dev/vanilla_vowel_plot.svg index e6d935a..7c2fb54 100644 --- a/dev/vanilla_vowel_plot.svg +++ b/dev/vanilla_vowel_plot.svg @@ -1,141 +1,141 @@ - + - + - + - + - + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/dev/vowelplot/index.html b/dev/vowelplot/index.html index 348710d..30a3fff 100644 --- a/dev/vowelplot/index.html +++ b/dev/vowelplot/index.html @@ -3,4 +3,4 @@ vowelplot(data.f1, data.f2, data.vowel, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vanilla vowel plot

This is a traditional vowel plot, with F1 on the x-axis in increasing order and F2 on the y-axis in increasing order. Note that simulated data were generated using the generateFormants function. Specifying a seed value makes the results reproducible. (Keep in mind that if you are generating values for different experiments, reports, studies, etc., the seed value needs to be changed (or left unspecified) so that the same data are not generated every time when they shouldn't be reproducible.)

For those inclined to use the alternate axes configuration with F2 decreasing on the x-axis and F1 decreasing on the y-axis, the xflip and yflip arguments that the Plots.jl package makes use of can be passed in to force the axes to be decreasing, the F2 values can be passed into the first argument slot, and the F1 values can be passed into the second argument slot.

vowelplot(data.f2, data.f1, data.vowel,
   xflip=true, yflip=true, xlab="F2 (Hz)", ylab="F1 (Hz)")

Vowel plot with alternate axes

I don't personally prefer to look at vowel plots in this manner because I think it unfairly privileges articulatory characteristics of vowel production when examining acoustic characteristics, so subsequent examples will not be presented using this axis configuration. However, the same principle applies to switching the axes around.

The vowelPlot function also allows for ellipses to be plotted around the values with the ell and ellPercent arguments. The ell argument takes a true or false value. The ellPercent argument should be a value between greater than 0 and less than 1, and it represents the approximate percentage of the data the should be contained within the ellipse. This is in contrast to some packages available in R that allow you to specify the number of standard deviations that the ellipse should be stretched to. The reason is that the traditional cutoff values of 1 standard deviation for 67%, 2 standard deviations for 95%, etc. for univariate Gaussian distributions does not carry over to multiple dimensions. While, the appropriate amount of stretching of the ellipse can be determined from the percentage of data to contain (Wang et al., 2015).

vowelplot(data.f1, data.f2, data.vowel, ell=true, ellPercent=0.67,
   xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses

Each of the data clouds in the scatter have an ellipse overlaid on them so as to contain 67% of the data. The ellipse calculation process is given in Friendly et al. (2013).

One final feature to point out is that the vowelplot function can also plot just the mean value of each vowel category with the meansOnly argument. Additionally, a label can be added to each category with the addLabels argument, which bases the labels on the category given in the cats argument.

vowelplot(data.f1, data.f2, data.vowel, ell=true,
-  meansOnly=true, addLabels=true, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses and markers only for mean values

The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.

Function documentation

Phonetics.vowelplotFunction
vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)

Create an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.

If meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.

Args

  • f1 The F1 values, or otherwise the values to plot on the x-axis
  • f2 The F2 values, or otherwise the values to plot on the y-axis
  • cats The vowel categories associated with each F1, F2 pair
  • meansOnly Plot only mean value for each category
  • addLabels Add labels for each category to the plot near the mean
  • ell Whether to add data ellipses to the plot
  • ellPercent Percentage of the data distribution the ellipse should cover (approximately)
  • nEllPts How many points should be used when plotting the ellipse
source
Phonetics.ellipsePtsFunction
ellipsePts(f1, f2; percent=0.95, nPoints=500)

Calculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.

See Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.

Args

  • f1 The F1 values or otherwise x-axis values
  • f2 The F2 values or otherwise y-axis values
  • percent (keyword) Percent of the data distribution the ellipse should approximately cover
  • nPoints (keyword) How many points to use when drawing the ellipse
source

References

Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.

Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537

+ meansOnly=true, addLabels=true, xlab="F1 (Hz)", ylab="F2 (Hz)")

Vowel plot with ellipses and markers only for mean values

The labels are offset from the mean value a bit so as to not cover up the marker showing where the mean value is.

Function documentation

Phonetics.vowelplotFunction
vowelplot(f1, f2, cats; meansOnly=false, addLabels=true, ell=false, ellPercent=0.67, nEllPts=500, kw...)

Create an F1-by-F2 vowel plot. The f1 values are displayed along the x-axis, and the f2 values are displayed along the y-axis, with each unique vowel class in cats being represented with a new color. The series labels in the legend will take on the unique values contained in cats. The alternate display whereby reversed F2 is on the x-axis and reversed F1 is on the y-axis can be created by passing the F2 values in for the f1 argument and F1 values in for the f2 argument, and then using the :flip magic argument provided by the Plots package.

If meansOnly is set to true, only the mean values for each vowel category are plotted. Using ell=true will plot a data ellipse that approximately encompases the percentage of data specified by ellPercent. The ellipse is represented by a number of points specified with nEllPts. Other arguments to plot are passed in through the splatted kw argument. Setting the addLabels argument to true will add the text label of the vowel category above and to the right of the mean.

Argument structure inferred from using plot recipe. Parameters such as xlim, ylim, color, and size should be passed as keyword arguments, as with standard calls to plot. Plot parameters markersize defaults to 3 and linewidth defaults to 3.

Args

  • f1 The F1 values, or otherwise the values to plot on the x-axis
  • f2 The F2 values, or otherwise the values to plot on the y-axis
  • cats The vowel categories associated with each F1, F2 pair
  • meansOnly Plot only mean value for each category
  • addLabels Add labels for each category to the plot near the mean
  • ell Whether to add data ellipses to the plot
  • ellPercent Percentage of the data distribution the ellipse should cover (approximately)
  • nEllPts How many points should be used when plotting the ellipse
source
Phonetics.ellipsePtsFunction
ellipsePts(f1, f2; percent=0.95, nPoints=500)

Calculates nPoints points of the perimeter of a data ellipse for f1 and f2 with approximately the percent of the data spcified by percent contained within the ellipse. Points are returned in counter-clockwise order as the polar angle of rotation moves from 0 to 2π.

See Friendly, Monette, and Fox (2013, Elliptical insights: Understanding statistical methods through elliptical geometry, Statistical science 28(1), 1-39) for more information on the calculation process.

Args

  • f1 The F1 values or otherwise x-axis values
  • f2 The F2 values or otherwise y-axis values
  • percent (keyword) Percent of the data distribution the ellipse should approximately cover
  • nPoints (keyword) How many points to use when drawing the ellipse
source

References

Friendly, M., Monette, G., & Fox, J. (2013). Elliptical insights: understanding statistical methods through elliptical geometry. Statistical Science, 28(1), 1-39.

Wang, B., Shi, W., & Miao, Z. (2015). Confidence analysis of standard deviational ellipse and its extension into higher dimensional Euclidean space. PLOS ONE, 10(3), e0118537. https://doi.org/10.1371/journal.pone.0118537