barktools: Functions to help when working with Barks
I’m happy to announce that I’ve just released another small R package called
barktools. Now that I’ve got one R package out there already, I’ve sort of caught the bug and realized it’s kinda fun to put these small packages out there. This one is just a lightweight little guy that I thought up a few days ago while falling asleep that’ll help me when working with Barks. You can download the package from my GitHub.
Functions to help when working with Barks.
This package makes it easier to work with the Bark scale when analyzing and plotting acoustic data. It contains two pairs of functions. The first (
hz) convert between Hz and Barks. The other (
scale_y_bark) make it easy to plot data in ggplot2 using the Bark scale.
To install the package, you can do so by downloading it from github.
# install.packages("remotes") # <- if not already installed remotes::install_github("joeystanley/barktools")
You can then load it like any other library.
Load the data
For this little vignette, I’ll load some sample vowel data from my own speech. It’s got some outliers, but that’s the nature of automatically-processed data.
library(tidyverse) vowels <- read_csv("http://joeystanley.com/data/joey.csv")
Here’s a simple plot that shows my vowel space. For now, I’ll just keep it simple.
ggplot(vowels, aes(F2, F1)) + geom_point() + scale_x_reverse() + scale_y_reverse() + theme_minimal() + labs(title = "Joey's vowels", subtitle = "Data and plot are in Hz")
Now, let’s say I want to plot using Barks. You can use the
bark() function to convert the formant frequencies into Barks.
vowels_with_barks <- vowels %>% mutate(F1_bark = bark(F1), F2_bark = bark(F2)) ggplot(vowels_with_barks, aes(F2_bark, F1_bark)) + geom_point() + scale_x_reverse() + scale_y_reverse() + theme_minimal() + labs(title = "Joey's vowels", subtitle = "Data and plot are in Barks")
The Bark scale turns the nonlinear Hz data into something a little more linear, so the shape of the vowel space should change somewhat.
The problem is most people can’t readily interpret the Barks unit. What is the Hz equivalent of 6 Barks? We can look this up using the
hz(6) ##  631.1045
But it would be better if we could incorporate more interpretable values into the plot itself. I think the first time I saw this was in Harrington et al’s (2000) paper on how the Queen of England’s speech changes over time:
Notice how the axes are in Barks, but the data is still plotted in Hz. This is a perfect case for using the
scale_y_bark() functions. Like the other
scale_* functions in ggplot2, this will transform the axes of your plot. In this case, it’ll convert the plotting area to the Bark scale, but the values will be in Hz still.
ggplot(vowels, aes(F2, F1)) + geom_point() + scale_x_bark() + scale_y_bark() + theme_minimal() + labs(title = "Joey's vowels", subtitle = "Data is in Hz; plot is in Barks")
Now, you can see that the shape of the vowel space is identical to the plot above, except the axis labels are more useful: I have a better idea of what 500Hz means. Note that the axes are reversed as well, just like
At this point, it might be useful to modify the axes with some additional labels. Since
scale_*_bark is just a wrapper around
scale_*_continuous, any argument that you would normally include in the latter function will work just fine in the bark function. Specifically, I’ll modify which values get labels with
breaks and the gridlines with
ggplot(vowels, aes(F2, F1)) + geom_point() + scale_x_bark(breaks = c(c(500, 1000, 1500, 2000, 3000)), minor_breaks = seq(0, 4000, 100)) + scale_y_bark(breaks = c(c(200, 400, 600, 800, 1000, 1500)), minor_breaks = seq(0, 3000, 100)) + theme_minimal() + labs(title = "Joey's vowels", subtitle = "Data is in Hz; plot is in Barks")
Exactly which values you want to put is up to you, obviously, so play around with it until it looks good.
The other type of plot you might want to use
scale_y_bark for is something that looks like a spectrogram, that is a time-by-hz plot. You’ll have to transform the data a little bit. You can use the code that I provided in my tutorial with the new
pivot_longer function in
dplyr. I’ll just pull out my /aʊ/ vowel for this plot:
vowels_long <- vowels %>% filter(vowel == "AW") %>% select(contains("@")) %>% rowid_to_column("phoneme_id") %>% pivot_longer(cols = contains("@"), names_to = c("formant", "percent"), names_pattern = "(F\\d)@(\\d\\d)%", names_ptypes = list(formant = factor(levels = c("F1", "F2")), percent = integer()), values_to = "hz") %>% unite(traj_id, phoneme_id, formant, remove = FALSE)
Here’s what a spectrogram-like plot might look like
ggplot(vowels_long, aes(percent, hz, color = formant, group = traj_id)) + geom_path(alpha = 0.5) + theme_bw()
Now, a lot of the change in /ai/ happens along the F1 dimension, but because of the logarithmic nature of sound, F2 visually takes up most of the vertical space and F1 is sort of squished down at the bottom. We can emphasize F1 by transforming the y-axis into the Bark scale.
ggplot(vowels_long, aes(percent, hz, color = formant, group = traj_id)) + geom_path(alpha = 0.5) + scale_y_bark(rev = FALSE) + theme_bw()
Note that this time, I added the argument
rev = FALSE to
scale_y_bark. By default, the function will flip the axis (like
scale_y_reverse), but in this case that behavior is not desired. So, you can suppress that flip by specifying
rev = FALSE.
And that’s it! That’s the whole package. I thought it would be a useful thing for me. Perhaps you’ll find some use for it too.