Resources

On this page you’ll find links to all sorts of stuff that I have found useful, including software, tutorials, books, general reading on R, statistics, Praat, and other stuff.


My handouts, tutorials, and workshops

R Workshops

I’m currently giving a series of workshops on how to use R which will include a variety of topics. I have included PDFs and additional information on each installment of this series.

Formant extraction tutorial

This tutorial walks you through writing a praat script that extracts formant measurements from vowels. If you’ve never worked with Praat scripting but want to work with vowels, this might be a good starting point.

Vowel plots in R tutorials

This is a multi-part tutorial on how to make sort of the typical vowel plots in R. Part 1 shows plotting single-point measurements as scatter plots and serves as a mild introduction to ggplot2. Part 2 shows how to plot trajectories, both in the F1-F2 space and in a Praat-like time-Hz space, and is a bit of an introduction to tidyverse as well.

Make yourself googleable

I’m no expert, but I have given a workshop on how grad students can increase their online presence and make themselves more googleable, based in large part to ImpactStory’s fantastic 30-day challenge, which you can read here.

Excel Workshop

Last year I gave a workshop on Excel and ended producing a long handout, that goes from the very basics to relatively tricky techniques. The link above will take you to a blog post that summarizes the workshop, and you can also find the handout itself.


R Resources

Here is a list of resources I’ve found for R. I’ve gone through some of them and others are on my to-do list. These are in no particular order.

General R Coding

  • The website for Tidyverse is a great go-to place for learning how to use dplyr, tidyr, and many other packages.

  • R for Data Science by Garrett Grolemund & Hadley Wickham is a fantastic overview of tidyverse functions.

  • Intro to Tidyverse by David Robinson.

  • Advanced R by Hadley Wickham with the solutions by Malte Grosser, Henning Bumann, Peter Hurford & Robert Krzyzanowski.

  • R Packages by Hadley Wickham.

  • Hands-On Programming with R by Garrett Grolemund & Hadley Wickham for writing functions and simulations. Haven’t read it, but it looks good.

  • r-statistics.co by Selva Prabhakaran which has great tutorials on R itself, ggplot2, and advanced statistical modeling.

  • Tidymodels is like the Tidyverse suite of packages, but it’s meant for better handling of many statistical models. Also see it’s GitHub page.

Data Visualization

  • ggplot2 by Hadley Wickham is a comprehensive resource for learning all the ins and outs of ggplot2.

  • The scion package has a bunch of colorblind-safe, perceptually uniform, ggplot2-friendly color palettes for use in visuals. Very cool.

  • The color brewer website, while best for maps, offers great color palettes that are colorblind and sometimes also printer-safe. The have native integration with ggplot2 with the scale_[color|fill]_[brewer|distiller] functions.

  • This blog post by Jesse Sadler is a great tutorial on how to use R to visualize network data.

  • Data Visualization: A Practical Introduction by Kieran Healy. I haven’t had the time to look through it, and as I write this it’s an incomplete draft of the forthcoming book, but it looks quite good. It covers data prep, basic plots, visualizing statistical models, maps, and a whole bunch of other stuff.

  • Edward Tufte is a statistician known for his series of four books that focus on best practices in the presentation of data: The Visual Display of Quantitative Information, Envisioning Information, Visual Explanations, and Beautiful Evidence. I haven’t read them, but have thumbed through them and they look very cool. As a practical application of them, this page by Lukasz Piwek shows how to implement many of these visualizations in R.

  • Joey Cherdarchuk of Darkhorse Analytics has put together some really succinct presentations on how to simplify things you might put in a paper like maps, charts, tables, and reducing the data to ink ratio.

Working with Text

RMarkdown and Bookdown


Statistics Resources

General Statistics Knowledge

  • The American Statistical Association, which is essentially the statistics equivalent in scope and prestige as the the Linguistic Society of America, put out a statement on p-values. It is brief and written in accessible language and in my opinoin should be required reading if you ever use or interpret p-values in your research.

  • Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing by Justin Matejka and George Fitzmaurice. This went viral in some circles and shows that you can get the exact same summary statistics with wildly different distributions. Very cool.

  • 15 Types of Regression You Should Know is a post on the blog *Listen Data that is a nice overview of different kinds of regression and how to implement them in R.

  • Mixed Modeling as a Foreign Language, a blog post by Andrew McDonald, first off is a good explanation of what mixed modeling is all about. But more importantly, it makes the point that “if you only partly understand the words you are using, you will humiliate yourself eventually.” In other words, it’s important to know what you’re doing when you use statistics, and if you don’t, maybe you should reconsider before you do something wrong.

  • Here’s a BuzzFeed article by Stephanie M. Lee about a researcher who made the news because of his unbelieveable amount of p-hacking and using “statistics” to lie about his data.

How-Tos and Tutorials


Praat Resources


Forced Aligners

I’ve got a lot of audio that I need to process, so a crucial part of all that is force aligning the text to the audio. Smart people have come up with free software to do this. Here’s a list of the ones I’ve seen.

  • DARLA is probably my personal favorite. It’s actually a whole collection of tools available through a web interface from Dartmouth University. It can transcribe, align, and extract formants from your (English) audio files all in one go. Previously, its forced aligner is built using Prosody-Lab but now uses the Montreal Forced Aligner (see below).

  • The Montreal Forced Aligner is a relatively new one that I heard about for the first time at the 2017 LSA conference. It is fundamentally different than other ones in that it uses a software called Kaldi. It’s easy to set up and install and I’ve used it on my own data. The benefit of this over DARLA is that it’s on your own computer so you don’t have to wait for files to upload. And you can process files in bulk.

  • FAVE is probably the most well-known forced aligner. It’s open source and you can download it on your own computer from Joe Fruehwald’s Github page. Or if you’d prefer, you can UPenn’s their web interface instead.

  • Prosodylab-Aligner is, according to their website, “a set of Python and shell scripts for performing automated alignment of text to audio of speech using Hidden Markov Models.” This is a software available through McGill University that actually allows you to train your own acoustic model (e.g. on a non-English audio corpus). I haven’t used it yet, but if I ever need to process non-English audio, this’ll be my go-to.

  • SPPAS is a software package with several functions including forced alignment in several languages. Of the aligners you can download to your computer, this might be one of the easier ones to use.

  • WebMAUS is another web interface with multiple functions including a forced aligner for several languages.

  • Gentle advertises itself as a “robust yet lenient forced aligner built on Kaldi.” It’s easy to download and use and produces what appear to be very good word-level alignments of a provided transcript. It even ignored the interviewer’s voice in the file I tried. The output is a .csv file, so I’m not sure how to turn that into a TextGrid, and if you need phoneme-level acoustic measurements, a word-level transcription isn’t going to work.


Beautiful Websites

I designed this website more or less from scratch, so I can appreciate the work others put into their own academic sites. Here are some examples of beautiful websites that I have found that I really like.

  • Kieran Healy has one of the beautiful academic websites I’ve ever seen. I created this category on this page just so I could include his page on here. Wow.

  • Practical Typography by Matthew Butterick is was my gateway into typography. My font selection and many other little details on my site (slides, posters, CV, etc.) were influenced by this book.


Miscellaneous

Just random stuff that doesn’t fit elsewhere.

  • The great American word mapper is an interactive tool put together by Diansheng Guo, Jack Grieve, and Andrea Nini that lets you see regional trends in how words are used on Twitter.

  • Collecting, organizing, and citing scientific literature: an intro to Zotero is a great tutorial on how to use Zotero by Mark Dingemanse. Zotero is a fantastic tool for, well, collecting, organizing, and citing scientific literature and I’m not exaggerating when I say that I could not be in academics without it.

  • Pink Trombone is an interesting site that has a interactive simulator of the vocal tract. You can click around and make different vowels and consonants. Pretty fun resource for teaching how speech works.

  • Vulgar: A Language Generator is a site that automatically creates a new conlang, based on parameters that you specify. The free web version allows you to add whatever vowels and consonants you’d like to include, and it’ll create a full language: a language name; IPA chart for vowels and consonants; phonotactics; phonological rules; and paradigms for nominal morphology, definite and indefinite articles, personal pronouns, and verb conjugations; derivational morphology; and a lexicon of over 200 words. For $19 you can download the software and get a lexicon of 2000 words, derivational words, random semantic overlaps with natural languages, and the ability to customize orthography, syllable structure, and phonological rules. In addition to just being kinda fun, this is a super useful resource for creating homework assignments for students.

  • IPA Phonetics is an iPhone app has what they call an “elaborated” IPA chart with lots of extra places and manners of articulation, complete with audio clips of all the sounds. You can play a game where it’ll play a sound and you can guess what you heard. It’s just fun to see things like a voiced uvular fricative (ɢʁ) or a dentolabial fricative [θ̼] on an IPA chart. Credits to University of Victoria linguistics and John Esling’s “Phonetic Notation” (chapter 18 of the Handbook of Phonetic Sciences, 2nd ed.).

  • The EMU-webApp “is a fully fledged browser-based labeling and correction tool that offers a multitude of labeling and visualization features.” I haven’t given this enough time to learn to use it properly, but it seems very helpful.

  • Jonhannes Haushofer’s CV of Failures. Other people have written this more elegantly than I could, but sometimes it’s nice to see that other academics fail too. You’re not going to get into all the conferences you apply for, your papers are sometimes going to be rejected, and you’re definitely not getting all the funding you apply for. I find it therapeutic to put together a CV of failures like his researcher did and to keep it updated and formatted just as would a regular CV. Don’t let impostor syndrome get in the way by thinking others haven’t failed too.