Last summer, I collected roughly 40 hours of conversation from people in Washington State last year. Not an enormous corpus, but I’m quite proud of that dataset. Well, my goal was to transcribe one speaker graudally over the course of the year, finishing around now. Well, in 9 months, I’ve done about… one hour. I realized this week that I really really need to get these done.
Now, I haven’t just been slacking off doing nothing for the past school year. I’ve been very busy with my research assistantships and with other tasks. I’ve given workshops and seminars here at UGA, and my name has been on a lot of conference presentations in the past few months, even if I haven’t been able to attend them. I’ve put together this website as well.
I’ve got my second qualifying paper coming up (edit: see my post about it here) and it occured to me that if I want to graduate in a year, I had better get these recordings transcribed!
Because I am the way I am, I kept track of my transcription rate. I did the first hour or so in about 5 hours. Okay, so at that rate, it’ll take me around 200 hours of work to finish the transcriptions. Okay, so at an hour a day that’ll take me like 8 months. At two hours a day I’ll finish in like August. Yikes. That’s a lot of work.
39 hours of audio left. At two hours of transcribing a day I'll finish by… Thanksgiving. Am I old enough to hire undergrads yet?— Joey Stanley (@joey_stan) April 28, 2017
So I decided to just get to it. I can calculate and project and write about it all I want, but it’s not going to get done unless I just go for it. And I’ve got to be the one to do it. Even though there are automatic transcribers out there, I’d have to double-check their work anyway, and if I need to listen to it all I might as well do it all myself. Plus, it’s nice to go through them again and listen to linguistic features I didn’t catch before. And there are a lot!
Edit: I’ve decided to tweet every so often to keep track of my progress and to keep myself motivated.
I put in four hours today and finished an entire speaker, averaging 3.2 minutes of work time for every minute of transcription time. Nice.— Joey Stanley (@joey_stan) May 2, 2017
I'm 15% finished transcribing. Just finished listening to my interview with a professional radio announcer. Wow. Talk about clear audio.— Joey Stanley (@joey_stan) May 12, 2017
Just hit the 20% mark for transcribing my interviews. I'm currently reliving the one with a guy who worked falling trees for 18 years.— Joey Stanley (@joey_stan) May 16, 2017
25% done with transcribing. Would have hit it earlier, but I've got two other projects going right now, which means more transcribing later…— Joey Stanley (@joey_stan) May 31, 2017
After a bit of a hiatus, I'm back to transcribing and just hit the 30% mark for my corpus! Hooray!— Joey Stanley (@joey_stan) August 3, 2017
After another long hiatus, I'm back to transcribing. I just hit the 35% mark of my corpus. I'm averaging around 4.6 hours of work for every hour of audio, and I've got about 26 hours of interviews left. https://t.co/AyjOL9gIXX— Joey Stanley (@joey_stan) April 17, 2018