Automated transcription for qualitative research: ethical, privacy, and security considerations

 





 'Transcription is vital to qualitative research, [however] it's a methodological and ethical blind spot, an onerous task, and it's expensive' - Julie Mooney-Somers.
 
This blog post covers some key ethical considerations for the use of automated transcription apps including privacy and security, safe storage, and GDPR (General Data Protection Regulation). I draw primarily from my experiences as a third-year PhD student conducting social science research, having recently recorded and fully transcribed in-depth interviews using automated transcription software Otter.ai (NB: I'm not paid by, or affiliated with, this company - I just use their product!). Hopefully this information will be useful to other researchers considering the use of automated transcription software in their work. 
 
* Update Summer 2021* - I presented a free webinar on this topic in June, which you can view on YouTube and the presentation on Slideshare
 
As someone who has spent hours transcribing research interviews for an undergraduate, two postgraduate degrees, and now a PhD, I'm enthusiastic about this process being automated. However, I'm aware (and very interested in) the ethical considerations of this. The suitability of these apps will vary depending on the purpose, objectives, and context of the project (and institution/funding body), as well as the skills/requirements of the researcher and participant(s).

I previously wrote a two-part blog post on the use of automated transcription apps for qualitative research, which you can read on my website and blog. Part one is an introduction to automated transcription software (and what's available on the market in 2020), how it works, and why it's useful from a researchers' perspective. Part two is a tutorial based on my own experience, providing a step-by-step guide to using one transcription app (Otter.ai) and highlighting some features which I think are useful for qualitative research (alongside some potential limitations). 

These previous posts have been circulated widely on Twitter, had around 700 hits, and have received some great comments and feedback (I'm really pleased they've been useful!). Some of the most frequently asked questions I've received have been regarding the accuracy of transcription software and its compliance with ethical guidelines and GDPR. For example, below are some questions I've been asked on Twitter (scroll down for privacy/security/storage/GDPR): 
 
FAQ: Why did you decide to use a specific transcription app?
 
I cover lots of these reasons (and more!) in my introductory blog post on automated transcription apps for qualitative research, including an infographic containing the "Top 10 things I like about Otter.ai"
 
FAQ: How accurate is transcription, and does it present a barrier to familiarity and interpretation of qualitative data?
 
I discuss the accuracy of transcription apps (and how to edit your transcript in Otter.ai) here, and intend to do so in more depth in a later post. From my experience, Otter.ai is pretty accurate and the time spent making small edits to the generated transcript is significantly less than typing the entire transcript by hand. You can edit the timestamped transcript live in the app (checking against the audio recording), which makes it really easy to correct any mistakes, and accuracy can be improved by teaching "known words" to the algorithm (useful for accronyms, names, and so forth). In addition, listening through your transcript again while making small edits can help refresh your memory, increase familiarity with the data, and gives you time to reflect upon and add comments to the transcript (e.g. regarding emerging themes, key words, and perceived sentiment/tone).
 
FAQ: Did you find that these apps were secure, particularly regarding data protection and storage issues?
 
Considering potential issues around GDPR, data storage, and privacy/security is really important (check out this open access article for an in-depth discussion applicable to automated transcription). Below are some comments and insights from my own experience which will hopefully be of use/interest to other researchers.

Privacy, security, and GDPR

According to the Economic and Social Research Council (ESRC, see here), ethical considerations should be reviewed regularly. This includes data protection, privacy and security, and the location and safe storage of personal information on third party servers.  
 
Speaking from the perspective of a UK-based researcher, the location and storage of the information I have collected is a key consideration. Some commercially available transcription services (like those provided by California-based Otter.ai) may transfer data out of the European Union (EU) and European Economic Area (EEA). This is an important consideration for GDPR, a regulation in EU law on data protection and privacy for individuals within the EU and EEA. The rights and obligations contained within the EEA agreement still apply to the UK until the end of the transition period for the UK's departure from the EU (31st December 2020) - I won't say too much about this here, but scroll down to the bottom of this post for more details!
 
Even though personal data (interview transcripts) may be transferred outside of the EEA, this doesn't mean that they aren't covered by suitable privacy policies. Otter.ai, for example, is signed up to the Privacy Shield Framework; this provides companies with a mechanism to comply with data protection requirements when transferring data (i.e. from the European Union to United States). You can view full details here including statements on Otter.ai's purpose of data collection (cookies, advertising, third-party analytics tools, and so forth).
 
Below is a statement that was recently shared with me (with permission to include in this post). This is included in the organisations' privacy notice to cover the use of Otter.ai to transcribe interviews for social research in a public body - I think the wording is extremely useful, and I've adapted this for my own research. 
 
Example privacy notice statement to cover the use of Otter.ai transcription services for qualitative interviews: "The audio recording of the interview, which may contain personal data if you have shared this during the interview, may be shared with a commercially available transcription service for the purpose of transcribing the information you share during the interview. Occasionally, data may be transferred out of the EEA, however in such cases it will be covered by appropriate privacy policies. For example, your data may be transferred out of the EEA for the purposes of transcription, using a transcription service that is signed up to the Privacy Shield Framework."

For the use of Otter.ai in my own research, the University's ethics committee required me to consider the following when applying for ethical approval:
  1. Explicit consideration of the use of Otter.ai with regards to the privacy notices of both the research institute where my PhD is based, and the wider University. This included written statements on anonymity/confidentiality and data storage (e.g. the removal of identifiable information).
  2. Inclusion of statements regarding my use of Otter.ai in the 'informed consent' form for my PhD (for participants and the researcher to sign and retain before completion of an interview). This included the provision of an alternative option for transcription (manual) if participants were not compliant with its use.
  3. Inclusion of statements about Otter.ai in the 'participant information sheet' for my research, with links to key information about their privacy policy and GDPR.
After review, my PhD fieldwork was granted full ethical approval (for context, my fieldwork consisted of semi-structured interviews conducted online via Zoom/Microsoft Teams).

I've included the statements used in the participant information and consent forms for my PhD, in case these are a useful point of reference for students considering the use of Otter.ai (or similar automated transcription software) in their own research:

"The interview will be recorded and transcribed verbatim by professional and confidential audio transcription software (Otter.ai - see https://otter.ai/about and https://blog.otter.ai/privacy-policy/ for more information). You will be given the opportunity to opt out of the use of this software before the interview."

Section of electronic informed consent form used for qualitative interviews (source: @CaitlinHafferty)

 

The UK's departure from the EU and membership to the EEA 

This may affect the nature of privacy notices for using some commercially available transcription software in the UK (e.g. the example privacy notice text I have included in this blog post). You could contact your institution's data manager if you have any questions about this.
 
You can read about this in the European Free Trade Association (EFTA) FAQs on Brexit. All relevant information is in these FAQs, but essentially the UK is covered by the EEA agreement until 31st December 2020 (end of the transition period for Brexit). After this transition period, the trade relationship between the EEA EFTA and the UK will have to be agreed in negotiations (including negotiations on issues regarding data protection). You can read the full seperation agreement between the UK and EEA EFTA here
 
 
Photo by Nick Morrison on Unsplash