The practical and ethical issues of automated transcription: five take-home messages
on
Get link
Facebook
X
Pinterest
Email
Other Apps
An
infographic showing some key questions that researchers can ask
themselves when thinking about using automated transcription (Author: @CaitlinHafferty)
Reading time: 4 minutes.
Automated transcription software is rapidly advancing and has huge potential to transform essential research tasks. However, there are lots of important considerations for its use, particularly when conducting qualitative research interviews. For example, common questions include: "How accurate and reliable is
automated transcription?", "Does it create a barrier to understanding,
interpreting, and familiarising yourself with the data from qualitative
interviews?", "Are there any issues with regards to privacy, security,
and safe storage?, "Are there any known ethical issues that I need to consider?".
This blog post shares some key take-home messages from this webinar. I spoke about what automated transcription is, the various applications which are available in 2021, some of the key useful features of one particular app (Otter.ai) and how to use them, and some ethical and GDPR issues that researchers can consider when thinking about using automated transcription in their own work. You can view the webinar recording, PowerPoint slides, and transcripts via the links and the video below:
Five take-home messages: what are the practical and ethical considerations for automated transcription?
1. Which tool to choose? It's important to think about the key features and what is most suitable for your needs. There are lots of commercially available transcription tools, which can make it confusing to pick the best one for the job. This is just a matter of selecting the tolls which are best suited to the context and purpose you are using them in. As of 2021, these include Dragon, Verbit, Otter.ai, Amazon Transcribe, Microsoft Azure Speech to Text, and many others. There are also lots of helpful websites and articles which provide an overview of the top free and paid transcription tools - TechRadar is a great place to start, and you can also check out my previous blog posts. I used an app called Otter.ai during my PhD, but this is just down to personal preference (NB. I'm not sponsored by or affiliated with them!) - it works really well for me and my working style. Some of the key features which I find useful in a research context are outlined in the screenshot below from my presentation slides.
In the presentation slides, I also talk through some of the main features of Otter.ai and why I find them useful - e.g. how to view, navigate, summarise, and edit transcripts, before exporting them for further analysis.
Why Otter.ai? Some useful features of this automated transcription app.
2.Human input is key.This is a really important point to remember -
it can be tempting to view this software as a 'magic fix' for all your
transcription needs, but automated transcription is not a replacement for human input, it is a useful addition. Automated
transcription is a bit of a black box - we know what goes in (audio)
and what comes out (a transcript), but we understand less about what
happens in between and how to evaluate it. We need a person to interpret and provide feedback on the transcript, checking that it is meaningful and an accurate reflection of
the audio recording (i.e. that it makes sense). It helps if this person
has knowledge of the topic(s) which are being covered in the recording
to help with the editing process. e.g. a subject-matter expert.
Try not to feel overwhelmed if the transcript
looks messy and remember that you can learn to spot common issues. It
can be quite confusing to see your automated transcript for the first
time - there can be huge blocks of text, random punctuation and
capitalisation where it shouldn't be, and sentences split between
multiple paragraphs (I provide some examples of these issues in my
presentation slides). Remember that human language is messy and so is the verbatim transcript-
the more experience you have editing automated transcripts, the easier it gets to intuitively spot errors (so keep at it!). Be aware that the transcript
will likely not look perfect in the first instance and manage your
expectations accordingly (e.g. anticipating that it might take some time to edit the transcript), however I find that most errors are minimal
and don't take long to edit.
All in all, it might not be helpful to think about the 'manual vs automated transcription' debate as an either/or - you will always need to use a hybrid approach which uses bits of both!
Automated transcription is nota replacement for human input (Picture: transcriptionwing)
3.Ethical considerations should be reviewed regularly. This includesdata protection, privacy and security, and the location and safe storage of personal information on third party servers. For example, if you're a researcher in the UK like me, it's important to know that some commercially available transcription services (like Otter.ai) may transfer data out of the European Union (EU) and European Economic Area (EEA). This is an important consideration for GDPR, a regulation in EU law on data protection and privacy for individuals within the EU and EEA (it also addresses the transfer of personal data out of these areas).
For example, Otter.ai's privacy notice states: "... Otter.ai may transfer, store, and process your operations with our partners and service providers based outside of the country in which you are based. Laws in those countries may differ from the laws applicable to your country of residence. Where we transfer, store, and process your Personal Information outside of the EEA or the UK we will ensure that the appropriate safeguards are in place to ensure an adequate level of protection.".
Another important consideration is who can view and access the data (and what permissions are needed). For example, Otter.ai's privacy notice states that they: "... train our proprietary artificial intelligence technology on aggregated, de-identified audio recordings. Only with your explicit permission will we manually review certain audio recordings to further refine our model training data.".
4. Always get informed consent when using automated transcription software.Your audio recordings may contain the personal information of third parties (e.g. research participants) so it is important that you have the necessary permissions from any third parties before using an automated transcription app.
For example, when gaining ethical approval for the use of automated transcription apps for research purposes, you could includesomething along the lines of: "The audio recording of the interview, which may contain personal data if you have shared this during the interview, may be shared with a commercially available transcription service for the purpose of transcribing the information you share during the interview. Occasionally, data may be transferred out of the EEA, however in such cases it will be covered by appropriate privacy policies. For example, your data may be transferred out of the EEA for the purposes of transcription, using a transcription service that is signed up to the Privacy Shield Framework."
Some more example GDPR statements for using Otter.ai in qualitative research are shown below (these are all available to view in the presentation slides).
Automated transcription and informed consent
5.Technology is never neutral. It's important to consider the wider ethical principles around the use of automated transcription software (and the algorithms which they are based on), for example core issues such as bias, fairness and equality, control, trust, transparency, accountability, etc. Remember that there are strong principles for artificial intelligence (AI) and sociotechnical design - automated transcription software is not independent of the social and institutional contexts in which it operates. The design and use of technology like automated transcription software involves a series of choices made by humans who can be influenced and biased - who is making those choices, how, and how might this influence the core ethical principles associated with its use?
I'll finish this post with a quote from a fantastic blog by Tracey Gyateng on Open Heroines, which offers some critical points of consideration regarding the impact of technology on society and using 'tech for good':
"... Because these [societal] inequalities exist, we need to understand the context and environment in which technology will be deployed, and work with the people who are most likely to be affected... We must actively include the voices of people who hold less power in society and who are most likely to be disproportionately impacted."