Make Radio 4 properly searchable and quotable with transcripts Transcribe BBC Radio 4 113818897511790268200 GP profil.
This is a Google+ Page ‘campaign’ experiment by
. I don’t plan to post much here. I simply ask that if you agree with the following then:
- +1 this page (or add it to one of your Circles)
- Reshare our proposition post
- Share your experience of trying to find something you'd heard said or wanted to research
. . . . .
Where we are nowRadio 4
is wonderful. But ever heard something on it and then wanted to find it later? Ever realised how difficult that is and how unlikely it is that you’ll find what you are looking for?
Web search is still dependent upon text
and the vast majority of what gets broadcast never makes it into searchable text. Instead, we’re stuck with:
- What the BBC marketing departments tell us on their bbc.co.uk pages
- What journalists and listeners publish in response to these pages and the programmes themselves
- A paltry set of transcripts available historically and on request
We’re also stuck with BBC iPlayer
only giving us 7 days’ access
to the Radio 4 broadcasting schedule (and it having limited non-UK availability
). Apparently this is due to copyright reasons but it’s probably also down to a need to keep hosting costs down (the BBC doesn’t have virtually boundless resources like Google) and to the value that is created by deliberate scarcity.
But I suggest the BBC could do better by exploiting auto transcription software
. . . . .
Where we could be
Isn’t the time ripe for something akin to the eBook revolution of searchable text to come to radio
Auto transcribing has been around for a while. It’s far from flawless
but it’s getting better all the time. YouTube offers it
, for example.
I propose that Radio 4 should auto transcribe any of its output that cutting edge speech recognition software is reasonably up the task of (we are surely further on than this 2008 Guardian article
), and to make the transcripts publicly accessible and searchable because:
- They don’t have to be perfect to be useful.
- Transcribing errors could be suitably disclaimed to cover legalities (and perhaps there could be a Wikipedia like collaborative effort from the public to make corrections)
- The hosting requirements of text transcriptions would be exponentially smaller than audio files
Even if the transcripts had to be limited to 7 day access like iPlayer they would still be invaluable.
. . . . .
Why just Radio 4?
Couldn’t this approach in theory be applied to all broadcast output? Well yes, but I think Radio 4’s programming is particularly suited as a testing ground.