'Voices' - The BBC Does Sociolinguistics

The BBC has dipped its toes into the water of popular linguistics before, with Radio 4 programmes such as Melvin Bragg's Routes of English, however it's not every day that they launch a whole website and pan-BBC project devoted to linguistics. The project is named Voices, and encompasses a Radio 4 series called Word4Word and all of the nations and regions.


Amongst the sections of the website is Language Lab, which attempts to gather data from linguistic surveys. The most ambitious of these is Word Map, which aims to build a UK-wide 'map' of the words and phrases people use.

Creating linguistic maps of language use is nothing new, sociolinguists have been doing it for years. Early academics wanted to draw definitive maps which would show precisely where the different dialects of the UK were used. However, the linguists doing the research soon ran into problems. The usual method for drawing dialect maps is to pick a few features of dialect, such as whether people say the vowel in 'bath' as in 'maths' or 'tar', and to go around to different areas recording what people say. Unfortunately, the results for different features never match up, leaving big overlaps. This is because dialects aren't distinct, discrete types but instead vary continuously across the country (and, as it turns out, class and age groups).

The Word Map project differs slightly as it looks at the words or phrases people use for different concepts. This kind of research has been done by linguists before, but it's more uncommon as word usage changes much more quickly than speech sounds and because it's a lot harder to gather the data (people will use different words when prompted than in natural speech with their friends). However, given that this is a website-based project, surveying words is really the only possible option.

The survey is written in flash, and starts by asking for basic data about your age, geographic upbringing and multilingual competencies (social class is conspiciously absent). You then pick from the six different spuriously-titled themes, and can enter your 'alternative word or phrase' for six different concepts. It's all displayed in a 'spider-diagram', with a nice bouncy effect as you add words, however despite appearances this is really only a method of presentation and doesn't really depict any kind of meaningful network or taxonomy. As a UI it works nicely though (apart from the lack of any way to delete words you've added if you made a typo).

The 'concepts' have all clearly been picked for their known variation across dialects, with 'long soft seat in main room' ('sofa' vs 'coach' and 'child's soft shoes worn for PE' ('plimsoles' vs 'pumps') being obvious examples. Quite a few have been oddly described though, such as 'narrow walkway alongside buildings', which points towards 'pavement' (or the American 'sidewalk'), but who says that pavements must be narrow or alongside buildings - 'pedestrian walkway alongside roads' would have been better. Others include 'running water smaller than a river', suggesting 'stream', but 'natural' isn't included. Even worse is 'main room of house (with tv)', suggestive of 'lounge' type rooms, but people might just as well consider their TV-containing dining room, kitchen or bedroom as their main room, and what about people without TVs (or, for that matter, lounges)?

The examples above all pick out fairly standard words, however some of the other concepts are more aimed at targetting the different slang words people use. The results for these will vary more and be much more interesting. Examples include 'to play truant' - well known for the numerous slang words such as 'skive','bung','play wag', etc - and 'drunk', which has even more slang words and phrases attached to it (see mine below).


One mistake that the site has made has been to use words rather than descriptions for many of the concepts. As the text encourages you to 'enter alternative words or phrases', doing this means that anyone who uses the word or phrase given, such as 'drunk' or 'rich', is unlikely to enter that same word, and so all this data is lost. Even worse is that these people might feel that they're required to enter alternative words, and so might enter words that they know but infrequently or never use. This will be especially true for fairly standard words like 'trousers' and 'clothes', 'baby' and 'friend'.

Other concepts given are so vague that they will simply attract people to enter as many standard synonyms as they can remember, rather than testing to see the different words people use for the same concept. An example is 'unwell', which might prompt you to enter 'sick', 'poorly', 'ill', 'frail', 'off-colour', etc, all of which are in fairly widespread use but which differ subtly in their connotations.

Finally, there's the bizarre. What words or phrases is 'left-handed' meant to attract, for example, other than insults? And what is 'young person in cheap trendy clothes and jewellery' meant to mean? Presumably they're looking for that derogatory term which seems to have suddenly hit mainstream consciousness, 'chav', but that doesn't seem like an accurate description to me ('trendy'??). Maybe it fits with 'townie' better, I don't know. Either way, it seems a pretty vague and meaningless description, and doesn't represent a 'concept' in my mind. Maybe I'll just put in 'twat'.

The different words that you enter are meant to show up on the map, presumably with shading to indicate in which areas people have submitted them most, but currently all the words I've entered are showing up as me being 'one of the first people to submit this word'. The words will apparently be added to the map once more than 10 people have submitted them, but as it even says this for 'happy', there's either no-one using the site or it takes a while to generate the graphics. I really hope that they're not 'moderating' the words first - that would be the ultimate insult.


The final stats are meant to be available in June. Hopefully they will be some decent analysis and some interesting results. I hope too that the system will be able to match inflected word forms such as 'drizzle' and 'drizzling' and pluralised and singular forms.

Despite all my criticisms, I do like the idea. I can't expect the

project to be a piece of serious sociolinguistic research. I suspect the BBC is more interested in being able to draw some pretty maps and a few fun news-friendly 'facts', but even that is ok. I would have prefered if the whole thing had been better designed (I wonder how much input Dr Clive Upton actually had), but if the survey gets some more people interested in linguistics, then that's no bad thing.