New home for WebAL

After long and tough negotiations in a dimly lit room at the back of bar, with cigar smoke lingering in the air, lots of arm waving, shouting, and a few cold drinks, WebAL (Web resources for African languages) has finally found a new home at:

It is now in the competent hands of Guy de Pauw, Gilles-Maurice de Schryver and David Joffe. The new WebAL will be done in wiki style, something that I should have done myself ages ago, but it takes good men to do something good. The new format will be a tremendous boost for WebAL’s continued existence.

While working on WebAL over the past 5 years, I got a lot of help and encouraging emails, even though I would have appreciated a free weekend at some fancy Cape Town lodge more; all expenses paid, of course. Nah, just kidding! All your contributions, suggestions and emails have been quite rewarding and satisfying.

Now go over there and download some grammar books to read.

Web resources for African languages

For a few years now, I have been maintaining a set of web pages collecting links to web accessible materials on African languages, the WebAL. Since I keep track of stuff like that for my own benefit, it has been little additional trouble transforming data that to a web format useful also to others.

However, I feel I’ve done what I can with it, and as much as it pains me, I can’t quite devote the time it deserves anymore. The web accessible material on African languages keeps growing almost by the day, and it would be benefitial to many people if something like the WebAL pages could continue to exist.

Thus I’m now looking for someone who might be willing to take over the maintenance of WebAL — and do whatever they want with it. I’m sure my archaic HTML tagging needs updating, if nothing else.

Drop me a mail if you’re interested, or know of someone who might be. (And just to clarify, I’m interested only in serious proposals. For instance, a move to a university site, or similar, would be preferable.)

African languages and descriptional density

I have recently been having some fun with bibliographical data. Specifically, I have tried to determine a simple way to calculate the "descriptional density" for various African languages, especially with regard to grammar descriptions.

Descriptional density (a concept I’ve invented myself, I think) aims to determine how well-described any given language is in terms of existing grammar books and dictionaries. For instance, if a given language has only one grammar book written about it, and another language has fourteen grammar books written about it, then obviously the latter language is more well-described than the former. In other words, it’s descriptional density is higher.

There are no doubt many factors that should be taken into account when calculating something like descriptional density, such as number of publications or titles, size of description, number of authors involved, number of varieties described, availability of the grammar(s), and so on. However, many such factors are difficult to operationalize in simple ways. For instance, the size of a grammar book is not always related to its inherent usefulness, quality or even comprehensiveness. The availability of an item is difficult to determine easily (at least as a numerical value). Indeed, there are seemingly only two factors that can be handled without stumblig onto major difficulties, and still get a reasonably informative result: number of titles or works (W) and time span (T). These can be worked into a formula as follows:

DD formula

In general, one grammar book equals a W value of 1. However, many grammar books appear in second, third, fourth, etc., editions. It seems unintuitive to give a second edition the same weight as a first edition. After all, it is still essentially the same book, albeit with some minor or major revisions. Hence it seems convenient to distinguish primary works (W1) from secondary works (W2). While primary works are given a value of 1, secondary works are given a value of 1/3 (a third).

T (time span) represents the number of years spanning between the publication of the first and the latest grammar. For instance, my bibliography includes 135 primary works (grammar books) for Swahili. The earliest of these was published in 1850, and the latest in 2006. This gives a time span of 156 years. In order for this number not to inflate the calculations unnecessarily, it needs to be whittled down a bit, which is why I use the square root of the actual time span in the formula.

By adding the total number of primary works (W1), with a third of the total number of secondary works (W2), and the square root of the time span (T), we get a total index value representing the descriptional density (DD) for any given particular language.

Here, then, is a list of fifteen of the largest Bantu languages spoken in Africa, ranked according to their DD (descriptional density) values:

    Swahili 173.49 135 + 78 156
    Zulu 70.53 42 + 48 157
    Kikongo 67.29 45 + 11 347
    Chewa/Nyanja 51.11 31 + 26 131
    Xhosa 42.15 20 + 27 173
    Shona 41.63 26 + 15 113
    Setswana 39.08 20 + 18 171
    Lingala 37.29 23 + 11 113
    Sesotho 31.45 16 + 9 155
    North Sotho 25.91 14 + 3 119
    Luba-Kasai 25.82 13 + 7 110
    Kirundi 24.23 12 + 7 98
    Kinyarwanda 21.54 9 + 9 91
    Sukuma 20.87 11 + 1 91
    Kikuyu 17.31 7 + 2 93

Notice how the ranking only roughly corresponds to the actual number of grammar descriptions (whether we look at primary works only or primary and secondary works jointly). By taking time span into account, we get a bit more sophisticated picture of how well-described any given language is. As already mentioned, I have only looked at grammar descriptions. For a more comprehensive look, I need to look also at dictionaries, but that is a project for another sleepless night.

You can read more details about this here.