posts - 79, comments - 412, trackbacks - 0, articles - 0

Frequency of sounds in Standard Dutch

Posted on Monday, January 31, 2005 10:53 AM

The following table displays the frequency with which sounds occur in Standard Dutch. The count is based on the CELEX database of word forms. I took the phonological transcriptions of these (app. 120.000) words and counted the occurrence of every symbols. This means that the frequency is based on types rather than tokens: the frequency of individual words is not taken into account. However, it is to be expected that the numbers would not change drastically if we would consider a real database. E.g. coronal consonants will still be the most frequent ones, if only because most function words and inflectional endings consist of them.

The table should be read as follows: I found 94777 occurrences of the symbol '@'. This equals 8.7 % of all segments in CELEX, and '@' is the CELEX symbol for the sound that is represented as ə in Unicode IPA (i.e. schwa).

CELEX symbol # of occurrences % in CELEX IPA symbol
r 97776 8.9 r
@ 94777 8.7 ə
t 91963 8.4 t
s 72995 6.7 s
l 60504 5.5 l
n 50798 4.6 n
k 47418 4.3 k
a 39526 3.6 a
A 37063 3.4 ɑ
i 34487 3.2 i
d 31650 2.9 d
p 31217 2.9 p
m 31202 2.8 m
x 30125 2.8 x
o 30029 2.7 o
e 29009 2.6 e
I 28685 2.6 ɪ
E 26343 2.4 ɛ
O 26260 2.4 ɔ
b 24103 2.2 b
v 20467 1.9 v
K 17755 1.6 ɛi
f 17571 1.6 f
N 17121 1.6 ŋ
w 16885 1.5 w
G 14576 1.3 ɣ
z 14454 1.3 z
h 14162 1.3 h
j 10784 1.0 j
u 9907 0.9 u
L 6857 0.6 ʌy
y 6305 0.6 y
M 3772 0.3 ɒu
| 3173 0.3 ø
S 2102 0.2 ʃ
g 1087 0.1 g
Z 878 0.1 ʒ
) 625 0.1 ɛː
< 224 0.0 ɒː
_ 101 0.0 ʤ
! 63 0.0
* 20 0.0 ɶː
( 4 0.0

Feedback

# re: Frequency of sounds in Standard Dutch

4/26/2005 10:19 AM by Evie Coussé
Hi,

Interesting! In my research on vowel reduction in Standard Dutch I have done a similar exercise for vowels in the Spoken Dutch Corpus. I think you would be surprised how much highly frequent words determine the distribution of sounds. For instance, for vowels the @ turned out to represent a quarter of all vowels in the Spoken Dutch Corpus.

Evie Coussé

Post Comment

Title  
Name  
Url
Comment   

ATTENTION: the code you need to copy is CaSe SeNsItIvE and is required to prevent spam.
Enter the code you see: