Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 127 2369 28965
Romanian 16 248 3202
Azerbaijani 6 99 1187
Pashto 53 429 6530
Macedonian 36 598 7548
Hindi 28 363 4657
Dutch 25 544 5941
Korean 264 5981 78556
Khmer 10 143 972
Lao 4 106 1058
Hungarian 9 211 2373
Indonesian 58 1027 11457
Georgian 4 35 718
Turkish 72 1368 14353
French 791 10358 110680
Abkhaz 1 32 456
Norwegian 41 1125 10046
Dari 2 4 302
Bengali 27 419 5350
Tajik 85 1021 12533
Thai 24 378 3811
Italian 257 4279 62590
Kazakh 963 15971 177881
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 111 1800 22118
Nepali 5 34 448
Finnish 1238 26254 314875
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 54 1245 14491
Greek 8 87 1135
Serbian 163 3283 35210
Arabic 212 2220 34044
Croatian 22 693 12138
Portuguese 32 470 5285
Chinese 984 14027 165577
Czech 101 2161 33790
Japanese 1596 18111 139969
German 307 4256 49198
Slovak 4 86 939
Mongolian 73 1543 14332
Spanish 385 5933 105622
Urdu 29 386 4277
Farsi 49 525 6779
English 3406 53536 699194

Raw counts

Number of texts 12372
Number of words 2388966
Number of sentences 198343
Number of annotations 123600

Language background

unknown 197
heritage 2994
foreign 9181

Gender counts

unknown 1827
male 3829
female 6716