Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 125 2334 28332
Romanian 15 239 3101
Azerbaijani 6 99 1187
Pashto 45 382 5508
Macedonian 36 598 7548
Hindi 23 339 4240
Dutch 25 544 5941
Korean 255 5865 77681
Khmer 10 143 972
Lao 4 106 1058
Hungarian 9 211 2373
Indonesian 56 1002 11202
Georgian 4 35 718
Turkish 60 1175 12054
French 784 10259 109654
Abkhaz 1 32 456
Norwegian 37 1076 9689
Dari 2 4 302
Bengali 24 378 4832
Tajik 85 1021 12533
Thai 22 354 3579
Italian 256 4268 62429
Kazakh 940 15760 174636
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 111 1800 22118
Nepali 4 17 322
Finnish 1238 26254 314875
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 54 1245 14491
Greek 8 87 1135
Serbian 161 3264 35039
Arabic 201 2102 32440
Croatian 22 693 12138
Portuguese 31 462 5138
Chinese 835 11732 135388
Czech 101 2161 33790
Japanese 1595 18101 139726
German 307 4256 49198
Slovak 4 86 939
Mongolian 54 1117 10933
Spanish 374 5800 103628
Urdu 4 69 908
Farsi 44 507 6069
English 3350 52930 691260

Raw counts

Number of texts 12012
Number of words 2327939
Number of sentences 193492
Number of annotations 124625

Language background

unknown 197
heritage 2967
foreign 8848

Gender counts

unknown 1826
male 3659
female 6527