Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 127 2369 28965
Romanian 15 239 3101
Azerbaijani 6 99 1187
Pashto 51 414 6154
Macedonian 36 598 7548
Hindi 27 354 4555
Dutch 25 544 5941
Korean 263 5959 78324
Khmer 10 143 972
Lao 4 106 1058
Hungarian 9 211 2373
Indonesian 58 1027 11457
Georgian 4 35 718
Turkish 67 1270 13452
French 784 10259 109654
Abkhaz 1 32 456
Norwegian 41 1125 10046
Dari 2 4 302
Bengali 24 378 4832
Tajik 85 1021 12533
Thai 22 354 3579
Italian 257 4279 62590
Kazakh 963 15998 178022
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 111 1800 22118
Nepali 5 34 448
Finnish 1238 26254 314875
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 54 1245 14491
Greek 8 87 1135
Serbian 163 3283 35210
Arabic 206 2152 32934
Croatian 22 693 12138
Portuguese 32 470 5285
Chinese 955 13687 161964
Czech 101 2161 33790
Japanese 1596 18111 139969
German 307 4256 49198
Slovak 4 86 939
Mongolian 56 1152 11306
Spanish 384 5914 105404
Urdu 15 168 2197
Farsi 46 522 6327
English 3376 53137 694601

Raw counts

Number of texts 12250
Number of words 2370527
Number of sentences 196615
Number of annotations 124224

Language background

unknown 197
heritage 2989
foreign 9064

Gender counts

unknown 1827
male 3768
female 6655