Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 113 2134 26013
Romanian 15 239 3101
Azerbaijani 6 99 1187
Pashto 41 357 5025
Macedonian 30 529 6784
Hindi 21 308 3991
Dutch 25 544 5941
Korean 245 5652 75801
Khmer 10 143 972
Lao 4 106 1058
Hungarian 6 157 1799
Indonesian 48 901 9574
Georgian 4 35 718
Turkish 39 842 7695
French 773 10210 108008
Abkhaz 1 32 456
Norwegian 36 1075 9649
Dari 2 4 302
Bengali 19 353 4167
Tajik 85 1021 12533
Thai 16 266 2607
Italian 250 4175 61102
Kazakh 885 15117 164967
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 108 1733 21463
Nepali 4 17 322
Finnish 1239 26270 315073
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 52 1226 14235
Greek 4 56 820
Serbian 146 2900 31109
Arabic 153 1564 24300
Croatian 22 693 12138
Portuguese 27 383 4388
Chinese 770 10732 123784
Czech 101 2161 33790
Japanese 1595 18101 139726
German 306 4735 55530
Slovak 4 86 939
Mongolian 39 826 8448
Spanish 325 5240 94481
Urdu 4 69 908
Farsi 35 398 4152
English 3315 52445 684718

Raw counts

Number of texts 11613
Number of words 2262153
Number of sentences 188519
Number of annotations 120856

Language background

unknown 197
heritage 2888
foreign 8528

Gender counts

unknown 1826
male 3468
female 6319