Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 113 2134 26013
Romanian 15 239 3101
Azerbaijani 6 99 1187
Pashto 41 357 5025
Macedonian 30 529 6784
Hindi 21 308 3991
Dutch 25 544 5941
Korean 245 5652 75801
Khmer 10 143 972
Lao 4 106 1058
Hungarian 6 157 1799
Indonesian 48 901 9574
Georgian 4 35 718
Turkish 39 842 7695
French 773 10210 108008
Abkhaz 1 32 456
Norwegian 36 1075 9649
Dari 2 4 302
Bengali 19 353 4167
Tajik 85 1021 12533
Thai 16 266 2607
Italian 250 4175 61102
Kazakh 906 15364 168519
Turkmen 158 4713 72015
Kurdish 18 135 1312
Slovene 108 1733 21463
Nepali 4 17 322
Finnish 1238 26254 314875
Uzbek 5 37 344
Albanian 6 114 1273
Bulgarian 52 1226 14235
Greek 4 56 820
Serbian 146 2900 31109
Arabic 153 1564 24300
Croatian 22 693 12138
Portuguese 27 383 4388
Chinese 770 10732 123784
Czech 101 2161 33790
Japanese 1595 18101 139726
German 305 4219 48681
Slovak 4 86 939
Mongolian 39 826 8448
Spanish 325 5240 94481
Urdu 4 69 908
Farsi 35 398 4152
English 3315 52445 684718

Raw counts

Number of texts 11632
Number of words 2258658
Number of sentences 188234
Number of annotations 123558

Language background

unknown 197
heritage 2908
foreign 8527

Gender counts

unknown 1825
male 3474
female 6333