Statistics

Russian Learner Corpus

Native language

Native language Number of texts Number of sentences Number of words
unknown 319 5752 65524
Swedish 182 3821 37812
Estonian 2 13 99
Vietnamese 82 1595 18549
English 3254 51587 674347
Pashto 33 279 3692
Hindi 6 74 712
Dutch 25 544 5941
Korean 227 5397 72581
Bulgarian 24 649 8223
Macedonian 26 464 5725
Turkish 7 131 1101
French 709 9203 97067
Abkhaz 1 32 456
Norwegian 36 1075 9649
Bengali 7 132 1345
Thai 8 148 1475
Italian 246 4108 60188
Kazakh 723 12720 138127
Turkmen 146 4695 82122
Slovene 103 1643 20509
Finnish 1239 26270 315073
Uzbek 5 37 344
Albanian 2 24 144
Indonesian 9 170 1813
Serbian 92 1504 17002
Arabic 85 730 13999
Croatian 22 693 12138
Portuguese 9 117 1181
Chinese 183 3448 42316
Czech 101 2161 33790
Japanese 1590 17974 138698
German 294 4062 46795
Tajik 85 1021 12533
Mongolian 11 265 3171
Spanish 192 3656 71718
Farsi 16 154 1396

Raw counts

Number of texts 10101
Number of words 2017355
Number of sentences 166348
Number of annotations 89235

Language background

unknown 197
heritage 2728
foreign 7176

Gender counts

unknown 1851
male 2688
female 5562