This page presents Word Count Charts and Lexical Item Charts for the "Corpus of Historical Japanese" Version 2020.03.
In the "Corpus of Historical Japanese" Version 2020.03 there are presented 18,910,000 Short Unit Words, and 2,570,000 Long Unit Words. The following sets out the breakdown for the individual sub-corpora.
Period | Sub-corpus | Short Unit Word | Long Unit Word |
---|---|---|---|
Nara period | Nara Period Series I: Man'yōshū | 99,000 | 94,000 |
Nara Period Series II: Senmyō | 21,000 | 17,000 | |
Heian period | Heian Period Series | 1,030,000 | 912,000 |
Heian period / Kamakura period | Waka-shū Series | 269,000 | 252,000 |
Kamakura period | Kamakura Period Series I: Folktales and Essays | 844,000 | 792,000 |
Kamakura Period Series II: Diaries and Travel Literature | 128,000 | 118,000 | |
Muromachi Period | Muromachi Period Series I: Kyōgen | 277,000 | 256,000 |
Muromachi Period Series II: Christian Materials | 138,000 | 128,000 | |
Edo Period | Edo Period Series I: Share-bon | 218,000 | ― |
Edo Period Series II: Ninjo-bon | 406,000 | ― | |
Edo Period Series III: Chikamatsu-Joruri | 255,000 | ― | |
Meiji Era / Taishō Era / Shōwa Era | Meiji Era / Taishō Era Series I: Magazines | 14,180,000 | ― |
Meiji Era / Taishō Era Series II: Textbooks | 856,000 | ― | |
Meiji Era / Taishō Era Series III: Early Meiji Spoken Language Materials | 193,000 | ― |
The word counts for the data collected in the "Corpus of Historical Japanese" are presented in the following files. Word frewuency (both including punctuation marks and not including punctuation marks) have been arranged according to Sample ID, Core/non-core, Main text type (including quotations), and Style.
Data on the Word Count Chart for Short Unit Words can be downloaded through the following links.
Download Word Count Chart for Short Unit Words tsv Data (Version 2020.03)
Download Word Count Chart for Short Unit Words Excel Data (Version 2020.03)
The word counts for the data collected in the "Corpus of Historical Japanese" are presented in the following files. Word counts (both including punctuation marks and not including punctuation marks) have been arranged according to Sample ID, Core/non-core, Main text type (including quotations), and Style.
Data on the Word Count Chart for Long Unit Words can be downloaded through the following links.
Download Word Count Chart for Long Unit Words tsv Data (Version 2020.03)
Download Word Count Chart for Long Unit Words Excel Data (Version 2020.03)
Word counts of individual lexemes (and word counts of lexical type and of part of speech) for the data collected in the "Corpus of Historical Japanese" have been arranged by historical period and by literary work.
They can be downloaded through the following links.