The word list (frequency list) of the BCCWJ is available to the public. It is free for use for research or educational purposes. Additionally, to ease usage a manual is provided.
Because the word counts of Long Unit Words in the POS and word classification lists were being treated as identical to those of Short Unit Words, the counts were corrected.(2014/01/07)
2011
Special "Japanese corpus" language policy research group.
Based upon the BCCWJ and the "Textbook Corpus", this report covers practical research related to the creation and use of word and kanji lists useful for Japanese language policy and education. The downloads further below also contain examples of research on the "BCCWJ principal word list", "Textbook Corpus word list", "School and Societal contrastive word list", "Educational subject-specific word list", and the "NDC genre-specific kanji frequency list".
2011
Special "Japanese corpus" language policy research group.
A list allowing for the comparison of the frequency and lexical level of the BCCWJ's fixed length samples of "Library books", "Published books", "Magazines", and "Newspapers", and variable length samples from "Yahoo! Answers" and "Yahoo! Blogs."
2011
Special "Japanese corpus" language policy research group.
A complete lexical listing of a "Textbook corpus", made up of textbooks in all subjects and grade levels from primary, middle, and high-school in the year 2005. As all the different grade levels, and subjects are known, it is also possible to to learn the frequencies of words as they are used in textbooks for those different subjects and grade levels. It also allows for comparison with the BCCWJ's collection of fixed-length samples from library books.
2011
Special "Japanese corpus" language policy research group.
This is a listing that allows for comparison between the words from the
middle- and high-school textbooks from the "Textbook corpus" discussed above
with a subset of words from the princple BCCWJ that are thought to be used
very frequently. Words commonly associated with schools, and words used often
in society can thus be compared. The word classification numbers from
NINJAL’s "Word List by Semantic Principles, revised edition", are also included. In the "Integrated edition", words with several classification numbers (e.g. polysemes) are treated as single lexical items. The PDF edition available below is designed for viewing this easily.
2011
Special "Japanese corpus" language policy research group.
The "Divided Edition" contains the same lexical information as the "Integrated edition" above, but in cases where words have multiple classification numbers they are treated as separate lexical items.
2011
Special "Japanese corpus" language policy research group.
Based on the above "Textbook corpus" and fixed length samples from library book sources, a list of words specific to different scholastic subjects was compiled. This list summarizes various words particular to different subjects in middle- and high-school curriculums.
2011
Special "Japanese corpus" language policy research group.
This is a list containing the frequencies of different Kanji characters in the 10 different genres classified under the Japanese Decimal Classification System (NDC). The list allows for a general summary of the kanji found in each genre.