This project is one of the subprojects under the integrated project entitled “Standards for Characters, Words and Sentence Patterns used in Chinese for Primary and Secondary Schools.” The purpose of the integrated project is to set up a corpus of textbooks and other referential readings for the primary and secondary school students, from which each of the other five subprojects would abide by. It is the aim of this project that Chinese characters, words and sentence patterns at various stages would be sorted out by frequency, and/or combined with suggestions of experts and scholars. The intention is to provide referential criterion for the adoption of words, characters, and sentence patterns in light of step-by-step procedure so that there will be no great difficulty for the learners. The result of this projects would provide the editors of textbooks with indicators for the selection of words or characters suitable for the learners of different states. In addition, it would offer designers of curriculum samples if there is some revision of the future curriculum.?
Since subproject two to six will use the corpus information to conduct analysis, in order to make the analysis process effective and rapid, subproject one is mainly responsible for the development of the system tools which other subprojects need for analysis and statistics in the process of developing basic knowledge of characters, words, and sentence patterns. The system tools include corpus construction technology, domain-specific word segmentation tools, characters, words, and sentence pattern list generation and comparison of list contents. In order to achieve this goal, the project will mainly be carried out by means of system implementation and document analysis.
?