The use of statistical methods based on corpora in humanities and literature researches is expanding. These methods can be used in studies of stylistics, literary criticism and comparative literature. Finding the pattern of language changes in different language varieties and investigating the existence of similarities and differences of language in different linguistic contexts is very important from the point of view of linguistic knowledge. Our main problem in this research is that what are the lexical and syntactic differences between the four registers of the contemporary Persian language and how can they be analyzed and explained. For this purpose, four corpora of literary, news, scientific and legal languages were created and labeled. Counting and statistics were done with the help of software programs and quantitative results were obtained. finally, these results were examined and analyzed based on situational context. The findings of this research showed that some linguistic features have significant differences in different registers. For example, the frequency of occurrence of verbs, pronouns and adverbs in the literary register and the frequency of occurrence of adjectives in the scientific register are clearly higher than other registers. Putting these characteristic features together can be a criterion for differentiating linguistic varieties.
Rights and permissions | |
![]() |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |