Showing 20 results for Corpus
Tahereh Taremi, Masood Ghayoomi,
Volume 0, Issue 0 (2-2024)
Abstract
The study of scientific articles, as the main genre of scientific productions and an important means of information exchange among the members of the scientific community, has received increasing attention during the past few decades. In scientific discourse study, textual structure and coherence require the writers to use various meta-discourse markers, including interactive and interactional meta-discourse markers, and their appropriate strategies. In the current research, the category of interactive meta-discourse marker based on Hyland's model is studied. We use a corpus-based approach to analyze Persian scientific research articles in the field of humanities and to find out the importance and role of interactive meta-discourse elements in Persian scientific papers.
For this purpose, we randomly select and analyze 800 abstracts of scientific research articles from 16 fields of humanities from the Comprehensive Portal of Humanities. Examining the data reveals the importance of the use of meta-discourse in the text in such a way that approximately one marker of interactive meta-discourse marker is found among every 15 words. Also, the analysis of the corpus indicates that frame markers are the most frequently used interactive meta-discourse marker in the corpus, and transition and code glosses are in the next rank with a little distance from frame markers. Endophoric and evidentials markers obtained the lowest frequency in the corpus. At the end, suggestions and corrections are provided to make Hyland’s concept more compatible with the discourse features of Persian scientific articles.
Mohammad Dahaghin, Gholamhosein Gholamhosein Zade, Zeinnab Saberpour,
Volume 0, Issue 0 (2-2024)
Abstract
The use of statistical methods based on corpora in humanities and literature researches is expanding. These methods can be used in studies of stylistics, literary criticism and comparative literature. Finding the pattern of language changes in different language varieties and investigating the existence of similarities and differences of language in different linguistic contexts is very important from the point of view of linguistic knowledge. Our main problem in this research is that what are the lexical and syntactic differences between the four registers of the contemporary Persian language and how can they be analyzed and explained. For this purpose, four corpora of literary, news, scientific and legal languages were created and labeled. Counting and statistics were done with the help of software programs and quantitative results were obtained. finally, these results were examined and analyzed based on situational context. The findings of this research showed that some linguistic features have significant differences in different registers. For example, the frequency of occurrence of verbs, pronouns and adverbs in the literary register and the frequency of occurrence of adjectives in the scientific register are clearly higher than other registers. Putting these characteristic features together can be a criterion for differentiating linguistic varieties.
Masoomeh Fijani, Amirsaeid Moloodi, Saeed Hessampour,
Volume 0, Issue 0 (2-2024)
Abstract
In corpus stylistics, computational tools are used to conduct qualitative and quantitative analyses of the electronic corpora of literary works, through which the stylistic components of the texts are identified. This study aimed to determine the stylistic features of the works of Simin Daneshvar and Ebrahim Golestan using a corpus-based approach. For this purpose, the works of these two writers were examined using corpus analysis tools, including keyword and concordance analysis in the AntConc software. After extracting the positive keywords in these works, each keyword was examined in its real context in the concordance menu, and a semantic classification was performed based on their semantic domains. The examination and comparison of the positive keywords showed that the semantic domains of "social behavior, work and profession, state and grammar" are common in the works of both writers. This commonality from a stylistic perspective can be related to similar social norms and behaviors, as well as the similar time and place of life and growth of the two writers. Golestan's works are writer-centered, while Daneshvar's works are reader-oriented. Daneshvar's story characters are much more numerous (14 names with a frequency of 504) compared to Golestan, who used only 3 names with a frequency of 107. In Golestan's stories, there is no reference to religion semantic domain, while in Daneshvar's stories, this domain is addressed. Daneshvar establishes a greater connection with the characters in her stories by mentioning specific individuals.
Niloofar Hesami, Shahram Modares Khiyabani,
Volume 4, Issue 3 (10-2013)
Abstract
Ellipsis is a frequent event, which occurs in different linguistic levels. Even though it does not usually affect communication, there are times when ellipsis leads to ambiguity and misunderstanding. In TV football reports, because of the context and previous familiarity with such a match, the reporter continuously omits linguistic units. This study deals with the nature and amount of ellipsis in TV football reports. The corpus includes the last 15 minutes of 12 TV football reports of four famous reporters. Data analysis includes studying the nature and amount of ellipsis in these reports based on Safavi (1390). On the other hand, ellipsis of linguistic categories is studied based on the model proposed by Halliday and Hasan (1976, 1987). In this study, ellipsis is categorized in 4 types: ellipsis leading to ambiguity, ellipsis with no effect on communication, which Safavi (1390) calls semantic reduction, ellipsis that distinguishes speech from writing, and ellipsis of optional linguistic units. The study shows that over half of the instances of ellipsis do not lead to any misunderstanding, and because of the nature of football reports, they result in some sort of semantic reduction. On the other hand, studying the types of ellipsis on the syntactic level shows considerable difference from Halliday and Hasan (1976, 1987) who categorize ellipsis in three types of verb phrase, noun phrase and clause ellipsis. Finally, the study shows that classic view towards ellipsis, which categorizes it in two types of text-based and context-based ellipsis, cannot explain instances of ellipsis in TV football reports. The present research categorizes the ellipsis types in TV football reports, and shows failure of the classic view in explaining these instances of ellipsis.
Sajjad Asgari Matin, Ali Rahimi,
Volume 4, Issue 4 (12-2013)
Abstract
The multidisciplinary analysis of relationship between language and law has been in the spotlight for many linguists in the last two decades. Forensic Linguistics attempts to describe and, where possible, explain the features that distinguish the language used in legal settings from the everyday language. Furthermore, discourse analysis is capable of application in a wide variety of settings and contexts. The purpose of this paper is to outline the theory and practice of forensic discourse analysis as a tool for interpretation and analysis of legal context with a particular focus on legal pragmatics in Persian legal events to enable both researchers in legal system and forensic linguists to pass the level of theory and barge into the practice of discourse analysis in Persian legal system. In this regard, we focused on the Legal Speech Acts based on the theory of Searle J. (1969). A collection of 20 files issued in legal context were analyzed and the results and applications will be discussed.
Ramin Golshaie, Arsalan Golfam, Seyyed Mostafa Assi, Ferdows Aghagolzadeh,
Volume 5, Issue 1 (3-2014)
Abstract
We examined two assumptions of the "Conceptual Metaphor Theory" (CMT) using corpus-based method. According to the first assumption, linguistic metaphors are merely reflections of conceptual metaphors; so linguistic metaphors have a marginal and secondary role. According to the second assumption, conventional linguistic metaphors are systematic. A 50-milion token sample of Hamshahri collection of Persian texts was selected as the corpus of the study. All of the corpus analyses of calculating the collocations and extracting the concordances were carried out using Ant Conc corpus software. Data analysis failed to find evidence in support of the first assumption provided by CMT, but the second assumption was partially confirmed. The findings suggest that the semantic patterns of linguistic metaphors are more complex than those predicted by CMT, and language use factors play an undeniable role in shaping the semantics of metaphoric expressions.
Reza Kheirabadi,
Volume 5, Issue 4 (12-2014)
Abstract
The “Persian Gulf”, as a very important geopolitical region known as global heartland, is the third great gulf of the world. There have been some controversies over the name of this region in recent few decades, and recently some Persian Gulf Arab states or sometimes American and European institutes have tried to coin the fake name of “Arab Gulf” instead. In this paper, after reviewing the literature and historical and international documents, we study the naming strategy of international media toward the name of this important geographical entity. We compare the frequency, genre and content of the articles and news in which four referring expressions of “Arabic/Arab/Arabian/Persian Gulf” have been used within the Critical Discourse Analysis (CDA) framework. The data are gathered from Time magazine archive (1923-2008) and Contemporary Corpus of American English (COCA) (1990-2012).
The findings of this article show that, comparing with other terms, the usage of the term “Persian Gulf” is considerably and undeniably more than the other three terms in a way that in Time magazine archive, there are 969 and in COCA corpus, 5003 cases of “Persian Gulf” usages while this number is around a hundred for all of the three coined words.
The results further shows that while Persian Gulf is widely used context freely as default name, “Arab Gulf” term is mostly used in economic context, especially in those news and articles, which are about the “oil”.
Reza Ghafar Samar, Mohsen Shirazizadeh, G. Reza Kiany,
Volume 6, Issue 4 (10-2015)
Abstract
The present article takes a critical and analytic look at various dimensions of studying vocabulary in academic texts, hence providing a quite clear prospect of the requirements, methods and challenges of this line of inquiry. The basic focus of the article is however to draw attention to the paucity of corpus-informed research on Persian academic texts as well as the linguistic productions of Persian speakers in other languages. In the first section, a holistic picture as to the significance of learning academic vocabulary is drawn. Then, some academic word and phrase lists and some academic corpora are briefly introduced. In the next section, different aspects which should be taken into consideration (e.g. collocation, lexical bundles, intra and inter-text lexical variation) in such type of research are elaborated and some of precautions to be taken by researchers are discussed. In the final section, some of the challenges and limitations of this type of research are mentioned and a scheme of the ecology of “studying academic vocabulary” is given. The scheme is supposed to act as a synoptic road map for interested researchers who are at the beginning of their academic endeavor.
Seyed Mohammad Hosseini-Maasoum, Maryam Ghiasian, Belgheis Roshan, Ashraf Sadat Shahidi,
Volume 7, Issue 1 (3-2016)
Abstract
The process of language change is an inseparable feature of the inherent nature of every language. This change is so slow and delicate that it will be tangible for the native speakers only after a long time and in comparison with the past. A diachronic outlook of the language is especially beneficial here. The present research seeks to examine the transition process of (negative or positive) semantic prosody of some presently neutral Persian verb compounds into connotation. To this end, different researches on semantic prosody, connotation and their transformation in different languages and especially in English are reviewed and the same trend is traced in some verb compounds in Persian. Two corpora from two different historical periods (12th century and modern Persian) of language data were compiled and the semantic prosody of seven verb compound was established in the two. The results show that the semantic prosody of some of these compounds have changed from positive to negative over time and this negative semantic prosody in some of the compounds especially mojeb shodan (cause) is changing to negative connotation.
Fateme Yegane, Azita Afrashi,
Volume 7, Issue 5 (11-2016)
Abstract
The present research surveys orientational metaphors in Quran in a cognitive approach. Space and orientation in the space are basic cognitive domains employed as source domain for conceptual metaphors. The research aims to explore the target domain concepts formed based on the orientational concepts. Thus the “Noor software” was searched with seven orientation marking keywords. All the verses including these keywords were identified in Quran and 60 instances of metaphorical application of these items were recognized. Some of the most prominent abstract concepts formed through orientational metaphors in Quran are “degree and dignity; bliss; superiority and advantage” among others. Findings of the research show that the special application of orientational metaphors in Quran is a stylistic and semantic feature.
Keywords: Quran؛ Conceptual metaphor؛ Orientational metaphor؛ Corpus
Javad Zare, Abbas Eslami-Rasekh, Azizollah Dabaghi,
Volume 7, Issue 7 (3-2016)
Abstract
Academic lecturing has tuned into the major teaching method in higher education. Due to the excess of verbal and visual information presented in a lecture and the importance of some of these information in the final assessment of a course, an understanding of how unimportant information is marked in lectures is useful. The present investigation was an attempt to investigate how lecturers mark unimportant information in Persian academic lectures. More specifically, this study was aimed to investigate the discourse functions of markers of lesser importance. Based on a mixed-methods approach, markers of lesser importance were extracted from the transcripts of the 60 academic lectures of the Persian corpus of SOKHAN. The derived markers of lesser importance were then analyzed in terms of their discourse functions. Five discourse functions, including discourse organization, audience engagement, subject status, topic treatment, and relating to exam were found. In addition, topic treatment, followed by subject status, accounted for most of the discourse functions of the markers of lesser importance. Moreover, audience engagement, discourse organization, and relating to exam were found to be the least frequent discourse functions. On the whole, the findings suggested that marking lesser importance does not necessarily involves orientation to the audience or organizing the discourse into points and asides. Instead, marking lesser importance most often necessitates using expressions that explicitly or implicitly demarcate boundaries between what the lecturer wishes to talk about, does not intend to go through, or tends to cover briefly.
Ramin Golshaie,
Volume 10, Issue 3 (7-2019)
Abstract
The problem of discovering the identity of anonymous authors has engaged humans' attention during the ages. In present times, with the revolution brought about by digital computing and electronic corpora, and also with the applications made available by stylometry research in forensic linguistics, systematic analysis of texts in different languages has expanded the understanding of researchers on the different aspects of linguistic styles.
In the present study, the possibility of authorship attribution based on idiolect has been investigated in Farsi. One of the linguistic elements that is claimed to be the seat for idiolect is function words. Function words have been the focus of attention in the authorship attribution research since it has been shown that they are processed unconsciously, have high frequency in texts, and remain independent of text topic. In this paper, the possibility of differentiating texts written by different authors has been studied using Farsi function words. The research questions were: 1) Are Farsi functions words capable of differentiating authors in Farsi prose? 2) Of monograms, bigrams, and trigrams, which one is the most efficient in differentiating author styles? 3) What is the minimum cut-off point for successful differentiation of author styles in Farsi?
First, a corpus of five Iranian scholars’ writings was compiled, normalized and divided into different sample texts. Then 20 most frequent words were extracted from different author samples and n-gram sequences (up to tri-grams) were analyzed using principal component analysis and cluster analysis in the Stylo package of R.
Findings showed that function words in Farsi were capable of differentiating authors’ writings with monogram words performing better than bi-gram and tri-grams in small size samples. Findings also indicated that under the experimental conditions used in this study, the minimum number of words for a text to be successfully attributed to an author is about 4000 words. This cut-off point is reached using 20 most frequent function words. It is concluded that different authors don't use function words in the same manner. In fact, while some high-frequency function words appear in the writings of all authors, they are given different priorities by different authors.
Maryam Zarei, Alireza Khormaee, Amirsaeid Moloodi,
Volume 10, Issue 5 (11-2019)
Abstract
Introduction
The dependents of verb are among the most debated subjects on which a considerable body of research has been done. Yet, researchers have constantly had diverse opinions about their real identities. Complement, as one of the dependents of verb, is in the same boat. Some scholars have differentiated obligatory complements from optional ones, while others consider complements as obligatory elements and do not recognize an optional category. This article, based on Langacker’s (1987, 2013) Cognitive Grammar and through a corpus-based method, seeks to find out whether the Persian corpus verifies the existence of optional complements and if not, in what category can we place what is normally called optional complement. In other words, this research is to seek the answers to the following questions: Are there any optional complements besides obligatory ones based on Persian corpus-based data as well as Langacker’s Cognitive Grammar? If complements are merely obligatory, how can one categorize those elements called optional complements?
Methodology
To answer the above-mentioned questions, four dependents (subject, object, source and goal) of four salient motion verbs (
raftan 'go',
āmadan 'come',
āvardan 'bring' and
bordan 'take') in Persian were chosen to be studied. To this end, 300 tokens of each salient motion verb along with their dependents and the related linguistic context were randomly selected from the corpus of Hamshahri 2 to observe their corporal behavior.
Discussion
Langacker (1987, 2013) distinguishes 3 dependents for heads including verbs, which are “complements”, “modifiers” and “adjuncts”. He defines complements as “a component structure that elaborates a salient substructure of the head. The head is thus dependent, and the complement is autonomous” (Langacker, 2013: 203). Conversely a modifier is “a component structure that contains a salient substructure elaborated by the head. In this case the head is autonomous, and the modifier is dependent” (Langacker, 2013: 203). And finally “a component structure which fails to either elaborate the head or be elaborated by it is called an adjunct” (Langacker, 2013: 205).
Regarding the four dependents of the salient motion verbs under study, subjects and objects are complements since they elaborate the salient substructures of the verbs. Subjects elaborate the schematic trajectors of the verbs and objects elaborate the schematic landmarks of them. So the verb is, to a great extent, dependent on the subject and the object to complete its meaning. Such high conceptual dependence of the verb brings about its syntactic dependence too and as a result complements are obligatory and must constantly accompany the verb. The corporal behavior of the complements (subjects and objects) verifies this fact; from 300 tokens of each verb in Persian, there was not even a single sample in which the subject or the object was absent. Goals and sources, which tend to be considered as optional complements in the canonical viewpoints in Persian grammar, are, taking Langacker’s Cognitive Grammar into consideration, modifiers since the motion verb elaborates their schematic trajectors which is a schematic process denoting a motional action. As a result, they are conceptually dependent on the motion verbs, hence being modifiers.
3. Conclusion
The corporal behavior of subjects, objects, goals and sources as the dependents of the four salient motion verbs under study produces the following conclusions:
- Complements are solely obligatory elements since they elaborate the schematic trajectors or landmarks of motion verbs; thus, motion verbs are so conceptually dependent on the complements that they can never appear without them and as a result they become syntactically dependent on the complements as well. Sources and goals, on the other hand, are modifiers that are dependent on motion verbs to elaborate their schematic trajectors. Therefore, the relation that exists between the complement and the verb does exist between the modifier and the verb too but in a reverse direction.
- Although sources and goals are both modifiers considering Langacker’s Cognitive Grammar, the result of the study shows that there is a goal over source preference. The frequency of the goals is much higher than that of the sources and the result of the Chi-square test indicates that there is a significant difference between the presence of these two elements with salient motion verbs (P<0.05). This result aligns with Stefanowitsch and Rohde (2004), Kabata (2013) and Verkerk (2014).
- Although there is an asymmetrical distribution between sources and goals, neither of them are optional elements. Their behavior in the text corpus shows that the presence of these modifiers are determined by the context, i.e. if the context needs them, they have to appear and if not, they are not employed by it. For that reason, sources and goals are contextually obligatory and can be called “contextual supplements”.
Studying adjuncts in the corpus shows that they are not optional either. These elements, too, have to be present if the context necessitates their being but if they are not summoned by the context, they are absent. So, adjuncts on the par with the modifiers are contextually obligatory and termed “contextual supplements” in this study. Based on the results of the analysis of the Persian text corpus, it seems that Langacker’s triple division of the dependents (i.e. complements, modifiers and adjuncts) does not meet the corporal behavior of these dependents.
Morteza Taghavi, Mohammadreza Hashemi,
Volume 13, Issue 1 (3-2022)
Abstract
One of the crucial topics discussed in descriptive translation studies is that of Translation Universals (TUs), which addresses typical, salient features of translational language that make it distinguished from other linguistic variants. Taking into consideration the differences between languages, the key question here is whether the purported universal features (mainly articulated based on examining European languages) exist in non-European, less- or uninvestigated languages. Employing Chesterman’s categorization of universals into ‘S-universals’ and ‘T-universals’, the present study aimed at examining the latter, less investigated group of universals. A comparable corpus was made of original and translated Persian expository texts to investigate two T-universals, namely simplification and explicitation. In the light of linguistic features of translational Persian obtained, the present study challenges the purported universals as none of the extracted features were in line with the previous studies’ prepositions.
1. Introduction
One of the most significant topics whithin Descriptive Translation Studies (DTS) is Translation Universals (TUs), first clearly articulated by Mona Baker in her work (1993). TU hypotheses are concerned with typical linguistic features that makes translational language different from other linguistic variants. According to Hansen and Teich (2001), “it is commonly assumed in translation studies that translations are specific kinds of texts that are different not only from their original source language (SL) texts, but also from comparable original texts in the same language as the target language (TL)” (p. 44). Such claims can be examined either manually or by means of corpus-based analytical tools. Corpora have been a reliable popular tool among researchers since the convergence of corpus-based empirical methodology and linguistic studies, including Translation Studies, during the 1990s.
Over the last three decades, many studies have been conducted on theories of TUs and evidence of specific linguistic features of translational language has been provided, but almost all of these studies have been on Western languages, especially English. Chesterman (2004) divides the TUs into "S-universals" and "T-universals." The first category refers to "universal differences between translations and their source texts" (ibid, p. 39) and the second category refers to the differences in the linguistic features of translations (target texts) as compared to non-translated, native TL texts. Although some TUs may usually be investigated within one category, some can be examined from the perspective of both groups.
Some interpret the so-called TUs as an inseparable part of any translational language and are in line with the theories presented in the literature of translation studies. It should be noted, however, that if a linguistic feature is to be considered a "universal," it should be found in translations into all languages, but, in fact, almost all the literature on TUs is devoted to research on Western languages, especially English. Only a few examples can be found (e.g. Xiao & Hu, 2015) that have studied universals in non-European languages. In addition, research on S-universals outweighs work on T-universals. Hence, such claims as the existence of ‘universal’ features, in the strict sense of the word ‘universal’, is highly debatable, unless they are scrutinized in other languages, especially those that are different from English in terms of word order, syntactic structures, stylistic features and the like. The hypothesis of the present study is that due to the differences between Persian and other Indo-European languages, including English, in those aspects (word order, syntactic structures, stylistic features, etc.), the claimed universal features shown in the other languages are not present in Persian, not at least with the same quality.
Investigation of TUs in Persian has also been largely neglected and faces some drawbacks. Some of these drawbacks are due to research methods (such as manual investigation and not benefiting from corpus investigation tools) and some are related to the limitation of data to only literary texts and novels. Focusing on comparison of source with target text(s) (addressing only S-universals using parallel corpora) and neglecting the examination of T-universals (using comparative corpora) is another limitation of such studies on Persian language. The present study intends to study the salient and distinctive linguistic features of translational Persian using corpus methodology and comparing translated texts with other comparable original writings, thus shedding light on the existence of the claimed universal features in Persian. In this regard, the T-universals of simplification and explicitation were selected and specific linguistic features were examined that can signal the presence of the selected universals. As an instance of non-literary writings, a medium-sized corpus consisting of two sets of expository academic and general humanities texts, namely philosophy and texts about literature (such as general informative texts about literature, review and critique of literary works, etc.), was analysed.
2. Background
The search for TUs dates back to the mid-nineties, where this topic led to a surge of interest among researchers, especially since the emergence of corpora as a research tool in Translation Studies. Searching through the existing literature on TUs shows that the research carried out on S-universals outweigh the studies on T-universals. Among others, simplification and explicitation are the two T-universals investigated in the present study.
A number of studies have been done on simplification as a universal feature in translation at the lexical, syntactic and stylistic level (e.g. see Laviosa-Braithwaite, 1996; Malmkjær, 1997; Laviosa, 1998; Cvrček & Chlumská, 2015). Taking a closer look at these studies highlighted some disagreements. For example, regarding mean sentence length, Laviosa (1998) (English), Xiao and Yue (2009) (Chinese), and Ilisei et al. (2009) (Spanish) showed that mean sentence length in translational language is significantly higher than original writings. But, contrary to these three studies, Malmkjær (1997) believed that stronger punctuations may result in shorter sentences in translated texts. Also, Xiao (2010) and Xiao and Hu (2015) found that sentences in original Chinese are relatively longer than translated texts, although this difference was not significant.
A number of other studies have further demonstrated evidence for explicitation or the tendency in translational language to make explicit what has been implicit in the source text, thus making it different from original writings (Blum-Kulka, 1986; Toury, 1991; Baker, 1996; Øverås, 1998; Olohan & Baker, 2000; Xiao, 2010). Although this feature is found in translations at different lexical, syntactic, and textual levels, "there is variation even in these results, which could be explained in terms of the level of language studied, or the genre of the texts" (Mauranen, 2007, p. 39). There is still a long way to go to determine whether explicitation is a universal feature or not, as most of the data in the literature is based on Western languages, especially English.
Much of the criticism that the topic of TUs has attracted relates to the fact that most studies have only focused on Western languages and failed to move beyond and scrutinize others, a fact that is also reflected in the small body of literature on Persian. What we know about the possible presence of TUs in translational Persian is mostly based on studies that were carried out on S-universals and are limited in one way or another (e.g. Ghamkhah & Khazaee Farid, 2011; Salimi & Askarzadeh Torghabeh, 2015; Vahedi Kia & Ouliaeinia, 2016; Ahangar & Rahnemoon, 2019). In general, these limitations can be classified into seven categories:
direction being restricted to comparison of source with target text(s) (addressing only S-universals)
lack of variety in the source language (almost all studies feature English as the source)
universals (all studies are on the four recurrent features of translation proposed by Baker (1996))
genre (almost all studies focus on literary texts)
size (very small-scale studies, mainly on selected parts of one or two books)
methodology (using manual investigation and not benefiting from corpus investigation tools)
source of data collection (all data were collected from books and published works, ignoring online translated materials available)
3. Corpus Design and Method
In the present study, a comparative corpus was used which includes two subcorpora: original Persian texts and English-Persian translated texts. Each component consisted of one hundred extracts, each of 3000-word length, taken randomly from books and webpages, thus amounting to 300,000 words for each subcorpus and 600,000 words on the whole. The current literature on Persian language has failed to move beyond literary texts. Contrary to the predominance of studies on European languages and small-sized corpus-based works on Persian, all confined to literary texts and books, the data for the present research was collected from books and webpages on two non-literary fields in Humanities, philosophy and texts about literature (such as general informative texts about literature, review and critique of literary works, etc.). Finally, both sides of the corpus are comparable in terms of number of samples, size, genre and sampling period.
After collecting each sample, a header was assigned to it. For samples collected from books, this header contains information about the book title and year of publication, and for websites, it includes the title of the text, date of the post and the webpage URL. To normalize the data, we employed Virastyar, a Persian MS-Word add-in. Moreover, for segmentation, tokenization and POS tagging, we utilized tools developed by Mojgan Seraji (2015) for Persian, namely SeTPer (sentence segmenter and tokenizer) and TagPer (POS tagger). In addition, after analyzing different corpu-analyzer tools (namely WordSmith, AntConc, Sketch Engine, and LancsBox), it was found that the best and most adaptable software for analyzing Persian texts is "WordSmith".
Two universal features of simplification and explicitation were selected for investigation. The presence of universals was identified through a number of features. For simplification, the study used the three signs discussed in Laviosa-Braithwaite (1996) where she concluded that translational language uses lower lexical density, shows less lexical variety, and reports greater mean sentence length. For explicitation, the higher frequency of connectives and cohesive ties in translated than non-translated language was employed (Olohan & Baker, 2000; Chen, 2006).
4. Results
The four different lexical and syntactic features of translational Persian were examined in the corpus under investigation, namely lexical density, lexical variety, mean sentence length, and frequency of connectives. First, regarding simplification, it was found that translational Persian in the comparative corpus used in this study has a higher lexical density, although this difference was minor and was not statistically significant. Also, the lexical variety in translational Persian was greater than non-translational texts. In addition, the study of the mean sentence length showed that sentences in original texts are slightly longer than translated texts. Comparing the two subcorpora, the texts "about literature" show higher lexical density and variety (or richness), and the sentences in philosophical texts were longer, which can be interpreted as field (also genre) variations and their idiosyncratic linguistic features. Finally, regarding explicitation, the total number of connectives was higher in the original texts than in translated texts. However, no clear overall tendency was detected in either subcorpus favoring connectives more than the other. Some connectives were more frequent in translations and some in original texts. Further, some connectives followed no trend as they were more frequent in one field but less frequent in the other.
5. Discussion & Conclusion
Moreover, the data and findings provide further support for the controversies over the strong version of TU hypotheses and raise intriguing questions regarding the presence of universal features in translations as none of the results for the four features addressed were in line with previously proposed T-universals. Therefore, the results of this study support the hypothesis that the claimed universal features, due to linguistic differences, are not present in Persian (at least to the same quality). Contrary to many previous studies (such as the detailed investigation of Ilisei et al. (2009)), features like lower lexical richness and density, greater mean sentence length and higher frequency of connectives might possibly not be among the most salient, universal (at least in its global sense) features indicative of the simplification and explicitation hypotheses. Therefore, it can be cocluded that the findings of this study indicate the specific features of translational (from an English source) and original Persian texts. In general, the present study shows that, in contrast to what might be assumed, simplification and explicitation as so-called translation universals may not be really universal as discussed by Baker (1993) and Eskola (2004), because they are not universally present in all translated texts, at least as far as thid research accounts for translational Persian psychology and sociology.
Whereas a number of studies support simplification and explicitation as translation universals, these linguistic features have been challenged by some other studies, especially when language pairs and genres vary and move from the more investigated languages and genres (Western languages, literary texts) to the less investigated ones (non-Western languages, non-literary texts) (Chesterman, 2004; Mauranen, 2007; Xiao & Hu, 2015). It seems that the assumption of the presence of similar linguistic features in all translations needs to be revised. Therefore, it is better to be cautious in presenting such generalizations and to reclassify them under what Eskola (2004) labels local translation law rather than universal translation law. In fact, it should be noted that some of the theories presented have been formulated using only a pair of specific language pairs or texts, and may not apply to other languages or genres and should therefore be limited and narrowed down. As it was shown in this study, both T-universals examined here were not present in Persian with the same quality as indicated by previous research.
Since it is not possible to proceed with any claim about the presence of universal tendencies in translations without validation, further work needs to be done to establish whether TU hypotheses are supported, at least in their current account, in other, especially unexamined, languages and genres. Although the results of the present study did not support any of the hypotheses presented in the previous studies, this may not be a good reason to dismiss the universals altogether. The authors believe that, instead of abandoning the whole possibility of translations displaying common features, we may find, at least, new tendencies that are different from those of the previous hypotheses; for example, simplification in translational language may be universally manifested through features other than lower lexical density or less lexical variety. Nevertheless, the present study indicated that the claim of the existence of "universal" features in its absolute sense (in all languages and text types) is unfounded. Much more research should be done on translational Persian and other non-European languages in order to clarify the validity and nature of TUs and the role of language, text type, translator skills and other intervening aspects involved in the minifestation of certain linguistic features in translations.
Eshrat Saghafi, Azita Afrashi, Mostafa Assi, Abdolhosein Farzad,
Volume 13, Issue 2 (5-2022)
Abstract
This paper attempts to investigate the conceptualization of conceptual metaphors of Bravery in the contemporary Persian and English Prose. The main question of this study is: "How the concept of bravery which is one of the target domain of Morality from Kovecses's point of view (2010: 23) is constructed and understood in the minds of Persian and English speakers. To achieve this goal, the authors prepared a corpus of 400 Persian sentences containing the word of شجاعت and its synonyms and also 400 English sentences containing the word of Bravery and its synonyms from the two Bases of Persian Language database (PLDB) and contemporary British national (BNC) prose , and examined them through cognitive analysis of the extracted conceptual metaphors. A statistical study of the two figures showed that as a prototype, Persian speakers consider Bravery as an "object" and English speakers as a "property". There are also many common source domains shared by the two bodies: "property", "object", "physical force", "upward direction", "action", "matter" and "human behavior". Although the Persian and English languages have many common conceptual metaphors for conceptualization of Bravery, there are some differences between them including the different source domains between the two languages which are as follows: the source domain of "path" which is belonged to Persian and the source domain of "show" which is belonged to English. The theoretical Framework of the present research is based on the conceptual metaphor theory proposed by Lakoff and Johnson (1980) and Kovecses's (2015).
1. Introduction
Universality and variation in metaphors of languages have become the main concern of many researchers to uncover the conceptual system of language speakers and consequently to discover the similarities and differences between the languages. lakoff and Johnson (1980, p.3) mention that the way we think, what we experience, and what we do every day is very much a matter of metaphor. The present study also attempts to investigate the conceptual metaphors of Bravery in Persian and English prose to find the similarities and the differences of the two languages.
2. Research question
The main question of this study is: "How the concept of bravery which is one of the target domain of Morality from Kovecses's point of view (2010: 23) is constructed and understood in the minds of Persian and English speakers?
3. Hypothesis
The comparison of metaphorical expressions of Bravery in Persian and English prose show some similarities in the expansion of using the specific source domains.
4. Literature Review
Kövecses (2005, P. 35) explains that "it should come as no surprise that at least some conceptual metaphors can be and are found in many languages. If some kinds of conceptual metaphors are based on embodied experience that is universal, these metaphors should occur – at least potentially – in many languages and cultures around the world".
Lakoff and Johnson (1980) discussed about the conceptual metaphor of HAPPINESS IS UP in English. Ning Yu (1995, 1998) noticed that Chinese shares with English all the basic metaphor source domains for happiness: UP, LIGHT and FLUID IN A CONTAINER, except the metaphor HAPPINESS IS FLOWERS IN THE HEART which English does not have. According to Ning Yu (1998), the application of this metaphor reflects "the more introverted character of Chinese".
5. Methodology
The theoretical Framework of the present research is based on the conceptual metaphor theory proposed by Lakoff and Johnson (1980). Lakoff and Johnson (1980, p. 6) argue that human thought processes are largely metaphorical and the human conceptual system is metaphorically structured and defined. Kovecses (2015, p.17) discusses the construal operations that bear directly on abstract concepts including: abstraction, schematization, attention, perspective, metonymy, metaphor, conceptual integration and Differential cognitive styles. Kovecses (2005, p. 9) also believes that metaphor is a many-sided phenomenon that involves not only language, but also the conceptual system, as well as social–cultural structure and neural and bodily activity. This paper also attempts to investigate the conceptualization of the conceptual metaphors of Bravery in the minds of Persian and English speakers verifying contemporary prose in Persian and English. To achieve the goal, the writers prepared a corpus of 800 Persian and English sentences containing the words of Bravery and their synonyms from the two Bases: Persian Language Data Base (PLDB) and Contemporary British National (BNC). Then the writers managed to identify and extract the relevant conceptual metaphors of Bravery from the corpus. The analysis of the two sets of metaphors reveals some important information: The high frequency source domains of conceptualizing Bravery in Persian and English languages show that Persian speakers consider Bravery as an "OBJECT" and English speakers consider it as " PROPERTY ".
The common source domains of Bravery shared by the two groups are as follows: "PROPERTY", " OBJECT ", "PHYSICAL FORCE", "UPWARD DIRECTION", "ACTION", "MATTER" and "HUMAN BEHAVIOR". The findings also show some differences between conceptual metaphors which reveal the specific mapping of Bravery significantly: the source domain of "PATH" which is specific to Persian and the source domain of "SHOW" which is specific to English.
The findings of the present study support the Embodiment theory of Lakoff (1999) and Kövecses's claim (2005) that the same bodily experiences lead to the same bodily perceptions and conceptions. Thus the universal conceptual metaphors, which arise from bodily experiences, perceptions and conceptions, will be the same all around the world. Nevertheless sometimes the different surrounding environment (culture) affects and changes these similar universal conceptual metaphors. Kovecses (2005, p. 13) proposes the two large groups of causes of metaphor variations as: differential experience and the differential application of universal cognitive processes which both can create interculturally and intraculturally different metaphors.
Fariba Ghatreh, Nasrin Kheradmand, Badri Sadat Seyedjalali,
Volume 13, Issue 2 (5-2022)
Abstract
Loan words, as one of the consequences of language contacts, can be widely used by native language speakers. The expansion of loanwords varies depending on many linguistic and non-linguistic factors. The present study, based on a descriptive-analytical method, aims to investigate the usage of loanwords in spoken Persian from three different perspectives: semantics, pragmatics, and sociolinguistics. For this purpose, 600 minutes of spoken Persian corpus of Al-Zahra University, including 14000 sentences in Persian for 100 different situations and subjects, used by 240 female speakers and 80 male speakers, have been extracted and studied according to loanwords’ “semantic fields”, “abstraction and non-abstraction”, and “usage frequency” as well as two sociolinguistics variables (“motivation” and “gender” of the speakers). The results of comparing the variety of loanwords and their usage frequency in different semantic fields show that the highest frequency of use belongs to the semantic fields of basic actions and technology, language and speech, and social and political relations. Moreover, research data indicate that loanwords are more related to abstract concepts and phenomena comparing to concrete ones. The research results, from the sociolinguistic view, also reveal that more women than men use loanwords with a common Persian equivalent. “Filling communication gaps in recipient language” and “social, cultural, political and scientific credibility of donor language” are the most important motivations for Persian speakers to use loanwords in their speech.
- Introduction
Following the contact and exchange between human societies, due to social, economic, historical, geographical, political, and cultural reasons, their languages influence each other and undergo changes. One of the remarkable instances of these interlinguistic changes is the emergence of loanwords which can be widely used in spoken speech. Today, as a result of the expansion of the mass media and the advancement of science and technology, we are witnessing the increasing use of loanwords in spoken Persian, which might have adverse consequences for our language over time.
The usage of loanwords is not limited to a specific context or field of language. Speakers of each language may use different loanwords in their everyday speech, depending on their individual and social needs or motivations. The current study aims to investigate the usage of loanwords in the spoken Persian from three perspectives: semantics, pragmatics, and sociolinguistics.
Thus, the following research questions are raised:
1. Which semantic fields of loanwords have the highest frequencies in spoken Persian?
2. Are loadwords more related to abstract or concrete concepts?
3. Regarding the gender of speakers, which group mostly uses loanwords with a common Persian equivalent?
4. What are the most important motivations for the use of loanwords by Persian speakers?
- Literature Review
Since the present study deals with the use of loanwords in the spoken variety of Persian, the literature review is presented into two subheadings:
A) Linguistic and sociological studies about loanwords, including Robins (1964), Sapir (1970) and Haspelmath (2009).
B) Corpus-based studies of Persian language, including Sharafi (2000), Mehryar (2003), Sattari (2009), Ketabi et al. (2010), Kargozari & Tafazzoli (2012), Mohammadi & Abdotajedini (2013).
A small number of the mentioned studies have been devoted to the invetigation of spoken Persian and the majority of researchers have studied loanwords in written literature. Moreover, in those limited number of works on spoken Persian, the reaseachers have expolred controlled data, mostly recorded radio and television programs, which are far from normal speech. Thus, as can be seen, this is the first time that the facts of spoken Persian have been studied in terms of the usage of loanwords.
- Methodology
To answer the aforementioned research questions, based on a descriptive-analytical method, the usage of loanwords in spoken Persian was analyzed from three perspectives: semantics, pragmatics, and sociolinguistics. For this purpose, 600 minutes of spoken Persian corpus of Al-Zahra University, including 14000 sentences in Persian for 100 different situations and subjects, used by 240 female speakers and 80 male speakers, have been extracted and studied according to loanwords’ “semantic fields”, “abstraction and non-abstraction”, and “usage frequency” as well as two sociolinguistics variables (“motivation” and “gender” of the speakers).
The corpus of this study, being prepared in the Linguistics Department of Al-Zahra University, is the first and currently the only corpus of natural speech for spoken Persian recorded in various social situations. One of the most important features of this corpus is that, unlike other controlled databases, here the researchers have access to natural speech of native speakers. Since the participants are not aware of this fact that their words are being recorded, the results and findings can reveal facts of nature speech and consequently are less biased. There is no need to mention that all privacy concerns have been observed during data collection.
- Results
The results of comparing the variety of loanwords and their usage frequency in different semantic fields show that the highest frequency of use belongs to the semantic fields of basic actions and technology, language and speech, and social and political relations. Moreover, research data indicate that loanwords are more related to abstract concepts and phenomena comparing to concrete ones. The research results, from the sociolinguistic view, also reveal that more women than men use loanwords with a common Persian equivalent. “Filling communication gaps in recipient language” and “social, cultural, political and scientific credibility of donor language” are the most important motivations for Persian speakers to use loanwords in their speech.
List 1: loan words of the corpus
update, upload, application, atom, autobahn (freeway), autobus (bus), add, Adams (chewing gum), address, adrenaline, eau de Cologne (perfume), art brush, agency, SMS, ascenseur (elevator), spray, sport, speaking, spin, strategy, stress, story, astigmat (astigmatism), screen shot, skill, skill worker, slide, off, UK band (brand new), active, expire, express, expression, aklil (glitter), équipe (group), alarm, album, alzheimer's, ampoule, amphitheater, energy, Angry Birds, online, optic, average, urgence (emergency), origin, OK, Oh Yeah!, idea, ideal, immigration, email, intranet, internet, Internet Explorer, entry, battery, bascule (scale), baguette, band, … |
List 2: Derived, compound, and Derived-compound words containing a non-Persian element
Atomi (Atomic), energy darmani (energy therapy), ba-class (high-class), Buddayi (Buddhist), post-e- electronic (e-mail), pomp-e-benzon (gas station), testi (by test), telephoni (by telephone), randomi (randomly), size-bandi (sizing), miyan term (midterm), … |
- Conclusion
One of the most frequent linguistic consequences of language contacts is the emergence of loanwords. There are two main motivations for using loanwords: “filling communication gaps in recipient language” and “social, cultural, political and scientific credibility of donor language”. The results of data analysis show that, regarding the gender of participants, women tend to use more loanwords with common Persian equivalents than men.
Men mostly use those loanwords which are often common words in Persian and don’t seem strange, and a small percentage of their loanwords are non-common and have a typical Persian equivalence; However, this percentage is higher for female participants. In other words, in most cases, men’s purpose of using loanwords is to “fill communication gaps in recipient language” and women's motivation is “the social, cultural, political and scientific credibility of donor language”.
Ibrahim Halil Topal,
Volume 13, Issue 3 (8-2022)
Abstract
This small-scale corpus-based study delineates the most common and significant dialectal variations between the two most commonly spoken English varieties: American English (AmE) and British English (BrE). As a result of the corpus analysis, four main areas have emerged as to where dialectal variations take place: pronunciation, vocabulary, grammar, and orthography/punctuation. A total of 26 variations (f=10 in pronunciation, f=5 in vocabulary, f=6 grammar, and f=5 in orthography/punctuation) was identified by analyzing a variety of sources, including books, articles, online dictionaries, and websites. The significance of the variations in the abovementioned language areas and their implications for language teaching were discussed empirically and pedagogically. Notwithstanding the limitations, the research is expected to contribute to our understanding and awareness of the dialectal variations and assist language learners and teachers with the learning and teaching of these variations pedagogically and systematically since it might serve as a guide or a framework of reference.
Mohammad Hassanzadeh, Hadis Tamleh,
Volume 13, Issue 6 (3-2022)
Abstract
Lexical bundle research has recently come to the forefront of corpus-driven studies. Previous corpus studies have documented conflicting results regarding the frequency and function of lexical bundles (LBs) in academic prose. To date, however, no study has exclusively investigated LBs in the "discussion" sections of research articles generated by professional native English authors. The current study addressed this gap by examining the frequency, structure, and function of the most frequent four-word LBs. The corpus was composed of the discussions of published research papers authored by native (L1) writers. The data were extracted from five reputable international journals in the field of applied linguistics, consisting of over 300,000 words. Using AntConc, all the lexical sequences were retrieved with a frequency of 10 and a range of 5. The results revealed that LBs were predominantly used by English writers. Structurally, it was found that phrasal bundles were the most frequent in our corpus. The findings also demonstrated that functionally, referential bundles were extensively employed. In addition, stance bundles and discourse organizing bundles were the most prevalent after referential bundles. Finally, the findings are discussed in terms of the implications for non-native writers regarding the use of LBs in academic prose.
1. Introduction
Since research articles (RAs) are an indispensable part of academia, writing a highly qualified paper entails the competent deployment of linguistic features. The current study investigated a particular type of morphological feature dubbed “lexical bundles” (LBs), which refer to frequently-occurring word combinations. With the growing interest in this area, some corpus-driven studies have examined LBs across different academic genres (Biber, Conrad, & Cortes, 2004), academic registers (Biber, & Barbieri, 2007), disciplines (Cortes, 2006; Durrant, 2017), expertise levels (Staples, Egbert, Biber, & McClair, 2013), L1 versus L2 writing (Ädel & Erman, 2012; Esfandiari, & Barbary, 2017), and rhetorical moves (Alamri, 2020). The findings of prior research on L1 and L2 writing have illustrated inconclusive results concerning the function and frequency of LBs. For instance, Ädel and Erman (2012) observed that native English writers relied on LBs to a greater extent than non-native writers. However, there have been corpus-based studies indicating that non-native writers utilized LBs with a higher frequency than their English counterparts (Bychkovska & Lee, 2017; Pan, Reppen, & Biber, 2016). By the same token, the frequency of functional patterns of LBs has been found to vary in a number of previous corpus studies (e.g., Ädel & Erman, 2012; Bychkovska & Lee, 2017). This study was set out to contribute to this path of inquiry by investigating the frequency, structure, and function of the most frequent four-word LBs in a corpus of 'discussion' sections of RAs written by native English academic writers in applied linguistics.
1.1. Research Questions
1. What are the frequently used four-word lexical bundles in research articles' discussions written by native English academic authors in applied linguistics?
2. What are the structural and functional properties of these frequently used four-word lexical bundles?
2. Literature Review
2.1 Frequency of LBs
Frequency is the most basic attribute of LBs since a multi-word sequence ought to have the requisite frequency threshold to be considered as a bundle. Depending on the size of a corpus, the frequency threshold might vary from 10 (Biber et al., 1999) to 40 times per million words (pmw) (Pan et al., 2016). A variety of occurrences have been identified to be used by authors in preceding bundle studies in L1 and L2 writing. As an example, Esfandiary and Barbary (2017) observed that English academic authors used significantly more LBs than Persian writers. Conversely, Bychkovska and Lee (2017) found that Chinese undergraduate students used more LBs in their essays than English students did.
2.2 Range of LBs
Range or dispersion is another criterion for identifying LBs. Similar to frequency, the range threshold varies depending on the corpus size. For instance, Adel and Erman (2012) set the low dispersion threshold of 'three' thanks to the size of the corpus, while for a corpus of 176 texts, the range threshold was set at 20 by Biber and Barbieri (2007).
2.3 Structure and Function of LBs
LBs fall into different structural and functional patterns. Following Biber et al.'s (2004) functional and structural taxonomies, LBs were structurally classified into three categories: NP/PP based bundles (phrasal bundles), VP-based bundles, and Dependent clause bundles. Functionally, they serve three primary functions, namely stance bundles, discourse organizing bundles, and referential bundles. Previous research has shown varying results regarding the frequency of stance expressions and discourse organizers in L1 writing.
3. Methodology
3.1 Corpus
The present study used a corpus of research article discussions produced by native English academic writers in applied linguistics. The RAs were extracted from five highly-ranked international journals (Language Learning, Applied Linguistics, TESOL Quarterly, Studies in Second Language Acquisition, and Second Language Writing). The corpus was composed of 243 discussion sections published between 2005 and 2019.
3.2 Bundle identification procedure
In the initial stage, the discussions were removed from all non-textual content (i.e. plain texts). Using AntConc (3.5.8.0), a list of four-word LBs with a frequency of 10 and a range of 5 were retrieved. Then, the LBs were structurally and functionally analyzed based on Biber et al.'s (2004) structural and functional taxonomy.
4. Results
After retrieval, 142 types and 2,637 tokens of LBs were found to be used in the discussions, suggesting the prevalence of LBs in the academic prose of native English writers. The most frequent LBs found in the corpus were in the present study, in the current study, in the case of, it is possible that, the results of the, and on the other hand, which occurred over 50 times across the corpus. Structurally, most LBs were phrasal bundles consisting of NP-based and PP-based bundles. The functional analysis revealed that referential bundles accounted for 60.6% of all LBs.
Volume 26, Issue 2 (9-2019)
Abstract
In this study, corpus method was used to test an assumption of Conceptual Metaphor Theory (CMT) that systematic and conventionally fixed metaphorical expressions have literal meaning in the source domain. The conceptual metaphors LIFE IS A JOURNEY and IDEAS ARE PLANTS were selected for analysis and three keywords from source domain of the metaphors were chosen and matched with their English equivalents. Hamshahri 2 collection of Farsi texts was selected as the corpus of the study. For ease of processing, one third of the corpus comprising of fifty million word tokens was randomly sampled as the working corpus. Collocates of the source-domain keywords, as realizations of fixed metaphoric expressions, were extracted using AntConc software and their concordances were examined. It was found that 1) in conventionally fixed metaphorical expressions, when source-domain keywords were used metaphorically they had collocates that rarely appeared with the same source-domain keywords used literally, and 2) source-domain keywords had gradable degrees of metaphoricity. The findings were interpreted as suggesting that the meaning of fixed metaphoric expressions may not be systematically connected to the metaphor's source-domain meaning.
Volume 27, Issue 4 (10-2020)
Abstract
Hegemonies imposed from sources of power have been an issue of investigation for many years. In recent years, media and movies have gained particular attention due to their society-affecting power. The present study explores how male and female characters are represented in American movies based on the Van Leeuwen’s (2008) social actor categorization. Hence, the researchers focus on the scripts of the movies available in fiction genre of COCA (Corpus of Contemporary American English). A representative sample of words depicting each gender was chosen based on their frequencies, and accordingly, their collocations were extracted. The findings indicate that men and women representations were following stereotypical depiction of gender roles; while men tended to be associated with high-ranked jobs, positions, activities, and identification categories, women were shown to be passively linked with inferior features, low-income jobs, child-bearers, and sexual aspects. More specifically, women were mostly objectified through a patriarchal perspective. The results might shed light on the archetypical imposition of power from above and may pave the way for unbiased media where depths, not just the appearances, of characters are of greater significance.