A Computer-Based Approach for Predicting the Translation Period of Early Chinese Buddhism Translation——Texts from the Eastern Han, Three Kingdoms and Western Jin Periods
|Authors:||郭捷立||Keywords:||中古漢語;漢譯佛典;譯者判別;費雪線性辨別分析法;可變長度n-gram;Ancient Chinese;Chinese Translation;Authorship Attribution;Fisher Linear Discriminant Analysis;Variable Length n-Gram Feature Extraction||Issue Date:||Jun-2012||Abstract:||中國佛教大藏經中的翻譯作品是佛教文化研究的瑰寶，然而當中部分佛經的譯者記錄存疑仍待解決。受限於早期史料難以完整收集的困難，使得佛經翻譯初期的年代——東漢、三國和西晉的佛典譯者問題最嚴重，也最難處理。相對於傳統文獻學的質化分析研究方式，本研究嘗試以統計量化分析搭配資訊技術的方式，來尋求早期佛教譯經作者紀錄的問題之解答。本研究以建立一個能夠準確分析文獻是由上述三個朝代之中的哪一個朝代所翻譯完成之判別機制為主要目的。藉由此研究成果，我們可以找出未知經典最可能的翻譯年代，以進一步縮小可能譯者的比對範圍。在研究過程中，我們先參考傳統文獻學者的研究成果，建立三個朝代的可靠參考翻譯作品清單，之後再利用Variable Length N-gram 的演算法進行文獻特徵值的萃取，並使用「費雪線性辨別分析法」進行判別特徵值判斷。根據實驗結果，此辨別機制之效果十分顯著，其準確度可以至少達到89%以上。此外，我們藉由進一步分析由費雪線性辨別分析法所產生的辨別函式，找出此三個朝代經文在翻譯上所使用的特徵，此特徵能用於分析探討同一個外語詞在三個朝代中被翻譯成不同的語詞的狀況。在本研究聯合機制中，我們發覺這樣的量化分類方式是可以解釋部分的經典翻譯現象。
Buddhism has been spreading in China for more than two thousand years since its first introduction in the Eastern Han dynasty (C.E 25-221), and has become an important part of daily life and culture at large in China. A great number of Buddhist scriptures were translated from Indian originals starting from the Eastern Han dynasty to the Tang dynasty (C.E. 618-907) and beyond. Scholarship has become increasingly aware, over the last few decades, that traditional authorship and translatorship attribution of the early Chinese works is often unreliable. The current reference edition of the Chinese Buddhist canon (Taishō shinshū daizōkyō (Abbr.: T.) 大正新修大蔵經, collated 1924-1934) contains 3053 works in 85 volumes, including about 1000 texts of Indian (or alleged Indian) provenance. However, ca. 150 of these texts are marked as shiyi 失譯, indicating that the name(s) of the translator(s) are unknown. In addition to such unknown cases of attribution, for the texts that were translated between the 2nd and the late 6th century, many attributions are uncertain, problematic or simply incorrect. Text-critical and philological studies have brought a significant advancement of the status of the research in the field. However, traditional philology has its scale limit. The research project the present thesis stems from has thus designed a statistic model employing variable-length n-gram, with Fisher Linear Discriminant, to establish a highly accurate classification mechanism for predicting the translation period of Chinese texts. The time brackets we focus on in the present study include three early Chinese dynasties: the Eastern Han (C.E. 25-220), the Three Kingdoms (C.E. 220-280) and the Western Jin (C.E. 266-316). These three dynasties constitute the earliest phase of Buddhist translation history and most of the translations from these periods present attribution problems. In this research, we build up classification mechanisms for each of the three dynasties. These can be used to test whether the translation style of a text is similar to the one prevalent during a certain period. According to the output of our experiment, all of the three classifications for three dynasties have an accuracy rate of more than 89%. Also, by examining the classification result, we extract the special translation usages of Chinese sutras in different time period. With the help of statistic information bearing on the characteristics and features of Chinese texts, this approach can not only provide new evidence relevant to uncertain authorships but also encourage Buddhist scholars and scholars of linguistics to do further studies.
|Appears in Collections:||佛教學系|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.