The web as a corpus: a resource for translation
Helia Vaezian
Khatam University
Published 2018-12-20


corpora for translation purposes
translator training

How to Cite

Vaezian H. (2018) “The web as a corpus: a resource for translation”, Vertimo studijos, 110, pp. 62-75. doi: 10.15388/VertStud.2018.5.


[full article, abstract in English; abstract in Lithuanian]

Accessing ready-made corpora may not be always easy. This is especially true for less dominant languages such as Persian for which the number of available corpora is very limited. Moreover, most existing corpora are domain specific, which implies that they supply a limited range of genres and text types. They, thus, may not always contain the information the translator is looking for. Drawing on the world wide web as a big corpus, however, is not subject to such limitations. The web, in fact, can be considered as a very large multilingual corpus containing texts in almost all languages and all text types. The present paper reports the results obtained from a collaborative experience in which undergraduate English translation students from the Department of translation Studies of Allameh Tabataba’i University made use of Google search engine and webascorpus web concordancer to extract translationally-relevant data from the web.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Please read the Copyright Notice in Journal Policy