Translation technologies: “It’s the people that are most important”.

Interview with Joss Moorkens, Dublin City University

Dalia Mankauskienė
Institute for Literary, Cultural and Translation Studies
Faculty of Philology
Vilnius University
dalia.mankauskiene@flf.vu.lt
https://orcid.org/0000-0003-2806-0892

We will be talking about the conference, but I wanted to start further back: what would you say is the area of your expertise in Translation Studies?

To the extent that I’m expert in anything, I suppose translation technology. I have worked a lot with machine translation and user interaction with translation technology including machine translation. Then more recently I’ve been thinking and writing about ethics and more broadly about the translation industry and the use of technology within the industry.

You mentioned several topics in the translation technologies that you are interested in. Why these specifically?

My PhD was on translation memory, but machine translation is changing quickly as is translation technology in general. That makes it interesting and it’s having quite a disruptive effect on translation processes. I think that particularly with machine translation, there are problems with how humans interact with it, for machine translation for dissemination1. There are lots of different ways that translators can interact with machine translation. In a lot of different platforms and interfaces, there aren’t a great deal of customization options for those interactions.

For different translators, for different people, some modes of interaction might be more helpful than others. Trying to improve and promote customizable modes of interaction with machine translation is, I hope, a positive message. As part of that, we’ve worked on a couple of different interfaces, a mobile interface, a desktop touchscreen interface to try and make that interaction more intuitive and helpful, and empowering for translator users.

I like that you mentioned ‘positive influence’ because at the beginning when you were talking about this issue, you said that machine translation was disruptive for the translation industry, do you mean that in a positive way?

I mean that in both positive and negative ways. I think it depends on what the focus is when an organization tries to work with machine translation. If the focus is purely on a cost strategy, trying to minimize cost, to maximize efficiency. I’m not necessarily against efficiency, of course, but against what I think Ellul called la technique2, this obsession with efficiency and rationalization, that can be done at the cost of other things.

If MT is implemented as part of a value strategy where it’s used to automate where possible, but also technology is used to empower translators and perhaps translators can move to higher value texts and content, I don’t think that necessarily is a negative disruption, but we see a mix of both in the translation industry and probably many varieties of gray in between.

Would you say that it is especially true with the coming of the neural machine translation (NMT)?

Yes. NMT means that there are certainly more text types and there’s more content that can be translated without any human input. If it’s low-risk, low shelf life, low-value text, then it enables the translation of it, especially for the material that wouldn’t be very interesting for a human to translate. But for a high-value text, it seems very unwise to use machine translation. I think some of the problems we see are the result of a unilateral imposition of machine translation in a translation workflow or the use of machine translation of content for which it’s not very wise to use it. That’s problematic, especially if that comes along with efforts to cut remuneration rates and to focus solely on efficiency, as I mentioned earlier.

What would you say is the status quo of machine translation at the moment?

In terms of the quality of output or in terms of how it’s used within commercial translation workflows?

I think the quality of output would have an impact on how it is used, no?

Yes, I suppose these things are interconnected so that the quality of output for well-supported languages, languages for which there’s a lot of data available, the quality of output is pretty good. It’s particularly fluent. There can be problems with register, there can be mistranslations, problems with consistent terminology. We used to be able to use force decoding for SMT3 to force a term to be used, and that’s not possible for NMT.

There are problems with bias, with gender bias, with inconsistent gender use from one segment to another. If an adjective or a job title is usually associated with one gender or another, then we might see some flip-flopping if there’s not a gender-specific pronoun used. Those are problems.

Then if we move to less well-supported languages, there are two primary ways that seem to be used in research to try and improve MT for less well-supported languages. Big tech companies like Google and Meta are trying to build huge multilingual neural NMT systems. A recent system about which Google Bapna et al.4 published from 2022 had 1000 plus languages within the multilingual NMT system, Meta are working on their no language left behind project, working with 250 different languages. The idea here is that there was a discovery made in 2017 that you could use labeled data when there wasn’t necessarily bilingual or parallel data. If a number of different labeled parallel corpora were used in training, it was possible to translate between the languages for which we didn’t have parallel corpora. There’s also been a finding that for multilingual NMT systems, for some reason, the quality of the less well-resourced languages rises within a multilingual NMT system.

Then, mostly academic research centers and researchers are trying to augment data, use back translation, use various methods of maybe using different models to try and improve bilingual systems for less well-resourced languages.

So it looks like there’s been something of a plateau in quality for well-supported languages within NMT and the publications looking at red and green AI5, which I might come back to, suggest that the improvement is at best logarithmic when more data is added. There’s an attempt to create bigger and bigger language models, but the improvement is becoming smaller as we increase the amount of training data. Thus, the biggest space for improvement seems to be in those low-resourced languages.

It is very interesting what you’re saying about the plateau, because I had a similar idea even though I’m no expert in the technical part of machine translation. I was thinking that because I still remember what, for example, Google was like before 2016. (My students don’t remember that anymore.) There was that big switch. And DeepL got Lithuanian only last year in March. It’s relatively new, but it has become so popular that it seems that overnight everybody knew it. Lithuanian is in no way one of those under-resourced languages, which might be surprising to Lithuanians, actually, as we are a small country, but definitely not under-resourced. And I’m thinking now that we’re at that plateau. Of course, we want other languages to catch up, but will there still be improvements in MT of well-supported languages? Will the investment keep coming for us to keep going or maybe we have reached our potential and it’s not worth it anymore?

I think it’s impossible to tell. My sense is that because training data includes good translation, bad translation, medium translation, usually some machine translation in there, that averaging it out means that there’s a limit to how good it can get but that’s very much intuitive rather than based on any fact. I think it’s very difficult to predict how things will progress.

Talking about predictions, in one of your slides there were some predictions about the future of MT and AI (when, what level, etc.). I think it’s very brave of those scholars to make these predictions in our current climate. But let’s get back to one of your thoughts you mentioned: term consistency, that we cannot tell NMT to keep the terms consistent as we could for the SMT. Why is that? We can’t have any rules with NMT?

There’s been a couple of different methods tried to make terminology consistent. We had one postdoc, Rejwanul Haque, who worked with us in ADAPT and is now a lecturer at South East Technological University in Ireland. Rej was looking at inserting a feedforward neural network within the larger neural network to focus solely on terminology. I don’t think he managed to do that successfully. Then there have been attempts to use rules to guarantee terminology but from what I understand, there’s been a degree of success but not 100% accuracy in ensuring that terminology is used consistently.

The machine doesn’t want to play by the rules?

No, and I suppose the benefit of NMT is that you’ve got these word embeddings where if there’s a syntactic or semantic connection or similarity between words, that these are vaguely proximate within the vector space. That means there’s a greater capacity for linguistic diversity than was previously. That’s the basis for the newer neural automatic evaluation metrics being a little bit more forgiving.

Where the previous automatic evaluation metrics were sticking rigidly to the gold standard human translations, this relational syntactic or semantic relationship, it might be taken into account for an automatic metric that uses neural networks. But then if NMT output depends on context and on the number of words produced so far, things are not static. The translation is not static, the word produced might be different.

That’s the benefit and what makes NMT fluent, but it is also what makes it more difficult to restrict terminology because we have this fluid circumstance whereby it’s not just looking for the most popular translation of a word or the most predictable translation of a word, but rather it depends on the context and the words produced so far. Everything is a little bit fluid as we’ve got thousands of calculations taking place at once.

It’s really fascinating how sometimes DeepL, for example, gives you a translation that I’m not sure many humans could do straight away. Maybe, as humans we are also in a sense at the beginning in our brain, we work it in a statistical machine translation way and only then impose our knowledge how to switch it. It is really fascinating how these NMT systems work.

I was also wondering about some simple things that appear to be far from simple when we deal with MT. For example, I have noticed that in translations to Lithuanian we do not get the right autographic signs. Our dash has to be a long one but it always gives you the short one and for me, it seems: „just tell the machine to do that.“ Apparently, we cannot really do that. The machine needs to figure it out itself that it needs to give us the long dash.

I can’t imagine how that’s done. In tokenization, presumably, there’s no dash there, so it’s being inserted after the fact. Presumably during the tokenization process, at the start of the translation process, you’ve got words being reduced, the capitalization is removed, the punctuation is removed. Then if that’s being artificially added, it must be a case of trying to find the right rules but I’m really not certain at all. It’s purely a guess.

It’s strange how machines are sometimes so clever with the text, but cannot give you the long dash when you need it. You have to do it yourself.

Some of the things that we think are really obvious can be quite difficult for machines. I saw a presentation by Andrzej Zydroń from XTM about the difficulties of having a consistent count of the number of words in a document. Different versions of Microsoft Word will give you different word counts for the same document. They’re vastly different. To try and deal with this inconsistency between not just Microsoft Word versions but lots of different word processing tools, different TM tools, Andy and others tried to work on a standard for counting of words.

Then there are still things within the standard that they put forward, such as the number of characters from Thai or from Japanese that represents a word, that are contentious. Something that would seem so straightforward can be quite difficult and machines can be quite inconsistent in how they carry this out based on our instructions.

That’s exactly what we were doing with students yesterday in our translation project management module. We took one document and we looked how many words you would get from Word, from Trados and other software and the results were quite different. The students were fascinated. It is one of my tricks to show them that technology is not always the smartest in the room.

We’ve covered many very interesting topics, Joss. What area would you say is under-researched in translation technologies or machine translation specifically?

There’s actually been very little research focused on localization in the last few years. It was a big research topic when I first started working with translation technology. In Ireland, we had the Localization Research Center that was run by Reinhard Schäler in the mid to late 2000s but more recently, it doesn’t seem to have been a hot topic. There’s the Journal of Internationalization and Localization, but sometimes they’ve only produced one issue per year, and that journal is now pivoting to call itself Digital Translation or rather the editors, Minako O’Hagan and Julie McDonough Dolmaya, are pivoting.

There doesn’t seem to be that much focus on localization despite the fact that localization is a huge area of the industry and is actually quite dynamic, particularly in software localization as we’re moving from fairly slow development to agile software development and sprints to so-called continuous development and continuous deployment. Which means that as a translator and localizer, you can be working almost simultaneously on the translation of something almost as it’s being written or being created.

What you’re working on can be deployed and published at any time which is really interesting and the effect that it might have is interesting. I know from speaking to people within game localization, the speed of localization has really sped up. It means that companies try to follow the sun where something will be passed on to a team during their daytime, moving from one region to another, so that a project never stops. The fact that we’re not working with a physical published item, and instead, there are just continuous updates means that there isn’t that feeling of achievement when a team develops or translates or localizes a product. Instead, it’s just one continuous sprint.

I know in Keywords Studios, they’ve created these artificial time periods after which they celebrate achievement and it’s „After two weeks or after two months we’ll celebrate this achievement. We’ve done this much in that two-month period.“ Similarly, they’ve had to tier their translators and localizers to try and make them feel that there’s a sense of career progression. There are real repercussions for these repeated cycles.

I’m not sure that it’s entirely clear how that changes things for people working in localization. How machine translation has been included in localization doesn’t seem to be entirely clear. A lot of localizers and translators are working within TMSs in which MT is incorporated, but most of the research about work with MT is just about static post-editing. We’re not really looking beyond that. It’s very difficult to have workplace studies.

I know there was a call by Brian Mossop in 2006 for having more workplace studies and there’s been a big response to that by people like Hanna Risku and Maureen Ehrensberger-Dow who have done a lot of workplace studies, but there’s a lot more to do there and a lot more that I think we can learn about how MT is included in localization workplaces, but also just how localization has changed. That, I think, is an under-researched area.

It’s interesting to hear you talk about this issue because in all of your explanations throughout our conversation, you keep going back to people and about how people use these technologies, how they feel about them. Usually when we talk about translation technologies it’s all about the machine: how this machine works? How can we make it better or to increase its output quality? etc. It’s very nice to hear you focus on how it can benefit people or how people feel about that. I’m wondering if that is also the reason why you are interested in the ethical side of machine translation?

I think you’re right. For me, it’s the people that are most important, and of the resources that the translation and localization industry relies on, there’s people, there’s energy and there’s now data. But the people have to… I suppose arguably with climate change, but maybe people and energy need to come equal first. Ultimately if we don’t consider how people interact with technology and in making that a pleasant and satisfying and motivating experience, then translation and localization will not be sustainable.

There’s talk of a talent crunch in subtitling because working conditions are not really being considered enough, there’s the imposition of MT and post-editing within very restrictive interfaces that are often owned by the translation agency – vendor portals, they’re sometimes known as. If we look at the survey from CSA Research by Pielmeier and O’Mara6 from 2020, it’s already so common to use these platforms, and often many of them are quite restrictive. We see in subtitling this talk of a talent crunch because remuneration is too little and jobs aren’t that satisfying for reasons of cost, speed, or prevention of piracy.

There might not be a full view of what happens on screen or to try and have a quick turnaround, we might have a program or a film broken down into small chunks so that a translator only has visibility of a small section. There’ve been documented problems with having no access to terminology from previous film versions when working with subtitling. There were the two subtitlers from Sweden and Finland that were documented, for example, by Kristiina Abdallah in 2016 or ‚17. If work becomes too dissatisfying, that means that people won’t want to do it, and translators are smart people, they’ll go and do something else.

The interaction of people is really important and it needs to be satisfying and it needs to be motivating. The books by Lewis Mumford from 1967 and ‚71 were quite prescient in that way where he talks about the myth of the machine and how we shouldn’t try to compare the mechanical operation and productivity of humans with machine, that our skills are elsewhere. It’s really a case of trying to best use those skills and, I suppose, to let the machine help us where it can, but the human experience is key.

That’s a very good sentiment and I’m glad you are expressing it. Here I would like to go back to the example that you gave in your talk (we didn’t really have enough time to discuss it during the conference), I mean scenario A that you had where the person creates some translation, the company feeds it to the machine, and then the machine replaces the person. What are your thoughts on that? Is there something that a person can do or safeguard themself somehow? Because I don’t really see how that could work differently.

I don’t see how it can work differently either. In that scenario where the translation is used to improve NMT output, and then there’s the imposition of post-editing and unilateral discount applied, I’ve seen two variations of that. One is more or less as stated in that case study where the employer says, “We’re going to be using post-editing from now on and you’ll be paid at 60%, 65%, 70% of your usual word rate.” There’s a new article by Bert Esselink in The Journal of Internationalization and Localization where he reflects on 30 years in the localization industry and talks about how most technological change tends to be top-down.

In other scenarios, I’ve seen that there’s been a discussion between the employer and the translator as to whether they’ll work with MT. The employer asks the translator to “give it a try and see what you think.” Initially, there’s no discount applied. Then they’ve had a discussion after the fact. In the example I’m thinking of, they used memoQ data to show how much technical effort had been put in. How much typing, how much less typing – or less keyboard interaction – there was after using MT.

Interestingly, I spoke to Peter Reynolds from memoQ about this afterwards and he said they thought no one was using it, and he wasn’t sure whether it was accurate. Anyway, that’s a whole other matter. In that scenario, the employer and the translator got together afterwards. If the translator was happy with the MT quality and felt it was useful and wanted to continue using it, then the employer suggested a certain word rate.

They were aiming for a 25% discount but were willing to discuss it, also were willing to accept if the translator was unhappy using it, especially if it was someone who they very much valued because they wanted to build long-term relationships with their translators. They felt, “If we have a good translator, we don’t want to upset them and we want to be able to rely on them over time.” That was more of an example of the value versus cost strategy, I thought, and was a fairer and participatory implementation of MT within a translation workflow.

That’s interesting to hear that there are some examples like that because generally what you hear is the first scenario: “Give us a discount. Use the MT,” and that’s that. Of course, a different situation is also known. For instance, I hear quite a lot in Lithuania that it’s not the translators that don’t want to use machine translation, but translation agencies don’t want them to use machine translation unless specifically given the task of postediting, of course. I think it also goes back to the kind of thinking that, “We’re giving you this translation, but if you use MT, you will not be doing as much work. We will not be able to check that and we will not be able to give you less money for it.” At least that’s my personal impression of it.

That could be the thinking. There are more and more examples and opportunities for remote monitoring so that it’ll be visible to an agency or an employer quite how much work has been done, how many keystrokes have been expended, how long they’ve spent on any specific task. That was something I mentioned in the talk the other day, that translator activity data. If we look at the presentation by Alessandro Cattelan about T-Rank, their automatic job assignment tool within translated.com, they use that data about how long someone was active on a job and then also ratings of the interaction, ratings of friendliness, those sorts of things, and how they adhered to deadlines. All of these things are included in the algorithm to decide whether a job is offered again.

Especially when there are vendor-owned platforms or agency-owned platforms, it means that they will have vast swathes of data to be able to see clearly how long was spent on a task, how much effort was made, was there a break while you made some coffee or went to the bathroom? All of that will be visible or is already visible. Those slightly arbitrary restrictions on the use of MT, maybe that might change over time.

Some people like to use MT as an initial suggestion. Maybe that’s the way forward. There was a big controversy in Canada a few years ago when within the translation service, they wanted to build an MT system. It became highly controversial. There were questions in parliament about it, but one of the things that appeared was that there were about one million hits on Google Translate, I think, for every week from the translation service. Already MT was being used to an extent.

Also, it was interesting that when I gave a talk at the ITI Conference, which is one of the translation associations in the UK, the people who were most open to using MT and who came to talk to me at the end about how useful they found it, were all people who were towards the end of their career, it looked. All gray-haired, all very positive about technology, didn’t see it as a threat at all, and they actually managed to work with it in a really beneficial way.

Many of them said that they used it just as an initial suggestion and then were able to think in a different way. It seems to be a very individual decision as to whether that’s how you want to use MT. Others have said that once they see a translation suggestion, it’s difficult for them to think of another way to express the same idea.

Maybe, it has to do with confidence in your own abilities because those people, they know their worth. They can use machine translation to support them. If you are a younger person entering your career, I think that’s the issue that at least in my research translation trainers see that those newcoming translators won’t learn to translate, just use machine translation and will, hopefully, learn only how to post edit. If you already have that translation skill, you are free to use machine translation to your advantage.

Especially if there’s such a focus on velocity and throughput that you’re really just trying to get through the next segment, next segment, next segment. It’s very difficult to feel that you have the time to think of a different way to express an idea. That idea that there’s a translation muscle that can waste away or that you never really build up if you’re post-editing, I don’t know that anyone’s really tested that experimentally. It’s something that people have written about more generally to do with automation. It’s certainly an idea that Chris Durban, who was at the conference the other day, put forward. I don’t know quite how you might test that. It’s an interesting idea.

It is. And it made me think of a colleague who’s also older, but she does quite a lot of translation. The reason she likes to use MT if she can is that she wouldn’t have to type that much. That’s the main thing. Sometimes you can really understand why that would be useful. We’ve mentioned our conference a few times now and I have to ask you, why did you agree to come to our conference? During your sabbatical year, especially. We are very grateful, of course, but why did you make this decision?

I think that ethics in translation is important. The fact that it was foregrounded in your conference, that initially made me interested. I didn’t get to see Nike Pokorn’s talk because I couldn’t travel until the following day. I contributed to Nike’s co-edited volume, The Routledge Handbook of Translation and Ethics, and the fact that she was involved as well was helpful. Just the people involved, and the fact that it was a new country, but the main one was the topic. That was what made it most interesting.

I’ve tried to be quite careful with travel. Before COVID, I traveled maybe 15 or 16 times during the year, and it was just too much. My kids are quite young and I found it exhausting. It wasn’t something I enjoyed anymore. Plus, I thought from an environmental perspective, that that’s just not sustainable. I’ve tried to pick and choose when I’ve traveled to places, use Zoom as much as I can, but this seemed to be an instance when it was worthwhile and important to travel. That was why I wanted to go.

Well that makes it even more important that you came. We highly appreciate it. Ethics seems to be an extremely popular topic this year. When we just started organizing this conference almost two years ago it wasn’t that prevalent. This year, we are the third or the fourth conference that is focusing on ethics. Do you think it is also one of those areas that are under-researched in some sense, or not anymore?

Not anymore. I think it was, but as I mentioned in my talk, we’ve got this recent handbook that Kaisa Koskinen and Nike Pokorn co-edited. Those handbooks, there are so many of them these days, but there’s a huge effort behind them. I think there’s lots of really useful material in that, some of which I still haven’t managed to read all of. There’s the new book by Joseph Lambert on Translation Ethics that’s coming out in March.

If we look back, there was Anthony Pym’s On Translator Ethics in 2012. Ethics have formed part of some work by Andrew Chesterman, Joanna Drugan, Lawrence Venuti. Not explicitly but is still part of the work of Maria Tymoczko. It was sporadic in publications or books here and there, whereas now it’s front and center in a glut of recent publications. I think that’s really important and it’s particularly important as we try to work out how our interactions with technology progress and how we view technology more generally. I think our view is a little bit more critical than it would’ve been a few years ago. I think that’s probably part of it.

When you mentioned Kaisa Koskinen, I remembered she was the chair of CETRA this year and I was listening to her talk and one of her ideas struck me because she said that for quite a while now, translation trainers and researchers have been very much focused on technology and how to learn it, how to make it better, etc., but now we should be coming back to talking about what translation actually is. And one of her main ideas was that the concept of translation is not representative anymore. Would you agree?

I think you and I see that within the EMT network that every so often there’s a network meeting that focuses on technology and there’s always a bit of a backlash where people are going, “Technology again, really?” Obviously, I don’t mind because I find it interesting. It’s my area of research, and myself and Tomáš Svoboda, along with yourself, work within the technology working group. I know that there’s a certain cohort of people who are just sick to death of hearing, of talking about technology.

To an extent, I can understand that because the discussion doesn’t always move on. We can hear the same talking points again and again. Sometimes you can find it difficult to find something new to say in the area. I review a lot of articles and papers many of which are to do with surveys of how translators feel about post-editing. They do become quite repetitive because it’s hard to think of a new way to express the same ideas. It doesn’t mean that they cease to be important. Sometimes I wish that they would look a little bit more into how this interaction takes place: What is post-editing? Have they just received a filled target text window? Is it propagated with MT or have they used something like DeepMiner within Déjà Vu where there’s been some sub-segment matching? At least something that makes it different from the previous survey. That’s probably a little bit unfair to say, but it’s just that the discussions can be quite repetitive and I can understand why there might be a backlash.

I know from people within the industry as well that sometimes they say they’re a little tired about the continual discussions of technology. The process of translation has become so technologized that it’s hard to leave it out of the discussion either, if we’re to be representative of how translation is carried out today. In my talk, I presented a typology of automation from Parasuraman et al. from 2000. As I mentioned, there are lots of different typologies that we might use. They’re all broadly similar but at the bottom, we have completely unaided work where it’s just the human, no machine. In translation that probably almost never happens anymore.

Technology is part of the discussion but maybe shouldn’t always be the only part of the discussion.

What a nice sentiment and a perfect way to end the interview. I’m really grateful for your time and of course all of us at Vilnius University feel very lucky you came to our conference and to Lithuania.

Thanks very much for you too and for inviting me to Lithuania.

2 Ellul, Jacques. 1964. The technological society. New York: Vintage Books.

3 Statistical machine translation, D.M.

4 Bapna, Ankur et al. 2022 Building machine translation systems for the next thousand languages. arXiv. Available at: https://doi.org/10.48550/arXiv.2205.03983.

5 Schwartz, Roy; Dodge, Jesse; Smith, Noah A. and Oren Etzioni. 2020. Green AI. Communications of the ACM, 63(12). 54–63. Available at: https://doi.org/10.1145/3381831

6 Pielmeier, Hélène; O’Mara, Paul. 2020. The State of the Linguist Supply Chain: Translators and Interpreters in 2020. Common Sense Advisory. January 2020.pdf