Category: AI and LLMs

  • 7 thoughts: takeaways from the European Language Data Space Workshop

    7 thoughts: takeaways from the European Language Data Space Workshop

    On Monday 15th September, I was a panelist and participant at the country workshop for Austria on the European Language Data Space. I’ll have to profess that I went into the day relatively uninformed about the project, so greatly appreciated the introduction about the European LDS. It was also good for me to take off my “Aufsichtsbrille” and understand what possibilities like beyond my small corner.

    1. Language data is so much more than text-based corpora: while initial training of LLMs used massive text-based corpora, language data can be so much more than text-based. With Speech-to-Text and Text-to-Speech being increasingly frequently uses, there are naturally massive audio language data files. My voice alarms’ female British English accent probably primed me to think only of a very small number of voices covering a locale.
      However, particularly for training speech-to-text applications,you need vast audio files covering massive ranges of accents, dialects, ages, pitches of voice. New text-based language data sets are increasingly made up of synthetic data. This led me to wonder about the adequate labeling of datasets.
    2. Getting the language data in is comparatively easy, but the training stage is time-consuming and expensive. Even if the language data is readily available at a low cost, that is only part of the story. The compute power required is time-consuming – cases were mentioned of days, weeks and months. And the compute power is not cheap. And then if say a tiny part of the data is then removed and must no longer be used in the model, how do you get it out.
      After all, retraining on a new “clean” dataset means more training weeks/months and compute time. This issue is why I am not surprised that Anthrophic et al. are choosing billion dollar settlements in legal cases, rather than the cost of retraining and/or the hard work “getting the toothpaste back in the tube”.
    3. Monetization of language data in Europe is a new concept: BigTechs have also got around some of the issues surrounding data scraping by setting up deals with platforms to allow them to use their data for AI training. Google’s $60m a year deal with Reddit in February 2024 was cited on several occasions. As I put these thoughts into text, there are stories breaking of even bigger/closer ties.
      One intervention of mine during the panel was that monetization might present problems for example for authorities – for example where supervisory authorities are funded by supervisory fees, supervisory entities might not be happy that the data created in their being supervised is monetized and only available for them for a fee.
    4. Europe does things differently: Europe thinks in hardware, software and innovation cycles that are still remarkably slow compared to the US or Asia. The LLM/GenAI age and increasing frequency / shortening intervals between releases shows these cycles as being too long. Maybe this is why there are no European BigTech players. Europe might be falling behind due to over-engineering. To use a language allegory, Europe is a “guardian of usage” stifling rather than driving innovation, possibly due to heavier touch regulation.
      However, at the same time, Europe’s appraoch is more one of “for the greater good”, rather than the personal enrichment of the select few. The literary translation keynote from the FIT World Congress in Geneva given by Anton Hur touched on how Silicon Valley specialises in vaporware.
    5. Metadata plays a massive role: when I started using a CAT tool, it was as simple as have source and target languages in TUs. However, since I started at the FMA, I have really understood the need for additional metadata – e.g. to indicate the supervisory area, locales, various usage fields etc. Volume/amount of data is not the only indicator of value of that data. BigTech always goes towards the “big(ger) is (more) beautiful” approach regarding the amount of data used to train LLMs. So yes, small (with good metadata) can be just as beautiful.
    6. “Getting the toothpaste back in the tube” – how do you remove data from models? One of the biggest asymmetries is the difficulty of removing data from a model without having to retrain the model compared to getting data in the model. It really is a black box. Similarly, with language constantly changing there is always the issue that data that is current now, does not always remain current, and therefore needs replacing. It would be interesting to understand the breakdown between the training and production costs of models, to understand how big the assymetry is (I can only assume it is a large one!).
    7. I’ve always understood the value of data, but not been able to value it: the platform for monetization of data really brought home the value of data – and reinforced that there is also real value in well-curated data (as I see from my work with translation memories and termbases). The “fuzziness” of synthetic data also brings home the real value that lies in human-curated data.

  • Who’s in/on the lead in early 2025?

    Who’s in/on the lead in early 2025?

    In December 2023, I wrote about the state of human-machine translation as we headed towards 2024. The technological march of machine translation had dominated 2023. From personal musings over the last twenty years, comparing my professional situation with 12 months before, for the first time in 2022 and 2023 my outlook was a more pessimistic one in consecutive years.

    My view was one of human translators being pushed towards fighting for scraps at dumping MTPE rates. More were considering moving away from translation or increasing their other activities than moving towards focussing solely on translation. In house, I had started to receive editing and revision requests that “didn’t seem quite right”. They seemed more fluent than their authors’ previous drafted texts, but also weren’t quite factually correct. In other cases, the inconsistency of terminology shone through. The sinking feeling was that my own descent towards MTPE drudgery had begun. The profession shared my pessimistic outlook. The fragility of (self-)employment relationships, needs for efficiency and cost-cutting amid difficult financial times were also apparent.

    2025: a turbulent start

    When I started sketching out this article in mid-December 2024, I didn’t know how 2025 would begin in terms of technological announcements. DeepSeek was not on the radar – by the end of January it was everywhere. Possibly from a translation professional’s perspective the most interesting aspect was OpenAI’s complaint in late January that new upstart DeepSeek was using “its” data. That’s right, the same data that OpenAI itself had unashamedly scraped to train itself. Excuse me Mr. Altman while I locate the sub-atomic-sized Stradivarius.

    In recent weeks, I’ve read a number of people saying that this could be positive for easing OpenAI’s (perceived?) monopoly. For many, ChatGPT has become a metonym for AI. Others think it could herald a torrent of new solutions – some fear one that might finally be able to translate (impacting their endangered volume of translation work and pushing them further towards MTPE’s clutches). And that was before the latest development of Elon Musk expressing his wish to buy OpenAI.

    The schism between the translation industry and the translation profession

    The trend of recent years of a divergence in approaches between the translation industry and the translation profession continues. It had been a pandemic edition of the Translating Europe Forum (TEF) that first pushed the Human in the Loop (HITL) agenda. At first sight, its deceptive allure took me in. Over time, I became aware of the weasliness of the term “Human in the Loop” for translation. HITL is misused: it fails to define the expertise level of the human, and does not advocate the human retaining control/leadership. The industry seems to be revising its estimation somewhat with new term “Human at the Core” which is closer to my “Expert in the Lead” approach than “Machine in the Loop”, but is still coined by the industry. My “Expert in the Lead” concept is also about coming down on the side of the profession over the industry.

    Fresh hope from the industry?

    A piece from late in 2024 by Arle Lommel for CSA did give me some hope that the industry is also coming round to the fact that HITL will not sustain human translators in the human-machine translation era. One remark in that piece captures why HITL gets it wrong, and how that “janitorial role” of HITL will not be fulfilling.

    “[…] “human in the loop” models – a sort of window dressing for post-editing – … often relegate expert linguists to an essentially janitorial role, sweeping up “bad MT” (quality checking and correction) and cleaning up AI messes. Instead, CSA Research has shifted to describing augmented translation as “human at the core” because, at the end of the day, empowered linguists will be making the decisions, aided by technology.”

    The Language Sector Slowdown: A Multifaceted Outlook, Arle Lommel for CSA Research

    Looking back to my assessment in December 2023, I opened with the following paragraph:

    The debate about the future of (human) translation and changing role of translators is the biggest topic in translator circles. 2023 has been the year of the (unstoppable?) march of machine translation. Within a year of bursting onto the scene as an unknown, OpenAI’s chatbot, ChatGPT, can apparently also translate. Human translators increasingly face tighter, more competitive markets. Many are not even consulted about their replacement by MT solutions, but maybe grudgingly offered MTPE work. And there are talks of tightened budgets and gloomy outlooks of recession. So are the days of out-and-out translators numbered?

    Michael Bailey, transl8r.eu blogpost – December 2023 – Who’s in/on the lead as we head into 2024?

    As I prepared to write this post, I asked fellow professionals over LinkedIn how they viewed the situation. A modest little poll on LinkedIn among my network of fellow translators returned a slightly blurry snapshot. I asked pretty much the same question as I have been asking myself over two decades. From over 80 responses, less than one quarter of responses viewed their situation more optimistically. In contrast, 45% view their situation more pessimistically, and the remaining third see it as unchanged from the previous year. From those responses, a number of in-house translators and specialists in less common language pairs seemed more optimistic. Of the positively inclined, many were offering premium services with a narrow specialist focus. A few reported that new areas of specialism had emerged that compensated for the slowdown in business in other areas.

    Busy-ness and Business

    Some responses mentioned improved levels of “busy-ness”, but qualified the improvement being due to time-consuming customer acquisition drives. For others, new services and specialist areas had arrested the slump, but hadn’t banished doubts about the long-term future. In a few cases, new revenue streams opened up from (re)activating new language pairs, although a number I connected with did not realistically view adding further language combinations as a potential solution. Others viewed that the situation was no worse than a year ago, but had also not improved. For some of these this kind of struggle was a “new normal” – the glass was neither half-full nor half-empty.

    Of those viewing the situation more pessimistically, several commented about an acceleration in the shift towards MTPE from “pure” translation work. Many freelancers lamented that their “valuable and valid” contribution was unable to outweigh their customers seeking “value for money”. By value for money – they mentioned diminishing rates (whether by line, character or page) or more MTPE work. A couple also said that work from major agency clients drying up had impacted them. In other cases the agencies had shifted towards an MTPE-based model instead of “classical translation”. Some others mentioned that reorganisations and mergers had meant that major customers had already reviewed the situation. A couple of respondents mentioned that smaller companies had been absorbed into larger groups with in-house language services.

    Payment Practices

    One contact also said that their pessimism was fuelled by longer payment times, although still within the agreed timeframe – a potential sign of agencies also suffering from cashflow issues. Amid ongoing cost of living issues (price inflation outstripping wage/salary increases, or downward pressure on rates), the financial squeeze becomes more apparent.

    By delaying this post, I wanted to also allow myself the opportunity to catch-up with the first swathe of “Monthly Recap” posts on LinkedIn in 2025, in addition to “year-end round-up” posts. I’ve come to appreciate that it has been a busy month if I don’t have time to even consider writing one. However, this is where internal time and performance tracking negates a need for such a round-up. In 2024, H2 showed a remarkable up-tick: in May, based on figures until the end of April, translation time was around 73% of productive hours. By year-end it was up above 80%. In addition, my worked hours were higher for translation in 2024 than total hours for 2023.

    How do you feel about the security/future of your role as a human translator, compared with 12 months ago?

    These figures are why I see the security/future of my role more optimistically going into 2025. But this might be due to the short-termism of recent successes masking and negating struggles earlier in the year. Looking back at the reasons behind my pessimism in the last two years, uncertainty weighed strongly on my mind. Transformation and reorganisation bring uncertainty and insecurity. As a digital transformation programme started, I had felt marginalised and sidelined. And I felt that remit creep was also disruptive for my “course” as a translator. So while doubts existed, along with the simmering AI hype, I remained pessimistic. Learning about a suggested roll-out of MT without harnessing our language data probably fed the pessimism. So what changed so much in twelve months for me to enter 2025 with renewed optimism?

    Getting back to business

    In previous years, the non-translation-based tasks I was logging increased. I advocate that 100% efficiency/productivity is an illusion, as is 100% productivity as a translator. However, translators are susceptible to worrying about a dilution of their time spent translating. At year-end 2023, my productivity tracking showed I was translating for less than 80% of my hours. When I started the job, the level was closer to 90%. I felt a need to arrest the drift towards my knowledge-based job becoming a non-translation-based one. So I enlisted the services of a coach, and focused on using my mid-year appraisal to shed some non-core commitments. It was a timely reboot, and boosted my translator’s esteem. Esteem is so important.

    Translating for a predominantly “non-public” domain means that a lot of my work’s impact never reaches the outside world. Internal visibility is therefore very important. Fortunately, the second half of 2024 served up a plethora of demanding, substantial and internally visible jobs. As a translator I still feel happiest translating, although I can use non-translation tasks to draw breath. I’ve learned to fuel my internal visibility. I am most visible where my translation results in the desired supervisory outcome at short notice. Internal visibility also builds momentum, as has been the case going into 2025.

    A public or private persona

    As wonderful as a very private persona sounds for less gregarious translators, I nevertheless need to maintain a public presence. Presentations and publications (e.g. in the ITI Bulletin and Universitas Mitteilungsblatt) also bolster the public impact of my work as a translator. The workshop I gave in Spiez and the contacts gained there were crucial in a lot of self-esteem issues. Three days’ reflection proved a turning point for “getting back to being me” and to steer out of the doldrums of silo-thinking. As I put the final touches to this piece, in 2025 I already have three further presentations confirmed, a conference participation and other irons in the fire.

    In silo-like environments, especially for the “lone rangers”, i.e. SPLSU in-housers like me or freelancers who do not work together with other translators in virtual teams, social media can become an ersatz barometer of success and a way to shout from the rooftops. The problem is that the algorithms can suck you in, but don’t pay the bills. Add the peacocking influencers to the equation and they will tell you to post hourly/daily/weekly to feed the algorithm. However, my work’s confidential nature means that I can’t get sucked in by the siren-like call of the algorithms. I don’t have the fear of missing out that a freelancer has, if they don’t take on a piece of work. And much of the messages are about the successes – after all you project success far more than failure.

    How are others feeling?

    From some of the end-of-year posts I read, some professionals certainly put in the hard yards and enjoyed exceptional years (in terms of acclaim and remuneration) in 2024. To them: congratulations – your messages show that there plenty of life in professional translation. From viewing their profiles and websites, they all specialise in certain language combinations and with some very interesting niches. The common key to their success also seems to have been their efforts in fresh customer acquisition and keeping customers.

    Some found that new areas of specialisation were opening up: either related to their existing areas or fresh new areas. Others pleasingly reported old customers feared lost returning to them after a dalliance with the AI/MT “good enough” world. For every success story, however, there were also stories of people having lost customers and work drying up. In some cases there were cases of agencies folding owing translators money. One such case was the bankruptcy of WCS Group and the agencies it ran (subsequently bought by Powerling). Many freelancers were left out of pocket. As I added to this post in mid-January 2025, there was a new twist to the Powerling story: The Dutch Society of Translators has just expelled Powerling from being a member. (h/t to Loek van Koeten for this information).

    Upskilling and job crafting for survival?

    Before I was able to actually narrow my remit, I had had to consider upskilling (i.e. obtaining alternative skills to complement my skills as a translator) and even put my foot in the water in actively pursuing courses to be fit for the new world of human-machine translation. However, obtaining new and possibly diametrically opposed skills to those I already possess as a translator proved counterproductive. Instead, with new areas of supervision coming online, my focus has now reverted to deepening my breadth of knowledge in the subject areas I cover. Some translation professionals have echoed this: those who will survive already possess all the skills and specialisations to survive.

    Teaching old dogs new tricks?

    Regarding the prospects of who will survive the AI deluge, I’ve read numerous estimates about the proportion of translators who will “survive” the AI revolution, with many stating between 10 and 25% percent, although the range is far wider. Part of the issue also relates to the stage of their career that translators are at. As William Lise identified in a blog post of his, some are close enough to retirement, and others young enough to change position. However, there is a substantial group of translators, particularly mid-career ones, trapped by the roots they have put down.

    Whether people who have retrained from other professions are any safer is hard to tell. They may bring expertise from a past career, but may lack the translation experience. Possibly being newer in the “trade” might work both ways: be more firmly tied to making it work as the cost of retraining hasn’t been recouped yet, or in contrast, not so firmly embedded in the profession that they can’t “get out”. From a number of contacts who always viewed translation as a “safe Plan B”, they’ve changed their minds about wanting to commit to it.

    Expertise counters AI hype

    Nonetheless, the reality after the tidal wave of AI hype has proven that expertise remains essential – accountability and credibility of translations are areas where human translators still have an advantage. AI and NMT flushes out generalists working for agencies and pseudo-specialists. In this case, broad fields of specialisation (e.g. financial/legal) for agencies maybe stops people from standing out from the crowd. Others say they experience agency work decided upon purely by means of “fastest finger first” – an issue I mentioned when I blogged about the profession/industry schism in autumn 2023. In that case, expertise is unlikely to be given a chance to shine through.

    In contrast, genuine specialists in narrow fields remain an elusively rare commodity. Regarding AI, there is a healthy scepticism about how it can really be a substitute for expertise and experience. Simply throwing more scraped data at the problem isn’t the solution, particularly as synthetic language data now swamps the originally lush large language pastures trained on human generated language. In this regard there is a counter revolution of some boutique LSPs looking for high-end translators whose personal service commands premium rates. In a couple of cases, some freelancers have even reported that they have profited from customers turning to them due to unsatisfactory agency experiences, viewing them as a “perfect fit” after lacklustre past experience.

    And when the boot is on the other foot?

    Occasionally, I outsource work to freelancers. The objective remains to ensure the desired supervisory outcome. This also sheds a lot of light on the “black box of translation”, market practices and how solid briefs helps so much. I have come to get a good feeling whether translators 1) want the job and 2) feel they can do justice to the job in hand. Genuine experts seem less fazed in not being able to take a job on. I also admire their honesty. Such a situation might be vastly different than dealing with an agency, where selling and margins are everything. The requirement of a satisfactory outcome, allows me to use a best bidder approach, rather than a cheapest bidder one.

    Capitalising on AI’s vulnerabilities

    Amidst the OpenAI/DeepSeek saga, I used the opportunity to highlight the accountability, control and expertise that expert human translation offers that AI and MT cannot. When “data scraping” allegations surfaced, I chose to capitalise on highlighting data confidentiality. My approach for the aficionados who brazenly claim how much time their ChatGPT Pro subscription saves, is to ask how they feel prompting techniques have changed, robustness of sources, and their views about the size of the context window.

    The disarming tactic is to speak the fanboy’s language rather than coming across as too protectionist. Only then do you highlight the issues that impact your translation work, and therefore confirm why your expertise is required (e.g. in a zero/low-resource language combination, with high demands on confidentiality, and the necessary to avoid hallucinations).

    Changing job remits

    In terms of job creation, I’ve observed a tendency towards not replacing departing staff, or at best retaining existing headcount. New translator jobs are seldom. Looking at job descriptions, may advertised positions have been for maternity cover positions, often initially limited to a year. It can easily take a year to get to grips with new procedures, practices and subject areas. Other vacancies have more of a project manager/coordinator role emerging rather than a “translator” remit.

    Monitoring open opportunities (I receive them through mailing lists from professional associations) is useful for gauging remit shift/creep. Job descriptions have clearly changed. Jobs creation rather than replenishment occurs in the area of LangTech. New LangTech units in larger language services are in-housing expertise. From conversations with people fitting the new profile, many highlight prominent “sponsors” within the organisation and strong links to IT being behind the creation of the new position.

    Managing language data has definitely become more than a “rainy day” activity – as has terminology work. In a small language services unit, terminologists were traditionally considered a luxury. With the advent of Machine Translation, robust terminology has gained in importance. Machine translation-generated texts into German have demonstrated why I need terminology for all locales of German. My recent work has really brought home the differences between Swiss/German/Liechtenstein/Austrian banking terminology.

    Driven loopy – the expert/machine/human in the loop/lead.

    As previously mentioned, the very strong industry-led approach to human-machine translation is of “machine in the loop” and “human in the loop”. The industry’s financial and PR clout dictates the way translation (both as an industry and a profession) moves forward. However, industry-led perspectives focus on leveraging technology to an extent where human involvement is negligible or a poorly-paid afterthought.

    This is quite apparent from the shift in the industry from humans predominantly “translating” to “post-editing”. In some cases the actual level of human expertise in the post-editing stage is questionable. Pitiful rates fail to motivate a professional: low per word rates for MTPE require unrealistic output levels to earn enough. It would take raw output pretty close to publishable in the first place that you can simply sign off. However, this realistically only works where translation is only required to be “good enough”. And the long-term job satisfaction of this approach is also negligible.

    The HITL narrative is pushed so far that the MITL approach barely gets a look in. Rebranding translators as “language experts” is a mere sop. In much the way that the electorate in the UK may/may not have “had enough of experts”. “Language experts” is just another weaselly term: genuine expertise may often be found in far narrower areas or a single source-target language combination. Imagine the (justified) outrage if we were to rebrand microbiologists or astrophysicists as “science experts”.

    Throw more language data at it?

    The fact is that amid Messrs. Altman et al. scraping the Internet for content to build their LLMs, human generated language data has been exhausted. Tech bros continue to recite their “more data = better results” mantra. The synthetic data has already flooded the Internet, creating new “reheated” synthetic language data. All that changes here is the consistency of the turgid porridge.

    The “more data = better results” approach is like a juggernaut or steamroller, or raging waters trying to pass through a pipe of a certain diameter. Upgrading pipes might permit a greater volume of waters to flow, but unless done end-to-end the flood risk still exists.

    Many AI companies are still a long way from break-even let alone posting profits. This raises ethical questions. Why should we allow tech companies to break human knowledge-based industries, accelerate climate change, only to line the pockets of the super rich, if they ever turn a profit? Industry dictates the terms: amid skewed arguments of increased efficiency, knowledge-based work is still fraught with “hallucinations”. Why should translators tolerate such hallucinations?

    Resistance is (not) futile?

    My view about the Expert in the Lead results from my conviction that the role of the human in human-machine translation remains essential. I do concede that the days of “human translation” from the formative days of my career are gone. Instead, rather than resist the use of technology, the emphasis has shifted to ensuring human expertise remains in control. For me, this involves making the smart choice about the use of technology, rather than rejecting it. Experts in human-machine translation can resist by refusing to have their workflows dictated to them. Refusing to be a cog in the process keeps them in the lead rather than in the loop.

    My bespoke service revolves around my correctly blending multiple translation memories (setting those penalties in relation to age of TUs, subject matter, incorrect locale/language variation) and really knowing what the translation is about. At the same time I also can make a sound decision about the sources of reference material to access. This has far better chances for meaningful and fruitful success, than the drudge of cleaning out the stochastic parrot’s sodden cage from an LLM prone to hallucination.

  • 7 thoughts/questions to start 2025: use of raw MT output

    7 thoughts/questions to start 2025: use of raw MT output

    It is New Year’s Day 2025, and I am finalising my “Who’s in/or the lead in 2025” post at the moment. I decided this year to also try to distill some of my comments on LinkedIn into mini blogposts. In this format, I’ll post seven thoughts/questions, throw them open to the hive mind and then try to draw the responses together in a response post.

    In recent years, I have seen a lot of posts pointing out particular machine translation errors. Their tone can vary wildly from “considered” to “downright dismissive”. The approach of the former will be to explain the shortcomings of the use of MT (in particular its raw output), and how there is more to consider than fluency that convinces a lay audience. The latter will often attack the kind of output you expect to find on social media sites belittling signs found in English around the non-English-speaking world.

    Here are my seven thoughts on the use of raw MT output:

    1. To what extent do professionals (i.e. people in “white collar” positions) actually trust raw output from MT?
    2. If such a raw MT translation does go to print/screen, who is accountable for it?
    3. Imagine the outcome results in something with fatal/lethal consequences. Presuming that there are multiple levels of sign-off. Who takes the responsibility?
    4. How far away are we from litigation over translation quality when premium machine translation solutions make bold claims about accuracy?
    5. At which point will output get worse as synthetic data swamps training of MT engines?
    6. What does it take for output to be good enough/fit for purpose?
    7. Should we educate the users rather than blame the machine?

    This list was originally posted as a comment to a post on LinkedIn. Feel free to share your thoughts here or on LinkedIn.

  • LinkedIn Collaborative Articles: kryptonite, marmite or plain embarrassing?

    LinkedIn Collaborative Articles: kryptonite, marmite or plain embarrassing?

    in September 2023, LinkedIn alerted me to the existence of its collaborative articles, written using AI. I took a look at them and contributed to some to understand the outcomes of doing so. Rapidly, I have noticed a growing number of dissenting voices about them.

    “Contributing” has evolved in a commentary about how LinkedIn uses AI to tackle a question. Each question/article is split into around five to seven main points, with the possibility to add to each section. I have generally posted in the areas of translation, technical translation and linguistics (in relation to the discipline of translation). My “interventions” of typically between 250-750 characters garner between five and thirty reactions on average.

    As the quantity of reactions per intervention increases, the number of followers and contact requests has also increased. And with it, LinkedIn recognised me as a “Top Voice” in certain areas. This status is no more than a small graphical badge on my profile. Aside from an initial burst of attention, the badge does not really made any big difference. If you have badges in multiple categories, as I quickly managed to do, you choose which one you want to display on your profile.

    However, the badge had a very negative side-effect too. “Compliance” by contributing to Collaborative Articles is poorly received by the kind of followers I want and actively engage with. We talk about subjects like Trados shortcuts or the state of the profession vs the industry. Their contributions are thought-provoking and enhance my knowledge. While my follower has increased, the ones from Collaborative Articles are “fickle followers”. There is far less traction and less genuine engagement. In contrast, those contacts I nurture from events, networks, and groups also interact and produce stimulating content.

    Are these the kind of followers I want?

    A steady flow of 20 to 30 reactions sufficed for me to attract a large number of followers, albeit maybe not necessarily those followers that I in turn wish to follow back. It has triggered an inundation of contact requests and followers. Of these requests, many have nothing in common other than being a translator – and no common specialisms or language pairs! Frequently there is no real explanation of why they wish to connect.

    In one case, one follower called my office and was put through to me and gave me a sales pitch. And then made a request to connect, and followed up by DMs. His polished sales patter did not interest me in the slightest, and I have no intention of connecting. In this regard I am comparatively lucky: my profile picture shows that I am clearly a middle-aged male, so I dodge the unsolicited personal mails that others have to contend with.

    In a world of fresh and authentic content…

    water drop at the tip of a leaf
    Photo by Pixabay on Pexels.com

    Authenticity and freshness of content are a LinkedIn mantra. So naturally when Collaborative Articles fail to deliver either, it becomes clear that there is little genuine intention for their content to enrich. Even their titles are prosaic, clumsy and repetitive. After reading only a few articles it became clear how they were generic prompt-based sludge. The prompts invariably spat out similar responses to a vast number of questions, particular questions/titles that only differed by 1-2 words. (It made me wonder if they use the RANDBETWEEN function in Excel to spit out new titles).

    Some collaborative articles are clearly untouched by post-editing, and I suspect are clicked through by disinterested gophers on work experience, who genuinely have no idea about the subject matter or field. And many sections seem to have a stance of mantra-like repetition of some false universal truth. The aspect of “universal truth” is something that also prevails in industry-side conferences, where some speakers project the industry view onto professionals in a way that it is the only way to survive. #2023TEF seemed to go down this path to a certain extent, although it was good to see some professionals pushing back against the industry’s universal truth.

    Know your field

    As an in-house translator, I hold certain clear views on the use of GenerativeAI in translation, particularly the advantages and disadvantages that it poses for the industry and the profession. There are many divergent stances, depending on the area of translation you work in. I am still firmly in  “Camp Profession”, and my stance is in line with professionals who are predominantly self-employed.

    Collaborative articles sit firmly in “Camp Industry”. The first wave of LinkedIn collaborative articles on translation seemed to read like “MT is gospel and the only way to work.” This is certainly not the case in the profession. Initially, the near standard use of CAT tools in both profession and industry scarcely got a look in. A few months in and there has clearly been a retraining based on received contributions. “Translation Memory” and “CAT tools” do figure more strongly. The AI still trips up in terms of far too many sections of the collaborative articles that deal with what a CAT tool is. While this suggests an overcoming of the initial bias in the training of the model, it confirms that those submitted contributions are duly being used to train LinkedIn’s AI.

    Sometimes collaborative article titles nevertheless remain downright incongruous. They choose to address subjects like “bilingual communication in global enterprises.” This must be due to the AI understanding translation as an exercise of a single source language to a single target language. But “global enterprises” communicate in many languages, although possibly only bilingually in an individual target market. Mere bilingual communication throughout a multinational cooperation is as feasible as a multinational corporation with one desktop PC and a filing cabinet. (Note: this was a humorous dig at the Robinson Corporation from Neighbours).

    LinkedIn still appears not to know my field

    LinkedIn still spews out subjects I ought to comment on – many of which are very wide of my expertise. I reject any notification to contribute to a subject outside my field, probably about as far “off remit” that I’ll go is to contribute about content strategies, due to having half a clue about them.

    curly haired man shouting through megaphone
    Photo by Thirdman on Pexels.com

    To understand the worthlessness of “Top Voices” I acquired one on a subject I was not qualified to talk about. That achieved, I stopped commenting on that subject area. After several months they did eventually remove my “Top Voice”. This demonstrates that uptake is poor among professionals while the decay period for losing this status is a long one.

    Contributing to Collaborative Articles: kryptonite?

    By contributing you provide human generated text data to pass into the LLM, to train the AI that LinkedIn uses. By contributing, you are siding with industry to the detriment of professionals. If you are a language professional, you may be contributing to drowning out your own voice on the platform.

    Some professionals have therefore actively chosen to not participate – a very respectable decision. Others have submitted content that either itself is AI-generated, to therefore try to hasten the “rot” of the model. Putting it another way – they try to feed the snake its own tail. My reservation to this approach is that LinkedIn is a platform to promote skills, so such efforts might be futile. No anonymity in posting of contributions might feed negatively into the algorithm and come back to haunt you.

    Collaborative Articles: Marmite?

    Another comparison would be to compare them to Marmite. But what is to love and hate about them? What I have actually loved is that their pure mediocrity has also helped me identify potential areas to blog about. However, LinkedIn might scrape my blogposts if I share them on LinkedIn to feed its AI and in turn its collaborative articles. I need a better understanding about how LinkedIn obeys a customised robots.txt file or scrapes my website. The “disruptor” in me loves the creative ways other contributors have to shovel in nonsensical AI-generated content into the Collaborative Articles. Hopefully with an eventual effect of “breaking” it.

    I rapidly have come to hate the mantra-like repetition of the questions. This has led to my answers becoming a repetitive stream of consciousness. I do not wish to invest my time in writing “snippets” of 250-750 characters as their have a limited impact. One contributor I seem to spot regularly uses it to promote their CAT-related solution. Sadly, the system is also too dense to pick up the fact that it is thinly veiled advertising, and it is not really possible to report such contributions as self-advertising. Then again, why users also have to police the site’s output when it makes money from publishing such dross.

    Collaborative Articles: plain embarrassing

    male statue decor
    Photo by karatara on Pexels.com

    Below, I have cited a couple of typical examples of frankly embarrassing statements I have encountered. There are countless others.

    “Human translation (HT) is the use of professional or native speakers to translate text or speech from one language to another.”

    Collaborative Article on: “You need to translate a document. How do you know which service to choose?

    I understand that fast-moving technical advances blur boundaries. But I despair that LinkedIn fails to grasp the difference between translation and interpreting. Sadly, some non-lay human audiences also seem to struggle in this regard.

    Or then there are questions that show that professionals should be in thrall to the industry, a practice I call out wherever possible.

    “How can you use translation memory to optimize pricing for your clients?” This title advocates that translators should punish themselves for the effective use of assistive technology. CAT tools and discounts for fuzzy matches are already being used to exploit translators and drive costs down, effectively penalising their investment in CAT tools. This is part of the industry vs profession schism that agencies exploit. I’d urge translators to work directly with customers, and to forge close working relationships, but to take a balance approached in terms of offering discounts – give them something, but do not reduce yourself to slavery rates.

    My response to the Collaborative Article on “How can you use translation memory to optimize pricing for your clients?

    Another recent case referred to CAT environments allowing translators to translate more in more languages.

    The article states: “Lastly, scalability is increased by allowing translators to handle larger volumes of content and more languages.” This is a clumsily worded statement. Using a translation memory in its own right will not allow translators to handle more languages. It is no substitute for their mastery of a language. And nor does its use unlock new languages. It would be far more accurate to say that TMs can allow LSPs to centrally store language data in a large number of languages and under certain circumstances to leverage this language data to assist translations in new source/target language combinations.

    My response to the Collaborative Article on “How can you choose the most effective translation memory tools and technologies for your project team?

    Recently, I started calling out Collaborative Articles that fall into the plain embarrassing category. However, I have tried to take an approach that does not give any information away other than what is wrong. Correcting it could lead to my input might being used to retrain the LLM used. I rate a lot of Collaborative Articles as being poor. Sadly, the feedback categories offered don’t fit the issue to address: AI output is published with minimal human intervention.

    Will nobody rid me of this troublesome grift?

    Many users on the platform despair at LinkedIn collaborative articles and requests to contribute filling their feeds. It is possible to stop notifications from LinkedIn Collaborative Articles from appearing among your notifications. However, it is not possible to choose to actively have them removed from your feed of other people’s contributions to them. There is a way to block all content from a certain person, but not a certain type of their content by someone.

    In this regard, there are other possibilities that I would like to see implemented on LinkedIn. I would also like to block carousels over a dozen slides – it’s another case in hand of “death by PowerPoint” – with inflated decks of slides.

    I would also like to choose what kind of posts I see from companies I follow. For example, I follow a lot of internationally active banks, and follow them to see what their news is. A lot of 3rd+ degree connections post about their new job at an institution on another side of the planet. And then, because they tag their new employer, I also see their new employer’s “Welcome on board!” comment to them, which LinkedIn assumes might interest me.

  • Who’s in/on the lead as we head into 2024?

    Who’s in/on the lead as we head into 2024?

    The debate about the future of (human) translation and changing role of translators is the biggest topic in translator circles. 2023 has been the year of the (unstoppable?) march of machine translation. Within a year of bursting onto the scene as an unknown, OpenAI’s chatbot, ChatGPT, can apparently also translate. Human translators increasingly face tighter, more competitive markets. Many are not even consulted about their replacement by MT solutions, but maybe grudgingly offered PEMT work. And there are talks of tightened budgets and gloomy outlooks of recession. So are the days of out-and-out translators numbered?

    The Chartered Institute of Linguists, which I recently joined, has released a white paper: CIOL Voices on AI and Translation. It addresses some initial reflection and major concerns. The White Paper points to a shift in professions: today’s professional translators will be the future’s language experts and consultants. Sometime new job titles are dismissed as a case of “old grapes in new bottles”?

    The introduction to the White Paper concludes:

    […} we can ensure that linguists remain at the forefront of AI integration in our field – the essential expert ‘humans in the loop’.

    Steve Doswell, Linguist, consultant and Chair of CIOL Council in CIOL Voices on AI and Translation

    The use of “expert ‘humans in the loop’” is telling here. Without attaching the “expert”, it implies that an involved human may not be an linguistic expert. This ties in with concerns about the need for human judgement in using MT and LLMs for translation. It remains essential that users clearly understand their responsibility, as well as the pitfalls of using unsupervised MT. In-house language units must have an active role in training and onboarding users. Their involvement in the decision-making regarding the adoption of such approaches remains essential. It is not an out-and-out IT decision – even if the technological nature of the solution, means IT must be on board. There is some very sensitive messaging in moving from a “human translation” approach to “human in the loop” if bypassing the intermediate “machine in the loop” stage.

    Potential for upskilling and job crafting

    This presents possibilities for upskilling and job crafting – both useful tools for in-house staff retention. New remits might help retain senior staff members wishing to have a change from day-in-day-out translation. Any in-house solution will need dedicated language technologists. Language technologists are the new translators in terms of language services recruitments. Central banks and financial market supervision authorities have been hiring people with this profile for several years.

    It is also important to remember that for any solution to work to its full potential, will need dedicated staff. The quick and dirty approach might be to outsource, but such solutions, although quicker to implement, may not allow the desired level of control. An attractive interface is one thing, but there might not be the possibility to tweak the temperature of the underlying model, or to train it to your specifications – which is beneficial to extract the maximum benefit for your use case. However, this training isn’t possible on the fly – it needs a long-term training concept and commitment. And naturally potential succession management issues need handling too. These issues may be due to sabbaticals, secondments, retirements and maternity leave. Entrusting an entire solution on a single set of shoulders is also an operational risk.

    In this case, human involvement is still in more of an expert capacity – training and refining the engine, and ironing out the wrinkles. (Rinse and repeat as required!) Other tasks include managing new versions of software and interfaces, or plugins to CAT environments and maintenance. With an outsourced solution the situation is not so clear cut. This brings us back to the issue of the position of the human expert in the loop – and whether human or machine is subordinate – in the translation process as a whole, and the problems with the terms used.

    Driven loopy – the expert/machine/human in the loop/lead.

    I first heard of “human in the loop” mentioned at the 2021 edition of the Translating Europe Forum (TEF). TEF is the European Commission’s annual translation *industry* event. Over the last two years, I have lost count of the amount of discussions I have had with other people, about it. The problem it throws up lies in the interpretation of the role of the human.

    Moving further back, human-in-the-loop in 2012 was a classification for autonomous weapons systems. In that context, a human must instigate the action of the weapon. Human-on-the-loop is a classification whereby a human may abort an action. Lastly, and most terrifyingly, human-out-of-the-loop is the classification for no human action is involved. In this case human-in-the-loop does not imply that the human is subordinate to the machine.

    An intermediate stage exists between human translation and human in the loop: “machine in the loop“. In that case the machine is subordinate to the human, or more likely an expert. Both “machine in the loop” and “human in the loop” are weasly terms. Both fail to mention the role of human expertise – which is why some prefer “human at the core” or “human in the lead“. Additionally, one experienced colleague recently pointed out on LinkedIn that anything “human” omits to say anything about their expertise. This is why I actively try to opt for “expert in the lead” (should that maybe be EITL or XITL?).

    There can be a lot of difficulties in explaining the delicacy of the situation to lay colleagues – they see a binary situation: human translation or machine translation.

    After all, If you are not in the lead, but only in the loop, then you are effectively “on the lead”. And naturally there is the issue of the subsequent drift from human in the loop to human on/out of the loop. In that situation, we’re in the territory of fully autonomous self-driving vehicles.

    Resistance is futile?

    AI technology is clearly here to stay. While there is a certain hype cycle, it is not just a passing fad. The truth is that its limitations are well recognised: AI/MT cannot be used unsupervised in many settings. There are possibilities that the enhanced use of technological assistance might also open up new seams for translation (MT is a suitable use case for e.g., translating Airbnb and travel site reviews where a gist translation is what is needed). Humans will remain an integral part for training the underlying systems. Otherwise, at some point there will only be synthetic data to train systems that require high quality human data for training. Increased efficiency needs to be offset against the lack of job satisfaction that some will experience from being relegated to post-editing.

    Resistance to the advancing AI/MT tide is futile – both in-house and as freelancers. The battle to fight is in educating and countering assumptions that the lay public holds of machines being better, faster and cheaper. People need to understand the real risks and costs. However, part of this battle will also be to ensure that the current cohort of translators/language consultants/language technologists in the making learn the skills they will need for the career of the future. Many university courses adapt to the changing times at a pace observed in glacial creep. This is where professional associations come in – both in upskilling existing linguists, but also in supporting the next generation as it begins its journey.