In this episode of Immigration Uncovered, host James Pittman interviews Ian Hawes, managing partner of ImmiTranslate, about how artificial intelligence (AI) is transforming the translation industry. They discuss the capabilities of large language models, ImmiTranslate's new AI-powered translation tool called Catalyst, ethical considerations around using AI for sensitive documents, and the future of AI in translation.
Key Discussion Points:
James Pittman: Welcome to Immigration Uncovered, the docketwise video podcast. I'm your host, James Pittman. This is episode 42, and today, we are talking about the AI revolution and its effect on the translation industry. And I have with me a perfect guest for this topic, Ian Hawes, who is the managing partner of ImmiTranslate. Ian, welcome to the program.
Ian Hawes: Thank you, James. It's it's really exciting to, to be on here with you.
James Pittman: Absolutely. And and I should mention right off the bat that Immutranslate is a DocuWise integration partner. So I've, you know, I've known Ian for quite a while. I know his, you know, extreme acumen on the topic of translations, and he's he's been in this industry for a while. So this is wonderful. And, Ian, why don't before we even get into the, sort of, the AI, how did you how did you get into the translation aspect of it?
Ian Hawes: Sure. Sure. So it's it's actually a really interesting story. So, like, way back in, like, 2013, 2014, my best friend, came to me and was like, you know, hey. He had he had met his, he was deployed to the air force. He was overseas. He met his wife, future wife. You know, they dated for a little Pittman, he fell in love. And he was like, you know, this is great. Like, let's let's get married. You know, you can come back to the US. He was being, deployed back, stateside. So, he and his fiancee went through the k Ian fiancee visa process, and then, you know, went through the the process to get her green card. And at the end of it, he come he came to me in in, like, 2014 and was like, hey. Look. We did this process. And, you know, it wasn't hard. We did it ourselves. We didn't use an attorney. But what was kind of really annoying were, we're getting the the documents that we need translated, actually translated Ian, you know, certified so that they would be accepted. And so he came to me. He was like, we should make a business doing this. Like, doing just focused on these translations. And I was like, that's a terrible idea. Like, it's not gonna make anybody. It's not gonna be successful. I don't know anything about translations. You know, I I I don't speak any other languages. My background was all in in software development. And so he finally kind of bugged me for, you know, a few months. And then I was like, all right, fine. You know, we'll, we'll do it. Ian, you know, of all the, like, ideas that we kind of came up with, you know, we're both born entrepreneurs. And this was the one that actually really stuck. Not only was it successful in that, hey, we we had a viable business, but it was it was, you know, something that we felt good about. It was something that, you know, filled really a need that, of course, obviously, things in in 2014 were same, but kind of different. And it was it was through that that I kind of came into this industry and and really embraced, you know. I I say one thing that I think most entrepreneurs kind of shutter out, which is that this is kind of my career now. You know, this is something that I'm committed to Ian I really enjoy. And so, you know, it's it's kind of a big leap for me as as I've, you know, grown up in in this industry and and seen different things and and stuff. But, yes, that's, like, the the origin story of of who we translate.
James Pittman: Yeah. That's I love hearing the back stories. And, you know, it's a it's such a great, time that you're you're in translation industry, because now we have the AI revolution, which is a lot of what we're gonna be talking about. And that really, just as with many fields, that really, sort of I don't wanna say threatens, but that looks like it's really gonna revolutionize, and is revolutionizing the field. So it's it's fortuitous because, you know, that was not there in 2014, and now you're in the industry Ian you're you're getting a front row seat for seeing how this amazing technology is gonna change so many aspects of it.
James Pittman: What was your, you know, sort of initial when you first sort of became I mean, you were in software development, so you knew about natural language processing and things like that. But when AI really bust onto the consumer scene, like, about 2 years ago, like, what was your first, you know, reaction as far as what it meant for the translation industry?
Ian Hawes: Sure. So yeah. So being in the translation in in the wider language industry, you know, obviously, we've we have felt the effects of not just AI, but also improvements to what used to be, you know, much more of a Pittman based process to, hey. Now the machines have taken over. You know? So so being in this industry, we're kind of used to, like, you know, the the the the death is impending of your industry. What are you gonna do next? And and that's actually really liberating for us because we're not so focused on what, you know, what does this mean for us? It's kind of like, okay. We got through neural you know, we got through machine translation. Okay. Now there's this new neural machine translation. We got through both of those. So for us, you know, in in a lot of other industry, folks in in the language industry, it's it's just kind of business as usual. As far as, like, when I first saw the power of of AI Ian and large language models, so it was a little bit earlier than than, like, the chat GPT. So, you know, we had seen there are are AI models that were prior to, like, you know, the the Ian AI models that kinda did things really well, but they weren't kind of actually that intelligent. And and that was sort of a running theme among software engineers. Is that like, oh, yeah. This AI is, like, not very smart at most things, but kinda maybe a little bit smarter at some things. And so, I I I remember in, I believe it was, like, August or September of 2022.
Ian Hawes: So before Chat GPT had been released, OpenAI had sent out a bunch of, developer emails basically saying, hey. Come use our our new API. Right? And so back then, it was called DaVinci. GPT 3 would was marketed as GPT 3. It was called DaVinci. And I remember the first time I kinda played around with it, I didn't really know what I was doing. And and so, you know, I I got this invite and I, you know, sort of kind of figured out what was going on. But then the idea of, like, a prompt and, like, these responses was very foreign. And I remember there was this tweet, from from someone I follow on on Twitter, that was basically using, using Ian an prompt to extract information. And I thought, you know, that's that's actually really interesting because we have like, going back, you know, 2022, which in AI years feels like a decade ago, you know, there were a lot of things that we were not very good at. Ian one of them was doing simple analysis on a document. So take, like, a birth certificate, for example. We didn't know that it was a birth certificate. You look at it. You go, okay. This is a birth certificate. But, like, how do you actually process the text on that document to say, okay. With a certainty, this is a birth certificate. And then how do you go beyond that and say, like, alright. This is a birth certificate issued in this year, you know, by this organization or this whatever. And so what I did was I took a a, you know, crisp document that we have in our test library, and I I ran it through like a free OCR. And I plugged it into, DaVinci. And I I told it, okay. Tell me what kind of document this Ian, and tell me when it was issued. And it spit back the exact document name and the exact date that it was issued. And I was like, okay. Let's go a step further. Tell me everything you can tell me about this document. Tell me who who's the subject of this birth certificate? Who are the parents? You know, what's the what's the gender? You know, what date was it issued? What municipality? What region?
Ian Hawes: And it spit everything back out at me. I was like, okay. Alright. Alright. This this is like, let's try 5 other documents. And we got 5 other results, and they were all accurate. And so that was that was, like, the first time that I was like, okay. I hope nobody else figures this out because this is a really big competitive advantage for us. Back then, all we were trying to do was was be better at standardizing our templated translations. So for example, our biggest, you know, document that we see are birth and marriage certificates. That's that's huge in the immigration district. Just about every applicant has some some form of that. And so for us, just being able to classify and and extract that information from these documents would allow us to reduce the turnaround time on translations tremendously because we could be templating them. And and we were in a way, but we didn't have a good way of trading the, you know, didn't have a good way of analyzing the documents to figure out, okay, what are we actually looking at? And so so that was back in 2022. And things have like I said, AI 1 year in AI era is, like, 5 years. Right? So, things have improved dramatically since since that little experiment, and we've been able to sort of roll stuff out into production. But, yeah, that was the first time I was like, uh-oh. We got something going on here.
James Pittman: Amazing. Let's so as of right now, so how do you see AI sort of making changes in the translation industry at present, and what are some of the most significant advancements that you see going on?
Ian Hawes: Sure. So so AI and large language models are really built for translation. Like, the original paper that, the so this transformer architecture, which Hawes kind of revolutionized, had been the bedrock of modern AI Ian and large language models, was actually a a academic paper written by a few people at Google, and their their goal was to create a better neural machine translation application in Google Translate. And so, really, from the very get go, LLMs have been very good at translation because that's what they were effectively designed to do. They have been phenomenally better since the initial sort of transformer paper, and, looking at, like, GPT 2 and GPT 3 and now GPT 4, we see huge improvements, not only in translation of, sort of, you know, the more popular languages, but even in these smaller esoteric languages that that don't get as much attention. It's it's really been it's it's been a massive improvement. And, you know, there's obviously, there are concerns that, you know, are are present, and certainly with not certainly Ian use cases with LLMs that are not specific to translation. For example, hallucinations, you know, we've we've talked a lot about that. Those those are actually not as as present in translations because the the text is, you know, you have to change some parameters when you're using, LLMs to do translation. But, you know, these we don't suffer from some of the same side effects. So the hallucinations are not always present. You know, they they come across occasionally. But LLMs are incredibly good at doing translation, and and they have really always been, kind of looking if you look at, like, the evolution of machine translation. So I I consider James to be, like, a distinctly higher level above machine translation because of this thing called a context window. So if you look at machine translation, you know, Google Translate, DeepL, or neural neural matry machine translation, They have to break down, the the text that you input into fragments. And so they segment about into smaller pieces, and then they run them through the algorithm. And what happens is when when you do that, you're not made aware of, you know, the text that sits above, you know, or the text that sits below. So as a result, LLMs have a distinct advantage because of this context window where they can have all of this information in the the the mind of the AI before they even do any sort of translation. So that immediately sets things like tone and audience and and, you know, other attributes that a human translator are very good at identifying that a machine translation is not. They set those parameters very well upfront, and as a result, you Hawes, translation that's much more nuanced to the, you know, the reader or the audience.
James Pittman: Okay. And, so it does and now does, Imitranslate actually incorporate any AI features at present?
Ian Hawes: So, actually, it's it's great that we're having those conversations. This is a really exciting time. So Imitranslate, we are launching we're publicly launching something called Catalyst. And Catalyst is the 1st AI powered tool specifically trained and built for certified translations. So if you're familiar with translating, you know, a a document with, like, Google Translate or something, right, you the output that you receive is just a string of text.
Ian Hawes: So with Catalyst, we've actually built, an algorithm Ian a AI model that is designed to produce a translation that mirrors what a human would produce for a certified translation. So Catalyst is is there's nothing like it. There's no other there's no other application of of James producing a translation that looks like what catalyst produces. And what it what it produces is akin to what our professional human translators produce. And, so it's a really exciting time for us. This is something that, like I said, we've been building this for, I don't know, over over a year and a half now. And it's it's truly incredible. So it's it's something that we've tested, for the last 3 months in in a very confined way where we have guardrails with, you know, the existing human translators within our network. And we've been surprised at every turn because it it is surpassing even what we expected would be possible. There's there's a whole bunch of other sort of stuff that goes into a certified translation that we'll probably talk about later. But for the most part, Catalyst produces professional quality certified translations that that really compete with human quality, effectively, instantly.
James Pittman: Now with for example, with the immigration area, you've got the right the requirement by USCIS that you have to have a a certificate of translation. So they're expecting a human translation. So you so I I don't see a way that you'll be able to use this product specifically for immigration now at present.
Ian Hawes: So, actually, it's a great question. With our process, what we've built in is that every translation that is produced by Catalyst, if you elect to have us certify the translation, our human translators go in, review it, make any edits that they see as being, you know, reflective of of the of a certified translation, and then they are the ones signing off on the translation. There's a an application that we've looked at where if you have let's say, you have maybe a paralegal doing translations. You don't really want anyone else involved in it. You can use Catalyst simply as a translation tool, and then you can sign off on the translation the same way you would sign off on your own translation. We are leveraging the the data that we've, you know, basically refined over the last, I don't know how many years. And we're building in our expertise in in certified translations with, you know, what we know about what makes a good certified translation, what we know about transliteration, all these things that we've learned. We've put this together in catalyst, and we are not only using it internally to produce drafts for our own translation team, but we want to, expose that to, you know, nonprofits that maybe don't have the language resources that they need to, manage their caseload. We wanna send it out to law firms that aren't comfortable sending full documents to, you know, a translation company like us. We wanna enhance their their own language capabilities with Catalyst.
James Pittman: Well, it's I I'm so glad we have a front row seat to hear about this since it's a it's such a great development. If this is something that you've developed totally in house then?
Ian Hawes: That's correct. So so Catalyst works as, basically, Ian series of algorithms on top of, like, a bedrock LLM. So, for example, a anthropics cloud or Ian a OpenAI, GPT 4 o, or even using, Meta's LAMA. It's a it's a series of of enhancements and and sequences and and other tools that go into preparing a document to be inserted into what we call the bedrock LLM. And then the output is then taken back into Catalyst and produces a either a PDF for a certified translation or if you want a word document to be able to make edits to it.
Ian Hawes: This is something that that Catalyst is built to support both of those use cases. And, yes, this is something that we've developed, over, you know, the course of a year and a half. It's not, you know, I'd love to say that it's entirely in house, but, you know, we do use, tools and other, non LLM AI models to enhance the the input and the output of what the LLM produces. But, yeah, it's effectively something that, you know, to my knowledge, there's no other translation company doing anything like it.
James Pittman: And when are you debuting this on Immi Translate?
Ian Hawes: So this is gonna launch September 24th.
James Pittman: Amazing. Amazing. And is it something that's gonna be available to all subscribers, or will there be a separate tier? Or how will
Ian Hawes: it work? So, actually, at the moment, everyone that orders a translation through Amy Translate is able to take advantage of Catalyst. So what that means is that you see basically instantly a draft of what we are looking at of the translation. That doesn't mean that it's certified, and there's still an opportunity to, request revisions the same way that we have, with our existing certified translation service. But what this does is it gives you output immediately so you can see not only the document that we are translating, but, you know, what it's going to look like. And and you can see the changes that our human translators make as well. Later on, we wanna open up the sort of self-service concept of catalyst, where, we would operate on a credit system where you buy a certain number of credits. You can upload as, you know, many documents as you want, and then it produces the certified translation for you to download and then sign off on yourself. We see this being incredibly powerful for, nonprofits that simply don't have language resources full time. You know, that maybe they don't have a budget for it. This works great as a draft. You know, if you, for more complex documents, you know, there is still we're we're at the point now where, I still don't want, you know, any translations being sent out that that obviously we can't because of of the legality behind a certified translation. We need somebody to sign off on it. But, you know, I I still see us being a little bit of ways away from an environment where no one really needs to to review it. It's it's good enough to just sign off on. But, you know, it is significantly better than what's out there now. So for example, now, you know, prior to Catalyst, our translators were taking, you know, scanned documents that are messy. You know, you've you've probably we we we share mutual clients, so you see the same sort of documents that that we see where, you know, it's it's a blurry scan or, you know, the the document is partially destroyed. You know, it's been through who knows Ian. And, you know, you're not getting a high quality, Ian DPI scan or anything like that.
Ian Hawes: You're getting kind of a messy picture. So, yeah. In in the days before Catalyst, our translators were first transcribing it. So they're they're, you know, on one side of the screen, they have the source document. On the other side, they have, you know, a blank page to see, okay, what does this what does this look like? You know, recreating fragments of the document, like tables or, you know, recreating positioning where images or emblems or signatures would be. And then doing the meat of the translation, which is, you know, the actual words within the document. In our, in our line of work, that that transcription part takes the the most time, and it also introduces the most errors. So one of the kind of misconceptions that we see is that, okay, you're a human translator. You gotta be perfect. Right? You don't make any errors. That's simply not the case. That's not unique to our company. That's any any company that's doing translation of the documents that we see is inherently looking at a standard error rate. And so typos happen, you know, that sort of thing. With with Catalyst and with human translators, when you combine them, the errors go to 0 effectively. You you have, you know, AI effectively doesn't make typos. You know, it's it's very it's very knowledgeable and confident in what it produces, which may or may not be correct, but it's not making, you know, the sort of typos and and omissions that, like, a human translator might. When you add in the component of a professional human translator looking over this draft that Catalyst has produced, then you have combination of of 2 people, you know, more or less. And it's it's a much more thorough and accurate translation than if you just had Ian person. And so so this this concept of, like, a draft translation is how we look at it. So Catalyst is Catalyst produces certified translations that are still draft. There's no signature attached to it. You know, you can't submit it to USCIS without having somebody sign off on it, but it will save you a significant amount of time in producing this initial translation.
James Pittman: Understood. Now, you know, it's axiomatic that, all industries use technology because they wanna be more efficient, and reduce costs and be more accurate and so forth. So when you're talking about, you know, having a human translator do a purely human translation versus using a tool like Catalyst, where the draft is produced by the AI, the human checks it, and if necessary, signs off on it? What are we talking about in terms of improvements in speed, improvements in accuracy? And is there a reduction in cost because the human being's not spending as much time translating every word?
Ian Hawes: Sure. So, actually, it's it's incredibly fast. I mean, the the graph on time to translation is, like, it's not even close. So the the catalyst from start to finish takes about 2 minutes to produce a translation. I'm not aware of any of our translators that are that quick.
Ian Hawes: On the human side, you know, once they're once they have access to the translation, you know, the the process of, you know, reviewing it and signing off on it, James anywhere from about 30 to 40 minutes depending on, you know, the complexity of the document, how many pages there are. But it's it's a significant reduction in what we call our standard turnaround time. So for most documents, you know, less than 6 pages, we have about a 24 to 48 hour turnaround time. Catalyst produces those translations in about 3 minutes, and then give or take about an hour or 2 for a human translator to review it. You're talking about a significant time savings. It's also that's that's more evident when you're looking at, situations where you need something back right away. Right? So that's that's where, Catalyst really shines is in the the turnaround time. As far as the accuracy goes, you know, we are continuing to improve the output that Catalyst produces. What what goes into Catalyst is a series of, you know, things that we've kind of pioneered. So, a transformer OCR architecture, we use what we call OCR grafting to basically combine the document format with the, OCR that's produced alongside, you know, a a prompt that limits how we want the translation to appear. And so that's really cool too because that's not that's not something that really we see any other company doing. And so the accuracy is is only getting better. It is, it it just in the initial when we first started Catalyst early this summer, you know, the the accuracy was around, like, a 78%. As the summer has gone on, as, you know, AI in AI time, that's, you know, summer start to end is like, what? Like, a whole year. The the bedrock LLMs have actually improved the accuracy without any change from us. So we see the accuracy approaching 90% now, where things are only gonna improve Ian the accuracy is the the introduction of pioneer models that are, you know, more fine tuned for the sorts of input output that we want. OpenAI is constantly producing, you know, new ways for us to input the information that we generate into their LLM to get an output that is satisfactory. So we see that improving the accuracy. There are OpenAI has a model that is geared towards the slower, more accurate production of content versus, you know, the kind of instant content production that we see with like chat gbt. And we think that that model, when we gain access to that model with Catalyst, that will allow us to produce, you know, an accuracy that that we are even more comfortable with. So yeah, in terms of accuracy, it's only going to get better. Where, where costs kind of comes into play is that we've it's been a challenge for us. The price that we start with has been the price that we've had since we started this business. And and we've kind of moved heaven and earth to maintain that price. Right? We are the the Costco hot dog, so to speak, where we have not. The hot dog is still like a buck, 25 buck 50, whatever. And and so the certified translation we produce are still 25 a page. Catalyst allows us to maintain that price point for the foreseeable future. We would not we would not be able to to continue operating as a business without these technological improvements, without it. And and in terms of price, you know, we're we're much more competitive on price than, you know, some of our some of the other companies out there that focus on sort of translations. And we think, you know, the we try to you're not gonna have all 3 in this matrix of, like, you know, accuracy, quality, you know, turnaround, and price. Like, it's not it it doesn't exist, but we try to skew closer to the, you know, price turnaround time of the matrix than, like, you know, the price accuracy time or or the price turnaround time. You know, what that means is that we've given the tools for our clients to always there's an opportunity to review the translation and request revisions. There's an opportunity to, you know, have somebody else take a look at it when when, you know, you think that we may have made mistakes, which it does happen. And those are tools that have been around for years with us. And and so that that part isn't changing. You know, there there are gonna be a segment of people that are not comfortable with any of their documents being close to or even within the realm of AI technology, and we understand that. And so as a result, you know, those folks may pay, you know, a higher price, but we're not talking significantly more. It's a it's a modest increase in in what we're doing. But, you know, the goal is that we maintain this affordable price, and and we continue to serve our clients on price, versus, you know, having to to raise rates and and kind of fight that battle.
James Pittman: Understood. Well, let's talk a little bit about the limitations as it were of the current state of AI with translations. And so you you mentioned involvement of the human translator to review the draft, to do any revisions. I mean, do they have to we talked about the the need for context. So do they then have to put in, you know, cultural context and rework language to be culturally appropriate?
James Pittman: Ian what might some of the other the current limitations be? I'm not saying some of these things won't be overcome. Surely, they will, but the current limitations, and and how do you see some of these things evolving down the line?
Ian Hawes: Sure. So so, yeah, there are limitations just like like any new technology. And and we've done our best to curtail those limitations, and, you know, cycle them to the top so that you can understand what, you know, may or may not be occurring. In terms of the specifics of, like, a translation, yeah, there are there are always going to be issues where, you know, culturally, what was said is not accurate. Right? And that's not specific to our AI translations, but that's kind of AI translations as a whole. I would say that the, LLMs are significantly less susceptible to those sorts of mistakes versus the traditional machine translation algorithms. So for example, like, DeepL is a NMT, neural machine translation. And so they are susceptible to issues like that. Whereas, LLMs typically don't have those sorts of of, you know, idiosyncrasies. Now I will say that it's also dependent not only on the Ian, but also on the language pair. So when you're, we run tests all the time between GPT 4 and, quad 3.5, and then llamas are met as llama. And, there are languages that the different James are much better and much more accurate at translating into. And so the beauty of Catalyst is that we can mix and match those models when we find that a certain LL lab is better at one language pair than the other. And, you know, there's a handful of languages that we, you know, are are aware of don't do well with AI translation. Ian fact, you know, full disclosure, Catalyst launches we're gonna launch with only a a series of the we officially, we support, like, 70 plus languages. Catalyst is only launching with about 7 languages. So there are still, you know, a ways to go and and more testing that needs to be done. But, you know, those those issues do arise, and and that's why it's important that we still have these human in the loop, you know, concept where a human can look at something that, you know, would be sensitive or or, you know, a similar sort of confusing term, and and make edits to it. And the other thing too is that it it it means translate we don't we have focused so much on these certified translations that it's it's not common that we come across documents that we don't that that have these sort of complex, you know, cultural nuances. And for the most part, our the majority of our documents are being translated into English. And, you know, for that reason, it's it's it has not been as much of a concern with Catalyst to to overcome those. As, you know, something where we were doing, like, localization into multiple different languages. But, you know, it is still on our radar, and and having the human in the loop is really the best guardrail that we could create right now.
James Pittman: Okay. Now, just out of curiosity, what are some of the languages that you find have more difficulty when using AI translation?
Ian Hawes: So, typically, Arabic, you know, the Middle Eastern languages have probably the more, or the the less accuracy, I would say. Spanish, obviously, is is going Spanish, French, Portuguese, you know, those languages are going to be, closer to, you know, the higher quality level. West spoken languages, some of the the languages of of India that are, simply don't have as much trading material in LLMs. Those those suffer from, you know, the the greatest or the the greatest risk of inaccuracy. And what's really cool is that because LLMs are are trained differently than machine translation, LLMs are actually, like I said, less susceptible to cultural idiosyncrasies because the majority of the LLM training data is all Internet data. Right? So so we speak on the Internet differently than we would, you know, where Ian a in a situation where machine translation is being trained on, you know, more academic documents or, you know, more, more documents that are considered considered legalese where you would have less, you know, slang or or other kind of cultural nuances. James are are, again, much more, much better positioned to kind of tackle those sorts of issues. In in terms of, like, the, beyond that, like, what languages are, have the most issue, you know, you're then you're getting into, like, okay, what languages are are less common on the Internet? And that's that's been sort of a a quick filter for, you know, what's better versus, you know, what's worse. They're, you know, the African languages that have, less content on the Internet suffer. You know, I I'd love to put together sort of a graph of, like, the different languages and and kind of highlight what we found because it is a it is a really interesting kind of of question about, you know, what what does AI know and what does it not know? And we're constantly wondering that. And when it comes to language, it can pretend to know the language to a point where it's it's making up things, which is kind of, you know, Ian another risk that comes up.
James Pittman: Right. Right. Now interesting. And and it's it's kind of the answer that I expected. I've I also figured that languages that have a great deal of regional variation or dialectical variation, such as Arabic would would present certain problems, because you have a lot of regional usages, which can you know, really is quite different. I mean, Arabic and Morocco is really totally different language almost than, you know, that which is spoken in the Levant and and so forth. So that's kind of what I figured. But I'd love to see that graph if you ever put it together. That would be be an issue.
Ian Hawes: Yes. It would. It would.
James Pittman: Okay. Now, let's look at some of the ethical considerations, because those are things people think about. So when you're using AI for translation and you're handling, you know, sensitive or personal documents, let's say you're gonna do certified translation for immigration, you know, how are you what are the ethical considerations, the data security considerations, and and how do those, how are you tackling this?
Ian Hawes: Sure. So there's there's a huge ethical dilemma or or not dilemma, but consideration when it comes to LLMs and any sort of customer documentation. And and so that's something that we obviously take seriously. We don't train on none of the documents that we receive are trained on any sort of, memory model. So what that means is that you're you're sending a document to us. It's not going into some repository where it's being trained, and it's not we have a zero data retention policy when it comes to the a the AI LOMs that we use. So what that means is that they're not trading it on the documents either. And and that's something that is really important because, you know, we're not these are these are documents that are confidential and proprietary. And and not only do we not have a right to sort of make that that assignment of, you know, how it's gonna be trained, but we don't want anyone else down the line to be able to do that either. So that's really important. So before we Ian use a tool in this sort of chain of of catalyst, you know, workflows, we have to make sure that it's not gonna be used by somebody else to train on it. So that's the first thing. The second thing is that we do see value in having a repository of documents to eventually train on, and that's why we've moved towards an opt in only model. So you have to specifically opt Ian to, having your document entered into you know, it's a it's a simple question. You know? Hey. Would you like to help us train our AI model? You know? And and it has a link to our privacy policy and our our AI privacy policy as well. And that's something that, you know, it's it's a focus for us to to eventually have this data where we feel that it is is oh, it's ethical for us to train on this data because people have specifically opted into it. One of the more common things that I hear when I talk to not less ethical, but, you know, people that that are in the AI industry is that, you know, oh, you've been around for so long. You must have all this data. Well, yeah, like we have all this data, but you know, years ago, we have we we created a data retention policy where we basically said we're gonna delete all this data because it's a huge liability. And so that's kind of something that we've maintained. And so as a result, yeah, we don't really have all this data.
Ian Hawes: The other thing is that, you know, we we have, a repository of translations that we've produced that are not that that don't have any personal information on it. So these are the, like, think of the birth certificates, marriage certificates, vital documents where we've scrubbed all the PII. You know, there's no there's not even a, you know, catalog number. It's, you know, the format basically of a document. And that's something that that we use, in in some of our templated translations, but we are looking at how we can improve that to basically use AI to instantly scrub PII from documents, which again is is a it's a it's a important step in being ethical with the data that we collect. So yeah. I mean, in terms of, data security beyond just, you know, the the use of data for training. But in terms of data security, that that hasn't changed. That's something that we've taken seriously since day 1. The documents that we produce are all encrypted arrests. Everything is transmitted securely. We have a very rigorous, security process for the not only the translators, but the administrators that work with the documents that, that we come across and that we translate. And and so that really hasn't changed at all. That's always been, very rigorous, and and, you know, we've Hawes been thorough Ian, since day Ian, effectively.
James Pittman: Okay. Great answer. And I'm sure all that security documentation Ian on your website available.
Ian Hawes: So we have a whole security page. Sure. And and, actually, we we take the, for the the for data training, we actually take that very seriously because we we look at the types of customers that are sending us documents. So for example, if you're using the DocuWise integration, the the DocuWise image translate integration, we don't even ask you if you want to opt Ian to this data because we take the stance that if you are submitting the data on behalf of somebody else, you don't have a right to even opt in to this process. Right? We only look at at documents where we the the system thinks are end users of the document. So if the document is about you, then we know, okay. Presumably, you you have the right to, to opt into this process to let us use your document for training. But in in a context where we know that you're like an attorney or or an, a non individual, you know, business, something like that. Then we take the assumption that you don't have the authorization. We're not even gonna bother showing it to you. You're not gonna be able to opt in.
James Pittman: Understood. Understood. Well, I wanna talk about the elephant in the room, which is, you know, about AI. It's the elephant in the room every time we talk about AI. Do you foresee a time when AI could eventually completely replace human translators, or do you think there's really always gonna be a need for a hybrid?
Ian Hawes: Well, no. I don't I I there were always the there will always be the need for a human in the loop for a hybrid option, because, ultimately, it comes down to, like, most things cost. Right? So the human translator is always going to cost more. That that divide is only expanding. And and I don't I don't in the immediate future, I still see translators existing kind of how how they have before. I think the translators are are very good at at completing, but I also think that they're very good at adopting new technology. So I do see an opportunity for translators to be the entry point to, using James for translation. And I also think that companies that have a much larger budget for localization, They will continue to do what they have done, and they will look to expand their use of James. But I I think that they will find that the quality of a human in the loop is always going to be superior to just an AI translation. And I think for the the next 10 years, they're still going to want that kind of security blanket of a human in the loop. Beyond that, who knows? You know, 10 years ago, like, we had, you know, Google Translate was, like, kind of the the more cutting edge, translation tool, and and now, you know, it's not. And and so things can change. It's where I see the biggest job loss being is in the sort of junior translators, people that are doing this part time, you know, maybe they just came out of college. I certainly don't recommend anyone take a study in, like, translation studies. You know, it Pittman be a little bit tougher, but, you know, there are roles in there's a there's a a very large community of people that are studying translations in LLMs Ian and fine tuning Ian helping to fine tune LLMs that are specific to language translation. So we've talked about the sort of the the larger LLM. Right? The large large the XL LLMs, which are, like, GPT 4 and and Claude. There's actually, fairly large amount of LLMs that are fine tuned just for specific language, pairs. And so I think that there's, you know, an opportunity if you were specializing in you wanna specialize in translation studies, for example. There's an opportunity for you to kind of augment some of these elements that are being created. And I also think there's an opportunity to be, you know, basically a localization engineer, where you have a specialty in in computer science or, you know, some other engineering degree. And you can kind of cross that with being bilingual Ian and focusing on on another language as your translation language. I I do see that. And and, you know, there's this whole subset of companies like us, where, you know, we have we create technology, and we sell it to companies that don't have their own in house localization teams. Hey. Look. My job's not going anywhere anytime soon.
Ian Hawes: And and we are as a company growing. And and so, you know, there's there's lots of opportunities, but, like, most things you're not the the opportunity that you thought you would have may not be what you end up with because the industry is evolving Ian things are growing fast. Mhmm.
James Pittman: Now, do you think that AI assisted translation is gonna help to expand access to language services for underserved populations or regions, and how might that be? I I mean, sometimes, maybe nonprofit organizations, you know, they may not have the budget for as much translation work as they need. Do you think this can help?
Ian Hawes: Yeah. Absolutely. So, I mean, that's been our our biggest use case. We've talked to we've been to the countless AILA conferences, and we've talked to nonprofits that are like, look. You know, we can't can't afford you. And so we've wanted to, for a long time, create an environment where we can offer them something. Right? You you know, maybe you're not having the human aspect of the translation boat. You know, use our tools. Use the technology that we've created so that you have an opportunity to do better for your clients. Because ultimately, you know, it's it's there are there is, you know, this factor of there's a cost associated with it. But, you know, we'd like to to offer you something so that you have this ability versus, you know, having nothing. And I've talked to other nonprofits where their their language services are entirely volunteer. And being able to do a being able to give them a draft translation Ian say, hey. Here's a draft translation. You know, we need this by, Sunday. Close the business. Much better argument than, hey. Here's a blank document. Can you translate this for us? Pretty pleased, by the way. We need it Sunday. So so this the the concept of a draft, I think, is is very compelling to nonprofits. Ian in language access beyond that is is also, AI translations will grow it. Not only that, but we've we've saw the demo of, GPT 4 o with the, the that's it's an audio demo. Right? So it's you're talking to GPT 4 o through an audio interface. I see that being a massive way for, you would have Ian interpreter in your pocket. And and that's that's gonna be that's gonna be a big thing. And the other thing too is that James are improving, and we are seeing more LLMs that exist on device. So GPT 4, right? That's, that's something that's entirely in the cloud, basically. It's so there's latency, right? There's connectivity issues. If you're in the middle of nowhere and you need to talk to someone and you neither of you speak the language, there's no signal, this concept of having this on device LOM or this on device, you know, AI translation, whatever it may be is incredibly compelling because you don't need this connection. Right? You just need the device. So, yeah. I mean, that's inherently going to improve access to, resources for for those that are, not able to communicate, you know, in in any language. I think that would be a huge improvement. So I'm I'm excited to see what that what the improvements are on on on device as well as with, you know, audio interpreting.
James Pittman: Yeah. You well, the audio interpreting, you know, I was gonna ask about that. It's a little it was Solilo was a little bit off the main topic, but that, I think, has tremendous impact. And do you know I mean, just think about that for a second. I mean, if you had if you had, the ability for any, you know, any government agency, any any anything dealing with the public where you could have immediate, you know, audio interpretation, like conference interpretation, when dealing with, you know, the public Ian and any of our, agencies, that would be enormous. I mean, do you know of any major players who are working specifically on audio interpretation?
Ian Hawes: I I don't offhand. As a company, we've kind of solely focused on these sort of document translations. But I I will say Ian the wider language industry, there is definitely a desire to get towards AI for those use cases. AI has affected translations tremendously. AI has a had a limited sorry. AI and machine translations have affected the translation aspect. Right? So so and and to kind of just for for reference, if if there are viewers that are not familiar, the way I describe it is translation is typey typey. Interpreting is talky talky. Right? So it's a That's basic standard convention. Yeah. Yeah. Very, very basic understanding. But, interpreting has been less subject to these sort of advances in technology because it's not been as good as the advances in, like, machine translation and neural machine translation. So seeing these demos from GPT 4 o Ian then the you know, eventually, this will matriculate to these open source bottles. It's it's incredibly exciting because there is a a huge industry of interpreters that are, you know, basically overworked. And and so, you know, we'd like to think that there's a an opportunity to use technology to improve access to interpreting, you know, that doesn't immediately eliminate all of the interpreters' jobs. But at the same time, there is a a serious lack of of qualified interpreters in in almost any, you know, instance. You know, you'd think about we see horror stories all the time. In the language industry where, you know, a police officer has used Google translate to try and say something to, you know, a a subject that they've been unlawfully detained, you know, it's it's wound up in court. There's the horror stories of of court, sort of litigation over how do you interpret certain things that are said, you know, in court. You know, what is what is the the initial, what the interpreter said, you know, does not reflect what what the client said in in testimony of court. And Ian these things play out often. And so not only is there a need for better AI interpretation, but there's still this this need for interpreters that are, able to handle some of these more complex and nuanced cases. So that's another situation of, like, the the technology for AI interpreting will improve, so that the interpreters can then focus on things that are actually hard, and and actually, you know, require, a bit more work. And and, you know, the folks that are just trying to communicate and and and find, you know, the the nearest, the nearest library or or restaurant or something can can use another app. It's
James Pittman: like But how would you I mean, when you debut this catalyst, getting back to the catalyst, when you debut this later in the month, you know, let's say our listeners who are subscribers to Docowise and also use Image Translate, I mean, what were their their experience be? What what will they see? And, you know, will they have to make any, opt ins? Or,
Ian Hawes: Sure. So so, actually, we, in essence, Catalyst already kind of exists for, for users of of Docowise that are using the Amy Translate integration. So when you, select the files that you want to translate, it will Catalyst is the the engine behind producing what we call these AI labels. So when you send us a document, we now immediately produce an extraction a quick analysis of the document to say, hey, this is the document. You know, this is what language is in, what you know, how many pages, whatever. But this is, you know, what category of document are these? These are the subjects of the document. So that's something that you kinda get for free. Ian terms of the terms of the translation, you know, it's it's not it's not necessarily something you have to opt into. And it's not something that is really going to change the the quality or the the end translation. It's just that we are going to be, using this as sort of a a a a bedrock for any future translation work that we do. The the other aspects of Catalyst where you can kind of upload a document yourself and do your own translation and your own edits, your own draft, you know, that's something that is you know, a little bit more, structured so that not everyone is going to have access upfront. But we are looking for people that are interested in this this opportunity. Ian know, nonprofits are are a big Ian, but really anyone that wants to kind of help us test it out and give their feedback. Because ultimately, you know, we have translator feedback, and that's very valuable. But we also need feedback from our clients as well. But yeah. So so when this launches, it's also more of a kind of unveiling of what we do at AB Translate that makes us different and not necessarily, you know, how does it change things, you know, or what do you need to do extra? Right? This is more of a a look at what is different about us versus, you know, some of our competitors.
James Pittman: Will the self translate interface be available at the same time?
Ian Hawes: Yeah. So it's gonna be, we have a sort of a beta list. You'll be able to sign up on the 24th. Ian then we're gonna, kinda, trickle in users and and get feedback as things go on. But we hope to have kind of a generally available catalyst, by the end of October.
James Pittman: And okay. Thanks for that. And, no. Then no. That's great information. Now, you know, do you anticipate any concerns from, you know, users, or have you heard concerns in the past? I mean, have you ever received inquiries asking, you know, how much of the translation work that you do is is done with machine translation and AI. I mean, do you get those questions? And if so, how do you address them?
Ian Hawes: Well, so there's always sort of accusations that, hey. You just use Google Translate. And and the reality is that, we wish we could use Google Translate. It it doesn't it simply doesn't work for the majority of the documents that we have, and that's why Catalyst was invented. So
James Pittman: Right.
Ian Hawes: There's there's a, you know, there's concerns with anything that we do, especially when it comes to kind of a touchy subject like AI. And and that's why we've kind of taken this approach of, like, let's be transparent about how we're using AI. Let's talk about what catalyst is. Let's talk about what goes into it. Let's talk about what it can and cannot do. And then that way, anyone that still has a concern is, you know, obviously welcome to address it with us. And ultimately, you know, there will probably be kind of an opt out, some some level of, like, you know, I don't want any AI, you know, even the the AI labels, you know, that that may come up. My hope is that, people trust us enough to to not we don't have to get to that point because ultimately, we are not doing anything. You know, we've we've ensured that the documents that you provide to us and you securely send us are not being trained through the methods that I talked about and through our AI, privacy policy. But if you still don't trust us, you know, we'd like to still be able to work with you, but you know, you have to understand that there's a, a there's an additional cost associated with it. And we we've also dealt with this with you know, for the longest time, we've had this option to have not only a translator do the translation, but also a proofreader that kinda comes in afterwards. And and that, you know, we have the stats behind it to say that that is a a more accurate translation, and then it produces, you know, less errors Ian and, you know, less issues arise from those translations. But the the number one complaint from our customers was that it was too expensive. And so this is kind of a a way to say, okay. We're improving the accuracy. We're not going to raise the price. And and it's going to be something that you feel comfortable with. You know, that's a much better deal, I think, than than anything else. So, you know, the we've we've been doing this long enough to know that there's always, you know, issues that arise with with certain clients. But we feel that being as transparent as possible with our AI use is the right way to go.
James Pittman: Okay. And looking ahead, what aspect of all of this excites you the most about the potential for AI in the translation industry?
Ian Hawes: Well, every time new AI models come out, there's kind of a a new exciting thing. It's like Christmas every week, for for me. And so, what what excites me is I I do see I I do see a lot of value in the multimodal aspect of some of these LOMs. So, we use multimodal LOMs. So visual and text based, prompt. But, you know, I like to see how they improve that. We have certain age cases that we have found in testing where the the LLM is Hawes, like, blind spots. It's the best way to put it. And and so I'm hoping that some of these blind spots are alleviated. You know, obviously, there's there's a desire for more accuracy in some of the translations where, you know, hey, maybe this translation is is accurate enough, but, like, you know, you could Hawes expanded on it. There's the the more so than hallucinations in our industry, the, the concept of things that are being omitted is the the the biggest danger that we see. And and so we've created sort of secondary prompts that look for issues where, you know, things were omitted. But we'd like to not have to do that in the future because that does add that adds time Ian, you know, that does add cost to to the the AI translation. So those are, like, very, like, technical things. And and and that's that's a result of of spending my most of my time the last, like, 6 months or whatever just focused solely on LLMs and and translation. But, there's a lot to be excited about. Every time there's a new model, it's it's it's a whole new toolbox for us. And and that's really exciting. And we like to be, you know, 6 months, 1 year ahead of of the other translation companies that are chasing this sort of technology. Because ultimately, we feel that that produces a much better service than than simply being, you know, a human based translation tool.
James Pittman: Yeah. Absolutely. I mean, it's so great talking to you. I mean, you we're really you know, listening to your talk and realizing that, you know, you're someone who's really, you know, right in the thick of this, right on the cutting edge of it, certainly, you're, you know, you've been you're so immersed in all of these details. So, Ian, it's been fantastic hearing about how AI is affecting image translate affecting the translation industry as a whole, and so exciting hearing about Catalyst coming out on September 24th. So, really, I I hope that, you know, you'll join us, again in the future as as as AI continues. The AI revolution continues, and we'll look back, you know, at some point down the road and see, you know, see what's changed. But, yeah. So thanks very much for joining us, for episode 42 of Immigration Uncovered. It's been a great conversation. Thank you, James.