Yeop Lee, Head of Product at Coxwave, shows product teams how to move beyond mere accuracy metrics to a holistic, outcome-focused evaluation using Coxwave's powerful tool, Align. Discover how to measure what truly matters for business success.

 
 
 
DOWNLOAD TRANSCRIPT

296 Audio.mp3: Audio automatically transcribed by Sonix

296 Audio.mp3: this mp3 audio file was automatically transcribed by Sonix with the best speech-to-text algorithms. This transcript may contain errors.

296 Audio.mp3

YEOP: Cox wave. Our product align is an analytic solution for LM powered conversational AI products. So the way we think about it is we're the Google Analytics for products like ChatGPT or Claude. We've always like visited the US to understand, okay, what are the sort of developments in Silicon Valley. How are builders approaching building these new types of products and so forth? But I think this year, and especially next year, is when we'll more actively push into the North American markets.

CRAIG: Build the future of multi-agent software with agency. That's a g t c y. Now, an open source Linux Foundation project agency is building the Internet of Agents, a collaborative layer where AI agents can discover, connect and work across any framework. All the pieces engineers need to deploy. Multi-agent systems now belong to everyone who builds on agency, including robust identity and access management that ensures every agent is authenticated and trusted before interacting. Agency also provides open, standardized tools for agent discovery, seamless protocols for agent to agent communication, and modular components for scalable workflows. Collaborate with developers from Cisco, Dell Technologies, Google Cloud, Oracle, Red hat, and more than 75 other supporting companies to build next generation AI infrastructure together. Agency is dropping code, specs and services no strings attached. Visit agency org to contribute. That's h g t c y Dot o g.

YEOP: Hey! I'm yop. I'm the head of product at Cox Wave. Um, I joined Cox Wave, I think, a little over two years back, and but I knew the team since they were founded in 2021. Um, the way that I got to know Cox was actually. So I met the founder of Cox Wave. His name is Jejung at a government competition. Um, so this was when Cox was just starting off, and at that time I had just left Reed and I had started my own company. And so we were both just getting started off, um, and that that was sort of during the peak of Covid as well. So this government program was like a three day program, but none of the participants were allowed to interact with each other. Everyone was sort of locked in everyone sort of separate room, and we had to watch sort of, um, each other's presentations over screen. But, you know, um, Kijung had a wonderful presentation sort of outlining the vision of Cox Wave. And I had given my presentation. And during the award ceremony we connected just, you know, swapping business cards. And during that competition, Zhejiang Wave got first place. My company at that time got second place. And so that's sort of how the relationship started. Um, and then we sort of went our separate ways, building up each, you know, I built up my own company. He was continuously building up wave. And then I had the opportunity to sell my company, um, in 2023. And that was, um, also when Zhejiang got the offer from a public Korean company to sell two of the products that he was building. Um, so he had built AI products since 2021, um, and this Korean public company, because they saw sort of the rise of ChatGPT and the potential of AI wanted to acquire these two products, um, to sort of leverage internally for themselves. And that's sort of when I joined Cox Wave because Zhejiang and Cox Wave was transitioning from B2C to the B2B space. Um, so that's sort of the journey. Yes.

CRAIG: Yeah. And the products that he sold, I mean, he the Cox wave no longer owns them. Or do you mean they were their customers of Cox Wave?

YEOP: Um, so we no longer own those two products. So I can give a brief introduction of those two products as well. The first one was a product called Hama, which is an image editing tool. Um, so, you know, like sort of background removing, etc.. And so it was used by a lot of, um, companies in Korea that were sourcing like, for example, foreign products, and they wanted to edit out different sort of labels and, you know, edit it in Korean. So that was sort of a big use case. And the second product was a product called pix. Um, enter Pix was a search engine for um, generative AI, you know, created images. So it was an image search engine. And so the company that acquired these products, they're Korea's largest font company. Um, so they're sort of like the default font company in Korea. And they wanted to leverage this technology, um, for their own sort of B2C use cases.

CRAIG: Oh, um, and the, uh, so, so the premise of Cox Wave today is, uh, a platform that, uh, LM, uh, companies can use to evaluate their models or explain exactly what Cox Wave is doing.

YEOP: Um, so Cox Wave, our product align, is an analytics solution for LM powered conversational AI products. So the way we think about it is we're the Google Analytics for products like ChatGPT or cloud. So what we focus on specifically is the conversations between a user and an LLM. Um, so we actually focus on the interaction that's happening to help businesses and product managers understand is the chatbot working properly to our users having a good or bad experience? If not, why are they having a bad experience? How can we fix this and so forth? And so I'm sure, you know, you've seen a lot of these evaluation platforms that have been coming up. And I can sort of explain a little bit about the way we differentiate with these sort of other evaluation platforms. So these evaluation platforms that exist today, they focus on an area called objective evaluation. So this is sort of, you know, a product manager intends the LLM to say a certain thing, and you evaluate does the LLM actually say that in a bunch of different scenarios. Right. Because you don't want the LLM to just, you know, go go off rails. Uh, where we focus on is, okay, let's say that the LLM is already saying what the product manager intends the desire to save, but that response may be a good response for a certain user, but also may be a really bad response for someone else.

YEOP: Right. Um, I'll just give you a quick example. Like let's say there are two users. They ask the same question what is the best food in Korea? And the LM responds, oh, the best food is uh, which is, uh, a type of pork, right? Um, but one user says, oh, thank you so much for letting me know, but the other says, oh, but like what? What restaurant sells this? You need to tell me the restaurant, right? So in that case, for the second user, the response wasn't sufficient enough. So it wasn't the best sort of most ideal response, right. Um, so that's sort of the areas where we pinpoint and analyze to help businesses understand, okay. Is this an LM response really a good response for this user? How what is an optimal response and so forth. So that's sort of an area we called subjective evaluation and feedback analytics. Yeah.

CRAIG: Yeah. It's an interesting area. Uh, because initially, uh, you know, there's reinforcement learning with human feedback to try and guide LMS, uh, to answer appropriately and not hallucinate. Uh, and, and that's been successful to a, to a degree. And then there are, you know, now there are systems, uh, multiple models that talk to each other and, uh, for providing an answer and the reasoning models and all of that. Um, and, and for a time, um, evaluation relied on benchmarks and there were all these benchmarks, um, that got increasingly complex, Likes, but users figured out or the market figured out pretty, pretty quickly that you can train to a benchmark. So, you know, grok three comes out and they show their chart and they they do, you know, so much better on this or that benchmark than any of the others. It doesn't mean that the user experience is better. So this is, uh, as you said, a did you call it a subjective evaluation?

CRAIG: Correct. Yeah. Uh.

CRAIG: Yeah. So, you know, from the point of view of a user, do you, uh, do you train? Uh, I would guess that this is done by, uh, by, uh, evaluation llms that talk to the primary LLM and then, uh, there's some scoring or something. Is, is that right? Or how much human intervention is there in the in the judging.

YEOP: Yeah. So, um, we also leveraged like a limit as a judge as well. But then we do have like human interactions that are built into our evaluation system. So for example, when our customer is onboarding, um, um, our system will ask, we'll do sort of, um, initial sort of judging and it'll sort of show those examples like, is this a sort of like an example of user dissatisfaction or not? And the reason why we do this is even if, for example, we just use sort of like a standard model, um, depending on the conversational product. Right? It's very different. Um, and so what we realized is this area of subjective evaluation ultimately is personalization right to the user. But that also means our platform and our product has to be personalized to our customers as well. And so that's sort of type of interaction is built in. And one of the areas where we focused on a lot is our search engine, which is a search engine for conversational data. So we have a natural language interface where if you type, for example, I want to find messages where the user is expressing distrust against the chatbot. Then our search engine will go across the conversational data and identify that. And this is for even conversations where, um, the user isn't explicitly saying, I don't trust you. Um, but any sort of implicit feedback that you can extract from the conversation will identify as well. Um, and so that's sort of the system that we've built in place for our analytics system.

CRAIG: Yeah. And when you say, uh, there's a conversational interface.

CRAIG: Um, is, is that for the customer, uh, for customer to query um your platform about the MLM. It's evaluating. So that's another layer. Is that right?

YEOP: That's that's correct. That's correct.

CRAIG: Yeah.

CRAIG: And you would have to evaluate that MLM as well.

CRAIG: Correct. Yeah. So what we do is kind.

CRAIG: Of a house of mirrors.

YEOP: That's true. Yes. So we work very closely with our customers to help them understand the performance of our search engine as well as our copepod, um, so that it ultimately is up to par to their expectations. And initially, like, honestly, like right out of the box, like, for example, the accuracy is, let's say 80%, but we all but the system gets better and better as our customers leave more feedback as well, because our search engine. Right. For example, same example of um, user is expressing distrust, right? Um, there may be new cases of user distrust, But our system needs to learn as well to understand. Okay. For this company, for this use case. Um, what is this trust? What does it look like? Right. Um, so we always tell our customers, if you use our product more, if you give more feedback, it'll get better and better and so forth.

CRAIG: Yeah. And how does someone use this? Is this a SaaS product? Uh, that they access, uh, through the web, or is it an API that they build into their, their system? Describe how that works.

CRAIG: Yeah.

YEOP: So we have a SaaS product, uh, as you mentioned, they can just go on the web, um, they can log in and just leverage our platform. We have an SDK that helps our customers ingest the conversational data into line so that it's available for analytics. Um, the other is we have a on premise self-hosted solution as well. So these are for our enterprise customers you know, who value privacy, security, they have more rigid data governance requirements, etc. so we'll provide that option as well. And we are looking to expand into an API option too. But it's in our product roadmap.

CRAIG: Yeah. And how how many conversations is does this run live. Uh as well. If you run a customer service, um, AI system, will it monitor the reactions or the exchanges live and then maybe signal to somebody that, uh, that it's not, uh, the customer service bot is not answering properly or is creating problems. And, and then the more important question is, after the customer knows this, uh, what do they do? How do you correct it.

CRAIG: Yeah.

YEOP: Um, so we for our SaaS product, it runs near, near real time for our enterprise ones. We usually run it in batch, um, especially because there are some limitations on the hardware, etc.. Right. Of our enterprise customers as it's on premise, um, for the fixing part. So our job is one to sort of identify, help them monitor, analyze the impact, to help them understand and prioritize what are the issues that they should really be focusing on. Right. And then in terms of the fixing part, because like our customers, they have many different ways that they're building these systems. Some are just fine tuning all the models. Some are just, you know, using prompt engineering and so forth. So we can provide guidance. We have a separate consulting service that helps provide this guidance. But other times our customers just go on and fix it themselves and they'll understand when that sort of fix has been made, and they can add that data into the metadata that they're logging into line so that they'll understand, okay, after this change has been made, did the sort of problems disappear or are these problems still happening and so forth?

CRAIG: Yeah. How many, uh. Com conversational interactions do you typically screen with a customer? Uh, is, uh, and then it's an ongoing thing, uh, presumably from your point of view, that's what you guys want it to be? Uh, or do companies get to a point where they feel pretty confident in their model and, uh, and stop evaluating?

YEOP: Um, so, like, we it's such a huge range, like from thousands of conversations to, like, millions of conversations per day, um, especially because, like, um, companies that are running like contact centers, for example, with AI. Like there's a lot of volume that's happening. There are other companies that are running, um, character chats that have a lot of daily usage as well, and so forth. Um, up till now, the customers that we've engaged in, they haven't left us. Um, and the reason is, um, this sort of evaluation and analytics is it's never over. Um, especially because new llms come up, things that are working before stop working. Right. Um, and as users, more and more users come into the platform and there are new sort of user engagements, this type of new user behavior leads to new problems that need to be analyzed, need to be fixed, and so forth. And again, um, these sort of new insights can propel new feature ideas, um, new sort of interaction ideas and so forth. And these require constant monitoring, analytics and evaluation.

CRAIG: Yeah. Uh, the, um, the this issue are are you working? You're working, I presume, primarily with, uh, application companies, companies that are building applications on top of either, uh, an open source model that they fine tuned or that they've hooked up to a, uh, vector database with a Rag system or or or do you also work with the foundation model? Uh, companies themselves that companies that are are coming out with their own models, listing them on hugging face and and hoping to get a piece of, uh, of that that pie that's dominated, obviously by OpenAI and anthropic and a handful of others.

CRAIG: Yeah.

YEOP: Um, so as, as you mentioned, we mainly work with the application layer companies. Um, and these are also enterprises that are building these applications for their internal use cases, too. But we've begun to expand our work with foundation model companies as well. Um, you mentioned anthropic. So we're one of anthropic closest partners, um, in Korea. So a couple months back we actually hosted Asia's first builder Summit. So we had the CEO of anthropic, Mike Krieger in Korea as well. And we had this sort of huge event that we had hosted. Um, so that sort of work is something that is ongoing as well. Um, as anthropic is also looking to expand its presence in Asia and Korea, um, as well as a company called two AI. Um, they are a foundation model company with a presence in Korea, India, the US, and they have a consumer product called Chat Sutra. Um, also, they offered their foundation model to as via um API to enterprise customers as well. So we work closely with them to.

CRAIG: Yeah, yeah. Um, I've had two AI on the podcast. Uh, yeah. And ah, most of the applications that you're working with are the models conversing in Korean? Uh, or is it predominantly English, even though your market presumably is, uh, East Asia?

YEOP: Um, so most of the language is actually in English. And very interestingly, when we first launched a line, we made the decision initially to not support Korean. And so our product was just in English. And our main target market was actually we started off with India. Um, and so most of our customers, um, enterprise customers as well, are based out of India and our sort of support for the Korean markets, um, including like localizing our product, supporting the Korean language and so forth, started early this year, like before that we have Korean customers. But, um, they were kind of forced to just use the English version of our product. Um, but yeah, now we're seeing more and more, um, you know, Korean customers come on board. And the reason for this was actually quite, quite clear. Um, it was because when we first launched our product, there just weren't as many, um, Korean products in the market. And so we had to as a startup, right, pick and choose and really focus. And so that's sort of led us to the Indian markets.

CRAIG: Build the future of multi-agent software with agency. That's a g t c y. Now an open source Linux Foundation project agency is building the Internet of Agents, a collaborative layer where AI agents can discover, connect and work across any framework. All the pieces engineers need to deploy multi-agent systems now belong to everyone who builds on agency, including robust identity and access management that ensures every agent is authenticated and trusted before interacting. Agency also provides open, standardized tools for agent discovery, seamless protocols for agent to agent communication, and modular components for scalable workflows. Collaborate with developers from Cisco, Dell Technologies, Google Cloud, Oracle, Red hat, and more than 75 other supporting companies to build next generation AI infrastructure together. Agency is dropping code, specs and services. No strings attached. Visit agency to contribute. That's a g t c y o r g. And and you mentioned align. Align AI is the flagship product from Cox Wave. Is that right?

CRAIG: Correct.

CRAIG: Um, and and the, uh, what? Uh, lm do you guys use, um, on the conversational side? And then on the evaluation side, in, in, uh, in looking at the, the, uh, conversational, uh, data that comes in from from a customer.

CRAIG: Yeah.

YEOP: Um, so we use a bunch of different models, but then our main is actually Anthropic's Claude Models also explains kind of why we have a good relationship partnership with anthropic. Um, we also have our own custom embedding models that we've trained to be very specifically good at retrieving conversational data. Um, so that's sort of our own internal technology. Um, and also there are like a bunch of different, like smaller analytics tasks. And depending on the task, we've sort of found which model works the best. And we've um, sort of put that LLM inside. Um, and also some like do run on OpenAI models as well. Yes.

CRAIG: Yeah.

CRAIG: And you know that the foundation field is changing quickly. Um, I mean, it seems to have slowed down a little bit, but, um, do you, do you keep, uh, sort of rotating in the latest foundation model so that you're keeping up with the market and, you know, not getting stale. I've always wondered how companies handle that.

CRAIG: Yeah.

YEOP: So that's always like a big challenge as well. But what we do is we don't always switch out for the newest model, but we do in like very rigorous internal testing to understand, okay, with these new models, how do they perform against the current models that we're leveraging on these specific analytics tasks? But because what we found is, you know, like even these newer models, like sometimes they're not the best at the analytics tasks that we need completed. Also for our enterprise customers, like switching out for the newest models, like every single time, that change always introduces new risk. Um, as well. So we'll always go through that internal evaluation process. And sometimes we'll make that switch over and sometimes we'll stick to the model that we're using.

CRAIG: Yeah.

CRAIG: Um, you've, uh, you've been around. Uh, what did you say since, uh, 20, 21? Uh.

YEOP: Yeah. So with the company, I started off, I think early 2023. Um, yeah. And then this company has, uh, has been around since 2021.

CRAIG: That's what I meant. Yeah. Yeah. Uh, the, um. Uh, initially Cox wave, uh, the two products that they sold was that Hama and Pix.

CRAIG: Yes.

YEOP: Correct. Harmon.

CRAIG: Enter pics.

CRAIG: Is that what you said? Yeah. Um, and and you've recently raised, uh, money, uh, and and are putting all of your, uh, your focus on, uh, on align AI is. Is that.

CRAIG: Right? That's correct.

CRAIG: Yeah. And is this a crowded market? I mean, everyone's talking about evaluation, but I haven't spoken to many companies in the evaluation space. Um, is is it, uh, a growing market? I mean, how do you guys see the market globally?

CRAIG: Um.

YEOP: So it definitely is a growing market, um, especially as like enterprises and more and more companies start leveraging generative AI not only for their internal use cases, but for customer facing use cases. So it is growing, um, and the need is very clear, especially because as a company and builder, if you can't trust the output of your LLM, then you can't really introduce it to your customer, right? Especially for larger brands. Um, the way we differentiate though is that area of subjective analytics. And honestly, we haven't seen that many players, um, in that space yet. It. Um, but as the sort of world moves more and more into, you know, hyper personalization where these LM systems are hyper tuned to that specific individual. Right? And very flexible. Adaptable, um, this area will become more and more important.

CRAIG: Yeah.

CRAIG: Um, is it, uh, how how affordable is, is the platform for, uh, for companies? Is it something that's within reach of a relatively small startup, or is it really focused on, you know, fortune 500 or fortune 1000?

CRAIG: Yeah.

YEOP: So we do have a SaaS plan. Um, that's for $99 per month. Um, so it is more affordable. And then we also do have like a design partner plan, um, for earlier companies, um, especially because at that time these companies are just starting to understand how their users are interacting with their products, and we want to be good sort of growth partners for them as well.

CRAIG: Yeah.

CRAIG: Uh, the, uh, is there what why did you settle on anthropic as your, uh, base model? Uh, is it through an evaluation, uh, or was it based on business relationships and that sort of thing?

CRAIG: Yeah.

YEOP: So it was on both. Um, so one, we internally evaluated the cloud models and they performed really well on the analytics and evaluations task. And then the second was, um, the sort of shared philosophy that both our companies have on building safe, responsible, reliable AI, um, especially anthropic, you know, they have the concept of constitutional AI as well.

CRAIG: Right?

YEOP: And so those are very important values, especially when we were going to our enterprise customers, like we wanted to have a foundation model partner that shared the same values as us that we could trust and so forth. So that also played a big role.

CRAIG: Yeah.

CRAIG: Do you work with, like, hugging face to, uh, provide evaluations there? Uh, I mean, it seems their leaderboard is really crowdsourced, but there's all kinds of problems with crowdsourcing. Uh, I mean, can it be gamed pretty easily? Is that, uh, is do you work with hugging face or or or do you, uh, publish rankings of, uh, that are in the market?

YEOP: Um, so we don't, uh, work with hugging face yet or publish that? Um, especially because what we're analyzing is not a specific model, but we're. Yeah, evaluating the product itself. Um, so, yeah, not yet. But one thing that we do do, though, is we work very closely with consulting companies. And so one of our closest partners is PwC India. There were our first enterprise customer as well as enterprise partner. Um, especially because, for example, I'll just give you the example of PwC India. Um, they have 30,000 employees in India alone. Um, and they have so many verticals that they're working on. And it was very clear that generative AI can be very transformative, not only for their internal use cases, but they're also seeing a lot of demand from their customers as well on how they can leverage generative AI. And so we've been, um, a very close partner to them, providing not only analytics and evaluation, but also, as I mentioned earlier, about, okay, if these problems are there, how can we fix it? So we provide that sort of consulting layer service as well.

CRAIG: I see. Um, and so where is this going? Uh, I mean, this field is, as you said, it's growing. Uh, it's becoming, uh, increasingly integrated into the global economy. The number of LLM based applications has exploded. Uh. I mean, is it just from your your guy's point of view? Right now, I'm just working, uh, to build market share or is it or is there something on the horizon that you see as a new, uh, product that Cox Wave is looking to launch?

CRAIG: Yeah.

YEOP: So increasing market share, of course, is very important. But another piece is a multimodal analysis. So nowadays these right now we focus mainly on text. So most of these are chatbots. But then now you know with even with like ChatGPT you go across different modalities. And so providing that layer of analytics for these multimodal products, helping businesses understand for example, okay, for this user, like when do they prefer one modality over the other. Are there some modality features that aren't performing as well? And so forth. Um, to help, um, these companies make those sort of feature design choices as well. Um, so that's a very important thing on the roadmap. Another thing is this thing about personalization, right? Um, the reason why I think it's such an interesting problem, but also very complex, is let's say you get all this sort of user feedback that's coming in and you understand, um, this user, um, but with that information, you can't always like make that change immediately into the product. Right? Um, so for example, I'll just um, like for example, when I use ChatGPT, sometimes there's this problem of with ChatGPT memory where it's just remembering everything. There are things that I don't want it to remember and actually make that change. And some of my colleagues, they've made the decision to just shut off the memory feature, um, because that that gets really annoying, right? So that type of like decision and how we can sort of help product managers make that decision in a more informed way. Right. What is what are the things that should be remembered and changed? What are the things that users probably don't want changed? Um, and actually helping them make those decisions, that's a very important thing on the horizon for us.

CRAIG: Yeah, actually that's interesting. You mentioned that that seems to that problem seems to be relatively new and getting worse. Um, uh, even if you start a new, uh, instance or new conversation or a new session, um, sometimes, uh, it'll, it'll mix the answers from your previous conversation into the current one. Yeah. Um, which you're right. It can be, uh, can be annoying. And you have to repeat to the model that I've stopped talking about that I'm talking about something new.

CRAIG: Yeah.

CRAIG: Uh, yeah. Forget about that. Uh, I didn't realize you could turn off the memory feature, but, uh.

CRAIG: Yeah.

CRAIG: Uh, have you guys looked at. Excuse me? Like, um, grok three? I mean, do you have customers that are building applications on, uh, on the grok models?

CRAIG: Um, I'm.

YEOP: So far off the top of my head, I don't think we have customers building on grok models. Uh, mainly OpenAI anthropic. And then for, um, our enterprise customers, they're usually building on top of, like, llama models or models. And then there are like large enterprises in Korea that have their own models. So LG has their own model. Samsung has their own Gauss model. Um, SKT has their own model as well. So they've been trying to figure out, okay, how can they leverage these internal models that they've built and just bring up the sort of performance on that?

CRAIG: Yeah.

CRAIG: Does Korea have a national model? Some countries are building, uh, you know, sovereign models whether you want to call them.

CRAIG: Yeah.

YEOP: So right now we don't. But our, uh, recently elected president, uh, Lee Jae myung, he's pushing for this as well. So I think there's like, a new, almost like, $100 billion, um, AI fund, like $800 billion AI fund that'll be used to build, um, sovereign AI as well as, you know, just supporting startups, ecosystem, so forth.

CRAIG: Yeah. Yeah. Yeah.

CRAIG: Uh, and I mean, I've been impressed by the a, the Korean, uh, startups that I've spoken to because There's so much US government support for startups, as you were saying, that's how, uh, you met the founder. Uh, is is that increasing? Uh, and and do you pay much attention to how things are organized in the United States, where it's it's largely, uh, a private, um, you know, a private, uh, exercise or private initiatives that fund AI startups.

CRAIG: Yeah.

YEOP: Yeah. So, uh, we pay attention to the US as well. Um, and then the government support has been increasing, too, um, especially with the sort of rise of AI the government is looking for, uh, new partners that they can work with to help not only build sovereign AI, but just provide this level of support to the startup community. So recently, the there was a new government position created. I think it's like a policy advisor position for AI. Um, but the person who's currently in that advisory position, he used to work for Naver, um, and I believe he worked on the foundation model, um, at Naver. And so he has an understanding of the role startups should play, how this ecosystem can play out and so forth. So just by having that new position created in the government that also provides like a very strong channel and voice that startups can, you know, present to the government as well. Um, another thing that's um, that we're doing as well is we, we and a couple of other companies, including written, which is Korea's largest B2C AI company liner, um, etc., we created the generative AI Startup Association. And so this is an association of AI startups and companies coming together. So we organize a bunch of different events to share knowledge learnings across one another as well as, you know, discussing with government policy makers to understand, okay, what are their priorities, what are things that us as startups really want to see, you know, changed in the market? How can we work together to, you know, create sort of the good future that we all envision?

CRAIG: Yeah.

CRAIG: Uh, the, um, are most of the startups focused on the Korean, Japanese markets? Uh, or are most of them focused on the, uh, English language speaking market?

CRAIG: Yeah.

YEOP: So I think traditionally it has more been focused on Korea and Japanese Asian markets, but now you're seeing more and more sort of, um, English speaking, US facing products as well. And I think the reason for that is actually because of these like foundation models and how good they are with languages in the past, like the language of, you know, Korean has been a big barrier for large companies from abroad coming in. Right? But now with these llms like Claude speaking Korean is just so natural. Um, that language is no longer a barrier. So for these Korean companies now, if you don't go to the US markets or these larger markets, there's just no way of surviving. So it ultimately.

CRAIG: Comes.

YEOP: Down to survival.

CRAIG: Yeah.

CRAIG: And on that, on the, the, the multilingual, uh, like, uh, I, I use uh, uh, because I spent a lot of my life in China. I work between English and Chinese in, uh, in the with, with various llms. It's the Korean, uh, to English as I mean, can Claude. Uh, two is it 2.7, 3.7? I always get that wrong. Uh, it does, cloud 3.7 speak Korean as fluently as, for example, it does Chinese.

YEOP: Uh, yeah. It's it's so fluent that last time, like an anthropic researcher came over for the builder summit. Um, we asked him, like, why? Like, does anthropic just have a lot of, like, Korean teammates? Um, like, how is Claude so good at Korean? Um, and even, like, they're surprised that it's so good at Korean. Um, but we're we're in China. Did you live? I lived for, like, 13 years in China as well.

CRAIG: Oh, yeah. I saw that you had spent time in Shanghai. I was, uh, yeah, I was in I was the New York Times bureau chief in Shanghai for quite a while. And then I was in Beijing, um, running the business side of the New York Times. Since my, uh, audience is, um, not only I mean, there are Another 150 countries or something that listen. But is this. Um, but most of the listeners in are in, uh, the US and Canada and, uh, the UK. Um, I mean, the English speaking markets. Uh, how big is your push been into these markets?

YEOP: So up till now we've been focusing primarily on the Indian markets as well as Korea. But now we're starting our push, um, for the US markets. So up till now, like we've always, like, visited the US to understand, okay, what are the sort of developments in Silicon Valley like how are builders approaching building these new types of products and so forth? Um, but I think this year and then especially next year, is when we'll more actively push into the North American markets as well.

Sonix is the world’s most advanced automated transcription, translation, and subtitling platform. Fast, accurate, and affordable.

Automatically convert your mp3 files to text (txt file), Microsoft Word (docx file), and SubRip Subtitle (srt file) in minutes.

Sonix has many features that you'd love including enterprise-grade admin tools, collaboration tools, secure transcription and file storage, share transcripts, and easily transcribe your Zoom meetings. Try Sonix for free today.


Learn more
 
blink-animation-2.gif
 
 

 Eye On AI features a podcast with senior researchers and entrepreneurs in the deep learning space. We also offer a weekly newsletter tracking deep-learning academic papers.


Sign up for our weekly newsletter.

 
Subscribe
 

WEEKLY NEWSLETTER | Research Watch

Week Ending 11.2.2025 — Newly published papers and discussions around them. Read more