Eye On AI

Dominic Williams, the visionary behind The Internet Computer (ICP) and founder of DFINITY, reveals his plan to build a decentralized alternative to cloud computing.

DOWNLOAD TRANSCRIPT

240 Audio.mp3: Audio automatically transcribed by Sonix

240 Audio.mp3: this mp3 audio file was automatically transcribed by Sonix with the best speech-to-text algorithms. This transcript may contain errors.

Dominic:

I drew a lot of inspiration from Bitcoin originally, and Bitcoin is actually created by a sovereign network. So people have these things called ASICs that do proof-of-work computations, that drive, that mediate participation in the network, and they're used to drive consensus, a bit kind of in the role of a random number generator. So the Internet computer is also hosted on sovereign hardware. There are these things called node machines, which are built to standard specifications and they're run by people called node providers, and at the moment there's about 140 organizations sometimes individuals, but usually companies around the world that run these node machines at scale, and they're all over the world, Generally speaking, in vast majority installed in co-location facilities.

Craig:

I'm past the point of looking for jobs, but I'm not past the point of looking for people to hire, and when I need to do that, I turn to Indeed. Imagine you just realized your business needed to hire someone yesterday. How can you find amazing candidates fast? Easy, just use Indeed. When it comes to hiring, indeed is all you need. You can stop struggling to get your job posts seen on other job sites, because Indeed's sponsored jobs help you stand out and hire fast. With sponsored jobs, your post jumps to the top of the page for your relevant candidates, so you can reach the people you want faster, and it makes a huge difference. According to Indeed data, sponsored jobs posted directly on Indeed have 45% more applications than non-sponsored jobs. Plus, with Indeed-sponsored jobs, there's no monthly subscriptions, no long-term contracts and you pay only for results.

Craig:

How fast is Indeed? In the minute I've been talking to you, 23 hires were made on Indeed, according to Indeed data, worldwide. There's no need to wait any longer. Speed up your hiring right now with Indeed, and listeners of this show will get a $75 sponsored job credit To get your jobs more visibility. Go to indeedcom slash IonAI. Go to Indeedcom slash IonAI, ionai. As always, all run together E-Y-E-O-N-A-I. That's Indeedcom slash IonAI right now, and support our show by saying you heard about Indeed on this podcast, indeedcom slash IonAI for a $75 sponsored job credit. Terms and conditions apply.

Dominic:

Hiring Indeed is all you need. I've been around for a while. I've been writing software for more than 40 years, so it gives you an indication for more than 40 years. So it gives you an indication. And I did do a computer science degree. As you might expect, I did pretty well. I graduated with the highest scoring, first-class honors and some prizes and some other stuff. But I didn't go into academia. I became an entrepreneur, originally from London and always been drawn to distributed systems.

Dominic:

I've done a lot of different things, but the first big thing I worked on was during dot-com. It was a thing called Smart Drives. It was a system for giving people fast access to remote files over dial-up lines, because back then we didn't have broadband and use something called differential compression and some kind of clever heuristics. So if I was trying to upload a file over my dial-up it would look at the file and think well, there might be a similar file on the remote file server and so we'll do this kind of differential compression and only have to upload a small amount of the data. After that kind of got dot-commed. But we created an enterprise product from the technology sold that quite widely. I then created a kind of platform for commercial publishers online ran that for several years.

Dominic:

Then I created a computer game which was actually formative and I'd probably gained, prior to crypto, my most useful formative experiences from that in a number of ways and it was a massively multiplayer game. It was unlike any other game. It was directed at kids that I think anyone's seen before or since. When I eventually closed it down just couldn't keep it running for a bunch of reasons and there was like a changeorg petition to bring it back. But it grew to about 3 million users and behind the scenes it used a whole lot of custom sort of decentralized game server kind of technology I'd developed myself Allowed me to indulge my interest in distributed computing. It was powered by a thing called Cassandra, which was one of the first decentralized NoSQL databases. We had the first sort of production installation of this thing called Cassandra.

Dominic:

And then eventually, you know I'd sort of been watching crypto from afar and eventually it sucked me in and first in 2013, bitcoin, of course by the end of that year I thought, okay, look, I'm going to create like a blockchain for games that will allow people to buy and sell virtual goods and move value between games, and that, of course, would have had some very different requirements. It needed fast finality. It needed to be able to process much larger numbers of transactions a second than Bitcoin could. And so I started looking around at things people were building at the time, started digging into the theory, found it was a bit shaky Nothing really worked like I wanted it to and spent 2014 really researching how we could build blockchain networks that could run much more efficiently, run much faster, scale and so on. I'd calculated that I needed to be able to process hundreds of thousands of transactions a second for this recurring payments vision I had at the time. So I was the first guy really sort of pioneering the repurposing of classical Byzantine fault-tolerant computing techniques for the blockchain setting and also proposing things like sharding and consensus groups and means of scaling out blockchain so they could process a lot more transactions and computation.

Dominic:

And around this time I was a member of the early Ethereum community before Ethereum launched and someone came up with this phrase world computer and world computer, I think, is just really a wonderful expression, and my interpretation of Will Computer was different to other people's interpretation at the time. So my interpretation was that a blockchain could be used as a replacement for traditional IT and essentially extend the internet so that the traditional internet, which launched in 1983 using the TCP IP protocols and obviously started growing exponentially in the 1990s with the advent of the World Wide Web, that connects software. It connects your web browser to a web server so the web browser can download content and display it to you. It allows your WhatsApp client to send a message to another WhatsApp client. It allows your you know we're using it now with video streaming, but you know it's essentially connecting software and I thought, wow, you know you could like extend the internet with an overlay network. And now you could. The internet could host software as well. And there's this idea that you know you could build things like social media, enterprise apps and now AI on the internet itself. Obviously, there are a lot of advantages that come from building on a public network as opposed to a proprietary network.

Dominic:

But more than that, I saw something in smart contracts which were being developed for Ethereum. So, within the context of Ethereum, these smart contracts were designed and envisaged to be things that would enable very basic DeFi like ERC-20, creating your token or a DEX with an automated market maker, and that was very much the community's vision of a world computer. But for me it was like well, that's a world calculator, not a world computer. And I saw something in smart contracts that I still don't think many people have clocked, which is that it's a new form of software, and I'd already spent decades coding. And well, this is a new kind of software. First of all, it's tamper-proof, which means it's immune to traditional forms of cyber attack. You don't need to protect it with firewalls and anti-malware. It's guaranteed to run the correct code against the correct data. And I saw that could be immensely valuable, and cybercrime and cyberattacks are a continually worsening problem.

Dominic:

So I thought, well, okay, that's pretty impressive, a new kind of software that's tamper-proof and doesn't need protecting by cybersecurity. It's also unstoppable and within the fault bounds of the network, so it's obviously much more resilient. So it's more secure, more resilient. But the unstoppability piece means that you can simplify software design, means that you can simplify software design, and so I saw an opportunity to create a platform not only where people could build web applications and services that didn't need protecting by cybersecurity and have this incredible resilience, but also where the actual software itself, the logic that makes these applications and services run, could be greatly simplified so you can reduce the cost of ownership and time to market and things like that. So it's just a general improvement on software. And you also have these remarkable new capabilities like autonomy. So autonomy is the property of running without a human controller or organization that controls it. And we're familiar with autonomy and applications like Uniswap on Ethereum. So it's a DEX. The DAO sort of stamps out the latest version of the DEX and it can't be changed afterwards.

Dominic:

But you could, I realize, create a new kind of open internet service and these exist on the internet computer now where there's a digital governance system that controls the software that creates that service and all the data inside and that could be essentially driven by the community that uses the service and developers will be part of that community and will propose software updates and if they are updated, adopted by the governance system, they would be installed completely automatically. So it's a bit like we've got open source projects where there's a kind of governance system of sorts with pull requests and things like that which you see on places like GitHub. But you could go much further than that and actually have a much more sophisticated governance system that actually ran a running service and that could also be useful for enterprise systems where you want to make sure that there's not just one administrator who holds the keys who might turn rogue or have an accident and do something bad. So I saw that a world computer could be an alternative to traditional IT, which would host this kind of serverless cloud environment within which you'd install and run a new kind of software with these kind of magical new properties that you can't get any other way, and so basically, I sort of set about building it. I proposed it in 2015. There's an old website on the Wayback Machine DFINITYio 2015. And it was difficult convincing people that the science would work, because people were so convinced that these networks couldn't scale and couldn't be performant. But in 2016, I was a co-founder and a tech incubator in Palo Alto, california, which is the heart of Silicon Valley, and persuaded other people there to start backing this DFINITY project thing. And October 2016, the DFINITY Foundation was created, which is where I work now and in whose court I'm currently located and work now and in whose I'm currently located.

Dominic:

And February 2017, you know, a ledger of these tokens that would be used to sort of drive the tokenomics of the network were created. A bunch of them were assigned to DFINITY as an endowment. We raised a bit of money. We started scaling the team. We were very lucky. We benefited from some really super smart early hires. Among the early teams a guy called Timo Hanke, who had been the CTO of TerraHashing, an early Bitcoin guy who designed this thing called ASIC Boost, ben Lin, who's the L from BLS Cryptography, andres Rosberg, the co-inventor of WebAssembly, and so on, and somehow from that we were able to really scale out this just brilliant R&D team. By the end of 2017, we were the largest R&D team in crypto. That has continued. You don't hear too much about us for a bunch of reasons, but we've continued to be this R&D leviathan. Eventually, after an awful lot of work, the Internet Computer launched May 2021, and it was a bumpy ride.

Dominic:

I think a lot of people were very worried about the internet computer and what it might do to the status quo and blockchain and what did this mean? And, to put it in perspective, the internet computer today hosts things like social networks and AI models and all associated data, content and computation, and that network resident code, which is an evolution of smart contracts on the internet computer, is processing HTTP and directly creating user experiences, so the whole thing's fully on the network. All these services are running fully from the network At the moment. Traditional blockchains remain focused, really, on sort of DeFi and the financial layer of decentralization, and they can't even store a single phone photo. So if you took a photo with your phone, you couldn't store that on any of today's traditional blockchains or any other blockchain that someone defines as non-traditional, and they focus on DeFi and tokenization, whereas the Internet Computer does do DeFi and tokens, but it's also focused on compute hosting, computer scale and enabling the Internet to host software as well as connect software.

Dominic:

The network's grown. We've spent the vast majority of our money on R&D, not on marketing and so on. We think in the end, that's the right way forward. Despite that, the computer on the network grew by more than 500% last year and we think it's going to grow much faster, mainly because of two big developments. One is just AI generally the network will run AI, and that consumes an awful lot of compute Cycles is the fuel for compute, and the other is this thing called the self-writing internet, which is enabled by AI, and the self-writing internet is a paradigm where people just chat to AI and it builds what you want according to your instructions, completely solo and gives you a URL where you can access your custom website, custom web app, custom internet service, custom enterprise app and even in production, while it's got data inside of it, you can keep on talking to the AI to update and improve it.

Dominic:

And we think this self-running internet paradigm is going to be like an iPhone moment for crypto that's focused on compute and the world computer paradigm generally, because people will just be able to talk in a secure way with AI and build out their own corner of the internet ecosystem, their own internet functionality, in real time and obtain this sovereign functionality and the sovereignty piece is very important that's running on a public network where they have complete control and there's nobody that can make them a captive customer, where they own the underlying software even though they're not writing it themselves and all the underlying data. And they don't have to worry about security, of course, because it's tamper-proof. They don't have to worry about resilience, which is unstoppable, and by talking to the AI, if they want, they can also create Web3 functionality and AI functionality, because, of course, these things they build on the Internet computer can interact with AI via APIs. So, yeah, that's kind of a long summary.

Craig:

Yeah, well, let me start asking some questions, because, I mean, maybe you can back up and give an of the physical side of this. So, um, blockchain as it exists today is really, uh, a registry right. It's, it's um a ledger, immutable ledgerger, because it's, uh, it's it doesn't exist in any single place. It exists across all the nodes of the blockchain. Uh, but you were talking about, um, how the blockchain couldn't even store a cell phone photo. How does it so you have the blockchain? Why?

Dominic:

is that the case, right? So, starting off with the basics, I suppose I drew a lot of inspiration from Bitcoin originally, and Bitcoin is actually created by a sovereign network. So people have these things called ASICs that do proof of work, computations that mediate participation in the network and are used to drive consensus, a bit kind of in the role of a random number generator. So the internet computer is also hosted on sovereign hardware. There are these things called node machines, which are built to standard specifications and they're run by people called node providers, and at the moment there's about 140 organizations sometimes individuals, but usually companies around the world that run these node machines at scale, and they're all over the world. Generally speaking, the vast majority are installed in co-location facilities. So it's sovereign hardware for a sovereign network. And the reason that these machines stop me if I get too technical but the reason the machines are built to a standard specification is that they're grouped into something called subnets behind the scenes by the network and these subnets replicate the compute and data of units of software that are hosted by the network, and all of these machines are hosted by the network. All of these machines are required to keep up with that replication of computing data for the units of software assigned to their subnet. If one machine was built to a substandard specification, not the public specification, and it fell behind, the network might decide that there's something wrong with that node machine, it's faulty, and it could be sort of excluded from the network and so on. So that's why they build these machines to a standard public spec and there are a bunch of differences. So, with the exception of Bitcoin and a few other early blockchains, you know most what you call proof of stake.

Dominic:

Blockchains actually reside mostly on cloud, so you know even Ethereum today. You know there are hundreds of thousands of validators, as they're called, like nodes, that exist on Amazon Web Services, exist on Amazon Web Services. Now, if you're an old school crypto theoretician like me, you might question some of the value in that, because, while it's true you're replicating the data and compute hundreds of thousands of times, you're doing so on a single infrastructure. I mean, in principle, jeff Bezos could switch it off, right, or it could all go down together. And in fact, solana, which is one of the leading blockchains, now had about 40% of its network on a European cloud called Hetzner, and overnight Hetzner decided they didn't want Solana validators and switched them off and they just lost that huge chunk of the network in one go.

Dominic:

So you know, that's why we have a slightly, first of all, different model. We want the internet computer to run on diverse sovereign hardware rather than on cloud. But also the other challenge with this model that proof-of-stake blockchains have is they kind of want to give everybody a chance to create a validator, because that involves people staking their cryptocurrency. This has the effect of taking that cryptocurrency off the market and that can help their token price because there's less sell pressure. Right, all the cryptocurrency is locked up. There's a kind of like dual purpose and that's the hidden purpose of proof of stake.

Dominic:

And we would say that for the purposes of compute at least, if you have a network like the internet computer focused on hosting tamper-proof, unstoppable compute, it doesn't make sense to replicate computation and data hundreds of thousands of times. It's not necessary to do that to obtain the security and resilience guarantees that the network provides. So instead, within the network it uses something called deterministic decentralization. Deterministic decentralization so the network runs under the control of a sort of governance system called the network nervous system, and the governance system actually structures the network and configures it and upgrades the internet computer protocol that runs it. But it also creates these subnets which replicate units of software. And the more subnets you have, the more units of software you can host. The subnets themselves are invisible to the software, like if you're a unit, if you were a unit of software and I was a unit of software, um, and we're on hosted by different subnets. You know we can call each other's functions directly. It's like the internet, right? Like you know, we're actually connected. You know this video between us is streaming over. Probably I don't know 10, 15, 20 subnets, but we don't know anything about that. You know my laptop has an IP address, your laptop has an IP address and that's enough, right? And it's the same on the internet computer. Like, the software doesn't see the subnets, but those subnets scale out capacity. And the question is how can you create a subnet with the minimum amount of replication? Because that's costly but that still provides the required security and resilience guarantees. So deterministic decentralization is right.

Dominic:

First of all, each of these node machines must come from a different node provider. There's like 140-odd of these companies, and sometimes individuals, that run these node machines at scale. So, for example, with an application subnet which is a bit different to a fiduciary subnet. It will pull 13 machines together or provide them with instructions to run a protocol to create a subnet right, and those node machines will come from different node providers, them with instructions to run a protocol to create a subnet. Those node machines will come from different node providers. They will be installed in different independent data centers around the world in different geographies and different jurisdictions.

Dominic:

It's common sense that obviously, if the node machines came from a single provider and that provider went bankrupt, the subnet would fail. That's no good. If the provider became evil, they could do what they like. That's no good. So the node machines come from different node providers. Similarly, if the node machines came from different node providers but they were all installed in the same data center, well, that wouldn't work because the data center could be blown up by a drone, go bankrupt or turn evil. So these node machines from different node providers have to be in different data centers. If those data centers are in the same geography let's say they were all around Amsterdam and Vladimir Putin put a nuclear bomb on Amsterdam well, guess what? That would also break because all the data centers are in the same area. So you require the data centers to be geographically distributed and similarly they shouldn't be in the same jurisdiction. Let's say you distributed them around Europe and overnight the EU decides to ban decentralized networks and they send their agents out. Well, again it would stop working and you have to be tamper-proof and unstoppable. So the data centers not only have to be geographically distributed but also in different jurisdictions.

Dominic:

And using that paradigm, for just basic system software, we're running it like the network runs at 13x replication. I actually think there's a strong argument. It can come down to like 7x, but we'll see the actual numbers come from like some kind of Byzantine, fault-tolerant math considerations. It has to be something called 3f plus 1. But you could do like 13, you could do 10, you could do 7. Probably that's the minimum. But of course at every stage you're reducing replication, reducing cost and in the future it'll also be possible for people running software within limits and within a framework of scaling privileges to control the replication within that framework. At the moment fiduciary is like replicated like 50x plus. But that's how it works and it's very effective. And you know it's all science-based. So you know the question is like you know what is the required replication factor for this specific unit of software's needs, and it happens accordingly, which is, of course, perfectly logical and the right way to do it.

Craig:

Yeah, so you've got these distributed nodes. You talked about the units of software. Do you mean that in software terms, like a piece of software is a stack of units that work together and then, you're placing each unit in a different data center and then they come together to operate. Yeah, operate yeah.

Dominic:

So I mean, let's say you're a developer, um, and in the future, maybe you're just one of the eight billion people on the planet that can use natural language instructions to tell the ai what you want to do in the self-writing internet paradigm. Um, you know, you're going to create some units of software and they're going to be installed on the internet computer. You're not concerned with where they are. I mean, in a default configuration, your software unit is going to be replicated all over the world because of the deterministic centralization concerns that I just mentioned, and you're just dealing with the Internet computer. So, while there is some configurability, in the general case, you're not even thinking about that. You're just uploading this unit of software to the Internet computer. It's actually called a canister because it's a bundle of something called WebAssembly, which is something that will also run inside your web browser and persistent memory pages.

Dominic:

This canister, smart contract, if you want to call it that. It can either be a complete system it's not going to be a social network, because you have to scale the storage and compute but it could be certainly quite a substantial complete enterprise app or it can be something much smaller. So if you look at the single OpenChat, which was the first open internet service in the world. It's been running for a few years now and growing steadily. Those guys, they give each individual user their own canister behind the scenes and that canister is used to record that user's chat history, all their chat histories and stuff like that. And that's an example of a canister being used as a small unit of software within a much larger system. But there are plenty of examples where people have built pretty substantial systems that are all contained inside a single canister.

Dominic:

And again, storm me if I get too technical these canisters are what is known in theoretical computing terminology as a software actor. Right, and actors were, I think, first for use with with a language called erlang, which was a language that's very, was very popular with telecoms companies and it could create, like you know, massively scalable, resilient, highly parallelized infrastructures. And so, anyway, a software actor is some, something that runs in parallel with other software actors and interacts by messages. Except within the Internet, computer, serverless, cloud framework, those messages are just function calls and function call results, and you know you're given the illusion that this code is synchronous, but it's not. It's actually asynchronous. And you know I can call your function and I'll get a result. But we're actually running in parallel and you're on one subnet and I'm on another subnet, kind of thing, and an actor has private state and in this case the software logic runs in its own persistent memory.

Dominic:

And I don't want to get too technical again, but on the internet, computer logic and data is kind of the same thing, so it's just state and instead of having a database and files, you just define your programming abstractions Like. This is a variable, like you know. For example, define a blog post, as you know, title, content, likes, comments or whatever, right, and then you just define your blog as, like you know, something along the lines of var, blog equals array of posts, right, and it all just lives in persistent memory, and that that's a paradigm called orthogonal persistence. That's another big innovation that you see on the internet computer. We've been trying to completely reimagine software in a more modern, effective form, and that actually turns out to be very important as well for working with AI, when AI builds completely solo or self-writing internet.

Craig:

I'm sorry, you talk about talking to a chatbot that will code a piece of software. Where does the chatbot reside?

Dominic:

Well, so the self-running internet's a general concept and we're obviously developing technology that we feel enables our own version. There's a platform called Caffeine you can't log in yet, but it's a caffeineai. That is actually chatbot, large language model agnostic. So in principle, any large language model could be used and in fact we'll probably give people a choice. That being said, number one we are tweaking these models and fine-tuning them for the specific task at hand so they can better use the technology, special technologies involved, and that bot could be something like a fine-tuned version of GPT-4.0, or it could be something like DeepSeek 3 or even R1, depending on needs. That's running on the internet computer itself.

Dominic:

At the moment the internet computer runs custom AIs and although the capabilities are scaling pretty rapidly, with people working to extend internet computer protocol to scale back the limits, currently you can run a neural network that does facial recognition very easily.

Dominic:

You can run people already running DeepSeq, but with a very small number of parameters. Later this year you probably get up to like I don't know, maybe 16 billion parameters, something like that. But oftentimes for coding tasks you want larger models because, for example, you want them to be multilingual. Because this is self-running internet, it's a paradigm for the world. People need to be able to instruct the AI in their natural, their own native language, and the internet computer will support something called AI workers, which are kind of like it's AI inference that's been made deterministic, that's running in replicated form, so you can be sure it's tamper-proof. That would be an alternative. So it could be on the one hand it could be GPT-4.0. On the other hand it could be DeepSea, running on the internet computer in a deterministic, tamper-proof form, and there might be different or there will be different considerations like cost, accessibility of the task, that kind of thing.

Craig:

Yeah, let's stop for a sec. So you said there are people running a scaled-down version of deep seek. Yeah, you said there are people running a scaled down version of deep seek on the network. Okay, so uh, is, is, is, are the uh? Is the source code for deep seek, uh, and the weights, uh, divided up by function or into units and then the units are distributed, or is the entire model replicated across all the nodes?

Dominic:

I'm just trying to figure out here. Yeah, so I mean currently when people are running custom AI, it's on something thatets dedicated to application logic, so it's replicated. 13x and I mentioned that probably 13x is overkill and in the future you can imagine subnets with lower levels of replication. But without going into technical details, there are other changes. While it's easy to create a subnet with lower levels of replication, but without going into technical details the internet there are other changes. While it's easy to create a subnet with low levels of replication, even 4X replication, there are security concerns. So there'll be some other work that's involved in updating the network so that it can handle lower security, lower resilience subnets. But I think in the future people might choose to run AI at like 4x or 7x replication and these kind of with lower security privileges. But at the moment it's running 13x and generally I haven't looked into how they're doing it, but I'd imagine it's just a few canisters.

Dominic:

A single unit of software can have up to 500 gig of persistent memory associated with it currently, but the current framework is 32-bit. It's actually as we kind of speak. I see Internet Computer Protocols being upgraded to support 64-bit WebAssembly or WASM compute, and so the 32-bit environment creates some limitations and that's why you've got these models with relatively tiny numbers of parameters. But 64-bit's coming as we speak, and the amount of sort of main memory, if you like, that you don't see in a canister which can be used to store cash, weights and so on is going to be rapidly scaling. But having said that, I don't think that's suitable for self-riding internet.

Dominic:

Self-riding internet is a mass market, will be a mass market phenomenon, and you like AI to run much more efficiently, so I think it will. You know it'll actually run on these, like standard models that are running on these things called AI workers within the network. That's something that's coming this year. So you'll see, like you know, different models being used in the self-running internet and people will often make choices based on considerations, like you know, like cost and security and the ability of the model and things like that. What I will say is, when we fine-tune DeepSeq, we've been getting some really, really good results. The best out-of-the-box proprietary model is hands-down Claude Sonnet for coding, but you can't fine-tune it. You can fine-tune for GPT-4.0 and when we fine-tune it, we can take it beyond Claude. But DeepSeq it's quite a large model and we've been playing around with it, but the results we've been getting are very good.

Craig:

Yeah, so I'm correct that the model may be replicated in full across the nodes? Yeah, yeah, and to access the model then you have to go through the blockchain, you have to go through the blockchain. The blockchain either pulls together the different units and delivers. I mean, it's the interface between the user and the….

Dominic:

Yeah, no, no. So that's the difference. So I think there's a lot of confusion in the space because people can't really get their head around the capabilities of the internet computer network. The internet computer basically creates a single serverless cloud environment that is essentially running on a blockchain, and the reason this is hard for people to grok is that today there's no other blockchain in existence that can even store a phone photo, let alone store, you know, the hundreds of gigabytes of weights associated with a large language model and actually run a large language model, and so you know the confusion I think comes about that a lot of people talk about like oh we've, you know, got this decentralized cloud blockchain, and all they're doing really is creating a kind of tokenization and access control there right and you know.

Dominic:

But people spin up, you know, uh, you know docker images if you like, right, that you can run, you can acquire control of on payment of some tokens, right, and these docker images sometimes actually more often than not actually just running on amazon web services, but you're just creating like a tokenized interface to Amazon, but they can also run on people's you know, home computers and stuff like that. You know. Look, I mean, the Internet of Computers is, like you know, hundreds of people have been working on it for years, like you know, very senior computer scientists. It's a whole different thing. Traditional blockchain. Traditional blockchain is like token databases, essentially, sometimes highly optimized, like Solana, can process a large number of transactions a second which have very little compute associated with them for things like meme coins. But the internet computer is a very different kind of beast. It's much more complicated, involves much deeper science and technology and provides a kind of network that is pretty unique and when you build on the internet computer, you're creating code that runs on the network itself, right Within this serverless cloud environment, where the software is really a form of smart contracts and you know it's tamper-proof, so you don't need to protect it using cybersecurity. Using cybersecurity. It's unstoppable and you can make it autonomous, which means that it can exist without a human or organizational owner and controller, either in unmodifiable form or under the control of a digital governance system. The network is using this scheme I mentioned deterministic decentralization to ensure that the individual units of software are being run with the levels of security and resilience that they actually need in order to meet the guarantees that the network provides and the internet computers envisaged, envisaged and designed as an alternative to traditional it.

Dominic:

We're trying to reinvent compute, so, you know, we want the world instead of you know. Currently, like you know, software systems and the you know, services and applications they create are really the foundations of modern society. I mean, there's 8 billion people on earth and they couldn't possibly be supported, kept alive, without a very high degree of computerized automation, and also things like social media are part of our everyday lives, it's how we live, and so we believe the computer is a societal foundation and therefore it should be part of the internet itself, not part of closed, proprietary, private computing infrastructures, or at least there should be an alternative and people can build on the internet rather than these closed, proprietary, private infrastructures like Amazon Web Services and Google Cloud. So that's our vision. The vision is that you're building on the internet.

Dominic:

Now the internet can host software as well as connect it, and this software has new properties and capabilities, such as being tamper-proof and unstoppable and optionally autonomous and so on, but the whole thing is on the network. There's no traditional IT involved. It's not like a traditional blockchain that's just acting as a gatekeeper through tokenization and some very simple access control functionality to centralized traditional IT, whether that's a cloud or somebody's gaming computer in their basement, everything runs on of the computer network are, as you said, distributed across geographies and jurisdictions, and each node has a copy of the software, not just a unit, we're going to subnet.

Dominic:

Yeah, so if your unit of software, or you know, the unit of software you're interacting with, is going to be assigned to a subnet, and then that subnet comprises a bunch of node machines from different node providers, installed in different data centers and different geographies and different jurisdictions, and that subnet is, you know, efficiently replicating the compute computing data of that software. But it's doing it using Byzantine, fault-tolerant, distributed computing techniques, which basically means that even if an adversary takes physical control of one of those node machines or some subset of those node machines, they cannot corrupt the data hosted by the subnet and they cannot interfere with its operation. And that's, of course, how you get. It's the same as a blockchain, in a sense. That's why running a Bitcoin node won't allow you to steal Bitcoins, right, you're just part of a much bigger network and you need more than half the caching power to corrupt it, at least in principle. So you get these guarantees, but the difference with the networks that have got a cache or something like that you mentioned, what's happening there, is you've got a blockchain that has a ledger and has tokens, and that's really like giving people access, control to these centralized computers for example, a gaming PC in someone's basement, or you know, an Amazon Web Services instance which ultimately, of course, is, you know, just running on some specific machine in Amazon Web Services data centers with a bit of virtualization going on.

Dominic:

This is very different. If you install some software on the internet computer, it has no location. Its location is cypher space and the network dynamically decides where its data and computation is replicated. So these subnets. By the way, the membership of these subnets is dynamic. The network can add new nodes as needed, it can remove failed nodes or remove them for other reasons.

Dominic:

If a subnet is overloaded, it can actually split it into two subnets by spawning to divide the load dynamically. And so you know. If you ask like where is this software that's been installed on the Internet computer, or even where is this AI model that's been installed on the Internet computer, the answer is it's all over the world, right, and its particular location is also dynamic. It's all over the world, right, and its particular location is also dynamic. It can change based on what the protocol does, which has this thing called the network nervous system, which kind of centralizes governance system that actually has automated control of the network and can add nodes to subnets, remove nodes, create new subnets, split subnets, that kind of thing. So really, what's the location of software and data on the internet? Computer Cypher space?

Craig:

Yeah, and you have this. What's the incentive for the node operators? You have this concept of reverse gas. Can you talk about that?

Dominic:

Sure. So on a traditional blockchain like Ethereum, when you interact with a smart contract, you have to pay some gas to pay for the computation, and gas on Ethereum is like fuel for computation. So with a traditional blockchain, every transaction is crafted manually by the user. You have a wallet like Metamask and you craft this transaction that will call into a smart contract. When you craft that transaction, you specify how much you'll pay for the computation, right, and there's a kind of fee market, and so you add the gas as part of the transaction, effectively, and then you know, using the key that the wallet holds, you know, you press send, it gets sent to the network and you get told when it's executed. So, very clearly, if you have to manually create every function call to a smart contract using your wallet, this can't be used as a basis for building a social network. You need smart contracts that can not only run much more efficiently and scalably in many orders of magnitude, but that can do things like process HTTP to create interactive user experience. So the whole thing is re-architected and re-imagined. On the internet computer there's a scheme called reverse gas, and that means that the software pays for its own computation and it does so using something called cycles, which are analog to gas on Ethereum, and the best analogy is that it's like an electric car. So if you have a Tesla, you plug it in to charge up the battery and you drive around. As it drives around, the battery gets depleted and you have to plug it in again. Or if you're on the move, go to a supercharging station or something like that, right, and software on the internet computers kind of similar you charge it up with these things called cycles. As it performs computations and stores data in its memory and so on, it depletes these cycles. If it runs out of cycles, it stops working and you've got like three months to fill it up before it gets scavenged. And you've got like three months to fill it up before it gets scavenged, right, but anybody can fill it up. So say, you're a bit of software and I'm using you and you run out of cycles and stop working. I can send more cycles and you'll start working again. So the tokenomics works as follows.

Dominic:

There are these things called ICP tokens, which are multi-purpose kind of utility token. They play the role of cryptocurrency within the network and they also allow you to participate in governance, like you can stake them to create a voting neuron within the network nervous system. But for the purposes of understanding the basic economics, number one you can convert these ICP into cycles and the network is aware of the value of ICP because it has Oracle functionality and ICP tokens worth one XDR. An XDR is a standard drawing rate of virtual currency defined by the IMF. I don't know what it's worth now, but last time I looked it was like $1.40 or $1.50. And if you convert ICP tokens worth 1 XDR into cycles you get a trillion cycles and cycles give you a predictable amount of compute. So it's like electricity. It's like that you know that it's going to cost you so much to charge the battery of your electric car with this amount of electricity kind of thing. The ICP token, which is named after the protocol internet computer protocol because people are having to convert ICP tokens into cycles to power their compute in the mode of an electric battery using the reverse gas model. On the other hand the network, so that's deflationary. On the other hand, the network mints new ICP tokens to reward people, not directly, but reward people participating in governance who can generate ICP through a mechanism, and it also mints new ICP for these people running the node machines, but it's not like a traditional blockchain where they're sort of speculating.

Dominic:

The governance system of the network is setting expected costs for purchasing node machine hardware, co-locating it, operating it, paying for the bandwidth and so on, and the network decides that they should get a certain return over four years. And then ICP is being issued in varying amounts depending on its current price, in order to cover the or to remunerate the node providers. And of course, if you think about it, this is completely necessary because node providers can have fixed costs. You know they have to. Like a node machine costs you about $20,000 to build and you know some of them are operating quite large numbers of them. You have to co-locate it, so you have to have contracts with data centers. They're consuming bandwidth. Again, you've got to have a bandwidth contract. So in order to have a stable network, the network remunerates them in a predictable way.

Dominic:

And then the network also modulates the remuneration available in different areas of the world, not only with respect to differing costs, but also to create incentives to deploy these node machines in different parts of the world, because the network relies on decentralization, just like the internet.

Dominic:

So, for example, compute might be a bit more expensive in Singapore. But initially, at some point, the network was not only willing to pay that but a bit more to incentivize people to deploy these machines in Singapore in data centers. But if you look at the dashboard of the internet computer, you'll see there's no machines installed in data centers all over the world. There's one in South Africa. There's a data center in South Africa hosting no machines. Now there's one in Panama or something. So they're all over the world. And the network nervous system which is setting these kind of policies with respect to how much financial remuneration you receive in real terms, is not only making sure that these node providers can run these machines in a stable way, but also creating these financial incentives to maximize the decentralization of the network, which in turn enables it to create tamper-proof, unstoppable host, tamper-proof, unstoppable software to reduce cost because it needs geographical and jurisdictional centralization.

Craig:

Yeah, how many node machines do you have right now and is there sort of a critical mass that you need to make this viable? And how does someone use the internet computer? Like if I wanted to use it, what do I do?

Dominic:

So I think there are like 1,500 of these machines around the world at the moment. Actually, not all of them are used, most of them are just kept in a pool. Actually, not all of them are used. Most of them are just kept in a pool and so that if there's a spike in demand there's some kind of safety buffer that allows the network to scale up quickly. But to use the Internet Computer is simple, you just interact with it over the Internet. And that might sound mysterious. How can that be? How can that work? Right, because the software is not in one place, it's in lots of places.

Dominic:

Well, the network incorporates these subnets, but actually there's a sort of perimeter called boundary nodes and these you know and domain default domain names. So each like unit of software actually has its own kind of URL, right, and you, when you interact with the internet computer, dynamic DNS will connect you first of all, I think, to a boundary node that's nearest to you, so you get the lowest possible latency, and then the I'm oversimplifying here, but the boundary node then will forward your request and the response that results from your request to the replica in the subnet hosting the software you're interacting with. That's kind of nearest to you in terms of minimum latency. So the domain name system and the routing that the boundary nodes perform determine which individual replica you're interacting with. But of course your compute, when you submit a function call that changes memory, called an update call, Of course it's replicated across all nodes. But sometimes you just want to perform a query that doesn't need to modify memory and so on. That's done quickly by this kind of routing based on latency.

Dominic:

But of course there are other considerations too, because an individual but of course there are other considerations too, because an individual replica inside a subnet node might be down, be faulty in some way. There are load balancing issues. It doesn't make any sense. If you've got something hosted on a subnet and let's say this piece of software is very popular in South Korea, it turns out the subnet has got one node in South Korea. Right, it turns out the subnet's got like one node in South Korea. Yeah, of course that's going to produce the lowest response time for kind of query computation. But if all that massive load just goes to one replica in the subnet that's in South Korea, then it could get overloaded. So you have to have some clever logic that will at some point begin distributing these requests to other replicas in the subnet, and you have to have things like DOS protections and rate limiting for individual IP addresses that might make requests, and so in actual fact, behind the scenes it gets pretty complicated, but at a high level that's how it works, yeah.

Craig:

And so is there a URL that people access.

Dominic:

Yeah, so I think that I don't know off by heart. So, by the way, like you know, when you've got a piece of software on the internet computer there are standard bits of software you can just load a website into. So if you go to internetcomputercom, which is the website for the internet computer that we maintain, that website is actually hosted on the internet computer and so you can map your own custom URL, your own custom domain name to your software on the internet computer. But with respect to uploading that software, you know there's a developer kit called DFX and you know when you create software that's going to be installed on the internet computer, there's some magic going on behind the scenes. It's interacting with various URLs and then you know through that process your software will get an ID and this will kind of determine future URLs you interact with and things like that. But that's all hidden.

Dominic:

The user experience is just this thing called the internet computer. It's part of the internet. You don't have to worry about how that works. If you've got an internet connection and you've got the software developer kit, you can deploy code on it. If you want to go to the internet computer website, you just type in internetcomputerorg and it's served off the internet computer, that first open internet service. I mentioned OpenChat, very cool. You know, first open internet service. I mentioned OpenChat, very cool, and you know you just type in ocapp, I think, and you know that's mapping behind the scenes to some underlying sort of software. You know network resident software URL but you as a user don't even know, I mean, what has actually been one of the problems we've experienced is that people don't know when something's on the internet computer and in this world of Web3, there was this kind of slight subversion of the meaning of on-chain.

Dominic:

So people will say, oh, look, this social media service or this whatever is on Solana, it's on Ethereum, it's on-chain. Coinbase has this tagline on-chain is new online People. When they hear that, they think those things are running on the blockchain. But it doesn't mean that at all. It's like a subversion of the original language I used as far back as 2015, 2016. What it means for them is that it's built on Amazon Web Services or Google Cloud or something and it's on-chain because it has an associated token. It's a bit like I have tokens and so I'm on-chain, kind of thing. The problem is, we did surveys and it's like more than 95% of people when they hear this language, and I'm talking about, you know, crypto journalists, mainstream journalists. In actual fact, it's nothing like that at all. It's just it's a token database and it's holding an associated token for this completely centralized thing running on like, you know, a cloud service. But the thing is like you know, and then people can interact with OpenShare, the OCapp, and it's this highly interactive kind of you know social network chat service that is completely autonomous, runs under the control of a decentralized governance system. The community actually runs the service itself. There's no back doors, there's no back door at all, and it's coming off the internet computer and you can use it as a progressive web app. And the problem is that people see that and it's like well, that's Web2 functionality, right, and they don't understand the difference. That thing is running on the internet computer on a decentralized network, and this other thing that's also described as being built on a blockchain.

Dominic:

They have a different meaning for on-chain. What they mean is it's built, it's completely centralized and run by developers on a cloud infrastructure, but it has an associated token that's on the blockchain. That really functions as a decentralized token database and these are very, very different things. But the use of that language language, this on-chain language, has confused the hell out of people. Like nobody understands.

Dominic:

And you know, when you try to explain, hey, like this blockchain, you can't even store a phone photo on it. You know, um, it's really just designed for defy and and tokens, maybe like access control, like there, this a cache thing. Like people look at you like you're from Mars. They just can't grok it because the language originally like on chain. It's like saying on cloud. Right, if someone says it's on cloud, well, you know, it's built on cloud. So on chain should mean it's built on chain. But that's not how they mean it. They mean it like centralized, built on Amazon web services or Google cloud or something like that, and there's an associated token. And then you're trying to explain that and it's really tough.

Dominic:

And when we say on-chain or on the network, it's the literal, but we don't know what to do. Because I was really an early pioneer in this kind of stuff and you know I probably have the most people trying to say it's fully on chain and stuff like that. And now this language that we had in like early days has completely changed. But the problem is it's not clear anymore. Like how do you. You know, in the context of the internet computer, what language do you use now to explain the difference? We're kind of like completely just stuck and we found that even I'd spoken to managing editors of very major mainstream publications, managing editors of crypto.

Dominic:

I had one shocking interview where someone connected us and after about 20 minutes I'd been explaining this vision of extending the internet. So not only does it connect software, it hosts software and this new kind of software is tamper-proof and unstoppable and all these amazing things, and you can create open internet service. And I could see the guy was looking bad-tempered so I paused I do talk too much, it's true. I paused and looked at him and he had a real scowl on his face and he shrugged his shoulder and he said say what's new. And I said what do you mean what's new? He said, well, ethereum and Solana. They're decentralized clouds too. So what's new?

Dominic:

And it was like I realized people like this language has confused people. It's like people like this language has confused people and, let's be honest, it probably was designed to confuse people, right, and it was an overloading of the language that didn't need to be made. Like on-chain had a very clear meaning, like why did on-chain have to change to mean some kind of centralized thing under the control of a bunch of developers that has an associated token? It doesn't make any sense, but the language is so pervasive and it's supported by all the venture capital backers of these projects that have an associated token. All the crypto press, all the crypto industry research which of course, tends to be really sort of, you know, sometimes supporting vested interests and so on. Like people can't believe it and it's fooled everybody. This is the incredible thing. Even people who are managing editors of crypto at huge international publications have been completely misled and confused by this terminology of on-chain, this overloading and change to mean having an associated token. So it's very tough for us, because what language can we use to explain this very fundamental and important difference? I mean, these are completely different things, right?

Dominic:

And similarly, the other one we had a lot of it is this kind of idea of decentralized cloud. For us, what we mean when we say decentralized cloud is that the cloud itself is decentralized and when you're putting software into this serverless decentralized cloud or an AI model or whatever it is, it doesn't have a location, it's in cypher space and you're guaranteed that the logic will run, as written, against the correct data. It's tamper-proof, immune to cyber attack. Nobody has backdoor access to it. It's unstoppable. You can make it autonomous and a bunch of other stuff, right.

Dominic:

But other people, when they talk about decentralized cloud, what they really mean is that there's a blockchain with some kind of tokenization and access control that gains you access to the cloud computing that's completely centralized, right, it's running on someone's gaming PC or it's an Amazon Web Services instance that someone is connected to the blockchain. But when someone says decentralized cloud, what you imagine is that the compute is decentralized, not that there's a blockchain gating access to centralized compute, because centralized compute could be switched off, it can be corrupted by whoever's running it and got the keys to it. Um, it's certainly not um tamper proof, it's certainly not unstoppable. It can't be autonomous, because autonomy implies no human or organizational control, like just controlled by logic. And so we do struggle, we really do struggle, and I think the other piece is just like people.

Dominic:

Well, how is it even possible? And I mean it's a technical area, and we discussed deterministic decentralization and how deterministic decentralization enables you to run with less replication. Run this host, the software with less replication while still delivering on security and resilience guarantees, but you know there's a lot more to it than that and, of course, when you're talking about something like the Internet computer, it's non-trivial. We have hundreds of people we work, I think, at this point very significantly, more than 1,000 person years of R&D has been dedicated to the effort and we've been going for years and that's just again very different. You get different results if you're willing to invest that much in R&D than if you're just copy-pasting an existing blockchain and making some tweaks and rebranding it.

Dominic:

But it's again very, very difficult because of the noise in Web3 to actually communicate to people the profound nature of the differences.

Craig:

Yeah, and you mentioned for people that want to explore this. You mentioned two URLs. You said internetcomputercom and internetcomputerorg. Are those aliases for the same?

Dominic:

No, just internetcomputerorg. Are those aliases for the same? No, just internetcomputerorg is a website and you can get a lot of resources like developer resources and things like that. And then there's a new one. I think it's just a deck at the moment. So people are interested in the self-writing internet. You can find that at icpai, but I think it's just decked at icpai at the moment. But actually there's a great URL to get started. It's internetcomputerorg slash library and if you go to internetcomputerorg slash library you can find all kinds of things. You can find, in a nutshell, primers, short introductory documents. You can find decks covering the base network, but also things like self-writing internet I think there's about 20, if people want it, some of the underlying not only the overall architecture employed, but also some of the underlying mathematics, which is actually very beautiful and interesting but sometimes quite complicated.

Craig:

Yeah, okay, so internetcomputerorg slash library. That's where people should go.

Dominic:

Great place to start.

Craig:

Yeah, I'm past the point of looking for jobs, but I'm not past the point of looking for people to hire, and when I need to do that, I turn to Indeed. Imagine you just realized your business needed to hire someone yesterday. How can you find amazing candidates fast? Easy, just use Indeed. When it comes to hiring, indeed is all you need. You can stop struggling to get your job posts seen on other job sites because Indeed's sponsored jobs help you stand out and hire fast.

Craig:

With sponsored jobs, your post jumps to the top of the page for your relevant candidates, so you can reach the people you want faster, and it makes a huge difference. According to Indeed data, sponsored jobs posted directly on Indeed have 45% more applications than non-sponsored jobs. Plus, with Indeed sponsored jobsonsored jobs, there's no monthly subscriptions, no long-term contracts and you pay only for results. How fast is Indeed? In the minute I've been talking to you, 23 hires were made on Indeed, according to Indeed data worldwide. There's no need to wait any longer. Speed up your hiring right now with Indeed, and listeners of this show will get a $75 sponsored job credit to get your jobs more visibility. Go to indeedcom slash IonAI. Ionai is always all run together E-Y-E-O-N-A-I. That's Indeedcom slash Ion AI right now and support our show by saying you heard about Indeed on this podcast. Indeedcom slash Ion AI for a $75 sponsored job credit, terms and conditions apply. Hiring, indeed, is all you need.

Sonix is the world’s most advanced automated transcription, translation, and subtitling platform. Fast, accurate, and affordable.

Automatically convert your mp3 files to text (txt file), Microsoft Word (docx file), and SubRip Subtitle (srt file) in minutes.

Sonix has many features that you'd love including secure transcription and file storage, advanced search, automated subtitles, transcribe multiple languages, and easily transcribe your Zoom meetings. Try Sonix for free today.

Learn more

Eye On AI features a podcast with senior researchers and entrepreneurs in the deep learning space. We also offer a weekly newsletter tracking deep-learning academic papers.

WEEKLY NEWSLETTER | Research Watch

Week Ending 2.16.2025 — Newly published papers and discussions around them. Read more