Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Jensen Huang, CEO of Nvidia, gave an eye-opening keynote talk at CES 2025 last week. It was highly appropriate, as Huang’s favorite subject of artificial intelligence has exploded across the world and Nvidia has, by extension, become one of the most valuable companies in the world. Apple recently passed Nvidia with a market capitalization of $3.58 trillion, compared to Nvidia’s $3.33 trillion.
The company is celebrating the 25th year of its GeForce graphics chip business and it has been a long time since I did the first interview with Huang back in 1996, when we talked about graphics chips for a “Windows accelerator.” Back then, Nvidia was one of 80 3D graphics chip makers. Now it’s one of around three or so survivors. And it has made a huge pivot from graphics to AI.
Huang hasn’t changed much. For the keynote, Huang announced a video game graphics card, the Nvidia GeForce RTX 50 Series, but there were a dozen AI-focused announcements about how Nvidia is creating the blueprints and platforms to make it easy to train robots for the physical world. In fact, in a feature dubbed DLSS 4, Nvidia is now using AI to make its graphics chip frame rates better. And there are technologies like Cosmos, which helps robot developers use synthetic data to train their robots. A few of these Nvidia announcements were among my 13 favorite things at CES.
After the keynote, Huang held a free-wheeling Q&A with the press at the Fountainbleau hotel in Las Vegas. At first, he engaged with a hilarious discussion with the audio-visual team in the room about the sound quality, as he couldn’t hear questions up on stage. So he came down among the press and, after teasing the AV team guy named Sebastian, he answered all of our questions, and he even took a selfie with me. Then he took a bunch of questions from financial analysts.
I was struck at how technical Huang’s command of AI was during the keynote, but it reminded me more of a Siggraph technology conference than a keynote speech for consumers at CES. I asked him about that and you can see his answer below. I’ve included the whole Q&A from all of the press in the room.
Here’s an edited transcript of the press Q&A.
Question: Last year you defined a new unit of compute, the data center. Starting with the building and working down. You’ve done everything all the way up to the system now. Is it time for Nvidia to start thinking about infrastructure, power, and the rest of the pieces that go into that system?
Jensen Huang: As a rule, Nvidia–we only work on things that other people do not, or that we can do singularly better. That’s why we’re not in that many businesses. The reason why we do what we do, if we didn’t build NVLink72, who would have? Who could have? If we didn’t build the type of switches like Spectrum-X, this ethernet switch that has the benefits of InfiniBand, who could have? Who would have? We want our company to be relatively small. We’re only 30-some-odd thousand people. We’re still a small company. We want to make sure our resources are highly focused on areas where we can make a unique contribution.
We work up and down the supply chain now. We work with power delivery and power conditioning, the people who are doing that, cooling and so on. We try to work up and down the supply chain to get people ready for these AI solutions that are coming. Hyperscale was about 10 kilowatts per rack. Hopper is 40 to 50 to 60 kilowatts per rack. Now Blackwell is about 120 kilowatts per rack. My sense is that that will continue to go up. We want it to go up because power density is a good thing. We’d rather have computers that are dense and close by than computers that are disaggregated and spread out all over the place. Density is good. We’re going to see that power density go up. We’ll do a lot better cooling inside and outside the data center, much more sustainable. There’s a whole bunch of work to be done. We try not to do things that we don’t have to.
Question: You made a lot of announcements about AI PCs last night. Adoption of those hasn’t taken off yet. What’s holding that back? Do you think Nvidia can help change that?
Huang: AI started the cloud and was created for the cloud. If you look at all of Nvidia’s growth in the last several years, it’s been the cloud, because it takes AI supercomputers to train the models. These models are fairly large. It’s easy to deploy them in the cloud. They’re called endpoints, as you know. We think that there are still designers, software engineers, creatives, and enthusiasts who’d like to use their PCs for all these things. One challenge is that because AI is in the cloud, and there’s so much energy and movement in the cloud, there are still very few people developing AI for Windows.
It turns out that the Windows PC is perfectly adapted to AI. There’s this thing called WSL2. WSL2 is a virtual machine, a second operating system, Linux-based, that sits inside Windows. WSL2 was created to be essentially cloud-native. It supports Docker containers. It has perfect support for CUDA. We’re going to take the AI technology we’re creating for the cloud and now, by making sure that WSL2 can support it, we can bring the cloud down to the PC. I think that’s the right answer. I’m excited about it. All the PC OEMs are excited about it. We’ll get all these PCs ready with Windows and WSL2. All the energy and movement of the AI cloud, we’ll bring it right to the PC.
Question: Last night, in certain parts of the talk, it felt like a SIGGRAPH talk. It was very technical. You’ve reached a larger audience now. I was wondering if you could explain some of the significance of last night’s developments, the AI announcements, for this broader crowd of people who have no clue what you were talking about last night.
Huang: As you know, Nvidia is a technology company, not a consumer company. Our technology influences, and is going to impact, the future of consumer electronics. But it doesn’t change the fact that I could have done a better job explaining the technology. Here’s another crack.
One of the most important things we announced yesterday was a foundation model that understands the physical world. Just as GPT was a foundation model that understands language, and Stable Diffusion was a foundation model that understood images, we’ve created a foundation model that understands the physical world. It understands things like friction, inertia, gravity, object presence and permanence, geometric and spatial understanding. The things that children know. They understand the physical world in a way that language models today doin’t. We believe that there needs to be a foundation model that understands the physical world.
Once we create that, all the things you could do with GPT and Stable Diffusion, you can now do with Cosmos. For example, you can talk to it. You can talk to this world model and say, “What’s in the world right now?” Based on the season, it would say, “There’s a lot of people sitting in a room in front of desks. The acoustics performance isn’t very good.” Things like that. Cosmos is a world model, and it understands the world.
The question is, why do we need such a thing? The reason is, if you want AI to be able to operate and interact in the physical world sensibly, you’re going to have to have an AI that understands that. Where can you use that? Self-driving cars need to understand the physical world. Robots need to understand the physical world. These models are the starting point of enabling all of that. Just as GPT enabled everything we’re experiencing today, just as Llama is very important to activity around AI, just as Stable Diffusion triggered all these generative imaging and video models, we would like to do the same with Cosmos, the world model.
Question: Last night you mentioned that we’re seeing some new AI scaling laws emerge, specifically around test-time compute. OpenAI’s O3 model showed that scaling inference is very expensive from a compute perspective. Some of those runs were thousands of dollars on the ARC-AGI test. What is Nvidia doing to offer more cost-effective AI inference chips, and more broadly, how are you positioned to benefit from test-time scaling?
Huang: The immediate solution for test-time compute, both in performance and affordability, is to increase our computing capabilities. That’s why Blackwell and NVLink72–the inference performance is probably some 30 or 40 times higher than Hopper. By increasing the performance by 30 or 40 times, you’re driving the cost down by 30 or 40 times. The data center costs about the same.
The reason why Moore’s Law is so important in the history of computing is it drove down computing costs. The reason why I spoke about the performance of our GPUs increasing by 1,000 or 10,000 times over the last 10 years is because by talking about that, we’re inversely saying that we took the cost down by 1,000 or 10,000 times. In the course of the last 20 years, we’ve driven the marginal cost of computing down by 1 million times. Machine learning became possible. The same thing is going to happen with inference. When we drive up the performance, as a result, the cost of inference will come down.
The second way to think about that question, today it takes a lot of iterations of test-time compute, test-time scaling, to reason about the answer. Those answers are going to become the data for the next time post-training. That data becomes the data for the next time pre-training. All of the data that’s being collected is going into the pool of data for pre-training and post-training. We’ll keep pushing that into the training process, because it’s cheaper to have one supercomputer become smarter and train the model so that everyone’s inference cost goes down.
However, that takes time. All these three scaling laws are going to happen for a while. They’re going to happen for a while concurrently no matter what. We’re going to make all the models smarter in time, but people are going to ask tougher and tougher questions, ask models to do smarter and smarter things. Test-time scaling will go up.
Question: Do you intend to further increase your investment in Israel?
Huang: We recruit highly skilled talent from almost everywhere. I think there’s more than a million resumes on Nvidia’s website from people who are interested in a position. The company only employs 32,000 people. Interest in joining Nvidia is quite high. The work we do is very interesting. There’s a very large option for us to grow in Israel.
When we purchased Mellanox, I think they had 2,000 employees. Now we have almost 5,000 employees in Israel. We’re probably the fastest-growing employer in Israel. I’m very proud of that. The team is incredible. Through all the challenges in Israel, the team has stayed very focused. They do incredible work. During this time, our Israel team created NVLink. Our Israel team created Spectrum-X and Bluefield-3. All of this happened in the last several years. I’m incredibly proud of the team. But we have no deals to announce today.
Question: Multi-frame generation, is that still doing render two frames, and then generate in between? Also, with the texture compression stuff, RTX neural materials, is that something game developers will need to specifically adopt, or can it be done driver-side to benefit a larger number of games?
Huang: There’s a deep briefing coming out. You guys should attend that. But what we did with Blackwell, we added the ability for the shader processor to process neural networks. You can put code and intermix it with a neural network in the shader pipeline. The reason why this is so important is because textures and materials are processed in the shader. If the shader can’t process AI, you won’t get the benefit of some of the algorithm advances that are available through neural networks, like for example compression. You could compress textures a lot better today than the algorithms than we’ve been using for the last 30 years. The compression ratio can be dramatically increased. The size of games is so large these days. When we can compress those textures by another 5X, that’s a big deal.
Next, materials. The way light travels across a material, its anisotropic properties, cause it to reflect light in a way that indicates whether it’s gold paint or gold. The way that light reflects and refracts across their microscopic, atomic structure causes materials to have those properties. Describing that mathematically is very difficult, but we can learn it using an AI. Neural materials is going to be completely ground-breaking. It will bring a vibrancy and a lifelike-ness to computer graphics. Both of these require content-side work. It’s content, obviously. Developers will have to develop their content in that way, and then they can incorporate these things.
With respect to DLSS, the frame generation is not interpolation. It’s literally frame generation. You’re predicting the future, not interpolating the past. The reason for that is because we’re trying to increase framerate. DLSS 4, as you know, is completely ground-breaking. Be sure to take a look at it.
Question: There’s a huge gap between the 5090 and 5080. The 5090 has more than twice the cores of the 5080, and more than twice the price. Why are you creating such a distance between those two?
Huang: When somebody wants to have the best, they go for the best. The world doesn’t have that many segments. Most of our users want the best. If we give them slightly less than the best to save $100, they’re not going to accept that. They just want the best.
Of course, $2,000 is not small money. It’s high value. But that technology is going to go into your home theater PC environment. You may have already invested $10,000 into displays and speakers. You want the best GPU in there. A lot of their customers, they just absolutely want the best.
Question: With the AI PC becoming more and more important for PC gaming, do you imagine a future where there are no more traditionally rendered frames?
Huang: No. The reason for that is because–remember when ChatGPT came out and people said, “Oh, now we can just generate whole books”? But nobody internally expected that. It’s called conditioning. We now conditional the chat, or the prompts, with context. Before you can understand a question, you have to understand the context. The context could be a PDF, or a web search, or exactly what you told it the context is. The same thing with images. You have to give it context.
The context in a video game has to be relevant, and not just story-wise, but spatially relevant, relevant to the world. When you condition it and give it context, you give it some early pieces of geometry or early pieces of texture. It can generate and up-rez from there. The conditioning, the grounding, is the same thing you would do with ChatGPT and context there. In enterprise usage it’s called RAG, retrieval augmented generation. In the future, 3D graphics will be grounded, conditioned generation.
Let’s look at DLSS 4. Out of 33 million pixels in these four frames – we’ve rendered one and generated three – we’ve rendered 2 million. Isn’t that a miracle? We’ve literally rendered two and generated 31. The reason why that’s such a big deal–those 2 million pixels have to be rendered at precisely the right points. From that conditioning, we can generate the other 31 million. Not only is that amazing, but those two million pixels can be rendered beautifully. We can apply tons of computation because the computing we would have applied to the other 31 million, we now channel and direct that at just the 2 million. Those 2 million pixels are incredibly complex, and they can inspire and inform the other 31.
The same thing will happen in video games in the future. I’ve just described what will happen to not just the pixels we render, but the geometry the render, the animation we render and so on. The future of video games, now that AI is integrated into computer graphics–this neural rendering system we’ve created is now common sense. It took about six years. The first time I announced DLSS, it was universally disbelieved. Part of that is because we didn’t do a very good job of explaining it. But it took that long for everyone to now realize that generative AI is the future. You just need to condition it and ground it with the artist’s intention.
We did the same thing with Omniverse. The reason why Omniverse and Cosmos are connected together is because Omniverse is the 3D engine for Cosmos, the generative engine. We control completely in Omniverse, and now we can control as little as we want, as little as we can, so we can generate as much as we can. What happens when we control less? Then we can simulate more. The world that we can now simulate in Omniverse can be gigantic, because we have a generative engine on the other side making it look beautiful.
Question: Do you see Nvidia GPUs starting to handle the logic in future games with AI computation? Is it a goal to bring both graphics and logic onto the GPU through AI?
Huang: Yes. Absolutely. Remember, the GPU is Blackwell. Blackwell can generate text, language. It can reason. An entire agentic AI, an entire robot, can run on Blackwell. Just like it runs in the cloud or in the car, we can run that entire robotics loop inside Blackwell. Just like we could do fluid dynamics or particle physics in Blackwell. The CUDA is exactly the same. The architecture of Nvidia is exactly the same in the robot, in the car, in the cloud, in the game system. That’s the good decision we made. Software developers need to have one common platform. When they create something they want to know that they can run it everywhere.
Yesterday I said that we’re going to create the AI in the cloud and run it on your PC. Who else can say that? It’s exactly CUDA compatible. The container in the cloud, we can take it down and run it on your PC. The SDXL NIM, it’s going to be fantastic. The FLUX NIM? Fantastic. Llama? Just take it from the cloud and run it on your PC. The same thing will happen in games.
Question: There’s no question about the demand for your products from hyperscalers. But can you elaborate on how much urgency you feel in broadening your revenue base to include enterprise, to include government, and building your own data centers? Especially when customers like Amazon are looking to build their own AI chips. Second, could you elaborate more for us on how much you’re seeing from enterprise development?
Huang: Our urgency comes from serving customers. It’s never weighed on me that some of my customers are also building other chips. I’m delighted that they’re building in the cloud, and I think they’re making excellent choices. Our technology rhythm, as you know, is incredibly fast. When we increase performance every year by a factor of two, say, we’re essentially decreasing costs by a factor of two every year. That’s way faster than Moore’s Law at its best. We’re going to respond to customers wherever they are.
With respect to enterprise, the important thing is that enterprises today are served by two industries: the software industry, ServiceNow and SAP and so forth, and the solution integrators that help them adapt that software into their business processes. Our strategy is to work with those two ecosystems and help them build agentic AI. NeMo and blueprints are the toolkits for building agentic AI. The work we’re doing with ServiceNow, for example, is just fantastic. They’re going to have a whole family of agents that sit on top of ServiceNow that help do customer support. That’s our basic strategy. With the solution integrators, we’re working with Accenture and others–Accenture is doing critical work to help customers integrate and adopt agentic AI into their systems.
Step one is to help that whole ecosystem develop AI, which is different from developing software. They need a different toolkit. I think we’ve done a good job this last year of building up the agentic AI toolkit, and now it’s about deployment and so on.
Question: It was exciting last night to see the 5070 and the price decrease. I know it’s early, but what can we expect from the 60-series cards, especially in the sub-$400 range?
Huang: It’s incredible that we announced four RTX Blackwells last night, and the lowest performance one has the performance of the highest-end GPU in the world today. That puts it in perspective, the incredible capabilities of AI. Without AI, without the tensor cores and all of the innovation around DLSS 4, this capability wouldn’t be possible. I don’t have anything to announce. Is there a 60? I don’t know. It is one of my favorite numbers, though.
Question: You talked about agentic AI. Lots of companies have talked about agentic AI now. How are you working with or competing with companies like AWS, Microsoft, Salesforce who have platforms in which they’re also telling customers to develop agents? How are you working with those guys?
Huang: We’re not a direct to enterprise company. We’re a technology platform company. We develop the toolkits, the libraries, and AI models, for the ServiceNows. That’s our primary focus. Our primary focus is ServiceNow and SAP and Oracle and Synopsys and Cadence and Siemens, the companies that have a great deal of expertise, but the library layer of AI is not an area that they want to focus on. We can create that for them.
It’s complicated, because essentially we’re talking about putting a ChatGPT in a container. That end point, that microservice, is very complicated. When they use ours, they can run it on any platform. We develop the technology, NIMs and NeMo, for them. Not to compete with them, but for them. If any of our CSPs would like to use them, and many of our CSPs have – using NeMo to train their large language models or train their engine models – they have NIMs in their cloud stores. We created all of this technology layer for them.
The way to think about NIMs and NeMo is the way to think about CUDA and the CUDA-X libraries. The CUDA-X libraries are important to the adoption of the Nvidia platform. These are things like cuBLAS for linear algebra, cuDNN for the deep neural network processing engine that revolutionized deep learning, CUTLASS, all these fancy libraries that we’ve been talking about. We created those libraries for the industry so that they don’t have to. We’re creating NeMo and NIMs for the industry so that they don’t have to.
Question: What do you think are some of the biggest unmet needs in the non-gaming PC market today?
Huang: DIGITS stands for Deep Learning GPU Intelligence Training System. That’s what it is. DIGITS is a platform for data scientists. DIGITS is a platform for data scientists, machine learning engineers. Today they’re using their PCs and workstations to do that. For most people’s PCs, to do machine learning and data science, to run PyTorch and whatever it is, it’s not optimal. We now have this little device that you sit on your desk. It’s wireless. The way you talk to it is the way you talk to the cloud. It’s like your own private AI cloud.
The reason you want that is because if you’re working on your machine, you’re always on that machine. If you’re working in the cloud, you’re always in the cloud. The bill can be very high. We make it possible to have that personal development cloud. It’s for data scientists and students and engineers who need to be on the system all the time. I think DIGITS–there’s a whole universe waiting for DIGITS. It’s very sensible, because AI started in the cloud and ended up in the cloud, but it’s left the world’s computers behind. We just have to figure something out to serve that audience.
Question: You talked yesterday about how robots will soon be everywhere around us. Which side do you think robots will stand on – with humans, or against them?
Huang: With humans, because we’re going to build them that way. The idea of superintelligence is not unusual. As you know, I have a company with many people who are, to me, superintelligent in their field of work. I’m surrounded by superintelligence. I prefer to be surrounded by superintelligence rather than the alternative. I love the fact that my staff, the leaders and the scientists in our company, are superintelligent. I’m of average intelligence, but I’m surrounded by superintelligence.
That’s the future. You’re going to have superintelligent AIs that will help you write, analyze problems, do supply chain planning, write software, design chips and so on. They’ll build marketing campaigns or help you do podcasts. You’re going to have superintelligence helping you to do many things, and it will be there all the time. Of course the technology can be used in many ways. It’s humans that are harmful. Machines are machines.
Question: In 2017 Nvidia displayed a demo car at CES, a self-driving car. You partnered with Toyota that May. What’s the difference between 2017 and 2025? What were the issues in 2017, and what are the technological innovations being made in 2025?
Huang: First of all, everything that moves in the future will be autonomous, or have autonomous capabilities. There will be no lawn mowers that you push. I want to see, in 20 years, someone pushing a lawn mower. That would be very fun to see. It makes no sense. In the future, all cars–you could still decide to drive, but all cars will have the ability to drive themselves. From where we are today, which is 1 billion cars on the road and none of them driving by themselves, to–let’s say, picking our favorite time, 20 years from now. I believe that cars will be able to drive themselves. Five years ago that was less certain, how robust the technology was going to be. Now it’s very certain that the sensor technology, the computer technology, the software technology is within reach. There’s too much evidence now that a new generation of cars, particularly electric cars, almost every one of them will be autonomous, have autonomous capabilities.
If there are two drivers that really changed the minds of the traditional car companies, one of course is Tesla. They were very influential. But the single greatest impact is the incredible technology coming out of China. The neo-EVs, the new EV companies – BYD, Li Auto, XPeng, Xiaomi, NIO – their technology is so good. The autonomous vehicle capability is so good. It’s now coming out to the rest of the world. It’s set the bar. Every car manufacturer has to think about autonomous vehicles. The world is changing. It took a while for the technology to mature, and our own sensibility to mature. I think now we’re there. Waymo is a great partner of ours. Waymo is now all over the place in San Francisco.
Question: About the new models that were announced yesterday, Cosmos and NeMo and so on, are those going to be part of smart glasses? Given the direction the industry is moving in, it seems like that’s going to be a place where a lot of people experience AI agents in the future?
Huang: I’m so excited about smart glasses that are connected to AI in the cloud. What am I looking at? How should I get from here to there? You could be reading and it could help you read. The use of AI as it gets connected to wearables and virtual presence technology with glasses, all of that is very promising.
The way we use Cosmos, Cosmos in the cloud will give you visual penetration. If you want something in the glasses, you use Cosmos to distill a smaller model. Cosmos becomes a knowledge transfer engine. It transfers its knowledge into a much smaller AI model. The reason why you’re able to do that is because that smaller AI model becomes highly focused. It’s less generalizable. That’s why it’s possible to narrowly transfer knowledge and distill that into a much tinier model. It’s also the reason why we always start by building the foundation model. Then we can build a smaller one and a smaller one through that process of distillation. Teacher and student models.
Question: The 5090 announced yesterday is a great card, but one of the challenges with getting neural rendering working is what will be done with Windows and DirectX. What kind of work are you looking to put forward to help teams minimize the friction in terms of getting engines implemented, and also incentivizing Microsoft to work with you to make sure they improve DirectX?
Huang: Wherever new evolutions of the DirectX API are, Microsoft has been super collaborative throughout the years. We have a great relationship with the DirectX team, as you can imagine. As we’re advancing our GPUs, if the API needs to change, they’re very supportive. For most of the things we do with DLSS, the API doesn’t have to change. It’s actually the engine that has to change. Semantically, it needs to understand the scene. The scene is much more inside Unreal or Frostbite, the engine of the developer. That’s the reason why DLSS is integrated into a lot of the engines today. Once the DLSS plumbing has been put in, particularly starting with DLSS 2, 3, and 4, then when we update DLSS 4, even though the game was developed for 3, you’ll have some of the benefits of 4 and so on. Plumbing for the scene understanding AIs, the AIs that process based on semantic information in the scene, you really have to do that in the engine.
Question: All these big tech transitions are never done by just one company. With AI, do you think there’s anything missing that is holding us back, any part of the ecosystem?
Huang: I do. Let me break it down into two. In one case, in the language case, the cognitive AI case, of course we’re advancing the cognitive capability of the AI, the basic capability. It has to be multimodal. It has to be able to do its own reasoning and so on. But the second part is applying that technology into an AI system. AI is not a model. It’s a system of models. Agentic AI is an integration of a system of models. There’s a model for retrieval, for search, for generating images, for reasoning. It’s a system of models.
The last couple of years, the industry has been innovating along the applied path, not only the fundamental AI path. The fundamental AI path is for multimodality, for reasoning and so on. Meanwhile, there is a hole, a missing thing that’s necessary for the industry to accelerate its process. That’s the physical AI. Physical AI needs the same foundation model, the concept of a foundation model, just as cognitive AI needed a classic foundation model. The GPT-3 was the first foundation model that reached a level of capability that started off a whole bunch of capabilities. We have to reach a foundation model capability for physical AI.
That’s why we’re working on Cosmos, so we can reach that level of capability, put that model out in the world, and then all of a sudden a bunch of end use cases will start, downstream tasks, downstream skills that are activated as a result of having a foundation model. That foundation model could also be a teaching model, as we were talking about earlier. That foundation model is the reason we built Cosmos.
The second thing that is missing in the world is the work we’re doing with Omniverse and Cosmos to connect the two systems together, so that it’s a physics condition, physics-grounded, so we can use that grounding to control the generative process. What comes out of Cosmos is highly plausible, not just highly hallucinatable. Cosmos plus Omniverse is the missing initial starting point for what is likely going to be a very large robotics industry in the future. That’s the reason why we built it.
Question: How concerned are you about trade and tariffs and what that possibly represents for everyone?
Huang: I’m not concerned about it. I trust that the administration will make the right moves for their trade negotiations. Whatever settles out, we’ll do the best we can to help our customers and the market.
Follow-up question inaudible.
Huang: We only work on things if the market needs us to, if there’s a hole in the market that needs to be filled and we’re destined to fill it. We’ll tend to work on things that are far in advance of the market, where if we don’t do something it won’t get done. That’s the Nvidia psychology. Don’t do what other people do. We’re not market caretakers. We’re market makers. We tend not to go into a market that already exists and take our share. That’s just not the psychology of our company.
The psychology of our company, if there’s a market that doesn’t exist–for example, there’s no such thing as DIGITS in the world. If we don’t build DIGITS, no one in the world will build DIGITS. The software stack is too complicated. The computing capabilities are too significant. Unless we do it, nobody is going to do it. If we didn’t advance neural graphics, nobody would have done it. We had to do it. We’ll tend to do that.
Question: Do you think the way that AI is growing at this moment is sustainable?
Huang: Yes. There are no physical limits that I know of. As you know, one of the reasons we’re able to advance AI capabilities so rapidly is that we have the ability to build and integrate our CPU, GPU, NVLink, networking, and all the software and systems at the same time. If that has to be done by 20 different companies and we have to integrate it all together, the timing would take too long. When we have everything integrated and software supported, we can advance that system very quickly. With Hopper, H100 and H200 to the next and the next, we’re going to be able to move every single year.
The second thing is, because we’re able to optimize across the entire system, the performance we can achieve is much more than just transistors alone. Moore’s Law has slowed. The transistor performance is not increasing that much from generation to generation. But our systems overall have increased in performance tremendously year over year. There’s no physical limit that I know of.
As we advance our computing, the models will keep on advancing. If we increase the computation capability, researchers can train with larger models, with more data. We can increase their computing capability for the second scaling law, reinforcement learning and synthetic data generation. That’s going to continue to scale. The third scaling law, test-time scaling–if we keep advancing the computing capability, the cost will keep coming down, and the scaling law of that will continue to grow as well. We have three scaling laws now. We have mountains of data we can process. I don’t see any physics reasons that we can’t continue to advance computing. AI is going to progress very quickly.
Question: Will Nvidia still be building a new headquarters in Taiwan?
Huang: We have a lot of employees in Taiwan, and the building is too small. I have to find a solution for that. I may announce something in Computex. We’re shopping for real estate. We work with MediaTek across several different areas. One of them is in autonomous vehicles. We work with them so that we can together offer a fully software-defined and computerized car for the industry. Our collaboration with the automotive industry is very good.
With Grace Blackwell, the GB10, the Grace CPU is in collaboration with MediaTek. We architected it together. We put some Nvidia technology into MediaTek, so we could have NVLink chip-to-chip. They designed the chip with us and they designed the chip for us. They did an excellent job. The silicon is perfect the first time. The performance is excellent. As you can imagine, MediaTek’s reputation for very low power is absolutely deserved. We’re delighted to work with them. The partnership is excellent. They’re an excellent company.
Question: What advice would you give to students looking forward to the future?
Huang: My generation was the first generation that had to learn how to use computers to do their field of science. The generation before only used calculators and paper and pencils. My generation had to learn how to use computers to write software, to design chips, to simulate physics. My generation was the generation that used computers to do our jobs.
The next generation is the generation that will learn how to use AI to do their jobs. AI is the new computer. Very important fields of science–in the future it will be a question of, “How will I use AI to help me do biology?” Or forestry or agriculture or chemistry or quantum physics. Every field of science. And of course there’s still computer science. How will I use AI to help advance AI? Every single field. Supply chain management. Operational research. How will I use AI to advance operational research? If you want to be a reporter, how will I use AI to help me be a better reporter?
Every student in the future will have to learn how to use AI, just as the current generation had to learn how to use computers. That’s the fundamental difference. That shows you very quickly how profound the AI revolution is. This is not just about a large language model. Those are very important, but AI will be part of everything in the future. It’s the most transformative technology we’ve ever known. It’s advancing incredibly fast.
For all of the gamers and the gaming industry, I appreciate that the industry is as excited as we are now. In the beginning we were using GPUs to advance AI, and now we’re using AI to advance computer graphics. The work we did with RTX Blackwell and DLSS 4, it’s all because of the advances in AI. Now it’s come back to advance graphics.
If you look at the Moore’s Law curve of computer graphics, it was actually slowing down. The AI came in and supercharged the curve. The framerates are now 200, 300, 400, and the images are completely raytraced. They’re beautiful. We have gone into an exponential curve of computer graphics. We’ve gone into an exponential curve in almost every field. That’s why I think our industry is going to change very quickly, but every industry is going to change very quickly, very soon.
Source link