Why AI Is Having Its Moment Right Now

Forbes

It feels like overnight, everyone was talking about artificial intelligence. But why? This panel of industry insiders at Imagination In Action’s ‘Forging the Future of Business with AI’ Summit breaks down what factors made this moment in AI happen  Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1  Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:  https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript  Stay Connected Forbes newsletters: https://newsletters.editorial.forbes.com Forbes on Facebook: http://fb.com/forbes Forbes Video on Twitter: http://www.twitter.com/forbes Forbes Video on Instagram: http://instagram.com/forbes More From Forbes:  http://forbes.com  Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.

Transcript

00:00 Awesome.

00:01 Thank you so much.

00:03 It's great to be here today and to talk a little bit about generative AI futures.

00:08 You know, this really feels like an unprecedented time in the machine learning space and the

00:13 AI world.

00:14 I started working on this around 2009 and it's never been cool before.

00:19 So it's very, very interesting to have relatives back in Texas want to talk to me about that

00:24 generative AI stuff.

00:27 So excited to be here today to have a chance to kind of look under the hood a little bit

00:31 at some of the companies that are building out this generative AI future and to talk

00:37 about some things in the open source world and the large versus small model world.

00:43 And then also some of the implications for on-device machine learning.

00:47 Excellent.

00:48 And so with that, I'm going to go ahead and kind of introduce our great panelists.

00:55 My name is Paige.

00:56 I work on generative AI at Google, particularly our large language models, our large generative

01:02 models like Gemini.

01:04 I have here with me today Adi.

01:06 Do you want to do it to give a brief introduction?

01:08 Sure.

01:09 So my name is Aditi Joshi.

01:11 I work at Google.

01:12 I've been there for five years.

01:14 I focus on open source in our core ML group.

01:18 Excellent.

01:19 And Kevin Schatz, Cameron?

01:21 Yeah.

01:22 So another Googler.

01:23 So I work in Google Cloud.

01:24 I lead a team in our conversational and generative applied AI organization focused on agentic

01:30 type things.

01:31 Awesome.

01:32 And Sundararajan?

01:33 I'm at Microsoft.

01:34 I lead a team on AI incubations here down the street in Kendall Square.

01:39 Wonderful.

01:40 And then also Marcus.

01:41 Thank you.

01:42 Thank you for hosting me.

01:43 Yeah, Marcus Ruhl.

01:44 I lead the Intel Developer Cloud and I'm building out very large supercomputers that are focused

01:48 to host various startups, so established companies that are building a variety of large language

01:53 models.

01:54 Prior to that, I was at Nvidia, built out Nvidia's GPU cloud infrastructure.

01:58 Excellent.

01:59 And so are you based locally as well?

02:00 I'm from Silicon Valley.

02:02 Oh, gotcha.

02:03 So we have a nice mix of Silicon Valley and Boston coming here today.

02:07 And with that, I am going to get started.

02:11 I also was not trusting the Wi-Fi connection given how many people we have here in the

02:16 audience and how many people we have in all of the hallway conversations outside.

02:20 So we'll be reading some of the questions.

02:22 And if we have time, we might take a couple of audience questions.

02:25 But let's see how quickly we can get through these ideas.

02:32 So first off, we've already discussed generative AI is kind of at a peak in the hype cycle

02:38 that is unprecedented.

02:40 Why do y'all think this might be the case today?

02:45 And then Kevin, do you want to start?

02:46 Yeah, I'm happy to start.

02:47 I think, you know, in a past life, I spent a lot of time on this concept of digital transformation

02:52 that was like the big sledgehammer from top down of go transform, right?

02:58 Reorganize yourself, be agile, and all these other things.

03:02 And I think, obviously, it didn't work.

03:05 I think that there's the reason it's at a hype cycle is because I think there's a subtlety

03:10 to how it's getting adopted and integrated into applications and into general enterprise

03:15 tasks and just everyday life.

03:17 And it's not this big sledgehammer.

03:19 It's just all of a sudden, you start seeing these trailing indicators of improvement and

03:23 efficiency or new things getting created or rate of adoption of new applications and services.

03:29 And you kind of take that step back, like, why is that happening?

03:31 It's like, oh, gen AI is under the covers, right?

03:35 Software development, whatever we want to talk about.

03:37 I think it's so integrated into tasks that the bottoms up adoption is leading to a set

03:45 of transformations and subtlety in just about every job role.

03:49 And it's kind of neat to watch.

03:53 >> I think I can go next.

03:55 I think the obvious answer is just compute.

03:57 We have compute that's available to us, which allows us to do so much more than we could

04:02 have ever done before.

04:04 But I think what we're seeing now, and this was my lightning talk this morning, is how

04:08 it's really changed how we launch products.

04:11 And how has the role of AI product manager really changed along the way is the ability

04:16 to be able to use these models, to be able to use the tools that are available through

04:21 the use of AI for us as TPM copilots, your technical program manager.

04:26 So when you're going from your MVP, your minimum viable product, to your product market fit

04:30 from zero to one, it's a series of experiments that you're going through.

04:34 That rapid experimentation is so critical in making sure that a product launch is successful.

04:39 I think that's what we're really going to see, our ability to transform the way that

04:44 we launch our products and how fast we launch those products, incorporate the feedback that

04:49 we're getting early into the process, and then keep iterating on that at a scale that's

04:54 been I think unprecedented.

04:55 I think that's just about to get started.

04:59 I love the discussion around generating more interesting experiences in existing applications.

05:09 Some of the places where we've seen the broadest adoption of AI features are in things like

05:14 AI for software development.

05:16 So helping with things like code completion or code explanation in the context of an IDE,

05:21 helping with burning down bugs or code review, and then also in our developer product or

05:27 workplace productivity tools.

05:28 So things like Microsoft Word or Google Docs, Sheets, Excel, really meeting people where

05:35 they are and trying to accelerate the tasks that people would have been doing anyway.

05:40 So it's really, really cool to see.

05:42 Sundararajan?

05:43 I would say what Kostlak called it this morning, it's a chat GPT moment.

05:50 I think chat GPT really democratized the showcase ability for democratic access not only to

05:57 technology but also opportunities.

05:59 And it's also the fastest growing product in history of all products.

06:03 So experimentation, rapid prototyping, all proved in a matter of weeks to months.

06:10 But that's a combination of all the things that we are talking about.

06:13 The technologies have been evolving, bubbling up to this moment.

06:16 The mathematics have been bubbling up to this moment.

06:19 And now the opportunity and access has basically become overnight available, if not completely

06:26 accessible in terms of cost yet.

06:29 And maybe this is where the GPUs come in.

06:30 Indeed.

06:31 Indeed.

06:32 We like people who use a lot of compute.

06:35 Yeah, so I think at a high level, I think I'm always careful to try and predict the

06:38 peak of the hike cycle.

06:39 I've been around to block for a few years now, and I'm not sure it's the peak yet.

06:45 I think anytime you take the analogy back to late '90s when I was in the cost of communication

06:51 went down to near zero.

06:52 I grew up at a time when I remember making a phone call and I was worried about people

06:55 like how much it was costing me.

06:57 My parents would scold me for being on the phone for too long.

06:59 Growing up in Europe, a bit behind the US in that respect.

07:03 But what's happening right now is the cost of certain compute, of doing things in large

07:07 language models, all of a sudden certain tasks that you just couldn't afford.

07:09 Maybe the compute was available before, but you just couldn't afford it.

07:13 And all of a sudden, that compute is now available at the cost where it's just super attractive

07:17 to automate certain functions.

07:19 And I think it's not just the Moore's Law that, yes, you get more compute just from

07:23 the hardware, but it's just the rate at which these models are evolving and what you can

07:27 do with the same amount of compute power is just mind boggling.

07:30 I was just listening into a talk by Naveen Rao last week at a vision conference.

07:35 He was talking about a factor four of improvement per year, year over year.

07:39 So if you think about Moore's Law, we've been trying to squeeze more out of the physical

07:43 hardware, out of the compute, but you're getting twice the compute power every 18 months.

07:47 What we're seeing now is that, yes, you still get more out of that compute, but the speed

07:51 at which these algorithms are evolving is something like a factor four, which means

07:55 that if you're building a model today and you're spending $100 million building it,

07:58 you just only care about reducing the cost, but it's also the speed, the amount of effort

08:03 it takes to build this maybe in a year from now could be something like a quarter of that.

08:06 So the economics are just so mind boggling, and just the rapid, the pace at which things

08:10 are evolving is just so mind boggling.

08:12 It's incredible.

08:13 So I have a thought to add here, but I just want to ask, right, is it like in the move

08:17 to Intel that you're contractually obligated to name drop Moore's Law?

08:21 Part of my job description, I apologize.

08:27 Aside from that, I think it's accessibility in general, right?

08:30 Certainly compute matters, but just how accessible AI is because of large language models.

08:35 We spend so much time thinking about the great things that happen on the output side.

08:39 We tend to overlook how good large language models at understanding what I meant, right?

08:44 The intent portion of it.

08:46 To me, that's the most fascinating superpower of large language models and generative AI.

08:51 It's great that we can build things with it, and there's a lot of outcomes with it, but

08:54 just the fact that it understands human behavior and human intent and human speech, that's

08:58 the fascinating part.

09:00 I love this conversation about accessibility and about how huge pieces of the world who

09:06 never had access to play with large language models are now being able to open up their

09:12 browser and sort of use them for their experiments, use them as part of their work.

09:17 I remember at Google when the first Palm models just came out, being able to test them out

09:23 internally, being able to test out some of the image generation models, but having that

09:28 all is just kind of like a sandbox within the context of the company.

09:33 Whereas now, there's really massive potential for everyone in the world to start experimenting

09:39 with these to be able to understand how they could be creative and to accelerate this process

09:45 of getting their ideas out into the world.

09:48 It's scary to think that that was just about a year ago, isn't it?

09:51 Just about a year ago.

09:52 And how many iterations there's been.

09:55 Absolutely.

09:56 And how much more efficient the compute has gotten.

09:58 It's really, really awesome to see.

10:00 And this is a great segue into our next section, which is all about open source versus closed

10:06 source models.

10:08 And I think all of the folks on the panel have experience working with both.

10:14 But really, open source models have massive potential for customization, adaptation.

10:21 How have we started to see how open source is really driving some of the new features

10:27 and the new excitement around these models?

10:29 Adi, do you want to go first?

10:31 I know that you work on a lot of really great open source projects within the context of

10:35 Google.

10:36 But I would focus on OpenXLA, which is the middleware.

10:39 So whether you use PyTorch or Jax or TensorFlow, we're offering a very open source middleware

10:45 way to be able to access different hardware, like Intel CPUs and GPUs and TPUs as well.

10:52 So that's one aspect of it.

10:54 But I think it's that collective wisdom that open source offers with the crowd.

10:59 Because not one company can do all of it.

11:02 We're just a part of the puzzle.

11:04 So working together as an ecosystem, I think that's really the critical piece of it that's

11:08 going to help us to catapult to levels that we've never seen before.

11:12 Absolutely.

11:13 Kevin, do you want to?

11:16 Yeah.

11:17 I mean, look, our model garden supports 140-ish models.

11:21 I think less than 20 of them are built by Google.

11:24 And a large percentage of them are open source coming through communities like Hugging Face

11:28 and GitLab and various others.

11:31 I think open source is a massively important part of this community.

11:35 And I think we've spent so much time in other segments of technology evolution kind of thinking

11:42 about the path from research to applied research towards production-type use cases.

11:50 And I think the gap between research, applied research, and production is pretty well near

11:53 zero at this point.

11:55 We're iterating so quickly and trying to get things out so quickly.

11:58 So I think open source pushes closed source, closed source pushes open source.

12:03 And every day, I come from a different background than AI.

12:06 I spent 20 years in telecom.

12:08 And I could probably take two to three years off in telecom and nothing changed.

12:14 Here you kind of take two to three days off and state of the art looks different.

12:19 Speaking of which, if folks haven't already seen, Llama 3 got released today.

12:25 So already available for folks to use, I believe, on Azure and also on Hugging Face.

12:30 If you haven't had a chance to test it out, that's definitely something to take a look

12:35 at.

12:36 And then also, I love seeing how, you know, as these open source models get released,

12:43 they're increasingly pushing the boundaries in terms of model capabilities and performance.

12:48 Like the mixed-straw models that just recently got released.

12:52 And also it sounds like the largest versions of Llama 3, they're surpassing even GPT-4

12:58 in capabilities these days.

12:59 So it's pretty, pretty nifty.

13:01 Do you all want to add anything around open source?

13:05 >> Yeah.

13:06 So I think there are two aspects to these differences.

13:09 One is, as you mentioned, from a developer perspective, you want to have a mixture, you

13:13 want to provide as much variety and the right price point for all kinds of applications.

13:17 So that's just a model perspective.

13:20 But I think behind the hype, there's going to be this realization that we don't understand

13:24 a lot of how these things work.

13:26 And this is where the ability to leverage open source models is going to be a game changer.

13:31 So research has now taken a backseat to Kevin's point, because the ability for universities,

13:38 for example, to continue to innovate and expand the science behind it is gated on access to

13:44 certain GPUs and there's sort of a cap to it.

13:47 So open source is not only -- I think about it not only at the model level, but the entire

13:52 up and down the stack that is helping not only to build new applications.

13:56 I've been talking about, like, okay, you want to get your first application out with a closed

14:00 source model and then you want to use fine tuning with your open source model.

14:06 So that's sort of one aspect of it.

14:07 The other is, like, how do you understand where things fail, where things can get better?

14:11 And the only place you can do that is with open source.

14:15 >> I love that.

14:16 When I was first learning how to do machine learning, I relied so much on all of the great

14:21 resources that were put on GitHub, that were put online, and it was really only by reading

14:26 the documentation and reading through others' code that I was able to learn about model

14:30 internals.

14:32 So I think if we lose that, and if the world just shifts to closed source models, then

14:39 we've lost a really massive opportunity to help educate folks and to help build the next

14:45 generation of research scientists.

14:47 Marcus, do you want to?

14:49 >> Just a few examples.

14:50 If you go back in the history of software, there used to be a time when people used to

14:53 pay for the browsers, the web browsers.

14:55 There used to be a time when people paid for the operating systems in the data center,

14:58 they paid for the databases, and they paid huge premiums.

15:01 And it's just inevitable, I think, to all your points, it's just so much value to have

15:04 a platform that everybody can contribute to, academia, students, startups, anybody that

15:09 can contribute, that can improve it.

15:10 I think for a single company to outdo that in the R&D is going to be really, really difficult,

15:15 I think.

15:16 Again, it doesn't mean in the short term it can happen.

15:17 Again, there was many examples where it happened in the past.

15:20 People built something that nobody had at that point in time, but then quickly somebody

15:23 else started open sourcing.

15:24 Lambda 3 is a great example.

15:25 All of a sudden you have the whole world starting to build around that, and it's very hard to

15:29 keep up with that as a proprietary vendor.

15:32 >> I agree.

15:33 And that's also a great segue into the next section, which is talking about small models

15:38 versus sort of the larger models that we've seen on the most bleeding of edges.

15:43 I think what we've seen in the community is that oftentimes people will take the smaller

15:49 models, they might fine tune them for specific tasks or use cases, and they're also obviously

15:56 much more efficient to deploy, both from a latency perspective as well as a cost perspective,

16:02 and also like a hardware footprint perspective.

16:07 So if we wanted to discuss kind of the strengths and weaknesses of large and small models,

16:14 what have you all seen in terms of this within your own companies, and then also how you're

16:22 thinking about this in terms of generative AI futures?

16:25 Is it going to be one beating the other, or is it going to be a blend of both, as Sundar

16:31 Rajan mentioned?

16:32 >> Yeah, I'll go there.

16:35 I think it's going to be eventually a mixture.

16:37 There is definitely now a shift a little bit towards custom or fit-for-purpose type of

16:42 SLMs.

16:44 But another thing that maybe not many people know about SLMs is they tend to hallucinate

16:50 a little less, and they seem to offer, like initial research is showing that they seem

16:55 to offer better protection against harm content.

16:58 So there is little other benefits to the SLMs than just the resources.

17:04 But yeah, I think in the future it's going to be a mixture of all of the above.

17:08 >> Excellent.

17:09 And for SLM, just for folks who might not be aware, is that small language model?

17:14 >> Good.

17:15 >> Good.

17:16 Excellent.

17:17 >> So I think it's interesting because maybe also about a year ago there were a lot of

17:21 conversations about large generalist models versus highly specialized models.

17:26 I think actually Microsoft Research was one of the first to come out and say, look, you

17:29 can take a really large generalist model and hyper-tune it and it actually performs considerably

17:35 better than some smaller models.

17:37 >> Just even prompt engineering.

17:40 >> Even prompt engineering as well.

17:41 I think that's spot on, right?

17:43 And I think we kind of lost sight and a lot of the conversation was like, do I really

17:47 need my model to understand the latest Taylor Swift songs?

17:51 And it's not that, right?

17:52 It's really the fact that it's trained on all of human language that matters in terms

17:56 of understanding the intent to be able to execute against tasks.

17:59 >> Especially for the emergent capabilities.

18:01 So there is also that concern with SLMs that you need certain scale for the emergence capabilities

18:06 to show up more prominently.

18:08 >> 100%.

18:09 But I do think it will be a hybrid world.

18:12 I think on the small model side, some of the age old techniques around RNNs are starting

18:17 to prove really valuable to iterate and get high performance and high accuracy out of

18:21 smaller models.

18:22 And obviously some of the things that are happening with mixture of experts are really

18:26 driving down the latency of large models.

18:28 So we're going to live kind of in a hybrid world for a long period of time.

18:32 But once we solve the 10X compute problem, let's do everything large.

18:37 >> I love that you called out prompting strategies as well as retrieval methods.

18:43 I think there are a lot of great ways to augment the capabilities of models without necessarily

18:49 trying to bake in all of those smarts.

18:53 >> I think it really depends on open or close, it really depends on what's the pain point

18:57 that you're trying to solve for the customer at the end of the day.

19:01 Is there a problem?

19:02 Is there a pain point?

19:03 And then what kind of product or solution do you want to be able to build?

19:06 And then working backwards from there, depending on that particular use case, I think that's

19:11 what's really critical in deciding whether it's open or if it's closed source or not.

19:15 Now there's advantages and disadvantages to both.

19:17 Like if you use an open model, of course it's available out there, it's rapid experimentation.

19:22 Those are some of the pluses that you get with it.

19:24 But again, the drawback may be the fact that quality control might not really necessarily

19:29 exist or you have to build that in and that takes quite some time, it takes effort.

19:35 And then from an open perspective, a closed perspective, because they're large models,

19:39 if you do want to build something that requires multimodality, then you want to start thinking

19:42 about a closed model because it offers that multimodality, at least right now.

19:48 So it goes back to, again, the pain point and the customer and the solution that you

19:51 want to build for the pain point that they're experiencing.

19:55 I think it's also really interesting too in the sense that there are some companies that

20:01 are experimenting with releasing these much, much larger models as well as making them

20:10 Apache 2 licensed or making them licensed but only for academic use, in which case people

20:17 can experience the coolness of multimodality and all of these other use cases, even though

20:24 the models are open.

20:26 We had an awesome Jemma release yesterday on this.

20:30 Code Jemma or one of the other specialized versions?

20:32 One of the other specialized.

20:34 Code Jemma also, but there was another.

20:37 Excellent.

20:38 And Markus?

20:39 First of all, we like all language models, so long as they're on Intel hardware.

20:41 We're happy.

20:42 Aside from that, no, I think there's just, and I think this sort of bleeds into the next

20:46 discussion, I think the next version you have to read up about what runs on the client versus

20:51 what runs on the server.

20:52 And I think just simple things like, at some point just economics will dictate that.

20:57 Let's take a copilot as an example.

20:58 If I want to copilot, as a startup I want to give out a copilot.

21:01 If I can run this on the client, if the client has enough compute power to do that, I can

21:05 just offer this as a free service and still be highly, not lose too much money.

21:09 If I try to do this on the server side, it's just going to bankrupt me.

21:12 So and vice versa, if that same user then comes and says, I want to turn on all these

21:15 other new features, great, I can then provide them with additional compute power in the

21:19 cloud and I can augment that and all of a sudden they get a much higher level of service

21:23 and maybe it's going to cost them a hundred or something dollars a month.

21:26 But now it's a professional developer that can afford that.

21:28 So I think the hybrid of those two is ultimately what's going to dictate that.

21:32 So I don't think it's one or the other, it's one and the other.

21:35 Yeah, that's awesome.

21:36 And one of the most interesting patterns that I've seen in the generative AI space is having

21:41 kind of a larger model be the planner model and being able to kind of take in a high level

21:48 task, break it into subcomponents and then be clever about which model to assign to which

21:54 of those subtasks.

21:56 So it might be that for one of the subtasks, you know, you need a model that's a little

22:00 bit more impressive, perhaps a little bit more expensive, needs to do its computation

22:04 on the server side.

22:05 But then for others, you can use a much more small kind of model locally or even, you know,

22:11 tools that might be available locally as opposed to relying too much on model smarts.

22:17 And to y'all's points, that breaks down the cost, makes the latency go down as well because

22:22 you're not sending everything out into the world.

22:25 And it also means that at the end of the day, your users are getting a much better experience.

22:29 I think this is going to drive a lot more compute power locally on the client also,

22:32 because certainly these things you just can't afford as a, you just can't afford them as

22:36 a SaaS provider.

22:37 It's just too expensive.

22:40 That doesn't look good.

22:41 It looks like we're about to get cut off.

22:42 Oh, wow.

22:43 John, are we getting cut off?

22:47 Okay, excellent.

22:48 Well, so I apologize.

22:52 It looks like we had, it looks like we got too excited about the conversation, though

22:58 I think we had a really great discussion about the strengths and weaknesses today of generative

23:03 models.

23:04 If I could just ask one more question to everybody, very quick, one sentence only, top of mind,

23:09 rapid fire.

23:10 What are you most excited about for the next year of generative models, given everything

23:15 that's happened in this past year?

23:21 I'm excited about all the tooling that's surrounding it and bringing those, helping it bring those

23:25 capabilities to every user.

23:28 Excellent.

23:29 I think just the new use cases that we're going to be seeing.

23:31 There's just so much innovation going on, talking to so many startups.

23:33 It's just fascinating the speed at which things are evolving.

23:37 Again, back to the late nineties, once you have cost of communication set down to zero,

23:40 all of a sudden you see people placing their Ubers and all these things.

23:43 None of that existed.

23:44 So I think we'll see something similar over the next year or the next few years.

23:48 I think AI allows us to be much more rigorous and disciplined in our experimentation approach.

23:53 And I think that's going to revolutionize how we build products.

23:57 I think it's learning for me, right?

23:59 When I stepped into this event and I saw everyone here and I started to listen to a number of

24:03 the sessions, it became clear to me that I know a fraction of a percentage of what's

24:06 actually happening in AI right now.

24:10 Excellent.

24:11 So it sounds like generative AI is going to be even more exciting.

24:17 Perhaps we're not at peak hype cycle just yet.

24:19 We have a little bit of ways to go and we should all be excited about what's yet to

24:24 come.

24:25 So thank you so much.

24:26 Thank you to our panelists.

24:27 You all did a great job.

24:27 Thank you.

24:28 Thank you.

24:33 Thank you.

24:38 Thank you.

24:43 Thank you.

24:48 [BLANK_AUDIO]

Category

Transcript

Recommended