How OpenAI Is Preparing For Future Safety Challenges

Forbes

OpenAI’s Head Of Preparedness Aleksander Madry spoke at Imagination In Action’s ‘Forging the Future of Business with AI’ Summit about how OpenAI is looking towards the future and how they prepare for the challenges of building AI.  Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1  Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:  https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript  Stay Connected Forbes newsletters: https://newsletters.editorial.forbes.com Forbes on Facebook: http://fb.com/forbes Forbes Video on Twitter: http://www.twitter.com/forbes Forbes Video on Instagram: http://instagram.com/forbes More From Forbes:  http://forbes.com  Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.

Transcript

00:00 Did you tell your colleagues and your faculty

00:03 that you are now working at OpenAI

00:05 or did you just leave the building

00:06 and everyone thinks you're still a faculty here?

00:08 - Well, I still am a faculty at MIT.

00:12 I just wanted to make a point.

00:14 Just don't leave and very much appear still on the building.

00:17 So don't tell my colleagues because they might not know.

00:20 - Okay, all right, great.

00:21 So your secret is with us.

00:23 - Both sides are aware.

00:24 - And we're simulcasting this throughout the building.

00:26 So the 2000 people that are listening to this talk

00:29 it's a secret, everyone shh.

00:31 All right, okay, so you're the head of preparedness.

00:34 So who feels prepared for life?

00:38 Who thinks OpenAI is prepared?

00:41 All right, well, let's figure out what preparedness means.

00:44 Ramesh, you got a question for our distinguished colleague?

00:48 - First of all, to have a fellow MIT professor

00:52 looking at preparedness at OpenAI

00:54 makes me feel a little bit more,

00:56 a little more secure that the right people from MIT

01:00 are working on preparedness at OpenAI.

01:03 But also I heard this joke, Alex,

01:07 that the whole AGI scare was created six months ago

01:12 so that OpenAI has an excuse to start

01:16 an official computer science theory group at OpenAI

01:20 and you're kind of leading it.

01:21 Is that correct?

01:24 - No, like essentially like the point is

01:27 I'm not sure this scare,

01:29 I think there is a lot of people who are scared.

01:31 Some of them, you know, Jan is mentioning the Doomer.

01:34 So there are like definitely a spectrum of concerns.

01:38 I don't think like that this is something

01:41 that like is at the core of like being scared

01:43 at the core of OpenAI is about, right?

01:45 OpenAI is about understanding what this technology,

01:49 like what kind of opportunities technology brings.

01:52 And in particular, my team preparedness is about thinking

01:55 how do we prepare both at the company,

01:57 but also how we help prepare the world

02:00 for the technology that will change us, right?

02:03 Like we'll change how we live, we'll change how we work.

02:06 This will be big changes

02:07 and just OpenAI wants to be the force of,

02:10 positive force in this space.

02:11 - So Ramesh, you worked at Apple, Google, Microsoft

02:16 and Facebook, which is now Meta.

02:17 You did what he did, you went on leave.

02:20 Do you have a question, not wearing your academic hat,

02:23 but wearing your hat of big industry?

02:25 You tell me every day how innovative

02:27 and entrepreneurial these big bureaucracies are.

02:31 - Thank you, thank you, that was so honest.

02:33 No, I think it's important for MIT professors

02:37 to take some time off and, you know, share our wisdom,

02:40 but also learn the wisdom from this other big tech companies.

02:42 - You don't get paid on Fridays, right?

02:43 Isn't that part of the deal?

02:44 - Yeah, that's right.

02:45 - That you're supposed to be externally focused.

02:46 - Exactly.

02:47 - And that goes for all faculty in case you wonder

02:49 if they're walking around with a cup saying,

02:51 "Hey, it's Friday."

02:52 - Yeah, we take the piece of MIT

02:54 and we sprinkle it in all these places.

02:56 So, but again, Alex, just popping up a level,

03:01 US just announced kind of an AI safety task force.

03:06 So if just walk us through how at the national level,

03:10 at the big tech level and at kind of a consumer level,

03:14 we should be thinking about safety, AI safety.

03:18 - Sure, so yes, US indeed just recently announced

03:21 the USA Safety Institute.

03:24 So this is a very important initiative

03:26 that I think I'm really hoping

03:28 and want to do everything I can to make it successful.

03:31 But yeah, so first of all,

03:32 we definitely should think about AI safety,

03:34 but I want to just make one point

03:36 that I really think it's not just about safety.

03:38 And that's why we use this word preparedness also

03:41 in addition to safety,

03:42 because it's about just making sure

03:44 that we prepare for the changes that come.

03:49 This means making sure that the downsides

03:51 that this technology can bring do not happen,

03:55 but also the upsides do happen.

03:56 So I just want to view it as a bit broader lens here.

04:00 But yes, this is definitely something

04:01 that requires also attention at the national level.

04:04 And I'm happy to see that US government is recognizing that.

04:07 And we need to think about many things.

04:10 So on one hand, we want to understand

04:12 this is something that my team in particular

04:14 focus quite a bit on is understanding,

04:15 okay, what are the new potential risks,

04:18 catastrophic risks that increased capability

04:22 of the technology brings bring about?

04:25 And then how do we think about mitigating them?

04:27 How do we make sure that our decision making

04:29 about whether we deploy the model or not at OpenAI,

04:32 but hopefully also the same process happens

04:33 in other companies,

04:34 you know, whether this is the right decision.

04:38 But then again, this technology is diffusing everything

04:41 one way or another.

04:42 So we, especially government should think about also,

04:45 well, first of all, let's make sure

04:47 that nothing bad happens, but also like,

04:48 what does it mean for the labor market?

04:50 What does it mean for, you know,

04:52 for all these processes that we are doing?

04:55 What this means for cybersecurity, and so on and so on.

04:58 And these are kind of many of these questions

05:01 are something that the government is in charge of.

05:03 So I'm very happy to recognize that they recognize

05:07 that this is part of their duty

05:10 and they're acting on this.

05:12 - I'm going to ask two related questions.

05:14 One is this confusion between explainability,

05:17 reliability, auditability, and trustworthiness.

05:21 I mean, the way I explain to people is,

05:23 you know, if somebody explains to me how a car works,

05:26 every part of the car, that's explainability.

05:29 And then when I start driving the car,

05:31 you know, I feel it's reliable.

05:34 At some point, when I see all my friends

05:36 and I see statistics in consumer reports,

05:38 that people are not dying for driving,

05:40 then it becomes trustworthy.

05:41 But at some time things go wrong

05:43 and then it becomes auditable.

05:45 So, you know, is there a similar analogy you can give us

05:48 from the AI point of view of how AI safety

05:51 is actually made up of all these pieces?

05:54 And it's careful, you know, we have to be careful

05:57 about which phrase is used in which aspect of AI.

06:00 - Yeah, so this is a great question.

06:03 And I really like how you nicely dissect

06:05 these four words that people sometimes use interchangeably

06:08 and they definitely shouldn't.

06:10 I think actually your current analogy works quite well.

06:14 The one thing that I just would want to,

06:16 like, I think the key difference

06:18 in the cutting edge AI right now is that,

06:21 well, I think if you have a bit of an engineering background,

06:24 you will be able with enough time

06:26 to understand how the car works,

06:28 that like how there is a combustion

06:30 and it is changed to some force that moves the wheels

06:33 and so on.

06:33 So you will really understand what's happening.

06:35 I think the level of complexity

06:37 of the existing, this large language models

06:41 is completely escaping our cognitive abilities.

06:44 So the kind of, there's a bit of a false promise

06:46 of explainability because yes,

06:48 at some point you will know exactly like how the model works

06:50 like here is a number and here's a number

06:52 and it gets add up

06:53 and then you are applying really operations.

06:56 So yes, so at this microscopic level, you will understand,

06:59 but it's much, much harder in some sense,

07:00 really to some extent you could think

07:03 might be even impossible or at least like greatly constrained

07:06 if you really can understand what's actually happening here.

07:09 And this actually ties to much of my work,

07:12 which showed that the way AI solves problems,

07:15 the way the models like AI models solve problems

07:18 tends to be very different to us.

07:19 So even if it tends to solve the task that we solve,

07:23 like let's say recognizing something in the images,

07:25 the way they approach this task is very different.

07:28 And this means that kind of in the context of explainability

07:31 sometimes there are explanatory methods

07:32 that seem to give us a feeling

07:34 as we know why the model did something,

07:38 but actually it's only misleading

07:40 because like the actual reasons are different,

07:41 but there's just some explanation for our consumption.

07:43 So, but still the analogy works

07:45 and I definitely would like to in particular highlight

07:48 also the trustworthiness aspect,

07:50 is because, okay, we need to first of all,

07:52 define what reliability means.

07:54 And depending on that, you can, you know,

07:57 AI is reliable right now,

07:58 or maybe it's not reliable right now

07:59 and we can think what do we want to get there.

08:01 But at some point we will need to face

08:03 the question of trustworthiness.

08:05 And meaning like,

08:05 where will we be comfortable with using AI in our life?

08:09 And then, you know, and then the question also will be

08:12 a kind of, okay, auditability

08:14 is a part of building trustworthiness.

08:16 - So Alex, to what extent are you an X factor

08:20 for your team or is there group thinking

08:23 and everyone thinks alike?

08:25 And are you creating a roadmap

08:28 for your kind of the charge that you're given,

08:31 or are you just kind of sharing some things

08:34 that people aren't thinking about?

08:35 And I know you probably can't get into too many details here

08:37 but I'm just curious, as this new company at this time,

08:41 when, you know, people are having to figure things out

08:44 that maybe hadn't had to be thought of with AI before,

08:47 how are you approaching it?

08:49 What can you share?

08:50 - Yeah, so that's a great question.

08:53 So first of all, definitely we are developing

08:56 very exciting technology.

08:58 Like I'm saying all the way often that like the privilege

09:01 of being at OpenAI is that you can see into the future,

09:03 but you can see for half a year

09:04 or maybe at most a year into the future, right?

09:06 So we also, there are surprises for all of us,

09:10 but the company cares a lot about understanding

09:13 what's happening, both from the scientific level,

09:15 but also from this safety preparedness level.

09:18 Now, you mentioned something important, the group thing.

09:20 And like in my team, which is kind of in particular,

09:23 thinks about all the unwanted, undesirable consequences

09:27 of AI, this is something that I kind of make it very clear

09:30 to everyone on the team is that my biggest fear

09:33 is exactly the group thing,

09:34 because there's a lot of group thing, even in the community,

09:37 because people tend to talk to each other

09:39 and they agree what the risks are.

09:40 And like, to me in particular,

09:42 like I think there will be something negative,

09:46 something catastrophic that AI will do in the future.

09:49 And I hope, and I'm working hard

09:51 to make sure this doesn't happen.

09:53 It probably will be not the things that we are afraid of.

09:56 It will be something completely different

09:58 that we are not thinking about,

09:59 because we have not yet dug enough in understanding

10:02 of all the subtleties and possibilities

10:05 that AI brings up.

10:06 So I always kind of drill it in my team,

10:08 just saying, look, our most important things,

10:10 we have some specific risk categories

10:12 in something called preparedness framework,

10:13 where we list the kind of, okay,

10:14 these are the things that we worry about,

10:15 like cyber security, we worry about bio-risk

10:20 and other things, but we have also explicitly

10:22 this process of unknown unknowns.

10:24 And we are striving in different ways

10:26 to make sure that we always change our thinking

10:28 and see, okay, is there something we don't have there?

10:31 You know, will we succeed?

10:32 I hope we will, but at the very least,

10:34 this is something that I'm very, very worried about

10:36 and trying to work hard to make sure that exactly like,

10:38 let's think outside of the box.

10:40 Let's not make assumptions if we don't have to,

10:42 and always kind of be doubtful.

10:43 Like I always told to my team is that

10:47 when we make some assessment,

10:50 actually what I'm the most worried about

10:52 or kind of where we should really feel

10:53 the biggest responsibility

10:54 is not where we find something troublesome,

10:56 but when we actually don't find something troublesome,

10:58 kind of like we need to always try to second guess

11:00 and say, did we miss something?

11:02 Is there something that we are not seeing

11:03 that we should be seeing?

11:04 So anyway, but that's a--

11:05 - So I asked you about a roadmap.

11:07 Are you charged with creating a roadmap?

11:09 - It's part culture, part technology to get there.

11:11 - Alex, are you charged?

11:12 I know when Ramesh was at Apple,

11:14 he helped create a privacy roadmap.

11:16 Are you charged with creating some sort of roadmap

11:18 or are you just there to think great thoughts?

11:21 - Well, I definitely am believer.

11:24 I'm always aspiring to thinking great thoughts,

11:28 but I really believe that these things

11:30 like avoiding group thinking and so on,

11:32 like you need some processes and frameworks

11:34 to actually drive the right thinking

11:37 and making sure people like thinking about safety

11:40 in particular is part of people's routine.

11:42 Like there's something not just you do from time to time,

11:45 but actually it's part of like,

11:46 even if you are working on developing better model,

11:49 you're thinking about that.

11:50 So to this extent, like one of the things

11:52 that I kind of was part of impetus to create

11:56 is something called preparedness framework,

11:57 which is actually public, you can read it,

11:59 which exactly outlines how do we think

12:02 about assessing this catastrophic risk?

12:04 What kind of technical work we do?

12:05 How do we also, and this is very important,

12:09 how do we have a governance piece

12:11 that helps feed this technical knowledge

12:13 into our decision-making about which one

12:15 that we are deploying, which one we are developing and so on.

12:19 So this is something that I thought is extremely important.

12:22 I also do try to have a roadmap because preparedness

12:26 is about not only about things that are happening right now,

12:29 but what is coming, what we should be preparing for.

12:32 So I do have that, but I'm not sharing that.

12:35 - So Alex, for time, I wanna just jump in here.

12:39 So this is what I'd like, I wanna tell you

12:41 that later today we have Sean on stage.

12:45 He graduated from MIT five years ago.

12:47 He said his favorite class was yours.

12:48 He's a course six, and he just left OpenAI

12:52 after being there for three weeks, three years,

12:54 like three weeks ago, and he's gonna tell us

12:57 about what it was like, and I'll ask him about you

13:00 and see if he still likes you.

13:02 But Ramesh, for the final question--

13:05 - I still like him.

13:06 - Yeah, okay, all right.

13:07 So Ramesh, for the final question,

13:09 can you ask him three questions,

13:11 and let's see which one he wants to answer.

13:14 And then I'm curious, to wrap this up,

13:17 how are you using AI?

13:18 How much of your software are you using,

13:20 are you using AI to code?

13:22 What do you do with AI to kind of experiment

13:28 and get feedback on this technology today?

13:31 But Ramesh, go, three questions.

13:33 - So Alex, I mean, you're a computer scientist,

13:36 just like me, so it's more than just policy

13:38 and frameworks and roadmap.

13:40 Some of your recent work here at MIT talked about

13:43 the failure of the models comes mainly

13:45 from domain shifts in the data,

13:48 and you have done a lot of work on

13:50 where pre-training could help, where fine-tuning could help.

13:53 How does that dovetail with what's going on,

13:56 thinking about AI risks in general,

13:58 because kind of model failure is just a tiny piece of it.

14:01 That's kind of one question.

14:02 The second question, and John, you would like

14:03 to also hear this, everybody here,

14:05 that OpenAI talks about preparedness,

14:07 not in terms of risk mitigation,

14:10 but risk creation, so very specific word

14:12 that OpenAI is using.

14:13 They're preparing for risk creation tools,

14:17 how to avoid risk creation, not risk itself.

14:20 So I want to kind of get your thoughts on that.

14:22 And the third one is what John asked,

14:24 like how does this play out for tools in AI in general?

14:28 - So you don't have to answer all those, just pick one.

14:32 - Sure. - Or none.

14:34 - So let me try to actually answer all of them very quickly.

14:39 So first of all, I do use AI, not really for coding,

14:42 because I don't do much coding anymore,

14:44 but I do it for all this knowing things,

14:46 like about like, okay, there was a word I had in mind,

14:49 and what is this word?

14:51 Or essentially, like there is a PDF I want to parse,

14:54 and like, ChargeGPT helps me with that.

14:57 So I really use it, and it's interesting

14:59 how much it's a part of my routine right now.

15:01 I don't even notice it anymore,

15:02 so I had to think for a moment to answer that.

15:04 In terms of my background as a computer scientist,

15:06 I think this is really important.

15:08 Not that maybe specific research I did

15:11 is like immediately directly applicable

15:13 to the settings which we are,

15:14 but like both the thinking,

15:16 the kind of structural thinking about this technology helps,

15:20 but also like, to be honest, having, being comfortable

15:24 to have a very technical conversation with someone,

15:27 and kind of really understanding

15:28 the underpinnings of technology,

15:30 and because of that, the corresponding risk,

15:32 like that's key, and that's, I think,

15:34 hopefully something that I,

15:35 positive that I can bring to this mission.

15:38 Now, in terms of, I actually not sure

15:40 what you're referring to in terms of risk creation,

15:42 so I will have to guess.

15:43 Again, in some sense, the thing about,

15:46 when we think about the negative,

15:48 potential negative impact of AI,

15:49 which, by the way, I think there's lots of positive ones,

15:51 but let's talk about negative,

15:52 because we want to make sure

15:53 that we modulate and mitigate this.

15:55 Like, I think AI is really a reagent.

15:58 It's just kind of an accelerant

16:00 of certain already known risks.

16:01 I think also there will be new risks,

16:02 but in some sense, there will be also

16:05 the old risks that are accelerated.

16:07 So, this is kind of what we are trying to understand.

16:10 If we think about this accelerant of AI,

16:13 like which kind of makes everything different,

16:16 like how we want to really understand

16:18 how the risk landscape changes as a result of this.

16:22 So, this means that you really need to understand,

16:24 and I really like always pitch my team,

16:27 but saying, yes, you have this great mission,

16:29 important mission, but also you will need

16:30 to get extremely familiar with the idea

16:34 with the absolute cutting edge of the technology

16:37 that open AI is developing.

16:39 So, you are really kind of side-side

16:41 with the people developing this,

16:42 understanding what they are doing,

16:43 because you need to understand how AI impacts things.

16:47 And for that, you have to have the dependency of that,

16:49 and then you will know what are the new risks that emerge.

16:52 So, hopefully this answers your question.

16:52 - So, Ramesh has one final question.

16:55 To the audience, are you underwhelmed by Alex,

16:58 raise your hand, about what you expected,

17:00 or you really, really like what he had to say,

17:02 and you're glad he's working on preparedness?

17:04 Okay, so Alex, I don't know if you can see,

17:06 but a lot of people are happy about you.

17:08 - Yeah, so I want to kind of channel Shantanu Bhattacharya,

17:12 who's a scientist in our group,

17:14 and he calls it kind of the Soviet model,

17:16 and any centralization eventually creates

17:19 potential for catastrophic failure,

17:21 and the way our society works is through decentralization.

17:24 And his point is that if you just decentralize,

17:26 things ultimately become stable,

17:28 they automatically become, they're cross-checking

17:30 their references, and things reach an equilibrium.

17:33 Do you have this debate, and I know your group

17:34 and my group at MIT both work on decentralization,

17:37 but at OpenAI, which is highly centralized organization,

17:40 do you think decentralization will be a way

17:42 to bring some stability?

17:44 - So, I think this is an excellent question,

17:47 and I think it's quite nuanced,

17:51 because on one hand, to make progress in current AI,

17:55 you need to have a lot of compute, and a lot of focus,

17:59 and that brings centralization, but there are

18:01 these challenges that you are talking about.

18:03 So this is, by the way, why OpenAI has

18:05 such a unique structure, is because this is a part

18:08 of a thinking, okay, if we succeed, yes,

18:11 if we create this great technology, what then, right?

18:15 Like, does it mean we are all powerful?

18:18 Should we be all powerful?

18:19 Probably not, right?

18:20 So there's a lot of thinking about that at OpenAI, exactly.

18:23 Just saying, like, yes, there is some centralization

18:26 that is needed to get the resources

18:28 to really build the technology, but this question

18:31 that you are raising, and your positive arisings,

18:33 are very vital, and definitely something

18:35 we are thinking about, and again, our structure

18:38 is one way of trying to address it.

18:41 Probably we should think more, because, you know,

18:43 it's a very difficult problem.

18:45 - Alex, five stars, if you were an Uber driver, good job.

18:48 - Thank you, thank you, Alex.

18:49 - All right, who knows?

18:50 (static)

18:53 (static)

18:55 (static)

18:57 (static)

18:59 (static)

19:01 (static)

19:03 (static)

19:05 (static)

19:07 (static)

19:09 [BLANK_AUDIO]

Category

Transcript

Recommended