The 3 Most Important AI Innovations of 2023

TIME

In many ways, 2023 was the year that people began to understand what AI really is—and what it can do. Here are three of the biggest AI innovations from the past year.

Transcript

00:00 My name is Billy Pirago, I'm a tech correspondent at Time Magazine and I've spent much of this year

00:04 reporting on artificial intelligence. In a lot of ways 2023 was the year that people began to

00:11 understand what AI really is, but there were plenty of innovations as well. Here's three to keep an eye on.

00:16 The first is multimodality. That's the ability of an AI system to work with lots of different

00:29 types of data, not just text but also images, video, audio and more. 2023 was the first year

00:35 that the public really gained access to powerful multimodal AI models like OpenAI's GPT-4 which

00:41 allowed users to upload images as well as text. GPT-4 could see the contents of images which opened

00:47 up all kinds of possibilities. You could ask it what to make for dinner based on a photograph of

00:51 what was inside your fridge or you could ask it how to fix your bike based on a photograph of a

00:55 broken part. Google DeepMind's latest model Gemini is also able to work with images as well as text.

01:01 In its launch video, after being shown an image of pink and blue yarn and asked what it could be used

01:06 to create, Gemini generated an image of a pink and blue octopus plushie. The real innovation behind

01:12 multimodality is that instead of just being trained on text, the new generation of models are trained

01:17 on video, images and audio. The belief inside many top AI companies is that this extra training data

01:24 will help these models become more capable and more powerful. It's a step on the path,

01:29 many AI scientists hope, towards so-called artificial general intelligence, the kind of

01:34 system that can act in the world, make new scientific discoveries and perform economically

01:39 valuable labour. The second big thing to watch in AI innovation from 2023 is constitutional AI. One

01:47 of the biggest unanswered questions in AI is how to align it to human values. If AI becomes smarter

01:53 and more powerful than humans, it could cause untold harm to our species, some even say total

01:58 extinction, unless somehow it's constrained by a set of rules that puts human survival and human

02:05 flourishing at its centre. Constitutional AI, first described by researchers at Anthropic in December

02:11 last year, harnesses the fact that AI systems are now basically capable enough to understand

02:15 natural language. The idea is quite simple. First, you write a constitution that lays out the values

02:22 you'd like your AI to follow. Then, you train the AI to score its own responses based on how aligned

02:28 they are to the constitution, and then incentivise the model to output only the responses that score

02:34 more highly. If you run that cycle enough times, you're left with an AI that has been reinforced

02:41 to behave in the way that you want it to, and to not behave in the way that you don't want it to.

02:46 There are some problems with constitutional AI. It requires trusting that the AI is interpreting

02:52 your constitution correctly, for example, but it's a promising addition to a field where new

02:56 alignment strategies are few and far between. Of course, constitutional AI doesn't solve the

03:01 problem of whose values AI should be aligned to. Today, it's a small number of Silicon Valley

03:06 executives who are writing those rules. But by making the act of setting rules for an AI so

03:11 explicit, constitutional AI could open the door to a future where the public gets more of a say

03:15 in how AI is governed. The third big thing to watch this year is text-to-video. One of the

03:23 noticeable outcomes of billions of dollars pouring to AI this year has been the rapid rise of text-to-video

03:28 tools. Last year, even text-to-image tools had barely emerged from their infancy, but now there

03:33 are several companies offering the ability to turn normal sentences into moving images with

03:38 increasingly fine-grained levels of accuracy. One of those companies is Runway, a Brooklyn-based AI

03:43 video startup that wants to make filmmaking accessible to anybody. And another is Pika AI,

03:48 which isn't pitched at professional filmmakers but at the general user. Tools like Pika and Runway

03:53 could transform the user-generated content experience as early as 2024, but text-to-video

03:59 is quite computationally expensive still, so don't be surprised if tools start charging for access.

04:04 [Music]

Category

Transcript

Recommended