Should Computers Run the World? – with Hannah Fry

Should Computers Run the World? – with Hannah Fry


[APPLAUSE] Hello, everyone. Sorry. Thank you. Thank you very much. All right, I thought I’d kick
off by telling you about one of my favourite science studies. It’s a little unusual study into
the diagnosis of breast cancer. So back in 2015, scientists took
a group of 16 complete rookies and they decided to
try and teach them how to diagnose breast
cancer based on pathology slides like this one here. So the idea was that at the
end of the training period, these testers would be able
to look at slides like this and tell whether they
were malignant or benign. Now despite the fact
that normally it takes years to train to be a
fully fledged pathologists, these testers who had never
diagnose any kind of cancer before in their lives actually
did astonishingly well. So after only two
weeks of training, they managed to get about 85%
of these slides correct, which is astonishing, right? But the really amazing
thing about their study wasn’t the performance
of the testers. It was their identity. Because these weren’t
medical students. They weren’t strangers
off the street. They were, in fact, pigeons. Now I know it’s a bit difficult
to imagine pigeons diagnosing breast cancer. And so what I have here,
I’ve got a little photo here. This is proper science,
by the way, people. This is real science here. This is their little laboratory. They had a little
screen on one side, where the slide would pop up,
and they would peck on one side if they thought
it was malignant, peck on the other side if
they thought it was benign, and they’d get a little treat at
the end if they got it correct. Now these birds did
pretty well, right? But the paper talks about
one slightly stupid bird who didn’t really understand
what was going on. Even by the end
of the testing, he was just kind of pecking
randomly at the screen. But if you take that bird away,
it turns out that although these birds could get
85% of these slides right on their own, If you combined
the votes from all of the birds together– so flock-sourcing
your diagnosis, as it were– thank you– more of those come– then, actually, the
accuracy of these pigeons shot up to an incredible 99%. And that is a number
that’s compatible to what a fully-fledged pathologist
would be able to manage. Now I don’t think that we
have to particularly worry about our hospitals being
overrun by bird doctors just yet. Well, there’s a
reason why I wanted to start with this
example because I think it actually illustrates
a really important point. Because I think that we
like to imagine ourselves, as humans, as being uniquely
capable of a whole range of different things, that we’re
uniquely talented, really. There’s some jobs that we and
we alone are capable of doing. And I think that this example
really clearly demonstrates that that’s just
often not the case. And if you can train
birds to diagnose cancer, why can’t you train a
computer to do it, too? And I think this is really
something that has changed in the last couple of years. I think that we’ve really
started seeing stories about machines being
able to outperform humans in things that we thought were
our job and our job alone. And I thought I’d pick
a subject in an area that, traditionally,
we think of as ours. So I thought I’d pick an
example from the world of music, something that only humans are
able to translate the emotion of being alive into song. Well, let’s see if
that’s really the case. Let’s see if machines
have come up to scratch and are as good as us now. So what I’ve got, I have got two
pieces of music to play to you. Both of them are
going to be chorals. Both of them are performed
by a live orchestra. One of them was composed by
the great Baroque master Johann Sebastian Bach. And the other one was
composed in the style of Bach by a computer. And I want to see if you
can tell the difference, OK? I want to see if you
can spot the real Bach. OK, here we go. So two options, here’s option 1. [AUDIO PLAYBACK] [ORCHESTRAL MUSIC] – (SINGING) [INAUDIBLE] [END PLAYBACK] And option two. [AUDIO PLAYBACK] [ORCHESTRAL MUSIC] – (SINGING) [INAUDIBLE] [END PLAYBACK] OK, didn’t say it
was going to be easy. OK, here we go. So your job now, between
those two options, is to decide which one
is the real Bach, OK? We’re going to take
a vote on this. So first thing’s first–
who’s not going to vote? Oh. Great. A couple of people actually
put up their hands. Damn. OK, all right, so who thinks
that the real Bach there was hiding behind option 1? Hands up. Ooh, interesting. And who thinks it was option 2? 50/50, then. Basically guessing at random. [LAUGHTER] And you laughed at that pigeon. OK. In fact, the real
Bach was actually hiding behind option
two, so well done if you got it correct. Option one there,
option 1, instead was something
called “Experiments in Machine Intelligence.” It was a experiment done
by the composer David Cope. And the way that David
Cope’s experiment worked there was by
getting the machine to use a very simple algorithm
to construct that music. Now I’m very aware, just
as sort of a side note, in the work that I’ve
been doing recently, I’m very aware that the word
“algorithm” makes about 85% of people want to gouge
out their own eyes, right? But I mentioned this
at a tech conference that I was at to someone,
and they agreed with me. But they added that
it makes the remaining 15% of people mildly aroused. [LAUGHTER] So I’ll let you decide
which camp you’re in. But the thing is is that
this particular algorithm, David Cope’s
algorithm, it actually works quite a lot
like predictive text does on your phone. So the way that an algorithm–
all an algorithm is, really, is it’s something
that takes an input, takes it through some
series of logical steps, and gives you an output
at the end, right? So a cake recipe,
in theory, it could be an algorithm, where your
input is the ingredients, and the output, at
the end, is your cake. But this particular
algorithm, it worked a lot like predictive
text does on your phone. So the input in this instance
was the vast catalogue of all of the chorals that
Bach had ever written. The logical step to get you
from one part to the end was just a very simple process. And what you would do is you
give the algorithm a chord, and then it will
tell you what chords were likely to come up next
in Bach’s original music. And you pick one of those
based on probability and repeat that process,
chaining them together until then you end up with
an original piece of music. So in that way,
actually, this algorithm was very, very
similar to those games that I don’t know if
you’ve seen people play. There’s a lot on the internet. Where you open up the
notes in your phone, you seed your sort of algorithm
with a very simple sentence like, “I was born,” and then
you let predictive text complete your own autobiography
for you that has been trained on
the things that you’ve typed into your phone. Now I thought I’d give this
a go, and I videoed it, and I thought I’d just
share it with you, just to give you a little
flavour of the type of messages I’m apparently
sending on my phone. It starts off fine. It gets a little bit
weird towards the end. “I was born to be a
good person, and I would be happy to be with you. A lot of people– I know that you are
not getting my emails– and I don’t have
any time for that.” There’s a little of an insight
into my life just there. It’s good. Now if I play you back that
little snippet of Cope’s algorithm, you can hear
these very, very simple chord transitions going on
in the background, and that really is the
giveaway that it’s fake. Bach’s original piece was
much more complicated. Here we go. [AUDIO PLAYBACK] [ORCHESTRAL MUSIC] – (SINGING) [INAUDIBLE] [END PLAYBACK] There you go. That’s how you spot the fake. But I think that
you could argue– I think if we’re
being honest here, I think you can
say that, actually, Cope’s algorithm there– it’s not really composing music
in the traditional meaning of the word. It’s more like it’s
taking Bach’s music, it’s kind of passing
it through a grater, and then sticking it
all back together again. But I think that
it really does– that algorithm does
go to demonstrate how something that
is incredibly simple can have these amazingly
impressive results, right? Enough to fool a
roomful of people as to which ended up being real. And I think that the stuff
that algorithms can do now is just incredibly impressive. I mean, we have algorithms
that can catch serial killers. We have algorithms
that can drive cars, even algorithms that
can diagnose cancer better than pigeons can. But you know, ultimately,
I didn’t really want– the book that I’ve
written and the talk that I want to give you today
isn’t really about algorithms. It’s really about humans. It’s about how we
fit into all of this and really what we want
our future to look like. Because I want to just tell
you a little story about what got me into writing this
books, something that happened to me a few years ago. So after I finished my PhD,
my first job as a postdoc was a collaboration with
the metropolitan police. And we were looking
at the riots that had happened across London and
the rest of England in 2011. And the idea was that we
were using all the data, and we were going to come
up with a predictive model that, if this were
to ever happen again, could be used by the police
to quash the unrest before it really got out of control. So we published this paper,
and a couple of years later, I went off to Berlin to go
and give to talk about it. There was this big
academic conference. And I was on stage making stupid
jokes, being really flippant, basically telling
everyone how great it was that you could control
a city’s worth of people by the police. And it just didn’t occur
to me that if there is one city in the world
where people really understand what it’s like to live
in a police state, it’s going to be Berlin. So as a result, in a Q&A, I
got absolutely torn apart. And I’ve managed to track
down a video of this talk, and I’ve found a photo of the
exact moment where I think I realise I’m really in trouble. You can see me, like,
pleading, begging the audience. But I think that this
really– this was the moment in my career when I
realised that when it comes to algorithms, you
can’t just build them, put them on a shelf, and decide
whether they’re good or bad in isolation. You have to think about
how they’re actually going to be used
by people– so both and longer term in the future. And of, course, there’s, these
big ethical questions like, algorithms being turned and
being used by police states. But there’s also
much smaller stuff. Because I think what I’ve come
to realise is that if there is one thing that’s absolutely
for sure is that you can’t rely on people. Let me explain what I mean. I think the most
sensible place to do that is with some Japanese
tourists who went on a trip to Australia, as
this story will show. So OK, this group of Japanese
tourists, they decided, while they were on their
holiday in Brisbane, they wanted to go on
a little road trip, and they wanted to go from
where they were staying, which was here, and visit this
very popular tourist destination over here. So they hired a car,
popped into their Sat Nav, and it said, well, it’s
basically a straight line, which is great. Slight problem that
you can’t necessarily see from this satellite image
is that unfortunately, there’s a great, big whopping body
of water between the two. Now in fairness to
these Japanese tourists, they didn’t notice
this immediately. Perhaps they didn’t speak
English particularly well and didn’t notice that
the word “island” appeared on the name of the place that
they were trying to drive to. But I think you
could forgive that. But you would think that once
it came to actually trying to drive on water,
they would know it was time to overrule
their machine, right? Apparently, they didn’t. I think this is a very
quite embarrassing when someone had to wade
out to come and help them. Definitely
embarrassing when they had to abandon their hire car. But most embarrassing of all was
about half an hour later when an actual ferry sailed past. But the thing is, I think
that we can all chuckle at the silliness of this story. But you know, I think
I’ve come to believe that, actually, these Japanese
tourists aren’t really alone. Because I think when
it comes to placing blind faith in a
machine, actually, this is a mistake that we’re
almost all capable of making. And, of course, there’s
really big examples where we’ve let algorithms
take the driving seat that aren’t necessarily deserving
of our trust, the big story about Cambridge Analytica
earlier this year being a really key example. But I think if you
look closely enough, you will see this story
over and over again, of people just giving up
their power to a machine. But I think that we’re all aware
now that, by now, you really can, if you have
enough data on people– you really can work out all
kinds of different things about them. So you can work out what
they’re going to buy. You can work out which
way they’re going to vote. But you can even work
out whether or not they’re going to go on to
commit a crime in future. And that’s what this
questionnaire here is from. This is from an algorithm– a so-called
recidivism algorithm– that is used when
defendants appear in court. And it it’s used to
predict whether or not that defendant will go on to
commit a crime in the future. And it’s used by judges–
this example’s from America, but this is something you
find around the world, including in the UK,
and it’s used by judges to decide who should be awarded
bail and, in some cases, to decide how long someone’s
sentence should be. Now I imagine quite a
few people in this room have sort of heard of
this happening already, and I kind of wanted to
get your take on this. So I want you to imagine that
you’re the one in the dog, right? So I want you to imagine
that you committed a crime, has to be
something that you did, you’ve got to be guilty, right? Otherwise, this doesn’t work. So you’ve committed a crime,
you’re standing in the dock, and a decision needs to
be made about your future. Who would you rather
presided over your future? Would you rather it
was a human or would you rather it was an algorithm? OK, let’s take a show of hands. Who would rather an algorithm
took care of their future? Presided over their future. Oh, interesting. That’s quite high, actually. OK, who thinks human? Who says human? Well, you’re still the majority. OK, who said human who
doesn’t mind telling us why? Do you mind telling us why? Is there a mic going round? May I use this one? Let’s see if we can
get you to tell us why. Is it on? Oh, perfect. Can you tell us why
you’d like human– what crime did you
commit, first of all? [LAUGHTER] I’m teasing. I would probably have a human
because you could have bugs in the computers and stuff. And also, I just feel
more comfortable, because I just would like
it if a human would decide, if a computer decided my future. Yeah, you kind of want someone
to look you in the eyes as they’re sending you to gaol? It’s that kind of thing. Who else said human who
doesn’t mind sharing? I’m sorry. I’m gonna run up. OK. This wasn’t the plan, but OK. I’m doing it. OK. Thank you. For a human because I
think I’m of a demographic that would probably do
better from a human response than a computer. I think you make a really
important point there, because I think that
everybody’s probably quite comfortable with
the idea that humans make more mistakes, right? That there’s going
to be a bigger range of answers with
a human than there would be with an algorithm. But most audiences that I talked
to think that that error will go in their favour, really. And I think that
different kind of people end up with a
different experience. Actually, who said
algorithm that doesn’t mind saying out why? Let’s go up here. Why not? Let’s go up here. Sorry. Gotta wait for me. Sorry. Here we go. Thank you. I think I trust
an algorithm more because they
wouldn’t be biassed, because it would be unbiased. Yeah. Yeah, I think you make
a really good point. I actually did this with
an audience the other day. And someone put their
hand up for algorithm. And I asked them why. And they said, because I
work for the judiciary. [LAUGHTER] I think you make a
really important, which is that, actually,
there’s lots of evidence to show that this error that you
get in human decision-making is a real problem. So there’s evidence to show
that if you take the same case– I’m running out of breath after
running up and down the stairs. If you take the same
case to different judges, you’ll often get a
different result. If you take the same case to the
same judge on a different day, you can get a different
result. There’s evidence to show that
judges who have daughters tend to be a lot
stricter in cases that involve violence against women. My favourite one,
actually, is this. Judges tend to be a lot stricter
in towns where the local sports team has lost recently. There’s some studies
that aren’t– they’re not necessarily
that popular, but they talk about how the time
of day can make a difference, whether the judge is hungry. There’s evidence to
show that judges don’t like giving too many of
the same decision in a row, so that if you have four people
been successful before you, then suddenly they
become a lot stricter. There’s lots and lots of
stuff like this, lots and lots of things like this. And the consistency
issue is something that you really can get
rid of if you switch over to algorithm. So I kind of think on balance,
I probably agree with the people who are slightly more in
favour of the algorithms. But there’s quite
a big but there, and I think that
that’s only the case– these algorithms
aren’t perfect, and I think it’s only
the case if we can trust the human judges to know
when to overrule the algorithm. Because the AI, or
these algorithms, they’re going to make mistakes. They’re going to make
calamitous mistakes. Let me give an example. There was a story about a young
man called Christopher Drew Brooks, who is a 19-year-old
man from Virginia, and he was convicted
of the statutory rape of a 14-year-old girl. Now they had been having
a consensual relationship, but she was under age,
and that’s illegal. So during his
trial, an algorithm assessed his risk
of reoffending. And it determined that because
he was such a young man and he was already
committing sexual offences, he had a very high
chance of continuing in this kind of life. So it deemed him high
risk and suggested that he be given 18 months gaol time. Now, in theory, there’s
nothing necessarily wrong with that assessment. But this case does really
illustrate, I think, just how inconsistent these
algorithms can sometimes be. Because this particular
algorithm put so much weight on this young man’s
age that, in fact, had Brooks been 36 years old,
that would have been enough– in fact, putting him at 22
years older than the victim there, which I think,
by any possible metric makes this crime
much worse, but that would have been enough for
this particular algorithm to kind of tip it over
the edge and suggest that this young man was
instead a low-risk individual and suggest that he
escaped gaol entirely. Now you would hope, I think,
that in a situation like this, the judge would
have the foresight to overrule an
algorithm like this and to rely on their
own instinct instead. But it seems that
judges are actually a lot more like Japanese
tourists than we might imagine. Because, in this case, and
many, many others like it, the sentence of
the individual was increased on the say-so of this
logically flawed algorithm. Well, there is another
problem with these algorithms, something that you actually
spotted there, talking about bugs in the system. And that’s really that they
don’t make the same mistakes with everyone. They make mistakes,
but it’s not kind of uniform across the board. And I think what we’ve really
realised in the last couple of years is that the algorithms
that we’ve invited in to make decisions own our
lives, they have these– all kinds of deep-hidden biases. And I think that really part of
the way out of all of this mess is by acknowledging that
artificial intelligence or an algorithm, they’re just
not going to be perfect, right? And that’s because they just
don’t understand the world in the same way that we do. They don’t understand
context, and they don’t understand nuance. And that, I think, is something
that has never been clearer than when you ask an algorithm
to recognise what’s contained within an image. So this is an
experiment– here we go. This is an experiment that
was done by Janelle Shane, the blogger Janelle Shane– I’ve repeated it here for you. She noticed something
a little bit weird when you upload
photographs like this to image recognition software
that will automatically label your photos for you. So I’ve done this one
with Microsoft Azure. So, OK, here’s the label that
it gives this particular photo here. It Says that this
is a herd of cattle grazing on a lush green field. Now I can definitely see a lush
green field, looks very lovely. But I mean, I’ve spent
quite a long time looking at this photo,
and I fail to find any cattle contained within it. And that does make you wonder,
is this image recognition software hallucinating
farmyard animals? Well, let’s give
it another go, OK? So let’s try it with
this one this image here. This one it labels
as a sheep standing on a lush, green field. It’s got quite a thing
for lush, green fields, this particular algorithm. OK, does it really understand
what a sheep is, then? It kind of makes
you wonder, does it know what a sheep actually is? Well, let’s try it. Let’s take a sheep in a
slightly more unusual situation, try that. This one, it labels
as a cat sitting on top of a wooden fence. Take a farmyard animal, put
it in the arms of a child, the algorithm thinks,
well, that can’t possibly be a living creature. Take farmyard animals,
put them in a tree, and it thinks, well,
they must be birds. And my favourite of all is
if you take those farmyard animals, leave them where
they are, but paint them pink, it thinks that suddenly
they must be flowers, which is kind of– let’s be honest, a far more
sensible explanation for what’s going on there than why those
sheep are actually pink. But you know, perhaps
all of this stuff, perhaps it doesn’t
really matter. But I think when
image recognition is used to start to really
change people’s lives, it starts to become,
actually, quite important. Let me just tell you
one last story about– a story of something
that happened in Idaho. So back in 2014, there were 16
disabled residents of Idaho, and they received some
unexpectedly bad news for the day. So the Department of
Health and Welfare, they just introduced
this new budget tool. And the idea of this
budget tool was it was going to automatically
calculate how much each of these residents– how much state benefits,
sorry, each of these residents were entitled to. Now these were people who had
quite severe disabilities, so this money was, essential,
to making sure they could keep their independence, really. They qualified for
institutional care but were being cared
for at home instead. So one by one, they each went
into the State Department to find out how much money
the algorithm decided that they were entitled to. And weirdly, some
people found out that they actually were entitled
to much more than they’d got in previous years while
other people, as you might expect, ended up having
deficits of tens of thousands of dollars. Now from the outside, no one
could work out how the hell this thing was
making its decisions. It looked like it was,
essentially, just plucking numbers at random. But the problem was it was
kind of impossible to argue with this computer. So the people who worked
for the government just trusted its output
a bit too much. So in the end, the
residents had to bring this class-action lawsuit to
have this algorithm turned over to be scrutinised. And when it was
scrutinised, they discovered that this algorithm– it had
so much power over their lives, it wasn’t some
super-sophisticated, artificial intelligence like they’d
sort of been led to believe. It wasn’t this super-slick
mathematical model. It was, in fact, an
excel spreadsheet and, if you’ll forgive
me for being blunt, a quite crappy one at that. So this Excel spreadsheet, it
had errors all over the place, right? There was bugs in the data,
the formulas were a mess– in fact, the maths
in this spreadsheet was so bad that the
judge would eventually rule it unconstitutional. I love the idea of there
being unconstitutional maths. But I think that, ultimately,
the moral of this story is that once you kind of dress
something up as an algorithm or as a bit of
artificial intelligence, it can take on this air
of authority that makes it really hard to argue with. So I thought that
I would leave you with a much more
positive example, I think, of how to get around
all of these different issues that I’ve raised
during this talk– an example of where I
think people really do understand what the
future should look like. And I want to go back to the
example of a breast cancer that I started with, where
I think people are doing some really incredible stuff. Now if you want to
design an algorithm to diagnose breast
cancer, there are two things that you want your
algorithm to be able to do. So on the one hand,
you want your algorithm to be really, really sensitive. You want to make sure that your
algorithm catches every single lost tumour that’s hiding
amongst that vast array of cells. You want to make sure it doesn’t
miss any last single tumour. But you also want your algorithm
to be really, really specific. So you want to make sure
that your algorithm isn’t flagging loads of
normal tissue and saying that it’s suspicious
when it isn’t, OK? You want to make sure that
it’s really accurate in all of its assessments. So, OK, simple, then. If you want to design an
algorithm to diagnose breast cancer, just whack
up those two dials, and you can kind
of– you’re done– except, unfortunately, it’s
just not really that simple. Unfortunately, when it
comes to these algorithms, these two dials tend
to be locked together. So turning one up
often means having to turn the other one down. And that means you can
kind of inadvertently design something that’s
quite a crappy algorithm. Because, for instance, there
is something, a very simple algorithm, that matches
this profile here, a very simple algorithm. It’s just a single line. All it does is it just
says, everyone has cancer. Gets 100% sensitivity,
certainly, but not that much use in terms of
actually diagnosing people. So all you have to do when
you design these algorithms, you have to just do the
very best that you can. You have to play to the
strengths of your algorithm. And believe me when I
tell you, these algorithms have some absolutely
almighty strengths. So on the subject
of sensitivity, let me just give you a flavour
of the kind of things that these algorithms can do
now, using an example of one of my favourite data sets. Everyone has a favourite
data set, right? This data here, this is
in data from something called “The Nun Study.” And this shows the cognitive
ability of 678 nuns. So this was some data that was
collected by the epidemiologist David Snowden. And at the beginning
of this study, these nuns were aged between
75 years old and 103 years old. And David Snowden managed
to persuade these women that every year of
their life they would take a little cognitive test. So being asked things
like, how many animals can you name in a minute. It’s questions like that. What you can see here
is the data for how these women performed, right? So you can kind of get this
trend of cognitive decline as people get older,
because they appear on this every year of their lives. Along the bottom there,
you have the people who ended up getting dementia. And along the top
there, especially the top right-hand
corner, that’s where you’ve got people who
remained absolutely sharp well into their older years. Now the reason why this is one
of my favourite data sets is because, not only did David
Snowden manage to persuade these women to take
these tests every year. He also managed to
persuade them to donate their brains to the
project after their deaths. Now if you’re a
little bit squeamish, I suggest you look
away, because I’m going to show you some
human brains in a second. But these are the women. These are incredibly generous
women, the School Sisters of Notre Dame from Kentucky. And in a moment, you’re
going to see the room where all of their brains are
kept, this wall of brains, essentially, the scientists use. Now the reason why
you want to do this is so that you can look to
see whether the people who had signs of
dementia in life are the same people who had signs– physical signs of the disease
within their brains in death. So when you dissect
their brains, whether you see
all of the lesions, all of the hallmarks of dementia
having affected their brains. Now you would think that
these two things should be straightforward, right? Signs of dementia in life,
signs of dementia and death, except it turns out
that’s not the case. There are some people who
really buck this trend. So take, for example,
Sister Mary here. Sister Mary died when
she was 101 years old. And as you can see from her
position on this graph here, she was incredibly sharp
right up until her death, doing crosswords, all of
this different kind of thing. And yet, when her brain was
dissected after her death, it showed all of the
hallmarks of having been ravaged by disease. In fact, inside her brain,
there was barely any difference between her brain and one
that would appear much lower down this chart. So what on earth
is going on here? Why are some people able to
resist showing the symptoms when they have this stuff
going on inside their minds? Well, it turns out
that a clue might be hiding in another
data set altogether, one that was created decades
before any of these women even showed any
signs of dementia. Because this team also
have access to the essays that these women wrote when
they entered the sisterhood when they were 19 and 20 years old. And if you do some very,
very simple analysis on the language that is
used in these essays, you can predict which women will
go on to develop dementia later in life. So here’s an example for you. This is the symptoms
of someone who did not go on to develop dementia. And you see the complexity
of the language, how densely packed the ideas are, the
sort of vocabulary that’s used, and so on. And compare that to a
sentence from someone who did go on to develop dementia. I mean, you can kind
of see illustratively just how different they are. Now this is the stuff the
algorithms are amazing at, looking for tiny, tiny clues
in seemingly completely disconnected data sets
that can make really big predictions in
the very long term. So in fact, when it comes
to cancer diagnosis now, the algorithms that we
have can’t just tell– it’s not that they’re just telling
you what’s in your body right now. They can make a prediction
about your long-term chances of survival based not
only on the tumour itself, but in something in the
surrounding tissue that we’re still trying to work
out exactly what it is that they’re picking up on. Just incredibly, incredibly
sensitive these algorithms, right? Just absolutely amazing
what they can do. And yet remember that
sensitivity does not make a perfect algorithm. That’s only one half
of the equation. Now on the flip
side, humans, when it comes to being sensitive,
were rubbish, right? We’re totally rubbish at this. I think this is something
that was best illustrated by quite a mean trick that
was played by some Harvard scientists on some
radiologists back in 2014. So what they did, they got
these professional radiologists, just to see how sensitive
their eyes were, and they showed them
this image of a lung, and they asked them
to have a look at it. And they used eye
tracking software to see where they were going. And they failed to
tell them, and 83% of the professional
radiologists failed to spot that they had hidden an
actual gorilla inside the lung scan– [LAUGHTER] –despite the fact that
the eye-tracking software said that their eyes were
looking right at them. And if you’ve got professional
radiologists missing gorillas, you can imagine how many
tumours end up getting missed. We are terrible at
sensitivity, right? We are really, really bad. We miss things all the time. But, specificity,
being specific, that’s like our superpower. So if you have a fully trained
radiologist or pathologist, they will almost
never misidentify a perfectly normal set of cells
as cancerous when they’re not, right? Something that
almost never happens. We’re incredibly,
incredibly good at it. So here’s the idea, then. Why do we just accept that
neither humans nor algorithms are ever going to be perfect? And rather than choosing
between human and machine, which is kind of the
rhetoric that we get so much, why don’t we exploit
each other’s strengths and just create much
more of a partnership? And this is what’s
happening now in the way that these cancer diagnoses
are being designed. Because the algorithm
never gets tired, so let it trawl through
all of that data and just highlight a few
key areas of concern. And then the human
never misdiagnosis. So they can come in,
and just sweep up and have the final say. They’re playing a really
active role in all of this. And I think, ultimately, this
is the version of the future that I’m really hoping for. I think when will
we start embracing the flaws in the algorithms as
well as acknowledging our own, really? And I think, when will we
start taking our algorithms off of the pedestals and
start treating them like we would any
other source of power– by questioning how they
work and calling them out for their mistakes? Because I think,
ultimately, you really can’t think of technology,
and artificial intelligence, and algorithms in isolation. You have to think of
all of the failings and all of the trust issues of
the people who are using them. Thank you very much. [APPLAUSE]

Leave a Reply

Your email address will not be published. Required fields are marked *