Mobile Performance from the Radio Up: Battery, Latency and Bandwidth Optimization – Google I/O 2013

Mobile Performance from the Radio Up: Battery, Latency and Bandwidth Optimization – Google I/O 2013


ILYA GRIGORIK: Hi everyone,
my name is Ilya Grigorik. I’m a developer advocate
on the Make the Web Fast Team at Google. And today I want to talk to you
about Mobile Performance from Radio Up, which is to say,
taking a lower level look at how the radio works, and what
we can learn about it to make our apps, mobile apps, and
web pages more performant. So these are some tips and
tricks that I’ve learned from working with different Google
properties, helping them optimize their applications,
and some general best practices. So first I think we should start
with the obvious, or maybe it’s not so obvious to
many of us, which is, wireless is fundamentally different
than wired. There are different
constraints. The technology, the underlying
technology, is, in fact, very different. And fortunately or
unfortunately, I’m not quite sure yet, our platform actually
abstracts that. TCP and our browsers make it
seem like it’s all the same, whether you’re accessing over
a wireless network, whether it’s Wi-Fi, a mobile network,
or a wired network, it feels like all the same. But once you actually dig a
little bit deeper, you’ll realize that there are different
design constraints, even within the different
wireless standards. And that actually dictates how
you need to design your applications to make them
feel performant. So first of all,
why do we care? It turns out that performance
is a top criteria for all applications, mobile
especially. And there’s a number of great
studies that have been done about performance on mobile. And the first takeaway is that
users expect mobile sites to perform just as well, if not
better, on mobile than they do on desktop. We don’t have more relaxed
expectations on mobile. If anything, it should
be faster. You’re on the go. You’re trying to check
something quickly. You don’t have much time. You want it to perform well. Now, out of all these users that
have asked this question, more than half have actually
reported feeling frustrated about having a problem when
accessing a site or an app over a mobile network. And most of those, the number
one complaint has been the slow load time for their
applications. And after that, almost half of
them said, if I had a problem, I couldn’t get the page to load
fast enough, it didn’t react fast enough, they
wouldn’t come back. So you’ve lost that user, and
they’re not returning to your site, which is the worst
possible outcome. And of course, this also
translates to dollars and cents, and even millions
of dollars. So this another great case study
that I’d like to share. This is done by Aberdeen
Group. They looked at a whole number,
100-plus e-commerce sites, and they realized that adding one
second of delay to their pages and to their applications
dropped their conversions significantly. So this is your actual
purchases. People viewed fewer pages,
and, of course, customer satisfaction decreased
as well. So this is just 1,000
milliseconds of latency added to your application. So it’s not just dollars
and cents. For many sites, large
sites, it’s literally millions of dollars. So our agenda today is to
actually look at Radio Performance 101. It’s kind of a propeller
hat talk, a little bit. And specifically, we’ll compare
Wi-Fi and mobile. And you’ll see why because it’s
important to understand that even something like Wi-Fi
and mobile networks are fundamentally different in
how we schedule, how the communication is done, and how
we design our applications, or what we can even do to optimize
our applications. And we’ll also look at some
practical tips for what you can do in your applications
to design for this. And then I’ll leave the last
part for you, which is to optimize the profit part. So there are tons and tons of
techniques for how to make your applications
more performant. There is application
optimizations. There’s HTTP-specific
optimizations, or HTTP1-specific optimizations. We are not going to
touch on those. We’re going to mention
them at the very end. We’re actually going to go
a little bit deeper. We’re going to go below the
application layer, even below the TCP layer, and kind of look
at, like, how does the radio actually work? So we’ll put the wired stuff
aside, look at the radio, and dive into Wi-Fi and mobile. So 3G and 4G, what’s so
special about that? How does battery life
play into this? It turns out the battery
life is kind of the central component. There’s the radio. Then there’s the battery,
those are two connected components that we need
to think about. And what is the radio
resource controller? Our friend and foe. So let’s dive right it. First of all, wireless, or
Wi-Fi, came around in the early ’90s. That’s when the first standards
were published. But really, it only became
popular in the late ’90s with the release of the B802.11B
standard, which is the 11 megabits per second standard. And the thing to realize about
Wi-Fi is, it was literally designed as an extension
to your Local Area Network, your LAN. So we reuse all the same
framing, protocols, mechanisms, all the rest of
them, and basically added a wireless interface to it. So it is not designed for a
mobile device, which is to say, a device with
limited battery. It was designed for
your desktop computer, your laptop computer. We weren’t constrained
by power at the time. Think back to the phone
that you had in 1999. We weren’t thinking about the
kinds of applications that we’re delivering at the time. And the thing about Wi-Fi is,
we need to understand how we actually schedule, how we
communicate, over Wi-Fi. So when we’re communicating over
a wired connection, we have a switch, or a router,
which routes all the packets between the client and
server, for example. In Wi-Fi, we share the
radio channel. And the radio channel
is a shared medium between all of us. All of us can’t talk
at the same time. That would just generate
a lot of noise. So we need some sort of a
mechanism to figure out who’s going to talk when. And Wi-Fi takes a very hands
off approach about this. It basically says, let’s use a
very simple algorithm, which is, if you want to talk,
first listen if nobody else is talking. If nobody else is talking
then start talking. And if another person starts
talking while you’re talking, well then, both of you stop,
wait for random interval of time, and repeat the
whole process. So it’s kind of like this party
algorithm of hey, we’ll just trust that all of you
behave nicely, and you’ll all respect each other. And we’ll just kind of
share this medium. That’s Wi-Fi. Now you can actually model
this and prove that the channel load, or the amount of
traffic, needs to be kept fairly low. In fact, the load of the network
needs to be below 10% in order to achieve
good performance. You’re always going to
have retransmissions. There will always be the case
when I start talking, and you start talking a little bit
later, and we collide. But that’s OK, because we will
wait, and then we’ll retransmit data, and
everything’s OK. You can get pretty
good performance. But after a while, you basically
run into this case where there are too many devices
trying to compete for the same medium, and the network
basically collapses. There’s no way to recover
from this state. And I’m pretty sure all of us
have experienced this at some point, right? Wi-Fi and a large group
of people in a room? Not a good combination,
oftentimes. So that’s, in part, why. And we can dig a little bit
deeper and actually look at how it’s implemented. So in the 2.4 gigahertz
spectrum, we actually have a limited amount of spectrum,
let’s say 60 megahertz. Within that 60 megahertz, we
can actually have three channels, which is to
say up to three devices can talk in parallel. And they won’t interfere
with each other. So these are the channels. If you guys have ever dug into
your settings on your Wi-Fi router, that would be your
channel one, channel six, or channel 11, or something
to that extent. Now the problem is, this is
nice nice and pretty– this is a little chart
that I pulled up on my own home network. And you can see that there’s
over a dozen networks. And they’re all overlapping. We’re all trying to talk
over each other. And basically, Wi-Fi to
some degree, is a victim of its own success. It has become so ubiquitous. There’s so many Wi-Fi routers
out there that it’s very hard to get a slot to talk to
somebody without interfering with anybody. So your neighbor, when they
start streaming that HD Netflix movie, they’re
interfering with you as well. Even though it’s a different
access point, they’re still using the same shared
spectrum. So there is limited capacity. And most networks have tens,
or even more than dozens of overlapping networks. So here’s a fun experiment. I ran this at my house. So this is not definitive data,
but this is an example, a data point. There’s about 15 access
points just in and around my apartment. And I ran this very
simple test. I took my laptop, and I had
my router, which was about 15 feet away. And I was staring right at it. And I figured hey, how long does
it take for one packet to get from my laptop
to my router? Like what is the latency
of that first hop? And I gathered data
for several hours. So this is a good
sample of data. And if you look at the numbers
here, the 95th percentile is about 50 milliseconds. So to travel 15 feet from my
laptop to my router, it was taking about 50 milliseconds,
which is crazy because 50 milliseconds is about the amount
of time it takes to travel from the West Coast
to the East Coast. So here I am trying to shuttle
a packet 15 feet away to my wireless router. So this is a function of,
there’s just a lot of other activity in the vicinity,
so we’re competing for the same spectrum. So then I tried something
different. I actually went out and bought
a different router, which is using the new 802.11N standard
and it’s running on the 5 gigahertz band. And the 5 gigahertz band gives
us more bandwidth. And by the function of the fact
that it’s also new, not many people are using it. So I was the lone person
on that channel, and I ran the same test. And there you go, my 95th
percentile latency is down to two milliseconds. So as far as I’m concerned,
that was a fantastic investment. So if nothing else, you can go
back home, upgrade your router to 5 gigahertz, hope that not
many of you are living in the same neighborhood, and you’re
not competing with each other, and then you’ll get much
better performance, web performance. We just cut 50 milliseconds
of latency. That is huge for performance. So some takeaways from this. First of all, Wi-Fi makes no
guarantees about the bandwidth or the data rate you’re
going to get. Anybody else can start
transmitting at any point in time. So it’s unpredictable. One way to deal with
this is to adapt to the variable bandwidth. Not to predict it– you can’t predict it. You can only adapt to it. So we leverage this to, for
example, serve video. When we serve video on YouTube,
we serve it in small chunks, like 5 to 10 second
chunks where we stream a chunk of video, we see if you can
download it fast enough. If you can download it fast
enough, we can upgrade you to a higher quality stream. Vice versa, if your bandwidth
was all of a sudden cut in half because somebody else is
streaming a video, we will downgrade you seamlessly to
a lower resolution video. So that’s adaptive streaming. And that’s something you can do
as well if you’re streaming music, streaming video,
or other things. The other variable that you
can’t really control with Wi-Fi is, of course,
latency and jitter. So because we have these
collisions all the time, and your packets need to be
retransmitted, and there’s not much you can do. Wi-Fi provides no guarantees
about this. So if you’ve ever tried to
implement a real time chat or real time voice and video over
Wi-Fi, you’ve certainly run into this problem. So perhaps the only thing you
can really do is start looking into something like WebRTC which
allows you to leverage unreliable delivery and
basically [? UDP ?] in a browser. And we had a great session
on WebRTC earlier today. So if you guys missed it, check
out the video for that. So that’s Wi-Fi. So that’s just a baseline. Now let’s take a look at
2G, 3G, 4G networks. How do they work and how
are they different? So first of all, when the mobile
standards started they set aside some specific
constraints that they needed to optimize for. First is they needed to
guarantee some level of performance. So they wanted to make sure
that they don’t have this congestion collapse scenario
which is present in Wi-Fi. They want to have some knobs to
tweak to say, if this is a really overloaded network,
we will lower the data rates for everybody. But you can still sort of
get a lower data rate. But we want to be able
to control that. And we need some
knobs for that. So that’s number one. Second is, of course we’re
covering a much larger geographical area, which means
that, once again, we need to handle more clients. And that’s something we
need to account for. And the second one is
our battery life. So it turns out, when you’re
optimizing for mobile devices, you really need to think
about battery life. I think, for many of us, if
we use our mobile device actively, you’ll find that your
batteries is draining very, very quickly. So radio is a very important
component of that. You add these two things
together, and you get the Radio Resource Controller. And what is the Radio
Resource Controller? So mobile networks, the 2G, 3G,
and 4G standards, take a different approach as to how
they schedule when you can communicate. Instead of doing this freebie
approach of you just start talking when you think nobody
else is talking, they’re actually employing a moderator,
which is to say, the radio tower actually tells
you when you can speak. You tell the radio tower
that hey, I would like to send some data. And then the radio tower
consults its schedule and tells you, OK, I’ve got
these five people that are queued up. You’re this far away. So you start transmitting at
this point in time with this signal power, with these
parameters, with this encoding, and then you have
this amount of time. So it’s a very different
scheduling mechanism. And as you can imagine, this
adds a lot more overhead. But it allows us to schedule
resources more efficiently within the network. And this obviously has
its pros and cons. So the Radio Resource Controller
lives within a different component of the
network infrastructure. In 2G and 3G network,
it was actually living in the core network. In the latest networks, it is
living right at the radio tower, which is one of the ways
of how we’re improving performance in 4G networks. By moving it closer to you,
we’re reducing the latency. But the Radio Resource
Controller has a number of implications. So the first of these is the
difference and the distinction between the control and the
user-plane latency. So whenever you talk to anybody
that’s working in the mobile space, they’ll often
throw this out and mention the control-plane latency. What is it? Recall that we need to talk to
the radio tower to figure out when we’re allowed to
transfer the data. This is what happens. First, you send a message to the
radio tower, and you say, I’d like to send data. The radio tower determines
when you can talk. And then it sends you
a message back. That is the control-plane
latency. And in 3G networks, this can
take up to 2 and 1/2 seconds. 2 and 1/2 seconds. We haven’t even sent a packet
of data, application data, from your phone, we’re just
basically trying to get a resource assignment from
the radio tower. In 4G networks, and the latest
generation networks, this time is significantly improved. It’s under 100 milliseconds. But nonetheless, every
millisecond counts, and 100 milliseconds is a significant
amount of time. And then only after we’ve
incurred this cost, and we’ve gotten our resource assignment,
can we then start transmitting data from our
phone to the radio tower. And that is known as the
user-plane latency. And for example, in 4G networks,
that can be about 10 milliseconds. So we are definitely in the
weeds here looking at how the mobile radio works. But some important
takeaways there. Let’s take this from
the ground up. Your phone is idle. It’s been asleep. You take it out of
your pocket. You start typing a URL. You hit go. What happens? So this is a 4G network. First, we need to talk
to the tower. So we’re going to occur
at least 100 milliseconds of latency. Then our phone is active and
starts transmitting data, or the radio is active, and it
starts transmitting data. After some time, it will
actually downgrade your radio into a short sleep cycle, which
is to say, it’s not going to listen for
transmissions all the time. It’ll sleep for some time, and
it’ll wake up periodically to save power. It turns out that the radio is
the second most expensive component in terms of power or
energy use in your phone after the screen. So this is why we want
to turn off the radio as quickly as possible. And in the case of 4G networks,
this is usually done after 100 milliseconds of
inactivity on your radio. So we’ve transmitted
some data. We wait for a 100 millisecond
pause, and then we downgrade your radio into this
half sleep state. And at that point, if you want
to transmit data again, we need to go through that whole
control-plane cycle once more. So you’re once again incurring
this same cost. But then, if your phone
continues to be idle, we wait for another 100 milliseconds. We go into the second, the
long sleep mode, and only after about 10 seconds,
we will go back into the idle mode. So a kind of pretty complicated
flow chart. But effectively, what this says
is, first, we have to incur the cost to upgrade
the radio. And then it takes a roughly
10 seconds to get back to idle mode. And this is very important,
as you’ll see in a second. So let’s put this together. This is pretty low level. How does the effect an actual
data transfer, something like an HTTP data transfer? So let’s start from
the beginning. We want to send a request, which
means we need to do a DNS lookup. We may need to do
a DNS lookup. We need to do a TCP handshake,
followed by dispatching the actual HTTP request, and
then we actually need to fetch the content. And maybe optionally, we need
to do the TLS handshake, another up to two round
trips there. So let’s add up all
these numbers. First of all, we have the
control plane latency, which is the time to wake up
the radio and get into the active state. So I’m using HSPA here,
which is a 3G network. And I’m using an average round
trip time of about 200 milliseconds, which is roughly
the time that we see at Google for 3G networks. And then, for 4G networks, these
new generation networks, let’s use 80 milliseconds
which is actually fairly aggressive. So 200 to 2.5 seconds, so 200
milliseconds to 2 and 1/2 seconds, just to get out of control-plane for the 3G network. And then we have these round
trips to fetch the DNS, to establish TCP, and
all the rest. And very quickly, you add
up all these numbers. This is without your server
response time. We already have over
half a second of just network latency. And this is very important if
you’re trying to design a responsive application that
feels responsive to the user. So the good news is, things are
definitely getting better and better with 4G. We’re down into hundreds
of milliseconds and half a second territory. But we also can’t rely on 4G,
as you’ll see in a second. So really, you should assume
that there is literally seconds of network latency
overhead when you design your applications. So the first takeaway here is,
we know that there are some constants that good applications
need to follow. A number of different User
Experience Research studies have shown that, in order for
an app to feel instant, they need to respond to the user, it
needs to acknowledge user input within hundreds
of milliseconds. And we just saw that, even on
the latest 4G network, you’re incurring 300 to 500
milliseconds, which means that you, basically, you cannot wait
for the HTTP request, or any request, to complete. You need to acknowledge the user
input independently of dispatching the request. So acknowledge the input. Send the request in
the background. That is the only way you will
have an application that feels instant to the user. So all communication should
be asynchronous. Similarly, you should anticipate
the RRC latency. It’s a very common complaint
about mobile network that, oh my god, they’re so
unpredictable, the variability is so high. Turns out, once you’re aware of
the control-plane latency, or of this negotiation at the
beginning, which is the time to wake up your radio, you can
model this stuff very well. And you can build, and you can
bake these things into the design process of your
application. Once you talk to the designers,
and you let them know that, hey, it may actually
take two seconds before we can do anything, like
even send an application packet, they can design your
application in a way that provides some sort of feedback
to the user that, hey, we’re working on processing
your input, but it may take a while. All right, so moving on,
we talked about energy. So this is another gotcha that,
I think, is top of mind for a lot of applications that
are native applications, but is not yet top of mind for a
lot of web applications. But it certainly will be soon. So notice that, in our earlier
diagram, we said it takes about 10 seconds to cycle from
a high power state into a low power state. This actually causes what is
known as an energy tail, where it doesn’t matter how much data
you have transferred, the radio will be active for a
certain period of time. Like, you could have transferred
one bit or one byte of data. It doesn’t matter. Or you could have transferred
10 kilobytes, or 100 kilobytes of data. The radio’s going to be on,
effectively, for 10 seconds after that. So you’re not going
to save much by sending data bit by bit. What you want to do is, you want
to send as much data as possible as early as possible. And in fact, intermittent data
transfers is a huge huge performance anti pattern
on mobile. So I have an example of this. An average mobile device today
has about five Watt-hours of battery capacity. And don’t worry about
the units too much. This is just to illustrate
a point. And five Watt-hours is
about 18,000 joules. And it turns out that an average
phone today of a 4G connection, in order to cycle
from a low state, to high state, back to low state,
will take about 10 joules of energy. Let’s do our math. Let’s say we have one minute
pulling interval. So I built an application,
like a Gmail application, which, every minute, will wake
up and just ping the server and say, do I have an email? Are there any messages for me? You multiply that out, it
turns out that this application will consume 3%
of battery life per hour. Now you have a couple of
these applications running on your phone. They have non-overlapping
intervals. You have your phone in your
pocket for half a day, you take it out, your battery
is at 25%. And you’re wondering what
the heck happened there? That’s what happened there. We’re doing these intermittent
things and that was draining the battery. So intermittent data transfers
are extremely expensive. And there’s a really cool case
study that was done as a joint case study between
AT&T and Pandora. They analyzed the native Pandora
application, which is, of course, a music streaming
application. And then discovered something
interesting. Pandora was doing all
the right things. When they were streaming
music, you would start playing the song. They would stream the entire
song down to the client and just play the whole thing. So they would turn off the
radio, which is exactly what you want to do. But then, about every, I think,
58 seconds, or 60 seconds, Pandora would fire an
analytics beacon which was reporting how far along did you
listen in the song, did you like the song, and
all the rest– it seems reasonable. They analyzed it and realized
that those data transfers were accounting to 0.2% of the total
bytes transferred, but it was 46% of the battery life
of that application, which is a huge, of course, performance
problem. So all they had to
do was to move– these are not critical
beacons. They could simply defer that
until later, until the next song data transfer that could
accumulate this data. And they did exactly that, and
they significantly improved the performance of their
application. So here are a couple
of examples. It’s a little bit small, but I
was looking at, for example CNN.com here. And I noticed that they have a
real time analytics beacon installed on their site. So whenever you’re reading a
CNN article on your mobile phone, about every five seconds
[? Charredbeat ?] sends a beacon saying,
I’m still here. Yep, I’m still here. Yep, I’m still alive. And guess, as you’re reading
CNN, you’re just draining your battery, which is not
a great experience. So in Google Analytics, we
actually caught this problem early on. And we fixed it. And we don’t do this. We have a different way, where
we provide real time analytics, but we have a way to
do that without requiring these beacons. So the short takeaway is, if you
are sending these kinds of beacons in your application,
web or native, definitely something to reconsider because it’s extremely expensive. Now unfortunately today, I don’t
think many platforms provide very good
instrumentation or visibility into how much energy does
my application consume– whether that’s web or native. But one tool that is actually
very good is the AT&Ts Arrow tool, which is a free and
open source tool. And I will show you guys
a quick demo of this. I’ve have it running here. So what it allows you to do is,
it allows you to capture a trace file. You can actually install an
application on your Android device, you hit Record, and
you interact with your application. Then you export a trace, and you
can analyze it with this analyzer tool. So I’ve already prerecorded a
trace, and I’m just going to show it to you guys here. So let me load this app here. And you can also run it in an
emulator, but I prefer to do it on my phone. So I loaded a website. I have this trace. One of the cool things is
provides is, it can actually record your screen. So here I’m starting
Collector. And I’ll just fast forward
a little bit. So I’m loading the site
Red Robin here. And you can actually look at the
diagnostics screen here. And you can see this
line moving here. So this is us downloading the
page, and you can see the throughput, the different RFC
states, or the states of the radio, and the energy
consumption on your app. So let me pause this. And one of the cool things
about this tool is, it’ll actually also analyze the
content that you’re downloading and point
out common performance anti patterns. Things like, hey, you’re
not caching the data. Or you have intermittent
data transfers. But then, on top of all of that,
it’ll also tell you the energy consumed during this
session that you’ve recorded. So you can actually
model this. Now one thing to call out
is, this is not an exact measurement. This is based on a model. So it has some assumptions about
the phone that you’re using and the network
type you’re on. So you can actually switch those
and say, I want to run this model on a 3G network
instead of a 4G network. So if nothing else, this is
a great tool to play with. And I encourage you guys
to check it out. I hope that in the future we’ll
have more tools like this baked right into our kind
of day to day dev tools, both on native and web platforms. So it is all about
the battery. It literally is all
about the battery. Whenever you’re wondering why
doesn’t my mobile radio behave in a certain way, always ask
the question how does it impact my battery life? And you often find
the answer there. Radio at full power, if it’s
on all the time, will literally drain your battery
in a matter of hours. If you’ve ever had your phone
burning your leg, you know how that feels, what’s usually
happening there is, you can’t get a connection, and some
application is just continuously trying
to reconnect. And that turns on the
radio at full power. It drains your battery like
there’s no tomorrow. And the actual transfer
size does not matter. It doesn’t matter if
you’re transferring one byte or 100 kilobytes. That’s the other important
takeaway. So the consequence of that is,
you want to pre-fetch data. I’ll fetch the previews and the
thumbnails of my awesome application, and then, as you
scroll, I’ll fetch the rest. That’s an anti pattern
on mobile. You want to pre-fetch as much
data as possible, turn off the radio, and then hopefully
keep it off for as long as possible. And we already talked about
periodic data transfers. So what can you do there. You can defer the request
until later. You can combine them. You can launch that data into
a local database like local storage and then fire
the request later. Of course, we provide some
tools for this, both on Android and Chrome. So Google Cloud Messaging is
definitely something that you guys should check out
and leverage. And what it allows you to do is,
you can push notifications to our servers, to the Google
Cloud Messaging servers, and then those servers will try to
determine an optimal strategy to deliver those messages to
your phone by minimizing the amount of data transfers. So for example, we
just launched the support on Chrome. We don’t yet have feature
parity with Android. I hope we get there quick. But for example, on Android,
you can actually mark a message as delay while idle– which is to say, if the user’s
phone is idle, don’t push it right now. But when they wake up their
phone, push the message there. And not only that, but if you’re
going to delay the message, this is the
time to live. If the user doesn’t wake up
their phone within the next hour, just drop the message on
the floor because it’s no longer relevant. And it’s the combination of
these things that allow you to build efficient applications,
where you can have efficient delivery of these messages to
the phone without necessarily having to wake up the phone or
having your phone wake up periodically and query
your service, which clearly does not scale. So that’s a little bit about
how the radio works on the phone itself. Now let’s take a deep dive into
how the mobile network, the core network, actually
works, and what implications it has on performance. So at a very high level, the
mobile network effectively has a couple of important
components. The first one is the
packet gateway. And the way to think about the
packet gateway is, it’s basically like your wireless
router, or your router at your house. It’s a NAT device which
accepts all of the connections. It terminates all of
the connections. And then it forwards the packets
to your device, in this case, a mobile device. And there’s an important pause
here, which is, your connection, or the connection
to your phone between the server to your phone, is not
an end to end connection. It is being terminated
by this router. So just the fact that you’re
turning off your radio, or you’re pressing the power button
on your phone, does not terminate the TCP connection,
which I think is a common misconception. And if you’re ever seeing this
kind of pattern, where you have your code basically saying,
look, if I turn off the radio, I’m going to lose
my WebSocket connection or something else, that is not true
because the connection is still maintained by
the radio network. And it will wake up your radio
when it’s necessary. So if you have code like this
in your application, you definitely want to
turn that off. Most carriers have, actually,
timeouts of anywhere between five to 30 minutes for
a TCP connection. If anything, I’ve found that, by
working with many different apps, it’s usually the
application server on your side that’s terminating the
connection early on. Like you have aggressive
timeouts of 60 seconds. And then, because of that, your
app needs to periodically pull your server to
keep it open. So make sure that that
plumbing is correct. It’s not the carriers
in most cases. OK, so we’ve got the
packet gateway. The packet gateway has
no idea about the location of your device. What it actually does is, it
forwards the packets to the serving gateway. And the serving gateway needs
to figure out, where are you and where is your device? The trick is, the
serving gateway doesn’t know that either. It basically needs to query a
local database, like a user database, to say which radio
tower are you currently associated with. And how is your billing
status? Should I even be forwarding
this packet to begin with? So that’s this mobility
management entity. So just like a user database. That’s all it is. All right, so let’s try this. Let’s say we want to send a data
packet from your phone. We can actually connect all
the pieces together now. First, my device is
going to wake up. And the first component is,
it’s going to talk to the radio tower. It’s going to negotiate the
times when it can send data. That will take hundreds of
milliseconds, or up to two seconds on 3G networks. After that, it will transfer
data to the radio tower. The radio tower will transfer
it to the serving gateway, which will forward it to the
packing gateway, and only then will it hit the external
network. And this is when the data is
transferred to your server and all the rest. The latencies for this
end-to-end transferring here, without the external network,
is anywhere between 50 milliseconds to hundreds
of milliseconds on different networks. I pulled out these numbers from
the AT&T technical FAQ. So basically, the takeaway here
is 4G, 50 milliseconds, plus the transfer time to use
server, on 3G, as high as 400 milliseconds and much, much
worse for 2G networks. It turns out, that’s
a simple case. Let’s try the more
complicated case. And I hope you guys
stay with me here. Let’s say we actually want to
push a packet to the device. So we’re trying to wake
up the device. Our server pushes a packet
to the mobile carrier. It hits the packet gateway. That goes to the serving
gateway. The server gateway,
once again, has no idea where you are. One of the nice properties of
mobile networks is, well, you guys are mobile. You hop into a car. Now you’re on your way
out of San Francisco. We have no idea where you are. So the serving gateway actually
talks to the mobility management here, and it says,
OK, I need to send this packet to this user. Tell me where to forward
the packet to? The mobility interface doesn’t
actually know your current physical location. It knows, kind of
geographically, where you are. OK, last time he checked in,
he was in San Francisco. So it sends a ping to
all of the radio towers in the vicinity. And says just flood the entire
network with a beacon that says, hey you, user number blah,
blah, blah, there’s a packet waiting for you. Identify yourself, please. Your radio wakes up. It gets that packet. It then talks to the tower. It says, hey, I’m here. I’m associated with
you right now. That data gets transferred
back. It goes back to the serving
gateway, and now it can transfer the data back to the
tower and to your device. This is pretty complicated. And the reason I’m pointing
this out is, oftentimes, there’s a question of
why does it take 200 milliseconds to do this? I mean, but clearly,
this is a pretty complicated problem to solve. So the fact that we can do this
in 40 to 50 milliseconds in 4G is actually or rather
impressive, to be honest. So this is a little crazy. But is it worth it? I think that’s a valid
question to ask. We went from Wi-Fi, where
we had nothing. We just said, just talk
and hope for the best. Cross our fingers. We went to this 4G network
interface, where now we have all these routers, radio towers
talking to each other, it’s all crazy. Where are we going? So this is a great case study
that was done a couple years back, actually, in 2011, where
they measured performance of Wi-Fi networks, typical
home router networks. I think this was done with
802.11G, so a fairly new standard, and they compared
it against LTE. So just focus on these
two clusters here. And what I want to point out
is, first of all, the throughput is actually better. But most importantly is this
graph right here, which is the round trip time for
our packets. With Wi-Fi, there’s no
guarantees about the latency of the delivery of
your packets. And that’s why you have these
giant tails here. Generally speaking, the latency
is fairly low, but then you have these outliers,
which is why this line is stretched. With LTE, even though it’s
so much more seemingly complicated, or not just
seemingly, it is more complicated, we can actually
deliver more reliable latency and lower jitter across
the network. So this is good. This is great. 4G will make things better. So a couple of things. Mobile radio is optimized for
burst data transfers. It is not optimized for sending
bits in small chunks. What you want to do is, you want
to transfer as much data as possible and then
turn off the radio. And in fact, the latest 4G
networks can transfer on the order of 10 of megabits per
second, which is really high data rates. But it will do so in very
small assignments. In fact, the bandwidth
assignments are done in millisecond and lower chunks. So you get a chunk of airspace
for about one millisecond, and you can transfer huge
amounts of data. So pack as much data
as you can. Group your requests. Don’t delay your requests. The not so good news part of the
4G world is that 4G will take a while to come. Despite the fact that there is
ads everywhere, across all the highways and everywhere else
that 4G will save all things, 4G will take a while
to deploy. Current carriers have
deployed a lot of infrastructure for 3G. And they’re continuing to
improve that, in part because they can, and in part because
it’s very expensive. So the dominant network type
of this decade, not just of this year, or in the coming
year, of this decade, will be 3G networks. And granted, they are being
enhanced to deliver higher data rates. But nonetheless, you can see
that the growth, overall growth projections, are
for 3G networks. The good news is, at least
within North America, LTE and HSPA, which are the 4G networks,
are actually taking off quite well. In fact, we are way ahead of the
curve compared to all the other countries. But nonetheless, you have to
basically assume that your users will be using a mix
of 3G and 4G networks. And even if you have a 4G
network data plan on your phone, your phone is switching
between 4G and 3G all the time. Depending on coverage, or where
you currently are, and load within the network. So because of that, you have to
design for variable network performance and availability. It is truly a multi-generation
future. You can’t assume that you’ll get
great performance with 4G. You need to plan for 3G
networks as well. Bandwidth and latency
are variable. And of course, as we’ve all
experienced, connectivity is intermittent. So if you’re building a mobile
application, or an application for the mobile web, you have to
assume that there should be an offline mode or some fallback
to say what happens when I can’t actually connect. And I mentioned this before,
but I think this is an important point to
make once again. You should have some sort
of a backup strategy. Oftentimes, the reason we have
poor battery performance is because some application does
not plan for intermittent connectivity, or the fact that
there is no connectivity, and it just continues to pull
the server to say, are you there yet? Are you there yet? Are you there yet? And that’s what’s burning your
leg and burning the battery on your device. What you want to do instead is
to say, I’m going to try to connect now. And then every time I fail, I
will just punt further out into the future, have some sort
of a decay function, and then say I’m going to stop
after this interval, and you’ll retry later. Nine cases out of 10, whenever
I have a performance or a battery life problem, I track
it down to exactly this. It’s some application
that’s just sitting there, in a loop– while not connected,
keep trying. Not a good strategy. So with that, I think we’ve gone
through the full cycle. And we’re back to application
best practices. Of course, all the radio stuff
is very important. But optimizing TCP best
practices, TLS, and HTTP is very important. The takeaways here is, of
course, measure first. Make sure you’re using real user
measurement, and you’re measuring performance across
real networks. Do that first, and then
optimize later. And I’ll just call out a couple
of examples that I think are very important. The fastest request is
a request not made. This is an obvious one. But it turns out, when we
analyze a lot of mobile applications, they’re
not caching data. This is the number one problem
for a lot of applications. So make sure you do that. Bytes are literally expensive
for a lot of users. You need to compress
resources. And you know, it’s funny because
we’ve been talking about compressing resources,
compressing images, for years. But we still find that a lot of
applications don’t do it. And of course, leveraging
formats like WebP, as you guys heard at the keynote,
and we had a session earlier today on WebP– definitely encourage you to try
it both on web and native. And then, finally, shameless
self promotion. I’ve been writing a book on this
stuff, and, specifically, on mobile radio performance
and other things. And it’s available
online for free. So if you guys are curious to
learn more about this, and I can talk about it all day,
please check it out and please offer feedback. You can actually comment
right on it. And with that, I think we have
some time for questions. If you can grab one of the
mics, and I’d be happy to answer them. [APPLAUSE] AUDIENCE: Hi. My name is Maurice. I work for Verizon Wireless. I have a question regarding APIs
or services that can be provided by the Android
framework itself to allow developers to make application
more efficient. For example, the same way today
doesn’t allow you to put network access in your main
thread, is there anything planned to do in that matter? ILYA GRIGORIK: So I have more
experience with the web part of the stack. So I can’t necessarily comment
on the Android part. I would actually direct you to
the Android guys sitting outside for that. But something like GCM, we
continue enhance GCM, or the Google Cloud Messaging, and
that’s what you want to use for a lot of communication. We just recently announced the
ability to actually send data through GCM as well. AUDIENCE: Thank you. AUDIENCE: So my understanding
of why LTE is much better is partially because of the
modulation that it uses. Since LTE uses LFDM, and Wi-Fi
uses something different. Do you know if there’s any
industry trends to move the modulation scheme of the access
points that you might have at home, to use that
technology over what we have today, since it’s so jittery? ILYA GRIGORIK: So there’s a
couple questions in there. So LTE has a host of different
improvements. So they’re basically redesigning
the network from the ground up. They’re moving the scheduling
stuff into the edges of the network. They’re using a new modulation
scheme, as you described. So it’s not just a modulation. Because that helps you
with throughput. But there’s a number of other
variables, like the energy use, the latency of assumption,
the connection, all the rest. For moving that same technology
into, let’s say, your local access points,
yes to some degree. So Wi-Fi standards and LTE,
or 3GPP, standards are completely separate. But just watching both of them,
they do borrow and steal from each other. So with the latest Wi-Fi
standards, you’ll find that the way they’re achieving
gigabit data rates is by using the same tricks, to
a large degree. AUDIENCE: So do you know
if that’s in 802 AC? ILYA GRIGORIK: Yeah, that’s
in part how they’re– yeah, exactly. AUDIENCE: Thank you. AUDIENCE: I’m fairly new
to Android development. And this may have been covered
in the messaging, cloud messaging feature. I was just wondering, is there
any currently, or plans to provide support for aggregating
requests from the OS level or framework to, if
you’ve got multiple apps running that all have a stupid
beacon request, is there any support for ganging these
together, so multiple apps kind of work with each other
instead of against each other to reduce radio usage? ILYA GRIGORIK: Yeah, so I
think that’s definitely something that we’re
looking at. And that’s something that only
the platform can provide. As an application designer,
you can’t control what the other applications are doing. So this is definitely something
that we’re thinking about, both on the web, like
on Chrome, for example, how can Chrome leverage something
like this, and also within the Android platform. So a good example of this is
turning off the radio early. If we know that there’s no other
connections being made, we can terminate the
connections early. That can be done by the
operating system in conjunction with the Radio
Resource Controller. So yes, there’s definitely
work in that direction. AUDIENCE: So that’s
nothing current, but it’s in the works? ILYA GRIGORIK: Yeah, yeah. AUDIENCE: Do you have
recommendations for tools just as a user, to help figure out
what’s killing your battery on iOS and Android? ILYA GRIGORIK: Yeah, so
I hope we make that tooling much better. I guess a couple of tips that
I use, one is, you actually have the battery panel
within Android. If you go into your settings and
kind of navigate down, it actually gives you a really
good breakdown of which applications consume power. And if you click on the graph,
it will also show you kind of a breakdown of when the power
is being consumed. So I’ve found that that’s the
most effective way to identify specific applications. I think we need to do more
to isolate, like, at a webpage level. There’s nothing really
like that. And I think we need something
like that. So I hope we’ll have a better
answer in a year’s time. AUDIENCE: Thanks. AUDIENCE: Hi, could you talk a
little bit about strategies that you would use for games? Because they have a completely
different data access pattern. And they typically talk
UDP instead of TCP. And they require lower latency
in case you’re playing with someone else. ILYA GRIGORIK: So all of the
same optimization strategies would apply there. It doesn’t matter if it’s
TCP or UDP, you need to wake up the radio. So to the extent possible, you
want to aggregate data. You don’t want to be beaconing
back every single achievement unlocked by the user if that
can be kept on the device. So once again, leveraging things
like GCM and others to aggregate that data,
it’s all the same. The protocol doesn’t
actually matter. AUDIENCE: Thanks. AUDIENCE: As a JavaScript
developer, the best practices, as you said, is to batch API
requests or other requests. Could you comment on the beacon
API, which would help us tap into actually when
the radio is active. ILYA GRIGORIK: The beacon API. So are you talking about
the new proposal? AUDIENCE: Yeah, and then what
the status of that is. ILYA GRIGORIK: Yeah, OK, so in
the HP performance working group, we’re working on a couple
of proposals, I guess. One is the beacon API, which is
to say, I want you to send this request, but I don’t
actually care when you send it. Just send it when it’s
convenient for you. A good example of that
is analytics beacons. You don’t want to wake up
the radio right now. You’re just saying, defer
this and dispatch it at some later time. So there’s no implementation
of that yet. I think we have early
drafts of the spec. If you guys are interested, I
would definitely encourage you to check out the working group
and comment on it. But hopefully, soon, we’ll
have something. AUDIENCE: OK, thanks. AUDIENCE: Can you talk about
what tools you used to look at your Wi-Fi networks
in those slides? ILYA GRIGORIK: What is the
tool that I used in– you know, I don’t remember off
the top of my head, but if you talk to me afterwards, I can
find it on my laptop. AUDIENCE: A question
about analytics. ILYA GRIGORIK: Yes. AUDIENCE: So what solutions are
you currently working on at Google not to lose the data
value, and to reduce the battery consumption. ILYA GRIGORIK: So for analytics,
what you generally want to do– so this is actually
related to the previous question about
the beacon API. So oftentimes, you can report
the data later. So one strategy is to literally
stash the data into your local database and
then report it when the radio is active. This requires some modifications
of your application itself. But for example, on the real
time analytics, we don’t actually ping back every
so often to just say that you’re there. We do the real time analytics
differently, which is to say, we track when you first visit
the page, which is when you hit that beacon, and then we say
that you’re active on the page for the next 30 seconds. And if you navigate to
another page, then you’re still active. So we’re not forcing this
five second interval. And I think we need to talk to
the other vendors and get them to move in the same direction
because they’re literally costing a lot in terms
of performance. AUDIENCE: And about
Google Analytics, and additional question. Will it be a separate API
extended for request bulking? ILYA GRIGORIK: Can you say
that again, sorry. AUDIENCE: So for example, we
have a start in activity. And we are reporting the user
when there is a start of activity, and when he left it. And there’s a chain of them. And will be the API of Google
Analytics extended, so we can bufferize this, like put, and
put to buffer, and to report later, or something like this. ILYA GRIGORIK: That’s
a great idea. So there’s three ways
to tackle that. Google Analytics can provide
a new API to basically say, stash this and send it later. You can implement your own
wrapper around it, and basically say, I’m going to
fire this later, when I consider that it’s active. And then the best approach, I
think, is a combination of the two, which is the beacon API,
which we were just talking about, which is basically a
browser mechanism, to say we’ll defer this to the browser
to determine because it’s actually in the
best position. So today the fastest way to get
what you’re talking about is to implement it in
your application. I can certainly talk to the
Google Analytics Team and see if there is interest
in implementing something like this. I think that’s a great idea. And then, maybe in a year’s
time, we’ll have something like the Beacon API. AUDIENCE: Thank you. ILYA GRIGORIK: Yeah. SPEAKER 1: We’re
way over time. [INAUDIBLE] ILYA GRIGORIK: All
right, one more. AUDIENCE: My question is
related to the Nexus 4. I’ve always used Verizon, and
I’ve never had any signal reception issues until I
switched to T-Mobile. At my house, on the first
floor, I only get edge. On the second floor, I get
edge, but it’s iffy. On the third floor, I’ll
get edge and HSPA+. Now this is the thing, though. With this phone, the decibel
rating, from what I’m reading, is like around 100. If I hold the phone in my
hand, I lose signal. ILYA GRIGORIK: You’re
holding it wrong. AUDIENCE: But this
is the thing. If I’m on a phone call, and
then I hold it, it doesn’t drop signal. So my question is, is there a
way that I, as an end user, can make the phone put more
power to the radio to hold the signal because I can’t
change T-Mobile. ILYA GRIGORIK: So the
short answer is no. The energy use is actually
dictated by the radio tower. It actually tells you, depending
on where you are, the distance between you and the
radio tower, the amount of other people talking, the exact
signal power that you should be transmitting. So in fact, the radio tower is
actually trying to help you optimize your battery use, which
sounds like, in this case, it may be working
against you. But there’s no explicit
control of that, unfortunately. AUDIENCE: OK. ILYA GRIGORIK: Great,
thanks guys.

Leave a Reply

Your email address will not be published. Required fields are marked *