(Editor's note: transcripts don't do talks justice.
This transcript is useful for searching and reference, but we recommend watching the video rather than reading the transcript alone!
For a reader of typical speed, reading this will take 15% less time than watching the video, but you'll miss out on body language and the speaker's slides!)
I totally didn't realize I was clicking, backstage. [LAUGH] All right. Hey, everyone! Thanks for coming to my talk. I'm especially impressed, because y'all could have taken an after-lunch nap, but you're here. So we know that shit is necessary. All right.
So, I'm super-excited to be here. My talk is called "Send Me Your Echolocation." It's about minoring performance of a distributed network. Any distributed-systems people here?
[VARIOUS CALLS OF "WOO!" FROM AUDIENCE]
I love how distributed y'all are. [LAUGH] Any networking people here?
[A FEW "WOOS"]
All right, sweet! So, when I started building these tools, I found that they're a lot like echolocation that whales and dolphins use. Any whales and dolphins in the building?
Oh my god! We are many! OK. So I'm Vicky Nguyen. I'm a network systems engineer at Fastly, which is an edge cloud platform. I'll get into that later.
I'm Wicky Vinn on the internet. I mostly tweet drunken nonsense to my 100 or so followers, half of whom I'm pretty sure are bots. So, if you like that kind of thing, you should follow-- or if you're a bot. OK.
So this talk will have four-ish parts to it. The first part will be fundamental network, fundamental networking, routing, vocabulary, and concepts we should become familiar with. The second part will be how a distributed network works and what we're looking for when we're monitoring this huge distributed network. And the last bit will be tools, tools, tools-- what are tools that we can use, to do that?
And, spoiler alert, the conclusion will be about how distributed systems need more evolved networking tools than the usual, because, when you operate a distributed set of data centers, the network can break at any time, between any pair of hosts. And so you need something that monitors and collects data about how the network is behaving at any time, real-time.
So we built an API that does that. And when I say "we," I mean "us," as, like, a collective group at Fastly, not me. But I'm the one that's here. So, look at me. OK. So, um--
But, in order for me to explain how monitoring a network works, we kind of need to know the elementary concepts of routing. So let's do that! So maybe you've heard the analogy about how the internet is like a postal service. So you have a package. You put it in a mailbox, which is kind of like the gateway to the internet, or to the postal service, kind of like the internet.
And it makes it to the postal service. And then it gets scanned for its destination. And then it gets put on a truck, gets put to another post office, puts on another truck, another post office, all the way until it makes it to its destination, right?
So data travels through the internet by way of packets, which is, in this analogy, like packages. And those contain bytes of data and metadata that helps it get to its destination, including its source IP address and its destination IP address. So the internet is like the postal service. Right? It's a bunch of separate corporate entities, represented by networks. And you, as a user, say you want to make a request to a website.
So your computer will mail a packet to your router, the gateway to the internet. And the router will then forward your packet to an internet service provider, which is kind of like the post office. And, hopping from place to place, your packet travels networks, to finally make it to its destination.
And in networking we call each place a packet lands a "hop," like what bunnies do. And somehow each hop is smart enough to send your packet to its destination. But how? So, for this talk, we'll denote hops as this little router-server-y looking thing, or maybe the most adorable e-cigarette you've ever seen in your life.
I don't know, Seattle-- whatever works for you. So hops know how to send your packet from one place to another. And they do that kind of like the post office. They have an address book. And it's called the "routing table."
And so here's a simplified version of the routing table you see in Linux. And when a packet comes in, the host checks to see if the packet's destination IP address-- in this case, let's say it's 184.108.40.206-- closely matches any of the destinations, through an algorithm called "longest prefix matching." I'm not going to go into that now, but if you're, like, into trees or algorithms about radix trees, you should look into that-- ya nerd.
And, after finding the closest match, the host will forward the packet through the nearest gateway. So let's say that your packet is going from your computer to your router. So your packet will say, hey, I'm trying to come here, and your computer will be like, all right. You should probably go through this gateway-- which will then hop to the next router. And so on, until you reach the destination.
So let's summarize what we have so far. We have packets, like packages in a postal service. They move from hop to hop through a routing system. And each hop knows how to forward packets to the next hop, using routing tables.
That's pretty good, so far? All right, Routing 101-- check. Routing 102, let's go. OK.
So we have transit providers. Right? The internet isn't really as simple as this illustration we've had. It's not a linear path. That would make things a lot easier, but it would not at all be resilient or reliable.
Instead, the internet is-- it's more distributed to be able to withstand failures. So it looks a little bit more like this. Can you all see that? All right. Oh, well.
So packets can take many permutations from its source to destination. And so, if we think of the internet as being made up of a bunch of separate corporate entities, each owning a different part, internet service providers, transit providers, offer passage from one part to another. Which is why, so often, Networking 101 classes portray networks like this. They're are a bunch of, like, interconnected hops, with numbers between them signifying some kind of cost measurement to get between them, whether it's time or distance. But maybe networks should be more often portrayed like this, like weird, blobby things that indicate where the boundaries of networks are, and how routers sit between them.
So let's summarize what we have so far. Routing paths are not linear, nor are they consistent. In fact, if you were to send an email, then your email would be broken up into many different packets, and those packets are going to take many different paths when it gets from your computer to your mail server. And transfer providers offer passage through networks through different paths.
Routing 102-- check. You guys are acing this class! OK, so let's talk about echolocation.
So dolphins-- you guys in the audience, apparently-- use echolocation to figure out how far things are. Right? A dolphin makes a noise. The noise travels through water. The noise eventually hits an object and then comes back to the dolphin.
And the dolphin now has this mental map of what's in front of it. So, very similar, networking, we have a utility called "ping." So ping has a source, emit an ICMP packet to a destination. Does anyone here know what ICMP stands for? Shout it out?
I can't see or hear you! I just realized. Yeah, you're right. [LAUGH] Internet Control Messaging Protocol-- ICMP packet to a destination. And if the host is alive and responding, the packet will be returned. And the time that the packet takes to get from source to destination and back is called the "round-trip time."
So, if there's a slowdown somewhere on the wire, or if the destination is down and a packet is dropped, this will be reflected in the results of the ping. But otherwise you'll see something like this. It will have the round-trip time of how long your packet took. And, since we sent one packet and we received one packet, then the packet loss was zero.
And you've probably seen this dialog before. You've maybe used it to test your internet connection, from your laptop. We are sending 10 ICMP packets to Google's public-facing DNS IPv4 address, which is 220.127.116.11. And each packet takes around 7 to 11 milliseconds to get there and back. And no packet loss.
So that's ping! Pretty good, so far? Cool.
So ping has an older sister, and she just got back from studying abroad in Paris. And she smokes cigarettes and drinks coffee, and she's super-sophisticated now. And her name is "Traceroute." So--
[LAUGH] Traceroute also sends ICMP packets from a source to a destination, but it does it for every time to live. And now we have to ask ourselves, what is "time to live"?
Well, "time to live" is a misnomer. It's a liar, and you should never believe it. Because "time to live" has nothing to do with time. It's an integer that refers to the number of hops a packet will go. Allow me to demonstrate.
All right, so you start a traceroute tool from your command line. The tool sets TTL, Time To Live, to 1. And it sends packets to only one hop. And then it returns and records its findings, including the IP address, RTT, and loss.
And then traceroute increments time to live to 2 and sends packets to only two hops. And then it returns and records its findings, including the IP address. And now see what happens, here.
Traceroute sets time to live to 3, and it tries to ping the third hop, but the is hop not responsive. We can't actually access it. And this could be because it's a private network, or maybe the hop has been configured to drop packets without setting an ICMP message back.
And sometimes this happens in networks, where you can access destinations but you can't access the hops between them. Because packets will drop there. And it really does testify to the usefulness of a distributed network, because you don't actually need to have connectivity from your source to your destination, necessarily; you just need to make sure that every hop next to each other has connectivity to each other. So, this way, networks can configure part of themselves to be private. Which is pretty cool.
Anyway, traceroute records its findings, which is confusion and betrayal.
And then it will finally reach its destination, reports its last finding, and that's it! So, IRL, this is what traceroute looks like. It's called My Traceroute, MTR, on the command line-- My Traceroute, which is the cutest name. And it sends three packets to each hop, from here to Google's public IPv4 address, 18.104.22.168.
So say you're a network engineer, and you get a Zendesk ticket from a customer that's complaining about there being down times between two hops. You would hop on to a hop and just start pinging and tracerouting the shit out of it. And that is actually how a lot of that debugging happens. The majority of networking tools in network optimization rely on some derivative of these tools. And traditionally they'll suffice.
So let's summarize what we have so far. We have tools called "ping" and "traceroute," and they give us information about paths taken from host to a destination, with stats like RTT and hop addresses and your packet loss.
So now you're all super-knowledgeable in networking. And someone comes up to you and says, hey, now that you know that stuff, can you please make sure that this distributed network is being monitored properly? And you're like, oh, yeah, totally I can do that. And then you scurry off to figure out what that means. [LAUGH] So now that comes to Fastly.
Also, I hope it's becoming clear that this whole, like, hypothetical situation is autobiographical. I'm just telling you my coming-of-age story through a convoluted networking talk. OK.
So, Fastly. Fastly is an edge cloud platform. We have data centers all over the world. Here's a map of the data centers that Fastly has now. Of course, one day we'll all be under water. And--
--the world is just the Midwest and Wakanda.
But, um-- yeah, Wakanda forever. But, for now, we'll need to provide a global network to handle our customer traffic. Right? And our customer traffic could range from your tiny, little GitHub I/O page that you use to, like, talk about yourself, or supporting huge Twitter surges during Coachella because Beyoncé's on.
And how it works is like this. Say you're a customer, and you run a website called Sports.company. And you're based in Seattle. So your origin server is here.
And you've decided that you want a global audience. You want everyone to know about sports. So you decide that you're going to use Fastly, and you can leverage our data centers to use as edge servers.
So your global audience has users in Tokyo. And there's a user in Tokyo that says, hey, what's the score to the game? Then that request is going to be routed to the closest data center, which is in Narita-- which is in Japan, in case you didn't know.
But Narita doesn't know the score yet, so Narita has to send a request to the origin server. And remember that there are all these, like, intermediate hops between each of these hosts. So Narita data center makes a request to the origin server. And the origin server has a sports database on disk. Right? So it says, yeah, I know the score, actually.
And that request comes back, and the results are cached in the Narita data center, so that all the other requests coming from Tokyo in that area will then go to Narita-- which is a lot faster than going all the way to Seattle. So you're effectively killing 2 million birds with one stone. It's like, if packets learned how to carpool, or packets have been, like, taking Lyfts this whole time-- they're like, oh, you can take Lyft Line! And so that's what they're doing.
So your origin server now is getting real hot, from all these data centers making requests, asking for the latest score. And they're like, well, it'd be really nice if there was a data center that could kind of handle all these, like, often-made requests for us. And so you, as Sports.company, could designate a shield. And so what the shield would do is this.
So, whenever the origin has new data, then the data center right next to it could grab data from it, real quick, because it's real close. And then requests that would normally go from your origin to your origin will actually go to your shield, increasing the chance of a hit for that resource. So your origin server could rest easy, because now it doesn't have as much work. And generally, uh--
[LAUGH] Some Homestar Runner fans in this space. [LAUGH] So, normally, Fastly data centers are more equipped to handle a lot of requests than origin servers are. So we have our shield. And we have all these data centers querying this shield for the latest scores. Right?
And that's totally fine. But, as you know, scores change. And so, when the shield has a new score, data centers will make another request, and so on.
And so we have all these data centers that are, like, hella thirsty, and they're all over the world, sliding to the shield's DMs.
And the user experience is now completely-- not completely, but, like, super-based on the efficiency of these hops. Right? So we actually want to optimize paths the best we can, make them as fast as possible. And so you can see that optimizing the path between data center and data center is really important, here.
So let's summarize what we have so far. We've established how Fastly can enable your website, application, service, whatever, to cache data edge servers, and how a shield data center could act as a second origin. And, because of that, we now realize that we need to monitor inter-data-center traffic, to make sure that those are taking the fastest paths possible.
And there are a lot of networking tools out there. Like, if you ever can, you always want to steal before you build something yourself. And we use a myriad of vendors. But the problem is that none of them really enable the multipath data-center-to-data-center measurements that we need. And I'll show you what I mean.
So, for instance, we use a monitoring software that sets up probes strategically everywhere in the world. And then they'll just, like, probe our data centers. And so here we have a probe, near the user in Tokyo. And it's making requests to our data centers.
But the problem is, we don't actually know what's going on between data center and data center-- because that's what's important, right? And we aren't able to measure all the other paths that are available. Like, we're only measuring the path that's used.
Here's another example. We use an open-source software that provides a monitoring UI. And it allows us to set up daemons in data centers that ping destinations of our choice.
But that only focuses on the use path, not all the other hops that we could be taking. And if we want to find the fastest path, we need to look at those other hops, right? So is there a way for us to actually get multipath measurements? And obviously the answer is yes, because otherwise I'd just leave right now.
So we talked about our Fastly data centers. Let's zoom in on what's going on inside one of those. So we have this Fastly cache. And we call them "caches" because they cache things. But just know that they're hosts, like everybody else. And we talked about hosts.
When a packet lands on a Fastly cache, it would act the same way as every other host. It'd look on its routing table, to decide where to send the packet, right? To find the next hop.
But routing tables in Fastly caches are a little bit more interesting, because we can leverage this thing called a "forwarding mark." And a forwarding mark is a socket option you can set on an ICMP connection. And so we're very sneaky and very clever, and we decide to assign a forwarding packet to every transit provider and interface pair. In other words, we're assigning a forwarding mark to a certain route that a packet can take from that source.
So let's say we pair every little interface that we have with a transit provider. So we're like, OK, VLAN 100 and Cogent, you guys get number 1. And VLAN 200 and Cogent, you guys get forwarding mark number 2. And we do that all the way down, on every transit provider that we're peering with.
And so, if a packet arrives on a Fastly cache, and a forwarding mark is not set "no forwarding mark," then the cache will just be like, all right, I know what to do. I'm going to look at the main routing table and send you through the default path. But, if a forwarding mark is set-- say it's set to 1-- then the cache will be like, oh, I'm going to look on the routing table that someone has configured me with, associated with the forwarding mark 1, and send you through Cogent and VLAN 100.
And so on, with, like, forwarding mark 4. I'm going to look up the routing table associated with my forwarding mark number 4, which is Telia and VLAN 400. So I'm going to send you through that path. And if some words that I said are just complete jargon to you, don't worry about it. Just know that we can use routing tables and routing rules on caches, to make packets take the path we want them to.
So let's summarize what we have so far. We use forwarding marks assigned to transit providers and interfaces, to control the paths that packets take. And, because of that, we can keep measurements on every path to figure out which one is best.
All right! So we're getting closer to the type of data that we want to collect. Right? And let's just review, really quick, the things that we do have. We use a lot of vendors for monitoring. They're awesome, but none of them really enable the multipath data to center-- to data center measurement that we need.
So, if we're going to finally have to build something of our own, what do we need it to look like? Well, it would be really nice if it was something we could request to ping multiple paths. Like, what can somebody request?
And it would be really cool if it can handle concurrent requests, because we're going to do a lot of these requests. We're collecting a lot of data, all the time. And it shouldn't have to matter whether or not you're doing one request or 1,000 a second.
And so you guys are like, um, duh, you could build an API. That makes HTTP request to trigger measurements from caches remotely. Right? And to that I say, well, you don't have to be so smug about it. All right. So.
Then obviously you're like, OK, I need a project name. So you hop on Wikipedia and, after 30 minutes of reading Greek and Roman mythology about seamen and aquatic gods, you're like, OK, I'm just going to use Sonar, because that's echolocation but for submarines.
So Sonar is something that we built. It's an HTTP API. And you can make a request to a data center like Narita that will send pings to Seattle, using every path. Does that make sense?
And it's written in Go, because I like Go. But it's also super-easy to use Go's standard library that sends and receives ICMP packets. And you can also set forwarding marks by simply setting the option, the socket option, the file descriptor, on an ICMP connection. And Go happens to run quite lightweight. And so enabling concurrent requests is quite trivial, with go routines. And, because nobody wants to do DDoS yourself or let people DDoS you, we set up rate limiting with Go channels. And then we deployed it on all caches in all data centers.
So now we have this HTTP API! And through the API we can set a forwarding mark on the ICMP connection for each one. So, if the forwarding mark is set to 1, the cache will look up the correct routing table and says, OK, go through VLAN and Cogent, OK? And so on, with the second forwarding mark and the third forwarding mark.
And now our API will send pings through multiple paths available. And we'll finally see if there's-- as a network engineer, you'll see that there's a better route than the one that's currently taken. And if there is a better route, then you would push a new configuration to the cache, so that it uses the new preferred provider.
And then on the main routing table would then reflect the new default, which is faster than the old default. And so, if the forwarding mark is not set, it'll use the new route, the one that we like. So Sonar will then continue to take measurements, using all the forwarding marks, to continue optimizing performance.
And the data collector we built runs on Google Cloud Platform, with Kubernetes. Any Kubernetes people, out there?
[ONE PERSON SAYING "WOO!" FAR IN THE DISTANCE]
Some, like, quiet whispers. [LAUGH] So that now makes requests to the API, like, every 20 seconds. And it's completely trivial for us to handle that, because of Kubernetes. And when the measurement data comes in, we start in BigQuery.
I'll mention VCL really briefly. I'm not going to get too much into it. Feel free to come up to me afterwards. But, for our purposes, VCL enables-- which is Varnish Caching Language-- enables consistent hashing between caches, so that measurements are spread out, and will only send requests to the caches that are up, not the ones that were down on purpose. Anyway, you don't have to really know that.
And so now we have all these frequent measurements stored in BigQuery. And eventually we can get to a point where we're automating these decisions. So, whenever we see better routes, we'll just automate them getting pushed new configurations that reflect those better routes.
And, lastly, we made a slackbot during a hackathon. And it was super-cool, so I'm going to show you what it is now. Is that starting? Cool.
So anyone at Fastly can hop on to Slack and ping between hosts. Which is really useful during incident management and emergencies. It's also super-useful for our customer-service engineers, who don't need to hop on caches all the time. They can just now hop on Slack and let customers know what they're seeing.
So there's no pseudo access, no SSH-ing necessary. Which just really goes to show that, if you're a network person or a distributed-platform person, you feel like you're doing really cool things, but nobody knows or cares about it. Just write a slackbot, because it's a really easy interface to show off stuff that you made.
And it was after the demo of the slackbot where I was bombarded with these messages, like "Sonar's pretty cool!" And "you have forever changed my life!" Which no one said that to me before.
And then, what is this sorcery? And my personal favorite was "Holy fuck, do you accept hugs?" To which the answer was "No."
So, thanks, everyone.