Caching Hudl’s news feed with ElastiCache for Redis

Every coach and ath­lete that logs into Hudl imme­di­ate­ly lands on their news feed. Each feed is tai­lored to each user and con­sists of con­tent from teams they are in as well as accounts they choose to fol­low. This page is the first impres­sion for our users and per­for­mance is crit­i­cal. Our solu­tion: ElastiCache for Redis.

Caching Hudl’s news feed with ElastiCache for Redis

Every coach and ath­lete that logs into Hudl imme­di­ate­ly lands on their news feed. Each feed is tai­lored to each user and con­sists of con­tent from teams they are in as well as accounts they choose to fol­low. This page is the first impres­sion for our users and per­for­mance is crit­i­cal. Our solu­tion: ElastiCache for Redis.

Every coach and ath­lete that logs into Hudl imme­di­ate­ly lands on their news feed. Each feed is tai­lored to each user and con­sists of con­tent from teams they are in as well as accounts they choose to fol­low. This page is the first impres­sion for our users and per­for­mance is crit­i­cal. Our solu­tion: ElastiCache for Redis.

Before we talk about caching though, it’s impor­tant to under­stand at a high lev­el the data mod­el used in the feed. There are 6 main col­lec­tions used by the feed:

To help illus­trate this, let’s look at two users: Sally and Pete. Sally decides to fol­low Pete. We now say that Sally is a fol­low­er of Pete and that Pete is a friend of Sally. When Pete posts some­thing, that post (aka con­tent) gets added to his user time­line as well as the home time­line for Sally. When Sally logs into Hudl to view her feed, she sees her home time­line pre­sent­ed in reverse chrono­log­i­cal order. If she then clicks on Pete, she views his user time­line and can view all the posts he’s created.

Let’s take a look now at what gets loaded every time a user hits their feed. We first grab a batch of post IDs from their home time­line, fetch those posts, and load all users ref­er­enced by those posts. Since the feed was cre­at­ed back in April 2015, the DB has grown rapid­ly and the total size is up to 120 GB. We cur­rent­ly have 18 mil­lion fol­low­er rela­tion­ships and 30 mil­lion pieces of con­tent. So where does caching come into play?

Redis

In the world of caching, there are pri­mar­i­ly two options: Memcached and Redis. For the longest time at Hudl, the default option was Memcached. It’s a proven tech­nol­o­gy and had pre­vi­ous­ly served the vast major­i­ty of needs across our ser­vices. However, with the intro­duc­tion of the news feed, we decid­ed to dig a lit­tle deep­er into the data struc­tures Redis had to offer and we’re real­ly excit­ed by what we found:

Lists

This alone would’ve been rea­son enough to use Redis. Timelines are nat­u­ral­ly stored as lists so being able to rep­re­sent them that way in cache is amaz­ing. As posts are added to time­lines, we sim­ply do a LPUSH (add to the front) fol­lowed by a LTRIM (used to cap the list at a max size). The best part: we don’t have to inval­i­date the cache as posts are added because it’s always being kept in sync with the DB.

Hashes

Displaying the num­ber of fol­low­ers and friends for a giv­en user is a crit­i­cal com­po­nent for any feed. By stor­ing these as fields on a hash for each user, we can quick­ly call HINCRBY to keep the val­ues in sync with the DB with­out the need to inval­i­date the cache every time a fol­low or unfol­low happens

Sets

We love to use RabbitMQ to retry failed oper­a­tions. Sets are the per­fect way for us to guar­an­tee we don’t acci­den­tal­ly insert the same post on a user’s time­line more than once with­out hav­ing an extra DB call. We use the post ID as the cache key, each user ID as the mem­ber, and then call SISMEMBER and SADD.

ElastiCache

Once we decid­ed on Redis, the next ques­tion was how to get a serv­er spun up and con­fig­ured so we could start test­ing with it. We love AWS and had heard about Amazon ElastiCache for Redis as an option so we decid­ed to give it a try. Within min­utes, we had our first test node spun up run­ning Redis and were con­nect­ing to it through the StackExchange.Redis C# driver.

With ElastiCache, we were eas­i­ly able to con­fig­ure our Redis deploy­ment, node size, secu­ri­ty groups and use Amazon CloudWatch to mon­i­tor all key met­rics. We were able to cre­ate sep­a­rate test and pro­duc­tion clus­ters, all with­out wait­ing for an infra­struc­ture engi­neer to set­up and con­fig­ure the servers man­u­al­ly. Here’s what we used for our pro­duc­tion cluster:

  • Node type: cache.r3.4xlarge (118 GB)
  • Replication Enabled
  • Multi-AZ
  • 2 Read Replicas
  • Launched in VPC

The final step in com­plet­ing our deploy­ment was con­fig­ur­ing alerts through Stackdriver. They seam­less­ly sup­port inte­grat­ing with the ElastiCache ser­vice and with­in a few min­utes, we had our alerts con­fig­ured. We were most inter­est­ed in three key metrics:

  • Current Connections: if these drop to 0, our web servers are no longer able to access the cache and require imme­di­ate attention.
  • Average Bytes Used for Cache Percentage: if this reach­es 95% or high­er, it’s a good sig­nal to us that we may need to con­sid­er mov­ing to a larg­er node type or low­er­ing down our expi­ra­tion times.
  • Swap Usage: if this gets to 1GB or high­er, the Redis serv­er is in a bad state and requires imme­di­ate attention.

Results

The feed was launched back in April 2015 and since then we couldn’t be hap­pi­er with its per­for­mance. Hudl’s traf­fic is high­ly sea­son­al and foot­ball sea­son is our prime time. Starting around August, coach­es and ath­letes from all over the coun­try get back into foot­ball mode and log into Hudl dai­ly. During the week of September 5th — 11th, there were 1.2 mil­lion unique users access­ing their feeds. The feed ser­vice aver­aged 300 requests per sec­ond, with a peak of 800. Here are some quick stats from ElastiCache dur­ing that same week:

  • Total Cached Items: 21 million
  • Cache hits: 175K/​min (aver­age), 350K/​min (peak)
  • Network in: 43 MB/​min (aver­age), 101 MB/​min (peak)
  • Network out: 600 MB/​min (aver­age), 1.25 GB/​min (peak)

Let’s take a clos­er took at two calls in the feed ser­vice: get­ting the time­line and hydrat­ing the time­line. The first call looks at just the oper­a­tion of fetch­ing the time­line list from redis. No oth­er depen­den­cies. The sec­ond call takes the post IDs in the time­line and loads all ref­er­enced users and posts. It’s impor­tant to note that this includes time spent load­ing records from the data­base if they are not cached and then caching them. This is the pri­ma­ry call used when load­ing the feed on the web as well as on our iOS and Android apps. 

Based on the suc­cess of feed, ElastiCache for Redis is quick­ly becom­ing our default option for caching. In the last year, five oth­er key ser­vices at Hudl have made the switch from Memcached. It’s easy to set­up, offers blaz­ing fast per­for­mance, and gives users all the ben­e­fits that Redis has to offer. If you haven’t tried it out yet, I would strong­ly rec­om­mend giv­ing it a shot and let us know how it works out for you.