[01-28-2014][10:16] Van Miranda: hit me [01-28-2014][10:17] Musicalmindz: OK [01-28-2014][10:17] Musicalmindz: i will hit you baby one more time [01-28-2014][10:17] Musicalmindz: so its not amazingly complicated but its kinda cool [01-28-2014][10:17] Musicalmindz: so [01-28-2014][10:18] Musicalmindz: before: 100 nodes, 20 threads per node = 2000 workers, $700/month in node costs [01-28-2014][10:18] Musicalmindz: after: 300 nodes, 20 threads per node = 6000 workers, $2100/month [01-28-2014][10:18] Musicalmindz: so here's how we deal with the throttling [01-28-2014][10:19] Musicalmindz: it took a while to figure out how their network stack was handling things [01-28-2014][10:19] Musicalmindz: but if you send 100 nodes (each has a unique IP) at amazon [01-28-2014][10:19] Musicalmindz: around 10-20% will get throttled almost instantly, within the first few requests [01-28-2014][10:19] Musicalmindz: our original naive implementation was just restarting every node as it got throttled [01-28-2014][10:19] Musicalmindz: which is what caused the Heroku API spam and brought them down lol [01-28-2014][10:20] Musicalmindz: so now we organize everything into blocks of 100 nodes [01-28-2014][10:20] Musicalmindz: each is a "workforce" [01-28-2014][10:20] Musicalmindz: start 3 blocks of 100 nodes at once [01-28-2014][10:20] Musicalmindz: with an arbiter that sits around watching their status [01-28-2014][10:21] Musicalmindz: each thread in each node is doing requests and as soon as a thread gets throttled it assumes it wont be able to get any more requests in and just puts itself to sleep [01-28-2014][10:21] Van Miranda: ahhhhh [01-28-2014][10:21] Musicalmindz: literally a permanent sleep command, sleep with no value [01-28-2014][10:21] Musicalmindz: so slowly over time, more and more of each block of nodes workforce gets throttled [01-28-2014][10:21] Musicalmindz: what we noticed over time is that at between 70-80% blockage [01-28-2014][10:21] Musicalmindz: the remaining dynos would never get blocked [01-28-2014][10:22] Musicalmindz: but at that point you've only got 20-30% of that block working [01-28-2014][10:22] Musicalmindz: so we set the arbiter to cycle all of the dynos in that block (only consumes one API command vs tons to do each node individually) [01-28-2014][10:22] Musicalmindz: once it hits a certain threshold [01-28-2014][10:23] Musicalmindz: currently its at 70% [01-28-2014][10:23] Musicalmindz: might knock it down to 60% [01-28-2014][10:23] Van Miranda: ah - in aggregate single command so you dont have to micro [01-28-2014][10:23] Musicalmindz: yaaaa [01-28-2014][10:23] Musicalmindz: less API spam [01-28-2014][10:23] Van Miranda: smarty [01-28-2014][10:23] Van Miranda: ok continue [01-28-2014][10:23] Musicalmindz: so [01-28-2014][10:23] Van Miranda: maspurtating fruious [01-28-2014][10:23] Van Miranda: ly [01-28-2014][10:23] Musicalmindz: lol [01-28-2014][10:23] Musicalmindz: because it actually takes time to scale down 100 nodes [01-28-2014][10:23] Musicalmindz: and let them die semi gracefully [01-28-2014][10:24] Musicalmindz: we actually have 6 blocks of 100 nodes [01-28-2014][10:24] Musicalmindz: and we roundrobin each block of 100 in [01-28-2014][10:24] Musicalmindz: so when one block gets retired [01-28-2014][10:24] Musicalmindz: it gets sent to the back of the list [01-28-2014][10:24] Musicalmindz: and a fresh set of troops (who have all been scaled down nicely) are subbed in [01-28-2014][10:25] Musicalmindz: so [01-28-2014][10:25] Musicalmindz: even with lots of nodes throttled per block we're still at minimum have ~100 dynos working [01-28-2014][10:25] Musicalmindz: and up to 300 [01-28-2014][10:25] Musicalmindz: early on [01-28-2014][10:25] Van Miranda: so wait do those nodes all still have the same IP address that they did when they were throttled? [01-28-2014][10:26] Musicalmindz: nope [01-28-2014][10:26] Musicalmindz: each time you spin it back up [01-28-2014][10:26] Musicalmindz: it gets a new IP [01-28-2014][10:26] Van Miranda: so when you spin them down they get a new ip [01-28-2014][10:26] Musicalmindz: yup [01-28-2014][10:26] Van Miranda: HAT [01-28-2014][10:26] Van Miranda: err HAWT [01-28-2014][10:26] Musicalmindz: ya [01-28-2014][10:26] Van Miranda: hahah HATS jason [01-28-2014][10:26] Musicalmindz: thats the only reason it works [01-28-2014][10:26] Van Miranda: ya i figured [01-28-2014][10:26] Van Miranda: now riddle me this [01-28-2014][10:26] Musicalmindz: not sure how many IPs we're cycling through but its a lot [01-28-2014][10:26] Van Miranda: yeah when do you exhaust IP range of Heroku [01-28-2014][10:26] Musicalmindz: we dont [01-28-2014][10:26] Van Miranda: or have they said anything to you about IPs [01-28-2014][10:27] Musicalmindz: they dont care [01-28-2014][10:27] Musicalmindz: they have unlimited IPs [01-28-2014][10:27] Musicalmindz: its based on how many IPs EC2 has [01-28-2014][10:27] Musicalmindz: which is... ginormous [01-28-2014][10:27] Van Miranda: so wait heroku runs on EC2? [01-28-2014][10:27] Musicalmindz: ya its just a layer on top of EC2 [01-28-2014][10:27] Van Miranda: bahahahah [01-28-2014][10:27] Musicalmindz: we're slurping up amazon data directly from amazon haha [01-28-2014][10:27] Van Miranda: DDoS amazing with amazon [01-28-2014][10:27] Musicalmindz: yessss [01-28-2014][10:27] Van Miranda: racks on racks on bezos [01-28-2014][10:27] Musicalmindz: if we set this all up ourselves on EC2 directly we'd save a decent amount of cash [01-28-2014][10:28] Musicalmindz: it didnt make sense before consider our total monthly cost was $700 [01-28-2014][10:28] Van Miranda: yeah but you wouldn't have all the heroku niceties [01-28-2014][10:28] Musicalmindz: ya and that