Simulating Chutes and Ladders

I was inspired by reading Ethan Markowtiz’ article on Chutes and Ladder simulation but I wanted to simulate a lot of games, not just 10,000. I wanted to simulate 100,000,000 or more. I decided to do it in my current language of choice Ruby. So I did and you can find all my code on github at twohlix/chutes_and_ladders.

With my initial setup it took about 34 seconds to simulate 1M games, but that was 100x fewer than I wanted. Also I didn’t want to wait around for an hour for the results and I knew there was a common way to speed up simulations like this: multi-threading. So I threw in some threading.

I ran the app with 2 threads and… zero speedup. zilch. nada. It still took about 34 seconds real time to get the results. What the eff? Apparently Ruby, or at least Matz’ Ruby out of the box supports native threads but they’re never run in parallel due to the Global Interpreter Lock. A quick googling and I decided to install Rubinius (aka rbx). A few minutes later and I was down to about 20s for 1M. Almost 50% faster, nice. 4 threads and I was at 12s. 8 threads and I was at 10s. 16 threads it was hanging for over 2minutes. Okay, so maybe my cpu only really can bother with 8 threads anyway. All that work got me down to ~18minutes to simulate 100M 2 player games of Chutes and Ladders with a maximum of 150 turns per game.

Here’s some output from all the simulation:

Locally Caching API Backed Rails Models

So I was working on a project that had some fairly high coupling between our system and the data we got from an API. Then someone wanted a dashboard to summarize a bunch of stuff and instead of building some “cache models” in some database I decided to just cache the attributes the API sent us using standard Rails cache. What caching engine you use is up to you, but this will be faster if held in a memory based cache (duh).

Since this was at least 3 models being fed by the API I figured lets use a Concern! Check it out:

Since we had to define the class level find for our API Backed models anyway, we decided to couple that find method with the write_local_cache method

So to grab a model from the cache I simply had to do

ModelName.local id

in the controllers feeding this dashboard. Alternatively you can do this in relationship methods on a model.

So this ended up taking us from a 65 second page load that result in 400 API requests to about a 1.5 second page load with 0 API requests. And the business was fine with this data being anywhere from 1-12 hours out of date. And since this cache is so long lived I wrote a rake task to preheat the cache that I run every day at 6am before the app starts getting used heavily:

namespace :app_name do
  desc "Pre-heats the cache based off current work assignments"
  task preheat_cache: :environment do
    Assignment.not_worked.find_each do |assignment|
      customer = Customer.find assignment.customer_id
      customer.total_due # this preheats a relationship to another API backed model
    end
  end
end

Ruby Array#each vs Enumerable#any?

So I had a little snippet of code that a coworker of mine mentioned I could replace:

collection_thingy.each do |val|
  return if val.prop == :particular_thing
end

with:

return if collection_thingy.any? { |val| val.prop == :particular_thing }

And that got me thinking about which was faster. So I did some quick benchmarking. Results and code are embedded below.

Clearly Array#each is faster than using the equivalent Enumerable#any?, and by 13-28% depending on the array size. That seems like a lot. I’m not enitrely sure why it’s that much faster but it’s interesting. Still, for most cases there is a negligble difference (especially in a Rails app), just be aware of the difference if you’re traversing giant arrays (on 1Billion elements its as much as 14 second difference in my environment).

Here are some graphs showing the speedup more visually (imgur album of graphs):
1M Elements
10M Elements
100M Elements
1B Elements

All that being said, if performance isn’t an issue in the particular codeblock Enumerable#any? is more readable. Feel free to post your own results as actual runtimes will vary from environment to environment.