Wordpress jumblings

I updated Wordpress this morning (boy they do make it pretty easy) and installed the SyntaxHighlighter Plus plugin. I edited an entry containing some Ruby code from a few days ago and it looks a lot better now.

Maybe someone can help: I’m looking for a plugin which lets you mark words or insert a placeholder of some kind which means “insert a link here before you publish”, and warns you before trying to publish with any of those items in there.  When writing these haxpact blags over the past few days, I usually just write then go back and link things, but that can be tedious.

Sports related Rails DB model layouts

I mentioned previously that I’ve ran an NFL pick the winners site for the past three seasons, the most recent one on Rails.  I wanted to write a few words on some model related stuff for both this and the upcoming World Cup site.  I recently discovered the excellent Railroad gem and used it to generate model diagrams of both the NFL site and the (WIP) World Cup site.

Let’s start with the NFL site for an example of something current.  This has been cropped down to focus on what I want to talk about today: the games themselves.

So we’ve got a Team model which is for each NFL team, this is obviously needed and will be the same for the World Cup site.  The issue at hand here are the games.  As I see it, there are two options, the first modeled here.

There is a single model for all games: it contains the time, away and home teams (as references), and the score (recorded as “awayscore” and “homescore”).  The NFL releases its schedule early on, so all games can be preentered at the beginning of the season.  As games are played (or moved to different times), these Game models are updated and saved.

This is a reasonable system, and takes advantage of the fact we know the schedule ahead of time.  However, it has some downsides.  One such downside is that the relationship from a Team to a Game can go through either the “away” or “home” fields, meaning that to get all the Games for a Team, you have to get the “away” and “home” relations and combine them, sorting by game’s time.  This results in some slightly bigger queries than one might expect.

Another issue could be always updating the Game model itself.  I have yet to decide if that really is a bad thing or not, though.

for the World Cup let me see the setup in a different light.  In the World Cup, there are 64 total games: 48 group games (8 groups, 6 group games) and then 16 knockout stage games.  At the beginning we know the times and locations of every game that will be played, but we do not know the participants of the knockout games until the group games have been played.

This led me to think of another modeling of the situation, one with three models all linked together.  Visually:

In this setup, we have three classes working together with the Team model.  ScheduleGame contains all the information about the time/date of the games, and is the central place to find out information about a game through relations.  ScheduleMatchup contains the two teams who will be competing a game.  This consists of an away team and a home team, both relations to the Team model.  Finally, ScheduleResults are entered as the games complete.

This setup allows for no editing of model instances once they are created.  We can pre instantiate all ScheduleGames, and 48 ScheduleMatchups.  I made the ScheduleGames model completely read-only – they are instantiated through “rake db:seed” out of YAML fixtures.

This still does not solve the issue above of “away” and “home” and having to combine the two to get games for a team.  I debated making two ScheduleMatchups per game, taking a Team ref, Game ref, and “side” enum – Away or Home.  (Can you do enums in Rails models?  Haven’t tried before.  A simple boolean would work too, if you could remember which side meant Home or Away).

Any thoughts from the internets?  Is my quest to have no models be edited while the site is running just silly?  Would you combine the World Cup model into the NFL style model?

Lame haxpact saturday

I spent the entire day working on our guest room:

  • Changing switches and plugs
  • Removing painting tape
  • Cleaning out old painting equipment
  • Getting paint off floor
  • Listening to The Killers on repeat

No this doesn’t quite count, but I had barely a chance to do anything Saturday but do work on this, so here’s a picture:

World Cup pick’em: Team group standing possibilities, part 2

Yesterday, I wrote about the World Cup pick’em site and doing some prediction “math” to figure out the likely group finishing positions of each team in a group.  Today, I planned to rewrite parts of it to speed it up.

My naive approach was to use transactions to create the db records needed so that a team could compute its group record, then back those transactions out to go on to the next possible finishes of games.  This was SLOW, and I knew it was going to be slow.  I began the rewrite by changing the way the group record was calculated.  Instead of relying solely on DB records, it allowed me to pass in an array of these records, and if that array wasn’t present, it would look them up in the DB for me.  This allowed me to create instances of models, but not save them to the DB, and just pass those into the Team.group_record method.  This reduction dropped individual database accesses when getting the group record down from about 16 or so to 2.  The savings are noticeable.

After making this change (and various other necessary changes based on this), I had the tools to rewrite the prediction methods.  That is when I discovered a major logical problem with the way I had written it yesterday.  It would generate all the possible scores 0..5 for both sides.. but then use that scoreline for BOTH games.  This meant that instead of 6^2 possibilities, there really existed 6^4 possibilities of games/scorelines.

Luckily I had stumbled across a neat collection of Ruby mixin methods for Arrays to do things like permutations, combinations, and (what I used) repeating permutations.  I dropped this code to config/initializers/array.rb:

class Array
  def permutations
      return [self] if size < 2
      perm = []
      each { |e| (self - [e]).permutations.each { |p| perm << ([e] + p) } }
      perm
  end

  def rep_perm_block(n)
    if n < 0
    elsif n == 0
      yield([])
    else
      rep_perm_block(n-1) do |x|
        each do |y|
          yield(x + [y])
        end
      end
    end
  end

  def rep_permutations(n)
    ret = []
    rep_perm_block(n) do |x|
      ret << x
    end
    ret
  end
end
I had a bit of trouble converting from yield style syntax to normal “build an array and return it” syntax, hence the helper method on the bottom.  Rails automatically loads anything in config/initializers/, so this mixin is always loaded for any Rails environment.  To generate all possible scorelines, I do this:
>> (0..5).to_a.rep_permutations(4)
=> [[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 2], [0, 0, 0, 3],
   [0, 0, 0, 4], [0, 0, 0, 5], [0, 0, 1, 0], ...

And just pair them up to get scorelines for each of the two games.

This results in some improved numbers for USA’s chances:

=> [{"USA"=>[0.969135802469136, 0.0308641975308642, 0.0, 0.0]},
    {"ENG"=>[0.0308641975308642, 0.529320987654321, 0.266203703703704, 0.173611111111111]},
    {"SVN"=>[0.0, 0.335648148148148, 0.259259259259259, 0.405092592592593]},
    {"ALG"=>[0.0, 0.104166666666667, 0.474537037037037, 0.421296296296296]}]

There’s plenty more places this can be taken.  For instance, when there is an extremely low % chance for a team to finish in a spot, let’s say less than 2.5% or so, I could get a list of all the game scorelines that would generate that position, then find the commonality between them so it can be displayed in English e.g. “Egypt must win by 5 goals and Scotland must lose” or something like that.  There’s a lot more to do on this application though, not just this prediction model…