April 15, 2015

Tom CopelandClass methods and singleton methods

2015/04/15 04:00 AM

Class methods are methods on a object's singleton class. Everyone knows this (1). I think I sort of knew it also, but recently I was working on a thing and this was brought home to me.

I was working on integration tests for filter_decrufter, so I wanted to define a sort of stubbed out ActionController::Base (2) with a class method that simulated defining a before action:

module ActionController
  class Base
    def self.before_action(*args)
    end
  end
end

And I had a subclass that attempted to call that before_action class method:

class AnotherFakeController < ApplicationController
  before_action :foo, :only => [:bar]
  def foo
  end
  def self.my_action_methods
    [:foo]
  end
end

Then filter_decrufter could define a singleton method that would check the before_action arguments and flag any options for missing actions:

# in a loop where filter_sym is before_action, after_action, around_filter, etc
ActionController::Base.define_singleton_method(filter_sym) do |*args, &blk|
  # ... gather some data about *args ...
  super(*args, &blk)
end

What I was seeing, though, was that AnotherFakeController would raise an exception when I loaded it and it attempted to call the parent class method as part of the class definition:

  1) Error:
FilterDecrufterTest#test_finds_problems:
NoMethodError: super: no superclass method `before_action' for AnotherFakeController:Class
    /lib/filter_decrufter/checker.rb:100:in `block in patch_method'
    /test/integration/another_fake_controller.rb:3:in `<class:AnotherFakeController>'

But why? The before_action method is declared right there in ActionController::Base!

The problem was that the before_action method that ActionController::Base defined was living on ActionController::Base's singleton class. No need to take my word for it though; you can verify this by defining a class method and checking the singleton methods:

irb(main):001:0> class Foo ; def self.bar ; puts "Foo#bar" ; end ; end
=> :bar
irb(main):002:0> Foo.singleton_methods
=> [:bar]

So when I defined a singleton method on ActionController::Base I was not intercepting the method call like I intended. Instead, I was redefining the existing method. And my new method definition called super, but since I'd redefined the only method with that name in this class's ancestor chain, there was no superclass method by that name available, and so bam, exception.

As a side note, singleton_methods looks up the inheritance chain, so it's not quite reliable for saying "this method is defined right here":

irb(main):001:0> class Foo ; def self.bar ; end ; end
=> :bar
irb(main):002:0> class Buz < Foo ; end ; class Biz < Buz ; end
=> nil
irb(main):003:0> Biz.singleton_methods
=> [:bar]

Back to the original problem - how to solve it? By defining the method not on the singleton class but instead further up the ancestor chain. And how to do that? By defining the method in a module and extend'ing that module:

# Define a method with our method in it
irb(main):001:0> module Buz ; def bar ; puts "Buz#bar" ; end ; end
=> :bar
# extend that module so that it's a class method
# rather than include'ing which would make it an instance method
irb(main):002:0> class Foo ; extend Buz ; end
=> Foo
# now define a singleton method that will intercept invocations of that method
irb(main):003:0> Foo.define_singleton_method(:bar) { puts "Foo#bar" ; super() }
=> :bar
# and demonstrate the that interceptor gets called first and then calls the superclass
irb(main):004:0> Foo.bar
Foo#bar
Buz#bar

I think the lessons learned are the usual ones. Unexpected exceptions are an opportunity for learning something. Don't confuse Java static methods with Ruby's class methods. Verify expected behavior in irb or in a small program. And read books written by people who have poured a lot of time and energy into the topic that's currently giving you trouble!

(1) Because everyone's read Paolo Perrotta's excellent Metaprogramming Ruby 2nd Ed.

(2) You could argue that I should just declare a dependency on actionpack and use it. That probably would be better; I might do that.

April 14, 2015

Ryan DavisGreat Expectations

2015/04/14 07:00 PM

minitest/spec has existed since v1.3.0 (6.5 years). It started as a simple proof of concept to show that there was a 1:1 mapping between “spec-style” and “unit-style” tests (not to be confused with BDD vs TDD, which is a development process). It showed (in 67 lines!), that every describe was just a Test subclass, and every it was just a test method (with strange inheritance).

minitest/spec has only grown to 152 lines since then but it really hasn’t changed all that much. Until yesterday.

The older spec system relied on a thread-local variable in order to understand what the current test method was. It was a limitation of how you called the expectation methods. So, when you wrote a test like this:

1
2
3
4
5
describe Something do
  it "should blah-blah" do                                      
    (1+1).must_equal 2
  end                                                                       
end

we call must_equal on the result of 1+1. This implies that the method is on Object in order to be called on the value 2. In order to map must_equal back to Minitest::Test’s assert_equal, minitest needed to know what test the expectation was in and that was done via Minitest::Spec.current, which just accesses a thread-local variable. That works great 99.9% of the time, until, you know, you use threads in your tests. (Why you would use threads inside your tests is irrelevant.)

So, if your test changed to this:

1
2
3
4
5
6
7
describe Something do
  it "should blah-blah threaded" do                                      
    my_threaded_test_thingy do
      (1+1).must_equal 2
    end
  end                                                                       
end

then it would blow up, because you’d be wrapping must_equal in your own thread and that would make the thread-local variable unavailable (they’re not scoped to parent threads, for better or worse).

Yesterday’s release of 5.6 brings new expect system from tenderlove that adds a struct and one new method _. The method wraps the resultant value in a Expectation value monad. All in all, it only adds 11 lines of code to minitest. All of the usual expectation methods are defined, but this time around, they know what test they’re in because it was stored off when the monad was created.

So, one character change later, and the following works:

1
2
3
4
5
6
7
describe Something do
  it "should blah-blah threaded" do                                      
    my_threaded_test_thingy do
      _(1+1).must_equal 2
    end
  end                                                                       
end

The method _ is also aliased to value if you prefer verbier tests. It is also aliased to expect, to make transitioning easier, but <bikeshed>1I really dislike that name</bikeshed> because it is non-descriptive. value is much better, and imo, _ is even better because it gets out of your way and lets you focus on the actual meat of the test.

At some point (pre-6.0) the old expectations are going to be deprecated in favor of the new system. Until then, they’re fine to use and won’t make any noise until the deprecation is officially planned. That will remove the monkeypatches on Object for all of the expectations that minitest supports. Hopefully that’ll reclaim the 11 lines we added.

Apologies are owed to Dickens. I couldn’t help myself, even tho I hated that book.

  1. <bikeshed> means: don’t argue with me about it. I don’t care to argue about it.

Ryan DavisDebride gets better rails support

2015/04/14 07:00 AM

I just released debride 1.3.0. Debride statically analyzes your ruby for potentially uncalled / dead methods. It should help you strip out your dead weight from your large (or small) projects.

Thanks to contributions from Pete and Ian, it now has an --exclude flag as well as better rails DSL support.

Pete’s also released debride-haml 1.0.0 as debride’s second plugin.

If you have any problems with it or have suggestions on how to make it work better for you, please file an issue.

March 26, 2015

Ryan DavisDebride gets Plugins!

2015/03/26 07:00 PM

I just released debride 1.2.0 and extended it with a flexible plugin system that’ll allow it to process multiple types of files.

I also released debride-erb 1.0.0 as debride’s first plugin.

Between the whitelisting and ERB support, I’ve been able to run this across seattlerb.org’s website and actually found some stuff we could remove!

If you have any problems with it or have suggestions on how to make it work better for you, please file an issue.

March 18, 2015

Ryan DavisDebride gets Whitelisting

2015/03/18 09:00 AM

I just released debride 1.1.0 and extended it with whitelisting. So previously when you ran debride on your code you might see something like this:

% debride lib

These methods MIGHT not be called:

MyClass
  good_method                         lib/some/file.rb:16
  bad_method                          lib/some/file.rb:20
...

But if you know that good_method is exempt (perhaps because it is public API), then you can whitelist it:

% echo good_method > whitelist.txt
% debride --whitelist whitelist.txt lib

These methods MIGHT not be called:

MyClass
  bad_method                          lib/some/file.rb:20
...

You can also use regexps in your whitelist by delimiting them with slashes (eg. /^good_/).

If you have any problems with it or have suggestions on how to make it work better for you, please file an issue.

March 12, 2015

Ryan DavisMeet Debride

2015/03/12 07:00 PM

I released a new tool named debride while on the road for Ruby on Ales and Mountain West Ruby Conf. It’s a fairly simple tool that can analyze code for potentially uncalled / dead methods. If you suspect that you have dead code but don’t know how to go about identifying it, you might want to check out debride.

% debride lib

These methods MIGHT not be called:

MyClass
  method!                            lib/some/file.rb:16

What’s dead code? To debride, it is any method defined that doesn’t appear to have a call anywhere else. So this tool isn’t as good for public API unless you also run your tests against it (but that is prone to keeping well-tested dead code alive).

caveats

There are obvious problems to using a static analyzer for a highly dynamic language like ruby. The first is send. I’m toying with the idea of providing a send wrapper that’ll log all method names to a file that you can then review and whitelist for subsequent runs on debride, but I’m totally open to suggestions & PRs.

March 09, 2015

Ryan DavisStanding on the Shoulders of Giants

2015/03/09 06:00 PM

I just posted my slides for my talk at both Ruby on Ales and MountainWest Ruby Conf 2015.

Enjoy! Let me know if you have any questions or comments!

February 10, 2015

Tom CopelandString method use and misuse

2015/02/10 05:00 AM

Consider this little code snippet:

[1,2,3].select {|x| x > 1 }.first

This should be replaced by [1,2,3].detect {|x| x > 1 }; that's what pippi is for. Pippi also finds a bunch of variants - select followed by any?, etc.

But after writing a couple rules I ran out of ideas on method sequences to detect. There's an infinite number of method sequence possibilities, of course, but most of them are nonsense, and pippi is slow enough as it is without filling it with checks that probably won't find anything.

So I wanted to figure out which sequences of method calls were happening. This isn't looking for just any method sequence, though, it's focusing on method invocations on objects where that invocation returns that same type. So Array#select is interesting, because it returns another Array, but Array#first is not because the array could be holding a variety of types. After a bit of flailing around I came up with a gizmo to find such sequences and ran it (e.g., USE_PIPPI=true PIPPI_CHECKSET=research bundle exec rake test:units | grep occurred) on some code that was lying around, producing:

select followed by each occurred 34 times
select followed by map occurred 20 times
select followed by size occurred 13 times
select followed by empty? occurred 10 times
select followed by any? occurred 7 times
select followed by first occurred 7 times
select followed by sort_by occurred 4 times
select followed by reject occurred 3 times
select followed by select occurred 3 times
select followed by count occurred 3 times
select followed by last occurred 3 times
select followed by detect occurred 2 times
select followed by length occurred 2 times
select followed by group_by occurred 2 times
select followed by none? occurred 2 times
select followed by to_a occurred 1 times
select followed by each_with_index occurred 1 times
select followed by slice occurred 1 times
select followed by inject occurred 1 times
select followed by all? occurred 1 times
select followed by collect occurred 1 times
select followed by map! occurred 1 times
select followed by join occurred 1 times

The grep is there because the utility prints out the location of each hit so you can go take a look at it, and I wanted to filter those out here. I'm printing out a frequency map, but that's not to say that low numbers on the histogram aren't be interesting; if a Pippi rule only finds one hit in a 10KLOC codebase it's still a good check if it actually found something to fix. But while we're accumulating data we may as well get a feel for frequency.

There's some good stuff there - for example, that select followed by all? should probably be stamped out - but we've already got a bunch of checks around Array#select. How about String#strip?

strip followed by empty? occurred 15 times
strip followed by downcase occurred 10 times
strip followed by split occurred 7 times
strip followed by strip occurred 4 times
strip followed by gsub occurred 4 times
strip followed by encode occurred 3 times
strip followed by length occurred 3 times
strip followed by encoding occurred 3 times
strip followed by force_encoding occurred 3 times
strip followed by index occurred 3 times
strip followed by count occurred 2 times
strip followed by size occurred 2 times
strip followed by ascii_only? occurred 2 times
strip followed by unpack occurred 1 times
strip followed by strip! occurred 1 times
strip followed by gsub! occurred 1 times
strip followed by match occurred 1 times
strip followed by tr occurred 1 times
strip followed by dump occurred 1 times
strip followed by b occurred 1 times
strip followed by delete occurred 1 times
strip followed by encode! occurred 1 times

Definitely some good fodder for rules there. Why would you ever call strip on a String that you got as a result of calling String#strip? Maybe strip followed by empty? could be replaced with a regex check like if foo.match(/^\s*$/). And you can imagine some other possibilities when seeing that list.

To repeat the warning in the pippi README, not all of these issues are fixable. A method might call strip on a String and return it, and the caller might then invoke strip on that same object because a different code path would bring an un-strip'd String to that same location. I don't think there's a good way around that short of a refinement or some such that would tell pippi to ignore that hit, and I'm not sure that'd be worth the trouble.

Anyhow, should be good check potentials here. We'll see!

January 15, 2015

Ryan DavisYou can go home now, I solved testing

2015/01/15 08:00 PM

I need to work on a real blog post about this topic, but all I could come up with yesterday was this snarky snippet of code:

1
2
3
4
5
6
7
8
9
10
11
def test_some_method
  assert_something_output do
    assert_nothing_raised do
      assert_some_side_effect do
        assert_some_response do
          object.some_method # testing SOLVED
        end
      end
    end
  end
end

This is, of course, me responding to more resistance to minitest not having useless assertions like assert_nothing_raised. I’ve talked about it before. I’ve blogged about it before. Pretty sure it’ll never die.

January 07, 2015

Ryan DavisEmacs Automodes

2015/01/07 08:00 PM

I have a system that I’ve already detailed called “autohooks”, but I also hinted about something I use called “automodes”. automodes? What is that and how does it differ from autohooks?

Hooks are intended to be run every time you load a file and go into a mode. When I load a ruby file, my the enh-ruby-mode-hook gets triggered. That’s what my autohook system is for. Automodes tho, is intended to load all the stuff that needs to be loaded before the hook triggers. An easy example is to set up mapping file patterns (eg “*.rb” files) to their modes (enh-ruby-mode) or to set up variables needed for the mode to be set on load (eg enh-ruby-program).

How simple? It’s just 4 lines long:

1
2
3
4
(defun rwd-load-modes ()
  (interactive)
  (dolist (path (directory-files (concat user-init-dir "modes") t ".*el$"))
    (load path)))

This is triggered right after loading emacs via my after-init autohook file:

1
2
3
4
5
6
7
;; after-init.el:
(when (and running-osx (string= "/" default-directory))
  (cd "~"))

(global-whitespace-mode 1)

(rwd-load-modes)

As an example, here is my modes/enh-ruby-mode.el:

1
2
3
4
5
6
7
8
9
10
11
12
13
(autoload 'enh-ruby-mode "enh-ruby-mode" "Major mode for ruby files" t)

(dolist (spec '(("\\.mab$"   . enh-ruby-mode)
                ("\\.rb$"    . enh-ruby-mode)
                ("Rakefile"  . enh-ruby-mode)
                ("Gemfile"   . enh-ruby-mode)
                ("\\.rake$"  . enh-ruby-mode)))
  (add-to-list 'auto-mode-alist spec))

(add-to-list 'interpreter-mode-alist '("ruby" . enh-ruby-mode))

(add-to-list 'load-path (expand-file-name "~/Work/git/enhanced-ruby-mode"))
(setq enh-ruby-program (expand-file-name "/usr/bin/ruby"))

It is a really simple system and really only exists so that I can break up a monolithic setup into many small single-topic files. This has made things a lot more maintainable over time.

December 19, 2014

Ryan DavisUpdated iTunes Bedtime script

2014/12/19 08:00 AM

Minor changes this time. This works in iTunes v12, which is the latest release for Yosemite. The only real fixes centered around verbage changes to menu items. I wish it had applescript support that didn’t require UI scripting to do basic things like listen to a smart playlist or a particular band.

This script starts up itunes and airfoil, sets the volumes of both, picks a specified playlist, and plays it. I fire this one up just before going to bed. I have another copy that sets up my programming playlist and another for cooking.

They’re always available via the script menu by putting them in "~/Library/Scripts/".

I hope this helps.

Bedtime.scpt:

property myPlaylist : "Mellow"
property myVolume : 50
property airfoilVolume : 0.5

tell application "iTunes"
    activate
    stop application
    
    my setShuffle()
    
    set sound volume to myVolume
    
    activate
    my setSource()
    my minimizeITunes()
end tell

my setSpeakers()

on setShuffle()
    tell application "System Events"
        tell process "iTunes"
            tell menu 1 of menu bar item "Controls" of menu bar 1
                tell menu 1 of menu item "Shuffle"
                    if "Turn On Shuffle" is in (title of menu items) then
                        click menu item "Turn On Shuffle"
                    end if
                    
                    click menu item "Songs"
                end tell
                
                tell menu 1 of menu item "Repeat"
                    click menu item "All"
                end tell
            end tell
        end tell
    end tell
end setShuffle

on minimizeITunes()
    tell application "iTunes" to close every window
    tell application "System Events" to tell process "iTunes"
        try
            click menu item "Switch to MiniPlayer" of menu 1 of menu bar item "Window" of menu bar 1
        end try
    end tell
end minimizeITunes

on setSource()
    tell application "iTunes"
        tell playlist myPlaylist
            reveal
            play
        end tell
        next track
    end tell
end setSource

on setSpeakers()
    set volume output volume 100
    set volume input volume 100
    set volume alert volume 100
    set volume without output muted
    
    tell application "Airfoil"
        activate
        close every window
        set current audio source to application source "iTunes"
        set linked volume to false
        
        disconnect from every speaker
        
        set (volume of every speaker) to airfoilVolume
        
        connect to speaker "airport2"
        connect to speaker "Computer"
    end tell
end setSpeakers

November 10, 2014

Evan PhoenixOn Portable Computation…

evanphx @ 2014/11/10 07:42 AM

Since the first time two computers communicated directly over a network, programmers have mused about the programs running on those computers. Back in those early days, there was careful external synchronization of programs “Bob, will you load number 3 while I’ll do the same over here?” We like to think that we’re well past that, but in reality, it’s basically still the same situation today. We’ve invented versioning schemes to make that external synchronization easier to swallow, but we’re still just manually putting 2 programs on different computers and hoping they can talk to each other.

And so the industry has slowly tried to evolve solutions to this problem. Java, for instance, strove to allow a server to push down the client version of a program to the browser to run, allowing the server to decide at load time which to push (you remember applet’s don’t you?). Modern web programmers effectively worked their butts off to avoid the problem and instead push a very heavy standards drivin’ approach (hi there W3C!). And because web programming has dominated so much of the industry for the past decade, there hasn’t been a big push to work on portable computing.

Disclaimer: I’m sure people will comment and say “OMG Evan! This person and that person have been working on it for 20+ years! I’m sure thats true, but almost nothing has gone mainstream, such that it has no effect on your everyday programmers.

There are 3 really interesting attempts at bringing portable computing to the everyday programmer I’ve been thinking about this weekend that I thought I’d do a quick post about.

  1. NaCL – Google’s approach to applets, basically. With a heavy emphasis on C++ as the target language, the aim is squarely at game developers. But I think if you look past immediate browser usage, you can see a separate idea: The ability to generate managed machine code that only runs inside a special harness and can be easily transported. I’m sure the irony of that last sentence NOT being applied to Java will be lost on no one. And in that irony, a nugget of truth: this approach isn’t new. The exact same thing could be applied to Java or really any VM. The place that NaCL deviates is that because it’s the generation of machine code and that the toolkit is a normal C/C++ compiler, it’s possible to compile a VM itself with NaCL to deploy it. Thus NaCL can be seen as a safe way to distribute binaries, which makes it pretty interesting. I’ll call this approach to portable computing the specialized environment approach.
  2. Docker – I’m talking about docker rather than generically containers or other similar technologies because it’s Dockers pairing of containers with images that are delivered over a network. Certainly a service to download a binary and run it would be trivial, thusly it’s usage of containers that Docker into a proper portable computing environment. The interesting aspect to Docker is that it’s approach is squarely on the computing “backend”. NaCL and it’s spiritual father Java applet focus getting new computation units to and end user directly, where as Docker is about the management of backend services. It’s just that the code for those backend services is delivered remotely on demand, making it a portable computing environment. This is the constrained environment type of portable computing.
  3. Urbit – Urbit is, well, it feels like the implementation of an idea from a scifi novel. Part of that is on purpose, it’s founder is known to be quite esoteric. Cutting through all the weirdness, I find a kernel of an idea for portable computing. They’ve defined a VM that can run a very simple bytecode and a whole ecosystem around it. The service I found most interesting is the fact that all programs are in their bytecode and all programs are stored in their global filesystem service. This means that it’s trivial for programs written and stored by another computer to be run by an other computer. Add in versioning, public identities, and encryption and you’ve got yourself a really nice little portable computing environment. The fact that the VM so effectively stateless and the services promote storing immutable data, it’s easy to see how a distributed system would work. The problem? It’s all in a really esoteric environment that it going to be incredibly difficult for even the most motivated programmers to get into. This is the isolate environment type.

So, that’s the point of cataloging these here? Well, I started looking at Urbit and thinking how difficult it is for it to get uptick. But the kernel of the idea is super interesting. So could that kernel be transplanted to something like NaCL or Docker to provide a meaningful environment for distributed programming? Urbit’s immutable global file system seems perfect to apply CRDT to, allowing multiple nodes to share a meaningful view of some shared data. Wire in a discovery mechanism that will tie environments together and I think there could be something big there.


October 22, 2014

Tom CopelandFinding suboptimal Ruby class API usage

2014/10/22 04:00 AM

Consider this little array:

[1,2,3]

Now suppose we want to find the first element in that array that's greater than one. We can use Array#select and then use Array#first on the result:

[1,2,3].select {|x| x > 1 }.first

Of course that's terribly inefficient. Since we only need one element we don't need to select all elements that match the predicate. We should use Array#detect instead:

[1,2,3].detect {|x| x > 1}

This is an example of refactoring to use the Ruby standard library. These are small optimizations, but they can add up. More importantly, they communicate the intent of the programmer; the use of Array#detect makes it clear that we're just looking for the first item to match the predicate.

This sort of thing can be be found during a code review, or maybe when you're just poking around the code. But why not have a tool find it instead? Thus, pippi. Pippi observes code while it's running - by hooking into your test suite execution - and reports misuse of class-level APIs. Currently pippi checks for "select followed by first", "select followed by size", "reverse followed by each", and "map followed by flatten". No doubt folks will think of other checks that can be added.

Here's an important caveat: pippi is not, and more importantly cannot, be free of false positives. That's because of the halting problem. Pippi finds suboptimal API usage based on data flows as driven by a project's test suite. There may be alternate data flows where this API usage is correct. For example, in the code below, if rand 0.5 is true, then the Array will be mutated and the program cannot correctly be simplified by replacing "select followed by first" with "detect":

x = [1,2,3].select {|y| y > 1 }
x.reject! {|y| y > 2} if rand < 0.5
x.first

At one point I considered pulling the plug on this utility because of this fundamental issue. But after thinking it over and seeing what pippi found in my code, I decided to stay with it because there were various techniques that eliminated most of these false positives. For example, after flagging an issue, pippi watches subsequent method invocations and if those indicate the initial problem report was in error it'll remove the problem from the report.

There are many nifty Ruby static analysis tools - flay, reek, flog, etc. This is not like those. It doesn't parse source code; it doesn't examine an abstract syntax tree or even sequences of MRI instructions. So it cannot find the types of issues that those tools can find. Instead, it's focused on runtime analysis; that is, method calls and method call sequences.

To see what pippi finds in your Rails project, follow the instructions in the pippi_demo example Rails app. TLDR: Add gem 'pippi' to your Gemfile, add a snippet to test_helper.rb, and run a test with USE_PIPPI=true to see what it finds.

Note that pippi is entirely dependent on the test suite to execute code in order to find problems. If a project's test code coverage is small, pippi probably won't find much.

Here's how pippi stacks up using the Aaron Quint Ruby Performance Character Profiles system:

  • Specificity - very specific, finds actual detailed usages of bad code
  • Impact - very impactful, slows things down lots
  • Difficulty of Operator Use - easy to install, just a new gemfile entry
  • Readability - results are easy to read
  • Realtimedness - finds stuff right away
  • Special Abilities - ?

Finally, why "pippi"? Because Pippi Longstocking was a Thing-Finder, and pippi finds things.

October 21, 2014

Ruby InsideRaptor: A Forthcoming Ruby Web Server for Faster App Deployment

Peter Cooper @ 2014/10/21 01:20 PM

Raptor bills itself as a new Ruby "app server” and it claims to blow everything else out of the water performance-wise (by between 2-4x!) whether that’s Unicorn, Puma, Passenger, or even TorqueBox on JRuby. The bad news for now is there’s no source or repo yet and only a handful of people (including me) have been given a sneak peek, although a public beta is promised on November 25th.

The history of Ruby webapp deployment

The deployment of Ruby (and therefore Rails) webapps was a painful mess for years, a state I lamented 7 years ago in No True ‘mod_ruby’ is Damaging Ruby’s Viability on the Web. Thankfully, shortly thereafter a number of projects came out to make life easier, the most famous being Phusion Passenger (then known as mod_rails) in April 2008.

Things have continued to improve gradually over the years, with Passenger getting consistently better, and new approaches such as those offered by Unicorn and Puma, using JRuby, as well as proxying through Nginx, coming into the picture.

Enter Raptor

Raptor, a new entry to the burgeoning world of Ruby Web servers, boasts some compelling features. "Visibility" is cited as a key feature so that you can look ‘into’ your app and analyze its performance as easily as possible using a JSON API (so building your own tools around the API should be simple). Raptor also uses the HTTP parser from Node which itself was derived from Nginx’s HTTP parser; both are renowned for their speed and stability. Raptor boasts a zero-copy, concurrent, evented architecture which makes it efficient memory and IO-wise - so even if you have slow clients or a slow network, these won’t bring your app server to a stuttering standstill.

Another feature that jumped out at me is integrated caching. Raptor doesn’t rely on an external services like memcached or Redis at all, but is truly internal and optimized specifically for Web workloads. If you’ve never set up caching before, this could provide a big boost as with Raptor it’ll be available “out of the box”.

The initial results seem promising. Fabio Akita has already shared some early benchmark results which broadly mirror my own experience (disclaimer: as someone with rather little experience and authority in benchmarking, my benchmarks are oriented around Raptor’s own benchmarking suite) but, as always, YMMV and such benchmarks are often criticized.

The waiting game..

The team behind Raptor promise they’ll be releasing some interesting blog posts soon about the technology behind it, including how the cache is implemented and has been optimized, how the zero-copy system works and how it’ll benefit your code, and similar things. So keep an eye on rubyraptor.org, especially around November 25th.

October 18, 2014

Tom CopelandGood technical videos

2014/10/18 04:00 AM

I rarely get a chance to go to technical conferences, but I do watch lot of videos. I tweet about the good ones; here's a collection for posterity:

October 16, 2014

Ruby InsideRuby’s Unary Operators and How to Redefine Their Functionality

Peter Cooper @ 2014/10/16 09:34 AM

In math, a unary operation is an operation with a single input. In Ruby, a unary operator is an operator which only takes a single 'argument' in the form of a receiver. For example, the - on -5 or ! on !true.

In contrast, a binary operator, such as in 2 + 3, deals with two arguments. Here, 2 and 3 (which become one receiver and one argument in a method call to +).

Ruby only has a handful of unary operators, and while it's common to redefine binary operators like + or [] to give your objects some added syntactic sugar, unary operators are less commonly redefined. In my experience, many Rubyists aren't aware that unary operators can be redefined and.. technically you can't "redefine an operator" but Ruby's operators frequently use specially named methods behind the scenes, and as you'll know.. redefining a method is easy in Ruby!

A Quick Example with -@

Let's ease into things with the - unary operator. The - unary operator is not the same thing as the - binary operator (where a binary operator has two operants). By default, the - unary operator is used as notation for a negative number, as in -25, whereas the - binary operator performs subtraction, as in 50 - 25. While they look similar, these are different concepts, different operators, and resolve to different methods in Ruby.

Using the - unary operator on a string in irb:

> -"this is a test"
NoMethodError: undefined method `-@' for "this is a test":String

The String class doesn't have unary - defined but irb gives us a clue on where to go. Due to the conflict between the unary and binary versions of -, the unary version's method has a suffix of @. This helps us come up with a solution:

str = "This is my STRING!"

def str.-@
  downcase
end

p str     # => "This is my STRING!"
p -str    # => "this is my string!"

We've defined the unary - operator by defining its associated -@ method to translate its receiving object to lower case.

Some Other Operators: +@, ~, ! (and not)

Let's try a larger example where we subclass String and add our own versions of several other easily overridden unary operators:

class MagicString  String
  def +@
    upcase
  end

  def -@
    downcase
  end

  def !
    swapcase
  end

  def ~
    # Do a ROT13 transformation - http://en.wikipedia.org/wiki/ROT13
    tr 'A-Za-z', 'N-ZA-Mn-za-m'
  end
end

str = MagicString.new("This is my string!")
p +str         # => "THIS IS MY STRING!"
p !str         # => "tHIS IS MY STRING!"
p (not str)    # => "tHIS IS MY STRING!"
p ~str         # => "Guvf vf zl fgevat!"
p +~str        # => "GUVF VF ZL FGEVAT!"
p !(~str)      # => "gUVF VF ZL FGEVAT!"

This time we've not only redefined -/-@, but the + unary operator (using the +@ method), ! and not (using the ! method), and ~.

I'm not going to explain the example in full because it's as simple as I could get it while still being more illustrative than reams of text. Note what operation each unary operator is performing and see how that relates to what is called and what results in the output.

Special Cases: & and *

& and * are also unary operators in Ruby, but they're special cases, bordering on 'mysterious syntax magic.' What do they do?

& and to_proc

Reg Braithwaite's The unary ampersand in Ruby post gives a great explanation of &, but in short & can turn objects into procs/blocks by calling the to_proc method upon the object. For example:

p ['hello', 'world'].map(&:reverse)  # => ["olleh", "dlrow"]

Enumerable#map usually takes a block instead of an argument, but & calls Symbol#to_proc and generates a special proc object for the reverse method. This proc becomes the block for the map and thereby reverses the strings in the array.

You could, therefore, 'override' the & unary operator (not to be confused by the equivalent binary operator!) by defining to_proc on an object, with the only restriction being that you must return a Proc object for things to behave. You'll see an example of this later on.

* and splatting

There's a lot of magic to splatting but in short, * can be considered to be a unary operator that will 'explode' an array or an object that implements to_a and returns an array.

To override the unary * (and not the binary * - as in 20 * 32), then, you can define a to_a method and return an array. The array you return, however, will face further consequences thanks to *'s typical behavior!

A Full Example

We've reached the end of our quick tour through Ruby's unary operators, so I wanted to provide an example that shows how to override (or partially override) them that should stand as its own documentation:

class MagicString < String
  def +@
    upcase
  end

  def -@
    downcase
  end

  def ~
    # Do a ROT13 transformation - http://en.wikipedia.org/wiki/ROT13
    tr 'A-Za-z', 'N-ZA-Mn-za-m'
  end

  def to_proc
    Proc.new { self }
  end

  def to_a
    [self.reverse]
  end

 def !
   swapcase
 end
end

str = MagicString.new("This is my string!")
p +str                   # => "THIS IS MY STRING!"
p ~str                   # => "Guvf vf zl fgevat!"
p +~str                  # => "GUVF VF ZL FGEVAT!"
p %w{a b}.map &str       # => ["This is my string!", "This is my string!"]
p *str                   # => "!gnirts ym si sihT"

p !str                   # => "tHIS IS MY STRING!"
p (not str)              # => "tHIS IS MY STRING!"
p !(~str)                # => "gUVF VF ZL FGEVAT!"

It's almost a cheat sheet of unary operators :-)

A Further Example: The TestRocket

TestRocket is a tiny testing library I built for fun a few years ago. It leans heavily on unary operators. For example, you can write tests like this:

+-> { Die.new(2) }
--> { raise }
+-> { 2 + 2 == 4 }

# These two tests will deliberately fail
+-> { raise }
--> { true }

# A 'pending' test
~-> { "this is a pending test" }

# A description
!-> { "use this for descriptive output and to separate your test parts" }

The -> { } sections are just Ruby 1.9+ style 'stabby lambdas' but, with assistance from Christoph Grabo, I added unary methods to them so that you can prefix +, -, ~, or ! to get different behaviors.

Hopefully you can come up with some more useful application for unary methods on your own objects ;-)

Ryan DavisHappy Birfday to me!

2014/10/16 01:13 AM

Today is my Fourteenth Year Anniversary with Ruby! Yay!

823 gem releases (up by 93), 9805 commits (up by 823), and bazillions of test runs later, and I’m still going strong. Rawr!

October 07, 2014

Tom CopelandCleaning up a Rails routes file

2014/10/07 04:00 AM

Large Rails apps have a way of gathering cruft. One popular cruft hideout is config/routes.rb. It doesn't change very often, and when it does change it's usually a person getting in there, adding a new resource or a new action, and getting out. So from time to time it's worth checking the routes file and tidying things up.

Probably the most obvious cleanup is removing unused routes. A route which has no corresponding action won't produce a warning, so you have to ferret these out yourself. Take a look at the routes file and see if anything obvious jumps out at you; if you know that a bunch of admin code was just extracted into another app you can look around for routes in an admin namespace. Do a git whatchanged -p app/controllers/ config/routes.rb (or as Brian Ewins suggests, git log -p --no-merges app/controllers/ config/routes.rb because whatchanged is deprecated) and see if recent controller or action removals included removal of orphaned routes. Grab the traceroute gem and see what it finds. This will lead to a smaller file, faster load time, and generally less confusion.

There are some minor cleanups that are worth a look. Sometimes I'll see resources with empty blocks:

resources :products, :only => [:index, :show] do
end

This is usually leftover from an earlier cleanup; just delete those empty blocks. Same goes for empty member or collection blocks. This is trivial, but it makes the file easier to read.

The default resource declaration - e.g., resources :products - results in routes to seven actions: new, create, edit, update, index, show, and destroy. Occasionally you'll see a resource declaration with all seven actions declared explicitly in an only constraint; in that case you can remove the entire constraint. So this:

resources :products, :only => [:create, :new, :edit, :update, :index, :show, :destroy]

can be boiled down to:

resources :products

The same applies to resource declarations, of course, but with six actions instead of seven.

A variant on this is to replace only blocks with except blocks when those would be shorter. That is, you can replace this:

resources :products, :only => [:create, :new, :edit, :update, :index, :show]

with this:

resources :products, :except => [:destroy]

Some people prefer wrapping single actions in an array when declaring resource restrictions; that is, they prefer resources :products, :only => [:create] to resources :products, :only => :create. I tend to use the bare symbol, but I can see where those folks are coming from - makes it easier to add other actions - so whatever works for your team.

This one is not like the others in that it makes your routes file bigger. But it reduces overall complexity, so I think it belongs here anyway. If you have an action that is serving only to redirect requests to another URL, you can replace that action with a route. So you can replace:

class ProductsController < ApplicationController
  def show
    redirect_to root_path
  end
end

With this route:

match '/products/:id' => redirect('/')

This trades five lines of code and a file for a single route, which seems like a win to me. There are times when this isn't quite as clearcut, like when the redirect is to a more complicated route, but generally this is a just a more effective use of the Rails API. Shades of 'refactor to the standard library'!

Another change which increases the routes file size but reduces the number of routes generated is to add only blocks where appropriate. I've found that many resources don't need a destroy action since that's handled in an admin app. So add only blocks and reduce your rake routes output.

There are other cleanups which could be made around shared constraints, unnecessary controller specifications, and redundant path_names usage. But I feel like usage of these features is rare enough that if you're using them you're probably reading the documentation.

When making changes like this, one option is to do a clean sweep and tidy up everything at once. I can see why people do this; it's one code review, one deploy, etc. But I usually do these things piecemeal - one or two changes at a time. I do this because there's lower risk; if I'm only changing a few things it's easy to locate the culprit if something goes wrong. It also lets me do the work in small chunks; instead of devoting a full day to routes file cleanup I can do 10 minutes of work here and there whenever I have time. Also, small changes are more helpful for co-workers to see what's going on. If someone sees a commit with a small and straightforward route cleanup, that person might be more likely to do a similar cleanup next time they see a similar issue. And if I'm headed down a wrong road with a cleanup strategy, someone can catch it earlier on in the process before I do too much damage.

October 01, 2014

Tom CopelandPreserving technical knowledge

2014/10/01 04:00 AM

If you've worked for a single company in a tech job for a while you know that there's a fair bit of turnover in our world. Folks move on to other jobs, new folks come on, people move between technical teams and over into management, etc. Why, chances are you might even have switched jobs yourself a few times!

It's nice to be able to preserve some technical knowledge during these transitions. If the person is switching roles within the company this is not such a big deal as that person will still be available in a crisis. But even in that case, there are a couple of ways to approach this.

One way is to have the outgoing person do a bunch of diagrams explaining stuff. To me this doesn't help a whole lot; these diagrams all end up looking the same: "this box connects to this cloud where things happen, then there are these other boxes". But it can't hurt. My advice here is to keep the diagrams simple. Also, do them in a tool that the next system owner knows how to use, otherwise they bitrot. I suspect there are folks who really learn a lot from these diagrams, so if you are one of those, have at it. But they don't seem to work for me. I wonder if a way to ensure these pictures have meaning is for the outgoing person to point to a connecting line and say "what would happen if this got cut" and the incoming person has to take a crack at it. That's probably a decent way to learn about things in general - drop a raise "BOOM" in a random place and think/talk through what effect that would have.

Another way is to do face to face interviews where the outgoing person explains the system. I think this is a bit more effective, however, the incoming person has to know something about what's going on otherwise none of the information makes sense. "We've had a longstanding problem with frobnicators not cromulating" doesn't mean much unless you know why a frobnicator should cromulate. However, this interview method can result in some frank and earnest sharing about system weaknesses that the outgoing person may be embarrassed to write down. Meetings like this have to have some structure, though, otherwise we just end up shooting the breeze for an hour and the knowledge passing doesn't actually happen.

My preferred way is to work out a shared handover. So Sally has been running the payroll system, she's leaving, and Fred is getting it. Sally should start redirecting all emails to Fred. If a report fails once or twice a day, Sally can make sure that there's some kind of alert set up, but Fred has to be the one fixing it. This is a fine opportunity for Sally to say something like "yeah I could never figure out how to get these gizmos to be 100%, but maybe you'll be able to sort it out". Humility goes a long way when you're outgoing; got to set Fred up for success. If there's some functionality move in progress - the tax calculations are getting extracted to a third party system - then Fred goes to the next status meeting and Sally doesn't. He'll be clueless, but then he'll know he's clueless, rather than sitting there with glazed eyes while Sally talks shop. Once the meeting is over he can race back to Sally's desk with a list of questions and now the answers will mean something to him.

The last chapter of Dave Copeland's excellent book The Senior Software Engineer talks about an outgoing dev's responsibility to leave a project in capable hands and covers some of this same material: one on one interviews, reviewing current or potential issues, etc. That's assuming that it's an amicable departure, but so is everything in this blog post.

Finally, I think it's helpful if the outgoing person can give the incoming developer a feel for the importance and impact of the job. "You're the key person for the payroll team", "you'll be able to save 200 accountants hours each day", etc. A little vision goes a long way towards motivating a developer to make a system better.

September 22, 2014

Tom CopelandVarious ways to create structs

2014/09/22 04:00 AM

The first chapter of Understanding Computation does a whirlwind tour of Ruby programming - syntax, control structures, etc. One of the constructs the author calls out is the Struct class. I've always used it like this:

Team = Struct.new(:players, :coach)
class Team
  def slogan
    "A stitch in time saves nine"
  end
end
some_team = Team.new(some_players, some_coach)

That is, I'd use the Struct constructor as a factory and assign the result to a constant. Then, if I needed to, I'd reopen the class that had just been created in order to add methods.

The example in the book does it a different way:

class Team < Struct.new(:players, :coach)
  def slogan
    "It takes one to know one"
  end
end

This creates an anonymous Struct, uses it as a superclass, assigns it to a constant, and reopens the new class all in one fell swoop. I poked around various gems and this seems like a pretty common approach - for example, Unicorn::App::Inetd::CatBody does this, and so does Arel::Attributes::Attribute.

The latter approach does result in more noise in the ancestor chain:

irb(main):001:0> Foo = Struct.new(:x) ; Foo.ancestors
=> [Foo, Struct, Enumerable, Object, Kernel, BasicObject]
irb(main):002:0> class Bar < Struct.new(:x) ; end ; Bar.ancestors
=> [Bar, #<Class:0x007ff4abb8b508>, Struct, Enumerable, Object, Kernel, BasicObject]

That's because Struct.new returns an anonymous subclass of Struct, and that subclass hasn't been assigned to a constant, so Ruby has to synthesize a to_s value from the type and object id. You could get around that using the two-arg constructor, but that creates the new class as a constant in the class Struct, which is a little weird:

irb(main):001:0> Struct.new("Bar", :x).ancestors
=> [Struct::Bar, Struct, Enumerable, Object, Kernel, BasicObject]

Sometimes I see code create Struct subclasses using the class keyword without adding methods; for example from WebSocket::Driver:

class OpenEvent < Struct.new(nil) ; end

Is there a reason to do the above rather than OpenEvent = Struct.new(nil)?

Edit: Bruno Michel noted that the Struct constructor also accepts a block:

Foo = Struct.new(:x) do
  def hello
    "hi"
  end
end

This is a nice technique because it results in a clean ancestor chain.

Edit 2: Tom Stuart (the author of 'Understanding Computation') pointed out this great writeup on Structs by James Edward Grey II where he explains why he's using Structs in the way he does. Definitely worth a read, especially with Ara Howard weighing in with a comment about his Map implementation.

Thanks to Thomas Olausson for reviewing this post!

Last updated at 2015/04/25 10:00 AM | Presented by Aredridel and Christoffer Sawicki | Hosted at The Internet Company