P O S T M O D E R N

Hexdump 0.2.x

hexdump, ruby

Recently mephux began working on a hexdump.js library (epic demo). Naturally, I thought I should go back and improve my Ruby Hexdump library.

Hexdump 0.2.x

Hexdump now supports word-sizes and endianness. This is useful for dumping packed unsigned ints:

Hexdump.dump("\x12\x34\x00\x00\x42\x42", :word_size => 2, :endian => :big)
# 00000000  1234 0000 4242                           |ሴ䉂|

:word_size can have any value, and is not restricted to powers of 2.

Now that Hexdump parses multi-byte words from data, displaying Unicode characters was the next logical step:

Hexdump.dump("\xb6\x80" * 10, :word_size => 2)
# 00000000  80b6 80b6 80b6 80b6 80b6 80b6 80b6 80b6  |肶肶肶肶肶肶肶肶|
# 00000010  80b6 80b6                                |肶肶|

Finally, Hexdump received some performance tuning which cut benchmark times in half on Ruby 1.9.2:

Before

                                user     system      total        real
hexdump (block)             7.740000   0.030000   7.770000 (  8.138029)
hexdump                     9.590000   0.050000   9.640000 ( 10.178203)
hexdump width=256 (block)   7.280000   0.020000   7.300000 (  7.534507)
hexdump width=256           8.130000   0.030000   8.160000 (  8.342448)
hexdump ascii=true (block)  7.740000   0.030000   7.770000 (  7.958550)
hexdump ascii=true          9.550000   0.050000   9.600000 (  9.803758)

After

                                 user     system      total        real
hexdump (block)              3.010000   0.010000   3.020000 (  3.529396)
hexdump                      5.430000   0.030000   5.460000 (  6.216174)
hexdump width=256 (block)    3.010000   0.020000   3.030000 (  3.308961)
hexdump width=256            4.700000   0.040000   4.740000 (  5.520189)
hexdump ascii=true (block)   3.050000   0.010000   3.060000 (  3.501436)
hexdump ascii=true           5.450000   0.040000   5.490000 (  6.352144)
hexdump word_size=2 (block)  7.420000   0.050000   7.470000 (  9.174734)
hexdump word_size=2          9.500000   0.070000   9.570000 ( 11.228204)
hexdump word_size=4 (block)  4.110000   0.030000   4.140000 (  4.849785)
hexdump word_size=4          5.380000   0.060000   5.440000 (  6.209022)
hexdump word_size=8 (block)  3.350000   0.070000   3.420000 (  4.147304)
hexdump word_size=8          4.430000   0.040000   4.470000 (  5.930758)

Using ruby-prof and the Rubinius Profiler (rbx -Xprofile) I found that the majority of time spent was in:

  1. Proc#call
  2. Array#join
  3. Integer#chr
  4. String#%

Optimizing out these excess method calls involved:

  • Frequently called lambdas were replaced with methods.
  • Array#join was replaced with incremental String concatenating code.
  • A lookup table of bytes and printable characters were added, to reduce excess Integer#chr calls.
  • Interestingly, Kernel#sprintf is slightly faster than String#%.

Uses

Both hexdump.js and Ruby Hexdump aim to provide similar features and behaviors. hexdump.js is definitely useful for when you want to offload the work of Hexdumping to the Browser, or when writing node.js Apps.

The Ruby Hexdump library is more suited for Ruby CLI Apps, but can also be used in Web Apps:

dump = []

Hexdump.dump(data) do |index,numeric,printable|
  dump << [index, numeric, printable]
end

render :json => dump