Stefen Kaes - Optimizing Rails

Posted by kev Fri, 23 Jun 2006 22:08:00 GMT

Stefen went very very fast during his presentation, so I’ve missed bits and pieces. I’ll link his slides if I can (though they may not be available except for the $50 video). Sorry about that.

Performance Tuning

  • Trying to improve performance without measuring is foolish
  • In favor of optimization at design time

Performance Parameters

  • Latency
    • How fast can you answer a request?
  • Throughput
    • How many requests can you process per second?
  • Utilization
    • Are your servers idle most of the time?
  • Cost efficiency
    • Performance per unit cost
  • Compute mean, min, max, standard dev (if applicable). Standard deviation will tell you how reliable your data is.

Benchmarking Tools

  • Rails log files (debug level >= Logger::DEBUG)
  • Rails Analyzer Tools (requires logging to syslog)
  • Rails benchmarker script (script/benchmarker)
  • Tools provided by DB vendor
  • Apache Bench (ab or ab2)
  • httperf
  • railsbench
    • downloadable from http://rubyforge.org/projects/railsbench

railsbench

  • Measures raw performance of Rails request processing configured through:
    • benchmark definitions
      • $RAILS_ROOT/config/benchmarks.yml
      • defines which urls you want to visit in yaml
    • benchmark class configuration
      • $RAILS_ROOT/config/benchmarks.rb
      • creates a benchmarking instance with an ActiveRecordStore
      • Can also define user locking etc.
    • stores benchmark data in $RAILS_PERF_DATA
    • indexed by date and benchmark time
    • uses additional Rails environment benchmarking
  • Usage
    • perf_run 100 "-bm-welcome options" [data file]
      • Run 100 iterations of benchnmark with given options, print data
    • perf_diff 100 "-bm=all opts" "opts1" "opts2" [file1] [file2]

railsbench options

  • -log[=level]
    • turn on logging (defaults to no logging). optionally oveerride log level.
  • -nocache
    • turn off rails caching
  • -path
  • -svlPV
    • run test using Ruby Performance Validator
  • patched_gc
    • use patched GC Ruby Profiling Tools
  • Ruby Profiler
  • Zen Profiler
  • rubyprof
  • Rails profiler script
  • Ruby Performance Validator (commercial, Windows only)
  • All but the last are pretty much useless for Rails performance work.
  • railsbench has builtin support for RPVL:
    • run_urls 100 -svlPV -bm=welcome ...
  • will start RPVL and run the named benchmark with given options

Please send an email to the RPV guys if you think it should have UNIX support

Top Rails Performance Problems

  • Depends on who you ask, but these are my favorites:
    • slow helper methods
    • complicated routes
    • associations
    • retrieving too much from DB
    • slow session storage
  • Judging from my experience, DB performance is usually not a bottleneck.
  • Instantiation ActiveRecord objects is more expensive

Available Session Containers

  • In Memory
    • Fastest but you lose all sessions on server crash/restart. Restricted to 1 app. Doesn’t scale.
  • File System.
    • Easy setup, one file for each session. Scales by using NFS or NAS (beware 10k active sessions!). Slower than
  • Database/ActiveRecordStore
    • Easy setup (comes with Rails distribution). Much slower than
  • Database/SQLSessionStore
    • Uses ARStore
    • More info at http://railsexpress.de/blog/articles/2005/12/19/roll-your-own-sql-session-store
  • memcached
    • Slighly faster than SQLSessionStore. Presumably scales best. Very tunable. Automatic session cleaning. Harder to obtain statistics. setup
  • DrbStore
    • Can be used on platforms where memcached is not available.

Cachable Elements

  • Pages
    • Fastes. Complete pages are stored on the file system. Web server bypasses app for rendering. Scales through NFS or NAS. Problematic if app requires login.
  • Actions
    • Second fastest. Caches the result of invoking actions on controllers. User login id can be used as part of the storage key.
  • Fragments
    • Very useful for caching small fragments (hence the name) of HTML produced during request processing. Can be made user aware.
  • Action caching is just a special case of fragment caching.
  • Several storage containers are available for fragment caching.

Storage Options for Fragment Caching

  • In Memory
    • Very very fast. If your app is running fast enough with 1 app server process, go for it!
  • File System
    • Reasonably fast.
  • DrbStore
  • memcached

ActionController Issues

  • Components
    • I suggest to avoid components. I haven’t found any good use for them, yet.
    • Each embedded component will be handled using a fresh request cycle.
    • Can always be replace by helper methods and partials.
  • Filters
    • If you are using components, make sure you don’t rerun your filters for every request.

ActionView Issues

  • Instance Variables
    • For each request, one controller instance and one view instance will be instantiated.
    • Instance vars creatd during controller processing will be transfered to view instance
    • So: avoid creating instance vars you don’t need. (PARAPHRASE, NEED TO FIND SLIDES)

Slow Helper Methods

  • pluralize(n, 'post')
    • Creates a new inflector instance, and try to derive the correct plural. This is expensive.
    • Do pluralize(n, 'post', MISSING_ARG_NEED_TO_FIND_SLIDES) instead
  • link_to and url_for
    • Much more efficient to construct your own urls, but you only need to do it on pages with large numbers of links.

ActiveRecord Issues

  • You can prefetch associated objects using :include
    • Article.find(:all, :include => :author)
  • Use piggy backing for has_one or belongs_to relations.
    • piggy.back :author_name, :from => :author, :attributes => [:name] article = Article.find(:all, :piggy => :author) puts article.author.name

Caching Column Formatting

  • Computationally expensive transformation on AR fields can be cached (in the DB, using memcached, a DRb process)
  • Example: textilize
    • I’ve analyzed an application, where 30% cpu was saved by storing the textilized value Ruby’s Interpreter is Slow
  • no byte code, no JIT, interprets ASTs directly
  • doesn’t perform any code optimization at compiler time:
    • method inlining

Complexity of Ruby Language Elements

  • Local Var acfcess: O(1)
  • Instance Var access: expected O(1)
  • Method Call: expected O(1)
    • hash access to determine literal value {"f" => :f}
    • method search
  • Recommendation:
    • don’t add method abstractions needlessly
    • use attr_accessors as external interfaces only
    • use local variables to short circuit repeated hash access
    • Avoid repeated hash access

Caching Data in Instance Variables/Class variables

  • see slides for example

Coding Variable Caching Efficiently

  • see slides for example

Defining Constants vs. Inlining

  • see slides for example

Local Variables are Cheap

  • see slides for example

Be Careful With Regards to Logging

ObjectSpace.each_object

  • see slides for example

Ruby’s Memory Management

  • Designed for batch scripts, no long running server apps
  • tries to minimize memory usage
  • simple mark and sweep algorithm
  • uses malloc to manage contiguous blocks of Ruby objects
  • complex datastructures
    • only references to C structs are stored on Ruby heap
    • comprises strings, arrays, hashes, local variables maps, scopes etc
  • eases writing C extensions
  • Current C interface makes it hard to implement generational GC

Why Ruby GC is a problem for Rails

  • ASTs are stored on the Ruby heap and will be processed on each collection
    • usually the biggest part of non garbage for Rails apps
  • Sweep phase depends on size of heap, not size of non garbage
    • can’t increase the heap size above certain limits
  • More heap gets added, if
    • size of freelist after collection, < FREE_MIN a constant defined in gc.c as 4096
    • 200,000 heap slots are a good lower bound for live data for typical Rails applications

Improving GC Performance

  • Control GC from the Rails dispatcher:
    • RailsFCGIHandler.process! nil, 50
      • Will disable Ruby GC and call GC.start after 50 requests have been processed

Patching Ruby’s Garbage Collector

  • Download latest railsbench package. Patch Ruby using rile rubygc.patch, recompile and reinstall binaries and docs.
  • Tune GC using environment variables
  • RUBY_HEAP_MIN_SLOTS
  • RUBY_HEAP_FREE_MIN
  • RUBY_GC_MALLOC_LIMIT
  • Rec values in slides (sorry)

Compile Time Template Optimization

  • Many helper calls in Erb templates can be evaluated at template compile time.
  • <%= end_form tag %> ==> </form>
  • It’s a complete waste to do it over and over again on a per request basis.
  • For some calls, we know what the output should be like, even if we don’t have all arguments available.
  • see slides

Rails Template Optimizer

  • Uses Ryan Davis’ ParseTree package and ruby2ruby class
  • Retrieves AST of ActionView render method after initial compilation
  • Transforms AST to simplify AST
  • Optimizes AST into optimized render method

Optimizer Customization and Restrictions

  • see slides

Comments are disabled