• Categories
    • Business
    • News
    • Product
    • Tips & Tricks
    • Work Culture
  • onehub.com
  • Sign In
  • Try For Free
  • Categories
    • Business
    • News
    • Product
    • Tips & Tricks
    • Work Culture
  • onehub.com
  • Sign In
  • Try For Free
Home » Uncategorized

Fixing a long standing bug in the Ruby Amazon S3 Library

Brian Moran Posted On February 2, 2012
0


0
Shares
  • Share On Facebook
  • Tweet It

The AWS::S3 library for Ruby has been around since the release of Amazon S3 in 2006; hundreds, if not thousands, of applications use it. Consequently, it is not usually “the suspect” when looking for the cause of intermittent access errors to S3. However, we recently found and fixed an error that has been present in the signature calculation method since the library was first released.

We use S3 as the backing store for Onehub Workspaces, and we do a lot of S3 operations. During routine log monitoring we noticed a slow, but persistent, stream of HTTP 403 (Unauthorized Access) errors from S3. These errors were not frequent enough to cause problems for our customers; applications using S3 should be designed anticipate errors, and retry. Still, we felt that further investigation was warranted.

To manage the logs generated by all of the Onehub services, we use Papertrail. Papertrail allows us to run a real-time search against our production logs, showing us requests to S3 like this:

https://s3.amazonaws.com/<bucket>/<object>?AWSAccessKeyId=<ouraccesskey>&Expires=1328127911&Signature=l74ewTX9hh0s2oiLoIY83V%2BlLuM%3D

The components used to calculate the signature are well documented by Amazon. When a signature fails, S3 will provide the components it attempted to use in the XML returned with the error message. We noticed that the signatures in these errors were different than those that should have been calculated for the provided Expires time. We monkey-patched the #canonical_signature method of AWS::S3::Authentication::Signature to handle a closure.

module AWS
  module S3
    class Authentication
      class Signature
        private
        def encoded_canonical
          digest = OpenSSL::Digest::Digest.new('sha1')
          b64_hmac = [OpenSSL::HMAC.digest(digest, secret_access_key, canonical_string)].pack("m").strip
          if options[:debug_proc]
            options[:debug_proc].call(sprintf("AWS::S3::Authentication::Signature - request %s encoded canonical: %s %s  canonical: [%s]", @request.path, b64_hmac, CGI.escape(b64_hmac), canonical_string))
          end
          url_encode? ? CGI.escape(b64_hmac) : b64_hmac
        end
      end
    end
  end
end

This enabled us to pass in an option to AWS::S3#url_for containing a closure with our debugging method.

options = options.merge({:debug_proc => lambda{|x| logger.warn(x)}})
the_url = AssetStore.url_for(key_name, options)

We put this through testing, and into production, then waited for the next error to appear.

AWS::S3::Authentication::Signature - request /<bucket>/<keyname> encoded canonical: l74ewTX9hh0s2oiLoIY83V+lLuM= l74ewTX9hh0s2oiLoIY83V%2BlLuM%3D canonical: [GET#012#012#0121328127912#012/<bucket>/<keyname>]
https://s3.amazonaws.com/<bucket>/<object>?AWSAccessKeyId=<OURACCESSKEY>&Expires=1328127911&Signature=l74ewTX9hh0s2oiLoIY83V%2BlLuM%3D

From here we could see the error. The Expires time used to calculate the signature was different that the time provided in the URL. The value 1328127911 is in the URL, while 1328127912 was used to calculate the signature!

But why?

It took a bit of digging through the AWS::S3 source, but we found the culprit. When generating these S3 URLs, we pass an expires_in option to AWS::S3#url_for.

# Signature is the abstract super class for the Header and QueryString authentication methods. It does the job
# of computing the canonical_string using the CanonicalString class as well as encoding the canonical string. The subclasses
# parameterize these computations and arrange them in a string form appropriate to how they are used, in one case a http request
# header value, and in the other case key/value query string parameter pairs.
class Signature < String #:nodoc:
  attr_reader :request, :access_key_id, :secret_access_key, :options

  def initialize(request, access_key_id, secret_access_key, options = {})
    super()
    @request, @access_key_id, @secret_access_key = request, access_key_id, secret_access_key
    @options = options
  end

  private

    def canonical_string
      options = {}
      options[:expires] = expires if expires?
      CanonicalString.new(request, options)
    end
    memoized :canonical_string

    def encoded_canonical
      digest   = OpenSSL::Digest::Digest.new('sha1')
      b64_hmac = [OpenSSL::HMAC.digest(digest, secret_access_key, canonical_string)].pack("m").strip
      url_encode? ? CGI.escape(b64_hmac) : b64_hmac
    end

    def url_encode?
      !@options[:url_encode].nil?
    end

    def expires?
      is_a? QueryString
    end

    def date
      request['date'].to_s.strip.empty? ? Time.now : Time.parse(request['date'])
    end
end

# Provides query string authentication by computing the three authorization parameters: AWSAccessKeyId, Expires and Signature.
# More details about the various authentication schemes can be found in the docs for its containing module, Authentication.
class QueryString < Signature #:nodoc:
  constant :DEFAULT_EXPIRY, 300 # 5 minutes
  def initialize(*args)
    super
    options[:url_encode] = true
    self << build
  end

  private

    # Will return one of three values, in the following order of precedence:
    #
    #   1) Seconds since the epoch explicitly passed in the +:expires+ option
    #   2) The current time in seconds since the epoch plus the number of seconds passed in
    #      the +:expires_in+ option
    #   3) The current time in seconds since the epoch plus the default number of seconds (60 seconds)
    def expires
      return options[:expires] if options[:expires]
      date.to_i + expires_in
    end

    def expires_in
      options.has_key?(:expires_in) ? Integer(options[:expires_in]) : DEFAULT_EXPIRY
    end

    # Keep in alphabetical order
    def build
      "AWSAccessKeyId=#{access_key_id}&Expires=#{expires}&Signature=#{encoded_canonical}"
    end
end

The #initialize method is the entry point, but most of the work is done by #build. The bug was immediately apparent once we looked at #expires. Because #build calls #expires, and then #encoded_canonical calls it later, the date used can change. The #date method uses Time.now, if these calls happened on different seconds, they would result in different values. The solution is to memoize the time; it could be done in #expires or #date.

def expires
  return options[:expires] if options[:expires]
  @expires ||= date.to_i + expires_in
end

Interestingly, this error is only possible if the expires_in option is used. We suspect most people either use the library’s DEFAULT_EXPIRY or pass in an expires option, both of which cause #expires to avoid the call to #date.

After a bit of testing we put this code into production and have eliminated these errors, resulting in better performance for our customers. We have also submitted a pull request to the library maintainer.

0
Shares
  • Share On Facebook
  • Tweet It




Author

Brian Moran


Read Next

Introducing Advanced Permissions


Onehub. All rights reserved.
Press enter/return to begin your search