A software engineer focused on writing expressive code for humans (and computers).

September 7, 2020 ·

Validating, normalizing, and formatting phone numbers in Rails

Often times in applications, the phone column is stored as a string (or an integer 😱) in the database and is a mess at the whim of the users. There’s occasionally a presence validation thrown in there for good measure or maybe even a format check for 10 digits, but these can fall short. The convention of hyphens and parenthesis are for human readability, that is, formatted values. But at the database level, we want valid and uniform data where practical.

Is it valid?

If a phone number is invalid, then there is no need to normalize… it’s useless. But how do we determine validity? Phone numbers are tricky to get right, especially when you realize support for international. So a library or API proves essential for covering most cases. Maybe this is overkill, but if usable phone numbers are necessary for your application, then they’re (maybe?) worth getting right.

One name rises to the top when digging into phone number validation: libphonenumber by Google. This library has many derivates in many languages but is referenced and ported freely. Even Twilio, the developer platform for communications, uses it for their phone Lookup API:

Twilio will use libphonenumber, Google’s open source phone number handling library, to properly format possible phone numbers for a given region, using length and prefix.

libphonenumber is for C++, Java, and JavaScript… what about Ruby?

Yeah, that’s a problem. Sadly we can’t use the library directly, but there are several ports available. libphonenumber directly references the telephone_number gem, but I typically reach for another called Phonelib. It has a lightweight interface that works great for utility purposes. Here’s what it looks like:

Phonelib.valid?("123456789")
#=> false

And it includes a phone validator for ActiveRecord models:

validates :attribute, phone: true

Boom, easy, done.

The phone number is valid… now what?

Great! Now we need to store it. But have you considered the variety of phone number formats: dot vs space vs hyphen separated, parens vs no-parens, plus all the international variants? We could use a JS mask and enforce it with a format regex, but not all numbers should be formatted the same way.

Enter stage left: “normalization,” the process of making something “normal”…

normal: conforming to a type, standard, or regular pattern
– Merriam-Webster Dictionary

…and based on that definition, it sounds a lot like something we’d want in our applications.

But how do we define normal?

Luckily there’s a pretty good answer to this: E.164. This is a spec for formatting phone numbers that ensures each device has a “globally unique number.” That sounds like exactly the kind of thing we should be storing. In fact, this is the recommendation by Twilio:

If you plan to store the phone number after validating, we recommend storing the full E.164 formatted number[.]

The E.164 format is most easily identified by a “+” prefixing a number. At a high level, the structure is “+”, then country code, then subscriber number. Here’s an example: “+14155552671”.

But how do we convert something like “(415) 555-2671” into its proper E.164 format for storage? Again, Phonelib can help:

Phonelib.parse("(415) 555-2671", "US").full_e164
#=> "+14155552671"

But isn’t a phone number just a number?

In Google’s Falsehoods Programmers Believe About Phone Numbers (definitely worth a read through), they specifically address this in #25 Phone numbers are numbers:

Never try to store phone numbers as an int or any other kind of numeric data type. You can’t do arithmetic on them[.]

So there you have it. You can’t do math on phone numbers, so a string is probably a better candidate.

But now the phone number is not human-friendly 🙄

This is true. Normalized data is great for computers, but not so much for the users. We should now circle back around and offer a way to display human-friendly phone numbers. Again, a technically difficult problem, as conventions for formatted phone numbers vary by country.

According to Google’s style guide for phone numbers, there are two main phone number formats to handle: North American (NANP) and international. So for USA, Canada, and other NANP countries (indicated by country code “1”), we use the national format: “(415) 555-2671”, but elsewhere the international format: “+44 20 7183 8750”. Fortunately, these conventions are both supported by Phonelib:

Phonelib.parse("+14155552671").full_national # US
#=> "(415) 555-2671"
Phonelib.parse("+442071838750").full_international # UK
#=> "+44 20 7183 8750"

And now for adding a validated, normalized, formatted phone to an ActiveRecord model in Rails, phew 😅

Add the Phonelib gem

# Gemfile
gem "phonelib"

Add some global config to avoid variances in usage

# config/initializers/phonelib.rb
Phonelib.default_country = "US"
Phonelib.extension_separate_symbols = ["x", ";"]
  • default_country is used when parsing a phone number without a country prefix.
  • extension_separate_symbols defines which characters indicate a phone number extension. The default is “;” but we also want to support “x”, “ex”, and “ext”… all covered by specifying “x”.

Validate the phone number to prevent bad data

The Phonelib library adds a PhoneValidator class, activated by phone: true. Depending on whether the field is required or not, you can also add an allow_blank flag.

# app/models/business.rb
class Business < ApplicationRecord
  validates :phone, phone: true, allow_blank: true
end

NOTE: phone: true refers to the PhoneValidator, not the phone attribute on the model.

Normalize the phone number before persisting

We want to normalize the record just before persisting to the database. Callbacks might be out of vogue, but they prove the point here in plain speak, “before save normalize phone.”

class Business < ApplicationRecord
  before_save :normalize_phone
  # ...
  private

  def normalize_phone
    self.phone = Phonelib.parse(phone).full_e164.presence
  end
end

NOTE: The #presence call at the end is to handle an empty phone number. The #full_e164 method returns an empty string, but it would be preferable to store that in the database as NULL.

Format the phone number for display

The #phone method returns the E.164 representation of the phone number, so let’s add a second #formatted_phone method as a display-friendly variation.

class Business < ApplicationRecord
  # ...
  def formatted_phone
    parsed_phone = Phonelib.parse(phone)
    return phone if parsed_phone.invalid?

    formatted =
      if parsed_phone.country_code == "1" # NANP
        parsed_phone.full_national # (415) 555-2671;123
      else
        parsed_phone.full_international # +44 20 7183 8750
      end
    formatted.gsub!(";", " x") # (415) 555-2671 x123
    formatted
  end
  # ...
end

Firstly, we have to consider that the phone attribute might not be valid. For instance, during form validation the user could enter a garbage value. So instead of displaying nothing on re-render, we actually want to present the user with exactly what they entered along with an error message. That’s what the guard is determining: if invalid, then show phone unmodified, else proceed with formatting.

For formatting, we want NANP numbers to use the “national” phone number format, but all others the “international” format. We can make this distinction by looking for a “1” country code. Now let’s zoom in to what’s going on in the national formatting case:

parsed_phone = Phonelib.parse("+14155552671;123")
#=> #<Phonelib::Phone>
parsed_phone.full_national
#=> "(415) 555-2671;123"
parsed_phone.full_national.gsub(";", " x")
#=> "(415) 555-2671 x123"

We first parse the phone number to get back a Phonelib::Phone instance. Then, #full_national returns the national formatted phone number, but with the extension separated by a semicolon… not very pretty. So we use #gsub to swap that out for something that looks a bit nicer. NOTE: the extension separator needs to match up with the list specified in the initializer.

Full Example

class Business < ApplicationRecord
  before_save :normalize_phone

  validates :phone, phone: true, allow_blank: true

  def formatted_phone
    parsed_phone = Phonelib.parse(phone)
    return phone if parsed_phone.invalid?

    formatted =
      if parsed_phone.country_code == "1"
        parsed_phone.full_national
      else
        parsed_phone.full_international
      end
    formatted.gsub!(";", " x")
    formatted
  end

  private

  def normalize_phone
    self.phone = Phonelib.parse(phone).full_e164.presence
  end
end

In Summary

Leaning into libphonenumber is an industry-recognized good option, but it doesn’t ensure 100% confidence. Still, considering phone numbers in separate terms of validation, normalization, and formatting, is a valuable exercise that can (hopefully) lead to better phone number handling as needed.