Validating, normalizing, and
formatting phone numbers in RailsOften times in applications, the phone column is stored as a string (or an integer š±) in the database and is a mess at the whim of the users. Thereās occasionally a presence validation thrown in there for good measure or maybe even a format check for 10 digits, but these can fall short. The convention of hyphens and parenthesis are for human readability, that is, formatted values. But at the database level, we want valid and uniform data where practical.
Is it valid?
If a phone number is invalid, then there is no need to normalize⦠itās useless. But how do we determine validity? Phone numbers are tricky to get right, especially when you realize support for international. So a library or API proves essential for covering most cases. Maybe this is overkill, but if usable phone numbers are necessary for your application, then theyāre (maybe?) worth getting right.
One name rises to the top when digging into phone number validation: libphonenumber by Google. This library has many derivates in many languages but is referenced and ported freely. Even Twilio, the developer platform for communications, uses it for their phone Lookup API:
Twilio will use libphonenumber, Googleās open source phone number handling library, to properly format possible phone numbers for a given region, using length and prefix.
libphonenumber is for C++, Java, and
JavaScript⦠what about Ruby?Yeah, thatās a problem. Sadly we canāt use the library directly, but there are several ports available. libphonenumber directly references the telephone_number gem, but I typically reach for another called Phonelib. It has a lightweight interface that works great for utility purposes. Hereās what it looks like:
Phonelib.valid?("123456789")
#=> false
And it includes a phone validator for ActiveRecord models:
validates :attribute, phone: true
Boom, easy, done.
The phone number is valid⦠now what?
Great! Now we need to store it. But have you considered the variety of phone number formats: dot vs space vs hyphen separated, parens vs no-parens, plus all the international variants? We could use a JS mask and enforce it with a format regex, but not all numbers should be formatted the same way.
Enter stage left: ānormalization,ā the process of making something ānormalāā¦
normal: conforming to a type, standard, or regular pattern
ā Merriam-Webster Dictionary
ā¦and based on that definition, it sounds a lot like something weād want in our applications.
But how do we define normal?
Luckily thereās a pretty good answer to this: E.164. This is a spec for formatting phone numbers that ensures each device has a āglobally unique number.ā That sounds like exactly the kind of thing we should be storing. In fact, this is the recommendation by Twilio:
If you plan to store the phone number after validating, we recommend storing the full E.164 formatted number[.]
The E.164 format is most easily identified by a ā+ā prefixing a number. At a high level, the structure is ā+ā, then country code, then subscriber number. Hereās an example: ā+14155552671ā.
But how do we convert something like ā(415) 555-2671ā into its proper E.164 format for storage? Again, Phonelib can help:
Phonelib.parse("(415) 555-2671", "US").full_e164
#=> "+14155552671"
But isnāt a phone number just a number?
In Googleās Falsehoods Programmers Believe About Phone Numbers (definitely worth a read through), they specifically address this in #25 Phone numbers are numbers:
Never try to store phone numbers as an int or any other kind of numeric data type. You canāt do arithmetic on them[.]
So there you have it. You canāt do math on phone numbers, so a string is probably a better candidate.
But now the phone number is not human-friendly š
This is true. Normalized data is great for computers, but not so much for the users. We should now circle back around and offer a way to display human-friendly phone numbers. Again, a technically difficult problem, as conventions for formatted phone numbers vary by country.
According to Googleās style guide for phone numbers, there are two main phone number formats to handle: North American (NANP) and international. So for USA, Canada, and other NANP countries (indicated by country code ā1ā), we use the national format: ā(415) 555-2671ā, but elsewhere the international format: ā+44 20 7183 8750ā. Fortunately, these conventions are both supported by Phonelib:
Phonelib.parse("+14155552671").full_national # US
#=> "(415) 555-2671"
Phonelib.parse("+442071838750").full_international # UK
#=> "+44 20 7183 8750"
And now for adding a validated, normalized, formatted
phone to an ActiveRecord model in Rails, phew šAdd the Phonelib gem
# Gemfile
gem "phonelib"
Add some global config to avoid variances in usage
# config/initializers/phonelib.rb
Phonelib.default_country = "US"
Phonelib.extension_separate_symbols = ["x", ";"]
default_country
is used when parsing a phone number without a country prefix.extension_separate_symbols
defines which characters indicate a phone number extension. The default is ā;ā but we also want to support āxā, āexā, and āextā⦠all covered by specifying āxā.
Validate the phone number to prevent bad data
The Phonelib library adds a PhoneValidator
class, activated by phone: true
. Depending on whether the field is required or not, you can also add an allow_blank
flag.
# app/models/business.rb
class Business < ApplicationRecord
validates :phone, phone: true, allow_blank: true
end
NOTE:
phone: true
refers to thePhoneValidator
, not thephone
attribute on the model.
Normalize the phone number before persisting
We want to normalize the record just before persisting to the database. Callbacks might be out of vogue, but they prove the point here in plain speak, ābefore save normalize phone.ā
class Business < ApplicationRecord
before_save :normalize_phone
# ...
private
def normalize_phone
self.phone = Phonelib.parse(phone).full_e164.presence
end
end
NOTE: The
#presence
call at the end is to handle an empty phone number. The#full_e164
method returns an empty string, but it would be preferable to store that in the database asNULL
.
Format the phone number for display
The #phone
method returns the E.164 representation of the phone number, so letās add a second #formatted_phone
method as a display-friendly variation.
class Business < ApplicationRecord
# ...
def formatted_phone
parsed_phone = Phonelib.parse(phone)
return phone if parsed_phone.invalid?
formatted =
if parsed_phone.country_code == "1" # NANP
parsed_phone.full_national # (415) 555-2671;123
else
parsed_phone.full_international # +44 20 7183 8750
end
formatted.gsub!(";", " x") # (415) 555-2671 x123
formatted
end
# ...
end
Firstly, we have to consider that the phone
attribute might not be valid. For instance, during form validation the user could enter a garbage value. So instead of displaying nothing on re-render, we actually want to present the user with exactly what they entered along with an error message. Thatās what the guard is determining: if invalid, then show phone
unmodified, else proceed with formatting.
For formatting, we want NANP numbers to use the ānationalā phone number format, but all others the āinternationalā format. We can make this distinction by looking for a ā1ā country code. Now letās zoom in to whatās going on in the national formatting case:
parsed_phone = Phonelib.parse("+14155552671;123")
#=> #<Phonelib::Phone>
parsed_phone.full_national
#=> "(415) 555-2671;123"
parsed_phone.full_national.gsub(";", " x")
#=> "(415) 555-2671 x123"
We first parse the phone number to get back a Phonelib::Phone
instance. Then, #full_national
returns the national formatted phone number, but with the extension separated by a semicolon⦠not very pretty. So we use #gsub
to swap that out for something that looks a bit nicer. NOTE: the extension separator needs to match up with the list specified in the initializer.
Full Example
class Business < ApplicationRecord
before_save :normalize_phone
validates :phone, phone: true, allow_blank: true
def formatted_phone
parsed_phone = Phonelib.parse(phone)
return phone if parsed_phone.invalid?
formatted =
if parsed_phone.country_code == "1"
parsed_phone.full_national
else
parsed_phone.full_international
end
formatted.gsub!(";", " x")
formatted
end
private
def normalize_phone
self.phone = Phonelib.parse(phone).full_e164.presence
end
end
In Summary
Leaning into libphonenumber is an industry-recognized good option, but it doesnāt ensure 100% confidence. Still, considering phone numbers in separate terms of validation, normalization, and formatting, is a valuable exercise that can (hopefully) lead to better phone number handling as needed.