Cant find the role you’re looking for?
Sign up to be the first to know about future job openings at GetYourGuide
Stay notified of new jobsCopyrighted 2008 – 2023 GetYourGuide. Made in Zurich & Berlin.
Constantin Șerban-Rădoi, backend engineer, walks us through the fascinating art of scaling up booking reference numbers. As we’re undergoing significant growth here at GetYourGuide, Constantin explains how the FinTech team succeeded in creating more booking code combinations. The team was challenged with keeping codes short, easy for customers to read over the phone, and making sure the new module didn’t accidentally generate any bad words.
{{Divider}}
At GetYourGuide, we have many moving parts that keep the platform running smoothly. One of the most important cogs in the machine is booking reference numbers. Serving as an intermediary between travelers and our partners, such as tour companies, requires us to have a powerful system for keeping track of the activities sold on our platform.
From the customer's perspective, the booking reference number is the single identifier for the activity they have purchased on GetYourGuide. This identifier is required by the Customer Service team so they can look up the booking in our internal systems.
For our partners, the booking reference number is the magic link between an activity they offered to a customer and the payment that they will receive from GetYourGuide for that booking.
In the early days of GetYourGuide, the booking reference number looked like GYGxxxxxxxx, where xxxxxxxx was an 8 digit number. This served us well until this summer, when it became clear that we would soon need more than 8 digits to refer to newer bookings. We needed a new way of generating identifiers that were scalable for many years to come while also being backward-compatible.
Given the problem at hand, we had to make sure that we did not break the existing functionality, and that we could still look up bookings that had been completed years ago. The simplest solution for this problem was to separate the implementation in two:
The first part of the solution was easy because it was already implemented. The interesting part comes next, the new algorithm. This algorithm posed several constraints:
First, we wanted to keep the length of the reference number as short as possible. This ensures that our customers can easily read and reproduce this reference number over the phone or on the web when they would require assistance from our customer service. Second, we did not want to change the algorithm every time we exceeded another digit, e.g. 10 digits, then 11, and so on.
Given these constraints, we decided to use a larger set of symbols, letters and digits. This meant that we could, in theory, increase the number of symbols from 10 to a maximum of 62 if we were to use all the digits and all the English alphabet characters in upper or lower case. However, when being read from a paper or screen voucher and transmitted over the phone, some of these characters would look ambiguous. Let’s look at a couple of examples:
Because mixing lower with upper case characters would cause confusion, we have decided to only use upper case characters. Additionally, we have eliminated most vowels from the character set to avoid unintentionally forming bad words as well as some more characters that are phonetically close together and could cause confusion when spelled over the phone, like ‘C’, ‘D’, ‘S’ and ‘T’.
This meant that for reference numbers containing just 9 characters we could encode up to about 10^12 booking references. For a single-digit increase from 8 to 9, we can refer to 1000 times more numbers than if we used just digits (0-9).
It offered a total of up to 99999999 reference numbers that could be represented. With the new algorithm, the reference numbers look like this:
This variant allows us to encode up to 10^12 (1,000,000,000,000) booking references, 1000 more than using just digits. Below is a table with the various lengths and number of symbols considered.
While this change gave us room for 1000x growth in terms of additional booking reference numbers, it now also meant that we have to make sure we can properly handle the new encoding elsewhere in our system.
At the beginning of GetYourGuide, the assumption was that we would always use numeric values to represent the booking references, some parts of our codebase took shortcuts and were only able to handle integer types for these values.
Thus, we needed to make sure that all the possible occurrences — where we create or read booking references from external or internal sources, and any other intermediary steps where we use them — were capable of handling both the old numeric form and the new form with alpha-numeric strings.
As one would imagine, testing such substantial changes was very important, since we did not want to risk breaking any functionality regarding the bookings of our customers.
The introduction of unit tests for the new module that generates booking references was a big help. Additionally, the existing end-to-end tests for the customer booking flow made it easier as well.
Furthermore, we did additional manual tests on our testing clusters, which are a small-scale representation of the production system where we could see how the system would behave once everything was released.
This additional step is critical. For such changes affecting all systems, there could always be something that one could oversee during the creation of unit and functional/end-to-end tests.
Having production data at hand is also very helpful for discovering any previously unidentified issues. Finally, we released everything behind a feature flag so that we could easily roll-back in case anything popped up. This entire process, although quite elaborate, was the right choice for us to ensure a smooth transition from the old to the new booking reference number system.