Search Algorithm Experiments
The search experience was some of our most fertile ground for A/B experimentation. It was also a feature we were not able to test on the old site, and one of the main reasons we chose to go with Optimizely Server Side as our testing platform. We knew that this was one of the most pivotal steps in the funnel and a complex experience full of possibilities.
It was also one of the most highly-visible features on the website. Many internal users’ jobs involved using search in their daily work, particularly the Guest Experience (CX), Homeowner Experience, and Sales teams. In addition, homeowners themselves would often use search to check on their homes. This could make changes contentious, and nothing was more contentious than the default sort order. Obviously, every homeowner wanted to be at the top of results, which just wasn’t possible. So we had to justify our default sort order by showing that it would be the best both for users and for all homeowners by generating the most rental value for everyone.
While we tested all kinds of UI elements in search, I want to focus on two here: default sort algorithm and map display.
Basic Sort Testing
When we first launched the search experience on the new website, we had a number of sort order options built in: Price, Capacity, Bedrooms, Distance, and Reviews (more would be added later). To begin with, we chose Distance as the default sort, as it seemed to be the most objective and closest to the user’s assumptions.
Our plans were to eventually implement a multivariate algorithm that incorporated several factors, but in order to start learning quickly we ran a series of simple sort order tests.
Distance vs. Capacity - our exec team was unhappy with the default distance sort, and for good reason: in many of TurnKey’s markets, our larger, more lucrative homes are outside of the city center, while smaller, less profitable units (apartments/condos) tended to be concentrated in the urban center. A search for Nashville for instance would show a page full of condo units before we ever got to a 5-bedroom house. Exec pushed us to rank homes by highest capacity (the number of people permitted to stay in a rental) instead. They turned out to be right: while overall conversion rate declined with a capacity sort, overall revenue increased. That made sense: we were pushing users to more expensive homes first, which turned some off, but the price difference made up for it. Capacity became our new default sort.
Capacity vs. Reviews - the most interesting result of the basic sort testing series was when we tested Capacity vs. Reviews. Review sort showed homes that had the highest number of good reviews (in other words, the top results were 5-star homes with the most 5-star reviews). The results surprised us: this test hit statistical significance quickly and decisively, with review sort showing a +15% increase in conversion rate and revenue over capacity. We’d found a big win!
But not quite. What we found, after some time, was that review sort was creating a rich-get-richer situation, where homes that already performed well and had a lot of reviews were getting even more reservations and more reviews, while homes that had struggled to acquire good reviews weren’t getting as many bookings and so were stuck. Given that TurnKey’s growth was dependent on keeping current homeowners satisfied and onboarding new ones — particularly new ones which wouldn’t have reviews yet — this wasn’t a sustainable solution.
Multivariate Sort Experiments
In our next stage, we sought to combine multiple variables as weights for our default sort. We knew the following from our basic sort tests:
Review sort was good for conversion, but we needed a way to spread the love to homes that didn’t have the best/most reviews
Capacity was next-best for revenue
In addition, user data, internal feedback, and executive opinion produced the following new considerations:
While a pure distance sort was bad for revenue, not incorporating distance in some way was not ideal either. A home barely on the edge of the search area could be ranked at the top. This was particularly counterintuitive when the user moved the map to a new search center and the top results were nowhere near the center.
Executives and homeowners were unhappy with our 30-result pagination: in large markets, a home could be many pages deep.
Execs also wanted to see a more densely-populated map to visually convey how much inventory TK had
Priority Score
After a series of tests, it became evident that the variables visible to the website were not going to solve one of our biggest concerns: balancing revenue maximization with “spreading the love” to under-booked, and particularly new, homes. We needed a more sophisticated way to sprinkle in these homes while still improving conversion rate.
The answer was to bring in our Revenue Management team. They’d always had an interest in the sort algorithm for rev maximization purposes, but we’d yet to build anything that would allow them — and most importantly, their robust data on unit performance — to contribute. So together we built the Priority score. The Priority score was a value revenue management could use to dynamically prioritize some homes over others, both to boost homes that might have a higher revenue potential at a certain time, and to assist homes that were struggling to get more reservations. We made Priority score the heaviest weight in the algorithm, followed by capacity — and saw conversion rate improve while also spreading reservations amongst more homes.
Map Clusters
Still, we needed to solve the problems of pagination and filling the map.
The first step was to test increasing the number of listings per page or results from 30 to 50. This proved to be neutral, and since it was meant to assuage homeowners by increasing the number on the first page of results, we rolled it out.
The next was to plot all of the results on the map on the first load. The problem was that the map was now too full, that it was hard to see how many homes were in an area and where. This was exacerbated by our old “balloon” pin design:
So we did three things:
1) Redesigned the pin to be more compact, with an interior TK checkmark and new outline to add contrast when pins overlap
2) Changed the hover state from darkened-teal to red, so that when hovering over a listing it’s clearer which pin belongs to that home
3) Implemented pin clustering
Of these, pin clustering was by far the most difficult technically and to get right from a UX perspective. Namely, it meant a lot of tweaking of the clustering behavior: the radius of the cluster, the interaction on click, and how it behaved at various zoom levels. In order to get the feel of this right, we leveraged internal users to for testing and feedback.
Onion Sort
Our search experience had massively improved through all of these tests, but we were still stuck with the problem of how to incorporate distance into results.
Instead of tweaking distance’s weight in the algorithm, we wanted to try a different means of geographic distribution: while the top few results should be central to the search map, we didn’t want the first page to be dominated by central properties. Instead, a user should see a good geographic spread as they scrolled through the list.
So we came up with what we called “Onion Search”. We’d create a series of concentric circle “layers” radiating out from the search center. Each “layer” slice would be ranked according to the standard, Priority-score search algorithm. Then, we’d show a set from each layer at a time.
In the original version, the total search radius would be 15 miles. The first onion layer would be at a 5 mile radius, followed by a 5-10 mile layer, and then a 10-15 mile layer. Each layer would then be ranked by the Priority algorithm. Then to build the actual ranking order, we’d pull the top 10 homes from each layer and concatenate those: 10 from the 0-5 mile slice followed by 10 from 5-10 miles followed by 10 from 10-15 miles, then repeating the loop through all of the results.
Later, we tweaked Onion sort by adding an additional quarter-mile layer, based on feedback that the first 5 mile tier wasn’t providing centralized-enough results.
Through this process of A/B experimentation and gathering internal user feedback, we were able to balance improving conversion rate and revenue with satisfying business considerations — a win-win for everyone.