Daniel Boyle dot net / eBay Discovery Recommendations

The Problem

eBay never really knows why a user clicks on an item. It can be inferred based on filters and search terms, but at the end of the day, eBay doesn’t know if, in my example, I searched for Adidas because I love their shoes, or because it’s the first brand I thought of.

Initial viability through version 1

Hypothesis and prototype

I hypothesized that embracing this uncertainty and using recommendations to provide a directed browsing experience based on the top aspects of an item (brand, color, style, etc) and generate recommendations that help users discover what’s important to them. Even if the recommendations weren’t perfect, they should help users refine their search and ultimately find their perfect match.

This was before structured data was mandatory for many items on eBay, so I needed to build a prototype to validate that the data was reasonably likely to exist before our data scientists would invest in training an algorithm. I created a script that would select random popular items from eBay’s inventory, and extract their aspects. Then, it would generate a set of recommendations using an algorithm that found similar (but not identical) items. Those items would then be analyzed for matching aspects, then sorted into buckets by value. When over half of the items returned valid recommendations with aspects, we decided to invest time in a proper algorithm.

Mixed, but promising, results

The first set of recommendation attempted to display two or three groups representing different values for the top aspect (such as 3 different colors for a shoe), but it wasn’t effective at filtering out values that shouldn’t change, like shoe size. In the actual experience, this led to inappropriate recommendation path or a lack of recommendations where the data didn’t support enough variation. In categories where this data did exist, we saw good results, providing hope for future iteration.

We made a decision to delay further investment, as eBay’s push for more structured data was ramping up, giving us home we’d see better performance soon.

Iterate, improve, and expand

For our second attempt, we wanted to ensure we'd have a more complete results set. As structured data became more reliable, we built an algorithm from scratch that reflected a new strategy - focusing on the variety in key aspects, as determined by the algorithm, the user's original search, and the item itself.

For example, if you’re looking at black Nike running shoes, we would show suggestions for different brands of black running shoes or different colors of Nike running shoes. In this test, not only were people buying directly from the placement, but they were finding the right item more often in general.

Examples of the first successful UI and recommendations

Mobile Launch

The next step was to provide more options - not just a mixed list of brands or colors, but to actually group them into meaningful sets. We started with our native mobile apps, where users’ frequently complained that bouncing between search and item pages was tedious and frustrating. Once we showed that carousels were a viable UI for our existing recommendations when inserted at a user's decision points (replacing the earlier grids of items at the bottom of each page) we began exploring ways to enable exploration.

Our first mobiLE prototype

Our first prototype showed 1 or 2 carousels per aspect, with the additional recommendations triggered by the user to enable an infinite browse experience. Unfortunately, this didn’t go far - in development, it turned out to be too slow, too clunky, and too memory intensive and made the experience of using the app worse.

Eventually we settled on using the buttons as mini-tabs to switch out the recommendation sets. This launch produced a boost in performance and engagement similar to the desktop launch.

The final product on Mobile

Beta launch

We opened access to 200 sellers and guided them through a three‑step creation flow before letting them create their first ad.

Key metrics:

Click‑through rate
Ad sales
Return on ad spend (ROAS)

Outcomes

Tracking from actual ad campaigns indicated the product could work – returning positive ROAS when ads featured recognizable brands and real discounts – which signaled the need to integrate with seller promotion tools (bundles, discounts) to improve creative inputs. The other choice, to allow free-form content without review immediately caused quality and trust issues with campaigns, leading to an immediate change to allow only pre-vetted campaign text – preventing sellers from advertising discounts and bundles.

This meant our two limitations would prove to be blockers for long-term adoption.

Other changes would be easy to address in iterative changes. Among our dissatisfied users, the top complaint was a lack of transparency and controls around keyword targeting, which was originally abstracted to simplify the campaign creation process and limit spam. To address this, I created a UX that would provide an option for more visibility into the process and worked with the science team to create a workflow that would allow customization of the keywords without risking harm for our buyers.

Restrospective

Not every project can be a winner, but the important thing is that failures don’t carry a high cost and that we learn our lessons. In this case, I learned an important lesson about standing my ground and escalating when obvious danger signs appeared.

I transitioned to management when this project was completed, providing me with a new perspective on this problem, and invested time learning how to prevent my new team from encountering the same problem.

Wireframes for keyword targeting and performance tracking updates

Helping eBay shoppers find their perfect match with recommendations.

Impact