You know nothing about me

One challenge that a number of online services (apps, websites, etc.) experience is making predictions and decisions about new customers.

We know what to do with the customers who have been with us for a while now. We have data on their preferences, usage characteristics, demographics, payments, and a lot more.

But what can we do with a customer who just joined, maybe minutes ago, so that we can make clever inferences about them right away. How should we interact with them? What should we offer them? How do we keep them active and ensure that they won’t just ‘test us’ briefly and immediate leave?

If only there was a way to learn some things about our new customer quickly, with high precision, that will be useful in making rapid decisions about that.

Show me your friends and I will tell you who you are

A new research I was involved with suggests one way to glean some insights about your newcomers, rapidly, immediately when they join. Scientists call this line of research “solving the cold start problem”. You have a person with “thin file” (the term Credit Bureaus in the U.S. use for those customers) and you need to make predictions about their future behavior, or quick promotion decisions that may impact their journey with your service now.

The solution our research proposes: look at their friends’ behavior and assume that they are similar. With a little twist. Not all their friends, and not just by averaging the friends’ behavior. Rather, by using a little network analysis.

Here is how it works. Imagine that you manage a website such as, say, PayPal, and a new user, Sandra, joined your service at 10h00 this morning. You know very little about her at this stage. Maybe you know her username and password. You might even be able to infer her that she is a woman, or her age, from her name. Maybe even her location and device type from the traffic details. Not a lot. Certainly not something that can be helpful in deciding how many transactions she is likely to produce, or how much money she is going to deposit, or whether you should offer her different fees scheme.

But maybe within a minute Sandra generates her first transaction. This transaction is going to a user, Frank, whom you know well. Frank has been with you for a while and is a frequent user, that generates 5 transactions a week at an amount of roughly $100 each, and is consistently paying on time. The probability that Sandra will behave similar to Frank is higher than chance. Just by knowing something about one transaction of Sandra’s you can start making much more intelligent predictions on her similarity to him.

Our research shows that it actually goes deeper than that. If we know things about Sandra’s network, through Frank, we can make even more intelligent predictions. If Sandra for example sends transactions to Frank and to Jutta then you have more information than if she just speaks to Frank. But… if you also know that Frank talks to Jutta also – meaning, you have a “triangle” of interactions where A, B, and C all interact with one another – your prediction accuracy is substantially higher. Averaging Frank and Jutta’s data will give you predictive accuracy above 80% on Sandra’s behavior. Remember, Sandra only did 2 things since she joined the service, and you can already predict 80% of what she will do.

Triangles of connections on a network prove to be extremely powerful in making predictions about people who joined the network, before the newcomer does anything.

Other important predictors of behavior include the centrality of Frank in the network. If Frank has a lot of triangles, then the likelihood that Sandra will behave more like Frank is actually lower (Frank is more ‘unique’, and Sandra is likely to behave more like one of his friends than like him). The list of predictive properties gets longer, with many nuances (is the interaction between Frank and Jutta bi-directional, or does Jutta talk to Frank more than he replies; does the network of connections change rapidly, or is it relatively stable; etc.).

Nodes in a network

Instead of listing all of these elements that can be used to predict Sandra’s behavior on t0 (the so called: “Cold start”) I will refer you to the academic paper describing the work, online. You will find a lot of the technical details there.

But the main take-home message that I want to leave you with is… that knowing nothing about a person who joined a service does not mean you know nothing about them. It just means that you have to dig outside of “their” box. The idea that we are the sum of the five people we interact with the most is true in psychology, and it is even more profound in online behavior. We are just not as unique as we think we are… – a lot can be learned about us from the people we surround ourselves with.

In times when third-party cookies become more scarce, when privacy and data on a user become limited and harder to gather across platforms, making predictions about an individual from similar peers’ behavior might become a prominent way for marketing managers to know their customers. Note that the example we used, transactions on the platform, is not exclusive. If your service does not have transactions, it might have references (friend invites another friend to join the service). Maybe it allows for sharing of articles, liking others’ pictures, or simply going to someone else’s person page first. All of those behaviors are “network behaviors”. If I join now and immediately go do something on some other user’s domain – I am a node in a network, and there is information about me that can be gleaned by the network behavior.

With users’ fast sampling of apps, quick churns, or rapid usage drop after little information, the things we do with our customer early on in the encounter are crucial – knowing how to leverage information about them to maximize the tailoring of the experience to their preferences is critical. Solving the cold-start problem is increasingly essential to do it right.

Featured image: Good Studio / Shutterstock.com