Always interesting to see outsiders writing papers about this, using anecdote and unrelated data (mostly political and real world purchase data in this case) to argue that ML doesn't make useful predictions. Meanwhile I look at randomized controlled trial data showing millions of dollars in revenue uplift directly attributable to ML vs non-ML backed conversion pipelines, offsetting the cost of doing the ML by >10x.
It reminds me a lot of other populist folk-science belief, like vaccine hesitancy. Despite overwhelming data to the contrary, a huge portion of the US population believes that they are somehow better off contracting COVID-19 naturally versus getting the vaccine. I think when effect sizes per individual are small and only build up across large populations, people tend to believe whatever aligns best with their identity.
> Always interesting to see outsiders writing papers about this, using anecdote and unrelated data (mostly political and real world purchase data in this case) to argue that ML doesn't make useful predictions. Meanwhile I look at randomized controlled trial data showing millions of dollars in revenue uplift directly attributable to ML vs non-ML backed conversion pipelines, offsetting the cost of doing the ML by >10x.
I regularly buy the same brand of toilet paper, socks, and sneakers. Machine learning can predict that.
But, machine learning can't predict that I spent the night at my parents house, really liked the fancy pillow they put on the guest bed, and then had to buy one for myself. (This is essentially the conclusion in the abstract.)
Such a prediction requires mind reading, which is impossible.
The key insight missed by this paper (and people from the marketing field in general) is that cases like that are extremely rare compared to easy to predict cases. They don't matter right now at all for most products, from the perspective of marketing ROI.
Also ML can predict that, BTW. Facebook knows you are connected to your parents. If the pillow seller tells Facebook that your parents bought the pillow, then Facebook knows and may choose to show you an ad for that pillow.
> Also ML can predict that, BTW. Facebook knows you are connected to your parents. If the pillow seller tells Facebook that your parents bought the pillow, then Facebook knows and may choose to show you an ad for that pillow.
I think you're letting your imagination run away, and I think you're trying to exceed the limits of the kind of information that you can collect and act upon.
What you're trying to do is mind reading, and computers physically cannot do that. (Nor can people)
You are friends with your parents on Facebook.
Your parents buy the pillow.
The pillow seller tells facebook that your parents bought the pillow.
Now Facebook knows that somebody who is your friend recently bought the pillow.
Facebook may decide to show you an ad for that pillow because somebody who is your friend recently bought the pillow.
The result may look like "mind reading", but it's actually very simple in terms of actual prediction.
I think you may be conflating the topics and goals of adjacent exercises; predicting consumer behavior is not the same thing as optimizing a conversion pipeline.
The examples they give in section two are directly relevant to optimizing conversion pipelines. They pretty clearly intend to be describing something relevant the e-commerce user experience.
Are you really sure you're not just fooling yourselves with your randomized controlled trials? As Feynman famously said, the easiest person to fool is yourself. And in business even more than science, you might even like the results.
Have you ever put this data up against something similar to the peer review system in academia, where several experts from a competing deparment (or ideally competing company) try to pick your results apart, disprove your hypothesis?
well, certainly it's possible to fool yourselves with A/B testing, it doesn't mean you must be fooling yourselves. I've also seen similar results in recommendation settings in mobile gaming, not once but over and over again across portfolio of dozens of games/hundreds millions of players. You don't need to predict 20% better on whatever you are predicting to get a 20% increase in LTV and it's even better if you are doing RL since you are optimizing directly for your KPIs
The actual conclusion of the study is so absurd that it's not worth engaging with seriously.
That is, to maximally understand, and therefore predict, consumer preferences is likely to require information outside of data on choices and behavior, but also on what it is like to be human.
I was responding to the interpretation from the blog post, which is more reasonable.
Yes, the review paper appears to be roughly conditioned on "using data that academics can readily access or generate".
Clearly, this doesn't generalise to cases where you have highly specific data (e.g. if you're Google).
However, cases with large societal impact are more likely to be the latter? They may perhaps better be viewed as "conditioned on data that is so valuable that nobody is going to publish or explain it", which kind of is in the complement of the review?
If your ML model is able to predict what consumers are going to buy, the revenue lift would be zero.
Let's say I go to the store to buy milk. The store has a perfect ML model, so they're able to predict that I'm about to do that. I walk into the store and buy the milk as planned. So how does the ML help drive revenue? The store could make my life easier by having it ready for me at the door, but I was going to buy it anyway, so the extra work just makes the store less profitable.
Maybe they know I'm driving to a different store, so they could send me an ad telling me to come to their store instead. But I'm already on my way, so I'll probably just keep going.
Revenue comes from changing consumer behavior, not predicting it. The ideal ML model would identify people who need milk, and predict that they won't buy it.
This is incorrect. You can predict many things that drive incremental revenue lift.
The simplest: Predict what features a user is most interested in, drive them to that page (increasing their predicted conversion rate) -> purchases that occur now that would not have occurred before.
Similarly: Predict products a user is likely to purchase given they made a different purchase. The user may not have seen these incremental products. For example, users buys orange couch, show them brown pillows.
Like above, the same actually works for entirely unrelated product views. If users views x,y,z products we can predict they will be interested in product w and we can advertise it.
Or we predict a user was very likely to have made a purchase, but hasn’t yet. Then we can take action to advertise to them (or not advertise to them).
ML is useful for many things. I'm asking the question of whether prediction is useful, and whether it is accurate to describe ML as making predictions.
The reason to raise those questions is that for many people, the word prediction has connotations of surveillance and control, so it is best not to use it loosely.
The meaning of the word "predict" is to indicate a future event, so it doesn't make grammatical sense to put a present tense verb after it, as you have done in "Predict what features a user is most interested in." Aside from the verb being in the present tense, being interested in something is not an event.
You can't predict a present state of affairs. If I look out the window and see that it is raining, no one would say that I've predicted the weather. If I come to that conclusion indirectly (e.g. a wet umbrella by the door), that would not be considered a prediction either because it's in the present. The accurate term for this is "inference", not "prediction".
The usage of the word predict is also incorrect from the point of view of an A/B test. If your ML model has truly predicted that your users will purchase a particular product, they will purchase it regardless of which condition they are in. But this is the null hypothesis, and the ML model is being introduced in the treatment group to disprove this.
You can predict a present state of affairs if they are unknown to you.
I predict the weather in NYC is 100F. I don’t know whether or not that is true.
Really a pedantic argument, but to appease your phrasing you can reword my comment with “We predict an increase in conversion rate if we assume the user is interested in feature x more than feature y”
That is a normal usage in the tech industry, but that's not how ordinary people use that word. More importantly, it's not how journalists use that word.
In ordinary language, you are making inferences about what users are interested in, then making inferences about what products are relevant to that interest. The prediction is that putting relevant products in front of users will make them buy more - but that is a trivial prediction.
Exactly. I know someone who does this for a certain class of loans, based on data sold by universities (and more).
Philosophically -- personally -- I think this is just another way big data erodes our autonomy and humanity while _also_ providing new forms of convenience. We have no way of knowing where suggestions come from, or which options are concealed. Evolution provides no defense against this form of manipulation. It's a double edged sword, an invisible one.
If the store knows you will want to buy milk, it will have milk in stock according to demand. If it doesn't have a perfect understanding of whether or not people want to buy milk, the store will over/under stock and lose money.
No, I'm the person who doesn't know the great things to buy with my Raspberry Pi. Thanks to great predictions from Amazon's part, they get me to buy more. Similar to how Netflix does a pretty good job of recommending movies.
I know this is slightly off what the article is concerned with, but the important question in a business context is whether this prediction is worth anything, i.e. whether it can be turned into revenue that wouldn't be generated in the absence of the prediction.
It reminds me a lot of other populist folk-science belief, like vaccine hesitancy. Despite overwhelming data to the contrary, a huge portion of the US population believes that they are somehow better off contracting COVID-19 naturally versus getting the vaccine. I think when effect sizes per individual are small and only build up across large populations, people tend to believe whatever aligns best with their identity.