10 minutes with....#2:
Stefan Hellfritzsch, Software Developer on Detecting Fake Reviews
1. What role do reviews play in e-commerce?
Reviews support buyers in their purchase process. In an online shop you usually have pictures and marketing texts next to detailed product information. But you can't look at the products as closely as you would in a brick-and-mortar shop and maybe try them out instantly. Even good product descriptions cannot replace a personal consultation or trial use: What does the finishing look like, can you see glued joints, does the product still keep its promises even after several weeks of testing?
Reviews can close such a gap, reducing returns and increasing conversion. The customer receives independent information about the actual use of the product and can decide whether it meets the personal requirements. And, let’s be honest, who wants to try out 5 washing machines in comparison? In total, reviews are helpful for vendors and consumers alike.
Fake reviews, however, can render this service useless, because they may either pretend that a product is of a high quality or resemble hate speech. Users do not want to see fake reviews for an objective purchase decision.
2. Is it possible to recognize fakes, being an ordinary customer?
Yes, if you know the product! Then you see directly that the description contains wrong information. But then you don’t have a need for a purchase decision support.
In general, it is difficult to recognize a fake review on something specific. There are indications like an over-positive representation, if a majority of other evaluations are of a variety of opinions. Also negative claims without specific reasons can point to being a fake review. If a large number of similar products have been described by reviewers, it is very likely that they are not ordinary users, but professional writers.
3. How can machine logic approach this?
Machine learning methods are based on approaches similar to a human learning curve. They learn through examples whether a review is real or fake. There are training data - a series of fakes and truthful reviews, each of which you know for sure what they are. Based on this data, the machine learning model learns to distinguish a fake from a truthful review. There are limits here, of course, but in our experiments this works very well. A 100% recognition rate is not possible, because e.g. copies of real reviews for other products are difficult to separate from real reviews.
4. How does the machine recognize a fake review?
First, the reviews must be prepared so that they can be processed by a machine learning model. This is a kind of conversion of language into numerical vectors. This happens with various Natural Language Processing methods that take language apart, analyze sentence structures, find and normalize types, so, for example, "walk, walking, walked" for a machine can be reduced to one unit of sense, just “walk”. Also, sentiments are analyzed to determine the attitude of a writer with respect to a topic, whether long or short sentences were used and even if there are multiple exclamation marks present. But you can't say that if one of this occurs, it's fake. Machine learning models decide on the basis of all this information, as you do. At the end, there is a probability, to which extent the review is fake or not.
5. How does this improve our product?
We have developed a prototype in the form of a microservice for our Synaptic Ecosystem. The additional service can be hosted separately in Azure and linked to our product via a REST interface. As a result, the shop manager receives a list of existing reviews that is sorted by probability of fakes. Individual standards can then be used to personally sort out or confirm which reviews are to be visible in the shop. The service can thus be used to increase the efficiency of existing processes.
If you are interested in more information about this research or service, please contact S.firstname.lastname@example.org