PostNL AI sentiment analysis

Together with PostNL, we investigated during a jumpstart how machine learning can contribute to a more personalized user experience of chatbot Daan.

Increasing customer satisfaction
PostNL has been successfully using chatbot Daan for several years to support their customer service. This chatbot helps customers to quickly find an answer to their question. PostNL, for which we are already building the app, asked us for help in their search for even better customer service: how can the user experience of their chatbot be improved and expanded? One of the ideas we came up with together was the application of sentiment analysis. Sentiment analysis is an existing machine learning technology that is applied to analyze subjective information such as customer satisfaction in texts. We wanted to investigate whether it is technically possible to use sentiment analysis in the PostNL chatbot to better help end users and increase customer satisfaction.

Challenge
The challenge in applying sentiment analysis is that it is typically applied to measure the sentiment in reviews (think of movies or restaurants) and posts on social media and forums (such as on Twitter and Reddit). Moreover, this analysis is often performed on English texts. The challenge in this project was therefore twofold: to apply sentiment analysis in a new context, namely chatbots, and to messages in the Dutch language.
This type of challenge is ideal for a jumpstart. In a jumpstart, we build a technically working prototype in one or two weeks, which we then test with real users. In this way we get answers in a short time to questions such as: what is technically feasible? And: what do users think? These answers form a good validation of whether it is worth further investment in an idea, as was the case with the jumpstart with PON, for example.

Prototype and use test
We validated the technical feasibility by building a prototype. We looked at whether we can apply sentiment analysis in the existing chatbot tech stack, and what quality we can expect from the ML models. To test the quality of various machine learning models, we anonymized historical data and analyzed it afterwards. In this way we found out which model is most suitable. We then used this prototype in the user test. This test clearly showed how users relate to a chatbot. This allowed us to answer questions about the role of sentiment in a conversation, and which forms of personalization work and which do not.

Result
This jumpstart consisted of three parts: data study, usage test and technical prototype. The result was therefore threefold. Thanks to this one-week jumpstart, PostNL has been able to validate with a small investment whether an innovative concept such as sentiment analysis is currently of added value for their chatbot.