Testing our work with users has been a favourite Yoomee activity since we started over a decade ago. We've tested many websites and apps over these years, but never a chatbot.
We are currently developing a chatbot which will help people in Sheffield struggling with their mental health find support using natural conversation. The information will come from the Sheffield Mental Health Guide. So what did we learn from our first foray into testing this chatbot with users?
Assumptions versus reality
At a basic level, I had assumed each testing session would be shorter and simpler than its website equivalent. After all, a chatbot's functionality is pared back compared to that of a website. And the six scenarios drawn up by Sheffield Flourish — the fabulous charity behind the Sheffield Mental Health Guide from which our chatbot will draw its data — seemed pretty straightforward.
Wrong, wrong, wrong! I based this assumption on my perspective as someone involved in developing the chatbot (its language rather than functionality, I hasten to add) rather than putting myself in the shoes of the people interacting for the first time. How easy it is to forget this is all about the users. Getting used to conversing with a chatbot brings challenges, and these may be further compounded if you're struggling with your mental health. So the testing process was just as time-consuming, messy and illuminating as it is when testing websites
Limited scope, limited learning?
Another assumption I brought to the testing table was that because the chatbot's scope and functionality are more limited than those of a website, the associated learning from testing would be similarly limited. Wrong again!
Each user viewed the chatbot through the prism of their individual personality. All testers were male; but their age, ethnic background, digital literacy, mental state and a whole host of other factors impacted the way they interacted with the chatbot and what for them were stumbling blocks. We learned so much. Our one regret was that we had no women testers. A first for Yoomee as the gender bias for testing normally leans towards women rather than men. Sadly our only no show on the day was a woman.
Words matter more than ever
I had already been thinking about how a chatbot offers fewer opportunities for user interface design than a website. On the other hand, it offers plenty of scope around language and the way it's used and testing only helped to underline this fact. If a chatbot is to make itself understood within the confines of a pop-up window, then it needs to do so clearly and succinctly and in a tone which allays any fears in people with mental health problems.
I had already spent a little time in sharpening the draft — a placeholder if you like — initial text conceived during prototyping. Testing confirmed I need to invest a whole lot more time on this. People with mental health problems will not necessarily focus all their energies on decoding the chatbots' prompts. So, the words used must leave no room for error or misinterpretation.
People with mental health problems will not necessarily focus all their energies on decoding the chatbots' prompts. So, the words used must leave no room for error or misinterpretation.
To illustrate this fact, one of our testers struggled because the chatbot asks: “ Please tell me in one short sentence what your problem is." He focused on the word “short" and in keeping his sentences suitably short overlooked the need to use the very keywords which would have enabled the chatbot to identify the nature of the help he was looking for. Conversely, another tester felt the “short" stopped him from sharing a cathartic essay on the nature of his mental health problems. One for me to go away and contemplate further
Chatbot: fit for purpose?
So, what did testers make of the concept of a chatbot? Initially, some struggled to get their heads around whether it was a human or computer they were interacting with. But the response to this question seems to depend on people's existing ease with technology and what they interpreted from the scenarios. For some, the chatbot was a welcome alternative to a website, as it does the hard graft of filtering for you and provides a concise answer. Some described it as “Googling done for you". For others, they admitted they want a real person and not a computer to interact with when they're feeling vulnerable, though resources don't always make this possible.
Testing confirmed that simply scraping data as it appears on the Sheffield Mental Health Guide website, into the chatbot and presenting it in a small window is overwhelming for users. We definitely need to change the way in which information is structured and presented to avoid people being confused or even agitated by it.
So at the end of the day, there were more similarities than I'd thought between testing a website and a chatbot. In both scenarios, there's no hiding. All work and assumptions to-date are laid bare. Critiqued with brutal honesty. And equally quickly you learn what does and doesn't work. 'Fail fast!' so says the Agile manifesto, and user testing helps you do just that.
So at the end of the day, there were more similarities than I'd thought between testing a website and a chatbot. In both scenarios, there's no hiding. All work and assumptions to-date are laid bare.
As a result of our sessions, we'll be reviewing: the way the chatbot is accessed from the homepage of the Sheffield Mental Health Guide; messaging and language across the chatbot; how results are displayed in the chatbot before we do our next round of user testing and make further changes.
Watch this space!