Wizard of Oz Experiments
Excellently Cheap
Wizard of Oz experiments are an excellently cheap way to test ideas and concepts without actually building a complete, perfected, and working product.
If you’re like me and haven’t watched the movie in a while, you might’ve forgotten the story and what this “Wizard” is all about. Watch this clip that reveals the wizard:
Quite a hilarious yet relieving scene when you come to find out that behind the big green head there’s nothing scary at all. In testing our ideas, of course, we don’t want to scare our participants. The big green head typifies our service or product idea presented in a way that is realistic enough to make people think it’s real while the real wizard, the man behind the curtain, typifies how we can be the wizard behind the scenes manually operating the service or product to make it seem real.
Here’s an obvious example:
Say you work for Slack and you want to test if chatbots are received well by participants. To build a chatbot would take engineers, psychologists, linguists, and writers weeks or months just to get it to a point where it is not only functional but also functions the way you want it to. And after you’ve spent all that time building and perfecting this feature, what if you launch it to find out nobody uses it, wants to use it, or worse yet, needs to use it. It doesn’t solve their problem. However, you have a chatroom service that’s already available today. You could recruit participants to test the reception to chatbots by telling the participant that they are here to test a new feature. Their task is to chat with the chatbot but, behind the scenes, there is no chatbot at all. You, human being, pretend to be the chatbot and send reply chats to this participant’s chats, making your responses sound as close as possible to how an automated chatbot would respond.
Here we have the participant believing that they are speaking to a computer (the green head). But actually you (the wizard) are pretending to be the chatbot. You still get true responses from the participants as long as you presented the test in a way that does not reveal that any of this is fake.
Here’s another story I heard:
A consumer electronics company (like Best Buy) wanted to test if there is interest for a vending machine in hospital hallways. Installing those vending machines are costly so it wouldn’t make sense to ship one out there, set it up and just see how many hospital patients and visitors purchase stuff and then break it down again if there is no interest. So instead, they Wizard of Oz-ed it. They created a simple purchase form on an iPad and set up the iPad on a stand in one of the hallways. In the closet behind the iPad, an employee hid there with the goods and came out to deliver the purchased item once the form was submitted. A little creepy? Maybe a little. But this was a much cheaper method and still produced the count of interested customers.
How We Used It at FordLabs
Recently, my team and I also conducted a Wizard of Oz experiment for our product. We wanted to find out how much engagement could be driven if user’s could receive points (redeemable for rewards) for driving and for driving well. The first idea was to just test with a clickable prototype. However, because earning points is something really personal, just showing a screen with a number of points and saying this is what you’ve earned is not the same as they themselves earning it. The answers and reactions we would get from a clickable prototype test would just be claims — what they say they would do, not what they would actually do.
My next thought was, well, they would really need to use it and earn their own points and use those points to redeem something for real in order to get a true reaction and a closer-to-reality understanding of their level of engagement. So maybe we should just build a simple product to tracks their driving, add simple algorithms to calculate their points, and they can come in-person to redeem their reward. No, still not lean enough. Setting up location tracking, mileage, hard braking, speeding detection is all NOT easy. It would still take months to build even just that.
So I had another thought. Well, there are tons of apps out there that already does exactly all that. Why build our own? They also have 1-week free trials we could use and we can wizard of oz the points and rewards ourselves. So we did. We recruited participants for a 1-week driving study and set them up with a free trial account on an existing driving tracker app in our group (so that we can count their driving events). At the end of each day, we calculated how many miles they drove, how many driving events they had, and emailed them a summary with how many points they’ve accumulated at the end of each day like an automated email (see image). We also tracked if and when they opened their emails. When they came back in to redeem their points for gift cards, we asked them about their experience and received the feedback we were looking for about engagement and how the app and points affected their driving habits.
Creating an Experiment
You may think it is difficult to turn your test into a Wizard of Oz experiment. It’s true and sometimes a Wizard of Oz experiment is just not suitable for what you’re doing. But if it is suitable, it just takes some resourcefulness, creativity, and thought to convert your test into a Wizard of Oz experiment. I’ve given you some examples here and shown you my thought process on how I was able to convert my test. Read up more on other examples and start brainstorming with your team!
A snippet from Aardvark using Wizard of Oz
The difference between Wizard of Oz testing and Concierge testing