Rating, expectations, and experiments
Last night I watched Snowpiercer. I'd heard good things about it online, I like to support simultaneous releases on iTunes, and it had a whopping 95% rating on Rotten Tomatoes. I was hugely disappointed. It wasn't a terrible movie. It wasn't a great movie either. But that 95% had set such an expectation for me that when I watched it, the massive flaws made it so much worse, perceptively, that if I'd gone in thinking it was a 40% to 60% movie. I watch all sorts of silly sci-fi, and enjoy it. I just go into it expecting silly sci-fi. How the rating influenced by perception and enjoyment of the film got me thinking. How do we rate things on iMore, and how can we do it better?
Ratings systems are something almost every organization that reviews almost anything has to figure out. (Even if that figuring leads to them not using a ratings system at all.) There are different systems and pros and cons to each, as well at to using one or not.
By way of another example, I recently tried out a game that I didn't like. I found the first run experience and the design to be less than good. Georgia really liked the gameplay, however, and Chris the community. We all cared about different things. How do you account for that?
5-star scales are common. iTunes uses them. Amazon uses them. Each star can either be whole, allowing for a 5-point spread (0 to 100% in increments of 20), or halved, allowing for a 10-point spread (0 to 100% in increments of 10). They also allow for relative measure. A 4-star app is better than a 3-star app, for example. They're not so good at qualifying those measures. Why is the 4-star app better?
Thumbs up vs. thumbs down — recommended vs. not recommended — are also common. Instead of a lack of positives, they actually highlight negatives as well (-100% to +100% with potentially 0 in the middle). You can easily tell if some apps are good and other bad, but they any relative measures. You can't tell how good or bad they are compared to other apps.
Sometimes elements of both are combined, and you get a recommendation scale. For example, must avoid, not recommended, recommended, must have.
All of them can suffer from similar problems. Should 1-star or 2-star, or non-recommended apps be reviewed at all? What's the real difference between a 3-star and 4-star or recommended and must-have app, beyond the personal opinion of the reviewer?
What happens if you rate an app 5-stars and a better app comes along? If that app gets better? What happens if part of an app are great and others... not so much.
Most of all, how do you overcome the lack of nuance and specificity that, while making ratings highly glance-able, also make them incredibly shallow.
One of the things I've been thinking about it to tie ratings to specific criteria. It dawned on me during Apple's user experience evangelist Mike Stern's talk on Designing Intuitive User Interfaces at WWDC 2013 that a lot of ideals set for developers and designers could be used as measures for the resulting apps.
For example, how usable is an app? How simple, clear and intuitive are the navigation and controls? How useful is it? Are the features well defined, focused, and implemented? How well designed is it? Is the interface attractive and the interactions enjoyable? How accessible is it? Can it be used by as wide a range of people as possible?
I've not yet finished thinking it through, and there's a lot still to consider, weight, and figure out how to map to ratings, but I think it could ultimately lead to something that provides really glance-able information that's backed by solid criteria. Especially if that criteria is elaborated upon in the review that includes the rating, then it lets everyone know not only relative measures and like or dislike, but areas where apps excel and where they don't. The information density increases, the precision increases, and hopefully the value increases.
So, here's where I ask for all of your help. What ratings systems do you like most? What provides you with the most value? What would you like to see on iMore?
Assorted other stuff:
- I also saw Guardians of the Galaxy, which is at 92% on Rotten Tomatoes. I find that score equally ridiculous, but I also enjoyed the hell out of it. It wasn't transcendent by any means, but it was a ton of fun.
- Here's some more of Rotten Tomatoes at the movies: Star Wars at 92%, The Matrix at 87%, The Avengers at 92%, the Godfather at 100%, and so on.
- Guy English had some smart follow up to our recent Debug podcast with Marco Arment of Overcast and he posted on [Kickinbear] (http://kickingbear.com/blog/archives/464)
- Speaking of smart stuff, read Ben Thompson's piece on the app business being a business.
- Our friends Cali Lewis and John P. opened their new Geek House over the weekend. Huge congrats to both of them. Phil Nickinson went to celebrate along with them on our behalf.