Dispelling Myth about Dictation / Speech Recognition

When I first started writing this I had just finished listening to episode 156 of The Bestseller Experiment; as a patron supporter I get early access to episodes, as well as being a member of the wonderful BXP Team. The marvellous episode focused on interviewing the author Julian Barr about his new book The Way Home. Julian is also a long time listener and member of the BXP Team. I highly recommend Julian’s book, a gripping tale that was well paced, characters with connections and motivations. His book has also now earned an Amazon bestseller tag! I’m very much looking forward to the next book in the series.

Important paranoid associated thought: like many writers I feel like a fraud that just needs to write more and thus I feel awkward about asking for advice, after all I’ve already answered my own request for advice “Write more!” Anyway, later in the episode the two Marks discuss writing using Speech Recognition (SR) and gave a call-to-action regarding listeners experiences with writing via dictation. I was surprised to find that I felt empowered and not a fraud, since this is a topic I know quite well.

As someone with long-term chronic Repetitive Strain Injury (RSI) in both of my wrists I have a lot of experience with speech recognition, going back nearly twenty years to the horrendous days of massively inaccurate software; the frustration and stress of trying to use the software often made me feel even worse! Fortunately the various programs have improved so dramatically in the last ten years that I find dictating to be dramatically faster, easier and shockingly more efficient. The vast improvements have come about because of the following factors:

  1. Understanding of what is involved in analysing language (technical).
  2. Improved code efficiency (technical).
  3. Substantially increased computer processing power (brute force).

This also means that modern speech recognition is better are recognising accent and voice differences. With training, software should adapt to work near perfect for most users; I appreciate that is quite a bold claim.

As someone that used to be able to maintain a decent enough typing speed of between 70 to 80 words per minute (WPM), having that ability taken away from me was devastating; I was unable to work or partake of most of my hobbies. Having struggled through the horrid early years of dictation I can appreciate why people are loathe to give speech recognition a try, however just about every problem has gone away these days.

In general many people are not up to date with the latest information when it comes to cutting-edge technology; after all there is so much to do/learn. This is in part because the various non-specialist media outlets are often years behind when reporting non sensational things, there is so much to talk about and typically they repeat the same core points. In this ever-accelerating technologically era I suspect anyone that has not used modern speech recognition has heard opinions that are about software from 10+ years ago. My title was not an attempt at clickbait, when I discuss or read things about speech recognition there is an understandable fixation on accuracy, but with modern software claiming accuracy of 90%+ for most people with little to no training, and 95%+ with some training, I wonder why accuracy is still considered a barrier to entry. It seems like my system is 99% accurate, but I appreciate it has been used a lot over many years. My point is that typically most people will type errors anyway, even with grammar and spell checkers mistakes slip through. Even for those that manage a rare 100% accuracy the first time they type something the result should still be double-checked. Mistakes are still made, accuracy is a concern whether typing or spoken, so why not do the vast majority of the work via speech?

When I was working in adult social services I had severe RSI flare-up, in fact my worst ever that caused a domino of problems. When I returned to work for a while I was able to cope due to using speech recognition, despite being in a large busy office. I was surprised at how accurate it was even with all the background conversations. Additionally instead of using a mouse to navigate the screen I found using commands to finally be efficient. How things had changed!

During long bouts of sleep deprivation I can somewhat rest my eyes whilst dictating. Thankfully I rarely get headaches, but dictation has also proved helpful when I have; I find it’s better to do something than nothing, since I’ll be suffering either way.

I’d like to highlight that a hybrid approach can be used. Especially if you can still type and you want to, then do so. Can be quite easy with today’s smartphones maybe you can use speech recognition whilst away from your normal work area. For the following reasons I’d recommend at least experimenting.

Speech Recognition Pros & Cons

Pro 1: Health

When dictating we don’t need to be sat down or stood still, we are not tied to a keyboard. Since we can move about I often do so. Over the years I have done all manner of things whilst dictating: physiotherapy, light exercise/stretching, to things like cleaning or ironing, etc. When I am having a particularly painful wrist episode my arms, shoulder and back all become problematic, resulting in difficulties sitting or standing for any length of time, so on a particularly bad days I’ve even dictated whilst resting in bed.

Con 1: Training Time Investment

Like any new skill there can be a learning curve, which can vary dramatically from person to person. Although these days even without any training on a modern device and software, dictation can start out at 90%+ accuracy.

I appreciate that getting out of comfort zones and allocating time to learn something, can be challenging. Saying embrace the challenge is all well and good, but people and their situations can vary wildly. It is sensible to decide during an epically busy time that doing something new is too much of a risk, but because life is strange maybe the change will quickly be beneficial, even in regards to time, which links to Pro 2 …

Pro 2: Speed

Personally, I think the health reason is reason enough but just in case here is another reason. Just because a person is good at typing does not mean they should stick with that method, since dictating can allow them to be faster. I often find it easy to dictate over a 100WPM, sometimes as high as 150WPM; granted a few typists with specialist keyboards can beat that, but for the vast majority of people dictation is twice as fast typing.

Following on from Con 1, it is worth learning the extra functions like how to navigate via dictation, as well as the various advanced commands. Going from quick dictation to struggling to carry out navigation commands can make you feel like a writing session was ruined; writers typically have enough reasons to procrastinate without imagining new ones 😉

Speed is a major factor for writing events like #NaNoWriMo, thus the speed advantage of dictation can really pay off.

Con 2: Initial Costs

Not everyone has a computer (desktop/laptop/tablet) or smartphone (I’m only differentiating because so many people typically do, as it is really just a computer with a phone function). Free speech recognition exists but I do find Dragon NaturallySpeaking to be better overall, but it isn’t cheap.

Then there is the topic of what microphone to use. Whilst you can use a laptop’s built in microphone it is better to have a decent microphone, although I’ve found that a £25 microphone works just as well as my more expensive Yeti, so you don’t have to buy crazy equipment.

Other extras: I’ve also invested in a microphone stand, pop-filter, USB cable extension and a high quality wireless headset. The extension and wireless the reason I can exercise or tidy my room whilst dictating.

One of the problems I found using my fantastic quality Yeti microphone was there were a few delays/problems with the software, but this was because I had leaned back in my chair and thus wasn’t close enough to the microphone. So before you rush off to buy an expensive microphone consider how your setup can be altered to get improvements.

Pro 3: Speaking is Natural + Rhythm of Speaking

Based off this subtitle you can see why Nuance called their software NaturallySpeaking 😉 Particularly when dictating dialogue I find I can write a better scene; I think this down to being able to somewhat act the scene out, I feel more in character as I switch back and forth between character perspectives. I’ve even experimented with literally acting a scene out, although that led to some comedy moments of frantically changing my position to be the correct character, like a stand-up performance.

Sometimes we can spend a lot of time thinking about a subject only to find that when we speak we change what we had intended to say. There is something about speaking out loud; maybe it is because we engage more of a body, thus more of our brain. I also think this is probably a knock-on effect of evolution in regards to us being such a social species, we need to be careful of what we say to others.

One of the best tips for writers is: “Read your writing out loud.” Dictating can be a big help, you get used to speaking out loud, thus when it comes time to edit your work you are more likely to give it a try. This also links to one of the key tips from Bestseller Experiment, “Make a public declaration.”

There is another advantage to dictating. If you think of a sentence and then struggle to dictate it, then that is a sign there is a problem. Typically you’ll easily find a rhythm, indicating were commas and full stops best fit; granted you have to say “comma”, but I think that is no different to having to press the comma key. Maybe somebody who struggles with grammar could benefit from dictation?

Con 3: Editing

As I mentioned above I think this is a con that gets too much attention, since work should be double-checked anyway. Still it can be particularly irksome during the training period, when correcting (editing) as you go is highly recommended. I think a valid point about the accuracy aspect is that they are typically errors that we are aware of, unlike when most people type and things slip through.

Crucially this is a problem that fades over time, I rarely need to correct things. Since I write fantasy fiction and role-playing games I also have lots of additions for my fantasy proper nouns, my system mostly recognises these new words after the initial correction or two. Just like with typing it is more important to get something written first, then you have something to edit.

Pro 4: Flow

Due to the pain from my disability, I lost my ability to enter a flow state whilst writing/typing. It was 2009 when this this feeling briefly appeared during dictation. My comfort level with dictating slowly grow over the years, by 2009 I found talking to my computer to be more than only comfortable but also empowering.

Con 4: Habits

Initially when first learning to use speech recognition a user can feel they are wasting their time. Why bother stressing yourself out, fighting your habits? I’ve separated this point from Con 1: Training, because I think habits/traditions are such a powerful part of our psychology.

Habits are typically difficult to break; various people can react differently to the same thing. Decades ago I had the regular association of being denied the use of my wrists to type a decent work session, the threat of pain from typing as well as sitting too long, plus stress and sleep deprivation. Since back then speech recognition was lacking, I quickly developed justifications about putting things off. In the light of pain-paranoia and frustration it became easy to justify thoughts like “I need to minimise computer usage even using dictation, so I need to work out as much as possible upfront.” Once I developed this habit I found it hard to break it, even as the ability of speech recognition improved.

Pro 5: Focus

I find I do not get distracted as much when I am dictating. Maybe because I am typically away from my desk, so I cannot easily check emails or browse. It can seem like our hands have a mind of their own when within a split second of thinking about a website we’ve switched to that. This is why so many writers use blocking software that restricts their access to the Internet. Following on from Pro 3, I find that if I do start giving my computer commands to browse non-important things I quickly stop myself.

Con 5: Stream of Consciousness

Dictating does not dictate quality. The fact we can dictate more WPM means we can also have more to edit. This is a minor Con, yes I’m being nit-picky, but over the years I have dictated a lot of garbage. I think I have solved this by writing more, showing others my work, learning more about writing; not just practice, but learning to carry out skilled practice. If you feel that when you start dictating you are writing garbage, don’t worry I think you’ll quickly adapt.

Bonus Pro: Moving is Thinking

Linking back to Pro 3: Speaking is Natural, there is something about moving and thinking, dictation means you don’t have to be sat still at a keyboard. When we move we are activating different brain regions, plus getting the blood flowing, etc. Physical intelligence is one of the many types of intelligence being researched, plus whilst kinaesthetic leaners are typically separated from other learning types, the majority of people can learn in all manners of ways including kinaesthetic. Quick interesting point, animals have a more developed brain than plants because they need to navigate; the sea squirt is a fascinating creature that once it finds a permanent spot for its next stage of life eats its own brain. It is also worth looking into the tools of memory specialists and how they utilise virtual spaces to associate memories for better recall.

Some speech recognition software allows for the transcribing of previously recorded speech. You can even transcribe a recording of another person, although I’ve never done this and I am not sure of the efficiency of the process.

I’ll be making a video version of the blog in the New Year, but before I finish here are so extra points. Dictating role-playing mechanics is not a big deal, I’ve even used speech recognition to dictate computer code years ago; I am contemplating giving it another go with the vastly improved software and machine power of today.

Whether walking outside or in bed trying to sleep (chronic pain is hell), I’ve dictated notes via my smartphone’s built in software. Granted it is not as powerful as Dragon, but it is easy to do and I don’t have to get out of bed. I’ve also made use of a Dictaphone with a headset whilst walking, that I’ve later dictated at home, this counted as a first draft. Dragon Anywhere allows for dictating on the go, but I cannot afford it and I am rarely out and I have Dragon 15.

In conclusion if you are still not sure if speech recognition is for you, I highly recommend giving it a go, at least go hybrid, mix things up. The future is already happening!

Links

I’ve written about The Bestseller Experiment before.

The Bestseller Experiment Podcast

Julian Barr

NaNoWriMo

RPG Power of PBM: Social

This post about RPG and Play by Mail Games (PBM) continues on from the previous article RPG Power of PBM: Time.

When discussing PBM RPG, occasionally someone will be concerned that there is a lack of social interaction in such a game type. They envision a lone player reading something like a Choose Your Own Adventure or Fighting Fantasy book. Even before the explosion in email access or the World Wide Web took off, PBM games were very social. Granted some players were playing smaller games with none of their local friends involved, so they had to wait for a letter to arrive by post from other players. Whilst phoning someone was possible, back then the cost was off putting, particularly an issue for younger players; the further back in time we go the more likely players did not have easy telephone access. For the vast majority of people these days, these are no longer concerns, if you have access to email or the web you are able to be involved in any number of games.

It’s understandable that some players of tabletop games, and in particular LARP, would assume that socialising is an issue in a PBM game. Consider how many people refer to the online world as not being real, there is just something disconnecting about a lack of physical presence. This lack of face-to-face interaction, however, does not prevent a PBM player from developing strong social ties. Besides curiosity, many games promote alliances, and given the strategising power of PBM, contacting other players is normal in all the various games I’ve played. Obviously other players are going to form alliances, and information gathering is vital.

Like with Massively Multiplayer Online (MMO) RPGs, meeting somebody in game randomly could lead to long-time close friendships. Many people will be familiar with online players deciding to meet up, going to large group events, and some players forming close relationships or marriage. This level of friendship has been happening with some PBM players for decades.

Direct social interacting, face-to-face, whether physically or virtually, is not something everyone wants to do. There could be any one of a number of reasons, such as: chronic illness/injury (whether minor or full bedbound), social anxiety, autism, and returning to the previous article’s point: a lack of time. Please don’t think of PBM games as being only suited for people with health issues, non-neurotypicals, or any type of disability, this list just highlights another benefit of this game type.

communication-1297544_640
Virtual Socialising, Diversity and Identity

Another interesting aspect of PBM is that of identity, how we present ourselves and how others perceive us; of course sadly some people find any discussion of identity as an excuse to attack others, particularly minorities. For many of the diverse PBM games a player might choose to hide their identity, present themselves how they want, which some people feel is their best course of action even nowadays. This is another advantage that PBM games can offer.

During a tabletop game, and even more so with LARP, the emotional intensity and sense of connection can be quite intense. It may seem that PBM will lack this level, but just like with any other role-playing games, whether playing with others, or reading a turn result by themselves, players can still achieve emotional highs from succeeding or failing. Given the strategy aspect I previously emphasised, having a long-time ambitious plan succeed certainly provides an emotional high. Other players also tend to be interested in what is going on in the game, so for those that want to there are still plenty of chances to socialise with others, as well telling friends and family about your latest game exploits.

The raise of the modern Massively multiplayer online (MMO) owes it roots to tabletop RPG, Multi-User Dungeon (MUD), and PBM. Within these games large number of players come together to form alliances, either to compete with other players, or the game world. Organising things with other players is a big part of the MMO genre: fleet co-ordination in EVE Online, dungeon raids in World of Warcraft, etc. The same applies to other competitive games: FPS, RTS MOBA, etc., it is normal for players to organise themselves into teams/clans.

Coalition

As mentioned above, players were forming alliances in PBM games decades ago, and some professional games were quite popular leading to massive groups. Playing a bigger PBM means more players to interact with, and this scaling of game size translates to more people to keep up to date with, as well as game positions to track. The end result being a player could choose to spend a lot of time communicating with other players, and this certainly addresses the query of socialising with a PBM. For some players they can be communicating with many players a day, all year, a level of socialising tabletop or LARP rarely achieve.

My first PBM game was Quest by KJC Games, which I eventually ended up running and redesigning as a moderated RPG. As a kid I had seen PBM adverts in the old White Dwarf magazine (Games Workshop), but the money I earned from my paper round went on RPG books and wargame models. Whilst at college I met some other gamers, and via these people I eventually gave Quest a go, which also led to me trying other games like the massively success game It’s A Crime. Their Quest alliance consisted of only people they were close friends, but also to keep in game information secure.

Information security and trading is a major part of socialising and fun with most PBMs.

Before a tabletop gaming session they often discussed their PBM plans and this co-ordination eventually resulted in devastating attacks on their enemies. When Magic the Gathering came along, the group would often bounce PBM ideas around whilst playing cards; fun times. I appreciate I was lucky with regards to joining such an organised group of players. Out of the many groups that I played with, this PBM & TTRPG social group (plus a bit of wargaming) was a big help in regards to developing ideas and eventually getting a job at KJC Games. Working at my local games shop Tower Models also helped.

In a future article I’ll tackle a question I have been asked many times: “But how do you actually role-play during a PBM?” Due to the sheer diverse types of PBM games I view this as a complex question, although the easy answer is: make a character, play it 😉

My Curious Cthulhu Visits

Several weeks ago I was abruptly awoken, but not due to my normal constant pain interrupting my sleep, but due to an abnormal feeling. I was quite surprised to see upon the ceiling a multitude of writhing tentacles. As my brain struggled to comprehend what I was perceiving, and my heart-rate raced, the tentacles slowly retreated back to what I eventually determined to be a porthole. Whilst I motionlessly observed things, I became aware of a single bizarre eye that was staring back at me, comprehending me in an alien way. For a moment I was gripped with terror, surprised by the insanity of the vista before me. I knew I was most clearly awake, but for a split second I wondered if reality had changed, or I was somewhere else. Seconds later I saw a small tumbling gorilla hovering between myself and the now closing porthole. A calming thought reminded me of the psychology experiment of video of the basketball game and the person in the gorilla suit… Focus on the gorilla, not the eldritch horror! My feelings of dread subsided as the last tentacle finally closed the porthole, then the gorilla left.

Over the week I experienced more interrupted sleep. Nothing unusual there, but the regular eldritch visitations certainly were, as were the tumbling gorillas; I even had one night when I saw a 4by4 grid of them.

I attempted to try and write about my experiences, but my words felt inadequate, failing to capture the depth of the visitations. I was struck by how ludicrously clichéd the idea of a sleep deprived writer going crazy was, plus as I analysed my writings/ramblings my normal hypercritical opinion of my work screamed at me that I was writing poor fanfiction.

Although I have been suffering from sleep deprivation for a year, what had recently changed was I had been prescribed Amitriptyline due to its common side effect of causing drowsiness. My doctor thought that using Amitriptyline combined with the Remedeine (opiate) could help over-ride the pain interpreting my sleep. I did initially manage to get a few nights of basic uninterrupted sleep before returning to lacking sleep; this was slightly better than my last year’s pattern. Whilst I might have had a bit more sleep that week, I decided to stop taking the drug, since most of the sleep I was getting resulted in a horrifying moment of anxiety when I woke. With the drug out of my system the hallucinations no long occurred.

All in all a fascinating set of experiences, that one day might result in some interesting fiction, and most certainly aid me running role-playing sessions, but I’d rather not have had to go through the process. Sleep wise I have struggled on, I am currently attempting some new non-pill approaches to improving my sleep, and to continue to try do as little as possible so I can prioritise healing.

One possible positive take away is that it seems during the initial hallucination I was quickly able to appreciate my situation, and crucially calm my freaking out heart-rate and panicking thoughts. The later visions just felt like I had to wait for the visitations to wrap-up. A paranoid thought follows this, that my brain chemistry might have been badly affected by the drug, on top of the long-term sleep issues, somehow sowing the seeds for a future psychotic breakdown, because one never knows. Thankfully I’ve had no hallucinations since, which beats my paranoia back in to a sensible state.

Design a site like this with WordPress.com
Get started