The New York State Court Reporters Association is promoting Project Steno’s June 6 outreach webinar, as told by NYSCRA’s Transcript Weekly, posted earlier today by NYSCRA Social Media Committee Chair Marina Dubson. Though stenographers have made great strides in recruitment and introducing people to this field through efforts like NCRA A to Z, Open Steno, and Project Steno, there remains a need to get word out to high school students and staff that court reporting is a viable and vibrant career that young people should give serious consideration. Resources will be provided, and it can all only be seen as a wonderful complement to the resources already published by the National Court Reporters Association. If you’ve got some time to attend at 7:00 p.m. Eastern Time this Sunday, definitely consider registering today!
In my Collective Power of Stenographers post, we explored how court reporters collectively out-earn every company in business today. In Aggressive Marketing — Growth or Flailing, we took a look at VIQ Solutions, parent of Net Transcripts, and saw how a transcription company could be making millions in revenue but be unprofitable. This all set me down a path of learning about zombie companies, companies that are not making enough to meet debt obligations, or just barely enough to make interest payments. You can watch Kerry Grinkmeyer describe how that happens here. This isn’t very rare. A Bloomberg analysis of 3,000 publicly-traded companies found one in five were zombies. The main takeaway? Companies can make lots of money and still be taking losses.
I had the pleasure of looking through the Kentley Insights June 2019 Court Reporting and Stenotype Services market research report. I do want to be upfront about it: I have some reservations about the methodologies and some of the reporting. Very much like the Ducker Report, as best I can tell, it’s based off a sampling of respondents from in or around the field. There are parts of the report that are arguably a little incomplete or unclear. For example, being industry experts, we all know the vast majority of the work is done by independent contractors. Independent contractor isn’t a term that appears in the report. Unsurprisingly, when we reach the job pay bands and employment section, it says there isn’t detailed data on the industry and compares us to the telephone call centers industry. So this report is not a must-have for court reporters, but it does have some interesting insights.
Those remarks aside, when we get to the profitability section of the report, we get to see something pretty striking. Based on their data, more than 1 in 4 court reporting companies are not profitable. Average net income as a percent of revenue for the ones that are profitable? About 9.3 percent. For the ones that are not profitable, a loss of about 9.6 percent. And a pretty chart that says as much.
On the following page, there’s a forecast for operating expenses and industry revenue. That’s summed up in another pretty chart.
If we look at the trends here, it’s pretty clear that the forecast is for expense growth to eclipse and outpace revenue growth. If that keeps up, the unprofitable companies are going to be looking at bigger losses year after year. Given all the information I have today, I surmise that the smaller court reporting companies are the more profitable ones and the bigger ones are the ones struggling. There are sure to be some outliers, like small court reporting shops that go bankrupt and leave their independent contractors unpaid. But overall, the smaller companies can’t afford to remain unprofitable for very long, so it’s probably the “big dogs” eating that 10 percent loss. If I’m right, that may also mean the push to go digital is the dying breath of companies that can’t figure out any other way forward. In February, I wrote “…we only lose if we do not compete.” That is becoming more evident with time and data. It is a great time for the stenographic reporter to open up shop and be a part of the 74%.
Speaking of data, if everybody that read this blog donated $1.50, we’d have enough money to stay ad-free for the next two decades. To all donors we’ve had to date, thank you so much, put your wallets away. To everybody else, check out this cool song from M.I.A. about taking your money.
When you care about something, how difficult is it to do? I can only go by my own experiences here. I hate calling lawyers. A family member got fired and there was potentially an attached legal issue. I was on the phone chain calling lawyers for them until I found one that could speak to the family member that same day. I don’t have any desire to be a public speaker, but I figured it out when I thought our profession might need it. US Legal, by all appearances, cares a lot about attracting digital reporters and strengthening AAERT.
In fairness, US Legal does have a reporter corner and a few spots on their site where they specifically mention stenography. But we have to look at the totality of the circumstances to decide whether this is out of genuine care or whether it’s a facade to point at and say “look, we care!” It’s been known for a while that US Legal is backing digital reporting. They bought out Stenotrain, made some announcements to look good, and killed it. Now reporters are getting offers to join USL as long as reporters drop the stenotype and fall in line with whatever junk USL wants to peddle to consumers. Again, I have to look at my own experiences, and when I don’t advertise very much, my site can get as little as 500 views a year.
Meanwhile, when I spend a few hundred bucks on an ad, I get the word steno in front of thousands of people.
Hopefully the point is pretty clear. If and when they cry shortage and say they just can’t fill the seats, it’s a lie. According to Owler, they have a revenue of over $100 million. They’re taking that money and betting it against stenographic court reporters. There are national, state, and nonprofit databases of reporters. This is a game to take our relatively high-paying jobs and organized, educated workforce, and replace them with low-paying jobs and people who won’t have the same ethics culture we do.
It’s a game I need some help winning. All corporations are made up of people. Educate those people on the truth, and just maybe they’ll realize they’re risking everything by backing the losing horse. If you happen to get a message from one of the recruiters working on this, please don’t blast them, but let them know what’s happening. Chances are good they have no idea.
The New York State Unified Court System commissioned the Future Trials Working Group to look at many possibilities for use of technology in the courtroom. In April 2021, the Future Trials Working Group released a report with recommendations for the court system. On page 13 of that report, there was a section regarding the possibility of automatic transcription, and specifically automatic trial transcription.
The report had a strange view on the possibility of automatic transcription. In one area, it noted “the most foreseeable endgame in the evolution of trial transcription likely is full automation.” In another part just down the page, it stated there were “…obstacles to the use of such technology on a fully automated or even predominantly automated basis for the foreseeable future”, going on to note “…automated transcription — at least at its current stage — could threaten access to justice if widely employed.” The most foreseeable endgame is automatic, but in the foreseeable future, the technology is unreliable. This is, in my view, a strange view to take. The report goes on to recommend that the court system study outside vendor offerings for automated/remote transcription or translation.
Court reporters and the people that represent them did not sit in silence. A response was prepared by the New York State Court Reporters Association and the Association of Surrogate’s and Supreme Court Reporters. Several unions supported the response, and the full letter and list of supporting unions can be read below. My personal favorite quote? “…use of automated speech technology for trial transcripts, by all available information, would not threaten access to justice, it would implode it.” We have, as a profession, put our foot down and said “we are here to guard the record, we have been guarding the record for over a century, and we will do all we can to educate the system on why other technologies are inadequate.” State and national association membership has never been more important. Union membership has never been more important. When you contribute to these organizations, you give them strength to advocate for you.
In full disclosure, I did contribute to the letter. But without the work of ASSCR President Eric Allen and NYSCRA President Joshua Edwards, this would not have been possible. Again, it all points to the importance of association and union membership. Members empower leaders. Leaders fight for an advocate on the behalf of members. It’s a symbiotic relationship that, if you are not currently a part of, you certainly want to be.
This month I had a chance to sit down with Marc Russo of MGR Reporting. Marc’s a working reporter and business owner. We got to hit a lot of topics in this video, including Marc’s history in the field, how reporter skill relates to reporter treatment, and how scheduling ahead can help reporting firms fill their clients’ needs.
Using Marc’s words, it’s about treating reporters like people instead of numbers.
Don’t take my word for it, check out the interview here!
With the news that Verbit has bought VITAC, there was some concern on steno social media. For a quick history on Verbit, it’s a company that claimed 99 percent accuracy in its series A funding. In its series B funding it was admitted that their technology would not replace the human. Succinctly, Verbit is a transcription company where its transcribers are assisted by machine learning voice recognition. Of course, this all has the side effect of demoralizing stenographers who sometimes think “wow, the technology really can do my job” because nobody has the time to be a walking encyclopedia.
But this idea that Verbit, a company started in 2016, figured out some super secret knowledge is not realistic. To put voice recognition into perspective, it’s estimated to be a market worth many billions of dollars. Microsoft is seeking to buy Nuance, the maker of Dragon, for about $20 billion. Microsoft has reportedly posted revenue over $40 billion and profit of over $15 billion. Verbit, by comparison, has raised “over $100 million” in investor money. It reports revenue in the millions and positive cash flow. Another company that reports revenue in the millions and positive cash flow? VIQ Solutions, parent of Net Transcripts. As described in a previous post, VIQ Solutions has reported millions in revenue and a positive cash flow since 2016. What’s missing? The income. Since 2016, the company hasn’t been profitable.
Obviously, things can turn around, companies can go long periods of time without making a profit, bounce back, and be profitable. Companies can also go bankrupt and dissolve a la Circuit City or be restructured like JCPenney. The point is not to disparage companies on their financials, but to give stenographic captioners real perspective on the information they’re reading. So, when you see this blurb here, what comes to mind?
Hint. What’s not being mentioned? Profit. While this is not conclusive, the lack of any mention of profit tells me the cash flow and revenue is fine, but there are no big profits as of yet. Cash flow can come from many things, including investors, asset sales, and borrowing money. Most of us probably make in the ballpark of $50,000 to $100,000. Reading that a company raised $60 million, ostensibly to cut in on your job, can be pretty disheartening. Not so once you see that they’re a tiny fraction of the overall picture and that players far bigger than them have not taken your job despite working on the technology for decades.
Moreover, we have a consumer protection crisis on our hands. At least one study in 2020 showed that automatic speech recognition can be 25 to 80 percent accurate depending on who’s speaking. There are many caption advocates out there, such as Meryl Evans, trying to raise awareness on the importance of caption quality. The messaging is very clear: automatic captions are crap (autocraptions), they are often worse than having no captions, and a single wrong word can cause great confusion for someone relying on the captions. Just go see what people on Twitter are saying about #autocraptions. “#NoMoreCraptions. Thank you content creators that do not rely on them!”
This isn’t something I’m making up. Anybody in any kind of captioning or transcription business agrees a human is required. Just check out Cielo24’s captioning guide and accuracy table.
If someone’s talking about an accuracy level of 95 percent or better, they’re talking about human-verified captions. If you, captioner, were not worried about Rev taking away your job with its alleged 50,000 transcribers, then you should not throw in the towel because of Verbit and its alleged 30,000 transcribers. We do not know how much of that is overlap. We do not know how much of that is “this transcriber transcribed for us once and is therefore part of our ‘team.'” We do not know how well transcription skills will fit into the fix-garbage-AI-transcription model. The low pay and mistreatment that comes with “working for” these types of companies is going to drive people away. Think of all the experiences you’ve had to get you to your skill level today. Would you have gotten there with lower compensation, or would you have simply moved on to something easier?
Verbit’s doing exceptionally well in its presentation. It makes claims that would cost quite a bit of time and/or money to disprove, and the results of any such investigation would be questioned by whoever it did not favor. It’s a very old game of making claims faster than they can be disproven and watching the fact checkers give you more press as they attempt to parse what’s true, partially true, and totally false. This doesn’t happen just in the captioning arena, it happens in legal reporting too.
This seems like a terrifying list of capabilities. But, again, this is an old game. Watch how easy it is.
It took me 15 seconds to say six lies, one partial truth, and one actual truth. Many of you have known me for years. What was what? How long will it take you to figure out what was what? How long would it take you to prove to another person what’s true and what’s false? This is, in part, why it is easier for falsehoods to spread than the truth. This is why in court and in science, the person making a claim has to prove their claim. We have no such luxury in the business world. As an example, many years ago in the gaming industry Peter Molyneux got up on stage and demo’d Milo. He said it was real tech. Here was this dynamically interactive virtual boy who’d be able to understand gamers and their actions. We watched it with our own eyes. It was so cool. It was BS. It was very likely scripted. There was no such technology and there is no such technology today, over eleven years later. Do you think Peter, Microsoft, or anybody got in trouble for that? Nope. In fact, years later, he claimed “it was real, honest.”
Here’s the point: Legal reporters and captioners are going to be facing off with these claims for an indeterminate amount of time. These folks are going to be marketing to your clients hard. And I just showed you via the gaming industry that there are zero consequences for lying and that anything that is lied about can just be brushed up with another lie. There will be, more or less, two choices for every single one of you.
- Compete / Advocate. Start companies. Ally with deaf advocates.
- Watch it happen.
I have basically dedicated Stenonymous to providing facts, figures, and ways that stenographers can come out of the “sky is falling” mindset. But I’m one guy. I’m an official in New York. Science says there’s a good chance what we expect to happen will happen and that’s why I fight like hell to get all of you to expect us to win. That’s also why these companies repeat year after year that they’re going to automate away the jobs even when there’s zero merit or demand for an idea. You now see that companies can operate without making any profit, companies can lie, much bigger companies haven’t muscled in on your job, and that the giant Microsoft presumably looked at Verbit, looked at Nuance, and chose Nuance.
I’m not a neo-luddite. If the technology is that good, let it be that good. Let my job vanish. Fire me tomorrow. But facts are facts, and the fact is that tech sellers take the excellent work of brilliant programmers and say the tech is ready for prime time way before it is. They never bother to mention the drawbacks. Self-driving cars and trucks are on the way, don’t worry about whether it kills someone. Robots can do all these wonderful things, forget that injuries are up where they’re in heaviest use. Solar Roadways were going to solve the world’s energy problems but couldn’t generate any energy or be driven on. In our field, lives and important stakeholders are in danger. What happens when there’s a hurricane on the way and the AI captioning tells deaf people to drive towards danger?
Again, two choices, and I’m hoping stenographic captioners don’t watch it happen.
Very often on stenographer social media, we get questions about whether something should be reflected as said, sic’d, or “corrected.” There has been plenty of discussion over the years on whether to correct lawyers’ or witnesses’ speaking in transcription. There are a lot of ways to take this conversation, and in the spirit of keeping this fun, I’ll hit the highlights.
Necessary in this discussion is: “What is my transcript?” The bulk of freelance work goes to deposition reporting. When a case is filed and initial motions to dismiss are decided, if the case is not dismissed, it moves to discovery. Discovery is where the parties exchange information that they have so that when it is time for trial, there are few or no “surprise” pieces of evidence. At the conclusion of discovery, the parties can ask the court to decide the case as a matter of law if there are no factual questions in dispute. If the case cannot be resolved as a matter of law, it goes on to trial. An integral part of the discovery phase is deposition testimony. Parties have an opportunity to question the other side’s witnesses under oath. Witness testimony is evidence, and the evidence unveiled during the discovery phase is ultimately what helps parties settle cases, courts decide whether a matter can be decided as a matter of law, impeach witnesses at trial, and appellate courts review the decisions of the trial court. In America, the testimony of one witness can convict beyond a reasonable doubt. Your transcript is the verbatim record of what occurred during the testimony, and again, that testimony is powerful evidence.
Unsurprisingly, there are many different takes on what “verbatim” means. We can all read the dictionary definition: “in exactly the same words that were used originally.” But court reporting and transcription are service industries, and there have been many times where court reporters are pressured by a client or company to change that verbatim record in some small way. In my view, that pressure gave life to a lot of court reporter conventions that are daunting for students, new reporters, and even veteran reporters to master. For example, as a young reporter, I was told to take out false starts, never ever report “um,” and to even physically remove strikes and withdrawns from deposition transcripts. Now, wherever you are, the laws in your jurisdiction supersede my advice or opinion, but I am going to share the way I look at each in the hopes that this can be shared with others who struggle with these. For sure, anything I write can and will be debated, but debate can only improve our field.
Removing False Starts
This was drilled into me by agencies as a young reporter. “Always remove false starts.” It’s still being pushed on young reporters today, to the point where some may not even be taking them down. Frankly, I see this as bad advice. The essential factors for a reporter to consider in the way something is transcribed are context and readability. Does my transcription of the verbatim notes change the context of this testimony? Does my transcription degrade the readability of this testimony? In my view, removing most false starts will not actually change context, and they will improve readability. As an example:
“Q. Are you — did you go to the store?”
It would be difficult to argue that removing the words “are you” and simply changing the question to “Did you go to the store?” hurts the context. Nothing has changed. And so to the extent removing false starts is looked at favorably in our field, I get it. But what about when it would change context?
“Q. Are you — I mean, did you go — did you go to the — sorry. Did you, if you remember, go to the store?”
“A. I’m sorry. I don’t understand your question.”
What happens in a world where a young reporter, told that they must remove false starts, removes all that and changes it to “Did you, if you remember, go to the store?” The context is unequivocally changed. Verbatim, it’s very clear that the question was not clear. There was a lot of extra “stuff” in there. If such a question is cleaned up, it makes the witness look like they’re not paying attention or unintelligent. Removing false starts can hurt the context and stop legal professionals from doing their job. Imagine that the deposition is taken by a young associate and the trial lawyer is a seasoned vet who did not sit on the deposition. Reading a “cleaned up” version, the trial lawyer might believe the witness is a bumbling mess. When that witness gets on the stand and is given clear questions, it’s going to be a surprise for that trial lawyer. So even where law may allow the removal of false starts, it’s a decision the court reporting practitioner should make using their own sound judgment, and not on the whims of an agency or client. You may also want to see NCRA Advisory Opinion 4 to the extent it touches on this topic.
Never Ever Report Um
Again, I see the reporting of “um” as a matter of context and readability. Let’s say that you’re taking a motion argument, and it looks something like:
“MS. ATTORNEY: Um, um, um, um, um, um, um, um, um — your Honor, based on the hearing that we just had, there is no set of facts under which the people may prevail. I therefore ask you to dismiss this case in the interest of justice.”
Does it really change anything if you don’t report the ums in that specific instance? Nope. And this isn’t a hypothetical. I recall a situation just like this, where the attorney had, without question, made the point they were trying to make, and then became very flustered asking the court to make a decision. But what if the situation was a trial situation?
“Q. Did you see Mr. Vanhorten shoot Mr. Gorfasi?”
“A. Um, well — um, yes.”
If you transcribe that sentence as “well, yes” the context is destroyed. The witness seems crystal clear on what they saw. Those ums have a kiloton of context that transform what is being said. I’m not here to say anyone who omits an um is a bad reporter, but think twice before subscribing blindly to the “truism” that we do not report ums.
Physically Remove Strike That or Withdrawn
Often, strike that is seen as a false start. Just imagine the typical scenario:
“Q. Were you — strike that. Were you ever an employee of ABC Corporation?”
Again, the rule of context comes into play. In the above scenario, I can’t say I see a big problem with the omission of the false start strike that. But as a mentor to many over the years, I’ve come across the following scenario:
“Q. Were you ever an employee of ABC Corporation?”
“A. Well, I wasn’t an employee at the time.”
“MR. GUY: Move to strike.”
What have mentees come back and said? “Chris, my agency says remove strikes. Do I remove that whole thing?” Working reporters have had to counsel many a new reporter. “No. We cannot remove portions. That motion to strike is the attorney preserving their motion on the record, which will be later reviewed by a court.”
Ultimately, with these three categories, leaving things in as they are said is often the way to go. A court can always seal, strike, or disregard something that shouldn’t be in the transcript. On the other hand, a reporter that does not put something in the transcript can be questioned about why it was removed, or even have their neutrality called into question.
Now that we’ve explored some of the common things that impact context, let’s explore some more “what ifs.” Since I was a newbie, the discussion has come up, “Someone said a word incorrectly. Should I sic this?” This comes from a very literal way of thinking sometimes cleverly but pejoratively termed in our field as “the literati.” The pressure is turned up to make something “perfectly verbatim” when there is a video, which brings up the question “are we not being verbatim when the video camera’s not on?” There are two major schools of thought, literal verbatim and readability, and within those schools of thought, you have many different situations and many different gradients. I could not possibly address each one, but let’s hit some common examples.
“Let me ax you a question.” It’s obvious to anyone that the speaker means to say ask. Many speakers do not enunciate clearly. It does not change the context to transcribe “ask,” and it greatly improves the readability, so for such moments where the context is not endangered and the word is obvious, there’s no harm in having the correct word rather than some kind of phonetic spelling. I would say the same for names. Let’s say someone’s name is Dr. Giglio. One person says “Jig-lee-oh” and the other says “Gig-lee-oh.” Again, if it’s clear that this is the same person, and the context is not endangered, transcribing the correct name is the way to go. If it’s not clear, then it’s time to speak up and get some clarification on the spelling! This is not to say you can never write a name phonetically, but try to make these spellings consistent throughout the transcript to the extent people are saying the same word, even if they say it a little differently.
“It’s supposably true.” In addition to not changing context by being too verbatim, we have to be mindful that sometimes people use words that sound like other words. If someone says a “wrong” word or a word we are not accustomed to hearing, we must resist the urge to correct, because that actually can alter context. We must also take the time to research things we are not a hundred percent sure on. In my book, supposably was not a word. The WordPress spellchecker says it’s not a word. I came to learn, a decade into my career, that supposably means “as may be conceived or imagined.” Supposedly is more of a synonym for allegedly. Was this true 10 years ago? I have no idea. As court reporters, we face the harsh reality of language drift. Words fall in and out of use. People do not speak as we were taught. So while you might correct something like axing a question, you have to think twice before you correct something that’s “supposably wrong.” If you have three minutes, check out my favorite video illustrating language drift. You can go back about 700 years before English starts sounding like gibberish and giraffes were camelopards. Through a mix of self-initiated research and our continuing education culture, we keep ourselves ahead of the average transcriber.
Whether there is video or not, you want a clear and logical reason why you have transcribed something the way you transcribed it. In my view, the strongest reason for a transcription choice is “transcribing it any other way would change the context or was not verbatim.” Reporter convention and training take a backseat to that.
Court reporters are masters of English dialects even when we have no training. There is a study out there that pretty much shows we are twice as accurate as laypeople when transcribing the AAVE dialect. The thing that makes us, as humans, so much better than computers at transcribing speech that has a dialect or an accent is our ability to understand context. For example, in the Northern Cities Vowel Shift dialect, someone might say something that sounds like “she went down the black.” Dependent upon the context, we know that that sentence can be “she went down the block.” In brief, our ability to look at the totality of a statement is important. What a reporter may hear is “down the black.” But what must be transcribed, in the interest of both context and readability, is “down the block,” unless there’s some context that tells us “black” is actually correct.
This is also where our ability to speak up for the record comes into play, because if a reporter is unsure, they can seek clarification. For purposes of our work, dialects and accents are very much like garden-path sentences where a sentence goes in a different direction from what you were anticipating; we can discern what’s said from the context. Though accents are a different animal from dialects, the same rules apply. Early in my career, I had a gentleman say something that sounded like “I got up and leave her.” Through context I knew the statement was “I gotta pull a lever.” He was explaining how to open bus doors! Another man talked about the “zeh bruh lies or stripes” on the road, which could only be “zebra lines or stripes.” We’re not here to pick apart how something was said, we’re here to take down what was said.
“Vice-a versa” versus “vice versa.” “Neezy preezy” versus “nisi prius.” “Nun pro tunc” versus “nunc pro tunc.” “In forma papyrus” versus “in forma pauperis.” Because of Latin’s considerable history and various modern regional pronunciation schemes, this is another thing that gets confusing fast. My advice? Treat it like mispronunciations. Treat it like dialects. Treat it like all these other examples and look at the context. If someone says, objectively, the wrong phrase, then don’t change it for them, but if you know exactly what they said, don’t transcribe it phonetically for the sake of “verbatim.” Take a look.
“MR. GUY: Quid pro quo is the Latin phrase for ‘from possibility to actuality.'”
So we head over to Google, and we can see clearly that “a posse ad esse” is the Latin phrase for that. Quid pro quo means “something for something.” No correction is necessary here. We knew what was meant, but the wrong thing was said. Verbatim is our friend. But what if it’s just a butchered pronunciation?
“MR. GUY: vee-low-shee-yee-yus quam asparagi coke-a-tor is the Latin phrase for ‘faster that asparagus can be cooked.'”
MR. GUY: velocius quam asparagi coquantur is the Latin phrase for ‘faster than asparagus can be cooked.'”
If you’re following along, you can probably tell that I think the second one is the obvious choice. No matter how butchered that pronunciation might be, if it’s clear, transcribing the wrong word or a series of phonetic jabs is what a computer would do. You’re better than that, use it to your advantage. And do not be too hard on yourself for making a mistake. I have had colleagues that were told the incorrect spelling of Latin phrases by people far more educated than many of us are. Whatever the issue, learn from various mistakes and situations, try not to become so rigid with regard to language that it endangers context, and continue to grow.
But I Was Taught This Way
Whenever stuff like this comes up, inevitably you’ll get responses like “but I was taught this way,” or “I’ve been doing it my way for 30 years.” Nobody can really fight with that. We have to respect one another and those various perspectives, backgrounds, and experiences. But I’ve come to look at it from a liability and reputation perspective for the freelance court reporter. If someone questioned you on a transcript, how would you respond? “My agency told me to” is a very unsafe response, because the agency can just say they didn’t, and if you’re an independent contractor, they’re not supposed to have direction and control over you. So take a look at the practice, and imagine being questioned on it. “That’s what you said” is a much stronger response than “everybody does it this way.”
We have to deal with the fact that, while we may live in a world of “truisms,” like “clients expect us to clean up the record,” these things are not universal, and in fact, as a young reporter, I had a lawyer tell me “you can’t change [false starts], it’s part of the record!” Imagine being about 20, and repeatedly told that “everyone cleans it up,” “this is normal,” “this is expected,” “you’re a bad reporter if you don’t fix it,” and then being slammed with “you can’t take that out.” It’s not surprising to me that there are reporters of all ages and experience levels that struggle with this. I’m really hoping this helps the strugglers: I was you. You’re not going to have an immediate answer for every situation, but having an objective or neutral method for how you make these decisions is imperative. If problems arise, and they occasionally do, you’re going to be defending your work. Remember, this is all about having an accurate record for review by the parties, trial courts, and appellate courts. Our expertise is what stops errors like “lawyer dog” from making it into the record and ruining people’s lives. If your work hasn’t changed the context of a statement and the transcript is readable, you’re off to a great start.
Allie Hall is a reporter and educator who has made amazing strides in getting schools to pick up court reporting programs and getting students filling those programs. Some months ago, a group of working reporters came together under Allie’s guidance and leadership, and with additional help from co-admin Traci Mertens, the group has managed to donate thousands to new reporters and students in need.
If you are a working reporter or CART writer looking to give back, please reach out about joining the group. There is a fundraiser currently ongoing, and working reporters may donate ten to twenty dollars to help meet students’ needs.
Working reporters may donate via:
Google Pay: email@example.com
There is truly no contribution too small. If you’ve got an extra ten dollars to put down on a student, consider sending it along to Allie today! I am a contributing member of the group, and I have rarely ever seen such energy and accountability in a grassroots fundraiser. This is something special, it’s something I really support, and I know the money is going to making the road that young professionals have to travel just a little bit less bumpy. Most of us can look back at our student years and say “I wish I had…” Now we get to be a part of making sure the students of tomorrow have!
There’s a lot of conjecture when it comes to automatic speech recognition (ASR) and its ability to replace the stenographic reporter or captioner. You may also see ASR referred to as NLP or natural language processing. An important piece of the puzzle is understanding the basics behind artificial intelligence and how complex problems are solved. This can be confusing for reporters because in any of the literature on the topic, there are words and concepts that we simply have a weak grasp on. I’m going to tackle some of that today. In brief, computer programmers are problem solvers. They utilize datasets and algorithms to solve problems.
What is an algorithm?
An algorithm is a set of instructions that tell a computer what to do. You can also think of it as computer code for this discussion. To keep things simple, computers must have things broken down logically for them. Think of it like a recipe. For example, let’s look at a very simple algorithm written in the Python 3 language:
Line one tells the computer to put the words “The stenographer is _.” on the screen. Line two creates something called a Stenographer, and the Stenographer is equal to whatever you type in. If you input the word awesome with a lowercase or uppercase “a” the computer will tell you that you are right. If you input anything else, it will tell you the correct answer was awesome. Again, think of an algorithm like a recipe. The computer is told what to do with the information or ingredients it is given.
What is a dataset?
A dataset is a collection of information. In the context of machine learning, it is a collection that is put into the computer. An algorithm then tells the computer what to do with that information. Datasets will look very different dependent on the problem that a computer programmer is trying to solve. As an example, for enhancing facial recognition, datasets may be comprised of pictures. A dataset may be a wide range of photos labeled “face” or “not face.” The algorithm might tell the computer to compare millions of pictures. After doing that, the computer has a much better idea of what faces “look like.”
What is machine learning?
As demonstrated above, algorithms can be very simple steps that a computer goes through. Algorithms can also be incredibly complex math equations that help a computer analyze datasets and decide what to do with similar data in the future. One issue that comes up with any complex problem is that no dataset is perfect. For example, with regard to facial recognition, there have been situations with almost 100 percent accuracy with lighter male faces and only 80 percent accuracy with darker female faces. There are two major ways this can happen. One, the algorithm may not accurately instruct the computer on how to handle the differences between a “lighter male” face and a “darker female” face. Two, the dataset may not equally represent all faces. If the dataset has more “lighter male” faces in this example, then the computer will get more practice identifying those faces, and will not be as good at identifying other faces, even if the algorithm is perfect.
Artificial intelligence / AI / voice recognition, for purposes of this discussion, are all synonymous with each other and with machine learning. The computer is not making decisions for itself, like you see in the movies, it is being fed lots of data and using that to make future decisions.
Why Voice Recognition Isn’t Perfect and May Never Be
Computers “hear” sound by taking the air pressure from a noise into a microphone and converting that to electronic signals or instructions so that it can be played back through a speaker. A dataset for audio recognition might look something like a clip of someone speaking paired with the words that are spoken. There are many factors that complicate this. Datasets might be focused on speakers that speak in a grammatically correct fashion. Datasets might focus on a specific demographic. Datasets might focus on a specific topic. Datasets might focus on audio that does not have background noises. Creating a dataset that accurately reflects every type of speaker in every environment, and an algorithm that tells the computer what to do with it, is very hard. “Training” the computer on imperfect datasets can result in a word error rate of up to 75 percent.
This technology is not new. There is a patent from 2000 that seems to be a design for audio and stenographic transcription to be fed to a “data center.” That patent was assigned to Nuance Communications, the owner of Dragon, in 2009. From the documents, as I interpret them, it was thought that 20 to 30 hours of training could result in 92 percent accuracy. One thing is clear: as far back as 2000, 92 percent accuracy was in the realm of possibility. As recently as April 2020, the data studied from Apple, IBM, Google, Amazon, and Microsoft was 65 to 80 percent accuracy. Assuming, from Microsoft’s intention to purchase Nuance for $20 billion, that Nuance is the best voice recognition on the market today, there’s still zero reason to believe that Nuance’s technology is comparable to court reporter accuracy. Nuance Communications was founded in 1992. Verbit was founded in 2016. If the new kid on the block seriously believes it has a chance of competing, and it seems to, that’s a pretty good indicator that Nuance’s lead is tenuous, if it exists at all. There’s a list of problems for automation of speech recognition, and even though computer programmers are brilliant people, there’s no guarantee any of them will be “perfectly solved.” Dragon trains to a person’s voice to get its high level of accuracy. It simply would not make economic sense to have hours of training a software to everyone who is going to speak in court forever until the end of time, and the process would be susceptible to sabotage or mistake if it was unmonitored and/or self-guided (AKA cheap).
This is all why legal reporting needs the human element. We are able to understand context and make decisions even when we have no prior experience with a situation. Think of all the times you’ve heard a qualified stenographer, videographer, or voice writer say “in 30 years, I’ve never seen that.” For us, it’s just something that happens, and we handle whatever the situation is. For a computer that has never been trained with the right dataset, it’s catastrophic. It’s easy, now, to see why even AI proponents like Tom Livne have said that they will not remove the human element.
Why Learning About Machine Learning Is Important For Court Reporters
Machine learning, or applications fueled by machine learning, are very likely to become part of our stenographic software. If you don’t believe me, just read this snippet about Advantage Software’s Eclipse AI Boost.
If you’ve been following along, you’ve probably figured out, and it pretty much lays it out here, that datasets are needed to train “AI.” There are a few somewhat technical questions that stenographic reporters will probably want answered at some point:
- Is this technology really sending your audio up to the Cloud and Google?
- Is Google’s transcription reliable?
- How securely is the information being sent?
- Is the reporter’s transcription also being sent up to the Cloud and Google?
The reasons for answering?
- The sensitive nature of some of our work may make it unsuitable for being uploaded. To the extent stuff may be confidential, privileged, or ex parte, court reporters and their clients may simply not want the audio to go anywhere.
- Again, as shown in “Racial disparities in automated speech recognition” by Allison Koenecke, et al., Google’s ASR word error rate can be as high as 30 percent. Having to fix 30 percent of a job is a frightening possibility that could be more a hindrance than a help. I’m a pretty average reporter, and if I don’t do any defining on a job, I only have to fix 2 to 10 percent of any given job.
- If we assume that everyone is fine with the audio being sent to the cloud, we must still question the security of the information. I assume that the best encryption possible would be in use, so this would be a minor issue.
- The reporter’s transcription carries not only all the same confidential information discussed in point 1, but also would provide helpful data to make the AI better. Reporters will have to decide whether they want to help improve this technology for free. If the reporter’s transcription is not sent up with the audio, then the audio would only ostensibly be useful if human transcribers went through the audio, similar to what Facebook was caught doing two years ago. Do we want outside transcribers having access to this data?
Our technological competence changes how well we serve our clients. Nobody reading this needs to become a computer genius, but being generally aware of how these things work and some of the material out there can only benefit reporters. In one of my first posts about AI, I alluded to the fact that just because a problem is solvable does not mean it will be solved. I didn’t have any of the data I have today to assure me that my guess was correct. But I saw how tech news was demoralizing my fellow stenographers, and I called it as I saw it even though I risked looking like an idiot.
It’s my hope that reporters can similarly let go of fear and start to pick apart the truth about what’s being sold to them. Talk to each other about this stuff, pros and cons. My personal view, at this point, is that a lot of these salespeople saw a field with a large percentage of women sitting on a nice chunk of the “$30 billion” transcription industry, and assumed we’d all be too risk averse to speak out on it. Obviously, I’m not a woman, but it makes a lot of sense. Pick on the people that won’t fight back. Pick on the people that will freeze their rates for 20 or 30 years. Keep telling a lie and it will become the truth because people expect it to become the truth. Look how many reporters believe audio recording is cheaper even when that’s not necessarily true.
Here’s my assumption: a little bit of hope and we’ve won. Decades ago, a scientist named Richter did an experiment where rats were placed in the water. It took them a few minutes to drown. Another group of rats were taken out of the water just before they drowned. The next time they were submerged, they swam for hours to survive. We’re not rats, we’re reporters, but I’ve watched this work for humans too. Years ago, doctors estimated a family member would live about six more months. We all rallied around her and said “maybe they’re wrong.” She went another three years. We have a totally different situation here. We know they’re wrong. Every reporter has a choice: sit on the sideline and let other people decide what happens or become advocates for the consumers we’ve been protecting for the last 140 years, before the stenotype design we use today was even invented. People have been telling stenographers that their technology is outdated since before I was born, and it’s only gotten more advanced since that time. Next time somebody makes such a claim, it’s not unreasonable for you to question it, learn what you can, and let your clients know what kind of deal they’re getting with the “new tech.”
Some readers checked in with the Eclipse AI Boost, and as it was relayed to me, the agreement is that Google will not save the audio and will not be taking the stenographic transcriptions. Assuming that this is true, my current understanding of the tech is that stenographers would not be helping improve the technology by utilizing this technology unless there’s some clever wordplay going on, “we’re not saving the audio, we’re just analyzing it.” At this point, I have no reason to suspect that kind of a game. In my view, our software manufacturers tend to be honest because there’s simply no truth worth getting caught in a lie over. The worst I have seen are companies using buzzwords to try to appease everyone, and I have not seen that from Advantage.
Admittedly, I did not reach out to Advantage myself because this was meant to assist reporters with understanding the concepts as opposed to a news story. But I’m very happy people took that to heart and started asking questions.
As a stenographic court reporter, I have been amazed by the strides in technology. Around 2016, I, like many of you, saw the first claims that speech recognition was as good as human ears. Automation seemed inevitable, and a few of my most beloved colleagues believed there was not a future for our amazing students. In 2019, the Testifying While Black study was published in the Language Journal, and while the study and its pilot studies showed that court reporters were twice as good at understanding the AAVE dialect as your average person, even though we have no training whatsoever in that dialect, the news media focused on the fact that we certify at 95 percent and yet only had 80 percent accuracy in the study. Some of the people involved with that study, namely Taylor Jones and Christopher Hall, introduced Culture Point, just one provider that could help make that 80 percent so much higher. In 2020, a study from Stanford showed that automatic speech recognition had a word error rate of 19 percent for “white” speakers, 35 percent for “black” speakers, and “worse” for speakers with a high dialect density. How much worse?
75 percent word error rate in a study done three or four years after the first claim that automatic speech recognition had 94 percent accuracy. But in all my research and all that has been written on this topic, I have not seen the following point addressed:
What Is An Error?
NCRA, many years ago, set out guidelines for what constituted an error. Word error guidelines take up about a page. Grammatical error guidelines take up about a page. What this means is that when you sit down for a steno test, you’re not being graded on your word error rate (WER), you’re being graded on your total errors. We have decades of failed certification tests where a period or comma meant a reporter wasn’t ready for the working world yet. Even where speech recognition is amazing on that WER, I’ve almost never seen appreciable grammar, punctuation, Q&A, or anything that we do to make the transcript readable. It’s so bad that advocates for the deaf, like Meryl Evans, refer to automatic speech recognition as “autocraptions.”
Unless the bench, bar, and captioning consumers want word soup to be the standard, the difference in how we describe errors needs to be injected into the discussion. Unless we want to go from a world where one reporter, perhaps paired with a scopist, completes the transcript and is accountable for it, to a world where up to eight transcribers are needed to transcribe a daily, we need to continue to push this as a consumer protection issue. Even where regulations are lacking, this is a serious and systemic issue that could shred access to justice. We have to hit every medium possible and let people know the record — in fact, every record in this country — could be in danger. The data coming out is clear. Anyone selling recording and/or automatic transcription says 90-something percent accuracy. Any time it’s actually studied? Maybe 80 percent accuracy, maybe 25; maybe they hire a real expert transcriber, or maybe they outsource all their transcription to Kenya or Manila. Perception matters; court administrators are making industry-changing decisions based on the lies or ignorance of private sector vendors.
The point is recording equipment sellers are taking a field which has been refined by stenographic court reporters to be a fairly painless process where there are clear guidelines for what happens when something goes wrong, adding lots of extra parts to it, and calling it new. We’ve been comparing our 95 percent total accuracy to their “94 percent” word error rate. In 2016, perhaps there were questions that needed answering. This is April 2021, there’s no contest, and proponents of digital recording and automatic transcription have a moral obligation to look at the facts as they are today and not what they’d like them to be.
If you are a reporter that wants more information or ideas on how to talk about these issues with clients, check out the NCRA Strong Resource Library, and Protect Your Record Project. Even reporters that have never engaged in any kind of public speaking can pick up valuable tips on how to educate the public about why stenographic reporting is necessary. Lawyers, litigants, and everyday people do not have time to go seeking this information; together, we can bring it to them.