Trolls and You

We try to keep political stuff from being published here unless it’s educational, about court reporting, or about the industry. I’ve been pretty good about this. Commentators have been great about it. The occasional guest writer has been amazing with it. This topic touches with politics, but it’s not strictly political, so it should be fun to learn about.

It’s established that the United Kingdom, United States, China, Russia and several other countries view the internet as, more or less, another theater of war. They’ve had operatives and people hired to create fake posts, false comments, and advance the interests and ideas of the government. The prices reported? Eight dollars for a social media post, $100 for ten comments, and $65 for contacting a media source. In the case of China, they’re reportedly working for less than a dollar. If the host country allows it, you have trolls for hire.

So in the context of stenography and the court reporting industry, seems like whenever we get into the news, there are regular comments from regular people, such as “why not just record it?” Typical question. Anyone would ask this question. There are fun comments like “Christopher Day the stenographer looks like he belongs on an episode of Jeopardy.” Then there are comments that go above and beyond that. They make claims like — well, just take a look.

“…I gonna tell you that in modern technology we can record something like court testimony for hundreds of years back very easily…” “…the technology is smarter every single second…” “…if you store data in the digital format we can use an AI to extract the word from the voice in the data, it will be very accurate so much so the stenographer loses their jobs.” Wow! Lose our jobs? I felt that in my heart! Almost like it was designed to hurt a stenographer’s feelings. Right?

We can store the video for hundreds of years? Maybe. But consider that text files, no matter what way you swing it, are ten times smaller than audio files. They can be thousands of times smaller than video files. Take whatever your local court is paying for storage today and multiply that by 8,000. Unless we want a court system that is funded by advertisements a la Youtube, the taxpayer will be forced to cough up much more money than they are today. That’s just storing stuff.

The technology is getting smarter every second? No, not really. Whenever it’s analyzed by anybody who isn’t selling it, it’s actually pretty dumb and has been that way for a while. Take Wade Roush’s May 2020 article in the Scientific American (pg 24). “But accuracy is another matter. In 2016 a team at Microsoft Research announced that it had trained its machine-learning algorithms to transcribe speech from a standard corpus of recordings with record-high 94 percent accuracy. Professional human transcriptionists performed no better than the program in Microsoft’s tests, which led media outlets to celebrate the arrival of ‘parity’ between humans and software in speech recognition.”

“…And four years after that breakthrough, services such as Temi still claim no better than 95 percent — and then only for recordings of clear, unaccented speech.” Roush concludes, in part, “ASR systems may never reach 100 percent accuracy…” So technology isn’t getting smarter every second. It’s not even getting smarter every half decade at this point.

“…we can use an AI to extract the word from the voice in the data…” This technology exists, kind of, but perfecting it would be like perfecting speech recognition. Nobody’s watching 500 hours of video to see if it accurately returns every instance of a word. Ultimately, you’re paying for the computer’s best guess. Sometimes that’ll be pretty good. Sometimes you won’t find the droid you’re looking for.

Conclusion? This person’s probably not in the media transcoding industry, probably doesn’t know what they’re talking about, and is in all likelihood a troll. Were they paid to make that comment? We don’t know. But I think it’s time to realize that marketplaces are ripe for deception and propaganda. So when you see especially mean, hateful, targeted comments, understand that there’s some chance that the person writing the comment doesn’t live in the same country as you and doesn’t actually care about the topic they’re writing about. There’s some chance that person was paid to spread an opinion or an idea. Realizing this gives us power to question what these folks are saying and be agents of truth in these online communities. Always ignoring trolling leads to trolling leading the conversation. So dropping the occasional polite counterview when you see an obvious troll can make a real impact on perception. The positive perception of consumers and the public is what keeps steno in business.

The best part of all this? You can rest easier knowing some of those hateful things you see online about issues you care about are just hired thugs trying to divide us. If a comment is designed to hurt you, you might just be talking to a Russian operative.

Addendum:

I understand readers will be met with the Scientific American paywall. I would open myself up to copyright problems to display the entire article here. If you’d like to speak out against the abject tyranny of paywalls, give me money! I’m kidding.

What Verbit Leadership Needs To Know

I had a lot of fun writing the Verbit investors article. But the more I explore opinions and ideas outside our steno social circles, the more I see that most people totally don’t get stenographers or the work we put in. A lot of us have had sleepless nights trying to get a daily out, time lost for ourselves or our families trying to do the job we signed up for, or some amount of stress from someone involved with the proceeding being unhelpful or antagonistic. It happens, we take it in stride, and we make the job look easy. So it doesn’t surprise me very much when people say “why not just record it?” It doesn’t surprise me that investors threw money into the idea that technology could disrupt the court reporting market. But I can only hope that proponents of digital really take the time to understand and step back from the cliff they’re being pushed towards.

For this exercise, we’re going to be exploring Verbit’s own materials. They recently circulated a graphic that showed the “penetration” of digital into the court reporting market. It shows 5 to 10 percent of the deposition market taken by digital, and 65 to 75 percent of the court market taken by digital. It also notes that only 25 to 35 percent of courts are digitally transcribed. I take this to mean that while they have 75 percent of the “court market,” they only transcribe about 25 percent of it. This is a massive problem. So the technology, when it’s not breaking down in the middle of court (29:20), is ready to record all these proceedings. But you only have the capacity to transcribe about a third of that. So in this magical world where suddenly you have every deposition, EUO, and court proceeding, where are you going to get all of these people? We’re talking about multiplying your current workforce by 28 assuming that every person you hire is as efficient as a stenographer. And the math shows that every stenographer is about as efficient as 2 to 6 transcribers. So we’re really talking about multiplying your current workforce by 56 to 168 times, or just creating larger backlogs than exist today. By not using stenographers, Verbit and digital proponents are setting themselves up for an epic headache.

Of course, this is met with, “well, there’s a stenographer shortage.” But what you have to understand is that we’ve known that for seven years now. All kinds of things have happened since then. You’ve got Project Steno, Open Steno, StenoKey, A to Z, Allison Hall reportedly getting over a dozen school programs going. Then you have lots of people just out there promoting or talking about the field through podcasts, TV, and other news. Showcasing the shortage and stenography has brought renewed interest in this field, and we are on track to solve this. Again, under the current plan, you would need as many as 60,000 transcribers just to fill our gap, and the turnover will probably be high because the plan promotes using a workforce that does not require a lot of training. So if you’re talking about training and retraining 60,000 people again and again over the next decade, I am quite sure you can find 10,000 or so people who want to be stenographic court reporters.

Look, I get it, nobody goes into business without being an optimist. But trying to upend a field with technology that doesn’t exist yet is just a frightening waste of investor money. How come when you sell ASR, it’s 99 percent accurate, but when Stanford studies the ASR from the largest companies in the world, it’s 60 to 80 percent accurate? How come when you sell digital it’s allegedly cheaper and better, but when it’s looked at objectively it’s more expensive and comes with “numerous gaps and missing testimony?” These are the burning questions you are faced with. There’s an objectively easier way of partnering with and hiring stenographers. If you don’t, you’re looking at filling a gap of 10,000 with 60,000 people, or multiplying the current transcription workforce of 50,000 by 56 (2.8 million). In a world of just numbers, this sounds great. Three million jobs? Who wouldn’t want that? But not far into this experiment you’ll find that people don’t grow on trees and the price of the labor will skyrocket unless you offshore all of the work. What happens when attorneys catch onto the fact that everything is being offshored and start challenging transcripts? Does anyone believe that someone in Manila is going to honor subpoenas from New York? Again, epic headache.

So if I could get just one message out to Verbit leadership and all the people begging for us to “just accept technology,” it would be to really re-examine your numbers and your tech. The people under you are going to tell you that a new breakthrough is just around the corner, that things are going well, and that you shouldn’t worry. But you should worry, because you very well might find yourself a pariah in your industry like Peter Molyneux ended up in his. If you’re not familiar, Peter became famous for promising without delivering. One of the most prominent examples of this was 2009 E3, where he stood up on stage and introduced Milo. This tech was going to be interactive. It was going to sense what you were doing and respond to it. It turns out it was heavily scripted, the technology did not and still does not exist to do what was being talked about and presented to consumers. Now, anyone with a bit of sense doesn’t listen to Peter.

If the ASR tech worked, why not sell it to us at 10,000 a pop multiplied by the 25,000 stenographers in your graphic and walk away with a cool 250 million dollars? It does what we do, right? So why aren’t we using it? Why aren’t you marketing it to us? It’s got to be a hell of a lot easier to convince 25,000 stenographers than it is to convince 1.3 million lawyers. Sooner or later, Legal Tech News and all the other news people are going to pick up on the fact that what you are selling is hype and hope. So, again, consider a change of direction. Stop propping up STTI, shoot some money over to the organizations that promote stenography, and partner up with steno. You’d be absolutely amazed how short people’s memories are when you’re not advocating for their jobs to be replaced with inferior tech. Take it from somebody who’s done the sleepless nights and endless hours in front of a monitor transcribing, this business isn’t easy. But if you trust stenographers, we’re going to keep making it look easy, and we’re going to make every pro-steno company a lot of money.

What Verbit Investors Need To Know

I had touched pretty gently on Verbit when its series A funding came in at $23 million. The series B funding is in at about $31 million earlier this year. Now Verbit’s announced a strategic partnership with the STI and professional flip flopper, Jim Cudahy. Migliore & Associates already came out with the hard truth of what this means: ASR doesn’t make the cut for the production of legal transcripts without a qualified court reporter no matter what you name it, NLP, ASR, AI, computer magic, automated transcription.

Do I come off as angry? I am angry. I’m angry that investors are being led down a path of burning capital where there’s just not a bright future. When the series A funding was happening, Verbit used words like automated, “save an enormous amount of manual labor.” “Adaptive speech recognition” with over 99 percent accuracy. Series B is out. They “would not take the human transcriber out,” “the AI will enhance the human.” So investors are fundamentally paying millions of dollars so that they can be another Rev. I doubt very much that that’s what was sold to investors. I don’t think anybody would be putting down millions on that.

Then the partnership with STI? A complete joke. I have already gone into how, without any doubt, stenographers and NCRA are by far the best equipped to deal with the court reporter shortage. AAERT and the STI just don’t have the funding, infrastructure, or experience to tackle the problem, and it shows in their data. By their estimates, court reporting companies stand to save $250,000 over the next decade by adopting digital tools. First, I would love to know if this is individual savings or cumulative. We don’t know because there are no sources linked or cited. If this is cumulative, it’s embarrassing that they would even post that. That would mean 25,000 in savings a year across all companies. If that’s the projected individual savings per company, only slightly less embarrassing. That’s less than the average annual salary of a single court reporter. This may come as a shock to Jim Cudahy, but court reporting companies adopted digital tools throughout digital’s birth in the 70s and into the 80s and 90s. Stenographers are already a part of the Information Age, utilize AI, and produce quality records daily. The idea that investors are going to dump $50 million into “technology” expected to save $250,000 over 10 years and expect a return is terrifying. “Most courts are digital,” again, assuming everything they have to say is true, and yet judicial candidates show a preference for stenographic court reporters and returning them to courtrooms. The growth here is in stenographers, stenography jobs, and stenography schools, and Verbit’s current leadership is missing this boat completely.

Let’s just tell it like it is. When a grassroots-funded stenography blog can give you some pretty solid reasons you’re backing the wrong horse, it’s time to give investors nothing less than what they deserve. Open up a Steno Department, throw down some money on us, and we will make sure you’ve got real and steady returns. Verbit, with proper leadership from Tom Livne, can still save the day. Just not with this bait and switch technology-to-transcription model that amounts to little more than a repackaging of old tech. The only other viable alternative I see is buying this blog for a good $8 million and hoping investors don’t see it before then. Not a difficult decision. Come on over to the winning team. Vote for sten!

Buying Hype

Seems like every day now there’s a new article talking about the great advances of AI transcription. Notice in what I just linked, the author is “Wire Contributor,” which to me means that it’s probably a Trint employee. The September 2019 article goes on to link an April 2017 article where the Wire apparently said something they did was unprecedented.

If you’re not looking at dates and glancing over it, it looks like AI transcription is making leaps and bounds. It’s coming. Their app is to be released at the end of 2019! What will we do? I am here to hopefully get everyone thinking critically. Why are these articles always sporting a technology that’s critically acclaimed but not ready to be publicly released? Because it’s a pitch. It’s an effort to get more investors. It’s a bid to get more people to throw money at it.

Not to get too controversial, but I’ve long watched a YouTuber scientist named ThunderF00t (Phil Mason). He’s made many videos to raise consumer awareness on products including inventions like the Free Electric, Solar Roadways, Zero Breeze, Fontus. All of these amazing things have a common theme: They sound cool. The media doesn’t understand the concepts behind them. Their creators make positive claims about them. These inventions have had millions of dollars put into them only for kickstarters and stakeholders to be let down. This is despite walls of positive press from various sites and media forums.

What can we learn? Sellers sell. That’s what they do. When there’s millions of dollars to be made, does the seller really care if the product only meets 90, 80, or 70 percent of the buyer’s needs? Will most buyers spend more time and money holding the seller accountable, or will they eat the loss or attempt to justify the purchase to themselves? That’s why you see claim after claim and never a bad word unless you have colossal levels of fraud, like Theranos. What else can we learn? These things can raise millions of dollars and never hurt a market, Solar Roadways raised over a million dollars and never threatened existing energy companies.

Buying hype can only serve to dampen our morale and make us cede market share. It can only serve to silence us. You don’t have to be a computer scientist to investigate claims about computer science. Let’s start selling facts and raising consumer awareness. If nothing else, remember: If their product worked, you would be using it.

Can Verbit Replace Verbatim?

I had had some thoughts with regard to AI and stenography. I stand by what I said there. Verbit has been, according to online commentators, soliciting people’s business and offering to assist with their workload. There are even some who have said — though I have not seen documentary proof of this — that Veritext is using Verbit or a similar process for their digital reporters. Succinctly, running the audio through a computer program and having a human fix up what the computer spits out. Oddly enough, sounds a lot like what we do when we are taking down every word on a stenotype these days.

The bottom line is these companies are hungry for money. They need revenue to prove to their investors that they are a good investment. Verbit reportedly raised $23 million. Trint reportedly raised at least 150 million euros, or 168 million dollars. That should give you an idea of just how big of an expense it is — in their estimation — to create a program to do what we already do.

When we talk about solving problems, and specifically solving problems with AI and computers, two of the largest jumps in technology are machine learning and modeling the human brain. Modeling the human brain seems an arduous task that is difficult to do on modern hardware. Machine learning is giving the computer training data, and then having the computer make “educated” guesses based on the training data.

So why bring this up again? Well, to caution all of us. The simple truth is the more training data that you give these folks, meaning the more audio files they have that show the computer what we do, the more they’ll be able to sculpt the program. If you make the business decision to help them in that way, that’s fine. But you know what? Demand a premium! There are hundreds of millions of dollars involved in developing these computer programs right now. They should probably be paying YOU to transcribe YOUR work, because quite frankly, if they perfect the program, you might be out of business. If they haven’t yet perfected the program, you’re helping them perfect it! Sounds like a premium service to me.

So make sure everybody out there knows: They don’t want your business, they need it, and they should probably be paying you.

A Word on AI and Stenography

I’ve said this before, but it feels like AI is ubiquitous and in everything these days. It spreads a lot of bad press for us stenographers in that people believe we are or will soon be replaceable. We can further extrapolate from the Pygmalion effect that those beliefs impact reality.

As many know, I’m an amateur programmer. I know relatively little about the top-of-the-line tech and can only code on a very basic level. That said, the more I learn conceptually, the more I’m in awe of just how far computers have come, and how far they have to go. You see it every day on your smartphone and in your steno software. Computers are hard at work and designed to do amazing things.

Here is the thing about computers: They only do what you tell them to do. You have to come up with a set of instructions, an algorithm, that gets it from point A to point B. They solve problems, but only using the instructions you give them. Even if you come up with the instructions, the results can be useless. We can imagine problems as mathematically solvable and insolvable — finite or infinite. An example of an infinite problem is a Fibonacci sequence. You take the next number in the sequence, and you add it to the last number in the sequence. This stretches into infinity. You can easily write a program to generate Fibonacci numbers, and the computer would die before generating them all because there are infinite numbers.

Then there are solvable problems. Chess is considered a solvable problem because it is a game with a finite number of pieces, spaces, and moves. There’s a problem, though. There are so many moves in chess that just the datasets for having 7 pieces at the end of the game (Lomonosov tablebases) are said to be 140 terabytes of information. To put that into perspective, it’s been estimated that all the books in the world would fit on about 60 terabytes. Even if you had a supercomputer capable of generating every possible move in Chess, the information would be absolutely useless to you, because to digest all of it would be the equivalent of reading every book ever written thousands of times.

So let’s think of AI and audio in terms of problem solving. The most basic way to describe Alexa and Siri is that they listen to you for keywords, and check what you say against their database, and decide what to do based on that algorithm we talked about. Let’s face it, there are only maybe 200,000 words in the English language. You could store every single one as a large audio file with less than 700 GB. Here is the deal: computers don’t hear in the traditional sense. They’re taking what you say and presenting educated guesses based on all the data they have. So now, if you will, imagine all 200,000 English language words and every combination they could possibly be in. To put it in perspective, it is a way bigger number than this. Now let’s add all the different ways words might be said, or all the different scenarios that might interfere with how the computer is “hearing.” Let’s add all the different accents and dialects of English.

Let me say this: It is very likely, in my mind, that someday computers will be programmed to hear as well as stenographers in any given situation. It’s a solvable problem. It’s a winnable game. But right now, based on what I know, there’s an indeterminate amount of time and money that it’ll take to get to a point where it is perfect and 95 percent or better in most or all scenarios in a reasonable amount of time. Take for a moment the example of Solar Roadways. Pave the roads in solar panels to solve America’s energy crisis. Millions of dollars were poured into this solution, and it failed. Remember, solvable problem, winnable game. Finite number of people with finite energy needs. Failed anyway. Speech-to-text is estimated to be worth billions of dollars. But what if it takes 100 more years to solve? How many millions or billions of dollars need to be lost before the solution is declared “good enough?” Remember, they can sell Alexa and Dragon today for piles of money. They don’t need 95 percent. The exponential growth of computers has ended, and unless the experts bring us quantum computing or some other huge leap in technology, we’re looking at computers being more money to upgrade.

Those companies you see that are touting transcription AI in 2019 are doubtlessly having transcribers fix AI-prepared transcripts at best. Their game is psychological. It’s not cost saving, it’s cost shifting from the worker to the boss. That’s why it’s not being sold to the public. It’s a magic trick. Look to the left while the magician rolls the coin to the right. It is in our best interest as stenographers to call this out when appropriate, and continue to bolster our own magic skills and industry as the go-to for the hearing impaired and legal communities. Could some geniuses come along and program your replacement next year? Sure. But one thing that you should understand is that it’s not very likely, and buying the hype before they have a product to sell is only going to hurt our morale and livelihoods. We have our method. We have a product. We’ve got more brains, voters, and history in the field. So do yourself and all of us a favor, don’t buy the hype, and the next time you meet a transcriber working for Fake AI Transcription Corp, LLC, tell them they can double their earnings and better themselves by joining the stenographic legion. If a supercomputer is required to solve Chess, what do you believe is required to get automatic speech recognition to 95 percent?

May 26, 2019 Edit:
I should add that it’s obvious computers are becoming ruthlessly good at transcribing one speaker, especially in a closed or suitable environment. There are hours of video on that. It’s introduction of multiple speakers in a less-than-perfect environment where the thing struggles, probably because of all those mathematical issues talked about above.

June 18, 2019 Edit:

A post recently made its rounds on social media claiming a computer science PhD couldn’t see the perfect transcription coming out any time soon. It stands in stark contrast to the claims of some that the technology is already perfect.

August 17, 2019 Edit:

Another article came to light showing that Facebook Messenger and other automatic transcription apps are actually using human transcribers behind the scenes. Using my amateur knowledge of computer coding, I can say this is clear evidence that they need data (the transcriptions) to feed into the machine learning algorithms. Further, if they’re not paying their transcribers exceptionally well and bad data is being inputted, it could ultimately make automatic transcription programs worse. Expect some pretty big delays on the AI transcription front.

August 25, 2019 Edit:

I had created a “mock voice recognition video” just to prove how easy it would be for a company to lie about its voice recognition progress. I coded a computer program that spits back whatever text you give it at a set words per minute. So next time you’re at an automatic transcription demonstration, ask yourself if what you’re seeing is automatic or staged. I often give the example of Project Natal and Peter Molyneux. Gamers were made to believe that the Milo demonstration of Project Natal was a showcase of technology that was coming out. The truth broke years later that the demonstration was heavily scripted, and over ten years later, no such technology exists. Similarly, when someone tells you that their audio transcription program is flawless — question whatever you’re seeing and realize how easy it is to stage and sell things.

Steno V Digital (Archive Post)

Consider this a gentle touch on an important topic. There’s been a true memetic shift in the way stenographers are interacting and spreading ideas. Content is honestly popping up faster than we can even really digest it, so let this post serve as a staging point for some of what’s happening this Court Reporting and Captioning Week 2019. This weekend I’ve had the pleasure of reading a flyer from the DRA in California (Photo Archive). Read about Idaho’s need for reporters (Photo Archive). Finally, got to see Cleveland Reporting Partners’ whole take on digital v steno (Photo Archive).

In very brief summary we are seeing many people put into writing what I have opined over Facebook. Yes, technology is amazing. But right now it struggles with certain things. It can transcribe one speaker quite well, but if you throw in some stray sounds or a second speaker, it can have a hard time. This makes the market for captioners and legal reporters a little more promising because we have the skill and training to give them what they need now and train others to do it. Make no mistake, there’s a big market in that, so if a company is having you train a digital, make sure you’re getting at least the next ten years of your annual income upfront.

Technological growth is no longer exponential. Don’t get me wrong, it’s impressive. But until Quantum Computing is cheap and accessible there are probably things we won’t see, like a JARVIS-like AI. We will see imitation AI, that’s for sure, but there is an indeterminate clock on when we will see quantum tech. The running idea and current study that’ll probably lead to true ideas is machine learning. This takes data training sets, like pictures, or recordings, or text — whatever it is programmed to take — and it takes that information and uses it as a basis for its decisions. Sometimes this is entertaining. Sometimes this goes horribly wrong. The bottom line is it is limited by the speed at which it can process its training data and the speed at which it can retrieve that information for later.

I imagine that the training data set for an AI to “do depositions” would look something like recorded depositions paired with their transcripts. There are three big hurdles there, building the training set, processing the training set, and retrieving the right data when it’s time to “do deposition.” In a classic computer we have, in very laypeople terms, little transistors firing on and off to tell the computer what’s going on. Tech is running into a problem where it can’t get these little nanotubes much smaller, and making bigger processors absorbs more electricity. For example, I wrote a Fibonacci-generating program. The basic concept is every number adds itself to the number that would come after it. The computer is happy to make these calculations, but very quickly, the processing power needed to calculate these numbers begins to run dry, and the files we store these numbers in become too large to be opened on a weak laptop. The simplest algorithm in existence busts up a classic computer. This is probably the trouble they have making something that can seamlessly listen to people and transcribe, the computer just doesn’t have the power to process it quickly. Look how long it takes videographers to burn disks or Go To Meeting to process audio. Now imagine adding another layer where the machine is transcribing everything perfectly. In Quantum Computing they’re talking about these very small units being able to calculate everything at once, or large batches of things at once. If they crack that, we’re probably back to exponential technological growth.

In the meantime, fight for your jobs. Fight for market share. It’s not a question of whether we’re outdated. Today the answer to that is no. What matters heavily is perception. Perception can change outcomes. One of the most effective tactics in war has been to get the enemy army to rout, and that’s exactly what digital reporting advocates are trying to get you to do: Give up and go home without a fight. Don’t buy into it, make the technology prove itself. Even the worst stenographer puts in words four or five times faster than the average typist, yet there are still typists.

Keep competing. We are well on our way to winning this thing.