Highlight Reel

The Challenges and Realities of Managing Complex Systems This script discusses the complexities and challenges in managing multiple systems and open-source technologies like WordPress and Apache web server. It highlights the shift in perspective towards system stability, referencing Facebook’s approach to handling issues. The speaker criticizes the inefficiencies and pressures that arise when technology systems fail, such as the burden on IT professionals and the professional sacrifices they make. It questions the system reliance on often inexperienced personnel and emphasizes the need for fair compensation and recognition for the intense demands of tech support roles.

00:00 Introduction: The Problem with Multiple Systems

00:13 The Rise of Open Source Solutions

00:20 The Impact of Facebook’s Approach

00:42 Challenges with Open Source and Stability

01:30 The Human Cost of System Failures

01:42 Conclusion: Seeking Balance and Compensation

YouTube

Transcript

[00:00:00] Do you like conversation on a variety of topics? Feel like no one wants to talk about the things that interest you? Tired of only hearing the same political, sports, or catastrophe talk? We feel that way too. Join two high functioning geeks as they discuss just about anything under the sun. We can’t tell you what we’ll be talking about each week because we don’t know where our brains will take us.

It will be an interesting conversation though, so hang on and join us. Here comes the Relentless Geekery.

Alan: Seeing your picture. I see that we’re recording. Let’s see. My camera is turned on. I see you. Okay. It’s bringing things up. Now I see you. There we go. There we go. [00:01:00] Okay. Let’s see. I should probably get a different background than my real background. Do I have

Stephen: a picture of that background and then use it as your virtual background and say, this is my real background.

Alan: This is what it really looks like to be up in Skynet here with all my stacks. Exactly.

Stephen: Yeah, you

Alan: could have done that from the cruise ship. Exactly. You know what? We’re just going to brave it out. Here you are. Okay. Visual effects not available in my browser. For everybody wondering what we’re doing in this uncharacteristic podcast, it’s because Zoom is down and we’re in Google Meetings.

And so it still works and I can see and hear you and stuff like that, but it’s a new environment to wander around So

Stephen: this is actually funny and because for work i’ve got a big project i’m working on now It’s a project that probably should have been done and kept up for years now but the gist of it is We have like different [00:02:00] websites across three different hosting platforms And then we have six different six to ten Different services depending on how you look at it some of which have quote unquote web pages landing pages for that service and stuff so When things break, which they do people get upset and I’m like, okay, we’ll go look at it.

It happens. You can’t run this many systems with this many things going on without having a problem. And that’s right. The basic gist of it is it’s we’re tired of having all these problems. It needs to be stable. And I’m like, if it needs to be stable, then we should not be using 20 different systems in 12 different places.

We should not be making updates. We should not be using WordPress with plugins at all because, we need to know if that’s going to get updated and break something. Great. Hold on. Oh there you [00:03:00] go. It says, try again next. And they don’t understand that, we have this event plugin And we paid 70 bucks for a year for it and they just upgraded it.

Broke things part of it was Just things interacted, but part of it was the plugin developer Like wanted to rush it out so they didn’t put all the same functionality in so functionality we had In version three no longer existed in four, but they made some other improvements and changes, but it broke other things They rushed it out too quick They actually had to get into our website to put some code in to fix some things before the new version came out and i’m getting like, this can’t break this can’t do this.

I’m like, okay You next time I can create a developer staging, section, go test it. And then I said, but every time we update a plugin, I will have to do that. So you need to count on two hours of my time for every plugin update. We have 10 websites, [00:04:00] each with 15 plugins. You do the math. What do you want me to do?

Some of these update every week. And then they said, we’ll call them. I’m like, call who call WordPress. It doesn’t work that way. WordPress is an open source thing. Why are we using it? Cause everybody does. And it’s really good. We can’t have this. Okay. This other service costs about 25, 000 a year.

That’s too much. Yeah. You can’t have both. And then they’re like, we’ll call the plugin developer. They don’t have a support. Why not? We need to call them. They’re like two guys in their basement. They don’t have the staff to answer the phone. If they charge 500 for the plugin, we’d get support.

They charge 70 bucks, this is on, we can’t do this. We need, okay, then we got to pay for it. We can’t, and I’m in the middle and they think, I don’t know what I’m doing because things break and I’m like you know what, you just updated things and with it just on the live site.

Again, I can take an extra two hours, [00:05:00] but that means this other thing you’re rushing to get done, can’t get done. What do you want? We need to call somebody.

Alan: Wow. I’ll tell you, you’ve just encapsulated the incredible shift in development over the course of our careers. At least, wow.

It used to be that releasing a new version was an incredibly huge thing. Yeah. Was. All that, development went to test. You tested the bejesus out of it. There was an entire, an incredible science about, do you have all the test cases that will exercise every bit of your code? Do you have, if you’re dependent on anything, like you’re saying, any kind of plugin or subroutine, you have to make sure that subroutine is in every part of your installation everywhere.

And so if you’ve got, mainframe days, mobile locations in the windows days, do you have, do you have multiple versions of windows? Testing was an enormous thing, and a big thing was you had to make sure that you had a place to retreat to. Facebook, in my opinion, was the progenitor.

It changed [00:06:00] everything. They did all kinds of changes on the fly, and when things broke, people just said it’s free, and I guess I’ll wait. And then as people got more and more addicted, They lost the desire to wait, not just Facebook. Google started doing it too. They actually had multiple versions out in the field where they did like AB testing and you’ll said what works better, what makes people happier and they collected the kind of stuff he used to collect in tests on the fly in production.

And then when you had instabilities. Like it used to be a cardinal rule, no data loss, especially not user data that you always had that place to retreat to and you back up everything. So for instance, when I went out to California and did my. com things for a couple of years, we had, we did an overnight release of various different things.

And we planned it for holidays and times when there weren’t be a lot of users on the system. So you didn’t tax it while you were doing things already. They had the idea of server farms where you had. You’d roll it out slowly, test it to see whether [00:07:00] all the testing you have done is going to survive the real world because real world has like lag and different communications environments.

And if you didn’t attract that, you can keep track of what we have running here. As much as we try to make sure the functionality is good, the practical functionality of how quick is it to respond. And country to country, is there going to be the translation tables to help you with languages, all that kind of stuff we had to contend with, we used to write huge scripts to make sure that everybody knew what they were going to do, and it was a synchronized, enormous, it was storming the beach at Normandy in terms of who’s going to work out what and when, the database guy had to help.

All the things he had programs that had to run to convert a database from one version to the next. And he had to have the ability to retreat from the new version if things were broken. So you could bring the old version back up and the old version of the database tables because it was all synced and that kind of thing.

So the fact that you have people nowadays who are like [00:08:00] it’s broken, must be fixed. And that all the things you just said, there is no accountability. There’s things that as soon as the Apache web server came out, And it was, it runs two thirds of the internet, but there’s no guy to call. It’s all open source and it’s really stable because so many people have worked on it, tested it.

It really has that level of rigor, but it’s like the quality between Wikipedia and Encyclopedia Britannica. There’s nobody to vouch for. There’s nobody to like, say, I did the research and you can tell me what I got wrong. It just, you count on the massive humanity to discover those bugs and the pride of.

All the geeks that love working on this kind of stuff to fix things as they come out. So like Linux is like that, anybody who’s running on a Linux platform, you started off with Linus Torvalds, but quickly people took over various different parts of that. And I guess it’s for anybody who hasn’t been through the generations of that, that didn’t go through development tests and production and the clear differences between them they don’t [00:09:00] have not only an appreciation, they have no patience for it.

And yeah.

Stephen: And things have changed with contact because it used to be here’s your tech support number Even if you had to wait on hold for moozak you had a tech support But a lot of them don’t and microsoft you give them the info and they will call you back Or get back with you if it’s not a high priority Item, they’ll just they’ll say it’s up to 24 hours before we get back to you.

There’s not just a number to call I absolutely love when exchange our email’s not working You That’s a huge deal. That’s our main thing. That everybody communicates that way. We’re all remote, exchange is not working. It’s down. There was a big announcement on the Microsoft exchange status website.

Everything’s down. We’re working on it. Okay.

Alan: Estimate as to when it might be done. Sometimes it’s like we, we are still figuring out what’s wrong. We’ll give you an

Stephen: update in an hour. That’s you know, what you usually see, and I’ve been in that situation. If. If I’m [00:10:00] work, I had a server go down and I had to work with Tennessee, the main headquarters.

And my phone was ringing off the hook because everybody in the office is calling me. It’s Hey, this is down. When’s it go be up? This is down. When’s it go be up? Hey, can you, my boss is knocking on the door and I’m like, folks, if I’m answering all your emails to tell you, yeah, I know it’s down.

I’m not fixing it. So I’m told call Microsoft and tell them we need this back up. I don’t even know how to respond to the, that, that statement because they are working on it, they know it needs up. They’re probably one of the most stable environments around. They have a horde of very smart engineers on that right now.

And the thing is. We’re not the only people affected right now. It’s thousands of companies. So I’m not calling bill on the phone and he’s not going, Oh my God, you guys are down. We’ll fix it immediately. If I do get ahold of somebody it’s the 23 year old. [00:11:00] Scared as hell intern. That’s oh my God, things are down.

Everybody’s calling me. Yes. I know. It’s a, he doesn’t know anything. What? No more than I do. He has the status page up. He’s hitting refresh like crazy because he’s Oh, please get back up so I can tell these people to lay off. I can call them all you want. It won’t fix us sooner.

They’re not going to say, okay, here’s the 500, 000 companies that are down. Move these guys to the top because they called, but I say it won’t help nothing. No, you need to do it immediately. Okay. You’re paying me to sit here for nothing. Oh, look at that. I got some little tech guy that told me they know it’s down and it’ll be up as soon as possible.

Good little tech guy. Thank you. That was 45 minutes.

Alan: This is only counting on it being software, when they have a hardware out, when actually like a server crashes, an entire server farm a main pipe of the internet gets cut. Then like immediately, not everywhere that people are knowing about it, but then it’s how soon, get the [00:12:00] programmer on it.

This is actually going to be digging ditches to get to the cables that’s been cut because they didn’t call Julie or whatever the state’s term is for that, check this before you do it thing. It’s amazing the level of how unawareness also goes with no appreciation for what it really takes to keep this whole thing going.

There’s famous lines of if the whole, if all like architecture was built like Microsoft Windows is one woodpecker could destroy civilization because it looks stable, but it’s really fragile in certain ways. I used to go to tech conferences where and I, because I handle databases at Progressive and it was good databases for a large part of my career.

But then you start talking about how do you really test those kinds of things? If you’ve got stored procedures, if you’ve got multiple databases that are, that you’re tapping into and stuff like that, it was they had a whole session at one point about fuzzing where it isn’t even trying to anticipate what the code should do, or even what a user [00:13:00] might do, they just did all kinds of random stuff to look at.

attack surface of a program, not even if you’re like being an intelligent hacker trying to get in, but just users really will. Spill something into their keyboard. The cat will walk across it. It’ll have a sunspot activity that will cause some kind of random text stream. And once in a while that random text stream looks like an intrusion attack.

You know what I mean? It looks like they injected code. And they tried to go to a part of memory that you’ll send it to execute even though it shouldn’t have been. So they just talked about how much stuff they’ve discovered. By random things happening, but they also then said that’s oddly like a real user case that we’re working on where a user Like you can’t idiot proof it enough to not have the most cunning idiot And that’s really an incredible term because they’re just a normal person trying to get their job done but if if you say choose one two or three and then they go and hit um, slash at the other corner of the keyboard Like you can’t do that You [00:14:00] can’t even anticipate what things people might try or that they might have six programs operating and that the memory thing that goes wrong on their personal machine can do things to an online environment, so I, the tales that I would come away from to share back at Progressive where you, we really think we’re on top of this.

You can’t believe what Microsoft is having to do for SQL Server or Oracle is having to do with their database and especially their, they had Or go ahead, expand from databases to have various different modules that they had for here’s the accounting module. Here’s the HR module and stuff and just the tales of terror that they had about how they thought they had a bulletproof this and then they went into the real world and just had things go.

Oh my God, who would do that?

Stephen: Honestly, it’s way more stable than they were 10 years ago, 20 years ago, 30 years ago. They’re more robust, more stable, more easy to use. And the video games are a great example of how things have changed. Look at video games. [00:15:00] They spend five years on some of these top AAA games and the day they come out.

They have a six gigabyte download at for updates and fixes because in the old days, when it went gold and went to production, that was it. You were done. It was on CD. It was on cartridge. You didn’t get to do any more, but the games are smaller. Now these things are, look at Skyrim hundreds of hours and MDs, it’s huge.

And, I can see the big boss phone. I don’t want there to be any problems in these 300 gazillion lines of code. Find them all. It’s still humanity, man. I had a good one. I was scrambling. All I got was a. The website’s down and all the links are broken. Okay. If the website’s down, the links will be broken.

So maybe that’s the same thing. Which I, I look, I’m like, which website we, again, we have 20 I’m looking, I don’t see which one they’re broken. It needs fixed immediately. Okay. That’s not real helpful. Looking. I spend [00:16:00] our, can you tell me, give me the actual link you’re looking at.

Can you give me an error message? Can you tell me what you were doing? Et cetera. No answer. Call. No answer. Text. No answer. And I know they’re irritated and mad, but tech have to ask these questions and narrow it down. You would not take your card of the shop, drop it off and say, it’s broke, fix it.

You say, It’s making this noise. It’s not turning correctly. It’s blowing cold air or it’s knocking, you tell them Here’s what the problem is. You don’t just say it’s broken and walk away But that’s what I get so often and then why come you haven’t fixed it yet as soon as I know what the actual problem is, i’ll find it out I’ll let you know and how many times we’ve said that How long is it going to take to fix it?

It’s only going to take about five minutes to fix it But there’s probably five or six hours of investigation to figure out what that one thing to do is You know, how do you tell people that [00:17:00] and it’s not i’m just not doing the scotty thing and multiplying it by four This is reality if I don’t know what it is.

Here’s the 20 things. It could be i’ll take the top three It’s probably one of those if not I gotta go through the other 17!

Alan: Honestly I even had I think I’ve even given a talk in the Mets about that. That, the act of debugging something is Like that also has changed over the course of time. That used to be on the mainframe.

So you’d like you’ll look for where there’s bad data. There, a SOC seven error was where you’ve got numeric data in a character field or the other way around. You know what I mean? Where it really expects certain things and you can’t add B, you can add numbers. So the, that idea of how do you approach that?

How do you narrow the possible problem space and do binary searches so that you cut out we know it can’t be this, we did that test and walking them through how Logical and rigorous and patient. You have to be in order to get to now. We know what the problem is. Now we can fix it, whereas they think it’s going to be like a big blinking red light and you’ll walk over to that part of the machine.

It’s [00:18:00] about to blow up and cool that down. And you only mean just how I had movies. Some of the odd things. I don’t know. I’m sure you had these. We had, I really was went through Y2K. I really went through all the things like if we roll this number over. And we don’t have the ability to have the year 2000 and thereafter, you need a four digit field to show that, or it’s going to think that year 00 is less than year 99, and all the things that you’ve based on how years always go up in size, there is no year 100.

It’s going to be year 0, and it’s going to break things. And trying to explain that often showed. The terrible thing to say, people don’t really get not only computers, but they don’t get their business. If you were dealing with a financial application, you had to say that number can never be negative.

I went through your data in preparation for trying to [00:19:00] work on this, and there’s all kinds of negative values where you told me it always has to be a positive thing. I’m not sure how it happened, but they’re in there. And if you’re not going to do the data cleansing beforehand, I can guarantee you that the first time we try to run this, you’re going to get results you’ve never seen before because you’ve got.

Bad data in certain fields or data that you said can’t possibly exist and yet it does. And so just, being This is an odd thing by taking that approach of Saying i’m going to do this and i’m going to go for what is the most common Possibilities or the ones that if they could cause the most damage i’ll go on them first whatever else might be Taking a calm approach Angered people like you can’t believe they went ballistic of no fix it right away Like you could hear their You can see the steam coming out of the ears.

It’s there’s no way we’re going to get to a solution without knowing that once we get to it, we looked at a whole bunch of stuff. And in fact, how many times have you been on a bug hunt and you found and fixed one, but you uncovered three more that have to be taken care of. And then he’s how do we prioritize [00:20:00] these?

These are bombs waiting to go off in your code. And. All of those explanations, honestly, by being the calm guy in the room, it didn’t seem to calm them down. I would have thought that it would have a reassuring, I know what I’m doing. This is the way to solve it.

Stephen: idiot and not know what you’re doing. No. Yeah. I don’t understand that either. That’s what I get all the times. Okay. It’s been 10 minutes. You haven’t fixed this yet. You need to call somebody else. What do you want me to call and explain our customized system? That’s going to get it fixed any sooner.

Okay, I’ll get right on that. Hang up, fix the problem. All right, me and Mr. X fixed it. It’s all good now. If you trust me to be the IT guy, it’s not just now. This has been over 30, 35 years of, doing this. If you trust me to be the I. T. guy, trust me to be the I. T. guy. And these things can’t always be fixed immediately.

If a Boeing 747 is having aileron problems, they don’t [00:21:00] just say, we’ll go fix it so we can get them back on the air. They ground them and say, We’ll let you know, cause we want it to be safe and good. If you don’t want your system to keep crashing, you need the time to spend on it, but when it’s, got to do more, got to do next, next more, change it, change.

I, oh my God, I get that all the time. It’s when I do web stuff, dynamic web pages before WordPress, I would say, look, if it takes the designer an hour to create this page, it takes me four or five hours to make it actually work and be dynamic. But you won’t see a single change. It’ll look exactly the same.

And they would be like it’s already done, isn’t it? No, it’s not. How come it’s taking so much longer? It looks done. It’s a painting that you’re looking at exactly,

Alan: Another thing we used to have, it’s like the that describing how it takes to make I can get something as a demo up and running really quickly.

And that’s not 90 percent of the way there. That’s 10 percent of the way there. And if that work and [00:22:00] bulletproof and account, like the minute we had things, okay, we’re going to start selling in Canada. Oh boy, you’ve got provinces now instead of States. And so you have to be able to account for, and they have different ways that they do addressing, but it’s not.

This core thing of your system is going to break dramatically. We need to get ahead of that and work on it. Oh no, we, we did all the new features that they wanted were, and I don’t know, even currency transaction, all that kind of stuff, all the things to keep something stable were always put behind compared to new features, new whiz bang, because they always want to be able to say new in this version, et cetera, et cetera.

And I always. The maintenance programmers would beg for just please give us this ability to stabilize things before we have to add new features because the act of new features when here’s some other terms that people might have heard when you do regression testing, when you do testing that just checks to see what you’ve already gotten to see that it maintains its functionality, that what you did new doesn’t break anything like you were [00:23:00] just talking about that they added this new they updated this plugin and it added new features, but it broke things that were at work.

Okay. And they didn’t even have a deprecation report that says we’re taking these features out. I don’t even know that they did their testing to say everything we have already is guaranteed to keep working. No, they don’t make that guarantee at all anymore. So we, boy that when I used to talk through test beds with various different, Real quick.

I did work for Ameritech and they had five different states in the Ameritech region, and they were various different degrees of tech savviness and even of having tech besides somebody in a room with a binder that’s keeping track of assets. And like we, we just had to go through so many times we’re heading towards this, but there’s going to be stutter steps along the way because not everybody is equally, not only technically proficient, but like in their mind, they that there’s going to be a difference with.

An asset that you know exactly what it costs versus something that what you did was you bought [00:24:00] 10, 000 of those and you took the total cost of asset and various other things. And then you just divided the total of all that spending by 10, 000 to get to a an average cost. But then when you have to replace it, there’s no guarantee that the average cost is going to be what you replace it with, because there’s overhead in terms of we had to warehouse it, and then that inventory had a cost, and I had to explain again and again, the differences between those various different things, and it wasn’t like, I’m not trying to do this, I’m not trying to confuse That’s the way big companies have had to do things that when they wanted to just be able, historically, to keep I don’t need one, I need a hundred more of those assets.

About what is the cost of that so I can put it in my budget? And then there was always a true up that came between. The average, and honestly, sometimes it wasn’t an average cost. It was a trumped up cost. They just made something up so they could get it in the budget. And then they had to go on and find out, Oh, no, we [00:25:00] underestimate how much that really costs.

What do we put in for a budget overrun? Oh, no, you can’t do that. That makes us look like you were fools by not getting a good estimate before you said this. So how far are you going to get behind in truing this up before it’s just. It’s just fakery. This is wow. I had such conversations with that’ll cost me my job.

What you’re going to do to explain it to other people, but oh how far it had gotten in some cases, it was like 10, 12 years of it getting increasingly spiraling out of control. It’s did nobody ever something that says that’s more, more non realistic. Who’s going to fix this? Kick the can down the road.

Somebody else that might, whoever’s in my job after me, we’ll have to find out how much this is a house of cards. It is not based on reality.

Stephen: I know that’s a story, America, but

Computer running cobalt and nobody understands it. Just reboot it. It always works [00:26:00] So my buddy casey has an idea for an app that he wants to work on So he’s I got the idea so you make it happen and i’m like that’s that comes from everybody ideas are nothing And so i’m like, okay.

We gotta do this. I’m like, okay, so I can work on it, but we really should go talk to a lawyer because the functionality of this app We got to make sure we’re doing something that’s legal. Because it is It’s involving the newer marijuana personal industry. Nothing bad and illegal that, but I said, we need to make that for sure.

We need to make sure that what we are doing doesn’t cross some line or step on something wrong.

Alan: Enabling could be aiding and abetting and illegal. Exactly. Yeah. And

Stephen: not only that, we’re going to need a privacy policy. We’re going to need cause it’s got authentication. We got to have security in there.

What if the security gives any, he’s that sounds like a whole lot of work. [00:27:00] Yeah, that’s why your idea sucks when you say I should get 20 percent or something like that. Nobody cares about your stupid idea. It’s the guy that can make it happen that it matters.

Alan: That’s, boy, we that’ll, i, I agree with what you’re saying. They, that the due diligence, all the things you have to put in to make sure that it’s like an expert system, there’s a human being embedded in there that if you were going to go to a counter and all the questions that they would ask you, they not only are asking you those questions and jotting down your answers, they’ve got things that are like that can’t be right.

That’s not true or real or fully the answer. And so you start putting in all those things. Another cardinal rule for us was Like you capture bad data as early in the process as possible, right? At the point of data entry of where you can, because if it gets into the system and starts running through overnight transaction processing or something like that, you get multiplier effects.

You get things that get embedded in the system and then. Taking them out and undoing transactions that shouldn’t have ever happened [00:28:00] was really an expensive thing to do. And not only, if you will, in terms of real computer time and cost of that, but just human time to figure all this kind of stuff out.

And yet, and if people already had a, an error cycle where something went wrong and it spit out an error code, and then for the next day you fixed that, It, the people were very unrealistic if they had any kind of routine, just go for throughput, go for as many transactions as you possibly can. And don’t worry, the few, the 1 percent errors we’ll fix those and run it through again.

And then you start showing them stats. It’s not 1%, it’s 2 and then 4 and then 8, because the kinds of things that we’re allowing them to do in order to get people to type things up real quickly, data entry, and then move on you have to, Except that there’s going to be less data entry efficiency if you’re not going to cause problems just down the line.

It’s you can’t put pollution into a river and not worry about what are the people downstream going to do with this. And the act of getting pollution out of a [00:29:00] river is almost always harder than stopping the pipe that’s putting the crap out in the first so I analogies sometimes work and sometimes we’re not a river.

Oh god, I lost you.

Stephen: I just had that same thing where it was a fire drill because there was something put online that was really wrong. It’s oh my gosh, that’s should never be there and everyone’s coming on me and I’m like who gave you this? Where did it come from? Why does it say this?

Hold on. I get past Dozens of these a day from multiple people through email through online task list through slack through texting I get multiple things go update this go put this here I did not sit and study it and read it. I did not question it. I was given something. I copied it. I pasted it. I moved on to the next one because everything is exclamation red.

Don’t do this immediately needs done. Why isn’t it done? So under a first priority. Yeah. So if everything you get is a [00:30:00] high priority that has to be done ASAP, that doesn’t leave time to double check it. Besides. I’m the last guy in the line shouldn’t everybody else have double checked it before me So i’m getting like before it gets to you

Alan: exactly So

Stephen: i’m like getting you know Blasted here.

It’s hold on a second. Where did it come from track it down? Oh, it came from this person that person was on the meeting and they were keeping their mouth shut and they’re like, oh Yeah, I think I sent that over and I’m like, you were gonna throw me under the bus and that is where did you get it?

Because that’s wrong. You shouldn’t have sent out. Did you make it? No. I got it from sent it from the client, and it’s like everybody got it, nobody double checked. It wasn’t my fault necessarily. But when you’re in that. Mode, like you said where it’s a rush and it just needs done immediately.

We need the numbers we need to get these done It takes longer later. It took much longer to fix that problem than it would have taken initially But it would have went up live [00:31:00] a day later it was broken for a day and then took a couple hours to fix. How is that any worse?

Alan: I’ll tell you So I worked at Progressive, and one of the most impressive things about them was, when they had a system out, they really had a great process for, so what happened, and it wasn’t about who did it, Bill or Bob or Tom, it was let’s make sure we understand fully how it happened and then put things in place.

So we’ll never let that happen again. It’s absolutely worth downtime was a big thing when you’re an online insurance company and every day did stats and said every minute we’re down cost us 18, 000 and maybe that’s not the right number to remember. But it was really like that people they go back in their chair and say, that’s a lot. If you’re thinking that you have a hundred million, trans users across the United States, et cetera, et cetera. So when we used to do that of like that, and it’s often caused called risk management, but it has all kinds of tentacles into other things.

It’s. What’s the possibility of this thing breaking and what’s the cost if it [00:32:00] does and you multiply those together and then say how much am I willing to pay to stop this bad thing from happening? Real corner cases that only happen one in a million. You don’t really need to worry about that as much as this is breaking every single day and causing pain and getting etc.

So there was a big prioritization of What had to be fixed first and there’s always like when people think well, it’s stable and you just wait for bugs, right? There’s already a list of 120 different bugs that we’re actively working on and sometimes you can get them into the next feature release Sometimes you just put them out and you don’t tell anybody that there’s a new release because it’s not based on features it’s based on getting to stability and how many times people were if you’re going to do that, can I just get this one little new thing in there?

It’s but you’ve just expanded all the testing and all the guarantee that we have to do to make sure. I know, you know what you’re doing, but you don’t know how tangled all the things we have going on here. And tangled is not a bad word when [00:33:00] the system is 20 years old. And, cruft is the term that I have often heard, that describes There’s all kinds of people that have worked on this code over the course of time, and they didn’t have the same coding standards or naming standards.

They weren’t all equally competent. You really have to make sure that, in fact, take a step back and progressive. 1 of the things I did when I worked at Pete Marwick in their catalyst group, we had. Big programs that did technology transition, getting you from assembler code, which was increasingly, there were no people that understood it well enough to work on it.

And so when this breaks Morton salt, Morton cycle, it’s going to break where it might not be fixable unless you bring somebody in from Denmark that still understands this. And we took old spaghetti cobalt code, which was every big bank, big insurance company, big utility. Everything was cobalt back then.

That was the business language. How many versions based on the law changes every year or there’s a new competitive pressure from somebody else and so taking that spaghetti [00:34:00] cobalt making destructured cobalt, where the person that the fiery bus crash scenario, if this guy dies, And all of what he knows goes with him, you’re not going to be able to get things, maybe you can get them fixed, but it’s going to take five times as long because nobody knows this system.

It’s the guy that works at Campbell’s Soup that knows what dial to turn to make sure that when the soup tastes funny, he’s the one that knows how it all hangs together. And we, the business case for making use of Pete Marwick’s products was this guy who cost you a hundred thousand dollars a year That is the only guy that knows how to do this.

You have the systemic risk of this code is fragile and he’s the only guy that knows it. And it’s also, wouldn’t you rather have a code that a 30k person could maintain and let this 100, 000 guy go and work on the hard projects that are new features or just prepare yourself for you can’t be addicted because not only is it the fire bus crash, this guy might say, I’m going to need to make 200, 000 a year because you can’t live without me.

Edit. It didn’t [00:35:00] happen that often, but I was in enough situations where places were being held for ransom. You know what I mean? That people knew their power. They knew they were the only one that understood how this worked and they weren’t jerks about it, but they sure started to command a premium for nobody else knows this and you really need me.

And once in a while, people also, they left because they went crazy. They were under such pressure to make sure this stuff. Worked and worked at all the time. I tired of being called at two in the morning. They wrecked their lives, their marriages their health over, I can’t be under this sustained pressure, it like, it’ll kill you, you want to be on whatever cortisol is flooding through your system.

You just get jerky and you fly off the handle in anger over nothing because that’s been bottled and you’re trying to be a software professional for so long, hey, everybody. It’s very cool that 14 year olds can now, write video games and make apps and viruses and stuff like that.

But there’s something to be [00:36:00] said for people who have been through a lot of these wars and have an appreciation for some of those things really were worth doing. And when you get to what you’re talking about, like someone that yells at you, hey, call somebody, get this fixed. You have to be able to explain a little bit of that’s just not possible or it’s not practical.

Yeah. It’s not practicable. You know what I mean? You have to be the calm guy Even if what they want to think is if I slash at you more, you’ll work harder. You know what I mean? Sorry, you’re going through this because this going into fourth quarter going into like the holiday when sales things gear up and There’s different seasons of the year where things are easy medium hard and people get what I want to be is on vacation and yet this just broke.

So they called me back from being at my Thanksgiving dinner with my family. You know what I mean? The guy been there done that good Lord. I had any number Colleen to have had holidays wrecked [00:37:00] because when you’re the person that really knows what’s going on. They call you in with no compunction and there’s is there anything more galling than okay?

They called you in and then they go and leave and have their Thanksgiving because now you’re there It’s like man, you just you might have made a mistake here because long term I will never do this for you again You know what? Just I do have that power of if you’re the if you’re the smart guy that knows how it works You’re not a slave.

You’re not an indentured servant. You really I’ll be happy to fix this. And what additional vacation and bonus do I get? Because I don’t have to do this. Sure. Fire me. And then you’re even in a worse spot. So point, sometimes it gets weird and ugly because exactly the boss that is a bastard about it is the one that you happen to have to be working for right at the right time.

Most time I had good negotiations with people really understand. You’re like the only guy that could have figured this out. I will court you to keep you, to make sure that we don’t take on risk, that we don’t have to, [00:38:00] because keeping you happy is an important thing.

Stephen: And not everything is an emergency that the world’s on fire.

If there’s, A bad link on a page that is causing people to go to something incorrectly. Yeah, that’s a bad thing to have happen But if that page has been there for seven years without being changed And only four people have ever gone to it in that seven years That’s not the highest priority. We’ve not affected that many people.

Those are four that could be managed and figured out. But when every little problem like that is just as equally on fire massive of a problem as the whole server going down, then it’s like you need to re evaluate here.

Alan: I’ll tell you, we’ve laughed about using the choosenator, for making life decisions for us.

I used to talk about that kind of stuff all the time, that you need to build in. Like monitoring and feedback loops. And it might be that the [00:39:00] monitoring actually slows your system down by 2%. But when you have a feel for what really needs to get done, how often has this feature been used? How critical is it dollars wise or time wise or whatever, and you have numbers, not just panic, not just.

I’m personally offended that this thing is, down, you needed to be able to say there will never be a time when it’s only one thing, there’s always going to be choosing amongst options and how will we make that choice, you have to be able to prioritize to make a decision on numbers, not just, and sometimes people were pretty good with numbers.

a feel for it, how many times it was also, okay, I fixed that, but this is still blowing up over here. And then next time, let’s not start at the top left. Let’s really look at what has the most traffic through it. What has the most impact at all, all those kinds of things. And I have, I’ve had good bosses where they were like, it’s, it was really good that we were able to track this because then oftentimes.

Bosses are not only I need to get [00:40:00] this fixed, but I need to be able to report to my higher ups what we’re doing and why we’re doing it and giving them that information that says. We really have prioritized this correctly. We can prove it. We’ve got the numbers. It gets them off the hot seat by not only saying, I yelled louder and that was my solution was like, no, you took 15 minutes to talk this through and we think we’re doing this.

And in fact, there’s a feedback loop in on this process that says in 15 minutes. I don’t want to go knocking on the guy’s door, but we can check to see has anything broken more severely and then we will let’s call it an hour. Let’s call it two hours. We will shift resources based on the next 10 bad hours.

We’re going to have to make sure that each of those things that gets solved. It gets us out of hot water to the public quicker, less customers affected, less dollars lost all that kind of stuff.

Stephen: I did a, yeah. I did a little search once and took the last 500 emails I’d gotten and broke it down if it included ASAP or [00:41:00] immediately, it was about 300 out of 500 emails.

That said,

Alan: the top priority, like you’re saying, once people find that there’s a priority system, they learn to game that system and say, if I got to put three exclamation points or the magic word ASAP mission critical. They will they don’t necessarily judge whether it’s that about the company, but it matters to them.

And they’re going to claim as many of you as a resource as much as they possibly can, because it matters that their system is cruising along correctly. That was a lot.

Stephen: Hey, I see your Batman. Hold on. Hold on one second.

So look what I got my justice league identity card. Excellent. Calling head. I’m at the comic store and gave me one last night. It’s the justice league. It says justice league on it. So

Alan: there we go So now you’re officially a justice league insider. You’re snapper cop. I’ll see you’re like a mascot of the justice league [00:42:00] There’s a voice in the past.

Does anybody even know who that is anymore? Because he’s Not really around anymore.

Stephen: In today’s world you’re only relevant. You’re only relevant until you’re not. And then everyone forgets you in two seconds

Alan: or else they’ll bring you back as a villain. Oh yeah. He had developed a drug habit and that’s why I had to sell out the justice league.

I think that’s really what they did with staffers. It’s Oh man, he was like comic relief and then you turned him into a POS. That’s not fair. And yet you got to do something for stories after 80 years. Exactly. I, so there’s things coming out. The new Venom movie is coming out. I’ve been seeing really interesting press about the Thunderbolts movie and now I’m hearing they might call it dark Avengers cause they don’t want to lose the Avengers brand name and stuff.

But that was a very cool idea when it first came out, that They had reformed villains that were going to take care of things like the Avengers, but because they’re villains, they might not have all the moral components of no killing and stuff like that. And then you find out [00:43:00] sometimes villains They’re using that as a way of masking.

Now I got the government behind me, but I can still do all kinds of evil evil stuff. So honestly, it’s going to be really cool to see how they handle all that. Moral ambiguity and quandary. Cause not to be weird, tell me that’s not screaming from today’s headlines. So many people lost trust in institutions because there have been people that said.

If I can act as if I’m a good guy, I can get away with murder as long as I can maintain the ruse. You know what I mean? Whether that’s the government or the priesthood, name the thing that has had horrific corruption because someone was able to look you in the eyes and lie to you and JD Vance, like someone that just, it doesn’t seem to be, there’s no moral bone.

There’s not like a tell that you can tell that, he knows he looks to the side because he’s trying to remember what lie he told last time to make sure that this lie ties into it. No, they really are soulless liars that lie as easily as breathing. And we don’t even have wasn’t there a TV series called the mentalist or [00:44:00] something where a guy could tell by micro expressions, whether it’s lying to me with a Tim Roth.

Tim Roth, that’s what it was. Exactly. And that was a great show. It was a great show, but I wonder whether that would apply to some of these True psychopaths and psychopaths. They don’t have a moral compass. They don’t have anything.

Stephen: And that show Bull with what’s his name from NCIS. I’m sorry, you’re cutting out a

Alan: little bit.

Stephen: Oh, yeah, sorry. The show Bull with the one guy from NCIS. It’s the same type of thing where he reads the juries and stuff. Okay. Have you watched Agatha at all?

Alan: Not yet. It’s just right on.

It’s been,

Stephen: it’s been fun. I’ve been enjoying

Alan: the continuation of the Scarlet Witch. They even

Stephen: bring that

Alan: up.

Stephen: They even mentioned it.

Alan: Okay.

Stephen: Yeah. Did you happen to watch the penguin on Max?

Alan: Haven’t watched that yet, honestly, because of being on our little Baltic cruise excursion, and then putting life back together after that.

And so [00:45:00] I have, I am behind, I have a list of wow, I can’t believe I haven’t watched yet the penguin, especially because it’s Colin Farrell Colin first, someone that I really admire as an actor. And I was like let’s see what he’s doing. This sounds really interesting.

Stephen: So we’ve been watching it.

Okay. Think of it as the Sopranos, but set in Gotham City. It’s not superheroes. Batman’s not in it. They’re not fighting the Riddler. It’s not su, bad guy of the week. It really is a Dr. Yeah, it’s a drama with the penguin as the focus. The stories have been Fantastic. I’m like, oh my god. This is actually When I saw logan in the theater, I was like, oh my god That was just a good movie regardless of it being quote unquote superheroes the movie itself was just so good.

Anybody should be able to enjoy it this If you enjoy any of those crime mafia type, you can watch this. There’s not [00:46:00] Batman superheroics going on. It’s a gritty drama. The story has just been engaging and so fantastic.

Alan: Very good. I’ll have to, I’m ready to watch it. I just have not done so very good.

And

Stephen: like you said, Colin Farrell, I didn’t even recognize him with all the prosthetics on and he does such a good job acting

Alan: guy and the penguin is not a handsome guy. And so how did they submerge him into this?

Stephen: You can’t hardly tell it’s Colin Farrell and the prosthetics look so real you forget it’s not his real face and he does such a good job acting through it.

I’ve been like, dear god, if this show does not win some awards, somebody is doing something wrong. I, so I love all the MCU stuff. I love Marvel. That’s been my thing. Spider-Man’s always been my guy. I used to read tons of that NDC, but mostly Marvel and stuff and I’ve loved all the movies and I was really looking forward to Agnes.

Colin said, Hey, we got new episodes [00:47:00] of Agnes and Penguin. I said, oh, let’s watch Penguin. I could forget about Agnes for a couple days. I need to see what happened in Penguin .

Alan: Got it. It’s. I love the fact a lot of some kind of folks were really good early on, like with the Kingpin and Spider Man of saying, it’s not all going to be other costume of thugs.

It’s going to be guys that are really good about crime, really good about money and its influence, really good about just being intimidating to their standard citizens. And then that’s so much how crime happens. It’s not knocking over a bank. It’s all those little protection rackets and prostitution and numbers and all those other, the mafia type things.

It’s a very interesting comparison to the Sopranos. I’m really curious. They did a really interesting transformation. Like the Penguin back on the Batman TV series in the 60s was Marcus Meredith, whack, whack, crazy umbrellas and stuff like that. But they absolutely did transform into just being a criminal moralist guy.

And his superpower, if you will, was he’ll do anything to anybody. He will, that [00:48:00] he’ll sell out anybody. He wasn’t. A physical Bane type character where you’re not, he’s not going to win a fight, but he sure can make plans and arrange for all of those little things to happen and he’ll get his cut.

You know what I mean? So

Stephen: it’s got a lot of intrigue and a lot of separate little storyline stuff going on. And it gave me and calling this idea for a really cool video game and somebody really should do this. So it takes place in the new, the Batman world from the movie. And this is right after the wall, the water, all the dam broke and it destroyed everything.

So that’s where we’re starting at now. How would this be for a cool game? So a top down ish game, but you’re the penguin and you have to. Build up your forces. So you have to rebuild your base so you have to you know, get your gang in there and to rebuild it So it’s partly like a crafting [00:49:00] building sim city thing But it’s also a little gta where you’re gathering your people and you send them out to go deliver drugs or go Prostitution and little side missions, but you’re not doing them You’re sending your people to do them like an rts and as that would be so much fun

Alan: Yeah, I’ll tell you, this is a while back, like decades where they had, there had been the tremendous storm, all the bridges got severed, Gotham City was isolated and they had the various different parts of the city.

Trying to think what it was called. There was something like only the villainous, the venal and the insane are left on the island. And they showed how they really got more into the, geography of Gotham City than ever before. Here’s why this guy, Killer Croc, owns this section, because he’s the guy that really will intimidate you to the point of biting your head off.

Whereas this was the Penguin doing his crime thing and they had the various different around Arkham Asylum, whatever else it might have been. And part of what they went into that series very well was, [00:50:00] like, no matter how badass you are, You can’t be everywhere. So your power extends to are you looming over somebody at the time and intimidating them?

But once you walk away People can go back to what they were doing as long as they’re not visible to you

And whereas the penguin was great at All the little arrangements. You know what? I have money. I don’t need to be a big guy. I can hire a big guy to be my heavy, my muscle and like the alliances that were caused or you don’t have to make it that it’s a nice deal.

It can be. I know something about you and I’m gonna blackmail me unless YouMail you, unless you work for me. I’ve got your sister captive and I’m going to harm her unless you, and so they really, that was incredibly intense to find out that it wasn’t just Biff bam pow far from it all the Batman stuff, people betrayed him.

People betrayed each other because people had learned how. You really can have all different kinds of influence that isn’t only inspiring people to fight back against crime. People that were trapped on the island, they want to be neighborhood [00:51:00] watch. But then you find out that it’s really hard to do that if they are going to kill your mom unless it’s, oh my god. It was, and it lasted a year. There was all the interconnected Batman titles and Catwoman, everything was going on. And If I remember right there was something well, why didn’t the Justice League be carrying their cards come in and help clean up the city because Batman said stay out of my city It’s mine or you want to mean there was they had to have like way long ago why didn’t the Justice Society just join World War two and What the Nazis and fix the world because of the spear of destiny remember that it would actually have power over them They had to have some kind of big excuse as to why a whole bunch of heroes didn’t clean out the city and say we’re not, we’re not going to do all the disaster relief, but at least we can stop it being as horrifically bad as it has got the Mad Hatter is here and the, just how to deal with insane, if you have a profit motive, you can say we’ll cut a deal.

If you’re the joker and all you want is chaos, it doesn’t pay as much. Okay.

Stephen: [00:52:00] Oh, I was going to say, oh, so you mentioned that in the penguin, his mother is alive, but he keeps her well off the grid because everybody else thinks she’s dead. So that’s going to come into play. I’m sure at some point.

Alan: Interesting. That’s, they’ve had some, the long holidays, the holidays, there’s been a couple other things where they’ve talked about that, that Roman the various different mafia presence in Gotham was hating the fact that the costume People had come around and attracted all this attention to crime fighting because they liked it when it was just hey We want everything to be nice and calm as long as we’re making our bit off the numbers our bit off all the various different little sins and In order to keep things calm, they really were willing to do terrible things, just like people, they almost rubbed each other out.

You don’t want to have a gang war, you want to have territories and everybody make their bets. I’m curious about how they’re going to handle it, who, Black Mask? There’s others that have been not super villains, maybe just wearing a mask to conceal [00:53:00] their identity, but that really were crime bosses, like the Penguin, much more than they were criminals.

Kite man or other crazy villains. You know what I mean? And not everybody ends up going to Arkham Asylum. Some people just go to prison because most of their crime was Stealing, not trying to poison Gotham’s water supply or whatever else it might be, unless there was money to be made. So I’m very, I’m feeling bad now that I’m behind.

I got to catch up. I got to see, is it where it’s being released once a week? And so you have to wait for the next, so you can’t binge it. Cause I like that. It’s totally, to jump, we just started watching the next season of the British baking show. And while we were catching up on like 13 seats, the butler did it used to binge three, four nights.

And now it’s going to be, we really are like It’s what do they call it destination tv appointment tv where it’s okay The next episode is out on friday. So that’s going to be a steak for you and pizza for me and we’re gonna We plan on cuddle on the couch. We’re ready for the next [00:54:00] episode and then it really is.

Okay, that only took an hour I want to see more. I, whatever we had established as a way of immersing ourself in it. Now you got to wait for 10 weeks.

Stephen: So funny now because when survivor on in not survivor big brother and survivor, when those two started, they were like the beginning of the whole reality TV thing.

And now it’s become such a thing. But people don’t understand even though the quote unquote reality TV not just the baking contest, but they’re scripted. There is much a fiction thing. And I’ve argued with people. I know people who have been on these shows. I’ve talked to actual people on some of these shows and they’re always like, Oh yeah.

It’s not, it’s the

Alan: professional wrestling of television. Yeah,

Stephen: exactly. It’s just one step away from a soap opera. And it’s not that it’s scripted, but real things happen. They just may pause and say, hold on, let’s go back. We need to make this more dramatic. You need to say this.

We’re go do this later. Hint at it here and then do it [00:55:00] again. So it’s that mixture, but it’s, it, it is definitely not. All reality and people don’t believe that. And I’m like, that explains our political landscape at the moment. Completely.

Alan: Exactly.

Stephen: Hey, I hate to cut this a little short. I got a call coming up and I’ve got some other issue here.

I got to get first. So

Alan: we’re able to fit this in, brief, we’ve had some really serious medical stuff going on. I know people have now had aorta problems and hatched retina problems and things like that. And so thank God for medical science and the advances in that because we laugh, but we really say as long as Colleen and I stay alive, they’re fixing things that soon it’s going to be all the stem cells and all the telomere lengtheners that we all stick around and we’ll be able to live longer.

But in the meantime, there’s still terrible, serious stuff that can happen. And the fact that they can be cured and fixed. It’s just amazing that Like what used to kill people 20 years ago. Now it’s okay. We wouldn’t know what to do. Cool. The cool the guy down so that [00:56:00] his metabolism slows. And it just, it was amazing how these things are not, there’s still a big deal, but they sure aren’t.

Uncharted territory anymore. We learn and add to that knowledge base all the time and thank God it’s available to us. You know what I mean? I just I’m around because I’ve had two big cures and you know there was the time when they didn’t know that hey put the guy between two big metal plates and Zap him with electricity to start his heart on the correct rhythm.

Okay, that’s mad scientist stuff

Cardioversion, oh my gosh. But right there,

Stephen: It’s a shame that we just can’t trust our scientists or our medical people ’cause they’re all in some big conspiracy against us.

Alan: It sucks. It’s only about money. There’s no desire to actually help. There’s no Hippocratic oath you take about do no harm.

Alright man. Cool. Take care. Take care. Enjoy your making show. Alright. You bet.

Let’s see.

You have been listening to the Relentless Geekery [00:57:00] Podcast. Come back next week and join Alan and Stephen’s conversation on Geek Topics of the Week.