This is a story where a one line change of a hardcoded value actually went well.
I could image a scenerio where somebody stored the number of months of backlog as a 2 bit value. 0, 1, 2 or 3, you know, to be smart and clever. This may not appear as a problem during testing because it may be hidden many layers down, in some downstream service that is untested. Maybe in some low code automation service....
Changing it to 4 would mean the backlog is 0. Who knows what the consequences might be. Would that service go and cancel all jobs in the production queue?
Would it email all customers mentioning their stuff is cancelled?
I get that this is a seem gly easy change, but if a change of policy is expressed to the software team as an urgent problem, this seems like the management team needs better planning, and not randomly try to prioritize issues.....
The audit trail likely represents actual risk reduction of someone undoing or misunderstanding the change later, since the change has no meaning outside the context of the request.
"Fixing preexisting errors that violate new company policy" also arguably involves real risk reduction; you gotta do that work sometime, and if everyone in the company agrees the time is now, the best time is now.
Using Marge instead of Homer is not "risk reduction" but presumably testing accounting close is also critical.
Tony's request is also reasonable, unless you want to leave the next dev in the same shithole you were in re. the wiki state.
On the flip side, if nearby things are never updated to match changing understanding of the system, then very shortly the code will be cluttered with possibly dozens of different styles: multiple naming conventions, constants in some places, hard-coded values in others, and values read from a parameter file in others, and other kinds of variations. The result will be a chaotic scramble that has no clear structure, and requires programmers to know and understand multiple different ways of expressing the same business concept.
You said, "It's a bad idea to rush through refactors".
What constitutes "rushing through" a refactor, and what forces and context make it bad to do so? What can we do, if anything, to make it so that refactoring is as much a part of everyday development as the CI/CD process, and thus becomes just part of the work that's done, not something to be put off until the business decides there's nothing else with a higher priority?
In the linked article, the situation was that they absolutely needed to change some hard-coded MONTHS value from 3 to 4 in order to keep the factory running.
That change should be shipped in isolation. With manual testing to cope for the lack of coverage, presumably.
Refactoring doesn't have the same urgency as keeping the factory running, no matter how much we all believe in keeping the campground clean. It can wait until Monday in this particular case.
OK, but that's not the question I asked. What I want to know is how we can make refactoring as much a non-negotiable part of the process as code review, tests, CI/CD, or whatever you consider essential, non-skippable, even under a short timeframe?
In the context they were in, the answer to all of those questions is "shut the fuck up, we can talk about it later".
In the normal course of business, it's a different conversation. Even then, if you're making some poor dev refactor a bunch of code because they had the misfortune to touch it, maybe you should have done the work yourself a long time ago. Or written a linter and ticketed owners.
You don't want to make feature development some sort of pain lottery.
Unplanned work being arbitrarily scoped into sprint is pain. Doesn't matter the source.
In practice, your approach often turns into "junior engineer abuse" where they have to clean up a bunch of unrelated pre-existing stuff to make the seniors happy as a condition of shipping their feature.
There's all sorts of ways it could go wrong. Perhaps the real question is where blame will fall if it does. If the big boss says "I decided to take the risk and push this through, I accept this was a consequence of that", great. If the programmers get beatings, not so great.
the other thing i notice from the story was that an update on something considered mission Critical was not given an update on within 24 hours.
IT should have volunteered the info regarding how far back in the backlog this was classified as soon as that prioritization was made. "Behind 14" and with many people on the testing side occupied is obviously not going to help with "layoff level priority".
To me, the classification of "enhancement" just doesn't seem to capture the urgency.
I think the correct people and processes were followed, but they could have saved a great deal of time aligning on the importance and priority of the task by putting together a meeting with the leads.
For a time-sensitive and critical update to core functionality, the director of operations should have been aware of the mean time to deployment for the software and put together a team to fast track it, instead of entering it into the normal development pipeline with a high priority.
I could image a scenerio where somebody stored the number of months of backlog as a 2 bit value. 0, 1, 2 or 3, you know, to be smart and clever. This may not appear as a problem during testing because it may be hidden many layers down, in some downstream service that is untested. Maybe in some low code automation service....
Changing it to 4 would mean the backlog is 0. Who knows what the consequences might be. Would that service go and cancel all jobs in the production queue? Would it email all customers mentioning their stuff is cancelled?
I get that this is a seem gly easy change, but if a change of policy is expressed to the software team as an urgent problem, this seems like the management team needs better planning, and not randomly try to prioritize issues.....