New ask Hacker News story: Ask HN: Best approach to inheriting an out of control system?

Ask HN: Best approach to inheriting an out of control system?
2 by jimmynopension | 3 comments on Hacker News.
I can't find any good advice for managing what is a reasonably common situation, "how to successfully fire fight" On-prem, cobbled together system, multiple parts that are completely unknown, everyone is afraid of making a change, if it goes down it might not come back up. Two devs were building a little project on top of a system and adding bits to it as they got new ideas (both quite junior quick and dirty types). Much of the codebase they have never touched, it's just there running with the stuff they have on top. Deployed it to an OVH remote managed rack (yeah I know!) and offered it to a client. Client loved it, hockey stick growth, huge demand, totally not production ready but it's throwing off cash (lots and lots and lots). Devs are burnt out and one is taking an offline holiday, they aren't mature about the situation, they are also a self contained business unit away from the rest of the company. Emergency consultant and senior devs from parent company started to take a look over the evenings this week. Emergency plan is to get a copy deployed and running in azure so there are 2 instances and if the OVH one goes down again we can swap traffic (pending P&L sign off) Potential for a great product to die and cause reputational damage. Potential to beat it into shape and turn it into a production product and multi million revenue. Thoughts?