As the company’s business experiences explosive growth, both the scale of requirements and the user base are rapidly expanding. This presents challenges to the system in terms of the three high (high performance, high concurrency, high availability), scalability, and maintainability. The old system, due to various limitations in its early design (such as the expertise of early participants, the foresight of architectural design, impatience of management, etc.), gradually becomes inadequate to meet current and future demands, exposing various issues. Developers find themselves dragging an old, worn-out car on the highway, which is a daunting task. In simpler terms, the codebase of the old system has become too problematic to fix, leading to a situation where developers either get buried in its issues or abandon the project altogether.
At this point, a common question arises: should we continue trying to patch the issues, or should we choose to refactor? Patching is simply not feasible, not in this lifetime. Refactoring, on the other hand, requires the courage of a true hero because it’s a complex and time-consuming task. Moreover, it can impact ongoing business development or even bring it to a standstill. Often, product managers and executives are not supportive because they only care about one thing: when will the next feature be ready? Everything else is your development team’s problem.
If you choose the path of refactoring, you must be prepared to see it through, no matter what. How can you ensure a successful refactoring from the get-go? Based on common practices in internet projects and my personal experience in refactoring projects, here is an outline of the common steps for refactoring systems of various sizes:
Refactoring is not just the responsibility of the development team; it’s a collective effort involving the entire project team. Refactoring can improve the system’s performance, availability, and scalability, as well as optimize and streamline business processes to meet new demands. It requires a significant investment of resources and must have the support of stakeholders. Typically, this requires explaining the benefits and drawbacks of refactoring, as well as the critical issues that would arise if refactoring is not done. Once you have their support, the refactoring work can officially begin.
Participants: Technical Leader
Refactoring is a long-term endeavor; it’s not something that can be completed in one or two iterations, or even within a few months. It requires a substantial investment of manpower, resources, time, and effort. So, what are our goals in this prolonged battle? Are we aiming to meet the system’s high-performance requirements through a more efficient architecture? Or do we want to enhance code quality through refactoring? Perhaps we aim to introduce new technologies and frameworks to upgrade the entire system or optimize business processes to address previously unmet requirements. Once you have clear goals, you can work purposefully.
Participants: Technical Leader, Architect
Refactoring typically falls into several levels:
- Platform-level refactoring: Refactoring the entire platform, such as Alibaba transitioning from the LAMP stack to the Java platform.
- System-level refactoring: Refactoring specific business systems, such as introducing microservices or SOA architecture to break down monolithic applications.
- Architecture-level refactoring: Improving the existing architecture through adjustments and redesign, addressing architectural shortcomings, like decoupling business logic through layered design or introducing caching for improved concurrency.
- Business-level refactoring: Addressing specific business requirements that cannot be met due to the limitations of the current system, often involving the refactoring of business processes or database structures.
- Module/code-level refactoring: The most common form of refactoring, typically involving the use of design patterns, encapsulation, and code optimization to improve code structure and performance.
Determine the level of refactoring required, the overall scope, and the technology stack for refactoring. Then, conduct a scientific assessment and estimation of the refactoring work. This includes identifying the costs, required resources, and time commitments, as well as assessing whether ongoing business requirements can be accommodated during the refactoring process. Once these predictions are established, you can provide stakeholders with a clear understanding, especially when they ask when new requirements can be delivered.
Participants: Technical Leader, Architect, Developers
Refactoring is not about abandoning the old system; it’s about continuously working with it. Knowing your enemy is the key to victory. Refactoring not only requires a clear understanding of the new system’s goals and future, but also a deep familiarity with the old system, especially its pitfalls. At this stage, the participants in the refactoring project, especially those who worked on the old system, should document and organize information related to the old system’s business and technical details. This includes collecting documents such as design documents, technical documents, architecture diagrams, UML diagrams, and ER diagrams related to the system.
The following are common preparation tasks before refactoring the old system:
- Gathering information and documentation related to the old system, including design documents, technical documents, architectural diagrams, UML diagrams, ER diagrams, and other graphical materials.
- Mapping and documenting business lines and processes, outlining projects and business flows, and documenting them.
- Reviewing key code and database designs in the old system.
Any issues or uncertainties should be addressed promptly through communication with relevant personnel from the business side, ensuring that problems are resolved early in the process.
Participants: Technical Leader, Architect, Developers
If the refactoring involves changes to the database, database refactoring is typically the first step. Many refactoring initiatives are triggered by issues related to the database. During database refactoring, the deficiencies and obstacles in the old system’s database design are addressed. This may involve redesigning tables using normalization or denormalization techniques, considering sharding or partitioning strategies, and more.
Participants: DBA, Architect
Before starting the backend system refactoring, it’s essential to have design and technical documentation in place, as mentioned earlier. Once these documents are finalized through discussions and planning, the architect can proceed with system architecture design, and backend developers can begin coding. This phase is often the most time-consuming and critical part of the refactoring process. The quality of the backend architecture directly affects the success of the refactoring, the quality of the business code, and the overall refactoring quality.
Due to the extended timeline of this phase and the fact that its results may not be immediately visible, Agile development methodologies are often used. This allows for iterative development, ensuring effective planning and continuous progress. The advantages of using iterations include:
- Effective planning and quantification of the entire refactoring process.
- Visible achievements at each stage, preventing the team from getting stuck in a long refactoring process.
- The ability to test or observe refactored parts promptly during iterations, allowing continuous learning and improvement.
During backend system refactoring, it’s essential to have clear, quantifiable goals and standards. For example, defining the QPS (Queries Per Second) supported by various systems and business modules, the expected response times for interfaces, etc. This enables the team to focus on achieving these goals during refactoring.
Regular code reviews should also be conducted throughout the refactoring process to identify and address issues with the refactoring itself and the quality of the code. This helps prevent the introduction of poor designs or subpar code that could harm the entire system.
Participants: Technical Leader, Architect, Developers
If database refactoring is part of the project, data migration becomes a crucial step. It generally involves two types of migration: full migration and incremental migration. Full migration transfers all data from the old system to the new one in one go, while incremental migration handles data created in the old system after full migration until the old system is retired. These migrations are typically scripted or programmed to avoid manual errors.
After migration, it’s essential to compare the data between the old and new systems. This comparison can also be automated through scripts or programs to identify discrepancies and perform any necessary adjustments or investigations.
Participants: DBA, Developers
As the backend system refactoring progresses, scripts and programs should be developed to validate the business interfaces between the old and new systems. This ensures that issues in the refactoring process are detected promptly, and, if necessary, architectural and database adjustments can be made. Additionally, increasing unit test coverage during refactoring is highly beneficial.
Once the dependencies between systems and modules are resolved, integration testing can begin. Comprehensive testing, including functional testing, stability testing, performance testing, local testing, and simulating production environments, should be performed. Any issues identified during testing should be addressed, verified, and fixed to meet the standards required for a smooth release.
Participants: Architect, Developers, Testers
When the backend system refactoring reaches a certain level of stability, it’s time to initiate gradual deployment. During this phase, only a portion of the traffic is directed to the new system. This allows for real-time tracking and analysis of logs and monitoring alarms. Any issues or anomalies can be addressed promptly. As confidence in the new system’s stability grows, the scope and volume of the deployment can be gradually increased. Continuous monitoring of logs and alarms should be maintained throughout this phase.
Participants: DevOps Team, Testers, Developers
When it comes to transitioning to the new system, it’s crucial to have a well-defined transition plan in place. This plan should include detailed processes, workflows, and contingency plans, including rollback procedures in case unexpected issues arise. This step ensures that the transition is smooth and minimizes disruption to the business.
Participants: DevOps Team, Testers
After completing the above steps, the system has undergone successful refactoring. However, it’s essential to understand that refactoring is a substantial undertaking, and even after the process, the system may not be flawless. Refactoring is not the endpoint but rather a new beginning.