You’ve built (or paid for) a piece of software that works as expected when tested internally … but stalls and sputters when you turn it over to users. What went wrong?

Software is often built to perform rather than scale. Minimizing the time to handle a single query—by any means necessary—is tempting, but sacrificing long-term scalability for speed can be risky.
RASP

“There is often a trade-off between performance and scalability,” says Jerry Goodnough, Cognitive Medical Systems’ Chief Architect. “Systems designed for scalability may not be as fast for single users, but are implemented to remain responsive under load. Systems designed purely for performance may slow considerably at scale.”

Plan Carefully Before You Build
The number of expected software users or transactions can grow substantially over a system’s effective lifetime. The key to architecture that can grow with your organization is anticipating your eventual business and effectively planning before you build. Try to estimate the number of users and transactions beyond the near term. The same goes for the complexity of transactions and the business context. For example, if you’ve built a scheduling program for an outpatient surgical center with a limited number of cases per year, will it be able to scale and handle to a busy family practice seeing 5,000 patients a month?

Which Direction(s) to Scale?
Architects have a couple of scaling options: horizontal and vertical. Scaling horizontally involves adding more components to your system (e.g., servers, services, databases, or connections). Vertical scaling involves increasing the size of a system’s existing components (e.g., more memory, larger and faster processors).

Initially, it may be expedient to stand up a single server, especially if you can accurately predict your eventual capacity requirements. Keep in mind, however, that it is often expensive to accommodate unexpected demand with this approach as reserve capacity must be allocated up front. If demand suddenly exceeds capacity, it may be necessary to replace the entire server to satisfy service level agreements–these procurement and retirement costs compound over the system’s lifecycle.

Horizontal scaling is generally more expensive initially to architect than a vertical solution, but allows for “capacity on demand.” You can start small and add capacity as needed by supplementing, rather than replacing, components. On the flipside, you can reduce capacity as demand wanes, so you effectively only pay for what you need at the time. The additional nodes also introduce redundancy to increase fault tolerance and system availability.

So Which is Right for You?
If capacity requirements are known in advance and growth is manageable with a single server for the long term, vertical scaling could be the most appropriate option. If capacity may vary or exceed a single server, horizontal scaling may be the only viable solution.

If you’re working with a vendor, make sure they consider the tradeoffs between vertical and horizontal scaling, that they know the impact on data storage and access through database replication and charting, and that they take into account the impacts of user interface design on data load.