Analytics applications often have extremely demanding infrastructure requirements. CPU, RAM, disk, network, and other components can all become heavily stressed. However, rather than all components being stressed equally, analytics applications are typically constrained by a primary chokepoint. Add resources to alleviate the primary chokepoint, and another bottleneck -- caused by the next most constraining resource -- will quickly appear.
Perhaps that is a bit simplified. Rather than only adjusting physical infrastructure, you can tune applications themselves to help meet the demands of different types of analysis as well as different types of data. Still, a well-architected infrastructure remains critical to ensure analytics applications run efficiently and economically.
From the economics perspective, a number of variables need considering. Some relate to cost and others to value. What is the capital investment required to build the infrastructure? What is the cost of processing a given dataset? How quickly can a given dataset be processed? Can processing be completed fast enough so that processing of the next dataset can begin as soon as it is available? What is the average utilization of the infrastructure while processing data and over longer time periods that include non-processing times?
As these questions suggest, coming up with the "best" architecture is not an easy exercise. In fact, when considering how infrastructure requirements change over time, you could argue that that no single, best architecture exists. After all, as long as infrastructure demands vary over time, any fixed infrastructure will at times be underutilized and/or not be able to keep up with demand.
This is where cloud computing has outstanding potential. Cloud computing is not about fixed infrastructure. It is about dynamic allocation of infrastructure resources to meet the needs of an application at the time it actually needs those resources. When a cloud-ready analytics application is not running, you pay no infrastructure costs. This makes the opportunity costs of underutilized infrastructure disappear.
When it comes to controlling costs, cloud computing has an additional advantage to consider. You can access massive cloud infrastructures with no capital investment. This means cloud users don't have to try to architect the "best"-sized infrastructure in advance, committing large sums of capital in the process. They simply deploy what they need, when it is needed, and pay for what they consume.
Cloud computing also addresses performance and scale issues. Cloud-ready analytics applications are designed to scale out. Adding more cloud servers or other cloud infrastructure means quicker processing for a given dataset. It also means the ability to process larger quantities of data at the same time.
Cloud computing isn't the perfect solution for every analytics problem. However, the flexibility, elasticity, scale, and cost attributes it delivers are certainly worth understanding and considering.