Big data is changing the world. IDC has defined it as one of four key technologies enabling digital transformation, and companies across the world are rushing to take advantage of the insights it can offer. Processing tomorrow’s data won’t be easy, though. Companies must be able to handle more kinds of information than ever before. The increasing volume and velocity of data will place extra strain on computing systems that weren’t designed to handle it.
So how can companies build a big data infrastructure without breaking the bank? They must be prepared to deal with large data sets in innovative ways, but many won’t have the appetite for a large-scale systems upgrade. Fortunately, there are ways to improve the speed and efficiency of your data analysis without investing valuable IT dollars in a system-wide upgrade. Cloud computing can help enterprises bring more data processing capabilities on board without a huge capital investment. It promises customers more elastic storage and computing resources, which they can pay for on demand.
Public cloud providers such as Amazon and Microsoft offer infrastructure as a service (IaaS), which lets IT departments extend their existing computing power using virtual machines. They can spin up development and testing servers quickly using resources, such as Amazon Web Services, as they gain experience with the public cloud—eventually deploying production systems there, too. Many applications that previously ran on their own premises will be usable in the cloud.
Companies that are uncomfortable with putting sensitive data in a public cloud environment can still take advantage of cloud computing’s elasticity by running a private cloud infrastructure. This uses virtual machines controlled by a layer of software that automates management tasks behind the scenes. It keeps some of the elasticity you find in a public cloud environment, and it enables administrators to allocate computing resources on demand to compute-intensive data processing tasks in ways that were not possible before. Sophisticated IT departments can even marry both cloud computing approaches in a hybrid cloud environment, where private and public cloud services work together to complete certain workloads—companies using this approach can put data wherever it makes most sense.
IaaS can run existing applications focused on handling traditional transactions, such as customer orders, but data workloads are evolving and creating new data-processing demands. The Internet of Things is an example of this data revolution, generating high-volume data streams from vast arrays of distributed sensors. For example, next-generation car company Tesla produces sensor-heavy vehicles, monitoring everything from location to speed and road conditions. At this time last year, Tesla was collecting data from cars driving 1 million miles every 10 hours, and it’s using that data to generate new insights about how its customers are using its vehicles and how the cars are performing.
Even consumer-facing mobile applications are forcing companies to process data in new ways. Companies analyzing information from an online dating or photo-sharing app look for different things than companies processing flight bookings. Unstructured data such as audio files, statistical machine learning models, or social media posts will be featured in tomorrow’s data sets, and enterprise data processing hasn’t had to deal with these before.
To meet these needs, cloud analytics providers have also layered big data analytics services into what they offer. Many of them now provide analytics as a software service, enabling customers to access them programmatically, marrying this functionality with their own applications. This makes it possible to quickly build analytics services that meet evolving business needs without having to invest in complex, expensive hardware infrastructure that may go unused for large parts of the day.
In many cases, these analytics services are using a form of artificial intelligence known as machine learning that creates statistical models with large amounts of data. Enterprise customers can then use these services for analytics, using patterns in historical data to make business decisions. Machine learning concepts have existed for years, but they only became a commercial reality relatively recently, thanks to cloud computing and big data, which helped put the vast computing resources they required at users’ disposal. It’s a good example of how on-demand computing services are making it possible to do new things. Handling those tasks on equipment that you purchase yourself isn’t impossible, but it’s far from easy.
IT departments that want the benefits of newer, more powerful kinds of data processing shouldn’t despair. Though few companies have the budget for a complete system and software refresh, the good news these days is that, with cloud computing, that’s no longer necessary.
Read the e-book “A Practical Big Data Analytics Strategy to Help Manufacturers Own the Future” for a look at how Hortonworks is helping customers take advantage of big data in the cloud.