One of the problems with big data applications is they have to handle big data — we’re talking huge data sets.
As Supreet Oberoi, vice-president of Concurrent Inc. , a maker of the Cascading application development framework, points out in a column for GigaOM, if they aren’t tough enough they may fail in production.
The solution is to build resilient, well-tested applications before they go out the door. “This is a matter of philosophy and architecture as much as technology,” he says, in putting forward eight tips for building big data apps that can hold up to demanding environments:
–Define a blueprint for resilient applications, with a systemic enterprise architecture and methodology for your company approaches big data applications.
This means answering a number of questions, including where your current architecture is failing;
–Size shouldn’t matter. Apps have to be tested with small-scale datasets, then fail or take too long with larger ones. They have to handle all sizes of data;
–Have a transparent process for finding problems, so developers and operations staff can diagnose and respond to problems when they happen;
–Abstraction and simplicity work. “Resilient applications tend to be future-proof because they employ abstractions that simplify development, improve productivity and allow substitution of implementation technology,” he writes. Developers should be able to build apps without being mired in the implementation details. Then data scientists should b able to use the app and access any type of data source;
–Build in security, auditing and compliance;
–Test-driven development should provide the ability to step through the code, establish invariants, and utilize other defensive programming techniques;
–Be portable. Applications should be designed to run on a variety of platforms and products;
–No black arts. Code should be shared, reviewed and commonly owned by multiple developers, not dependent on one person.
“If companies follow these eight rules, they will create resilient, scalable applications that allow them to tap into the full power of big data,” Oberoi writes.
How many of these rules does your developer team follow — or break?