This mandate is the inspiration behind InfoWorld’s list of top 10 emerging enterprise technologies of 2009. We believe this is an amazing time in IT, with a swarm of new technologies that have the potential to reduce costs, change the way we work, and open up new frontiers. So we decided to brush aside the high-level trends trumpeted by analysts and ask ourselves: Which enterprise technologies shipping now, but not yet widely adopted, will have the greatest impact?
We have purposely avoided specific product mentions or recommendations, because we have set our sights on long-term potential rather than current implementation. If it’s your job to concoct your organization’s technology strategy and decide where to place your bets, then our top 10 emerging enterprise technologies is for you.
10. Whitelisting
Some whitelisting software can fingerprint and block a wider range of files than executables, including scripts and macro modules, and even write-protect any text or configuration file. The latter is useful for noting unauthorized modifications, such as the changes that many malware programs make to the DNS Hosts file.
Obviously, whitelisting requires a cultural shift. In many enterprises today, users still have some measure of control over what they run on their own desktop or laptop computers. But due to the relentless ramp-up in new and smarter malware — and the increased involvement of organized crime in malware-based attacks — whitelisting may be our only hope in the losing battle over enterprise security.
— Eric Knorr
The iPhone boom has brought many things to programmers beyond the urge to simulate bodily functions with apps like iFart. The most enduring legacy is familiarity with Objective C, a language first introduced with Steve Jobs’ NeXT Computer in 1988.
If you’re a Java programmer, learning Objective C means figuring out how to handle memory allocations for yourself. If you’re a JavaScript jock, you must grasp the concept of a compiler. There is no other choice if you want to write code that can be downloaded by millions of iPhone owners.
So why not start with something simple written in the languages spoken by every Web developer? When I built Web versions of my book, “Free for All,” I added a special markup that let the iPhone install the Web page as if it were a regular app. All of this code will work on other WebKit-enabled browsers, like the one in Android, and it’s not hard to make it work on the BlackBerry.
Some dev kits are moving beyond the browser to provide better access to deeper corners of the API. Appcelerator’s Titanium Architecture, Nitobi’s PhoneGap, and the LiquidGear fork of PhoneGap build apps for the major platforms that are ostensibly native but rely upon creating an embedded version of the browser. Most of the important logic is crafted in JavaScript, which runs inside the embedded browser. The code has access to the accelerometer and the GPS even though it’s just JavaScript.
Others are porting popular languages like Ruby. The Rhomobile tool, for instance, embeds a complete Ruby interpreter and Web server inside your app so that you can write everything in Ruby. The folks at Apple forced them to remove the eval function because it hurt their ability to completely test each app, but aside from that, it’s like building a Web site in Ruby. The code runs on the major platforms.
All of these approaches are surprisingly good — if you’re not looking for superfast performance or perfection. Game developers can use the accelerometer with these apps, but only to build simpler, two-dimensional games that don’t need access to the deepest levels of the video hardware. Fonts and layouts are sometimes just a bit different from platform to platform, and this can be annoying. But if your requirements are simple and you already know Web development languages, these approaches are much easier than learning Objective C.
For enterprises, cross-platform app dev eliminates a key barrier to developing and deploying mobile applications developed in-house. It’s difficult to mandate that all employees use the same smartphone, and even if you could, coding your apps for a specific platform locks you in. With cross-platform app dev, you can write it once — without having to learn the quirks of a specific platform — and run it across many devices. At last, widespread deployment of mobile enterprise applications may become a reality.
—Peter Wayner
We all know the “two kinds of green” cliché: Save the planet and save money by reducing power consumption. The technologies to accomplish that dual purpose have already found their way into servers, desktops, and other hardware, but in some cases, the benefits will accrue only as better software support emerges.
More efficient power supplies, along with hard drives that reduce speed or shut themselves off when they aren’t needed, are delivering the goods right now. But in order to “park” inactive cores and motherboards or other components that go to sleep, multicore CPUs generally need to be told to do so at the OS or application level.
Several storage vendors produce hard drives that can spin down or power off when not in use. Most of the systems shipping now limit the functionality to slowing down drives, since the time required to spin up or shut down a drive is longer than most applications support. There are generally three levels of power savings, each conserving more power and requiring more time to return to full functionality; think of them as slow, slower, and off. The first state can be recovered from in 1 to 2 seconds and the second in less than 30 seconds, while recovery from the powered-off state can take as long as two minutes. The latter causes problems with most applications, so most vendors don’t use it.
The latest CPUs support core parking, powering down cores that aren’t needed when loads are light. The feature is supported in Windows 7 and Windows Server 2008 R2. It’s most useful in servers that are intermittently loaded or lightly used outside of business hours. A two-, four-, six-, or eight-core processor can shut down all but one core and still respond to requests, and return to full functionality if the load on the single core increases beyond a set limit.
Motherboards and add-ons such as network interface cards are introducing the capability to power down components when not in use. For example, some motherboards, particularly laptop systems, support two video systems: one built into the motherboard and one discrete. The built-in adapter uses less power, while the discrete one offers higher performance. The motherboard can switch between the two as necessary to offer either power savings or high performance.
Network interface cards can shut down when the network is not in use, and other components are adding similar capabilities. But until these features are supported by the operating system — and, in some cases, individual applications — they are of little use. It’s great to have a NIC that powers itself down, but you need an operating system that can power the thing up again.
— Logan Harbaugh
The next decade will see an explosion of cores in new chips. This era, dubbed “many core” — a term that refers to more than eight cores — is set to break out shortly. Intel, for example, has already shown working demos of a chip from its Tera-scale project that contains 80 cores and is capable of 1 teraflop using only 62 watts of power. (To put that in perspective, note that a system capable of 18 teraflops would qualify for the current list of the top 500 supercomputers.)
Non-x86 processor vendors are also deeply involved in this fray. For example, Tilera currently sells a 16-core chip and expects to ship a 100-core monster in 2010. What will IT do with so many cores? In the case of Tilera, the chips go into videoconferencing equipment enabling multiple simultaneous video streams at HD quality. In the case of Intel, the many cores enable the company to explore new forms of computing on a single processor, such as doing graphics from within the CPU. On servers, the many-core era will enable huge scalability and provide platforms that can easily run hundreds of virtual machines at full speed.
It’s clear the many-core era — which will surely evolve into the kilo- and megacore epoch — will enable us to perform large-scale operations with ease and at low cost, while enabling true supercomputing on inexpensive PCs.
— Andrew Binstock
SSDs (solid-state drives) have been around since the last century, but recently, we’ve seen an explosion of new products and a dramatic drop in SSD prices. In the past, SSDs have been used primarily for applications that demand the highest possible performance. Today we’re seeing wider adoption, with SSDs being used as external caches to improve performance in a range of applications. Gigabyte for gigabyte, SSDs are still a lot more expensive than disk, but they are cheaper than piling on internal server memory.
Compared to hard drives, SSDs are not only faster for both reads and writes, they also support higher transfer rates and consume less power. On the downside, SSDs have limited life spans, because each cell in an SSD supports a limited number of writes.
But the most dramatic story is pricing. A 32GB SSD has gone from over $1,000 to under $100 in the last five years, though this is still about 46 times as expensive as a SATA drive in dollars per gigabyte. As new solutions to the wear problem emerge from the lab, we expect SSD adoption to accelerate even more, as the hunger for high performance in cloud computing and other widely shared applications increases.
— Logan Harbaugh
Data is flowing everywhere like never before. And the days when “SQL” and “database” were interchangeable are fading fast, in part because old-fashioned relational databases can’t handle the flood of data from Web 2.0 apps.
The hottest Web sites are spewing out terabytes of data that bear little resemblance to the rows and columns of numbers from the accounting department. Instead, the details of traffic are stored in flat files and analyzed by cron jobs running late at night. Diving into and browsing this data require a way to search for and collate information, which a relational database might be able to handle if it weren’t so overloaded with mechanisms to keep the data consistent in even the worst possible cases.
The solution? Relax the strictures and come up with a new approach: NoSQL. Basic NoSQL databases are simple key/value pairs that bind together a key with a pile of attributes. There’s no table filled with blank columns and no problem adding new ad hoc tags or values to each item. Transactions are optional.
Today’s NoSQL solutions include Project Voldemort, Cassandra, Dynamite, HBase, Hypertable, CouchDB, and MongoDB, and it seems like more are appearing every day. Each offers slightly different ways to access the data. CouchDB, for instance, wants you to write your query as a JavaScript function. MongoDB has included sharding — where a large database is broken into pieces and distributed across multiple servers — from the beginning.
Simple key/value pairs are just the start. Neo4J, for instance, offers a graph database that uses queries that are really routines for wandering around a network. If you want the names of the dogs of all of the friends of a friend, the query takes only a few lines to code.
The real game is keeping the features that are necessary while avoiding the ones that aren’t. Project Cassandra, for instance, promises to offer consistent answers “eventually,” which may be several seconds in a heavily loaded system. Neo4J requires the addition Lucene or some other indexing package if you want to look for particular nodes by name or content because Neo4J will only help you search through the network itself.
All of these new projects are just the latest to rediscover the speed that might be found by relaxing requirements. Look for more adjustments that relax the rules while enhancing backward compatibility and ease-of-use. And expect a new era of data processing like nothing we’ve experienced before.
— Peter Wayner
I/O virtualization addresses an issue that plagues servers running virtualization software such as VMware or Microsoft Hyper-V. When a large number of virtual machines runs on a single server, I/O becomes a critical bottleneck, both for VM communication with the network and for connecting VMs to storage on the back end. I/O virtualization not only makes it easier to allocate bandwidth across multiple VMs on a single server, it paves the way to dynamically managing the connections between pools of physical servers and pools of storage.
But let’s start with the individual server. Take, for example, VMware’s recommendation to allocate one gigabit Ethernet port per VM. A server that supports 16 VMs would therefore need four four-port gigabit Ethernet NICs, plus additional Ethernet (iSCSI), SCSI, or Fibre Channel adapters for the necessary storage. Many servers don’t have enough empty slots to support that many adapters, even if the cooling capacity were adequate. And 16 VMs per host is barely pushing it, considering that today’s Intel and AMD servers pack anywhere from 8 to 24 cores and support hundreds of gigabytes of RAM. Consolidation ratios can go much higher.
Typically, a single adapter resides in each server, connected by a single cable to the appliance or switch, which then provides both network and storage ports to connect to storage and other networks. This simplifies datacenter cabling, as well as the installation of each server. It also eases the task of transferring adapters to another system if a server fails. In solutions such as Cisco UCS, I/O virtualization makes server provisioning, repurposing, and failover extremely flexible and potentially completely automated, as it’s handled entirely in software. Further, because the I/O virtualization systems can emulate either multiple Ethernet or Fibre Channel connections running at varying speeds, available bandwidth can be tailored to the requirements of VM migration or other heavy loads.
Virtualizing I/O does require drivers that support the specific OS in use. The major operating systems and virtualization platforms are supported, including VMware ESX and Windows Server 2008 Hyper-V, but not necessarily all versions of Linux and Xen or other open source virtualization platforms. If you’re using supported OSes, I/O virtualization can make running a large datacenter much simpler and far less expensive, particularly as increased processing power and memory support allow servers to handle vaster numbers of virtual machines.
— Logan Harbaugh
Data is the lifeblood of any business. The problem is what to do with all of it. According to IDC, data in the enterprise doubles every 18 months, straining storage systems to the point of collapse. The blame for this bloat often falls on compliance regulations that mandate the retention of gobs of messages and documents. More significant, though, is that there’s no expiration date on business value. Analyzing data dating back years allows users to discover trends, create forecasts, predict customer behavior, and more.
Surely here must be a way to reduce the immense storage footprint of all of this data, without sacrificing useful information. And there is, thanks to a technology known as data deduplication.
Every network contains masses of duplicate data, from multiple backup sets to thousands of copies of the employee handbook to identical file attachments sitting on the same e-mail server. The basic idea of data deduplication is to locate duplicate copies of the same file and eliminate all but one original copy. Each duplicate is replaced by a simple placeholder pointing to the original. When users request a file, the placeholder directs them to the original and they never know the difference.
Deduplication takes several forms, from simple file-to-file detection to more advanced methods of looking inside files at the block or byte level. Basically, dedupe software works by analyzing a chunk of data, be it a block, a series of bits, or the entire file. This chunk is run through an algorithm to create a unique hash. If the hash is already in the index, that means that chunk of data is a duplicate and doesn’t need to be stored again. If not, the hash is added to the index, and so on.
Data deduplication isn’t just for data stored in a file or mail system. The benefits in backup situations, especially with regard to disaster recovery, are massive. On a daily basis, the percentage of changed data is relatively small. When transferring a backup set to another datacenter over the WAN, there’s no need to move the same bytes each and every night. Use deduplication and you vastly reduce the backup size. WAN bandwidth usage goes down and disaster recovery ability goes up.
More and more backup products are incorporating data deduplication, and deduplication appliances have been maturing over the past few years. File system deduplication is on its way too. When it comes to solving real-world IT problems, few technologies have a greater impact than data deduplications.
— Keith Schultz
Desktop virtualization has been with us in one form or another seemingly forever. You could probably even say that it’s been emerging since the mid-1990s. But there’s more to desktop virtualization today than most of us could have imagined even two or three years ago. Yet another milestone is just around the corner: truly emergent technology in the guise of the desktop hypervisor.
Long the leader in this space, Citrix System’s XenApp and XenDesktop are examples of how desktop virtualization just might put a desktop server farm in every datacenter and a thin client on every desktop. XenApp weaves together all the prevalent desktop and application virtualization technologies into a single package: traditional application and desktop sessions, application streaming, and VDI (Virtual Desktop Infrastructure). No matter which way you turn, the detriments of each is generally backed up by the benefits of another.
Regardless of what solutions are available today and what solutions may be on the horizon, enterprise desktop management remains one of the biggest points of pain in any organization. While the model for datacenter architecture has changed systemically in the past 20 years, the model for deploying desktops hasn’t. In most places, it’s still one fat box per user, with some mishmash of management tools layered across the top to protect the users from themselves and protect the network from the users.
Whether any of the desktop virtualization technologies are applicable to your enterprise is wholly dependent on the nature of the business. Call center and health care treatment room terminals are a relative no-brainer, but you can quickly run into problems with noncompliant applications in other implementations. As the blend of desktop virtualization technologies reaches a critical mass, the wide variety of ways to ship a Start menu to a user offers a better chance that at least one will apply in every instance. Certainly, if the world turns its back on fat clients at every desk, IT will be a happier place. As for the users, the client hypervisor may give both IT and the most ardent fat client holdouts what they need.
—Paul Venezia
Why on earth would InfoWorld pick a programming framework for distributed data processing as the most important emerging technology of 2009? Because MapReduce enables enterprises to plunge into analyzing undreamed of quantities of data at commodity prices, a capability that promises to change business forever.
IDC has predicted a tenfold growth in digital information between 2006 and 2011, from just under 180 exabytes to 1,800 exabytes (that’s 1 trillion and 800 billion gigabytes!). This explosion represents a challenge, of course (how to store, retrieve, and archive all that data), but also a huge opportunity for enterprises. After all, everything in that sea of data is potentially information — information that could be used to guide business decisions.
Until recently, enterprises that might want to process petabytes of independent data to find business-relevant relationships would need an extremely good reason to invest in such a venture; the costs and time required were prohibitive. But this is quickly changing as enterprises begin to adopt highly distributed processing techniques, most notably MapReduce, a programming framework that has enabled Google, Yahoo, Facebook, MySpace, and others to process their vast data sets.
In its simplest form, MapReduce divides processing into many small blocks of work, distributes them throughout a cluster of computing nodes (typically commodity servers), and collects the results. Supporting highly scalable parallel processing, MapReduce is fast, cheap, and safe. If one node goes down, the lost work is confined to that individual node.
Google introduced the MapReduce framework in 2004, but there are many implementations today, including Apache Hadoop, Qizmt, Disco, Skynet, and Greenplum. Apache Hadoop is the leading open source implementation. Amazon taps Hadoop to offer MapReduce as an Amazon Web Service. Cloudera, which bills itself as offering “Apache Hadoop for the Enterprise,” is making significant inroads.
Support for MapReduce programming is also delivered in several enterprise software products such as GigaSpaces eXtreme Application Platform, GridGain Cloud Development Platform, IBM WebSphere eXtreme Scale, and Oracle Coherence, to name a few.
The inexorable growth of data is a fact of life. As vendors drive the MapReduce framework into product offerings, we have a new window into what all those petabytes mean. It’s difficult to imagine how, just 30 years ago, businesses could function without the benefit of business intelligence software or even spreadsheets. When MapReduce becomes part of the culture, business strategists in the not-too-distant future may look back on our era in the same way.
— Savio Rodrigues
From InfoWorld.com