Tracking and cracking network performance problems is no easy task. More than a matter of identifying often mystifying bottlenecks, ensuring network efficiency requires an almost preternatural understanding of your organization’s IT operations, as well as a thick skin for withstanding the heat when problems inevitably arise.
To keep your network humming, we’ve outlined 10 areas where tweaking and moderate investment can lead to significant performance gains. After all, as more and more organizations seek to conduct business at wire speed, making sure your systems blaze is essential to the competitive edge your organization needs.
Speed up that WAN
IT has long been caught in the web of leased lines and costly WAN charges. Linking multiple sites with T1 lines, MPLS, and even Frame Relay used to be the only way to guarantee connectivity, but the scene has changed. Rather than curse at your monthly WAN bill, it’s high time to investigate your alternatives.
Shifting to a fibre optic provider might mean a substantial increase in site-to-site bandwidth at a significant cost savings — it’s all a matter of location. Even bringing a few sites into a new WAN design can save enough money to increase bandwidth to the sites that aren’t accessible by the same carrier.
You may wind up running your own VPN between these sites, but if the carrier’s SLA is strong enough and the network is as low-latency as it should be, this won’t be an issue.
Sites outside the footprint of the larger carriers, and thus destined to remain on leased-line connections for the foreseeable future, could benefit from a WAN accelerator. If you can’t increase bandwith to those satellite sites, try WAN optimization tools.
Lose the leased lines
Ditch leased-line Net access. There’s bound to be a better, cheaper way to bring high-speed Internet into your environment. Granted, T1 and T3 leased lines provide more of a guarantee against latency, but the cost differential is extraordinary, and the maturity of these networks — especially the business-class products — has grown substantially.
Drop old apps
Many businesses cling desperately to elderly application platforms. The result is increased costs, downtime, and fragility of core business systems. Don’t migrate an old app to new infrastructure, get something new. The upfront costs may be more, but they will pale against the long-term costs you’ll incur by not severing these ties.
Build a lab
For the cost of a single server, you can build a monster IT test lab. A cheap, dual-CPU, 12-core AMD Istanbul-based 1U server can run several dozen virtual machines in a test scenario for about US$1,500. Using VMware Server on Linux or VMware ESXi, you can avoid software licensing fees, while maintaining a perfectly valid platform for testing anything, from software upgrades to new packages, new operating systems, or even network architectures.
Combine a virtualized server lab with tools such as GNS3, and you can build and test just about any planned network or system infrastructure you want. There’s no easier way to determine where resource bottlenecks reside than in a test bed, and if that test bed is as easily constructed as it is in a virtual lab, there’s no reason not to find them.
Watch everything.
Whether you prefer proprietary or open source tools, there’s a myriad of options available to monitor everything from network latency and throughput to RAM and CPU utilization, to SAN performance and disk queue lengths — you name it.
And when implementing network monitoring, be sure to leave no stone unturned. Monitor the CPU utilization of your routers and switches; watch the error rates on Ethernet interfaces; have your routers and switches log to central syslog servers and implement some form of logfile analysis to alert you when there are reports of anything from IP conflicts to circuits going down. Careful, conscientious implementation and tweaking of your monitoring framework will save enormous amounts of time and energy, especially when it counts the most.
Know your apps
Infrastructure performance monitoring will only get you so far. All the computing and storage resources that you are offering up on your network are being consumed by your applications. For too many of us, those applications form something akin to a black hole — we can easily observe their effects on our infrastructure, but it’s often difficult to see inside them to know what’s going on.
Many IT shops are content to let software vendors install and implement the applications on their networks; after all, that’s less work for IT. But be careful — you’re on the hook when the network later slows to a crawl.
Small can be better
While it may be possible to fit 2TB of data onto a single 7,200-rpm SATA disk, you’ll still be limited to an average randomized transactional throughput of perhaps 80 IOPS (I/O operations per second) per disk. Unless you’re storing a mostly static data bone yard, be prepared to be thoroughly unhappy with the performance you’ll get out of these new drives as compared to twice the number of 1TB disks.
If your applications require a lot of randomized reads and writes — database and email servers commonly fit this bill — you’ll need a lot of individual disks to obtain the necessary transactional performance. While huge disks are great for storing less frequently used data, your most prized data must still sit on disk arrays made up of faster and smaller disks.
Beware the 10-pound server in the 5-pound bag
Virtualization has to be just about the coolest thing to happen to the enterprise datacenter in a long time. If used incorrectly, however, virtualization technology can shoot you in the foot
By way of example, let’s say you have a hundred physical servers you’d like to virtualize. They’re all essentially idling on three-year-old hardware and require 1GHz of CPU bandwidth, 1GB of memory, and 250 IOPS of transactional disk performance.
You might imagine that an eight-socket, six-core X5650 server with 128GB of RAM would be able to run this load comfortably. After all, you have more than 20 percent of CPU and memory overhead, right? Sure, but bear in mind that you’re going to need the equivalent of about 140 15,000-rpm Fibre Channel or SAS disks attached to that server to be able to provide the transactional load you’ll require. It’s not just about compute performance.
To dedupe or not to dedupe
Deduplication is great for the backup tier. Whether you implement it in your backup software or in an appliance such as a virtual tape library, you can potentially keep months of backups in a near-line state ready to restore at a moment’s notice.
Like most great ideas, however, deduplication has its drawbacks. Chief among these is that deduplication requires a lot of work. It should come as no surprise that NetApp, one of the few major SAN vendors to offer deduplication on primary storage, is also one of the few major SAN vendors to offer controller hardware performance upgrades through its Performance Acceleration Modules. Identifying and consolidating duplicated blocks on storage requires a lot of controller resources. In other words, saving capacity comes at a performance price.
Accelerate your backups
If you are backing up direct to tape, it’s likely you’re underfeeding your tape drives. There are very few backup sources that can support sustained read rates to match the tape drive’s write performance. Because many tape drives become significantly less efficient when their buffers are empty, this becomes the root cause of most backup performance problems.
In other words, the problem isn’t your tape drive; it’s the storage in the servers you’re backing up. Though there may not be a great deal you can do about this without investing heavily in a large, high-performance intermediate disk-to-disk backup solution, you have more options if you have a SAN. Though it will depend largely on the kind of SAN you have and what backup software you run, utilizing host backups — which read directly from the SAN rather than over the network — can be a great solution to this particularly vexing problem.
Matt Prigge is contributing editor to the InfoWorld Test Center, and the systems and network architect for the SymQuest Group, a consultancy based in Burlington, Vt.
Paul Venezia is senior contributing editor of the InfoWorld Test Center and writes The Deep End blog.