Introducing Servers

Jarrod Spiga21 February 2009, 3:56 PM

The difference between the hardware in a typical server and that found in a PC - and the attributes that maximise a servers uptime.


Servers are the cornerstone of corporate infrastructure, relied upon to provide the services that employees and customers need to perform day-to-day operations in a timely and efficient manner.  The single most important attribute of most enterprise-grade servers is reliability – and a good level of fault tolerance is factored into the design of most servers in order to increase uptime.

Many readers run servers in their own home. They are the headless Linux box in the corner of the study that provides email, web server, DNS, routing and file sharing services for the home. While these machines still constitute servers in a raw sense, it would take a brave Technology Officer to put their faith in these types of servers to fulfil the ITS requirements of a business.

This guide demonstrates what differentiates business-class servers from the typical white-box server that you can build from off-the-shelf components and highlights some of the many factors of a server’s design that needs to be carefully considered in order to provide reliable services for business.

Form Factor



Servers come in all shapes and sizes. The tower server is designed for organisations or branch offices whose entire infrastructure consists of a server or two. From the outside, they wouldn’t look out of place on or under someone’s desk – but the components that make up the server are often of a higher build quality than workstation components. Tower cases are generally designed to minimise cost whilst providing smaller businesses some sense of familiarity with the design of the enclosure.

For larger server infrastructures, the rack-mount case is used to hold a server’s components. As the name suggests, rack-mount servers are almost always installed within racks and located in dedicated data rooms, where power supply, physical access, temperature and humidity (among other things) can be closely monitored. Rack-mount servers come in standard sizes - they are 19-inches wide and have heights in multiples of 1.75 inches, where each multiple is 1 Rack Unit (RU). They are often designed with flexibility and manageability in mind.

Lastly, the blade server is designed for dense server deployment scenarios. A blade chassis provides the base power, management, networking and cooling infrastructure for numerous, space-efficient servers. Most of the top 500 supercomputers these days are made up of clusters of blade servers in large data centre environments.

Processors



With the proliferation of quad-core processors in computing's mainstream performance sector, the main difference between servers and workstations comes down to the support for multiple sockets.

Consumer-class Core 2 and Phenom-based systems are built around a single-socket designs that feature multiple cores per socket – and cannot be used in multi-socket configurations.

Xeon and Opteron processors on the other hand, provide interconnects that allow processes to be scheduled across separate processors featuring multiple cores, contributing towards the total processing power of a server. It’s not uncommon to see quad-socket, four-core processors in some high-end servers providing a total of 16 processing cores at upwards of 3.0GHz per core. The scary thing is that six- and eight-core processors are just around the corner.

The other main difference that you see between consumer and enterprise processors is the amount of cache. Xeon and Opteron processors often have significantly larger Level-2 and Level 3 caches to reduce the amount of data that has to be shifted to memory, generally resulting in slightly faster computation times depending on the application.

A server’s form factor will also impact on the type of processor that can be used. For instance, blade servers often need more power-efficient, cooler processors due to their increased deployment density. Similarly, a 4RU server may be able to run faster and hotter processors than a 1RU server from the same vendor.

Memory



While the physical RAM modules that you see in today’s servers don’t differ dramatically from consumer parts, there are numerous subtle differences that provide additional fault-tolerance features.

Most memory controllers feature Error Checking and Correction (ECC) capabilities, and the RAM modules installed in such servers need to support this feature. Essentially, ECC-capable memory performs a quick parity check before and after a read or write operation to verify that the contents or memory has been read or written properly. This feature minimised the likelihood of memory corruption.

The other main difference in memory controller design is how much RAM is supported. The newest Intel-based servers now use a  memory controller built onto the processor die, as has been the case with AMD-based systems for years. Even the newest mainstream memory controllers support a maximum of 16GB of RAM. HP have recently announced a “virtualisation-ready” Nahalem-based server design that will support 128GB of RAM.

Many modern servers provide mirrored memory features. A memory mirror essentially provides RAID-1 functionality for RAM – the contents of your system memory are written to two separate banks of identical RAM modules. If one bank develops a fault it is taken offline and the second bank is used exclusively. The memory controller of the server can usually handle this failover without the operating system even being aware of the change, preventing unscheduled downtime of the server.

Hot-spare memory can also be installed in a bank of some servers. The idea is that if the memory in one bank is determined to be faulty, the hot-spare bank can be brought online and used in place of the faulty bank. In this scenario some memory corruption can occur depending on the operating system and memory controller combination in use. The worst-case scenario usually involves a crash of the server, followed by an automated reboot by server recover mechanisms. Upon reboot, the memory controller brings the hot-spare RAM online limiting downtime.

Hot-swappable memory is often used in conjunction with both of the features – giving you the ability to swap-out faulty RAM modules without having to shut down the entire server.

Storage Controllers

Drive controllers are dramatically different in servers. Forget on-board firmware-based SATA RAID controllers that provide RAID 0, 1 and 1+0 and consume CPU cycles every time data is read or written to the array. Server-class controllers have dedicated application-specific integrated circuits (ASICs) and a bucket full of cache (sometimes as much as 512MB) in order to boost the performance of the storage subsystem. These controllers also frequently support advanced RAID levels including RAID 5 & 6.

The controller cache can be one of the most critical components of a server, depending on the application. At my place of employment we have a large number of servers that capture video in HD-quality at real time. A separate “ingest” server often pulls this data from the encode server immediately after it has been captured for further processing and transcoding. Having 512MB of cache installed on the drive controller allows data to be pushed out via the network interface before it has been physically written to disk, significantly boosting performance. Testing has revealed that if we reduced the cache size to 64MB, data has to be physically written to disk and then physically read when the ingest process takes place, placing significant additional load on the server. Finally, consider that most mainstream controllers have no cache whatsoever – the impact on performance in this scenario would probably prevent us from working with HD-quality content altogether.

But what happens if there is a power outage and the data that is in the controller cache has not yet been written to the disk? In order to prevent data loss, some controllers feature battery backup units (BBUs) that are capable of keeping the contents of the disk cache intact for in excess of 48-hours or until power is restored to the server. Once the server is switched on again, the controller commit the data from the cache to the disk array before flushing the cache and continuing with the boot process. No data is lost. BBUs are another feature missing from mainstream controllers.

External Storage

Any computer chassis has a physical limitation to the number of drives that you can install. This limitation is overcome in enterprise servers by connections to Storage Area Networks (SANs). This is typically accomplished in two ways – via a fibre channel or iSCSI interfaces.

iSCSI is generally the cheaper option of the two because data transferred between the SAN and server is encapsulated in frames sent over ubiquitous Ethernet networks, meaning that existing Ethernet interfaces, cabling and switches can be used (aside from the cost of the SAN enclosure itself, the only additional costs are generally an Ethernet interface module for the SAN and software licenses).

On the other hand, fibre channel requires its own fibre-optic interfaces, cabling and switches, which significantly drives up cost. However, having a dedicated fibre network means that bandwidth isn’t shared with other Ethernet applications. Fibre channel presently offers interface speeds of 4Gb/s compared to the 1Gb/s often seen in most enterprise networks. Fibre channel also has less overhead than Ethernet, which provides an additional boost to comparative performance.

Disk Drives



For years, enterprise servers have utilised SCSI hard disk drives instead of ATA variants. SCSI allows for up to 15 drives on a single parallel channel versus the two on a PATA interface; PATA drives ship with the drive electronics (the circuitry that physically controls the drive) integrated on the drive (IDE), whereas SCSI controllers perform this function in a more efficient manner; many SCSI interfaces provide support for drive hot-swapping, thus reducing downtime in the event of a drive failure; and the SCSI interface allows for faster data transfer rates than what could be obtained via PATA, giving better performance, especially in RAID configurations.

However over the last year, Serial-Attached SCSI (SAS) drives have all but superseded SCSI in the server space in much the same way that SATA drives have replaced their PATA brethren. The biggest problem with the parallel interface was synchronising clock rates on the many parallel connections – serial connections don’t require this synchronisation, allowing clock rates to be ramped up and increasing bandwidth on the interface.

SAS drives are the same as SCSI drives in many ways – the SAS controller is still responsible for issuing commands to the drive (there is no IDE), SAS drives are hot-swappable and data transfer over the interface is faster compared to SATA. SAS drives come in both 2.5- and 3.5-inch form factors, 2.5-inch drives proving popular in servers as they can be installed vertically in a 2RU enclosure.

In addition, SAS controllers can support 128 directly-attached devices on a single controller, or in excess of 16,384 devices when the maximum 128 port expanders are in use (however, the maximum amount of bandwidth that all devices connected to a port expander can use equals the amount of bandwidth between the controller and the port expander). In order to support this many devices SAS also uses higher signal voltages in comparison to SATA, which allows the use of 8m cables between controller and device. Without using higher signal voltages, I’d like to see anyone install 16,384 devices to a disk controller with a maximum cable length of 1 meter (the current SATA limitation).

In the next few months, there will be another major advantage to using SAS over SATA in servers. SAS does support multipath I/O. Suitable dual-port SAS drives can then connect to multiple controllers within a server, which provides additional redundancy in the event of a controller failure.

GPUs and Video

One of the areas where enterprise servers are inferior to regular PCs is graphics acceleration. Personally, I’m yet to see a server in a data centre with a PCI-Express graphics adapter, although that’s not to say that it’s not possible to install one in an enterprise server. In general though, most administrators find the on-board adapters more than adequate for server operations.

Networking

Modern day desktops and laptops feature Gigabit Ethernet adapters, and the base adapters seen on servers are generally no different. However, like most other components in servers, there are a few subtle differences that improve performance in certain scenarios.

In order to provide network fault tolerance, two or more network adapters are integrated on most server boards. In most cases, these adapters are able to be teamed. Like RAID fault tolerance schemes, there are numerous types of network fault tolerance options available, including:

  • Network Fault Tolerance (NFT) – In this configuration, only one network interface is active at any given time, with the rest remaining in a slave mode. If the link to the active interface is severed, a slave interface will be promoted to be the active one. Provides fault-tolerance, but does not aggregate bandwidth.
  • Transmit Load Balancing (TLB) – Similar to NFT, but slave interfaces are capable of transmitting data, provided that all interfaces are in the same broadcast domain. This provides aggregation of transmission bandwidth, but not receive – and also provides fault-tolerance.
  • Switch-assisted Load Balancing (SLB) and 802.3ad Dynamic – provides aggregation of both transmit and receive bandwidth across all interfaces within the team, provided that all interfaces are connected to the same switch. Provides fault-tolerance on the server side (however, if the switch connected to the server fails, you have an outage). 802.3ad Dynamic requires a switch that supports the 802.3ad Link Aggregation Control Protocol (LACP) in order to dynamically create teams, whereas SLB must be manually configured on both the server and the switch.
  • 802.3ad Dynamic Dual-Channel – provides aggregation of both transmit and receive bandwidth across all interfaces within the team and can span multiple switches, provided that they are all in the same broadcast domain and that all switches support LACP.
Just about all server network interface cards (NICs) support Virtual Local Area Network (VLAN) trunking. Imagine that you have two separate networks – an internal one that connects to all devices on your LAN, and an external on that connects to the Internet, with a router in between. In conventional networks, the router needs to have at least two network interfaces – one dedicated to each physical network.

Provided that your network equipment and router supports VLAN trunking, your two networks could be set up as separate VLANs. In general, your switch would keep track of which port is connected to which VLAN (this is known as a port-based VLAN), and your router is trunked across both VLANs utilising a single NIC (physically, it becomes a router-on-a-stick). Frames sent between the switch and router are tagged – so that each device knows which network the frame came from or is destined to go to.

VLANs operate in the same physical manner as physical LANs – but network reconfigurations can be made in software as opposed to forcing a network administrator to physically move equipment.

Because of the sheer amount of data received on Gigabit and Ten-Gigabit interfaces, it can become exhaustive to send Ethernet frames to the CPU for it to process TCP headers. It roughly requires around 1GHz of processor power to transmit TCP data at Gigabit Ethernet speeds.

As a result, TCP Offload Engines are often incorporated into server network adapters. These integrated circuits process TCP headers on the interface itself instead of pushing each frame off to the CPU for processing. This has a pronounced effect on overall server performance in two ways – not only does the CPU benefit from not having to process this TCP data, but less data is transmitted across PCI express lanes toward the Northbridge of the server. Essentially, TCP Offload engines free up resources in the server so that they can be assigned to other data transfer and processing needs.

The final difference that you see between server NICs and consumer ones is that the buffers on enterprise-grade cards are usually larger. Part of the reason is the additional features mentioned above but there is also a small performance benefit to be gained in some scenarios (particularly inter-VLAN routing).

Power Supplies

One of the great features about ATX power supplies are the standards. ATX power supplies are always the same form factor and feature the same types of connectors (even if the number of those connectors can vary). But while having eight 12-volt Molex connectors is great in a desktop system, so many connectors are generally not required in a server, where the cable clutter could cause cooling problems.

Power distribution within a server is well thought out by server manufacturers. Drives are typically powered via a backplane instead of individual Molex connectors and fans often drop directly into plugs on the mainboard. Everything else that requires power draws it from other plugs on the mainboard. Even the power supplies themselves have PCB-based connectors on them. All of this is designed to help with the hot-swapping of components in order to minimise downtime.

Most servers are capable of handling redundant power supplies. The first advantage here is if one power supply fails, the redundant supply can still supply enough juice to keep the server running. Once aware of the failure, you can then generally replace the failed supply while the server is still running.

The second advantage requires facility support. Many data centres will supply customer racks with power feeds on two separate circuits (which are usually connected to isolated power sources). Having redundant power supplies allows you to connect each supply up to a different power source. If power is cut to one circuit, your server remains online because it can still be powered by the redundant circuit.

Server Management

Most servers support Intelligent Platform Management Interfaces (IPMIs), which allow administrators to manage aspects of the server and to monitor server health – including when the server is powered off.

For example, say that you have a remote Linux server that encountered a kernel panic – you could access the IPMI on the server and initiate a reboot, instead of having to venture down to the data centre, gain access and press the power button yourself. Alternatively, say that your server is regularly switching itself on and off every couple of minutes, too short a time for you to log in and perform any kind of troubleshooting. By accessing the IPMI, you could quickly determine that a fan tray has failed and that the server is automatically shutting down once temperature thresholds are exceeded. These are two scenarios where having access to IPMIs has saved my skin.

Many servers also incorporate Watchdog timers. These devices perform regular checks on whether the Operating System on the server is responding and will reboot the server if the response time is greater than a defined threshold (usually 10 minutes). These devices can often minimise downtime in the event of a kernel panic or blue-screen.

Finally, most server vendors will also supply additional Simple Networking Management Protocol (SNMP) agents and software that allows administrators to monitor and manage their servers more closely. The agents often supplied provide just about every detail about the hardware installed that you could ever want to know – how long a given hard disk drive has been operating in the server, the temperature within a power supply or how many read-errors have occurred in a particular stick or RAM. All of this data can be polled and retrieved with an SNMP management application (even if your server provider doesn’t supply you with one of these, there are dozens of GPL packages available that utilise the Net-SNMP project).

The future...

All of the points detailed in this article highlight the differences between today’s high-end consumer gear (which is typically used to make the DIY server) and enterprise-level kit. However, emerging technologies will continue to have an impact on both the enterprise and consumer markets.

As the technology becomes more refined, solid-state drives (SSDs) will start to emerge as a serious alternative to SAS hard-disk drives for some server applications. Initially, they’ll most likely be deployed where lower disk capacity and lower disk access times are required (such as database servers). When the capacity of these drives increases, they’ll start to become more prominent – but will probably never replace the hard-disk drive for storing large amounts of data.

The other big advantage to using SSDs is that RAID-5 failures become less of an issue (RAID 5 arrays can tolerate the failure of a single drive in the array. If during the time that it takes to replace the faulty drive and rebuild the array, a second drive fails or an unrecoverable read error (URE) occurs on one of the surviving drives in the array, the rebuild will fail and all data on the array will be lost.). SSDs shouldn’t exhibit UREs – once data is written to the disk, it’s stored physically, not magnetically. A good SSD will also verify that the contents of a block including whether it can be read before the write operation is deemed to have succeeded. Thus, if the drive can’t write to a specific block, it should be marked as bad and a reallocation block should be brought online to take its place. Your SNMP agents can then inform you when the drive starts using up its reallocation blocks, indicating that a drive failure will soon happen. In other words, you’ll be able to predict when an SSD fails with more certainty, which could give RAID-5 a new lease of life.

Moving further forward, the other major break from convention in server hardware will most likely be the use more application-specific processor units instead of the CPU as we know it today. There’s already some movement in this area – Intel’s Larrabee is an upcoming example of a CPU/GPU hybrid, and the Cell Broadband Engine Architecture (otherwise know as the Cell architecture) that is used in Sony’s Playstation 3 is also used in the IBM RoadRunner supercomputer (the first to sustain performance over the 1 petaFLOPS mark).

Next: Visual tour of the insides of a server.


Post your comment



Comments

RSS feed Email alert

Peter Allport (New user):

I've just purchase my first copy of APC while oversea's and I missed out on the article what makes a server a server?, is there a link on this? I'm eager to learn more about it...Also I found your Magazine to be very good with all the technical advices that we in the Cook Islands are somewhat limited.

30 March 2009, 9:55 AM (11 months ago)report abuse Send to a friend reply

obsidianjaguar (New user):

This Is Really an amazing article! I have been working tech suport for many years and am at the stage where I want to move towards System admin .. I have honestly never found an article that breaks it all down in such a readable and accesable way ... Thnak you so very very much for providing this level of info .. advanced , but accessable , withoiut dumbing it down .. truly a gift

15 April 2009, 9:20 AM (11 months ago)report abuse Send to a friend reply

anonymous user Anonymous user

April APC on sale now!

Tags