Sei sulla pagina 1di 18

A WHITE PAPER FROM

FUTURE FACILITIES INCORPORATED

Cooling Path Management for the Mission


Critical Facility (MCF): A Case Study
A simulation-based methodology to maximize equipment
resilience and cooling energy efficiency

Akhil Docca and Sherman Ikemoto


2/28/2008

Executive summary

The proliferation of modern, high-powered IT equipment is creating a new set of cooling


challenges in the data center that can reduce equipment resilience well before the cooling
capacity of the room is reached. This is forcing owner/operators to take a conservative
approach by overcooling the data center paying more excessively to operate the cooling
systems. This paper is a case study for cooling path management; a simulation based
methodology to maximize IT equipment resilience and cooling energy efficiency.
A Case Study for Cooling Path Management
The Cooling Path
Cooling path management is a simulation-based methodology for data center cooling system
management. The methodology is based on the Virtual Facility approach to data center
modeling which takes all of the major elements of the data center into full account, including the
infrastructure, cooling components, cabinets and IT equipment. The cooling path is defined as
the route taken by the cooling air from the ACU supplies, to the inlets of each individual unit of
IT equipment and back to the ACU returns. Cooling paths are intricate and complex as they
exist in the data center but are intuitive as a management methodology and straightforward to
control within the Virtual Facility model to achieve operational objectives for the cooling system.

Cooling path management is the process of stepping through the full route taken by the cooling
air and systematically minimizing or eliminating cooling breakdowns and inefficiencies with the
ultimate goal of meeting the air intake requirement for each unit of IT equipment. Strict
adherence to the methodology eliminates the need to know in advance where to look for
problems and enables design options to be addressed holistically over the full scale of the room
from the equipment inlets and exhausts to the room itself.

The cooling path can be split into three primary segments that simplify the methodology as
shown in Figure 1.

Figure 1: The Cooling Path is the route taken by the cooling air from the ACU supply to the
perforated tiles, to the equipment inlets and back to the ACU return

Each segment has its own specific objectives for improvement and associated set of change
options for achieving the objectives. This makes the methodology easy to use, repeatable and
applicable to all combinations of IT equipment, cabinets and rooms. The objectives and
associated change options are shown in Table 1.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
2 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Segment Description Design Objective Problem areas Design Options

1 ACU supply to Meet flow rate and Low pressures • ACU selection and
perforated tile temperature zones due to low placement
specification for airflow or flow • Blockage relocation
each perforated tile • Tile placement
vortices
• Baffles/diffusers

2 Perforated tile Bypass flow from


to equipment perforated tiles to • Blanking
inlets Meet flow rate and ACUs • ACU controls
temperature • Equipment redistribution
specification for • Hot/cold aisle
3 Equipment each piece of Recirculation from • Ducting
exhausts to equipment equipment • Hot/cold aisle
ACU return exhausts to inlets containment

Table 1: The Cooling Path is divided into three segments to simplify the methodology

The cooling paths are influenced by the room configuration, the IT equipment and how they are
arranged relative to each other. Any changes to the facility including ACU settings, cabinet
arrangement and equipment placement will change fundamentally the cooling paths. Cooling
path management, therefore, is appropriate to initial design of the room and to configuration
management throughout the data center life span in order to manage cooling problems or
inefficiencies that creep in over time.

More information on cooling path management and the Virtual Facility can be found on the
Future Facilities website (www.futurefacilities.com).

Case Study Description


Cooling path management with the Virtual Facility is best illustrated with a case study. Figure 2
shows a small but representative Virtual Facility that was built in the 6SigmaRoom design
software. The main objective of the methodology is to minimize cooling problems and maximize
cooling system efficiency by managing the cooling path for each unit of IT equipment in
accordance with design objectives. The process should be re-applied for every major change to
the room configuration or IT equipment.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
3 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
PDU 01 ACU 01
PDU 02
ACU 02
PDU 03

Perforated tiles

ACU 03

Raised Floor
ACU 04 Cutout

Figure 2: Isometric view of the data center case study

The case study room is designed to the following specifications:

 1000 sq.ft
 23 cabinets, 3.75 kW/cabinet
 90 kW total load, 90 W/sq.ft
 4 ACUs , 120 kW total cooling capacity, n+1

Three IT equipment configurations were studied to illustrate how cooling path management is
performed as part of the inventory management process. The configurations selected for study
are:

 Initial design or pre-commissioning


 Loaded to 30% of cooling capacity
 Loaded to 80% of cooling capacity

Now, let’s take a detailed look at each section of the facility starting from the raised floor.

Raised Floor:
The raised floor stands 2 feet off the ground and is non-rectangular to accommodate an
entrance ramp. The ACUs reside within the room and the chilled water supply plumbing lie
under the raised floor alongside the data and power cables. These must be included the Virtual
Facility model as they have a significant impact on Segment #1 of the cooling path.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
4 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Chilled Water
Pipes

Cable Trays

Power Cables

Figure 3: Under floor obstructions

Air Conditioner Unit (ACU):


There are 4 ACUs in the room capable of providing 30 kW of cooling and 4,200 CFM of air at
maximum return air temperature of the 20 ºC. The set point for each is 22 oC. The ACU control
system is important to model properly in the Virtual Facility to simulate the actual behavior of the
ACU in the data center. The ACU libraries and generic models that are available within
6SigmaRoom model the controls properly and can accurately predict the potential cooling and
efficiency problems that can occur at any point along the cooling path.

Figure 4: ACU’s are rated at 30 kW of cooling at a maximum return air temperature of 20 º

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
5 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Cabinets:
There are a total of 23 Cabinets in the room. 20 of these have been specified to house servers
and storage equipment (2000 mm height x 600 mm width x 900 mm depth) and the remaining
three have been specified to house networking equipment (2000 mm height x 800 mm width x
900 mm depth). All the cabinets are on casters and are 50 mm (2”) above the raised floor. The
cable penetrations at the rear of the cabinets have cold-locks and have a sealing efficiency of
80%. The cabinet libraries available within 6SigmaRoom contain all of the detail necessary to
predict with a high level of accuracy potential cooling and efficiency problems that can occur
along cooling path Segments #2 and #3.

Cabinet Exterior Shell Cabinet Interior Shell (Mounting Rails,


Equipment & Empty U Slots

Figure 5: Cabinet & Equipment View

Cooling Path Design at Three Stages of Life


Initial design or pre-commissioning:
The initial design of the data center is done often without specific knowledge of the equipment
that will eventually populate the cabinets. At this stage, the focus of cooling path design is
limited to Segment #1 as design of Segments #2 and #3 have little meaning given the lack of
information about the IT equipment that will eventual be deployed. Here, simple 3.75 kW of
load evenly distributed over the vertical height of each cabinet is used to approximate the fully
loaded condition. In reality, this is far from the worst case thermal loading condition, but has
become the standard configuration for cooling design at the pre-commissioning stage.

Segment #1 is the path from the ACU supplies to the perforated tiles. The design goal is to
supply a minimum of 450 CFM of cooling air to each perforated tile; an amount, in theory,
sufficient to hold the 3.75kW cabinets to a temperature rise of 15 ºC. The ACUs, tiles and
under-floor obstructions (chilled water pipes, data cables & power cables) can be configured in
the room to achieve the design goal.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
6 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
The Virtual Facility reveals a breakdown in Segment #1 of the cooling path in the form of low
flow by 100 CFM at tile O8 as shown in Figure 6.

Flow from tile O8 is


below specification
by 100 cfm

Figure 6: Airflow rate coded by color at each perforated tile

The under-floor airflow and pressure distribution reveals the root of the problem as shown in
Figure 7.

Airflow vortex Low pressure zone

Figure 7: Velocity vector plot of airflow under the raised floor. Note the indicated airflow vortex.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
7 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
The vortex creates a low pressure zone that reduces the airflow at the corresponding perforated
tile (O8) to 450 CFM. This vortex has to be minimized or eliminated to bring Segment #1 to
specification. Referring to Table 1, under floor problems can be addressed by four methods.
We start by examining the location of ACU 04 which is supplying the air that is swirling under tile
O8. After testing a few different locations, the Virtual Facility revealed that a 2 foot shift to the
left of ACU 04 (Figure 8) reduces the vortex enough to meet the airflow specification.

ACU 04
2’ shift

Figure 8: ACU 04 was shifted by 2 feet to the left to reduce the vortex generated on the
downstream side of the floor cutout.

This design change resulted in a 30% increase in flow rate at tile O8 from 450 to 586 CFM
above the required 550 CFM specification as shown in Figure 9.

Tile flow rate at 450 CFM originally Tile flow rate at 600 CFM after the design change

Figure 9: Reducing the under-floor vortex increases the flow to the corresponding perforated tile

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
8 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
More optimization can be done, but given that specifications are met, cooling path design for the
pre-commissioning stage is complete. Without knowledge of the equipment, Segments #2 (tile
to equipment inlet) and #3 (equipment exhaust to ACU) of the cooling path are undefined which
makes cooling path design for these segments ineffectual.

In addition to the tile flow rate specification, a hot aisle/ cold aisle arrangement is specified to
further ensure the resilience of equipment that will eventually populate the room. As we will see
later in the case study, room-side thermal design guidelines like the hot aisle/cold aisle can be
defeated easily by high power IT equipment.

Loaded to 30% of cooling capacity


The case study continues at the stage when the first wave of equipment is deployed and cooling
path Segments #2 (from the tiles to the equipment inlets) and #3 (from the equipment exhausts
to the ACUs) come into play for the first time. The room is loaded to 30% of its cooling capacity
as shown in Figure 10.

27 computing units in 7 36 storage units in 3


cabinets cabinets

4 networking units
in two cabinets

Figure 10: Data center loaded to 30% of cooling capacity

The Virtual Facility reveals a cooling path problem for a networking unit as shown in Figure 11.

Overheated
networking unit

Figure 11: The color plot shows an over temperature problem within the networking cabinet

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
9 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
We know from the design of Segment #1 that sufficient air is being supplied to the tile in front of
the cabinet, therefore the problem must exist in cooling path Segments #2 or #3 for this piece of
equipment. Let’s start with Segment #2 where bypass air is the problem to be examined.

6SigmaRoom can calculate a useful metric called ACU Supply Effectiveness Index to quantify
the amount of air that bypasses the equipment and returns directly to the ACUs.

ACU Supply Effectiveness: the overall percentage of cooling air supplied by the ACUs that
enters the equipment intake vents (as opposed to returns directly to ACU)

In the current configuration, only 24% of the cooling air supplied enters the equipment. This
value can be confirmed graphically by the airflow patterns in front of the networking cabinets as
shown in Figure 12.

The amount of bypass air is


significant in the current
configuration

Figure 12: 76% of the cooling air supplied by the perforated tiles bypasses completely the
equipment inlets and returns directly to the ACUs

A secondary effect of the bypass is poor operating efficiency for ACUs 01 and 02 as shown in
Figure 13 on the right. The air returns at a relatively cool temperature of 20 ºC which reduces
the cooling effect of ACUs 01 and 02 according to the cooling profile shown in Figure 4. As a
result, ACUs 01 and 02 supply the networking equipment with air that is warmer than desired as
shown in Figure 13 on the left.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
10 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
ACU 01

ACU 02

ACU 03

Warm air being supplied ACU 04


by ACUs 01 and 02

Figure 13: Non-uniform supply temperature is shown on the left that is dictated primarily by ACU
operating efficiency that is show on the right

Both the effectiveness index and the non-uniform supply temperature strongly indicate that
reducing bypass will improve the cooling problem for the networking unit. At the very least, a
significant amount of cooling energy could be saved.

Referring to Table 1, bypass reduction for Segment #2 can be accomplished in this case by four
methods:

 Shutting down ACU 01 and 02 (this option has the added benefit of reducing cooling
costs) and making the remaining 3 ACUs to operate at a higher efficiency
 Reducing the supply flow rate from the ACUs by 50% (assuming variable speed drives
are in use)
 Shutting off floor grilles in front of cabinets without any equipment to force the air to
reach the cabinets with equipment in them
 Redistributing the networking equipment equally in 4 cabinets instead of two

The options were implemented one at a time to assess the impact individually. Eventually all
the four options had to be implemented to achieve the best possible result on the resilience of
the networking equipment without incurring a lot of cost. The resulting ‘overheat’ plot in Figure
14 shows that all the networking units that were originally operating above the specified
temperature limit are now below the limit.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
11 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Figure 14: The over temperature condition has been eliminated as a result of reducing bypass
air associated with the networking equipment

This result is somewhat counterintuitive in that the over temperature condition was eliminated by
cutting the cooling supply in half. Figure 15 illustrates why this happened.

Bypass is high before proposed changes Bypass is reduced after implementing the changes

Figure15: Segment #2 cooling path before and after shutting off ACUs 01 and 02.

With all the changes made, less cooling air is being supplied, but the supply temperature is
lower by 5 ºC. Also, the ACU supply effectiveness has increased from 24% to 47%, which is a
marked improvement over the original facility layout. These combine to solve the temperature
problem and reduce the cooling system operating cost by 50%.

Cooling path design summary at 30% loading:

 IT equipment cooling problems can occur in a partially loaded data center

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
12 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
 IT equipment cooling problems are most often associated with cooling path Segments
#2 and #3 as Segment #1 is typically within specification when the room is
commissioned
 In this case, the problem was too much bypass air in Segment #2.
 The problem was fixed by shutting down two ACU’s, closing the floor grilles in front of
empty cabinets and distributing the networking equipment evenly in networking row of
cabinets.

Loaded to 80% of cooling capacity


Our case study continues with the introduction of new equipment that loads the room to 80% of
its cooling capacity as shown in Figure 16. 80 units have been added and the room power
dissipation has increased from 40 kW to 77 kW. All four ACU’s are now operational to
accomodate the increased equipment load; damper settings have been removed from the
perforated floor tiles to faciliate maximum airflow to all cabinets.

58 computing units in
13 cabinets

84 storage units in 7
cabinets

8 networking units
in 4 cabinets

Figure 16: 58 servers, 84 storage units and 8 networking units are added to the room.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
13 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Thermal problems are expected to reappear with the change, and they do as shown in Figure
17.

Cabinet C5
Overheat plot Cabinet N6
of cabinets Overheat plot
of cabinets

Figure 17: New over temperature problems within one computing and networking cabinet

The equipment within cabinets C5 and N6 are receiving cooling air that is above their specified
maximum. A look inside the Cabinets C5 and N6 (Figure 18) shows the specific units that are
overheating.

Rackable C3106 Cisco 6509


networking
switches

IBM Blade
Center 1

C5 Cabinet N6 Cabinet

Figure 18: Equipment overheat plot – C5 and N6 cabinets

A walk through of the cooling path in the Virtual Facility shows that Segments #1 and #2 are fine
for both the blade server and networking switch. However, the Virtual Facility reveals that
Segment #3 problems exist for both units as a significant amount of hot air is being re-circulated
from the exhausts to the inlets as shown in Figures 19 and 20. At this stage, the concept of
effectivness indices will be used again as a metric to improve the design. This time, the
efficiency index will be associated with the equipment inlets.

Equipment Supply Effectivness - the percentage of cooling air entering a specific piece of
equipment that comes directly from an ACU supply (as opposed to the exhaust of a neighboring
piece of equipment)

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
14 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
In the current configuration, only 51% of the air enters the IBM Blade Center 1 comes from the
ACU supply as shown in Figure 20.

Figure 20: Hot air from the server across the hot aisle and from its own exhaust is re-ingested

Specifically, the IBM Blade Center 1 ingests warm exhaust air from two locations a) from its own
exhaust and b) from the HP DL360 G5 server that sits across the hot aisle. In this case, the hot
aisle/cold aisle arrangement was defeated by the equipment it was implemented to protect.

The remaining cabinets in row C house HP DL 360 G4 servers that have a different fan
characteristic from the G5 servers and blow air into the hot aisle at a lower velocity. Without the
additional hot air from across the aisle, the IBM Blade Center servers in the remaining cabinets
(C4, C7, C8, C9 & C10) do not overheat.

For the Cisco unit, 52% of its cooling air comes directly from an ACU as shown in Figure 21.

Figure 21: Hot air from the exhaust re-circulates into the intake of the Cisco 6509

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
15 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Specifically, the Cisco 6509 power supply at the bottom of the unit is ingesting air that is
exhausted from the line cards that sit above in the same chassis.

The re-circulating exhaust air must be reduced to improve the Segment #3 cooling paths for the
overheating IBM Blade Center 1 and Cisco 6509 units. Referring to Table 1, a simple and cost-
effective way of reducing re-circulation is to install blanking panels in the empty slots of the
cabinet. For the IBM unit, most of the exhaust air is prevented from flowing to the front of the
cabinet where it can be entrained into the inlets as shown in Figure 22.

Figure 22: Blanking solves the cooling problem for the IBM Blade Center 1 by preventing
enough exhaust air from re-circulating to the intake

However, more must be done to solve the cooling problem for the Cisco unit. Here, internal
cabinet baffling is installed within the cabinet to segregate the intake and exhaust air as shown
in Figure 23.

Figure 23: Internal baffling solves the cooling problem for the Cisco 6509 by segregating the
intake and exhaust air between the PSU and the Line Cards

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
16 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
The resulting ‘overheat’ plot in Figure 24 shows that all the equipment in the data center is
operating under maximum allowable inlet temperature for the room at 80% cooling load. In
other words, the full cooling path for each unit of equipment has been designed and
implemented properly.

Figure 24: The cooling problems have been fixed for the room at 80% of cooling capacity

It is important to emphasize once again that each individual unit of equipment in the Virtual
Facility is modeled explicitly to attain sufficient modeling resolution for full cooling path design.
Equipments from different vendors have unique power dissipation and air flow characteristics
such as intake and exhaust size and location and fan flow rates. Capturing these details is
critical to defining fully cooling path Segments #2 and #3 in the Virtual Facility. Lacking these
details would have prevented in this case the insight necessary to solve the cooling problems,
maximize equipment resilience and improve cooling system efficiency.

Summary Points
 High power dissipation and power cooling fans within the IT equipment are driving the
need for a new simulation-based methodology called cooling path design
 Room-side design guidelines such as specified tile flow rates and hot aisle/cold aisles do
not ensure resilience for modern IT equipment
 Cooling path design is the systematic improvement of the entire cooling path for every
unit of equipment in the data center
 Cooling path design must be performed for every change in inventory or room
configuration as these have a fundamental impact on the cooling path definition for the
affected equipment
 The Virtual Facility, with its ability to model and track changes to the inventory explicitly,
provides an effective platform for cooling path design
 In this case study, the follow objectives were achieved:

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
17 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED
Stage Problem area Design achievement

Pre-  Swirling flow under perforated  Met the tile flow rate
commissioning tile leading to low flow specification for the room

30% loaded to  Overheating equipment  Solved cooling problem for


capacity networking equipment
 Wasted cooling capacity
 Reduced cooling cost by 50%

80% loaded to  Overheating equipment  Solved complex cooling


capacity problem for networking and
blade server

References
1. ASHRAE. (2005) Datacom Equipment Power Trends and Cooling Applications. Atlanta:
ASHRAE
2. ASHRAE. (2006) Design Consideration for Datacom Equipment Centers. Atlanta: ASHRAE
3. Garday, D and Costello, D. (2006) Air-Cooled High-Performance Data Centers: Case
Studies and Best Methods. http://www.intel.com/it/pdf/air-cooled-data-centers.pdf
4. Malone, C. and Belady, C. (2006) Data Center Power Projections to 2014. iTHERM 2006,
San Diego, CA
5. Malone, C. and Belady, C. (2006) Metrics to Characterize Data Center IT Equipment Energy
Use. Proceedings of 2006 Digital Power Forum, Richardson, TX
6. Patel, C.D., Sharma, R, Bash, C.E., Beitelmal, A. (2002) Thermal Considerations in Cooling
Large Scale High Compute Density Data Centers, (2002) Inter Society Conference on
Thermal Phenomena, pg 767-776
7. Patterson, M., Costello, D., and Grimm, P. (2007) Data Center TCO, A Comparison of High-
Density and Low-Density Spaces, Hillsboro, Thermes 2007, Santa Fe, NM
8. VanGilder, J.W. and Schmidt, R.R. (2005) Airflow Uniformity through perforated tiles in a
raised-floor Data Center. ASME Interpack 05, San Francisco, CA

Authors
Akhil Docca is the Engineering Services Manager at Future Facilities Inc.
Sherman Ikemoto is the General Manager of Future Facilities Inc.

Future Facilities Incorporated. All rights reserved. No part of this publication may be used reproduced or photocopied, transmitted, or stored in a retrieval
18 system of any nature, without the permission of the copyright owner. www.futurefacilities.com | A White Paper from FUTURE FACILITIES INCORPORATED

Potrebbero piacerti anche