Sei sulla pagina 1di 5

High Performance Computing

Platforms based on Intel


Architecture.
Scientists, engineers, and data analysts need ever-higher performance for their
technical computing applications to speed time to results, handle todays unprecedented growth in
data volumes, and improve the accuracy and precision of their
modeling and simulation applications, , and that number continues to increase as 97 percent of the
more recent
additions to the TOP500 list.
Intel architecture provides comparable value across the full range of technical
computing needs, from entry-level workstations to high-performance server
clusters.

Intel provides Higher Value with a Balanced Platform, enables balanced platform solutions that help
to improve both performance and cost models.
This is thanks to degree of parallelism in your software in the cpu processor.
Intel processors also have wide vector units, so they can execute a single instruction simultaneously
across multiple data points.
These parallel execution resources
can dramatically improve performance, but some applications require
more parallelism than others for
best performance.
Intel architecture gives you unmatched flexibility adding Intel
Xeon PhiTM coprocessors, Intel processors and coprocessors
include a variety of unique technologies
that help to improve parallel throughput, overall performance, and security,
while reducing energy consumption.
These technologies can increase per-

formance by as much as two times.

Tipos de procesadores a elegir:


Table 1. Optimized Processors for the Full Range of Technical Workloads(en pdf)

Diferencias tcnicas entre:

Intel Xeon processor E5 v3 product family:


The Right Processor for Most Technical Applications
The Intel Xeon processor E5 v3 family provides the best balance of per-core
performance, parallelism, energy efficiency, and cost for the vast majority of
technical applications.

Intel Xeon PhiTM Coprocessor 7100


A Powerful Boost for Highly Parallel Workloadswithout Recoding your Applications
Add Intel Xeon Phi coprocessors to your Intel Xeon processor E5 v3 familybased workstations and servers to increase performance for highly parallel code.
These powerful coprocessors provide:
A supercomputer in every chip, with up to 1.2 teraflops 3,7,12 of performance.
more performance per watt 3,15
Compatibility with x86 software

Table 2.Intel Processor Technologies to Increase Performance


and Security.
(De esto se pueden poner dos o tres ejemplos que suenen to potentes XD.)

Otras tecnologas que proporciona intel:

Intel offers two solutions that address the full range of networking needs for technical
computing: 1) 10/40 Gigabit Intel Ethernet technology provides a flexible, highperformance solution for connecting nodes in a loosely coupled cluster or for connecting a
workstation or cluster to a site networ included in Intel Ethernet Controller XL710 And 2)
Intel True Scale Fabric is a purpose built interconnect solution for HPC clusters designed
to support the most demanding performance requirements.

Right Storage:( Table 4. Choose the Right Storage)

Optimize Your Software For Fast, Parallel Execution with Intel Parallel Studio XE
Professional Edition and Cluster Edition and,
High Performance Compilers, Libraries, Parallel Models, and Analysis Tools:
Intel Fortran and C++ Compilers help to boost application performance through
explicit vectorization
Intel Math Kernel Library provides high performance for linear algebra, Fast Fourier
Transforms (FFT), vector math, and statistics functions on the latest Intel architectures.
Standards-based parallel models of OpenMP 4.0
Intel MPI Library provides sustained scalability with low latencies
Powerful analysis tools help to accelerate software development(Intel Advisor XE,
Intel Inspector XE, Intel VtuneTM Amplifier XE, Intel Trace Analyzer and
Collector)

Optimizaciones de las instrucciones

Enabling Auto-vectorization. paper focuses on the performance of 3D simulations.


Compiling with vec-report3 and vec-report5

Improving Auto-vectorization and Cache Behavior. Even though the vast majority of the
loads encountered during our stencil will be contiguous, the compiler may choose to use a
sequence of relatively expensive gather/scatter operations instead of simple packed loads.
Alternative optimizations: 1) peeling the first and last iterations from the inner-most loop,
such that all of the remaining iterations are known not to handle any edge cases; or 2)
introducing halo or ghost cells (i.e. a layer of additional cells around the grid which
store values representative of the boundary condition).

Such halo cells are a commonly used design pattern in many high performance
applications. For example, adding halo cells to an array of 1024x1024x1024
doubles increases its 8 GB footprint by only 48 MB
Esas son unas de las optimizaciones que el compilador puede aprovechar de forma ms ptima
dichas cpus.
Otros ejemplos de optimizaciones que se le pueden proporcionar para ejecutar aplicaciones de son
las siguientes:

Potrebbero piacerti anche