Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introduo:
O que um Data Center?
Uma viso geral da complexidade de um centro de dados; Definio; Exemplos; Seus principais componentes; Outros
componentes e recursos; Tiers de Data Centers; Uma viso geral do curso, referncias e sua logstica.
Data center
Um centro de dados, ou data center, uma instalao que contm o
armazenamento de informaes e outros recursos fsicos de tecnologia da
informao (TI) para a processar, comunicar e armazenar de informaes.
Acesse: https://www.google.com/about/datacenters/inside/streetview
para fazer um passeio no data center do Google
Microsofthas more than 1 million servers, according to CEO Steve Ballmer (July, 2013)
Discusso e exerccios
Por que centralizar os recursos computacionais em um centro de dados?
Relacione isso com o fato de vrias empresas terem centros de dados
distribudos geograficamente.
Relacione (os principais) tipos de aplicaes fornecidas por um data center.
Se justifica um data center de um hospital (menos de 100 servidores) em tier
IV enquanto encontramos um data center de hosting (mais de 1000 servidores)
com tier II ou III ?
Leitura recomendada
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale
Machines
Luiz Andr Barroso and Urs Hlzle 2009
02 infra TI
Armazenamento de Informaes
Crescimento dos dados e da importncia das informaes; Tipos de dados; Evoluo das tecnologias de
armazenamento; Estrutura e requisitos do data center; Ciclo de Vida da Informao;
Informao e dados
Informao : cada vez mais importante
Crescimento exponencial da importncia, do volume e
da dependncia do mundo corporativo por informaes
Aumentam, portanto, os desafios relacionados
proteo e ao gerenciamento dos dados
Crescimento exponencial
http://www.computerworld.
com/s/article/9217988/World_s_data_will_grow_by_50X_in_next_decade_IDC_study_predicts
Computerworld - In 2011 alone, 1.8 zettabytes (or 1.8 trillion gigabytes) of data will be created, the equivalent to every
U.S. citizen writing 3 tweets per minute for 26,976 years. And over the next decade, the number of servers managing
the world's data stores will grow by ten times.
http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm
Crescimento: exemplo 1
10,000,000,000 photos
2-3 Terabytes of photos are being uploaded
to the site every day
Serve over 15 billion photo images per day
Photo traffic now peaks at over 300,000
images served per second
Crescimento: exemplo 2
Inglaterra: Uma cmera de vigilncia
para cada 14 cidados
4 milhes de cmeras registrando
imagens diariamente
Voc tem ou pode encontrar outros
exemplos na Internet ?
O desafio do armazenamento:
Armazenar, proteger, otimizar e influenciar essa enorme quantidade crescente de
dados
Desafio
O desafio do armazenamento:
Armazenar, proteger, otimizar e influir* nessa enorme
quantidade crescente de dados
influir*, pense em como o armazenamento suporta a capacidade de gerar informaes sobre os dados
Tipos de dados
Estruturados
X
No Estruturados
Big Data:
Novos desafios para armazenamento
de dados nos centros de informao
Dispositivos de armazenamento
Os dispositivos de armazenamento variam conforme o tipo de dados, a
velocidade com que esses so criados e usados, e a capacidade.
Evoluo
Evoluo dos dispositivos de Armazenamento:
Do armazenamento interno no inteligente para
o armazenamento em rede inteligente.
Evoluo
Redundant Array of Independent Disks (RAID)
Direct-attached storage (DAS)
Storage area network (SAN) This is a dedicated, highperformance Fibre Channel (FC) network to facilitate block-level
communication between servers and storage.
Network-attached storage (NAS) This is dedicated storage for
file serving applications. Unlike a SAN, it connects to an existing
communication network (LAN) and provides file access to
heterogeneous clients.
Internet Protocol SAN (IP-SAN) One of the latest evolutions in
storage architecture, IP-SAN is a convergence of technologies
used in SAN and NAS.
Arquitetura tpica
Uma arquitetura tpica de processamento de um data center usando uma rede de armazenamento
(SAN) em um data center
Caractersticas Chave de um DC
ILM Process
ILM Benefcios
Improved utilization
Tiered storage platforms Low Costs
Simplified management
Processes, tools and automation
Discusso e exerccios
Um crescimento exponencial dos dados e dos Data Centers pode significar um incremento igual de
profissionais e recursos ($) em TI nos prximos anos?
Considere os dados de um venda no caixa de um supermercado. O valor dessa informao o
mesmo ao longo do tempo (primeiros dias, meses e aps um ano por exemplo)?
Cite facilidades ou recursos que voc espera de uma ferramenta de automao de ILM.
Na sua opinio que tipo de dado, estruturado ou no estruturado, parece ter um crescimento maior
hoje e por que?
Que vantagens voc v no armazenamento em rede sobre o interno?
Leitura recomendada
Captulo 1
Information Storage and Management Storing, Managing, and Protecting Digital Information in
Classic, Virtualized, and Cloud Environments
2nd Edition Edited by Somasundaram Gnanasundaram, Alok Shrivastava
03 infra TI
Ambiente de Armazenamento
Principais componentes de Hosts e Armazenamento; Tipos de conectividade PCI, IDE/ATA, SCSI etc.;
Componentes de um drive de disco; Desempenho de drives de disco; Sistemas de arquivos; LVM, Logical
Volume Manager
Principais Componentes
Application: A computer program that provides the logic
for computing operations
Database management system (DBMS): Provides a
structured way to store data in logically organized tables
that are interrelated
Host or compute: A computing platform (hardware,
firmware, and software) that runs applications and
databases
Network: A data path that facilitates communication
among various networked devices
Storage: A device that stores data persistently for
subsequent use.
do Ambiente de Armazenamento
CPU
Storage
Disk device and internal memory
I/O device
Host to host communications, Network Interface Card (NIC)
Host to storage device, Host Bus Adapter (HBA)
Operating system
Resides between the applications and the hardware
Controls the environment
File System
File is a collection of related records or data stored as a unit
File system is hierarchical structure of files
Examples: FAT 32, NTFS, UNIX FS, EXT2/3 e HDFS
Conectividade
Protocols define a format for
communication between sending
and receiving devices
Tightly connected entities such as central processor to RAM, or storage buffers to controllers (example PCI)
Directly attached entities connected at moderate distances such as host to storage (example IDE/ATA)
Network connected entities such as networked hosts, NAS or SAN (example SCSI or FC)
Conectividade
PCI (Peripheral Component Interconnect) is used for local bus system
It is an interconnection between microprocessor and attached devices, Has Plug and Play
PCI is 32/64 bit, Throughput is 133 MB/sec
PCI Express is a enhanced version of PCI bus with higher throughput and clock speed
Storage Medias
Magnetic Tape
Low cost solution for long term data storage
Limitations
Sequential data access, Single application access at a time, Physical wear and tear and Storage/retrieval overheads
Optical Disks
Popularly used as distribution medium in small, single-user computing environments
Write once and read many (WORM): CD-ROM, DVD-ROM
Limited in capacity and speed
Disk Drive
Most popular storage medium with large storage capacity
Random read/write access
Ideal for performance intensive online application
Solid State Media or FLASH DRIVES
Expensive
Sem partes mveis, como circuitos integrados e placas-me em computadores
Seek Time
Rotational Latency
Appx. 5.5 ms for 5400-rpm drive, 2.0 ms for 15000-rpm drive
Qual maior ?
100 IOPS
10 ms
8 ms / 10 ms = 0,8 = 80%
8 ms / (1-0,8) = 40 ms
3,2
32 ms
Utilizao x Performance
Consider a disk I/O system in which an I/O request arrives at a rate of 100 I/Os per second. The service time, RS, is 4 ms.
Utilization of I/O controller (U= a Rs)
Total response time (R= Rs /(1-U) )
Calculate the same with service time is doubled
Discusso e exerccios
D exemplos de conexes PCI e SCSI.
Um banco de dados requer um disco de 2TB. Mas os disk drives disponveis so somente de 500GB.
Que componente lgico do sistema pode ser utilizado para soluo desse problema e como ?
Um disco com 500GB tem mesmo 500GB teis?
Um sistema emprega 10 discos de 500GB e vem apresentando problemas de performance no I/O
(alto tempo de resposta). Tendo disponvel apenas mais volumes de disco como voc resolveria esse
problema?
Altere o exemplo de clculo de performance de discos para 3000 IOPS. Qual o tempo de resposta e
tamanho de fila obtidos?
Leitura recomendada
Captulo 2
Information Storage and Management Storing, Managing, and Protecting Digital Information in
Classic, Virtualized, and Cloud Environments
2nd Edition Edited by Somasundaram Gnanasundaram, Alok Shrivastava
03 infra TI
RAID
MTBF; RAID Protection; Mirroring and Parity; RAID levels; write penalty
RAID
Controller
Hard Disks
Host
RAID Array
RAID: SW vs. HW
Hardware (usually a specialized disk controller card)
Melhor escolha!
o Controls all drives attached to it
o Array(s) appear to host operating system as a regular disk drive
o Provided with administrative software
Software
o Runs as part of the operating system
o Performance is dependent on CPU workload
o Does not support all RAID levels
RAID levels
Disk Stripes
RAID 3, 4
Stripes data for high performance and uses parity for improved fault tolerance. One drive is dedicated for
parity information. If a drive files, data can be reconstructed using data in the parity drive.
For RAID 3, data read / write is done across the entire stripe.
Provide good bandwidth for large sequential data
access such as video streaming.
For RAID 4, data read/write can be independently on
single disk.
RAID 5, 6
RAID 5 is similar to RAID 4, except that the parity is distributed
across all disks instead of stored on a dedicated disk.
This overcomes the write bottleneck on the parity disk.
It is largely used by Database systems
RAID 6 is similar to RAID 5, except that it includes a second
parity element to allow survival in the event of two disk failures.
The probability for this to happen increases and the
number of drives in the array increases.
RAID Comparative
RAID
Min
Disks
Storage Efficiency
%
Cost
Read Performance
Write Performance
100
Low
Very good
High
Good
Better than a single disk
1+0
and
0+1
50
(n-1)*100/n
where n= number of
disks
(n-1)*100/n
where n= number of
disks
Moderate
Good
Slower than a single disk, as every
write must be committed to two disks
Good for random reads and very good Poor to fair for small random writes
for sequential reads
Good for large, sequential writes
Moderate
(n-2)*100/n
where n= number of
disks
50
High
Very good
Good
RAID
Controller
Discusso e exerccios
Por que h uma penalidade de WRITE mas no de READ nos mecanismos de RAID?
Em geral as controladoras de disco local dos servidores implementam RAID 1 enquanto grandes
sistemas de armazenamento em geral optam por RAID 5 ou suas variantes. Por que?
Compare os mecanismos de espelhamento e paridade.
Altere o exemplo de clculo de write penalty na condio de que somente das operaes so de
gravao. H penalty para o RAID 0?
Que tipo de gargalo RAID 3 apresenta quando comparado com o RAID 5?
Leitura recomendada
Captulo 3
Information Storage and Management Storing, Managing, and Protecting Digital Information in
Classic, Virtualized, and Cloud Environments
2nd Edition Edited by Somasundaram Gnanasundaram, Alok Shrivastava
03 infra TI
Armazenamento Inteligente
Components of intelligent storage system; List benefits of intelligent storage system; I/O Optimization; FrontEnd; Back-End; Explain intelligent cache algorithms and protection
Increased capacity
Improved performance
Easier data management
Improved data availability and protection
Enhanced Business Continuity support
Improved security and access control
Front End
Connectivity
FC SAN
Back End
Cache
Physical
Disks
Front End
Connectivity
Back End
Cache
FC SAN
Ports
Controllers
Physical
Disks
I/O
Request
s
I/O Processing
Order
Front-End
Controlle
r
C
B
Cylinder
s
I/O
Request
s
I/O Processing
Order
Front-End
Controlle
r
C
B
Cylinder
s
Host
Back End
Cache
Connectivity
FC SAN
Physical
Disks
Cachee
Acknowledgemen
t
Write-back
Write
Request
Acknowledgemen
t
Cachee
Acknowledgement
Cache
Cache
New Data
100 %
HWM
LWM
Idle
High watermark
Forced
Cache vaulting
Cache is exposed to the risk of uncommitted data loss due to power
failure
Front End
Back End
Cache
Physical
Disks
Connectivity
FC SAN
Controllers
Ports
Front End
Connectivity
FC SAN
Back End
Cache
Physical
Disks
Front End
LUN 0
Connectivity
Cache
Physical
Disks
LUN 0
FC SAN
LUN 1
LUN 1
Host 2
LUN Masking
Logical Unit Number
Active
Host
Port
Port
Controll
B e
r
Active
Controll
A e
r
LUN
Storage
Array
Active
Port
Port
Passiv
e
Controll
B e
r
Controll
A e
r
Active-Passive
Configuration
LUN
Storage
Array
Discusso e exerccios
Cite ao menos 2 mecanismos encontrados nos sistemas inteligentes de armazenamento.
Explique os dois principais mecanismos de gerenciamento de cache encontrados nos sistemas
inteligentes de armazenamento.
Por que o Command Queue do Front End nos sistemas estudados tm sentido para o acesso a
discos de estado slido ?
Operaes de READ e WRITE no cache apresentam que diferenas ?
Como voc cr que podemos medir a eficincia do cache de um sistema inteligente de
armazenamento.
Por que no encontramos essa inteligncia em sistemas de armazenamento interno local ?
Leitura recomendada
Captulo 4
Information Storage and Management Storing, Managing, and Protecting Digital Information in
Classic, Virtualized, and Cloud Environments
2nd Edition Edited by Somasundaram Gnanasundaram, Alok Shrivastava