El reto de paralelismo y
Multicore
Intel Software College
Motivacin
Se requiere mejor
rendimiento
Primera PC
1981
1980
Windows
Mouse
Monitor
Color
1990
Internet
Multimedi
a
Joystick
1995
Multitasking
Menor consumo
de energa
Wireless
Movilidad
Plugn Play
Juegos
Video Input
PVR
2004
2006
2
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Ley de Moores
3
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Quin se Preocupa por la Jerarqua de
Memoria?
Hueco Procesador-Memoria DRAM (latencia)
Rendimiento
1000
CPU
Ley de Moore
100
10
Proc
60%/ao.
(2X/1.5aos
)
Hueco de rendimiento
Procesador-Memoria
(crece 50% / ao)
Ley Menor?
DRAM
198
198
0
1
198
198
2
198
3
198
4
198
5
1
698
198
7
1
898
199
9
1
099
199
199
2
199
3
199
4
199
5
199
6
1
799
199
8
200
9
0
DRAM
9%/ao.
(2X/10 yrs)
Time
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Implicaciones de la ley de Moores
La velocidad de la memoria no est incrementando tan
rpido como la velocidad de los microprocesadores
~1980 i486 CPU toma ~8 ciclos de reloj (cpc
acceder memoria
~1990 - Intel Pentium toma ~224 cpc
) para
cloks per cycle
Consumo de energa
Intel Pentium ~3 millones de transistores
Intel Itanium 2 ~1 mil millones de transistores
A estaLa
tasa
tendremos ms
calor por centmetro
aproximacin
multi-core
mejora el
cuadrado que la superficie del sol
rendimiento usando la estrategia
divide y vencers
5
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Rendimiento / Consumo
Rendimiento
Potencia Requerida
1.00x
Mxima Frecuencia
Relative single-core frequency and Vcc
6
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Over-clocking
1.73x
Rendimiento
Potencia Requerida
1.13x
1.00x
Over-clocked Mxima Frecuencia
(+20%)
Relative single-core frequency and Vcc
7
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Under-clocking
1.73x
Rendimiento
Potencia Requerida
1.13x
1.00x
0.87x
0.51x
Over-clocked Mxima Frecuencia Under-clocked
(+20%)
(-20%)
Relative single-core frequency and Vcc
8
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Rendimiento Multi-Core y Consumo de Energa
Dual-Core
1.73x
Rendimiento
1.73x
Potencia Requerida
1.13x
1.02x
1.00x
Over-clocked Mxima Frecuencia
(+20%)
Dual-core
(-20%)
Relative single-core frequency and Vcc
9
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Aplicaciones en Procesadores de un solo
Ncleo
Incremento en la frecuencia de reloj produce un incremento
en el rendimiento de la aplicacin
La aplicacin gana, no se tom accin al respecto
Avances en la tecnologa de compiladores impulsan el
rendimiento
Re-construye tu aplicacin para aprovechar los
2001
beneficios adicionales de rendimiento
El cache mejora el rendimiento
Las aplicaciones se benefician con un pequeo o ningn
cambio
El Rendimiento de la Aplicacin Mejora como
Incrementan los GHz (frecuencia de reloj)
10
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Procesadores Multi-Core
(comenzando con Dual-Core)
Varios ncleos del procesador disponibles
Replicar todo el ncleo del procesador en un solo chip
mejora el rendimiento de una aplicacin paralelizada
Evolucin natural de HT
El rendimiento mejora para aplicaciones paralelizadas
Avances en la tecnologa del compilador continan
impulsando el rendimiento
OpenMP, Auto-parallelizacin, los compiladores incorporan
2005, 2006, las ltimas innovaciones del procesador
2007,
El Cache y la frecuencia de reloj no influyen en el
rendimiento de la aplicacin
El cache y la frecuencia del reloj siguen contribuyendo
La Mejora en el Rendimiento es
lograda por medio de la Paralelizacin
11
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Rendimiento
Rendimiento a Travs de Multi-Cores
3X
2004
2000
2008+
Normalized Performance vs. Initial Intel Pentium 4 Processor
12
Copyright 2006, Intel Corporation. All rights reserved.
Intel
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its Source:
subsidiaries
in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Rendimiento a Travs de Multi-Cores
Rendimiento
MULTI-CORE
Aqu estamos
10X
SINGLE CORE
3X
2009+
2004
2000
Pronstico
Normalized Performance vs. Initial Intel Pentium 4 Processor
13
Copyright 2006, Intel Corporation. All rights reserved.
Intel
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its Source:
subsidiaries
in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Many-Core
La ruta hacia
Many cores
Multi-Core
Dual-Core
Hyper-threading
Standard core
14
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Resumen
La velocidad del procesador (GHz) ya no es el principal contribuyente al
rendimiento de las aplicaciones
Multi-Core hace que haya disponibles varios procesadores (ncleos/cores) en un solo
chip.
Una aplicacin propiamente diseada puede escalar incrementos en el rendimiento
como el nmero de ncleos incremente
Las plataformas Multi-Core estn aqu!
2005
2006**
2007**
Rendimiento Desktop* Lanzamiento >70%
>90%
Rendimiento Mvil*
Lanzamiento >70%
>90%
Server
Lanzamiento >85%
~100%
* Mobile & Desktop Pentium
Multi-Core es una transicin de la tecnologa mayor para
los desarrolladores de aplicaciones
15
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Multicores No hay almuerzo gratis
16
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Estableciendo el Problema
Algunos en la industria tienen la percepcin de que la
paralelizacin de aplicaciones no es necesaria para sacar el
mximo provecho de multi-core, ya que algunos creen de que
el planificador del SO hace todo el trabajo para ellos.
Por las ltimas dcadas el software ha sido
desarrollado como una aplicacin de un solo
hilo (serial)
Los hilos no han sido comunmente usados para
tareas concurrentes, ejemplo: eventos de GUI
17
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Hechos del Paralelismo
El paralelismo solo era usado para aplicaciones de alto
rendimiento (No ms)
El cmputo paralelo no se va a detener
Desarrollar aplicaciones paralelas no es simple
Evaluar el rendimiento en paralelo es complejo
18
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Implicaciones del Paralelismo
Diseo de Computadoras Paralelas
Diseo de Algoritmos Eficientes
Evaluacin de Algoritmos Paralelos
Desarrollo de lenguajes de programacin Paralelos
Desarrollo de Utileras de Programacin Paralela
Portabilidad de aplicaciones paralelas
19
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Mquinas Paralelas:
Principalmente hay dos clases de mquinas
paralelas:
Multicomputadoras
Paso de mensajes, sistemas de memoria
distribuida, redes de estaciones de trabajo,
clusters, NUMA (non-uniform memory access),
etc.
Multiprocesadores
Multiprocesadores con memoria compartida,
multiprocesamiento simtrico, UMA (uniform
memory access systems).
20
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Arquitectura de Memoria Compartida
Cualquier direccin de memoria es accesible desde
cualquier procesador/ncleo: direccin nica por posicin
de memoria
Bus
Cache
Procesadores y Ncleos
Bancos de Memoria
21
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Estructura de un Proceso
Cdigo
Heap
IP
Stack
Rutinas de Int.
Descriptores
Proceso
22
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Hilos
Hilo
IP
Cdigo
Heap
Stack
IP
Rutinas de Int.
Stack
Descriptores
Hilos
23
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Visin de Alto-Nivel del Paralelismo
PROCESO
Una instancia de un programa en
ejecucin con los estados
necesarios para permanecer en
ejecucin la mayora de las
aplicaciones son procesos
Crear un nuevo proceso
ser costoso en CPU, se ll
tiempo y memoria
HILO
hilos pueden
arse sin replicar todo
roceso
Una instancia de subtareas
subdivididas para ejecutarse en
paralelo muchas aplicaciones se
dividen en mltiples hilos
24
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Tipos de Paralelismo
por Funcionalidad
Asignar hilos a funciones separadas hechas por la aplicacin
El mtodo ms fcil desde que sobreponer tareas es obvio
(ejemplo esperar actualizacin de una Interfaz de Usuario)
Por lo general, mejorar la capacidad de respuesta y
funcionalidad
A menudo se hace a travs de un modelo de programacin
por descomposicin funcional
por Rendimiento
Paralelizar para mejorar tiempo de retorno o tasa de trabajos
Ms difcil ya que los desarrolladores necesitan tener un
conocimiento profundo de flujo de datos y estructuras de
datos
Por lo general, mejorar el desempeo en general
Usualmente se hace mediante un modelo de programacin de
descomposicin de datos
25
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Flujo de Trfico en una Autopista
La Analoga
Imagina
Una Autopista Procesador Multi Core
Con Muchos Carriles Ncleo
Donde los Vehculos son Hilos de una Aplicacin
Y la Longitud de un Vehculo es el Tiempo de Ejecucin
de un Hilo
Y todo el Flujo de Trfico es la Ejecucin del Procesador
26
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Flujo de Trfico en una Autopista
La Analoga
4 Carriles = 4 Ncleos
La lnea final representa la ejecucin
Los vehculos son hilos de una aplicacin
27
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Flujo de Trfico en una Autopista
Partiendo Procesos
O Un
Unos
Un
bonche
solo
pocos
hilo
de
hilos
grandes,
hilos
cortos
de cmputo
intensivo, hilos
independientes
dependientes y quiz
unos cuantos hilos
cortos independientes
o hilos ms cortos
28
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Analoga con el Trfico de una Autopista
Qu se puede lograr con el paralelismo?
Vamos
a que
dividir
La
Suponer
Ahora,
carga
llega
en
4 trocas
ms
tenemos
podemos
mucho
ms
un
poner
pequeas
convoy
una
rpido!
troca
largo
en de
carga:
cada
carril
troncos
de madera
Cuatro
hilos
Una aplicacin
independientes
no paralelizada
Explotar
Resultado:
los
mltiples
Mejora
la tasa
ncleos
de
trabajos!
29
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Analoga con el Trfico de una Autopista
Vehculos en la autopista
de
diferentes
colores
son independientes
Esos
Vehculos
vehculos
(Hilos)
deldel
mismo
mismo
color
color
(hilos
son dependientes)
dependientes
entre
y
entre ellos
y por
lo tanto
pueden
paralelo
vehculos
ellos
y por
grandes
lo
tanto
(hilos
no se
grandes
pueden
nomoverse
mover
optimizados)
enen
paralelo
generan
enen
otros
otros carriles
(ser
ejecutados
paralelamente
en mltiples
carriles
carriles,
vacos
deben
(utilizacin
seguirse
entre
ineficiente
ellos. del procesador)
ncleos)
30
Ejecuta
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Analoga con el Trfico de una Autopista
Planificacin del Sistema Operativo
Planificador del
SO (Selecciona
el carril)
El Sistema Operativo no puede
una aplicacin en hilos
4
3
2
1
Debe
El
SO 2
Carril
1 la
esperar
piensa
troca.
solo puede planificar hilos exis
31
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Analoga con el Trfico de una Autopista
Escenario Ejemplo
Vehculos grandes (hilos grandes) y vehculos del mismo color
(hilos dependientes) causan trfico torpe.. Los carros ms
pequeos (hilos pequeos independientes) permiten un flujo
ms rpido
32
Ejecuta
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Analoga con el Trfico de una Autopista
Resumen
vidir un hilo grande en n hilos ms pequeos e
dependientes permite que el SO los planifique en
erentes ncleos incrementando la tasa de
bajos.
Igual que en la autopista, los vehculos grandes
causan que el trfico se entorpezca Los pequeos
permiten un flujo ms rpido
SO ayuda a manejar el trfico, pero el desarrollador
asegura que cada carril pueda tener suficiente
fico para que el SO lo pueda planificar, permitiendo
mayor tasa de trabajos de carga (datos).
33
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Aceleracin
La concurrencia est limitada por la naturaleza de la aplicacin
Es el acuerdo ms importante
Si s es la fraccin de trabajo serial intrnsico
El tiempo de computacin usando p ncleos:
s)ts
(
1
sts
p
34
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Tiempo en Paralelo:
T
st
(1 s )t
p
ts
sts
(1-s)ts
Seccin Serial
Seccin Paralela
p ncleos
(1-s)ts/p
tp
35
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Ley de Amdahl
Tp
st
s
t
S( p) s
tp
Aceleracin
S ( p)
( 1 s ) ts
st (1 s )t
s
s
p
p
1 ( p 1) s
Si s es la fraccin de tiempo secuencial, entonces, la
aceleracin es 1/s
limp S ( p )
1
s
36
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Ley de Amdahl
37
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Speedup and Efficiency
Intel Software College
Aceleracin (Simple)
Medir que tanto se acelera la ejecucin de cmputo vs. el mejor
cdigo serial
Tiempo serial dividido entre el tiempo paralelo
Ejemplo: Pintar una barda de tablitas
30 minutos de preparacin (serial)
Un minuto para pintar una tabla
30 minutos para limpiar (serial)
Por lo tanto, 300 tablas toman 360 minutos (tiempo serial)
38
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Speedup and Efficiency
Intel Software College
Calculando Aceleracin
Numero de
pintores
Tiempo
Aceleracin
30 + 300 + 30 = 360
1.0X
30 + 150 + 30 = 210
1.7X
10
30 + 30 + 30 =
90
4.0X
100
30 + 3 + 30 =
63
5.7X
Infinito
30 + 0 + 30 =
60
6.0X
Illustra la ley de
Amdahl
La aceleracin
potencial est
restringida por
la porcin serial
Que pasa si el dueo de la barda usa un spray para pintar 300
tablas en una hora ?
Mejor algoritmo serial
Si no hay sprays disponibles para varios pintores, cul es la mxima
paralelizacin?
39
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Speedup and Efficiency
Intel Software College
Eficiencia
Medir que tan efectivamente los recursos de cmputo
estn ocupados
Aceleracin dividida entre el nmero de hilos
Expresada como porcentaje promedio de tiempo no ocioso
Numero de
pintores
Tiempo
Aceleracin
Eficiencia
360
1.0X
100%
30 + 150 + 30 = 210
1.7X
85%
10
30 + 30 + 30 =
90
4.0X
40%
100
30 + 3 + 30 =
63
5.7X
5.7%
Infinito
30 + 0 + 30 =
60
6.0X
Muy baja
40
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
Aceleracin + Sincronizacin
S
41
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel Software College
42
Copyright 2006, Intel Corporation. All rights reserved.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.