NVIDIA Tesla P4

NVIDIA Tesla P4

Tesla P4 為資料中心帶來最高的能源效率,其小尺寸及最小50瓦特的低功率設計可安裝於任何伺服器內,讓生產作業負載推論的能源效率達 CPU 40 在進行視訊推論作業負載時,單一伺服器裡安裝單顆 Tesla P4 即可取代 13 台僅採用 CPU 的伺服器;而包含伺服器及用電量的總持有成本則能節省達8倍。


(3)利用Caffe AlexNet神經網路、batch size = 128比較img/sec。CPU:E5-2690v4,使用Intel MKL 2017。使用Intel優化的Caffe及AlexNet,來源https://github.com/intel/caffe。GPU:Tesla P4,量測GPU功率


(4)利用Intel優化的 GoogLeNet、雙插槽 CPU 伺服器、使用 Intel MKL 2017 的 Xeon E5-2650v4。使用 DeepStream SDK 的 1 顆 Tesla P4 GPU 伺服器。視訊串流為 720p @ 30FPS。



http://www.pny.eu/professional/explore-all-products/nvidia-tesla/775-nvidia-tesla-p4






鴻鵠國際股份有限公司

業務窗口   蔡先生
行動電話:0910-218-322                                                       
公司電話:02-2929-9388 #10
公司傳真:02-2929-7579
Email: sales1@honghutech.com


ACCELERATE YOUR CUSTOMERS’ DEEP LEARNING WITH
THE NEW NVIDIA TESLA P4 AND P40 

At GTC China last week, we announced the latest NVIDIA® Tesla® GPUs based on the NVIDIA Pascal™ architecture—the computational engine for the new era of artificial intelligence. Our Tesla GPUs deliver amazing user experiences by accelerating today’s deep learning applications at scale.

The Tesla P40 is purpose-built to deliver maximum throughput for deep learning workloads. 

The ultra-efficient Tesla P4’s small form factor and 75-Watt design help accelerate any scale-out server and provide 40X higher energy efficiency compared to CPUs. 

NVIDIA CEO Jen-Hsun Huang unveils technology that will accelerate the deep learning revolution that is sweeping across industries. “AI computing will let us create machines that can learn and behave as humans do. It’s the reason why we believe this is the beginning of the age of AI.” 

Huang was joined onstage by Andrew Ng, chief scientist of Baidu, China’s largest search engine, who got a rousing reception from the GTC China attendees. With AI becoming “the new electricity,” Ng described how the early bets his company has made on GPU technology and deep learning are paying off in a host of AI services that have the potential to transform industries.



TESLA P4  TESLA P4擁有最好的瓦每秒處理圖片能力

TESLA P4擁有最好的瓦每秒處理圖片能力




TESLA P4的時脈

1113MHz是預設值。



P4的上一代是M4, P4是新的價鉻。TESLA M4是
TESLA 在超大規模資料中心裡的實力
由於使用者產生的資料呈爆炸性成長,讓超大規模資料中心的需求有了全新定義。當今的雲端應用程式利用各種重要資料,透過現代視訊及影像處理以及深度學習技術,提供智慧化程度更高的即時體驗。資料中心的 GPU 加速功能,可大幅提升這些應用程式的效率。

NVIDIA® Tesla® M4 是全球第一款專為超大規模伺服器設計的加速器,有助於客戶因應不斷成長的資料量。小型且低耗能的設計可加速應用程式的輸送量,將資料中心的成本降低一半,處理深度學習推斷、機器學習預測和視訊工作負載的能源效率還是 CPU 的五倍之多。

Tesla 加速運算平台和 NVIDIA 超大規模軟體套件的結合,可成為用來建立及部署現代超大規模應用程式的端點對端點解決方案。




但新一代的P4,


CPU: 22-Core Intel Xeon E5-2699V4, MKL2017 IntelCaffe+VGG19, Batch Size: 4 | GPU Tesla M4 (TensorRT + FP32) and P4 (TensorRT + Int8) , nvCaffe + VGG19, Batch Size: 4


NVIDIA TESLA P4
CPU: Intel Xeon E5-2690V4 MKL2017 IntelCaffe+GoogLeNet and AlexNet, Batch Size: 128 | GPU: Tesla M4 (TensorRT + FP32) and P4 (TensorRT + Int 8), nvCaffe GoogLeNet
AlexNet, Batch Size: 128 



Note: Dual CPU Xeon E5-2650V4 | Tesla GPU M4 and P4 | Ubuntu 14.04. H.264 benchmark with FFMPEG slow preset | HD = 720p at 30 frames per second. 

ULTRA-EFFICIENT DEEP LEARNING IN SCALE-OUT SERVERS
In the new era of AI and intelligent machines, deep learning is shaping our world like no other
computing model in history. Interactive speech, visual search, and video recommendations
are a few of many AI-based services that we use every day.

Accuracy and responsiveness are key to user adoption for these services. As deep learning
models increase in accuracy and complexity, CPUs are no longer capable of delivering a
responsive user experience.

The NVIDIA Tesla P4 is powered by the revolutionary NVIDIA Pascal™ architecture and
purpose-built to boost efficiency for scale-out servers running deep learning workloads,
enabling smart responsive AI-based services. It slashes inference latency by 15X in any
hyperscale infrastructure and provides an incredible 60X better energy efficiency than CPUs.
This unlocks a new wave of AI services previous impossible due to latency limitations.

FEATURES 
Small form-factor, 
50/75-Watt design fits any scaleout server. 
INT8 operations slash latency by 15X. 
Hardware-decode engine capable of transcoding and inferencing 35 HD video streams in real time.


SPECIFICATIONS 
GPU Architecture NVIDIA Pascal™ 
Single-Precision Performance 5.5 TeraFLOPS* 
Integer Operations (INT8) 22 TOPS* (TeraOperations per Second) 
GPU Memory 8 GB 
Memory Bandwidth 192 GB/s 
System Interface Low-Profile PCI Express Form Factor 
Max Power 50W/75W 
Enhanced Programmability with Page Migration Engine Yes 
ECC Protection Yes 
Server-Optimized for Data Center Deployment Yes 
Hardware-Accelerated Video Engine 1x Decode Engine, 2x Encode Engine * 
With Boost Clock Enabled


Ċ
蔡長明,
2017年2月5日 下午7:18
Ċ
蔡長明,
2017年2月5日 下午9:57
Comments