NVIDIA Tesla P4

NVIDIA Tesla P4

Tesla P4 為資料中心帶來最高的能源效率,其小尺寸及最小50瓦特的低功率設計可安裝於任何伺服器內,讓生產作業負載推論的能源效率達 CPU 的 40 倍。在進行視訊推論作業負載時,單一伺服器裡安裝單顆 Tesla P4 即可取代 13 台僅採用 CPU 的伺服器;而包含伺服器及用電量的總持有成本則能節省達8倍。

(3)利用Caffe AlexNet神經網路、batch size = 128比較img/sec。CPU:E5-2690v4,使用Intel MKL 2017。使用Intel優化的Caffe及AlexNet,來源https://github.com/intel/caffe。GPU:Tesla P4,量測GPU功率

(4)利用Intel優化的 GoogLeNet、雙插槽 CPU 伺服器、使用 Intel MKL 2017 的 Xeon E5-2650v4。使用 DeepStream SDK 的 1 顆 Tesla P4 GPU 伺服器。視訊串流為 720p @ 30FPS。

http://www.pny.eu/professional/explore-all-products/nvidia-tesla/775-nvidia-tesla-p4

PDF 下載:http://images.nvidia.com/content/pdf/tesla/184457-Tesla-P4-Datasheet-NV-Final-Letter-Web.pdf

鴻鵠國際股份有限公司

業務窗口 蔡先生

行動電話:0910-218-322

公司電話:02-2929-9388 #10

公司傳真:02-2929-7579

Email: sales1@honghutech.com

ACCELERATE YOUR CUSTOMERS’ DEEP LEARNING WITH

THE NEW NVIDIA TESLA P4 AND P40

At GTC China last week, we announced the latest NVIDIA® Tesla® GPUs based on the NVIDIA Pascal™ architecture—the computational engine for the new era of artificial intelligence. Our Tesla GPUs deliver amazing user experiences by accelerating today’s deep learning applications at scale.

The Tesla P40 is purpose-built to deliver maximum throughput for deep learning workloads.

The ultra-efficient Tesla P4’s small form factor and 75-Watt design help accelerate any scale-out server and provide 40X higher energy efficiency compared to CPUs.

NVIDIA CEO Jen-Hsun Huang unveils technology that will accelerate the deep learning revolution that is sweeping across industries. “AI computing will let us create machines that can learn and behave as humans do. It’s the reason why we believe this is the beginning of the age of AI.”

Huang was joined onstage by Andrew Ng, chief scientist of Baidu, China’s largest search engine, who got a rousing reception from the GTC China attendees. With AI becoming “the new electricity,” Ng described how the early bets his company has made on GPU technology and deep learning are paying off in a host of AI services that have the potential to transform industries.

TESLA P4 TESLA P4擁有最好的瓦每秒處理圖片能力

1113MHz是預設值。

P4的上一代是M4, P4是新的價鉻。TESLA M4是

TESLA 在超大規模資料中心裡的實力

由於使用者產生的資料呈爆炸性成長,讓超大規模資料中心的需求有了全新定義。當今的雲端應用程式利用各種重要資料,透過現代視訊及影像處理以及深度學習技術,提供智慧化程度更高的即時體驗。資料中心的 GPU 加速功能,可大幅提升這些應用程式的效率。

NVIDIA® Tesla® M4 是全球第一款專為超大規模伺服器設計的加速器,有助於客戶因應不斷成長的資料量。小型且低耗能的設計可加速應用程式的輸送量,將資料中心的成本降低一半,處理深度學習推斷、機器學習預測和視訊工作負載的能源效率還是 CPU 的五倍之多。

Tesla 加速運算平台和 NVIDIA 超大規模軟體套件的結合,可成為用來建立及部署現代超大規模應用程式的端點對端點解決方案。

但新一代的P4,

CPU: 22-Core Intel Xeon E5-2699V4, MKL2017 IntelCaffe+VGG19, Batch Size: 4 | GPU Tesla M4 (TensorRT + FP32) and P4 (TensorRT + Int8) , nvCaffe + VGG19, Batch Size: 4

CPU: Intel Xeon E5-2690V4 MKL2017 IntelCaffe+GoogLeNet and AlexNet, Batch Size: 128 | GPU: Tesla M4 (TensorRT + FP32) and P4 (TensorRT + Int 8), nvCaffe GoogLeNet

AlexNet, Batch Size: 128

Note: Dual CPU Xeon E5-2650V4 | Tesla GPU M4 and P4 | Ubuntu 14.04. H.264 benchmark with FFMPEG slow preset | HD = 720p at 30 frames per second.

ULTRA-EFFICIENT DEEP LEARNING IN SCALE-OUT SERVERS

In the new era of AI and intelligent machines, deep learning is shaping our world like no other

computing model in history. Interactive speech, visual search, and video recommendations

are a few of many AI-based services that we use every day.

Accuracy and responsiveness are key to user adoption for these services. As deep learning

models increase in accuracy and complexity, CPUs are no longer capable of delivering a

responsive user experience.

The NVIDIA Tesla P4 is powered by the revolutionary NVIDIA Pascal™ architecture and

purpose-built to boost efficiency for scale-out servers running deep learning workloads,

enabling smart responsive AI-based services. It slashes inference latency by 15X in any

hyperscale infrastructure and provides an incredible 60X better energy efficiency than CPUs.

This unlocks a new wave of AI services previous impossible due to latency limitations.

FEATURES

Small form-factor,

50/75-Watt design fits any scaleout server.

INT8 operations slash latency by 15X.

Hardware-decode engine capable of transcoding and inferencing 35 HD video streams in real time.

SPECIFICATIONS

GPU Architecture NVIDIA Pascal™

Single-Precision Performance 5.5 TeraFLOPS*

Integer Operations (INT8) 22 TOPS* (TeraOperations per Second)

GPU Memory 8 GB

Memory Bandwidth 192 GB/s

System Interface Low-Profile PCI Express Form Factor

Max Power 50W/75W

Enhanced Programmability with Page Migration Engine Yes

ECC Protection Yes

Server-Optimized for Data Center Deployment Yes

Hardware-Accelerated Video Engine 1x Decode Engine, 2x Encode Engine *

With Boost Clock Enabled