FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs....

44
FPGAs e SoCs De monstros à solução no edge João Dullius BP&M

Transcript of FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs....

Page 1: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGAs e SoCsDe monstros à solução no edgeJoão Dullius

BP&M

Page 2: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

O palestrante

• Engenheiro de Aplicações• Processamento Embedded

• FPGAs

João DulliusBP&M Representações

[email protected]

Page 3: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Industry Trends

The Monster

The Solution

What about Rhinos?

Page 4: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Heterogeneous

Compute

Cloud to Edge AI Proliferation

Industry Trends

Page 5: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Internet of ThingsInternet of Everything

Page 6: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Page 7: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Page 8: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Page 9: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Page 10: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Page 11: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

4K2160

3840

Page 12: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

VoT - Video of Things

Resolution H.264 MJPEG

1MP (1280*720) 2 Mbps per camera 6 Mbps per camera

2MP (1920*1080) 4 Mbps per camera 12 Mbps per camera

5MP (2560*1960) 10 Mbps per camera 30 Mbps per camera

4K (3840*2160) 18 Mbps per camera 64 Mbps per camera

Page 13: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Industry Trend: Cloud/Edge Unification

Page 14: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Genomics Video Analytics Healthcare Finance

Data Center 5G Autonomous Driving Security

Power efficient inference

along with traditional

software

AI Proliferation

Industry Trend: AI Proliferation

Page 15: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Industry Trend: Heterogeneous Compute

Cache

Cache Cache

1980-2000

2x/ 1.5yprocess → Dennard scaling

2000-2010

2x/ 3.5ymultithreading → Amdahl’s law

2010-2020

2x/ 10ydensity → Moore’s law

SINGLE CORE MULTICORE HETEROGENEOUS ADAPTIVE

HETEROGENEOUS

Cache

Scaling from: Silicon process Architecture-aware software Software-aware architecture

AcceleratorCPU Multicore CPU Multicore CPU FPGA, ACAP

Page 16: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

2012 2018

80

70

50

AlexNet

60

BN-AlexNet

BN-NIN

ENet

GoogLeNet

ResNet-18

VGG-16

VGG-19

ResNet-34

ResNet-50

ResNet-101

ResNet-153 ResNeXt-101

Inception v3

Inception-v4

DenseNet-264 ShuffleNet 2x

SENet-154

MobileNet v2

Top-1

Accura

cy (

1%

)

Silicon Design Cycle

Pace of AI/ML Innovation

Speed of Innovation Outpaces Silicon Cycles

Innovation Cycle

Page 17: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Architecture Adaptability

Custom Data Flow Custom Precision Custom Memory

APPLICATION

DOMAIN

ARCHITECTURE

Page 18: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

ARCHITECTURE ADAPTABILITY

Page 19: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Programmable OR Adaptable

Application Architecture

1

ASIC

ADAPTABLE (once)

COMPUTE EFFICIENCY

PROGRAMMABLECPU, GPU, ASSP

1

COMPUTE EFFICIENCY

1

3

3

2

2

Page 20: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Why Not Programmable AND Adaptable?

FPGA, ACAP

PROGRAMMABLE

COMPUTE EFFICIENCY

1

DSA2

ADAPTABLE

DSA1

2

21Application Architecture

2

Page 21: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

1997 2004 2009 2014 2019 2024

IBM Watson becomes

Jeopardy champion!

Image

classification

Classification

better than humans

AlphaGo beats

Lee Sedol

AlphaZero

chess champion!

ADAS

Deep Blue (traditional software)

beats Garry Kasparov

Complexity: 10^120

Robo-taxis

(geofenced) Fully

autonomous

vehicles

Deep Learning vs. Traditional Software

Page 22: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

The Monster

Page 23: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGAs

Page 24: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGAs

Page 25: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGAs

Page 26: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

The Solution

Page 27: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGA Fabric• 7 Series FPGA Fabric

• Custom Engines

Tightly Coupled Domains• 3000+ interconnects

• Up to 100Gb/s Bandwidth

Integrated Analog• Temp & Power Monitor

• 12-bit 1MSPS ADC

Integrated Peripherals• USB, GigE, CAN

• UART, SDIO, I2C, SPI

High BW Memory• L1/L2 Cache, OCM

• DDR2/3, LPDDR2 w/ECC

Application Processor• Single or Dual Core

• Up to 1GHzA9

Dual Core1GHz

Kintex-7 FPGA Fabric

Dual-Core 800MHz

Artix-7 FPGA Fabric

Single-Core766MHz

Artix-7 FPGA Fabric

SoC

Page 28: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Mais periféricos?

Drag, Drop,

and Customize

UART1

UART2

. . .

UARTN

USB

PWM

ADC

MIPI

HDMI

Ethernet

DDR2/3

WiFi

Softcore / ARM Cortex

Memory

Management

Unit

Instruction

CacheData Cache

Ethernet

USB

UART

I2C Controller

SPI Controller

Ext Mem Controller

Ethernet ControllerDDR Controller

.

.

.

IP Catalog

Partner IP

CAN

. . .

Automotive & Industrial

Video & Image Processing

Embedded

Networking

Digital Signal Processing

Drag & Drop

100’s de IP & Peripherals

SPI

I2C

✓ Expand Interfaces and

Features

✓ Adopt New Protocols(e.g., EtherCAT, TSN, …)

✓ Develop a “Future-Proof”

project that evolves with market

trendsML

FPGA / SoC

Page 29: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Trace

if (is_uyvy) {

uyvy2bgr (in_mat, in_rgb);

}

else {

yuyv2bgr (in_mat, in_rgb);

}

resize <INTERPOLATION_AREA,

MAX_IN_HEIGHT,

MAX_IN_WIDTH,

MAX_OUT_HEIGHT,

MAX_OUT_WIDTH,

NPC,

MAX_DOWN_SCALE> (in_r, out_r);

cv.cpp

Application Example: Smart Camera

Page 30: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Preprocess AI Postprocess

Architecture for Smart Camera

System Performance

ML Latency

Page 31: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

In Programmable Logic

AI acceleration

in AI Engine Preprocess

Running in CPU Preprocess

Vitis Dataflow

Pipelining P

P

AI

AI

AI

Postprocess

Acceleration in

Programmable LogicP AI Postprocess

AI

AI 6 FPS

30 FPS

40 FPS

80 FPS

Postprocess

Postprocess

In AI Engine

Adaptive Architecture for Smart Camera

Page 32: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Xilinx runtime libraries (XRT)

Vitis target platform

Domain-specific

development

environment

Vitis core

development kit

Vitis accelerated

libraries

OpenCV

Library

BLAS

Library

Vitis AI Vitis Video

Partners

Genomics,

Data Analytics,

And moreFinance

Library

Analyzers DebuggersCompilers

Vitis: Unified Software Platform

Coming soon…

Page 33: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Shell

HardwareDevelopers

ApplicationSoftware Developers

AI Scientists(iterations in minutes)

EmbeddedDevelopers

Putting it All Together

Page 34: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

© Copyright 2019 Xilinx

VITIS AI Model ZooApplication Module

Face

Face detection

Landmark Localization

Face recognition

Face attributes recognition

Pedestrian

Pedestrian Detection

Pose Estimation

Person Re-identification

Video Analytics

Object detection

Pedestrian Attributes Recognition

Car Attributes Recognition

Car Logo Detection

Car Logo Recognition

License Plate Detection

License Plate Recognition

ADAS/AD

Object Detection

3D Car Detection

Lane Detection

Traffic Sign Detection

Semantic Segmentation

Drivable Space Detection

✓ Open for all users✓ Leveraging mainstream frameworks and

networks✓ Deployable and re-trainable

>> 34

Page 35: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

✓ Multi-task

✓ Multi-model

✓ Multi-framework

✓ Cascaded inference

✓ One or more DPU instances

✓ Custom layer types

✓ Graph segmentation

✓ One bitstream supports many CNNs

Single-chip Deployment of Multiple Models

>> 35

Page 36: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Edge Deployment of Custom Models

>> 36

Page 37: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

400+ functions across 8 libraries

Open source, performance-optimized out-of-the-box acceleration

Extensive Open Source Libraries

Library

Docs

Source

Tests

Examples

Benchmarks

25 functions 12 99 114

365525 37 Models

Page 38: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Compilers

AI optimization

LLVM

User Since 2001

Contributor Since 2007

Now Core to Xilinx Strategy

Committed to Open Source

2007 Contributions2019

Runtime

Libraries

AI Models

20192019

Page 39: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

© Copyright 2019 Xilinx© Copyright 2019 Xilinx

AI Developer Hub

Page 40: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

© Copyright 2019 Xilinx© Copyright 2019 Xilinx

What aboutRhinos?

Page 41: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

© Copyright 2019 Xilinx

More than 900 Rhinos are still being poached each year

In the last decade 8,889 African Rhinos have been lost to poaching

Source: https://www.savetherhino.org/rhino-info/poaching-stats/

>> 41

Page 42: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

© Copyright 2019 Xilinx

CNN

DPU

AWS IoT

Greengrass

Kutleng Engineering Technologies - SmartCAM

>> 42

Page 43: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

FPGAs - Brave New World

Page 44: FPGAs e SoCs De monstros à solução no edge - s3-sa-east ...€¦ · Deep Learning vs. Traditional Software. The Monster. FPGAs. FPGAs. FPGAs. The Solution. FPGA Fabric •7 Series

Building the Adaptable,Intelligent World