Flint Center for the Performing Arts, Cupertino, California, Sunday-Tuesday, August 20-22, 2017.

[accordion] [tabs] [tab title=”Videos”] Tutorial 1: P4 for Software Defined Networks: Language and Hardware Implementation

Tutorial 2: Building Autonomous Vehicles with NVIDIA’s DRIVE Platform

Session 1: GPU & Gaming

Session 2: IOT/Embedded

Keynote 1: Direct Human/Machine Interface

Session 3: Automotive

Session 4: Processors

Session 5: FPGA

Session 6: Neural Net I

Keynote 2: Advances in AI

Session 7: Neural Net II

Session 8: Architecture

Session 9: Server

[/tab] [tab title=”At A Glance”]

Sunday 8/20: Tutorials
- 8:00 AM – 9:00 AM: Breakfast
- 9:00 AM – 12:20 PM: Tutorial 1: P4 for Software Defined Networks: Language and Hardware Implementation
- 12:20 PM – 1:35 PM: Lunch
- 1:35 PM – 4:30 PM: Tutorial 2: Building Autonomous Vehicles with NVIDIA’s DRIVE Platform
- 4:30 PM – 6:00 PM: Reception

Monday 8/21: Conference Day 1
- 7:30 AM – 9:15 AM: Breakfast
- 9:15 AM – 9:30 AM: Introduction
- 9:30 AM – 10:00 AM: GPU & Gaming
- 10:00 AM – 10:30 AM: Break (Eclipse Viewing)
- 10:30 AM – 11:30 AM: GPU & Gaming (cont)
- 11:30 AM – 12:30 PM: IOT/Embedded
- 12:30 PM – 1:45 PM: Lunch
- 1:45 PM – 2:45 PM: Keynote 1: Direct Human/Machine Interface…
- 2:45 PM – 3:45 PM: Automotive
- 3:45 PM – 4:15 PM: Break
- 4:15 PM – 6:15 PM: Processors
- 6:15 PM – 7:15 PM: Reception (Wine & Snacks)

Tuesday 8/22: Conference Day 2
- 7:15 AM – 8:15 AM: Breakfast
- 8:15 AM – 10:15 AM: FPGA
- 10:15 AM – 10:45 AM: Break
- 10:45 AM – 11:45 AM: Neural Net
- 11:45 AM – 12:45 PM: Keynote 2: Advances in AI…
- 12:45 PM – 2:00 PM: Lunch
- 2:00 PM – 3:30 PM: Neural Net (cont)
- 3:30 PM – 4:30 PM: Architecture
- 4:30 PM – 5:00 PM: Break
- 5:00 PM – 7:00 PM: Server
- 7:00 PM – 7:15 PM: Closing Remarks

[/tab] [tab title=”Tutorials”]

Tutorials

Sun 8/20	Tutorial	Title	Presenter	Affiliation
8:00 AM	Breakfast
Tutorial 1: P4 for Software Defined Networks: Language and Hardware Implementation The flexibility offered by Software Defined Networks (SDNs) is appealing to network operators, since it allows the customization of network behavior for application-specific needs. In SDNs, flexibility comes at the cost of running the network on general-purpose hardware, reducing the performance that one can obtain from specialization. A recent trend in the industry is to use specialized hardware (ASICs and FPGAs) to combine the best of SDN programmability and flexibility with hardware execution efficiency. Programmable networking hardware allows both enhancing legacy protocols (e.g. adding monitoring to a traditional L2/L3 switching) and developing new protocols at a much faster pace. To support programmability for network devices, P4 (www.p4.org) has been developed as a new programming language for describing how network packets should be processed on a variety of targets ranging from general-purpose CPUs to NPUs, FPGAs, and custom ASICs. P4 was designed with three goals in mind: (i) protocol independence: devices should not “bake in” specific protocols; (ii) field reconfigurability: programmers should be able to modify the behavior of devices after they have been deployed; and (iii) portability: programs should not be tied to specific hardware targets. P4 is the first widely-adopted domain-specific language for packet processing. Several vendors have developed FPGA-based implementations and, with the arrival of Tofino, there is already at least one domain-specific processor optimized as a compiler target for P4 programs – a processor with a small instruction set that can process one header per cycle. The P4 community has created – and continues to maintain and develop – the language specification, a set of open-source tools (compilers, debuggers, code analyzers, libraries, software P4 switches, etc.), and sample P4 programs, all with the goal of making it easy for P4 users to quickly and correctly author new data-plane behaviors. Specialized backend compilers and optimizers for vendor targets are built upon the open source framework. Using P4 and the open source tools, it is easy to prototype new ideas for networking hardware and applications in P4, by simply augmenting the compiler with support for the new hardware. The goal of the tutorial is to introduce attendees to the domain of specialized hardware and programming tools for SDN. We discuss the basic operations in networking and how they have influenced the design of the P4 language, and of specialized, programmable networking hardware. We will show how the design goals of P4 language are met through examples of programs that can run on a variety of architectures. We provide several examples of applications (e.g. monitoring) that are enabled only by the combination of programmable hardware. We expect that, at the end of the tutorial, attendees will be familiar with the application domain and with a set of several implementations from different vendors that demonstrate various trade-offs. We aim to encourage researchers to consider the programmable networking devices area as one of the areas where domain-specific specialization is already moving into commercial devices and to contribute to the development of the P4 based ecosystem.
9:00 AM	T1	Background on Software Defined Networking	Johann Tonsing, Netronome
9:20 AM	T1	P4 Language and Applications, Part 1 & 2	Jeongkeun Lee, Barefoot Networks; and Robert Halstead, Xilinx
10:20 AM	Break
10:40 AM	T1	Overview of the P4 tools	Andy Fingerhut, Cisco
11:10 AM	T1	P4 Hardware Implementations	Jeongkeun Lee, Barefoot Networks; Johann Tonsing, Netronome; and Robert Halstead, Xilinx
12:10 PM	T1	Future Directions: Research Problems, Getting Involved, and Resources	Andy Fingerhut, Cisco
12:20 PM	Lunch
Tutorial 2: Building Autonomous Vehicles with NVIDIA’s DRIVE Platform It is not hard to imagine a day in the near future where explaining the concept of driving would be analogous to explaining the concept of using a cassette player to play music today. The path to this scenario is paved by the rise of autonomous vehicles towards making our roads safer, and in doing so, redefining transportation as we know it. NVIDIA’s DRIVE platform provides the foundation and the building blocks to enable development and a path to production for autonomous vehicle development. The hardware’s rich sensor input capability enables data acquisition for use cases such as neural network training and HD map generation and updates. The provided software accelerates the development of autonomous vehicle technologies such as perception, localization, and path planning. This platform provides a seamless transition for algorithms and tools created and tested on the development platform onto products that would demand the quality and safety certification supplied by our Tier 1 partners. We will be covering the advancements in hardware which facilitate optimized computation for self-driving tasks, including the detection of objects/obstacles around the vehicle. Until recently, the perception of the world around the vehicle was primarily derived from hand-crafted computer vision algorithms. While these handcrafted algorithms were adequate for basic ADAS, it is simply impossible to manually write code for every possible scenario an autonomous vehicle (AV) might encounter. AV requires a new computing model which is deep learning. Deep learning algorithms help address the issues of robustness, accuracy, and scalability, and they have become more relevant through the advancements in the availability of big data, compute power, and accelerated frameworks. The goal of this tutorial is to provide an overview of the autonomous vehicle landscape through NVIDIA’s platform and to highlight how deep neural networks are changing the autonomous vehicle landscape.
1:35 PM	T2	An Overview of NVIDIA’s Autonomous Vehicles Platform	Pradeep Kumar Gupta	NVIDIA
2:50 PM	Break
3:15 PM	T2	Deep Neural Networks – Changing the Autonomous Vehicles Landscape	Dennis Lui	NVIDIA
4:30 PM	Reception
6:00 PM	End of Reception

[/tab] [tab title=”Conf. Day1″]

Conference Day1

Mon 8/21	Session	Title	Presenter	Affiliation
7:30 AM	Breakfast
9:15 AM	Introduction
9:30 AM	GPU and Gaming	The Xbox One X Scorpio Engine	John Sell	Microsoft
10:00 AM	Eclipse Viewing Break
10:30 AM	GPU and Gaming (cont)	AMD’s Radeon Next Generation GPU	Michael Mantor & Ben Sander	AMD
		NVIDIA’s Volta GPU: Programmability and Performance for GPU Computing	Jack Choquette	NVIDIA
11:30 AM	IOT/Embedded	SiFive Freedom SoCs: Industry’s First Open-Source RISC-V Chips	Yunsup Lee	SiFive
		Self-timed ARM M3 Microcontroller for Energy Harvested Applications	David Baker	ETA Compute
12:30 PM	Lunch
1:45 PM	Keynote 1	The Direct Human/Machine Interface and hints of a General Artificial Intelligence	Dr. Phillip Alvelda	Wiseteachers.com, Former DARPA PM
		Abstract: Dr. Alvelda will speak about the latest and future developments in Brain-Machine Interface, and how new discoveries and interdisciplinary work in neuroscience are driving new extensions to information theory and computing architectures.
2:45 PM	Automotive	R-Car Gen3: Computing Platform for Autonomous Driving Era	Mitsuhiko Igarashi & Kazuki Fukuoka	Renesas Electronics Corporation
		Localization for Next Generation Autonomous Vehicles	Fergus Noble	Swift Navigation
3:45 PM	Break
4:15 PM	Processors	XPU: A programmable FPGA Accelerator for diverse workloads	Jian Ouyang	Baidu
		Knights Mill: Intel Xeon Phi Processor for Machine Learning	Jesus Corbal (Lead Presenter), Nawab Ali, Dennis Bradford, Sundaram Chinthamani, Ken Janik, Adhiraj Hassan	Intel
		Celerity: An Open Source RISC-V Tiered Accelerator Fabric	Scott Davidson (UC San Diego), Khalid Al-Hawaj (Cornell) and Austin Rovinski ( U. Michigan)
		Graph Streaming Processor (GSP) A Next-Generation Computing Architecture	Val Cook	ThinCI
6:15 PM	Reception
7:15 PM	End of Reception

[/tab] [tab title=”Conf. Day2″]

Conference Day2

Tue 8/22	Session	Title	Presenter	Affiliation
7:15 AM	Breakfast
8:15 AM	FPGA	Xilinx RFSoC: Monolithic Integration of RF Data Converters with All Programmable SoC in 16nm FinFET for Digital-RF Communications	Brendan Farley	Xilinx
		Stratix 10: Intel’s 14nm Heterogeneous FPGA System-in-Package (SiP) Platform	Sergey Shumarayev	Altera/Intel
		Xilinx 16nm Datacenter Device Family with In-Package HBM and CCIX Interconnect	Gaurav Singh & Sagheer Ahmad	Xilinx
		FPGA Accelerated Computing Using AWS F1 Instances	David Pellerin	Amazon
10:15 AM	Break
10:45 AM	Neural Net 1	A Dataflow Processing Chip for Training Deep Neural Networks	Chris Nicol	Wave Computing
		Accelerating Persistent Neural Networks at Datacenter Scale	Eric Chung & Jeremy Fowers	Microsoft
11:45 AM	Keynote 2	Recent Advances in Artificial Intelligence via Machine Learning and the Implications for Computer System Design	Jeff Dean	Google
12:45 PM	Lunch
2:00 PM	Neural Net 2	DNN ENGINE: A 16nm Sub-uJ Deep Neural Network Inference Accelerator for the Embedded Masses	Paul Whatmough	Harvard University/ARM Research
		DNPU: An Energy-Efficient Deep Neural Network Processor with On-Chip Stereo Matching	Dongjoo Shin & Hoi-Jun Yoo	KAIST
		Evaluation of the Tensor Processing Unit: A Deep Neural Network Accelerator for the Datacenter	Cliff Young	Google
3:30 PM	Architecture	A 400Gbps Multi-Core Network Processor	James Markevitch & Srinivasa Malladi	Cisco
		ARM DynamIQ: Intelligent Solutions using Cluster Based Multi-Processing	Peter Greenhalgh	ARM
4:30 PM	Break
5:00 PM	Server	The Next Generation IBM Z Systems Processor	Christian Jacobi & Anthony Saporito	IBM
		The Next Generation AMD Enterprise Server Product Architecture	Kevin Lepak	AMD
		The New Intel® Xeon® Processor Scalable Family (Formerly Skylake-SP)	Akhilesh Kumar	Intel
		Qualcomm Centriq 2400 Processor	Thomas Speier & Barry Wolford	Qualcomm
7:00 PM	Closing Remarks
7:15 PM	End of Conference

[/tab] [tab title=”Posters”]

Posters

Title

Presenter

Using Texture Compression Hardware for Neural Network Inference	Hardik Sharma, Tom Olson and Alex Chalfin (Georgia Institute of Technology and ARM)
SoundTracing: Real-time Sound Propagation Hardware Accelerator	Dukki Hong, Tae-Hyung Lee, Woonam Chung, Jinseok Hur, Yejong Joo, Juwon Yun, Imjae Hwang and Woo-Chan Park (Sejong University)
A Memory-Efficient Persistent Key-value Store on eNVM SSDs	Arup De and Zvonimir Bandic (Western Digital)
Accelerating Big Data Workloads with FPGAs	Balavinayagam Samynathan, Shahrzad Mirkhani, Weiwei Chen, John Davis, Maysam Lavasani and Behnam Robatmili (Bigstream)
Loom: A Precision Exploiting Neural Network Accelerator	Sayeh Sharify (University of Toronto)
EPIPHANY-V: A TFLOPS scale 16nm 1024-core 64-bit RISC Array Processor	Andreas Olofsson (Adapteva, Inc.)
Fully-Integrated Surround Vision and Mirror Replacement SoC for ADAS/Automated Driving	Mihir Mody, Piyali Piyali, Rajat Sagar, Gregory Shurtz, Abhinay Armstrong, Yashwant Dutt, Kedar Chitnis, Peter Labaziewicz, Jason Jones and Manoj Koul (Texas Instruments)
GRVI Phalanx On Xilinx Virtex UltraScale+: A 1680-core, 26 MB RISC-V FPGA Parallel Processor Overlay [Abstract]	Jan Gray (Gray Research LLC)

[/tab] [/tabs] [/accordion]

Hot Chips: A Symposium on High Performance Chips

Conference Sponsor IEEE Technical Committee on Microprocessors and Microcomputers.

HC29 (2017)

Tutorials

Conference Day1

Conference Day2

Posters