FPT'18

Program

10 Dec, 2018 (Mon) Workshop Day-1
11 Dec, 2018 (Tue) Workshop Day-2
12 Dec, 2018 (Wed) FPT'18 Conference Day-1
13 Dec, 2018 (Thu) FPT'18 Conference Day-2
14 Dec, 2018 (Fri) FPT'18 Conference Day-3, Workshop Day-3

10 Dec. (Mon)

10:00	Workshop: PYNQ place: Meeting room 5+6, Okinawaken Shichoson Jichi Hall (Map)
17:30

11 Dec. (Tue)

10:00	Workshop: Xilinx AWS place: Tenbusu Gallery, 3rd floor	Workshop: HPC-FPGA place: Tenbusu hall, 4th floor
am pm	Workshop: Xilinx AWS place: Tenbusu Gallery, 3rd floor	Workshop: RECONF-HPC place: Tenbusu hall, 4th floor	Workshop: INTEL-VINO place: Meeting room 1, 3rd floor
17:30

12 Dec. (Wed)

09:20	Opening
09:40	Keynote #1: Hideharu Amano session chair: Yuichiro Shibata
10:30	Break (30 min)
11:00	Oral Session #1: Neural Networks (4 taks x 20 min) session chair: Hiroki Nakahara
12:20	Lunch & Poster Session #1 (100 min) Design Competition (Preliminary)
14:00	Oral Session #2: High Level Synthesis (5 taks x 20 min) session chair: Lingli Wang
15:40	Break (30 min)
16:10	Oral Session #3: Stream Processing (4 taks x 20 min) session chair: Jason Anderson
~~17:30~~ 18:00	Ph.D Forum Demo Session Design Compe (Final)
~~18:30~~ 19:00	Welcome Reception place: Main hall
20:30

13 Dec. (Thu)

09:20	Keynote #2: Mike Strickland session chair: Kentaro Sano
10:10	Break (30 min)
10:40	Oral Session #4: Design Methodologies and Tools (3 taks x 20 min) session chair: Oliver Diessel
11:40	Invited Talk: Hiroshi Miyata session chair: Yoshiki Yamaguchi
12:20	Lunch & Poster Session #2 (100 min)
14:00	Oral Session #5: Networking and Data Applications (5 taks x 20 min) session chair: David Thomas
15:40	Break (30 min)
16:10	Oral Session #6: Security and Dependability (4 taks x 20 min) session chair: Martin Herbordt
17:30	Bus transportation
18:30
19:00	Banquet LAGUNA GARDEN HOTEL
21:00

14 Dec. (Fri)

09:20	Keynote #3: Kees Vissers session chair: Wayne Luk
~~10:10~~ 10:40	Poster Session #3 (50 min)
~~11:00~~ 11:30	Oral Session #7: FPGA Architectures (3 taks x 20 min) session chair: Brad Hutchings
~~12:00~~ 12:30	Closing
~~12:20~~ 12:40	Lunch
14:00	Workshop: Embedded ML place: Tenbusu hall, 4th floor	Workshop: ZYNQ-HLS place: Meeting room 1+2, 3rd floor
17:30		Workshop: ZYNQ-HLS place: Meeting room 1+2, 3rd floor
18:00

Keynote Speakers

Hideharu Amano, Professor, Keio University

Hideharu_Amano slide
Title: Accelerator-in-Switch: a novel cooperation framework for FPGAs and GPUs
Abstract: A large-scale FPGAs have been used as high-performance switches which connect powerful computational accelerators including GPUs. In most cases, the FPGA has a room of implementing additional accelerators which can treat on-the-fly data in the switch directly. Here, we introduce two implementation examples: one is for a low latency switching hub PEACH3 for high-performance scientific computing, and the other is FiC-SW for AI computing in a cloud. The key technique is how to use the partial reconfiguration and HLS description for the accelerator-in-switch.
Biography: Hideharu Amano received Pd.D from Keio University, Japan in 1986. He is now a professor, Dept of Information and Computer Science, Keio University. His research interests include parallel architectures and reconfigurable computing.

Mike Strickland, Data Center Solution Architect, Intel Programmable Solutions Group, Director, Intel PSG

Mike_Strickland slide
Title: FPGA Accelerated HPC and Data Analytics
Abstract: There are increasing opportunities in the data center for FPGA algorithm, networking, and data access acceleration. Microsoft has announced that they accelerate Bing search, machine learning, and networking with FPGAs – and they recently announced integration of FPGA based Project Brainwave with Azure Machine Learning for real-time AI. FPGAs also deliver some futureproofing for AI, with current support for reduced precision floating point and bfloat16 for better performance with minor loss of scoring accuracy. Intel and partners are also integrating FPGAs to data analytics frameworks and existing databases to enable enterprise customers to run unmodified applications without requiring any FPGA expertise for use with unstructured, NoSQL, and traditional relational databases. One such partner, Swarm64, is using a single FPGA for multiple acceleration roles to deliver 2X+ performance on the industry standard TPC-H benchmark.
Biography: Mike Strickland has more than twenty years of computer, networking and storage experience with companies such as Hewlett Packard, Silverback Systems, and Altera, which is now a part of Intel. He currently is leading the FPGA High Performance Computing vision across the Programmable Solutions Group at Intel. Previously Strickland has led the development and launch of numerous products including networking, storage management, TCP/IP Offload and iSCSI. He holds a B.S. degree in electrical engineering from Brown University and a M.S. degree in management from the Sloan School of Management at M.I.T.

Kees Vissers, Xilinx, fellow

Kees_Vissers Title: Novel Neural Network Applications on New Python Enabled Platforms
Abstract: Reconfigurable technology is very well suited for novel implementations of neural networks. In this presentation we will show the range of implementations for neural networks on reconfigurable technology. This includes a direct dataflow implementation of networks, and implementations of an ‘soft’ processor using an array of DSP blocks. We will show the trade-offs in implementation cost and power over a range of implementations with varying bit-precisions. These implementations are leveraging novel python based abstractions, that support a Jupyter Notebook interface. We will illustrate this with complete implementations on embedded platforms including the Pynq platform and on cloud based platforms including the AWS F1 platform. Finally we will project the new possibilities of the new 7nm based Xilinx platforms that contain specialized hardware that is very efficient for these neural network applications.
Biography: Kees Vissers graduated from Delft University in the Netherlands. He worked at Philips Research in Eindhoven, the Netherlands, for many years. The work included Digital Video system design, HW –SW co-design, VLIW processor design and dedicated video processors. He was a visiting industrial fellow at Carnegie Mellon University, where he worked on early High Level Synthesis tools. He was a visiting industrial fellow at UC Berkeley where he worked on several models of computation and dataflow computing. He was a director of architecture at Trimedia, and CTO at Chameleon Systems. For more than a decade he is heading a team of researchers at Xilinx, including a significant part of the Xilinx European Laboratories. The research topics include next generation programming environments for processors and FPGA fabric, high-performance video systems, machine learning applications and architectures, wireless applications and new datacenter applications. He has been instrumental in the High-Level Synthesis technology and one of the technical leads in the novel ACAP technology. He is now a Fellow at Xilinx.

Invited Speaker

Hiroshi Miyata, FUJITSU LABORATORIES LTD, fellow

Hiroshi_Miyata Title: Digital Transformation of Automobile and Mobility Service
Abstract: Automobile traffic system has not changed its physical, industrial and social structure more than 100 years from its start. It has been deployed in large scale, and it has realized important role of mobility. It consists of driver, automobile, and road physically. The system elements have been physically contacted each other and have been managed only by human. Electric and electronic technology have improved performance of automobile mechanic system for more than 30 years. The digital transformation Era is beginning. The system elements will have a digital data connection. The system value, size, range, and role will change dramatically. CASE；Connected Car, Autonomous Driving, Sharing Car and Mobility as a Service, Electrification make large-scale innovation of not only automobile but automobile traffic system, automobile industry and society. We will outline these new system and service concept. Then we will describe these digital transformation process will continue for a long time because the system and service will change year by year, which people are involved in. We will discuss the need for data cycle for improvement of this system and service. We will mention technologies needed for the future Mobility IoT and Automobile digital transformation such as communication technologies, drive by wire technology, drive by information technologies, traffic management technology by ICT, IoT technology, cloud technology that has scalability, flexibility, security, traceability, safety, reliability.
Biography: Hiroshi Miyata is a fellow at FUJITSU LABORATORIES LTD. He researches Mobility IoT related technologies and systems from 2017. He received M.S from the University of Tokyo in 1981. He was an engineer at Toyota Motor Corp.(TMC) from 1981 to 2004. He developed the first automobile shock absorber electronic control system and developed suspension systems, Cruise Control system, throttle system by wire, Electronic Control Units, sensors, and actuators. He planned car electronics technologies strategies in 1994. He promoted system integration called “Intelligent Transportation System” at Fujitsu Ten Ltd. from 1995 to 1996. He developed car multimedia systems, such as car navigation, emergency-call, connected car system from 1997 to 2004 at TMC. He was a president of Toyota InfoTechnology Center that is a research center of Information Communication technologies for automobile from 2005 to 2007. He was a general manager of electronics engineering Div. I at TMC from 2008 to 2009. He was a Managing Officer at TMC from 2009 to 2011. He was an executive Vice President of Toyota Technical Development Ltd from 2011 to 2016.

Best Paper Award

Dither NN: An Accurate Neural Network with Dithering for Low Bit-Precision Hardware
Kota Ando, Kodai Ueyoshi, Yuka Oba, Kazutoshi Hirose, Ryota Uematsu, Takumi Kudo, Masayuki Ikebe, Tetsuya Asai, Shinya Takamaeda-Yamazaki, Masato Motomura (Hokkaido University)

Program

* Best Paper Candidates

Oral Session 1: Neural Networks

* Kota Ando, Kodai Ueyoshi, Yuka Oba, Kazutoshi Hirose, Ryota Uematsu, Takumi Kudo, Masayuki Ikebe, Tetsuya Asai, Shinya Takamaeda-Yamazaki and Masato Motomura. Dither NN: An Accurate Neural Network with Dithering for Low Bit-Precision Hardware
* Hongxiang Fan, Shuanglong Liu, Martin Ferianc and Wayne Luk. A Real-Time Object Detection Accelerator with Compressed SSDLite on FPGA
Benjamin Morcos, Terrence C. Stewart, Chris Eliasmith and Nachiket Kapre. Implementing NEF Neural Networks on Embedded FPGAs
Shuanglong Liu, Chenglong Zeng, Hongxiang Fan, Ho-Cheung Ng, Jiuxi Meng and Wayne Luk. Memory-Efficient Architecture for Accelerating Generative Networks on FPGA

Oral Session 2: High Level Synthesis

* Anuj Vaishnav, Khoa Pham and Dirk Koch. Live Migration for OpenCL FPGA Accelerators
Ahmed Sanaullah, Rushi Patel and Martin Herbordt. Optimizing FPGA OpenCL Kernels for High Performance Computing
Jing Chen, Xue Liu and Jason Anderson. Software-Specified FPGA Accelerators for Elementary Functions
Leandro Rosa, Christos Bouganis and Vanderlei Bonato. Scaling Up Loop Pipelining For High-Level Synthesis: A Non-Iterative Approach
Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal, Marc Mateu, Daniel Jiménez-González, Carlos Alvarez, Xavier Martorell, Eduard Ayguade and Jesus Labarta. Application Acceleration on FPGAs with OmpSs@FPGA

Oral Session 3: Stream Processing

Philippos Papaphilippou, Chris Brooks and Wayne Luk. FLiMS: Fast Lightweight Merge Sorter
Makoto Saitoh and Kenji Kise. Very Massive Hardware Merge Sorter
Phong Tran, Thinh Hung Pham, Siew-Kei Lam, Meiqing Wu and Bhavan A. Jasani. Stream-based ORB Feature Extractor with Dynamic Power Optimization
Oscar Rahnama, Stuart Golodetz, Tommaso Cavallari and Philip Torr. R3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

Oral Session 4: Design Methodologies and Tools

Kevin E. Murray and Vaughn Betz. Tatum: Parallel Timing Analysis for Faster Design Cycles and Improved Optimization
Chirag Ravishankar, Henri Fraisse and Dinesh Gaitonde. SAT based Place-And-Route for High-Speed Designs on 2.5D FPGAs
Kuang-Ping Niu and Jason Anderson. Compact Area and Performance Modelling in CGRA Architecture Evaluation

Oral Session 5: Networking and Data Applications

Koya Mitsuzuka, Yuta Tokusashi and Hiroki Matsutani. MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories
Yunhui Qiu, Hankun Lv, Jinyu Xie, Wenbo Yin and Lingli Wang. Ultra-Low-Latency and Flexible In-Memory Key-Value Store System Design on CPU-FPGA
Toshitaka Ito, Yuri Itotani, Shin'Ichi Wakabayashi, Shinobu Nagayama and Masato Inagi. A Nearest Neighbor Search Engine Using Distance-based Hashing
Siddhartha Siddhartha and Nachiket Kapre. DaCO: A High-Performance Token Dataflow Coprocessor Overlay for FPGAs
Nadeen Gebara, Jiuxi Meng, Paolo Costa and Wayne Luk. Scheduling Algorithms for High Performance Network Switches on FPGAs: A Survey

Oral Session 6: Security and Dependability

Thibaut Marty, Tomofumi Yuki and Steven Derrien. Enabling Overclocking with HLS Tools through Algorithm-Level Error Detection
Festus Hategekimana, Joel Mandebi Mbongue, Md Jubaer Hossain Pantho and Christophe Bobda. Secure Hardware Kernels Execution in CPU+FPGA Heterogeneous Cloud
Farnoud Farahmand, Malik Umar Sharif, Kevin Briggs and Kris Gaj. A High-Speed Constant-Time Hardware Implementation of NTRUEncrypt SVES
Shane Fleming and David Thomas. Injecting FPGA Configuration Memory Faults in Parallel

Oral Session 7: FPGA Architectures

Tian Tan, Eriko Nurvitadhi, David Shih and Derek Chiou. Evaluating The Highly-Pipelined Intel® Stratix® 10 FPGA Architecture Using Open-Source Benchmarks
Jin Hee Kim, Jongeun Lee and Jason Anderson. FPGA Architecture Enhancements for Efficient BNN Implementation
Brett Grady and Jason Anderson. Synthesizable Heterogeneous FPGA Fabrics

Poster Session 1

Siva Satyendra Sahoo, Tuan D. A. Nguyen, Bharadwaj Veeravalli and Akash Kumar. QoS-aware Cross-layer Reliability-integrated FPGA-based Dynamic Partially Reconfigurable System Partitioning
Jakub Cabal, Lukáš Kekely and Jan Kořenek. High-Speed Computation of CRC Codes for FPGAs
Bo Liu and James Xu. FCLNN: A Flexible Framework for Fast CNN Prototyping on FPGA with OpenCL and Caffe
Qiang Li, Shane Fleming, David Thomas and Peter Cheung. Accelerating Top-k ListNet Training for Ranking Using FPGA
Yu Zou and Mingjie Lin. GridGAS: An I/O Efficient Heterogeneous FPGA+CPU Computing Platform for Very Large-Scale Graph Analytics
Pok Yee Kwan, Gary C.T. Chow, Tim Todman, Wayne Luk and Wenguang Xu. Lossy Multiport Memory
Takuya Yamazaki and Tsutomu Maruyama. An FPGA Implementation of Robust Matting
Sadegh Yazdanshenas and Vaughn Betz. The Costs of Confidentiality in Virtualized FPGAs
Naohito Nakasato, Hiroshi Daisaka and Tadashi Ishikawa. High Performance High-Precision Floating-Point Operations on FPGAs using OpenCL
Tobias Drewes, Balasubramanian Gurumurthy, Jan Moritz Joseph, David Broneske, Gunter Saake and Thilo Pionteck. Efficient Inter-Kernel Communication for OpenCL Database Operators on FPGAs
Kento Tajiri and Tsutomu Maruyama. FPGA Acceleration of a Supervised Learning Method for Hyperspectral Image Classification
Liang Xie, Xitian Fan, Wei Cao and Lingli Wang. High Throughput CNN Accelerator Design Based on FPGA

Poster Session 2

Christopher Blochwitz, Julian Wolff, Mladen Berekovic, Dennis Heinrich, Sven Groppe, Jan Moritz Joseph and Thilo Pionteck. Hardware-Triplestore – a Hardware-centric Database for Semantic Web
Qiangpu Chen, Minghua Shen and Nong Xiao. DP-Pack: Distributed Parallel Packing for FPGAs
Dennis Gnad, Sascha Rapp, Jonas Krautter and Mehdi Tahoori. Checking for Electrical Level Security Threats in Bitstreams for Multi-Tenant FPGAs
Hossein Omidian and Guy Lemieux. An Accelerated OpenVX Overlay for Pure Software Programmers
Andre Bannwart Perina and Vanderlei Bonato. Mapping Estimator for OpenCL Heterogeneous Accelerators
Hiroki Nakahara, Masayuki Shimoda and Shimpei Sato. A Tri-State Weight Convolutional Neural Network for an FPGA: Applied to YOLOv2 Object Detector
Krystine Dawn Sherwin, Ben Stappers, Prabu Thiagaraj, Kevin I-Kai Wang and Oliver Sinnen. Investigating how hardware architectures are expressed in high-level languages for an SKA algorithm
Siddhartha Siddhartha, David Boland, Steve Wilton, Barry Flower, Phillip Leong and Perry Blackmore. Simultaneous Inference and Training using On-FPGA Weight Perturbation Techniques
Akira Jinguji, Tomoya Fujii, Shimpei Sato and Hiroki Nakahara. An FPGA Realization of OpenPose based on a Sparse Weight Convolutional Neural Network
Ryota Yasudo, Jose Gabriel Figueiredo Coutinho, Ana Lucia Varbanescu, Wayne Luk, Hideharu Amano and Tobias Becker. Performance Estimation for Exascale Reconfigurable Dataflow Platforms
Teng Yu, Bo Feng, Mark Stillwell, Liucheng Guo, Yuchun Ma and John Thomson. Topological Ranking-Based Resource Scheduling for Multi-FPGA Systems
Emmanouil Pissadakis, Nikolaos Alachiotis, Panagiotis Skrimponis, Dimitris Theodoropoulos, Thanasis Korakis and Dionisios Pnevmatikatos. ReFiRe: efficient deployment of Remote Fine-grained Reconfigurable accelerators

Poster Session 3

Jorge Echavarria, Stefan Wildermann and Jürgen Teich. A Combinatorial Two-Level Logic Approximation DSE Technique Targeting FPGAs
William Diehl, Farnoud Farahmand, Abubakr Abdulgadir, Jens-Peter Kaps and Kris Gaj. Face-off between the CAESAR Lightweight Finalists: ACORN vs. Ascon
Kristiyan Manev and Dirk Koch. Large Utility Sorting on FPGAs
Dionysios Diamantopoulos and Christoph Hagleitner. A System-level Transprecision FPGA Accelerator for BLSTM with On-chip Memory Reshaping
Md Jubaer Hossain Pantho, Joel Mandebi Mbongue, Christophe Bobda and David Andrews. Transparent Acceleration of OpenCV Library Kernels on FPGA-Attached Hybrid Memory Cube Computers
Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen and Weisheng Zhao. A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform
Robert Hale and Brad Hutchings. Distributed-Memory Based FPGA Debug: Design Timing Impact
Matthew Ashcraft and Jeffrey Goeders. Unified On-Chip Software and Hardware Debug for HLS-Accelerated Programs
Haomiao Wang, Ben Stappers, Prabu Thiagaraj and Oliver Sinnen. Optimisation of Convolution of Multiple Different Sized Filters in SKA Pulsar Search Engine
Jia Liu and Qiang Liu. Speed and Resource Optimization of BFGS Quasi-Newton Implementation on FPGA using Inexact Line Search Method for Neural Network Training
Alexander Kroh and Oliver Diessel. A short-transfer model for tightly-coupled CPU-FPGA platforms

PhD Forum

Yu Xie, Chen He, Yi-Zhuang Xie, Chuang-An Mao and Bing-Yi Li. An Automated FPGA-Based Fault Injection Platform for Granularly-Pipelined Fault Tolerant CORDIC
Junichiro Kadomoto, Toru Koizumi, Akifumi Fukuda, Reoma Matsuo, Susumu Mashimo, Akifumi Fujita, Ryota Shioya, Hidetsugu Irie and Shuichi Sakai. An Area-Efficient Out-of-Order Soft-Core Processor Without Register Renaming
Antoniette Mondigo, Kentaro Sano and Hiroyuki Takizawa. Enhancing Memory Bandwidth in a Single Stream Computation with Multiple FPGAs

Demonstration Session

Lukáš Kekely, Martin Špinler, Štepán Friedl, Jiří Sikora, Jan Kořenek and Viktor Puš. Demonstration of Full-Duplex Packet Transfers over PCI Express with Sustained 200 Gbps Throughput
Donald Bailey, Yuan Chang and Steven Le Moan. Lens Distortion Self-Calibration using the Hough Transform
Shaoxia Fang, Lu Tian, Junbin Wang, Shuang Liang, Dongliang Xie, Zhongmin Chen, Lingzhi Sui, Qian Yu, Xiaoming Sun, Song Yao, Yi Shan and Yu Wang. Real-time Object Detection and Semantic Segmentation Hardware System with Deep Learning Networks
Daniel Holanda Noronha, Kahlan Gibson, Bahar Salehpour and Steve Wilton. LeFlow: Automatic Compilation of TensorFlow Machine Learning Applications to FPGAs

Design Competition

Kyosuke Mori, Yuuki Saito and Naohito Nakasato. Introduction of MNSTbot
Musashi Aoto, Yousuke Numata and Yasutaka Wada. Development of an FPGA controlled "Mini-Car" toward Autonomous Driving
Yuya Kudo, Atsushi Takada, Soji Tsuda, Takumi Sakai and Tomonori Izumi. a Platform on All-Programmable SoC for Micro Autonomous Robots
Yohei Shimmyo, Maiko Arakawa, Shunsuke Mie, Hiroaki Saito, Miyuka Nakamura, Yuichi Okuyama, Eriko Kayama, Misaki Kozakai and Hiroki Yomogita. Implementation of an Autonomous Driving System for FPT2018 FPGA Design Competition Using the Zynqberry Processing Board
Akira Kojima and Yohei Nose. Development of an Autonomous Driving Robot Car using FPGA
Hiromichi Wakatsuki, Takao Kido, Kenta Arai, Yuhei Sugata, Takashi Yokota, Kanemitsu Ootsu and Takeshi Ohkawa. Development of a Robot Car Based on Lane Detection with FPGA
Hiroki Bingo. Development of a Control Target Recognition for Autonomous Vehicle using FPGA with Python
Sou Tamura, Yasuhiro Nitta and Hideki Takase. A Study on Introducing FPGA to ROS based Autonomous Driving System
Kaijie Wei, Koki Honda and Hideharu Amano. FPGA Design for Autonomous Vehicle Driving Using Binarized Neural Networks