10 Dec. (Mon)
10:00 Workshop: PYNQ
place: Meeting room 5+6, Okinawaken Shichoson Jichi Hall (Map)
11 Dec. (Tue)
10:00 Workshop: Xilinx AWS
place: Tenbusu Gallery, 3rd floor
Workshop: HPC-FPGA
place: Tenbusu hall, 4th floor
Workshop: RECONF-HPC
place: Tenbusu hall, 4th floor
Workshop: INTEL-VINO
place: Meeting room 1, 3rd floor
12 Dec. (Wed)
09:20 Opening
09:40 Keynote #1: Hideharu Amano
session chair: Yuichiro Shibata
10:30 Break (30 min)
11:00 Oral Session #1:
Neural Networks
(4 taks x 20 min)
session chair: Hiroki Nakahara
12:20 Lunch &
Poster Session #1
(100 min)
Design Competition
14:00 Oral Session #2:
High Level Synthesis
(5 taks x 20 min)
session chair: Lingli Wang
15:40 Break (30 min)
16:10 Oral Session #3:
Stream Processing
(4 taks x 20 min)
session chair: Jason Anderson
Ph.D Forum
Demo Session
Design Compe (Final)
Welcome Reception
place: Main hall
13 Dec. (Thu)
09:20 Keynote #2: Mike Strickland
session chair: Kentaro Sano
10:10 Break (30 min)
10:40 Oral Session #4:
Design Methodologies and Tools
(3 taks x 20 min)
session chair: Oliver Diessel
11:40 Invited Talk: Hiroshi Miyata
session chair: Yoshiki Yamaguchi
12:20 Lunch &
Poster Session #2
(100 min)
14:00 Oral Session #5:
Networking and Data Applications
(5 taks x 20 min)
session chair: David Thomas
15:40 Break (30 min)
16:10 Oral Session #6:
Security and Dependability
(4 taks x 20 min)
session chair: Martin Herbordt
17:30 Bus transportation
19:00 Banquet
14 Dec. (Fri)
09:20 Keynote #3: Kees Vissers
session chair: Wayne Luk
Poster Session #3
(50 min)
Oral Session #7:
FPGA Architectures
(3 taks x 20 min)
session chair: Brad Hutchings
14:00 Workshop: Embedded ML
place: Tenbusu hall, 4th floor
Workshop: ZYNQ-HLS
place: Meeting room 1+2, 3rd floor

Keynote Speakers

Hideharu Amano, Professor, Keio University

Hideharu_Amano Title: Accelerator-in-Switch: a novel cooperation framework for FPGAs and GPUs
Abstract: A large-scale FPGAs have been used as high-performance switches which connect powerful computational accelerators including GPUs. In most cases, the FPGA has a room of implementing additional accelerators which can treat on-the-fly data in the switch directly. Here, we introduce two implementation examples: one is for a low latency switching hub PEACH3 for high-performance scientific computing, and the other is FiC-SW for AI computing in a cloud. The key technique is how to use the partial reconfiguration and HLS description for the accelerator-in-switch.
Biography: Hideharu Amano received Pd.D from Keio University, Japan in 1986. He is now a professor, Dept of Information and Computer Science, Keio University. His research interests include parallel architectures and reconfigurable computing.

Mike Strickland, Data Center Solution Architect, Intel Programmable Solutions Group, Director, Intel PSG

Mike_Strickland Title: FPGA Accelerated HPC and Data Analytics
Abstract: There are increasing opportunities in the data center for FPGA algorithm, networking, and data access acceleration. Microsoft has announced that they accelerate Bing search, machine learning, and networking with FPGAs – and they recently announced integration of FPGA based Project Brainwave with Azure Machine Learning for real-time AI. FPGAs also deliver some futureproofing for AI, with current support for reduced precision floating point and bfloat16 for better performance with minor loss of scoring accuracy. Intel and partners are also integrating FPGAs to data analytics frameworks and existing databases to enable enterprise customers to run unmodified applications without requiring any FPGA expertise for use with unstructured, NoSQL, and traditional relational databases. One such partner, Swarm64, is using a single FPGA for multiple acceleration roles to deliver 2X+ performance on the industry standard TPC-H benchmark.
Biography: Mike Strickland has more than twenty years of computer, networking and storage experience with companies such as Hewlett Packard, Silverback Systems, and Altera, which is now a part of Intel. He currently is leading the FPGA High Performance Computing vision across the Programmable Solutions Group at Intel. Previously Strickland has led the development and launch of numerous products including networking, storage management, TCP/IP Offload and iSCSI. He holds a B.S. degree in electrical engineering from Brown University and a M.S. degree in management from the Sloan School of Management at M.I.T.

Kees Vissers, Xilinx, fellow

Kees_Vissers Title: Novel Neural Network Applications on New Python Enabled Platforms
Abstract: Reconfigurable technology is very well suited for novel implementations of neural networks. In this presentation we will show the range of implementations for neural networks on reconfigurable technology. This includes a direct dataflow implementation of networks, and implementations of an ‘soft’ processor using an array of DSP blocks. We will show the trade-offs in implementation cost and power over a range of implementations with varying bit-precisions. These implementations are leveraging novel python based abstractions, that support a Jupyter Notebook interface. We will illustrate this with complete implementations on embedded platforms including the Pynq platform and on cloud based platforms including the AWS F1 platform. Finally we will project the new possibilities of the new 7nm based Xilinx platforms that contain specialized hardware that is very efficient for these neural network applications.
Biography: Kees Vissers graduated from Delft University in the Netherlands. He worked at Philips Research in Eindhoven, the Netherlands, for many years. The work included Digital Video system design, HW –SW co-design, VLIW processor design and dedicated video processors. He was a visiting industrial fellow at Carnegie Mellon University, where he worked on early High Level Synthesis tools. He was a visiting industrial fellow at UC Berkeley where he worked on several models of computation and dataflow computing. He was a director of architecture at Trimedia, and CTO at Chameleon Systems. For more than a decade he is heading a team of researchers at Xilinx, including a significant part of the Xilinx European Laboratories. The research topics include next generation programming environments for processors and FPGA fabric, high-performance video systems, machine learning applications and architectures, wireless applications and new datacenter applications. He has been instrumental in the High-Level Synthesis technology and one of the technical leads in the novel ACAP technology. He is now a Fellow at Xilinx.

Invited Speaker

Hiroshi Miyata, FUJITSU LABORATORIES LTD, fellow

Hiroshi_Miyata Title: Digital Transformation of Automobile and Mobility Service
Abstract: Automobile traffic system has not changed its physical, industrial and social structure more than 100 years from its start. It has been deployed in large scale, and it has realized important role of mobility. It consists of driver, automobile, and road physically. The system elements have been physically contacted each other and have been managed only by human. Electric and electronic technology have improved performance of automobile mechanic system for more than 30 years. The digital transformation Era is beginning. The system elements will have a digital data connection. The system value, size, range, and role will change dramatically. CASE;Connected Car, Autonomous Driving, Sharing Car and Mobility as a Service, Electrification make large-scale innovation of not only automobile but automobile traffic system, automobile industry and society. We will outline these new system and service concept. Then we will describe these digital transformation process will continue for a long time because the system and service will change year by year, which people are involved in. We will discuss the need for data cycle for improvement of this system and service. We will mention technologies needed for the future Mobility IoT and Automobile digital transformation such as communication technologies, drive by wire technology, drive by information technologies, traffic management technology by ICT, IoT technology, cloud technology that has scalability, flexibility, security, traceability, safety, reliability.
Biography: Hiroshi Miyata is a fellow at FUJITSU LABORATORIES LTD. He researches Mobility IoT related technologies and systems from 2017. He received M.S from the University of Tokyo in 1981. He was an engineer at Toyota Motor Corp.(TMC) from 1981 to 2004. He developed the first automobile shock absorber electronic control system and developed suspension systems, Cruise Control system, throttle system by wire, Electronic Control Units, sensors, and actuators. He planned car electronics technologies strategies in 1994. He promoted system integration called “Intelligent Transportation System” at Fujitsu Ten Ltd. from 1995 to 1996. He developed car multimedia systems, such as car navigation, emergency-call, connected car system from 1997 to 2004 at TMC. He was a president of Toyota InfoTechnology Center that is a research center of Information Communication technologies for automobile from 2005 to 2007. He was a general manager of electronics engineering Div. I at TMC from 2008 to 2009. He was a Managing Officer at TMC from 2009 to 2011. He was an executive Vice President of Toyota Technical Development Ltd from 2011 to 2016.

Best Paper Award

  • Dither NN: An Accurate Neural Network with Dithering for Low Bit-Precision Hardware
    Kota Ando, Kodai Ueyoshi, Yuka Oba, Kazutoshi Hirose, Ryota Uematsu, Takumi Kudo, Masayuki Ikebe, Tetsuya Asai, Shinya Takamaeda-Yamazaki, Masato Motomura (Hokkaido University)


* Best Paper Candidates
Oral Session 1: Neural Networks
  • * Kota Ando, Kodai Ueyoshi, Yuka Oba, Kazutoshi Hirose, Ryota Uematsu, Takumi Kudo, Masayuki Ikebe, Tetsuya Asai, Shinya Takamaeda-Yamazaki and Masato Motomura. Dither NN: An Accurate Neural Network with Dithering for Low Bit-Precision Hardware
  • * Hongxiang Fan, Shuanglong Liu, Martin Ferianc and Wayne Luk. A Real-Time Object Detection Accelerator with Compressed SSDLite on FPGA
  • Benjamin Morcos, Terrence C. Stewart, Chris Eliasmith and Nachiket Kapre. Implementing NEF Neural Networks on Embedded FPGAs
  • Shuanglong Liu, Chenglong Zeng, Hongxiang Fan, Ho-Cheung Ng, Jiuxi Meng and Wayne Luk. Memory-Efficient Architecture for Accelerating Generative Networks on FPGA
Oral Session 2: High Level Synthesis
  • * Anuj Vaishnav, Khoa Pham and Dirk Koch. Live Migration for OpenCL FPGA Accelerators
  • Ahmed Sanaullah, Rushi Patel and Martin Herbordt. Optimizing FPGA OpenCL Kernels for High Performance Computing
  • Jing Chen, Xue Liu and Jason Anderson. Software-Specified FPGA Accelerators for Elementary Functions
  • Leandro Rosa, Christos Bouganis and Vanderlei Bonato. Scaling Up Loop Pipelining For High-Level Synthesis: A Non-Iterative Approach
  • Jaume Bosch, Xubin Tan, Antonio Filgueras, Miquel Vidal, Marc Mateu, Daniel Jiménez-González, Carlos Alvarez, Xavier Martorell, Eduard Ayguade and Jesus Labarta. Application Acceleration on FPGAs with OmpSs@FPGA
Oral Session 3: Stream Processing
  • Philippos Papaphilippou, Chris Brooks and Wayne Luk. FLiMS: Fast Lightweight Merge Sorter
  • Makoto Saitoh and Kenji Kise. Very Massive Hardware Merge Sorter
  • Phong Tran, Thinh Hung Pham, Siew-Kei Lam, Meiqing Wu and Bhavan A. Jasani. Stream-based ORB Feature Extractor with Dynamic Power Optimization
  • Oscar Rahnama, Stuart Golodetz, Tommaso Cavallari and Philip Torr. R3SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Oral Session 4: Design Methodologies and Tools
  • Kevin E. Murray and Vaughn Betz. Tatum: Parallel Timing Analysis for Faster Design Cycles and Improved Optimization
  • Chirag Ravishankar, Henri Fraisse and Dinesh Gaitonde. SAT based Place-And-Route for High-Speed Designs on 2.5D FPGAs
  • Kuang-Ping Niu and Jason Anderson. Compact Area and Performance Modelling in CGRA Architecture Evaluation
Oral Session 5: Networking and Data Applications
  • Koya Mitsuzuka, Yuta Tokusashi and Hiroki Matsutani. MultiMQC: A Multilevel Message Queuing Cache Combining In-NIC and In-Kernel Memories
  • Yunhui Qiu, Hankun Lv, Jinyu Xie, Wenbo Yin and Lingli Wang. Ultra-Low-Latency and Flexible In-Memory Key-Value Store System Design on CPU-FPGA
  • Toshitaka Ito, Yuri Itotani, Shin'Ichi Wakabayashi, Shinobu Nagayama and Masato Inagi. A Nearest Neighbor Search Engine Using Distance-based Hashing
  • Siddhartha Siddhartha and Nachiket Kapre. DaCO: A High-Performance Token Dataflow Coprocessor Overlay for FPGAs
  • Nadeen Gebara, Jiuxi Meng, Paolo Costa and Wayne Luk. Scheduling Algorithms for High Performance Network Switches on FPGAs: A Survey
Oral Session 6: Security and Dependability
  • Thibaut Marty, Tomofumi Yuki and Steven Derrien. Enabling Overclocking with HLS Tools through Algorithm-Level Error Detection
  • Festus Hategekimana, Joel Mandebi Mbongue, Md Jubaer Hossain Pantho and Christophe Bobda. Secure Hardware Kernels Execution in CPU+FPGA Heterogeneous Cloud
  • Farnoud Farahmand, Malik Umar Sharif, Kevin Briggs and Kris Gaj. A High-Speed Constant-Time Hardware Implementation of NTRUEncrypt SVES
  • Shane Fleming and David Thomas. Injecting FPGA Configuration Memory Faults in Parallel
Oral Session 7: FPGA Architectures
  • Tian Tan, Eriko Nurvitadhi, David Shih and Derek Chiou. Evaluating The Highly-Pipelined Intel® Stratix® 10 FPGA Architecture Using Open-Source Benchmarks
  • Jin Hee Kim, Jongeun Lee and Jason Anderson. FPGA Architecture Enhancements for Efficient BNN Implementation
  • Brett Grady and Jason Anderson. Synthesizable Heterogeneous FPGA Fabrics
Poster Session 1
  • Siva Satyendra Sahoo, Tuan D. A. Nguyen, Bharadwaj Veeravalli and Akash Kumar. QoS-aware Cross-layer Reliability-integrated FPGA-based Dynamic Partially Reconfigurable System Partitioning
  • Jakub Cabal, Lukáš Kekely and Jan Kořenek. High-Speed Computation of CRC Codes for FPGAs
  • Bo Liu and James Xu. FCLNN: A Flexible Framework for Fast CNN Prototyping on FPGA with OpenCL and Caffe
  • Qiang Li, Shane Fleming, David Thomas and Peter Cheung. Accelerating Top-k ListNet Training for Ranking Using FPGA
  • Yu Zou and Mingjie Lin. GridGAS: An I/O Efficient Heterogeneous FPGA+CPU Computing Platform for Very Large-Scale Graph Analytics
  • Pok Yee Kwan, Gary C.T. Chow, Tim Todman, Wayne Luk and Wenguang Xu. Lossy Multiport Memory
  • Takuya Yamazaki and Tsutomu Maruyama. An FPGA Implementation of Robust Matting
  • Sadegh Yazdanshenas and Vaughn Betz. The Costs of Confidentiality in Virtualized FPGAs
  • Naohito Nakasato, Hiroshi Daisaka and Tadashi Ishikawa. High Performance High-Precision Floating-Point Operations on FPGAs using OpenCL
  • Tobias Drewes, Balasubramanian Gurumurthy, Jan Moritz Joseph, David Broneske, Gunter Saake and Thilo Pionteck. Efficient Inter-Kernel Communication for OpenCL Database Operators on FPGAs
  • Kento Tajiri and Tsutomu Maruyama. FPGA Acceleration of a Supervised Learning Method for Hyperspectral Image Classification
  • Liang Xie, Xitian Fan, Wei Cao and Lingli Wang. High Throughput CNN Accelerator Design Based on FPGA
Poster Session 2
  • Christopher Blochwitz, Julian Wolff, Mladen Berekovic, Dennis Heinrich, Sven Groppe, Jan Moritz Joseph and Thilo Pionteck. Hardware-Triplestore – a Hardware-centric Database for Semantic Web
  • Qiangpu Chen, Minghua Shen and Nong Xiao. DP-Pack: Distributed Parallel Packing for FPGAs
  • Dennis Gnad, Sascha Rapp, Jonas Krautter and Mehdi Tahoori. Checking for Electrical Level Security Threats in Bitstreams for Multi-Tenant FPGAs
  • Hossein Omidian and Guy Lemieux. An Accelerated OpenVX Overlay for Pure Software Programmers
  • Andre Bannwart Perina and Vanderlei Bonato. Mapping Estimator for OpenCL Heterogeneous Accelerators
  • Hiroki Nakahara, Masayuki Shimoda and Shimpei Sato. A Tri-State Weight Convolutional Neural Network for an FPGA: Applied to YOLOv2 Object Detector
  • Krystine Dawn Sherwin, Ben Stappers, Prabu Thiagaraj, Kevin I-Kai Wang and Oliver Sinnen. Investigating how hardware architectures are expressed in high-level languages for an SKA algorithm
  • Siddhartha Siddhartha, David Boland, Steve Wilton, Barry Flower, Phillip Leong and Perry Blackmore. Simultaneous Inference and Training using On-FPGA Weight Perturbation Techniques
  • Akira Jinguji, Tomoya Fujii, Shimpei Sato and Hiroki Nakahara. An FPGA Realization of OpenPose based on a Sparse Weight Convolutional Neural Network
  • Ryota Yasudo, Jose Gabriel Figueiredo Coutinho, Ana Lucia Varbanescu, Wayne Luk, Hideharu Amano and Tobias Becker. Performance Estimation for Exascale Reconfigurable Dataflow Platforms
  • Teng Yu, Bo Feng, Mark Stillwell, Liucheng Guo, Yuchun Ma and John Thomson. Topological Ranking-Based Resource Scheduling for Multi-FPGA Systems
  • Emmanouil Pissadakis, Nikolaos Alachiotis, Panagiotis Skrimponis, Dimitris Theodoropoulos, Thanasis Korakis and Dionisios Pnevmatikatos. ReFiRe: efficient deployment of Remote Fine-grained Reconfigurable accelerators
Poster Session 3
  • Jorge Echavarria, Stefan Wildermann and Jürgen Teich. A Combinatorial Two-Level Logic Approximation DSE Technique Targeting FPGAs
  • William Diehl, Farnoud Farahmand, Abubakr Abdulgadir, Jens-Peter Kaps and Kris Gaj. Face-off between the CAESAR Lightweight Finalists: ACORN vs. Ascon
  • Kristiyan Manev and Dirk Koch. Large Utility Sorting on FPGAs
  • Dionysios Diamantopoulos and Christoph Hagleitner. A System-level Transprecision FPGA Accelerator for BLSTM with On-chip Memory Reshaping
  • Md Jubaer Hossain Pantho, Joel Mandebi Mbongue, Christophe Bobda and David Andrews. Transparent Acceleration of OpenCV Library Kernels on FPGA-Attached Hybrid Memory Cube Computers
  • Wenzhi Fu, Jianlei Yang, Pengcheng Dai, Yiran Chen and Weisheng Zhao. A Scalable Pipelined Dataflow Accelerator for Object Region Proposals on FPGA Platform
  • Robert Hale and Brad Hutchings. Distributed-Memory Based FPGA Debug: Design Timing Impact
  • Matthew Ashcraft and Jeffrey Goeders. Unified On-Chip Software and Hardware Debug for HLS-Accelerated Programs
  • Haomiao Wang, Ben Stappers, Prabu Thiagaraj and Oliver Sinnen. Optimisation of Convolution of Multiple Different Sized Filters in SKA Pulsar Search Engine
  • Jia Liu and Qiang Liu. Speed and Resource Optimization of BFGS Quasi-Newton Implementation on FPGA using Inexact Line Search Method for Neural Network Training
  • Alexander Kroh and Oliver Diessel. A short-transfer model for tightly-coupled CPU-FPGA platforms
PhD Forum
  • Yu Xie, Chen He, Yi-Zhuang Xie, Chuang-An Mao and Bing-Yi Li. An Automated FPGA-Based Fault Injection Platform for Granularly-Pipelined Fault Tolerant CORDIC
  • Junichiro Kadomoto, Toru Koizumi, Akifumi Fukuda, Reoma Matsuo, Susumu Mashimo, Akifumi Fujita, Ryota Shioya, Hidetsugu Irie and Shuichi Sakai. An Area-Efficient Out-of-Order Soft-Core Processor Without Register Renaming
  • Antoniette Mondigo, Kentaro Sano and Hiroyuki Takizawa. Enhancing Memory Bandwidth in a Single Stream Computation with Multiple FPGAs
Demonstration Session
  • Lukáš Kekely, Martin Špinler, Štepán Friedl, Jiří Sikora, Jan Kořenek and Viktor Puš. Demonstration of Full-Duplex Packet Transfers over PCI Express with Sustained 200 Gbps Throughput
  • Donald Bailey, Yuan Chang and Steven Le Moan. Lens Distortion Self-Calibration using the Hough Transform
  • Shaoxia Fang, Lu Tian, Junbin Wang, Shuang Liang, Dongliang Xie, Zhongmin Chen, Lingzhi Sui, Qian Yu, Xiaoming Sun, Song Yao, Yi Shan and Yu Wang. Real-time Object Detection and Semantic Segmentation Hardware System with Deep Learning Networks
  • Daniel Holanda Noronha, Kahlan Gibson, Bahar Salehpour and Steve Wilton. LeFlow: Automatic Compilation of TensorFlow Machine Learning Applications to FPGAs
Design Competition
  • Kyosuke Mori, Yuuki Saito and Naohito Nakasato. Introduction of MNSTbot
  • Musashi Aoto, Yousuke Numata and Yasutaka Wada. Development of an FPGA controlled "Mini-Car" toward Autonomous Driving
  • Yuya Kudo, Atsushi Takada, Soji Tsuda, Takumi Sakai and Tomonori Izumi. a Platform on All-Programmable SoC for Micro Autonomous Robots
  • Yohei Shimmyo, Maiko Arakawa, Shunsuke Mie, Hiroaki Saito, Miyuka Nakamura, Yuichi Okuyama, Eriko Kayama, Misaki Kozakai and Hiroki Yomogita. Implementation of an Autonomous Driving System for FPT2018 FPGA Design Competition Using the Zynqberry Processing Board
  • Akira Kojima and Yohei Nose. Development of an Autonomous Driving Robot Car using FPGA
  • Hiromichi Wakatsuki, Takao Kido, Kenta Arai, Yuhei Sugata, Takashi Yokota, Kanemitsu Ootsu and Takeshi Ohkawa. Development of a Robot Car Based on Lane Detection with FPGA
  • Hiroki Bingo. Development of a Control Target Recognition for Autonomous Vehicle using FPGA with Python
  • Sou Tamura, Yasuhiro Nitta and Hideki Takase. A Study on Introducing FPGA to ROS based Autonomous Driving System
  • Kaijie Wei, Koki Honda and Hideharu Amano. FPGA Design for Autonomous Vehicle Driving Using Binarized Neural Networks