Live Virtual Developer Event Tuesday, October 26 from 10:00 AM to 7:00 PM PT
Thank you for your interest in this one-day developer deep-dive, part of Intel Innovation 2021. Reserve your seat today and join technology innovators from around the world—early adopters, Intel architects, academia, and tech leaders—who share a common interest in oneAPI, an open, standards-based, cross-architecture programming model.
When you do, you’ll get to participate in hands-on technical tutorials, tech talks, and lightning talks, plus attend multiple sessions across 4 tech tracks:
Parallel Programming
AI Analytics
Cross-platform oneAPI performance libraries
FPGA Programming
Additionally, you’ll get to hear how companies—Facebook, Laika, Red Hat, Amazon, Bittware, and more—are optimizing their products using oneAPI, and explore the latest oneAPI-fueled solutions to real-world problems for AI, Cloud, Edge, and 5G.
Takeaways
At the end of the oneAPI Dev Fest, you will walk away with examples and foundational training on what oneAPI delivers:
An open alternative to proprietary lock-in
Application performance across CPUs, GPUs, FPGA and other accelerators.
A complete set of Intel cross architecture libraries and tools for quick and efficient heterogeneous development.
Next Steps
Register and check the “Yes, I would like to attend oneAPI DevFest” box.
Joe Curley will open the summit, and welcome the audience. He will introduce the inspiration behind oneAPI and the industry problem we hope to solve together and set the stage for what is coming up next in the conference.
Speaker:
Joe Curley, Intel
Joseph (Joe) Curley serves Intel Corporation as Senior Director, oneAPI Products, Solutions & Ecosystem. His primary responsibilities include supporting the oneAPI industry initiative, product management of Intel’s oneAPI product implementation, and supporting the oneAPI developer ecosystem. Mr. Curley joined Intel Corporation in 2007, and has served in multiple other strategic planning, ecosystem development, and business leadership roles. Prior to joining Intel, Joe worked at Dell, Inc. leading the global workstation product line, the consumer and small business desktop product line, and in a series of engineering roles. He began his career at computer graphics pioneer Tseng Labs.
10 :40– 11:10 am
×
The Evolution of Open Standards Programming for HPC and AI
An increasing number of developers are seeking the benefits of the parallelism delivered by accelerator processors such as GPUs to enable HPC and AI applications. In the world of AI & HPC programming to get performance, developers have often needed to use closed and proprietary programming models.
oneAPI is based on open standards that deliver both performance and portability, and at its core is the SYCL open standard. This open and unified programming model gives developers a way to take advantage of the growing diversity of processor platforms. This presentation will explore how the community is creating an ecosystem using open standards and how we can get the whole industry to work together on this initiative.
Speaker:
Micheal Wong, Codeplay
Michael Wong is a Distinguished Engineer at Codeplay Software, a Scottish company that produces compilers, debuggers, runtimes, testing systems, and other specialized tools to aid software development for heterogeneous systems, accelerators and special purpose processor architectures, including GPUs and DSPs. He is now a member of the open consortium group known as Khronos, MISRA, and AUTOSAR and is Chair of the Khronos C++ Heterogeneous Programming language SYCL. For twenty years, he was the Senior Technical Strategy Architect for IBM compilers.
11:10 – 11:20 am
Break
11:20 – 12:05 pm
×
oneAPI Architecture
Paul Petersen, the architect for oneAPI, will explain oneAPI in terms of the technical challenges it seeks to solve in an open, multivendor, multiarchitecture manner. Paul will share his views on the technical accomplishments, and challenges; past, present, and future for oneAPI. This promises to be an informative must-not-miss talk.
Speaker:
Paul Petersen, Intel
Paul Petersen is a Sr. Principal Engineer in SATG (Software and Advanced Technology Group), and oneAPI Tools Architect. He received a Ph.D. in Computer Science from the University of Illinois in 1993.
Starting at Kuck and Associates, Inc. (KAI) responsibility included enhancing the auto-parallelizing compiler (KAP) and the early definition and implementations of OpenMP. While at KAI, he developed the Assure line of parallelization/correctness products, for Fortran, C++ and Java. In 2000, Intel Corporation acquired KAI, and he joined the software tools group creating the Thread Checker products, which evolved into the Inspector and Advisor components of the Intel® Parallel Studio. Inspector uses dynamic binary instrumentation to detect memory and concurrency bugs, and Advisor uses similar techniques along with performance measurement and modeling to assist developers in transforming existing serial applications to be ready for parallel execution. The passion for software architecture grew to cover all of Parallel Studio XE and its components architecture. After a few years leading the software tools pathfinding with a focus on defining next generation features for parallel runtimes and software analysis tools, Paul returned to software architecture in his current role leading the oneAPI Tools Architecture team.
12:05 – 12:30 pm
Lunch
12:30 – 1:00 pm
×
Using oneAPI Containers and Distributed Computing to Offload Computation to GPUs (Demo)
This demo explains how to use Intel® oneAPI HPC Toolkit Containers to build a distributed computing system that offloads the computation to one or more devices. Message Passing Interface (MPI), which is available in HPC containers, is a standardized message-passing system developed for distributed and parallel computing.
Speakers:
Loc Q Nguyen, Intel
Loc Q Nguyen received an MBA from University of Dallas, a master’s degree in Electrical Engineering from McGill University, and a bachelor's degree in Electrical Engineering from École Polytechnique de Montréal. He is currently a senior software engineer with Intel Corporation's Architecture, Graphics and Software Group. His areas of interest include Machine Learning, computer networking, and parallel computing.
Alberto Villarreal, Intel
Alberto Villarreal is a software engineer working on performance analysis and optimization on current/future Intel parallel architectures. Alberto has worked on software optimization and algorithm development/analysis in the energy industry for 12 years. Alberto joined Intel in 2016. He holds an M.Sc. in Mathematics and Computer Science and an M.Sc. in Geophysics (Colorado School of Mines, USA).
×
Improve rotoscoping efficiency for Filming industry by AI with Intel AI SW & HW
Rotoscoping is largely a manual process requiring a significant investment of time by artists. This problem was solved by applying both Intel Xeon hardware and oneAPI software products, and a close collaboration between the Intel Applied ML team and LAIKA’s visual effects team. The collaboration demonstrating a significant time savings of 50% of artists rotoscoping time
Speakers:
Louie Tsai, Intel
Louie Tsai is responsible for driving customer engagements with and adoption for Intel® Performance Libraries, leveraging the synergies between Intel Optimization for TensorFlow* and the Intel® oneAPI AI Analytics Toolkit. Louie holds a Master’s degree in Computer Science and Information Engineering from National Chiao Tung University.
Niharika Maheshwari, Intel
Niharika is leading the machine learning software engineering horizontal in the Applied ML team. She has designed & developed several applied Machine Learning applications for internal & external customers in fields including autonomous driving, content creation and translation. She has contributed to the software stack in relation to optimization, hardware benchmarking & exploration. She has a Masters in Electrical & Computer Engineering from University of Michigan - Ann Arbor.
James Pina, Laika
James is a digital paint and compositing artist for feature film and television, currently working at Laika as Lead Paint Artist in the Visual Effects department. His film credits include Missing Link, Kubo: And The Two Strings, Captain America and Iron Man 2, among many others.
×
Real Time XPU Path Tracing
Learn about the Intel® oneAPI Rendering Toolkit can be used to create a Path Tracer running at real time rates on client GPUs in conjunction with a Hydra delegate, enabling interactive artist workflows in major content creation DCC’s.
Speakers:
Johannes Meng, Intel
Johannes Meng is a graphics software engineer at Intel. His current work focuses on high-performance data structures for scalable volumetric rendering. Before Intel, he worked as a rendering researcher at Weta Digital. Johannes holds a PhD in computer science from Karlsruhe Institute of Technology.
Sebastian Herholz, Intel
Sebastian Herholz recently joined Intel as a graphics software engineer. His works focus on developing tools for advanced algorithms for light transport simulation. Before Intel, he worked on research projects on Monte-Carlo-based rendering algorithms, such as path guiding at the University of Tübingen. During that time, he completed several internships at Weta Digital and Charles University in Prague.
Sean McDuffee, Intel
Sean McDuffee is a graphics software engineer at Intel. He possesses a holistic understanding of the animated film and VFX industries born from his working and leadership experience in R&D on major international films. He is dedicated to open source technologies for the film industry.
×
Efficient Sharing of FPGA Resources in oneAPI
It’s common to have multiple FPGA kernels working independently on shared interfaces or attached memories including HBM2. Arbitrating access to these “endpoints” can be complex and unpredictable, and simple interleaving or multiplexing is inadequate. BittWare has created a lightweight configurable crossbar switch IP to alleviate this problem.
The crossbar switch IP is written using Intel’s oneAPI, allowing adjustments to fit the application requirements. This is performed without requiring extra shim logic or design restriction required by a typical fixed IP implementation, such as modifying the data path widths or adjusting the number of ports required.
Richard Chamberlain, BittWare
Richard started his career at MBDA UK, before joining Nallatech in 2001. For last 20 years he has pioneered using FPGAs for HPC and is a trusted industry expert in the field of heterogenous acceleration. Richard currently works as a Principal Systems Engineer in the applications team at BittWare, part of the Molex group.
1:00 – 1:30 pm
×
Deep look at adopting SYCL and DPC++
Deep technical look at what it involved in adopting SYCL with DPC++. This is not an introduction to DPC++/SYCL (read our book!) – it is a deep look at the real in/outs involved in embracing SYCL for heterogenous computing, with a specific look at using LLVM support of SYCL that comes from the DPC++ compiler project. This is a must-not-miss look at DPC++ and SYCL for anyone considering it – technical yes, but accessible for anyone with a programming background.
Speakers:
James Broadman, Intel
James Brodman works on runtimes and compilers for parallel programming. James is one of the architects of DPC++. James is active in the Khronos SYCL working group.
James Reinders, Intel
James Reinders has more than three decades experience in Parallel Computing, has co-authored ten technical books related to parallel programming. James works at Intel teaching and promoting parallel programming in a heterogeneous world.
Ben Ashbaugh, Intel
Ben Ashbaugh has more than two decades experience with software drivers for Intel graphics products. For the past decade, Ben has focused on parallel programming models for general-purpose computation on graphics processors, including SYCL and DPC++. Ben is active in the Khronos SYCL, OpenCL, and SPIR working groups.
×
Facebook M2M-100, a state-of-the-art NLP solution running on Intel
Facebook AI introduced M2M-100, the first multilingual machine translation (MMT) model that can translate between any pair of 100 languages without relying on English data. TNG Consulting has adapted the opensource code to run on Intel architecture, powered by the Intel® oneAPI AI Analytics toolkit. In this talk, we present the translation software we developed around FB m2m-100 and talk about the challenges we encountered on the way.
Speakers:
Shailen Sobhee, Intel
Shailen Sobhee is an AI Solutions Engineer at Intel. He is the link between the core engineering team and Intel's customers. Shailen assists and trains users of Intel technology on how to use machine learning and deep learning frameworks that capitalize on highly-optimized mathematical libraries for best performance on Intel hardware. He holds a Master's degree in Computational Science and Engineering from the Technical University of Munich.
Jonas Mayer, TNG Consulting
Jonas Mayer works in the Innovation Hacking Team of TNG Technology Consulting where his main focus lies on creating innovative showcases and prototypes in both soft- and hardware. Since 2018 he's been working on numerous projects ranging from real-time deepfakes, over mixed reality art experiences, all the way to autonomous miniature drones.
Prior to joining TNG, Jonas studied Informatics: Games Engineering at the TU Munich. Apart from the obvious game projects throughout the course of his studies, he focused mainly on artificial intelligence and high performance computing.
Thomas Endres, TNG Consulting
In his role as a Partner for TNG Technology Consulting in Munich, Thomas Endres works as an IT consultant. Besides his normal work for the company and the customers he is creating various prototypes - like a telepresence robotics system with which you can see reality through the eyes of a robot, or an Augmented Reality AI that shows the world from the perspective of an artist. He is working on various applications in the fields of AR/VR, AI and gesture control, putting them to use e.g. in autonomous or gesture controlled drones. But he is also involved in other Open Source projects written in Java, C# and all kinds of JavaScript languages.
×
Workflow and Profiling for Heterogenous computer
Acceleration of Performance using direct programming of GPUs. In this session, we will use Intel OneAPI profiling tools, Intel® VTune™ Profiler and Intel® Advisor, to identify performance bottlenecks and develop strategies to optimize applications at Argonne.
Speakers:
JaeHyuk Kwack, Argonne National Laboratory
JaeHyuk Kwack works in the performance engineering group at the Argonne Leadership Computing Facility. He received his B.S. and M.S. in engineering from Seoul National University, South Korea. He earned his Ph.D. and did his post-doctoral training in Computational Mechanics for Computational Fluid Dynamics (CFD) and Fluid Solid Interaction (FSI) problems from the University of Illinois at Urbana-Champaign, USA. Before joining Argonne, he worked for the Blue Waters supercomputing project at the National Center for Supercomputing Applications. He joined Argonne in 2018. At Argonne he has been focused on the OpenMP offloading model, and associated performance tools and math libraries for the coming US DOE Exa-Scale system, Aurora, at Argonne.
Michael D’Mello, Intel
Michael D’Mello has spent the last 30 years in the computer industry specializing in the area of Parallel Computing. Past employers include Thinking Machines Corporation, Convex Computer Corporation, and the Hewlett-Packard Company. He has been with Intel since 2003 and is the current manager of the Intel Center of Excellence at the Argonne Leadership Computing Facility. His primary interest is in the area of code development and optimization in the area of Scientific Computing. He holds a Ph.D. in Chemical Physics from the University of Texas, Austin.
Kevin O’Leary, Intel
Kevin O’Leary is a Lead Technical Consulting Engineer whose expertise includes compilers, debuggers, and software performance tools. He has worked at Intel for the last 17 years, He is currently responsible for performance optimization using Intel tools and was one of the original developers of the Intel® Parallel Studio XE development suite. Prior to joining Intel, Kevin spent many years as a debugger engineer for IBM/Rational Software. He holds a Bachelor’s in Computer Science from the University of Massachusetts and a Master’s in Computer Science from Oregon Health and Science University.
×
Real World Monte Carlo Derivative Valuation with oneAPI and Intel FPGAs
Having already developed a oneAPI replacement for an RTL design and shown significant advantages in terms of development process, CSS demonstrated an incremental evolution to deal with a wider range of cases requiring resources significantly in excess of on-chip memory limits. As a result of the compiler inferring many of the optimizations required, CSS was able to build a highly scalable implementation with comparable performance to RTL and a significantly quicker time to market.
Speakers:
Marcus Defrettes, Creative Solutions Space
Marcus an early adopter and exponent of agile methods and leads CSS’ Engineering, Research and Development activities.
1:30 – 2:00 pm
×
Unpacking oneAPI for the IOT Ecosystem
Since its debut, despite its potential to take advantage of heterogenous computing and noble intent for hardware freedom, oneAPI Toolkits has yet to catch the attention of many developers and solution builders today. With that in mind, through this session, we desire to make friendly neuron connections to weave the use-cases and to the oneAPI assets to suit different ecosystem player for IOT, AI and edge computing.
Speakers:
Kana Murugayah, Intel
Kana has been with Intel IoT Group for more than a decade and worked in multiple roles supporting customer to design innovative IOT solutions. Kana is experienced in Linux, Virtualization, and Security Application on IA and ARM. Now he evangelizes and promotes oneAPI as the must have developer tools.
Noah Clemons, Intel
Noah is a staff technical consulting engineer within the Intel Developer Tools Consulting and Support Division. He has been teaching customers how to write, analyze, and debug code with the latest tools and methods for the last 12 years.
×
Speedup from Intel® Optimization for TensorFlow* and related analysis with TF Timeline and oneDNN Verbose Log
Using Model Zoo for Intel Architecture to show case the performance benefit from Intel Optimization for TensorFlow* and analysis the performance improvement by TF Timeline and oneDNN verbose log
Speaker:
Louie Tsai, Intel
Louie Tsai is responsible for driving customer engagements with and adoption for Intel® Performance Libraries, leveraging the synergies between Intel Optimization for TensorFlow* and the Intel® oneAPI AI Analytics Toolkit. Louie holds a Master’s degree in Computer Science and Information Engineering from National Chiao Tung University.
×
StencilStream, a Stencil Simulation Library for FPGAs
In this session we will introduce StencilStream, a Stencil Library for FPGAs, developed with oneAPI and DPC++. Based on the features of DPC++ over pure OpenCL, StencilStream allows for a true separation of concerns between application functionality and performance optimizations. This is demonstrated with three applications and a seamless transition in the backend from row buffers to a tile-based architecture.
Speakers:
Tobias Kenter, Paderborn University, Paderborn Center for Parallel Computing
As HPC consultant for FPGA acceleration at the Paderborn Center for Parallel Computing, Tobias Kenter works on porting scientific codes to reconfigurable hardware using high-level synthesis tools. Application domains include libraries and simulations with stencils and on unstructured meshes.
Jan-Oliver Opdenhövel, Paderborn University, Paderborn Center for Parallel Computing
As student assistant at the Paderborn Center for Parallel Computing, Jan-Oliver Opdenhövel evaluates and develops emerging tools and workflows for reconfigurable hardware in an HPC context.
ware using high-level synthesis tools. Application domains include libraries and simulations with stencils and on unstructured meshes.
2:00 – 2:15 pm
Break
2:15 – 2:45 pm
3 Lightning Talks
×
Xjoin: Portable, parallel hash joins across XPU architectures
Modern cloud-native databases run on increasingly heterogeneous hardware with a diverse mix of XPU architectures deployed across CPU, GPU, and FPGAs. However, till date, database developers have had to rely on either proprietary, architecture-specific solutions (like CUDA), or lowlevel, cross-architecture solutions that complicate development (like OpenCL). The lack of portable parallelism caused by the absence of a common high-level programming framework is one of the main reasons preventing a wider adoption of XPUs by database systems. In this talk, we will present the first steps towards solving this problem using oneAPI. In particular, we port a recently-proposed, highly-optimized, GPU-based hash join algorithm from CUDA to Data Parallel C++ (DPC++). We then execute the hash join on multicore CPUs, integrated GPUs (Intel GEN9), and discrete GPUs (Intel DG1 and NVIDIA GeForce) without changing a single line of kernel code to demonstrate that DPC++ enables portable parallelism. We compare the performance of DPC++ kernels with hand-optimized CUDA kernels and model-based theoretical performance bounds to demonstrate the performance–portability trade off in using DPC++.
Speaker:
Raja Appuswamy, EURECOM
×
Ginkgo - An Open-Source Math Library in the oneAPI Ecosystem
Ginkgo is an open-source math library designed for GPU-accelerated supercomputers. In this talk, we will present the path we took to prepare Ginkgo for Intel GPUs. We will report our experiences in porting the NVIDIA-focused software stack to Intel’s DPC++ environment, elaborate how to enable performance portability across a diverse hardware portfolio, and we we will demonstrate how Ginkgo’s DPC++ backend can be used to prepare scientific applications for the OneAPI ecosystem.
Speaker:
Hartwig Anzt, Karlsruhe Institute of Technology
University of Tennessee
Hartwig Anzt focuses on the development of numerical methods for modern hardware architectures. As the author of the MAGMA-sparse software package and the managing lead of the Ginkgo math library, he has substantial experience in the development of sustainable and production-ready HPC software.
×
Precise and Fast Gene Sequences Clustering with GPU
Learn how Intel and RedHat have partnered together to enable Intel’s most popular AI developer optimizations on RedHat OpenShift Data Science. Experience the power of drop-in acceleration for popular frameworks such as TensorFlow and Pandas while eliminating complex Kubernetes cloud and infrastructure setup by using Intel® oneAPI AI Analytics Toolkit on RedHat OpenShift Data Science for a easy developer experience. Watch in real time the power and ease of use of the AI Kit on RedHat OpenShift Data Science with a live demonstration.
Speaker:
Zhen Ju, University of the Chinese Academy of Sciences
Zhen Ju gets his master’s degree from the University of the Chinese Academy of Sciences(UCAS) in 2016, and now is a Ph.D. candidate at UCAS, and he is major in computer science. Zhen Ju research in the fields of high-performance computing and heterogeneous acceleration. He has experience in accelerate codes on heterogeneous devices. He has developed an application that can remove redundancy sequences from biological sequences by CUDA and migrated it to One API.
×
Empowering AI on Red Hat OpenShift Data Science with Intel® oneAPI AI Analytics Toolkit
Learn how Intel and RedHat have partnered together to enable Intel’s most popular AI developer optimizations on RedHat OpenShift Data Science. Experience the power of drop-in acceleration for popular frameworks such as TensorFlow and Pandas while eliminating complex Kubernetes cloud and infrastructure setup by using Intel® oneAPI AI Analytics Toolkit on RedHat OpenShift Data Science for a easy developer experience. Watch in real time the power and ease of use of the AI Kit on RedHat OpenShift Data Science with a live demonstration.
Speakers:
Rachel Oberman, Intel
Rachel is an AI Technical Consulting Engineer who helps customers optimize their workflows with data analytics and machine learning algorithms from Intel. She holds a bachelor’s degree in Computer Science and Data Science from the College of William & Mary with a background in geospatial analysis.
Karl Eklund, RedHat
Karl Eklund is a Principal Architect aligning customer goals to solutions provided by the open source community and commercial vendors within the Red Hat OpenShift Data Science platform.
×
Using gdb-oneapi to Debug SYCL ZFP Compression Library
It is often challenging to find bugs in complex libraries. To ease this challenge, debuggers like gdb-oneapi can be used on both CPU/GPU. In this talk/demo/tutorial/workshop, we use the gdb-oneapi debugger to show how we pin-pointed obscure bugs in the ZFP library we recently migrated to SYCL; zfp is an open source library for compressed floating-point arrays that support high throughput read and write random access, often used in HPC Oil and Gas workloads. We will demonstrate how we used gdb-oneapi to step through the ZFP library to identify and resolve these bugs, and share best known methods for characterizing and resolving runtime issues
Speakers:
Sunny Gogar, Intel
Sunny Gogar is a senior applications engineer in developer ecosystem engineering team in Intel. He has expertise in developing and optimizing HPC and AI applications for CPUs and GPUs. He holds a Bachelor of Engineering, Electronics & Telecommunications from University of Mumbai and a Masters in High-Performance Computing from University of Florida.
Aravind Neelakantan, Intel
Aravind Neelakantan is a Software Engineering Intern who is pursuing his PhD in High Performance Heterogenous Computing at University of Florida. He has expertise in performance modeling and optimization of HPC workloads on CPUs and GPUs.
×
Building Domain Specific FPGA Platforms using Intel OFS for oneAPI Applications
Intel OFS and OneAPI provide a fast and efficient development flow for faster time to market and allows reuse across multiple hardware platforms. The focus of the presentation will be to highlight the developer experience of using oneAPI to build FPGA offload applications. It will explain the value of FPGA + Platforms + custom Board Support Packages (BSP) for offload and inline hardware acceleration. The presentation will introduce Intel OFS architecture and highlight its value proposition as an upstreamed, scalable, and source-accessible software when targeting custom hardware platforms. The presentation will showcase partner’s (Hitek System) domain specific FPGA accelerator platform solutions and developer journey using Intel’s latest and greatest FPGA technology.
Speakers:
Haris Tauqeer, Hitek Systems LLC.
Haris Tauqeer is working as a Senior Principal Engineer at Hitek Systems. He obtained a Master’s degree in Electrical Engineering from Johns Hopkins university and has over 15 years of experience working with complex designs and cutting edge technologies.
His key areas of expertise include FPGA design, development and verification for various wired/wireless communication technologies, including Ethernet, PCIe and satellite communications. He has successfully taped-out multiple generations of satellite gateway designs and Ethernet IP cores up to 800Gbps rates.
Haris is responsible for supporting an array of activities across the company, including system integration, hardware bring ups and managing the FPGA and embedded firmware development teams.
He led the effort of porting Intel OFS and OneAPI to Hitek Systems’ Low Profile Agilex PCIe accelerator card.
Tamara Lin, Intel
Tamara is a product marketing specialist in PSG focused on the development and go-to-market strategy for Intel Open FPGA Stack. She has technical marketing experience working across Intel’s power, intellectual property, and business unit segments.
Adonay Berhe, Intel
Adonay is the product marketing manager for oneAPI with FPGAs in Intel, PSG. He has experience working on software development teams using languages such as Python, C, and C++.
2:45 – 3:15 pm
×
oneAPI Effective Performance Portability
A new tool with integration of Intel VTune and TensorBoard to bring details Intel XPUs profiling information on Deep Learning workloads. It offers uarch details on DL ops level on a popular DL profiling tool : Tensorboard.
Speakers:
Jason Sewall, Intel
Jason is a Senior HPC Application Engineer at Intel Corporation. His research interests include parallel algorithms, numerical analysis, and practices for improving performance, portability, and productivity in software. He received the Ph.D. degree in computer science from the University of North Carolina at Chapel Hill in 2010.
John Pennycook, Intel
John is an HPC application engineer at Intel Corporation. His research is focused on improving application performance portability and programmer productivity. He received a Ph.D. in computer science from the University of Warwick in 2013.
×
Deep Learning Performance Profiling Tool for Intel XPUs
A new tool with integration of Intel VTune and TensorBoard to bring details Intel XPUs profiling information on Deep Learning workloads. It offers uarch details on DL ops level on a popular DL profiling tool : Tensorboard.
Speaker:
Louie Tsai, Intel
Louie Tsai is responsible for driving customer engagements with and adoption for Intel® Performance Libraries, leveraging the synergies between Intel Optimization for TensorFlow* and the Intel® oneAPI AI Analytics Toolkit. Louie holds a Master’s degree in Computer Science and Information Engineering from National Chiao Tung University.
×
Accelerating 4G, 5G and beyond with Intel oneAPI Integrated Performance Primitives (IPP)
In the entire cellular mobile communication system, the Base Station is the bridge between the mobile station and the mobile center, and its position is extremely important. Intel oneAPI IPP signal processing accelerates Base Station with multiple APIs(FFT/DFT, CRCs, etc.) based on Intel processors.
Speakers:
Ruqiu Cao, Intel
Ruqiu Cao is a Software Technical Consulting Engineer at Intel, where he enables products for customers and software developers through technical support, training, and hands-on assistance in areas of code development, performance tuning and scaling of SW applications. He has 10+ years of software developing experience in wireless domain. In the past he worked in 3G, 4G and 5G Base Station implementation and optimization.
Abhinav Singh, Intel
Abhinav Singh is a Software Technical Consulting Engineer at Intel, where he enables products for customers and software developers through technical support, training, and hands-on assistance in areas of code development, debugging, tuning, and scaling of SW applications. He has a background in software engineering with Master’s degree in computer science from University College Dublin, Ireland. In the past he has worked in enabling Intel compression and crypto accelerators on server platforms.
Narasimha Rao, Mavenir
Narasimha Rao is a Senior Architect in wireless L1/PHY and multimedia DSP software development. He has over two decades of experience in leading the DSP software development activities, sustenance activities and customer interaction across wireless communication infrastructure and telecom media gateway products. He brings rich expertise in technologies such as 2G, 3G, WiMAX, 4G LTE apart from multimedia speech/audio systems. He has extensively worked in architecting multiple physical layer designs and optimizing DSP software on state-of-the-art multi core SoCs and GPPs. He possesses strong background in digital communication theory, signal processing theory and hands on in system debugging aspects. He has proven experience in leading, mentoring and team bring up activities. He has master’s degree in Electronics and Communications from Mysore University, India.
×
Signal Processing via MVDR Beamforming Design on Intel FPGAs
This session will showcase the implementation of a Minimum Variance Distortionless Response (MVDR) Beamformer design that is implemented using Intel FPGAs and Intel oneAPI Toolkits. This smart antenna applications depicts an adaptive beamforming technique used to cancel interfering signals (placing nulls) and produce or steer a strong beam toward the target signal according to the calculated weight vectors. Developers will get an insight on the design and architectural implementations to allow packet processing and beamforming within the FPGA device kernels as well as the Board Support Package (BSP) and platform specific changes required to enable the utilization of IO Channels (streaming data directly from Ethernet). The session will also feature a recorded demo that leverages MATLAB to generate input signals that are routed to the FPGA to demonstration the beamforming capabilities using the Intel Agilex FPGA.
Speakers:
Adonay Berhe, Intel
Adonay is the product marketing manager for oneAPI with FPGAs in Intel, PSG. He has experience working on software development teams using languages such as Python, C, and C++.
Greg Nash, Intel
Greg is a System Architect for HPC, AI, and Government Analytics in the Military, Aerospace, and Government business group at Intel Programmable Solutions Group. He is responsible for putting together solutions in these areas between IP, tools, and devices. Previous to this role, he was a tools specialist for signal processing applications, developing PoC’s and writing papers in FFT's and radar processing, and designing radio heads at Motorola. He holds a MSEE from UM Ann Arbor.
Mike Tucker, Intel
Mike spent much of his career as an Altera customer designing video processing equipment for the broadcast, streaming, and video production industries. He has been in the High Level Design group in PSG for 5 years, and is currently focused on writing example designs in oneAPI and helping to improve the performance of the DPC++ compiler.
3:15- 3:45 pm
×
Write your first SYCL program, in under 30 minutes, using DPC++ on DevCloud
Meet some book authors and experts on SYCL/DPC++!!! Using the Intel DevCloud (get an account at https://tinyurl.com/getdevcloud - in advance is best), attendees will write and run their first SYCL program. We will talk about how to navigate key features of DevCloud, and start exploring the catalogue of hands-on labs that can be explored in a self-paced format. Intel experts will speak briefly, lead an exercise to get a first program written, and be on hand to answer questions. Key resources that will be discussed include DevCloud (https://tinyurl.com/getdevcloud), online Tutorials (https://tinyurl.com/learnsycl), and our DPC++ book (https://tinyurl.com/booksycl). Recommended prework: have a DevCloud account (free - https://tinyurl.com/getdevcloud), and download our DPC++ book (free - https://tinyurl.com/booksycl), and visit https://tinyurl.com/essentialDPCpp where you can try out Module 0. If you do Module 1 too, you can skip the class unless you have questions!
Speaker:
James Reinders, Intel
James Reinders has more than three decades experience in Parallel Computing, has co-authored ten technical books related to parallel programming. James works at Intel teaching and promoting parallel programming in a heterogeneous world.
×
Migrating AutoDock-GPU, a Drug Screening Code to Data Parallel C++
This talk describes key learnings and experiences of successfully migrating AutoDock-GPU to DPC++. AutoDock-GPU is an application that simulates the docking (or binding) of molecules used in health care drug discovery. oneAPI provides a CUDA migration tool called Intel® Data Parallel Compatibility Tool (DPCT). This talk will describe issues in the migration of complex macros and uncommon constructs such as subgroups, shuffles, reductions, pointer arithmetic, and atomics in local memory.
Speaker:
Edward Mascarenhas, Intel
Edward Mascarenhas, Senior Strategic Planner, Intel Corporation has a background in engineering management, systems software development, and high-performance network technologies. Currently he does strategic planning for HPC ecosystem enabling for discrete GPUs at Intel, and hands-on migrating, porting, and optimizing of HPC applications on discrete GPUs. He holds a PhD in Computer Science from Purdue University.
3:45- 4:15 pm
×
Open DevCloud Lab Time (optional)
This is optional extra time for attendees of the prior session to stay, and try more programming on DevCloud. Intel experts will be on hand to answer questions. Key resources include DevCloud (https://tinyurl.com/getdevcloud), online Tutorials (https://tinyurl.com/learnsycl), and our DPC++ book (https://tinyurl.com/booksycl). Recommended prework: attend the prior half hour, or have a DevCloud account (free - https://tinyurl.com/getdevcloud), and download our DPC++ book (free - https://tinyurl.com/booksycl), and visit https://tinyurl.com/essentialDPCpp where you should finish at least Modules 0 and 1. We’ll be here to entertain any questions!
Speaker:
James Reinders, Intel
James Reinders has more than three decades experience in Parallel Computing, has co-authored ten technical books related to parallel programming. James works at Intel teaching and promoting parallel programming in a heterogeneous world.
×
Efficacy of Intel Developer Tools in detecting performance blockers for AI workload
With NO code change or recompilation – We would like to demonstrate the Intel oneAPI Toolkits that can offer accurate depiction of performance blockers and Intel Optimized library assets that can used to help optimize AI performance at the edge.
Speakers:
Joel Lin, Intel
Joel is a Technical Consulting Engineer specializing in power and performance analysis tools for the Embedded/IoT segment. His 10+ years of software development experience spans drivers, media codecs, and performance optimizations on Windows* and Linux* operating systems.
Kana Murugayah, Intel
Kana has been with Intel IoT Group for more than a decade and worked in multiple roles supporting customer to design innovative IOT solutions. Kana is experienced in Linux, Virtualization, and Security Application on IA and ARM. Now he evangelizes and promotes oneAPI as the must have developer tools.
×
oneAPI Unified Cross-Architecture Programming in Healthcare
oneAPI is a single, unified programming model that supports work on heterogenous systems by eliminating the need for separate code bases and the use of multiple programming languages. oneAPI’s complete set of cross-architecture libraries allows users to focus on applying their skills to new innovations rather than rewriting existing code for different hardware platforms. In this session, the speaker will discuss how healthcare space will benefit from use of oneAPI, with example from medical imaging domain. Session will include customer testimonials on applying oneAPI to non-AI and AI use cases. In this session, audience will learn how Intel in collaboration with healthcare partners is leading the unified programming for x-architecture with comparable performance in the healthcare domain.
Speaker:
Beenish Zia, Intel
Beenish Zia is a Platform Architect in the Health and Life Sciences business at Intel Corporation. Beenish is an electrical engineer by training and has a deep technical background in digital circuit design, CAD, hardware prototyping and system integration especially for healthcare, HPC and AI. Since starting at Intel, Beenish has worked on various engineering projects including technology enablement for Intel® Xeon® Scalable processors, Artificial Intelligence (AI) and High-Performance Computing (HPC). Beenish’ s current role expands from advancing medical imaging devices to genomics sequencers to telehealth appliances with engineering. She is at her best when she uses her ingenuity to push the boundaries of technology and work with her customers to solve technically challenging problems, that will positively impact all life beings. Beenish is seen as a tenacious, emerging leader who cultivates growth in people around her and radiates positivity. She has been recognized by her workplace several time for various achievements in her area of expertise, for her customer obsession, her fearlessness , her enthusiasm, her growth mindset and her inclusive nature.
Evgeny Drapkin, GE Healthcare
Evgeny is a well-recognized industry leader with over 25 years of experience in adopting new technologies in Medical Imaging. For last 20 years Evgeny worked at GE Healthcare and currently serves as a Chief Engineer of GE Healthcare Digital Platforms. Evgeny's contribution to the field defines modern medical imaging: he designed the first PC-based ultrasound, led 3D ultrasound projects and revolutionized CT and PET image reconstruction by introducing and leading adoption of the state-of-the-art technologies.
Evgeny has held variety of leadership positions in the medical equipment industry – Project Manager with Diasonics Ultrasound, SW Manager and Site Support Manager with Insightec, and Functional Manager, Principal Engineer, and Chief Engineer with GE Healthcare.
In his current role of Chief Engineer with GE Healthcare Digital Platforms, Evgeny owns Compute Platform and technology roadmap for GEHC Healthcare latest innovation - Edison HealthLink. He manages technical relations with GE Healthcare strategic partners in computing industry and leads evaluation and adoption of cutting-edge computing technologies.
Evgeny has 9 US patents and is a recipient of prestigious GE R&D Dushman Award.
Evgeny holds a Master Degree in Computer Science from the Hebrew University in Jerusalem.
4:15 – 4:30 pm
Break
4:30 – 5:30 pm
×
What does oneAPI need to do in the future?
Come interact with a panel of experts on the future of heterogeneous computing, including discussion of how oneAPI should continue growing to meet the evolving needs of heterogeneous computing in the future. Join discord to discuss this topic, and ask your questions of this panel live!
Speakers:
Moderator: James Reinders, Intel
James Reinders has more than three decades experience in Parallel Computing, has co-authored ten technical books related to parallel programming. James works at Intel teaching and promoting parallel programming in a heterogeneous world.
Timothy Mattson, Intel
Tim Mattson is a parallel programmer obsessed with every variety of science (Ph.D. Chemistry, UCSC, 1985). He is a senior principal engineer in Intel’s parallel computing lab. Tim has been with Intel since 1993 and has worked with brilliant people on great projects including: (1) the first TFLOP computer (ASCI Red), (2) MPI, OpenMP and OpenCL, (3) two different research processors (Intel's TFLOP chip and the 48 core SCC), (4) Data management systems (Polystore systems and Array-based storage engines), and (5) the GraphBLAS API for expressing graph algorithms as sparse linear algebra. Tim has over 150 publications including five books on different aspects of parallel computing, the latest (Published November 2019) titled “The OpenMP Common Core: making OpenMP Simple Again”.
Petersen Paul, Intel
Paul Petersen is a Sr. Principal Engineer in SATG (Software and Advanced Technology Group), and oneAPI Tools Architect. He received a Ph.D. in Computer Science from the University of Illinois in 1993.
Starting at Kuck and Associates, Inc. (KAI) responsibility included enhancing the auto-parallelizing compiler (KAP) and the early definition and implementations of OpenMP. While at KAI, he developed the Assure line of parallelization/correctness products, for Fortran, C++ and Java. In 2000, Intel Corporation acquired KAI, and he joined the software tools group creating the Thread Checker products, which evolved into the Inspector and Advisor components of the Intel® Parallel Studio. Inspector uses dynamic binary instrumentation to detect memory and concurrency bugs, and Advisor uses similar techniques along with performance measurement and modeling to assist developers in transforming existing serial applications to be ready for parallel execution. The passion for software architecture grew to cover all of Parallel Studio XE and its components architecture. After a few years leading the software tools pathfinding with a focus on defining next generation features for parallel runtimes and software analysis tools, Paul returned to software architecture in his current role leading the oneAPI Tools Architecture team.
Bhavesh Patel, Dell
Bhavesh is creating AI-based strategies for Dell ISG products as part of the ISG CTIO “Integrated
& Solutions Ecosystems” group. Activities include meeting regularly with customers and product
teams to formulate winning Dell ISG strategies in areas of Accelerators & AI/ML. Bhavesh is also
responsible for defining platform architectures which are more focused in areas of High
performance computing, Deep learning, database acceleration and working on technologies that
can improve these architectures.
Bhavesh Patel has over 25 years of industry experience in hardware, signal integrity and system
design at various companies in the field of modems, telecom and servers.
Bhavesh received his BS in Electrical Engineering from DSCE, Bangalore. He has patents in the
field of silicon, PCB and system design.
Sheng Zha, Apache MXNet PPMC
Sheng Zha is a committer and PPMC member of Apache MXNet (Incubating), a former steering committee member of Linux AI Foundation ONNX, a member of the Consortium for Python Data API Standard, and a senior applied scientist at Amazon. In his research, Sheng focuses on the intersection between deep learning based natural language processing and computing systems, with the aim of enabling large-scale and accessible representation learning.
Kevin Harms
Kevin Harms is a team lead of the Performance Engineering group within the Argonne Leadership Computing Facility (ALCF). His interests include working on HPC programming models and high performance storage.