Student: Jash Vora
Organization: INCF - GeNN Development Team
Mentor: Jamie Knight, Thomas Nowotny
Project Duration: May 2025 - August 2025
This report presents the development of an Intel SPMD Program Compiler (ISPC) backend for GeNN (GPU-Enhanced Neuronal Networks), a code generation framework for simulating spiking neural networks . The primary goal of this project was to reduce the performance gap between single-threaded CPU implementations and GPU-accelerated simulations by exploiting the SIMD (Single Instruction, Multiple Data) parallelism available in modern processors.
The project involved the development of a ISPC-based code generation backend within GeNN. This included kernel generation for neuron updates, synaptic processing, and custom model operations. Benchmarking and performance evaluation demonstrate that the ISPC backend achieves considerable speedups over the single-threaded CPU implementations, while retaining full compatibility with existing GeNN models. At the same time, it is easier to use and is broadly accessibly compared to GPU solutions.
Traditional artificial neural networks (ANNs), as shown in panel (a), process real-valued inputs and outputs in a series of layers. Each neuron produces a continuous activation value that is passed forward through weighted connections.
Panel (b) illustrates how these activations are typically represented as real numbers, such as 0.3 or 0.8, which are updated every time step during training or inference.
Spiking neural networks (SNNs), shown in panel (c), work differently. Instead of passing continuous values, neurons communicate through discrete spikes that occur at particular points in time. Information is encoded in the timing and frequency of these spikes, making SNNs closer to how biological neurons operate. This event-driven style of computation can be much more energy efficient, since neurons are mostly idle and only update when spikes occur.
GeNN (GPU-enhanced Neuronal Networks) is a framework designed to accelerate simulations of spiking neural networks. It uses code generation to produce optimized kernels for different backends, such as GPUs and CPUs. This makes it possible for researchers to test large-scale SNN models efficiently, without having to write low-level code themselves.
The need for an ISPC backend arises from several limitations in the existing GeNN ecosystem:
Hardware Accessibility: Not all users have access to high-end GPUs, limiting the adoption of GeNN’s GPU-accelerated features. ISPC compiler is also easier to setup than CUDA.
Performance Gap: Single-threaded CPU implementations often exhibit poor performance compared to GPU versions, creating a significant dip in performance for users without GPU access.
SIMD Underutilization: Modern CPUs feature powerful SIMD instruction sets (SSE, AVX, AVX-512) that remain largely untapped in traditional scalar CPU implementations. Using certain keywords in the code could give major performance boosts in computations.
Cross-Platform Portability: ISPC provides a unified programming model that can target multiple architectures (x86, ARM) and instruction sets, offering better portability than CUDA.
The primary goal of the project was to develop a backend that could:
Develop a fully functional ISPC backend for GeNN that enables SIMD-accelerated neural network simulations on CPU hardware.
Development Environment:
Programming Languages:
Testing and Benchmarking:
1. Code Generation Pipeline: The ISPC backend follows GeNN’s established code generation pattern:
2. Kernel Development Strategy:
3. Memory Layout Optimization:
4. Testing Methodology:
Backend Infrastructure Development: The initial phase focused on establishing the foundational architecture for the ISPC backend within GeNN’s existing framework. This involved creating key files as well as the Array and Preferences class.
Backend class in the ISPC namespace inheriting from BackendBase, implementing the essential virtual function signatures required by GeNN’s code generation pipelineKey Technical Contributions:
Neuron and Synapse Update Kernels: This phase involved the systematic adaptation of existing single-threaded CPU kernels to leverage ISPC’s SIMD capabilities. The approach focused on identifying parallelizable operations and implementing ataomic operations for thread safety.
foreach constructs to process multiple neurons simultaneouslyTechnical Implementation Strategy:
foreach parallelization constructsPyGeNN Integration and System Configuration: The integration phase focused on making the ISPC backend accessible through GeNN’s Python interface and ensuring usability on different platforms.
System Integration Achievements:
Custom Update Operations and Benchmarking: The final phase focused on extending functionality to support user-defined operations and conducting comprehensive performance evaluation across multiple scenarios.
foreach parallelization to user-defined mathematical operations and reduction algorithmsBenchmarking Methodology:
SIMD Kernel Adaptation Strategy:
The core technical achievement involved the systematic refactoring of existing single-threaded algorithms into SIMD-optimized implementations. This was accomplished through strategic application of ISPC’s foreach construct, which enabled automatic vectorization while preserving functional correctness.
Backend Architecture Implementation:
// Core backend methods successfully implemented
void genNeuronUpdate(CodeStream &os, const ModelSpec &model,
const NeuronGroupInternal &ng,
const Substitutions &popSubs) const override;
void genSynapseUpdate(CodeStream &os, const ModelSpec &model,
const SynapseGroupInternal &sg,
const Substitutions &popSubs) const override;
void genCustomUpdate(CodeStream &os, const ModelSpec &model,
const CustomUpdateInternal &cu,
const Substitutions &popSubs) const override;
Vectorization Methodology:
foreach constructs to enable SIMD processingIntegration Achievements:
Test Configuration:
Benchmark Models: Vogels-Abbott Network
Detailed Performance Data: Complete benchmarking results, including raw timing data, memory usage statistics, and cross-platform comparisons are available in the Performance Analysis Spreadsheet.
Comprehensive Benchmarking Across Multiple Scales:
Sparse Networks:
Dense Networks:
Cross-Platform Performance Comparison:
1. Understanding GeNN’s Complex Architecture: GeNN is a well-structured library with intricate code generation pipelines and backend methods. Before any implementation could begin, I invested time in studying the existing backends and their design patterns. With guidance from my mentor, I developed a clear understanding of how different components interact, which formed the foundation for all subsequent development work.
2. Build System Integration: Integrating ISPC compiler and build dependencies into GeNN’s existing CMake-based build system was tricky. My mentor’s assistance in configuring library linking and cross-platform compilation was particularly helpful in building the ISPC backend.
3. Dual Code Generation Strategy: ISPC keywords are not recognised in a standard C++ file and therefore the backend required managing two separate code streams - C++ host code (for .cc files) and ISPC kernel code (for .ispc files) with their respective dependencies. Initialization was managed in the C++ file while parallel computations were managed in the ISPC ones. This helped in achieving a clean code organization and optimal performance.
1. Batch Size Optimization:
2. Automatic Instruction Set Detection:
3. Native ISPC Implementation of Core Functions:
The development of an ISPC backend for GeNN successfully addresses the performance gap between single-threaded CPU and GPU implementations. The project achieved its primary objectives by delivering a fully functional backend that provides significant performance improvements while maintaining compatibility with existing GeNN models.
The ISPC backend significantly lowers the barrier to entry for high-performance neural network simulations. Researchers without access to specialized GPU hardware can now achieve considerable performance jumps for medium-scale simulations. This democratization of computational neuroscience tools aligns with GeNN’s mission to make neural network simulation accessible to a broader research community.
The successful completion of this project establishes a foundation for future developments in CPU-based neural network acceleration and demonstrates the viability of SIMD programming for computational neuroscience applications.
I would like to express my sincere gratitude to my mentors, Dr. Jamie Knight and Dr. Thomas Nowotny, whose invaluable guidance, expertise, and continuous support made this project possible. Their deep knowledge of GeNN’s architecture and SIMD programming principles was instrumental in navigating the complexities of backend development and achieving the project’s objectives.
Special thanks to Dr. Knight for his assistance with the build system integration and initialization architecture. His mentorship not only helped me complete this project successfully but also significantly aided my understanding of high-performance computing and computational neuroscience.
I am also grateful to the INCF organization and the GeNN development team for providing this opportunity through Google Summer of Code 2025, and for their commitment to advancing open-source tools in computational neuroscience.
Intel Corporation. (2023). Intel SPMD Program Compiler User’s Guide. Available online
Intel Corporation. (2013). SIMD Made Easy with Intel ISPC. Available online
Pharr, M., & Mark, W. R. (2012). ispc: A SPMD compiler for high-performance CPU programming. Proceedings of Innovative Parallel Computing (InPar). Available online
Yavuz, E., Turner, J., & Nowotny, T. (2016). GeNN: a code generation framework for accelerated brain simulations. Scientific Reports, 6, 18854. Available online
Knight, J. C., Komissarov, A., & Nowotny, T. (2021). PyGeNN: A Python Library for GPU-Enhanced Neural Networks. Frontiers in Neuroinformatics, 15, 659005. Available online
Vogels, T. P., & Abbott, L. F. (2005). Signal propagation and logic gating in networks of integrate-and-fire neurons. Journal of Neuroscience, 25(46), 10786-10795. Available online
Hennessy, J. L., & Patterson, D. A. (2019). Computer architecture: a quantitative approach. Morgan Kaufmann.
This report was prepared as part of the Google Summer of Code 2025 program.