MSc Projects

Below a selection of the current available MSc projects. More information about other MSc projects can be found in the presentation.

A LaTeX template for a MSc thesis report can downloaded here.

Current MSc Projects

[CE-PRJ-2017-02] Application-Specific Processor Architecture

Multicore systems are facing increasing limitations in finding sources of thread-level parallelism in applications to effectively utilize their resources. At the same time, the increasing power consumption of multicore systems is forcing a structural reduction of their maximum achievable performance to prevent exceeding their thermal design power budget. One of the main limitations of using multicore systems is the mismatch between the resources available in the architecture (number of cores, memory size, etc.) and the requirements of the applications. This project aim to investigate the opportunities of creating reconfigurable processor architectures on FPGAs that can match the resources needed by the application, thereby ensuring efficient utilization of the available hardware. We will first identify a set of applications with various processing, memory and interconnect requirements and evaluate their architectural needs. Then we will define a number of optimal architectural templates that can process these applications efficiently. Finally, we will merge these templates into a unified reconfigurable architecture and evaluate its performance for the selected applications. Such an architecture promises to strike an optimal trade-off for performance, power-efficiency and programmability of processing systems. Interested? Contact Zaid Al-Ars

[CE-PRJ-2017-01] Multi-GPU Implementation of HPC Brain Simulations

High-performance solutions have become imperative for efficient simulation efforts conducted within Computational Neuroscience. Computational Neuroscience attempts to develop advanced in-silico models of brain functionality as a basis for extensive behavioral studies. Often such biophysically-meaningful networks are very computationally demanding. GPUs have been identified as a crucial HPC technology for use in the field. Even though single-node HPC rudimentary acceleration is relatively easy to implement (especially with GPU-MATLAB integration for example), the performance provided is not often enough for simulating large enough networks that come close to the biological systems under study. In this topic, we want to explore the possibilities that multiple GPU fabrics running simultaneously have to fill this performance requirement. Of special interest are low-latency accelerator interconnect like the NVidia NVLink technology. As a representative benchmark, we use a highly advanced, computationally demanding brain model, developed by Erasmus MC for detailed in-house behavioral studies of the Olivocerebellar system in the human cerebellum. This thesis topic is a joint venture between the Erasmus Medical Center (Neuroscience dept.) and TU Delft (Computer Engineering). Interested? Contact Zaid Al-Ars

[CE-PRJ-2016-04] FPGAs on FPGAs: Enabling FPGA Design Using Intermediate HW Logic Fabrics

At the computer engineering laboratory at TU Delft we are striving to find solutions to implement a heterogeneous cluster or cloud, which consists of nodes with CPUs, GPUs and FPGAs as computing units. This heterogeneous cluster/cloud should make effective use of the power of GPU/FPGA accelerators to solve Big Data problems. The long term vision is to map applications or parts of applications automatically to specific computing units, without the application programmer having to deal with low-level issues. When such a mechanism is available, it is expected that high-level programmers that program in, for example, Java or Scala, on top of frameworks such as Spark, will be able to unlock more of the raw computing power that is made available by the hardware. Especially for FPGAs, HLS tools can already provide some of this functionality, but they still do not solve the problem of having no binary compatibility between different FPGAs, which might be essential in distributing compiled applications in a cluster. For normal CPU's, this is often solved by using a virtual machine such as the JVM. Analogous to this mechanism, FPGAs could implement a sort of course-grain reconfigurable arrays (of basic logic blocks), which are in the context of FPGAs often called Overlay Architectures or Intermediate Fabrics. 

The thesis project will consist of the following parts / answering the following research questions. 1) Find out the current state of the art in overlay architecture / intermediate fabrics. 2) Determine a set of common use-cases within the Big Data context that could benefit from acceleration with such a fabric. 3) Implement such a fabric, hereby gaining detailed understanding of the challenges for such architectures, possibly improving on the state-of-the-art which could result in publishable work. 4) Map the use-case application onto the fabric (manual mapping is acceptable in this stage of the project). 5) Report performance measurements and give recommendations regarding the potential of the implemented platform w.r.t. the long term vision. Interested? Contact Zaid Al-Ars

[CE-PRJ-2016-03] Modelling Processor Execution Time Performance

As a technical leader, ASML would like to promote cutting edge research to find a long term, structural solution to mitigate the project risk of software not being able to meet the worst case execution time (WCET) deadline very early in this project. If this is achieved, the decision about the computing platform will be made very early in the project cycle and will save a considerable amount of software developer effort and project costs. Moreover, it will help bring certainty to project plans. The long term research objective hence is to build a model for the high-accuracy estimation of the workload and WCET performance. A toolset (CARM2G) has already been developed which can accurately predict performance for a multi-processor, multi-core execution platform. The first step in this assignment is to evaluate the CARM2G toolset meets the estimation accuracy requirements and if it is sufficient for modelling the application and the computing platform for one of the subsystem (SPM) developed by ESD. Interested? Contact Zaid Al-Ars

[CE-PRJ-2016-02] Acceleration of Big Data Algorithms for Behavioral Experiments

Tracking of mouth whiskers in many mammals is characteristic of their brain activity, similar to what finger movement is in humans. Neuroscientists can deduce a plethora of information on behavior by mounting whisker-tracking experiments, i.e. experiments where animals (typically, mice, rats) are being tracked for their whisker movements subject to various stimuli such as air puff in their eyes, auditory stimuli, and so on. The Erasmus MC developed an experimental setup which records whisker movements on head-fixed mice. Recording is done through a high-speed camera that generates large amounts of image stacks which are then sent to a computer for post-processing through a powerful yet slow Matlab program ( Current experiment runs generate 15 seconds of whisker-tracking video which occupies 2-4 GB of disk space to store and takes about 2 weeks of post-processing in Matlab. At the moment, dozens of videos are generated per week, which puts high pressure not only on the storage equipment needed but is also detrimental to the fast and efficient analysis of the behavioral experiments. The goal of the student in this thesis is to study the open-source Matlab code and port the compute- and data-intensive parts of it to a high-performance, FPGA-based computing platform (Maxeler). This not only will accommodate experiments in the lab but will also be the first, crucial step for supporting closed-loop behavioral experiments, where specific whisker movements will evoke (in real time) a suitable response by the analysis machine, leveraging a crucial class of neuroscientifically relevant experiments. Interested? Contact Zaid Al-Ars

[CE-PRJ-2016-01] Public Gateway to Massive Neuron Simulations

The modeling of neurons, especially at biological and physiological accuracy has been proven to be a very complex and computationally demanding task. Also, in many cases, command of the neuron-modeling implementation is difficult by non-experts, as is its optimal mapping on the proper computing infrastructure. Moreover, handling lengthy simulations at runtime, is an especially difficult task, given the workload complexity. As a result, it is very desirable to hide the complex details of neuron modeling under a user interface that will accept the parameters of neuron simulation and only provide the respective result. It is noteworthy that communities facing similar computational workloads do follow the paradigm of a centralized gateway to heavy simulations: For instance, the nanotechnology community features the nanohub portal. The envisioned interface is an online gateway that will accept neuron-modeling workloads and return the respective results. The gateway will function as front-end to the greater neuron-modeling infrastructure. The actual workload will be delegated to multi-/many-core or accelerator-based platforms. The student will receive the back-end implementation as legacy code and will be required to develop the online front-end. Specifications will be drawn for the functionality and parametrization of the front-end and a working prototype will be developed. The intention is – by the thesis end – that an alpha version of the gateway is publicly released. Interested? Contact Zaid Al-Ars

[CE-PRJ-2015-05] Multi-FPGA Implementation of Artificial-Cerebellum Computational Model

Over the last decade, an increasing amount of effort is being spent on constructing and, then, simulating powerful brain models that can greatly help in unraveling the mysteries of the human brain (see for instance the EU flagship project: Whereas these models are powerful and constantly come closer to the real brain functionality, however they are typically very computationally intensive, to the point that common platforms such as multicore CPUs fall short of reasonable execution times. We have thus turned to more powerful platforms, FPGAs. While FPGAs receive high marks when it comes to performance acceleration, nevertheless, their limited capacity is not sufficient for implementing large-scale brain simulations comprising (hundreds of) thousands neurons. The subject of this topic is to extend a currently implemented, biologically-accurate, simulation platform (comprising a single FPGA) to incorporate multiple FPGAs. If the inter-FPGA communication challenges are recognized and sufficiently dealt with, this extension is expected to double the achievable real-time brain simulation capabilities with every new FPGA on the stack. The platform is to be used for biophysically-meaningful simulations of Cerebellar microsections in the Neuroscience Department of the Erasmus MC, Rotterdam. The student is expected to analyze the original single-FPGA neural models, identify latency-sensitive sections and potential optimizations and, then, deploy (through use of suitable EDA tools, e.g. Compaan) the original application onto a multi-FPGA arrangement. Interested? Contact Zaid Al-Ars

[CE-PRJ-2015-04] Proposing a computational reference platform for genomics analysis based on the IBM Power 8

In this project, the student is expected to perform a detailed profiling of widely used genomics applications, identify commonalities, and analyze computational bottlenecks of these applications. The student would then optimize these applications to a cutting-edge IBM Power 8 system and construct a hardware cost function for optimal system utilization, including processor, memory, as well as I/O. Based on this information, the student would provide recommendations to modify system architecture and create an optimal reference platform for genomics applications. Interested? Contact Zaid Al-Ars

[CE-PRJ-2015-03] Design of big data model to predict possible disease onset based on biometric measurements data

In this project, the student is presented with a large set of human health data to be analyzed, such as blood tests, heart rate measurements, x-ray images, etc. This information is treated as an unstructured big data problem with that could be scanned for possible data correlations. The student is expected to create automated analysis methods, such as machine learning or genetic algorithms, with specific objective functions that should be able to predict the onset of possible diseases in the data set. Interested? Contact Zaid Al-Ars

[CE-PRJ-2015-02] Acceleration of cancer diagnosis algorithms on super computing FPGA platforms

In this project, the student is expected to analyze computationally expensive cancer diagnostics algorithms and identify time consuming functions suitable for acceleration on FPGA fabrics. The student would then design and implement hardware kernels of these functions and show their computational effectiveness. Furthermore, a system level implementation should be created with the kernels embedded as part of the algorithms to demonstrate practical viability of the FPGA implementation. Interested? Contact Zaid Al-Ars

 [CE-PRJ-2015-01] Evaluation of the accuracy and efficiency of DNA assembly algorithms

This project aims at identifying an optimal cost-accuracy trade-off for different DNA sequencing and assembly strategies. Various possible assembly strategies will be investigate including short and long read sequencing from the Illumina and PacBio platforms. In this project, we will collaborate with the Broad Institute of MIT and Harvard and the US Joint Genome Institute on their Illumina as well as PacBio data to evaluate the accuracy and efficiency of assembly pipelines using completely finished genomes such as E. coli, M. tuberculosis and S. cerevisiae. Interested? Contact Zaid Al-Ars

CE Tweets