LibGuides: Digital Scholarship Centre: eResearch

What is eResearch and High-Performance Computing?

What is eResearch?
eResearch is the use of advanced information and communication technologies (ICTs) in research practice, including collaboration, computing (such as high-performance computing), visualisation, research data management and other tools. eResearch has allowed researchers to become more innovative, accelerate research, and improve research collaborations.

What is High-Performance Computing (HPC)?
A HPC system is a collection (cluster) of computers connected together through a fast computer network. A HPC enables researchers to use multiple computers together, to solve a computational problem. This system also allows a researcher to solve a large analysis on a remote system, freeing up the researcher's computer for other operations. The HPC is used to perform massive computer simulations or calculations in a timely manner, which is virtually impossible to perform on a single desktop computer.

Who uses the HPC?
The HPC unit has over 300 researcher accounts from 30 different departments. Most of the departments fall under the Faculty of Natural and Agricultural Sciences, but there are also users from the Faculty of Health Sciences and, in 2018, the first researcher from the Faculty of Humanities started using the HPC.

The impact of using the HPC
An example of the impact that the HPC can have on research is a research group consisting of researchers from the Humanities (Department of Afrikaans and Dutch, German and French) collaborating with researchers in the Department of Mathematical Statistics and Actuarial Sciences. A quantitative approach to language comparison was suggested. In the study, the researchers had to perform 13 different statistical analysis on 195 different languages (over 8 billion words). The principal investigator could have done these analysis on her computer, but it would have taken an estimated 54 days, during which time she would not have been able to use her computer effectively and most likely would have suffered interruptions (such as load shedding). After modifying a small portion of her statistical analysis and running it on the HPC, she was able to retrieve the initial results within eight hours.

Statistics for 2019
The HPC performed 7 603 169 (867y 343d 15h:59) CPU hours in 2019. This means that if all the calculations were performed on a single computer, with a single CPU, it would take over 867 years to complete the same calculations that were performed on the HPC in 2019.

The HPC's computational capacity

The current HPC has 36 computer nodes that perform the calculations, with a total of 5 560 CPU cores and 13.8TB of system memory (RAM). The HPC also recently installed two new GPU servers. These servers are predominantly used for Image Processing, Artificial Intelligence (AI) and Machine Learning. Each GPU server has four nVidiaÒ Tesla V100 GPU cards installed. Each of these GPUs is capable of performing 7.8 double precision teraFLOPS (7.8 trillion double floating point operations per second) - each GPU card has the equivalent performance throughput of 18 midrange desktop computers, or the equivalent of about 72 midrange computers per GPU server.

The current 90 terabyte of storage is made available on all computers in the HPC over a fast IntelÒ OmniPath network connection. The OmniPath network is capable of transmitting 200 GB/s between computer nodes. That is at least 200 times faster than a user would have in a normal computer network. The HPC was also the first in Africa to make use of this network technology.

The overall calculation capacity of the system is referred to as the theoretical peak performance of the system. The theoretical peak performance of the HPC is 137.432 teraFLOPS (double precision). A teraFLOP is the measuring unit that indicates how many floating-point operations can be performed per second. One teraFLOP is one trillion (10 to the power of 12) floating-point operations per second. Imagine it was humanly possible to perform one mathematical calculation, every second of the day, 24/7, for every day of the year - if you can do that non-stop for 31 688.77 years, you would have executed the same number of calculations that a teraFLOP computer performs in a second.

HPC services

The HPC has extensive technical knowledge and experience, through years of engagement with the research community, follow international standards and keep up with trends in eResearch.

Assisting newcomers
- Training
- Training videos
- Technical documentation
- Referrals
Automation and workflows
- Automation of research processes, e.g. writing scripts to perform functions
- Attaining a better understanding of research processes, through engagement with researchers, and how it can possibly be automated
Software optimisation
- Understanding hardware features and performing software compilation accordingly
System design and maintenance
- Hardware compatibility
- Securing infrastructure
- Regular backups of data
System operations, e.g. job scheduling to ensure fair sharing of resources among researchers
Training
- Linux
- HPC usage (usage of software within the different research groups)
- HPC lectures (technical training and building clusters to computer science students)
- Data Carpentry (collaboration with the Data Carpentry Foundation)

For more information on training or to book a session, contact the DSC.

Digital Scholarship Centre

Contact HPC

Software at the HPC

What is eResearch and High-Performance Computing?

The HPC's computational capacity

HPC services