Olympus - a computing cluster at DSV
Olympus is a computing cluster based on the Slurm workload manager. It is mostly geared towards GPU computation, but it can also be used for more general CPU-bound tasks. The cluster is currently in its infancy and will be growing significantly over the course of the spring semester 2026.
All inquiries regarding Olympus should be sent to slurmsupport@dsv.su.se. This includes requests for access to the cluster.
All users, both students and staff, will need to apply for access each semester they want to use Olympus.
For thesis students
If you are writing or intend to write your thesis and need access to Olympus, this is the process to follow.
- Produce a working proof of concept
- Ask your supervisor to request access to Olympus on your behalf
- Adapt your code to run on the cluster
- Run your experiment
1. Produce a working proof of concept
Running code on a cluster is not like running code on your own computer the way you may be used to. Your code is not run interactively, which makes debugging more difficult, and there is no guarantee that your code runs immediately when you submit a job. Because of this, it is crucial that you have a test bed somewhere else where you can test changes and validate your workflow before you attempt to run your code on the cluster. Usually this would mean setting up a smaller version of your experiment that you can run either on your own computer or in the computer labs. This proof of concept should remain your "testbed" for evolving your program once you start running your experiments on Olympus, since you are going to have a much easier time diagnosing any problems in that limited environment than on the full cluster.
As an example, if you want to run an image classification experiment where a machine learning model is to identify the number and species of birds in photographs, you might produce a version of your experiment which uses the smallest possible version of the machine learning model you intend to use, and feed it a single picture with a bird. This should be possible to run on most hardware, and you will be able to do any initial debugging of your program in this limited implementation. Once you have a working implementation of your program in this limited form, you are ready to move on to requesting access to Olympus.
In order to make things easier going forward, do your best to isolate the portions of your program that will need modifying when you scale up your experiment to the size that you intend to run on Olympus.
2. Ask your supervisor to request access to Olympus on your behalf
When your proof of concept is in place, you can ask your supervisor to request access to Olympus on your behalf. The information they will need to submit in order for you to be granted access is:
- A couple of sentences describing your experiment. This is to ensure that we can determine if Olympus is a good fit or if we will need to provide some other solution for your project.
- The SU usernames of all authors for this thesis.
Access requests will be processed within a few working days of submission. If we determine that your project is unsuitable for Olympus (for example if it requires a very tight feedback loop or must run on Windows), we will work with you and your supervisor to provide an alternative solution. We have some machines in reserve for such projects.
3. Adapt your code to run on the cluster
In order for your code to be able to run on Olympus, you will need to write a wrapper script that informs Slurm of the resources your job needs, what program to run, where to store output, etc. Documentation for how to write this script will be written and published here shortly. Regardless of the state of the documentation at the time you write your script, you will likely run into issues that you need help sorting out. Contact slurmsupport@dsv.su.se for help with this or any other questions regarding Olympus.
When doing this adaptation work, you should make sure that your proof of concept stays up to date and is a representative minimal version of your experiment. This will help you sort out problems faster and easier than purely running your code on the cluster. Ideally the only difference between your on- and off-cluster code should be the size of the model you run and the amount of input it processes.
4. Run your experiment
Once you have your wrapper script in place and your proof of concept code works, you can run your experiment.