Brief Tutorial for Mersenne Prime Test Softwares

Table of contents

After playing with linux OSes recently, I also tried to figure out other ways to contribute to GIMPS. In this blog, I will summarize the softwares recommended for the GIMPS project and their usage. GIMPS relies on the PrimeNet for tasks distribution and collection. Currently there are three kinds of tasks for searching new Mersenne primes will be distributed: TrialT FactoringF、P-1PM1 factorization and Probable PrimePRP test。Therefore, I will mainly cover the usage and recommended strategy in this blog.

First of all, you can directly execute these programs if you don’t want to get credits. But I recommend creating an account on the GIMPS website so that you can use PrimeNet to automatically get new tasks and submit them. This is also the most effective way to help searching for the next Mersenne prime.

There are generally two ways to execute the programs. You can input the arguments (such as the exponent) through the command line, or you can put the tasks in a worktodo.txt file. The format of all these worktodo.txt files follows the official definition by PrimeNet. If you get the tasks through manual assignment, then what you get will be a bunch of text and each line defines a task. The program will recognize them after you put those text into the worktodo.txt file.

`mprime` for all tasks on CPU

mprime is the official program for primality test by the GIMPS project. It’s also called prime95 (or P95) on Windows, and it’s widely used for stress test. The program supports all tasks in PrimeNet but it can only compute on CPU. Nevertheless, the program is extensively optimized for various CPU models so it is the go-to choice for all the tasks on CPU.

The official releases of mprime can be obtained on the GIMPS website, while the latest beta version can be downloaded from the mirror site¹.

On Linux, you may need to install the GMP library before running the mprime program. (This can be done with sudo apt install libgmp10); while on Windows, all required libraries will be bundled with the program, so no additional installation is required. When you run the program for the first time, you will be prompted for your username (not the nickname) on GIMPS. Afterwards, you can set task preferences, the number of hours per day the task will run, and the amount of CPU and RAM available to the program. mprime consists of multiple workers that can perform different types of tasks. Changing preferences per worker allows you to have more fine-grained control over computing resources.

Note that on Linux, after the first time when you configure the preferences, the program won’t output any information in the terminal. If you want the menu to come back, you need to start the program with mprime -m.

`mfaktc` for TF tasks on GPU

GPU is a better fit for parallel computation than CPU, and TF is a task that can be greatly parallelized. Therefore TF is the task where GPU shows the best advantage over CPU. In the other two tasks, the benefit of using GPU only comes from faster integer multiplication with FFT algorithms. The other functionalities of GPU that helps parallelization are not being used.

The common programs used by everyone and recommended by GIMPs are mfaktc (based on CUDA) and mfakto (based on OpenCL), both of which can be downloaded from the mirror site¹. It’s worth noting that the compiled version downloaded from the mirror site incorporates some additional optimizations (like adjusting some parameters for newer GPUs). So, if you want to compile the programs yourself, you can also download the modifed source code from the mirror site. Beside, if you are going to download the compiled version, you might need to install the CUDA runtime with the version matched with the exectable.

These two programs are easy to use. Take mfaktc for example: download the tasks from the GIMPS website and put the task definitions into a worktodo.txt. Then just execute mfaktc under the same directory and you are good to go. At the beginning of each run, mfaktc will perform a quick check and notify you when there are problems with your GPU driver of hardware. This is actually a good way to check whether your GPU is working normally.

mfaktc is also utilized by some prime-finding projects on BOINC. SRBase is an example which distributes tasks to users with this program. It will also upload the factorization results back to PrimeNet. Therefore if you participate in those BOINC probjects, you are also contributing to BOINC. However the credits will be counted on the BOINC project, not on your account.

`GpuOwl` for PRP tasks on GPU

GpuOwl is a open-sourced software developed by Mihal Preda², it’s the only one among the GPU programs for GIMPS that are still being actively developed. Therefore, the tutorials here might be not up-to-date. The latest version of GpuOwl is v7.2 at the time I wrote this blog. It’s exectuables for Windows can be downloaded from the mirror site¹. But for Linux, it’s very likely you are going to compile it yourself because of there are no compiled versions on the website.

GpuOwl supports both PRP and PM1 tasks, but it’s designed for PRP with GPUs. Since only GpuOwl can provide the certificate file for PRP tasks, it’s not recommended to use other similar softwares now (such as CudaLucas). Using the primenet.py³ script provided by GpuOwl can help automatically obtain and submit tasks.

GpuOwl also supports multi-GPU computation, the way to do that is to launch multiple GpuOwl processes and specify the target GPU with the -device argument for each process. Meanwhile, using -pool argument to specify the shared workspace for all GPUs. Put the task definitions into the worktodo.txt in this workspace, then every GpuOwl process will fetch the task from this file.

`GpuOwl` + `mprime` for PM1 tasks on GPU

The softwares with support for PM1 tasks include GpuOwl and CUDAPm1. The code for the latter one hasn’t be updated for almost a decade and it only supports NVidia GPUs. Therefore GpuOwl is still the best option for PM1. After a recent update, GpuOwl now supports a clever way to compute PM1 tasks, that is to use GpuOwl for the calculation at stage 1, then feed the result into mprime which will then does the calculation at stage 2. By this way, we can benefit from both the fast computation of GPU, and a large amount of memory for stage 2 with mprime. (Generally speaking, the VRAM of GPUs is smaller than the RAM of the host machine. The more memory available for stage 2 the better.) Right now GpuOwl will not directly output the PM1 result because it doesn’t perform the final GCD step in the stage 1. Therefore, it has to be bundled with mprime for PM1 tasks.

The way to do this is to specify the directory with argument -mprimeDir, where the results from stage 1 will be stored with a format readable by mprime. There will be multiple files generate under this directory, each exponent will have a file and there will be another worktodo.add file. Copy these files to the directory of mprime, then mprime will recognize them and add the tasks to its own worktodo.txt.

It’s recommended to limit mprime from receiving new tasks and only take the tasks generated by GpuOwl. According to the tutorial by the author, add following config to the prime.txt under the directory of mprime:
1
2
3
4
SequentialWorkToDo=1
MaxExponents=1000
UnreserveExponents=1000
NoMoreWork=1
INI

Comparison of credit efficiency between task types

If you want to obtain higher credits on GIMPS, then it’s necessary to know that the efficiencies of the three types of tasks are different. Although GIMPS credits are assigned based on the amount of computation, it’s just a estimation because of different hardware models and software optimization. And it’s not proportional to the time you actually spent. The unit used by GIMPS to measure the computation is Ghz-DaysGHD. For example, 1GHD is roughly equivalent to the amount of computation done by a 1Ghz CPU for 24 hours.

According to my experiences, TF is the most efficient way to obtain GHD credits, then PM1, and lastly PRP. Although I haven’t done any CPU experiments, my experience on GPU is that with a graphics card similar to 1080Ti, the credit efficiency of TF can reach 700GHD per day, PM1’s is 200 to 300, and the one of PRP is only 100~200GHD per day. So I choose the task type based on these considerations: run TF if you want most credits, run PM1 if you want to take the advantage of your beefy RAM or to find the biggest factor, and run PRP if you want to be the next person to discover a Mersenne prime. If you want to figure out the efficiency yourself, you can use the credit calculator.

Besides, based on my experience (on GPU), the efficiencies of different task types are:

TF: Lower the bits and lower the exponent, higher the efficiency. (Therefore the wavefront TF tasks on PrimeNet are actually the most efficient)
According to the estimation of mfaktc, running on exponents around 200M with 74bits produces about 750GHD per day (using 1080Ti), but it only produces about 700GHD per day when running on 900M exponents.

PM1: Both the credits and computation time mainly depend on B2. Therefore it’s better to set a lower value for B1. However, the influence of exponent size and B2 on the efficiency is not clear yet.

Here I provide some reference data points (using 4090 GPU for the first stage and 13700K CPU for the second stage and)
Exponent B1 B2 Stage 1 time Stage 2 time Final Credits Credit Efficiency
119M 2.4M 260M 2hr 2.5hr 126GHD 28GHD/h
119M 3.2M 260M 2.5hr 2.5hr 132GHD 26GHD/h
332M 3.2M 210M 7hr 19.5hr 362GHD 13.6GHD/h
332M 3.2M 180M 7hr 18.5hr 323GHD 12.7GHD/h

PRP：It seems that higher the exponent higher the efficiency.
On RTX4080 it takes about 53 hours to complete the exponent around 113M, with final credits of ~500GHD, the efficiency is 9.4GHD per hour; Meanwhile, it takes about 500 hours to complete the exponent around 323M, with final credits of ~5000GHD, the efficiency is 10GHD per hour.

Exponent	B1	B2	Stage 1 time	Stage 2 time	Final Credits	Credit Efficiency
119M	2.4M	260M	2hr	2.5hr	126GHD	28GHD/h
119M	3.2M	260M	2.5hr	2.5hr	132GHD	26GHD/h
332M	3.2M	210M	7hr	19.5hr	362GHD	13.6GHD/h
332M	3.2M	180M	7hr	18.5hr	323GHD	12.7GHD/h

In this blog, I covered the basic usage of softwares above. For advanced usage, You can refer to the official documentation (although most of them are not very user-friendly), or this summary post on the Mersenne forum.

The strategy I currently use is:

For TF: Run on GPU only. Use the work distribution table to find the available TF task with the smallest exponent and bits, then reserve them throught the manual assignment page.
For PM1: On CPU, use mprime to obtain the first time PM1 tasks with the smallest exponents automatically. On GPU, select the PM1 tasks with the smallest exponents or those with exponents over 100M digits and small B1⁴.
For PRP: Run on GPU only. Use GpuOwl to automatically obtain the PRP tasks with smallest exponents or exponents over 100M digits.

The website mersenne.ca built by James Heinrich provides various tools, data and rankings for GIMPS. It also provides the builds for almost all unofficial softwares. ↩︎ ↩︎ ↩︎
The Github link for GpuOwl is github.com/preda/gpuowl. ↩︎
You can specify the type of tasks to obtain with primenet.py, the definition of each type name can be found at the GIMPS website ↩︎
You can find the exponents with over 100M decimal digits and no PM1 task done with this link. ↩︎

mprime for all tasks on CPU

mfaktc for TF tasks on GPU

GpuOwl for PRP tasks on GPU

GpuOwl + mprime for PM1 tasks on GPU

Comparison of credit efficiency between task types

`mprime` for all tasks on CPU

`mfaktc` for TF tasks on GPU

`GpuOwl` for PRP tasks on GPU

`GpuOwl` + `mprime` for PM1 tasks on GPU