NONMEM Users Network Archive

Hosted by Cognigen

Re: setup of parallel processing and supporting software - help wanted

From: Mark Sale <msale>
Date: Wed, 9 Dec 2015 13:42:51 +0000

Maybe a little more clarification:

Thanks to Bob for pointing out that the


option implements some code for load balancing, and there really is no down=
side, so should probably always be used.

Contrary to other comments, NONMEM 7.3 (and 7.2) does parallelize the covar=
iance step. Ruben is correct that the $TABLE step is not parallelize in 7.=

WRT sometimes it works and sometimes it doesn't, we can be more specific th=
an this. The parallelization takes place at the level of the calculation of=
 the objective function. The data are split up and the OBJ for the subsets=
 of the data is sent to multiple processes. When all processes are done, t=
he results are compiled by the manager program. The total round trip time=
 for one process then is the calculation time + I/O time. Without parallel=
ization, there is no I/O time. For each parallel process, the I/O time is =
essentially fixed (in our benchmarks maybe 20-40 msec per process on a sing=
le machine). The variable of interest then is the calculation time. If th=
e calculation time is 1 msec and the I/O time is 20 msec, if you paralleliz=
e to 2 cores, you cut the calculation time to 0.5 msec, now have 40 msec (2=
*20 msec) of I/O time, for a total of 40.5 msec, much slower. If the calcu=
lation time is 500 msec, and you parallelize to 2 cores, the total time is =
250 msec (for calculation) + 2*20 msec (for I/O) = 290 msec. If The key =
parameter then is the time for a single objective function evaluation (not =
the total run time). If the time for a single function evaluation is > 500=
 msec, parallelization will be helpful (on a single machine). There really=
 isn't anything very mystical about when it helps and when it doesn't. The =
efficiency depends very little on the size of the data set, except that the=
 limit of parallelization is the number of subjects (the data set must be s=
plit up by subject).

Mark Sale M.D.
Vice President, Modeling and Simulation
Nuventra, Inc. ™
2525 Meridian Parkway, Suite 280
Research Triangle Park, NC 27713
Office (919)-973-0383

Empower your Pipeline

CONFIDENTIALITY NOTICE The information in this transmittal (including attac=
hments, if any) may be privileged and confidential and is intended only for=
 the recipient(s) listed above. Any review, use, disclosure, distribution o=
r copying of this transmittal, in any form, is prohibited except by or on b=
ehalf of the intended recipient(s). If you have received this transmittal i=
n error, please notify me immediately by reply email and destroy all copies=
 of the transmittal.

From: owner-nmusers
 of Faelens, Ruben (Belgium) <Ruben.Faelens
Sent: Wednesday, December 9, 2015 5:42 AM
To: Pavel Belo; nmusers
Subject: RE: [NMusers] setup of parallel processing and supporting software=
 - help wanted

Hi Pavel,

In general, parallelization discussions always revolve around the following=
 question: “Can you create independent blocks of work?”

You should make a clear distinction here between parallelizing nonmem, and =
running several nonmem runs in parallel. Let’s talk about estimating of a=
 single model, doing a covariate search and doing a bootstrap.

Very roughly speaking, Nonmem works as follows:

1) Pick a THETA

2) Estimate the probability curve of all ETA’s for all subjects

3) Compute the integral over all probability curves to find a probabil=
ity for THETA

4) Pick a more likely THETA, rinse and repeat

Parallelizing NONMEM means parallelizing step #2. Step #1, #3 and #4 cannot=
 be parallelized.

In practice, we simply split up the subjects in N groups. Each worker calcu=
lates the probability curve for all subject in its group and sends the resu=
lts back to the main worker, who can then calculate step #3 and step #4.

This works well for very complex models with a considerable estimation step=
: ODE systems. If you have a very fast model (e.g. simple $PRED section) an=
d a huge dataset, it might be faster to run all of this locally on a single=

Note that Nonmem 7.3 does not parallelize anything other than the subject E=
TA estimation step! Nonmem 7.4 will parallelize also the TABLE and/or COVAR=
IANCE estimation step.

Conclusion: Parallelizing nonmem only works well in specific cases. You can=
 execute the parallelization by specifying a parafile. If your system admin=
istrator specified a parametric parafile, you can also choose the number of=
 CPU’s to parallelize over using [nodes]=x in nmfe, or with -nodes=xx=
x in PsN execute.

Let’s now talk about a covariate search. In this case, we want to evaluat=
e 12 models; we can evaluate them concurrently, as there is no dependence b=
etween them. PsN works wonderfully here: you can configure the amount of pa=
rallel runs using the -threads=xxx switch.

1) Estimate the base model

2) Create 12 instances of the base model, adding a single covariate to=
 each instance. Launch all of these instances in parallel.

3) Once these 12 instances completed, select the most significant cova=

4) Create 11 instances of the model from step #3, adding a single cova=
riate to each instance. Launch all of these instances in parallel.

5) etc.

As you can see, there is still some dependence: we need all results from st=
ep #2 to evaluate step #3. On top of that, parallelizing step #2 means you =
will have to collect all of those results back over the network to do step =
#3 (I/O impact). If you do not parallelize, they will already be sitting in=
 main memory.

In practice, we use the following calculation:

· #threads = #max_covariate_steps

· #CPU_available / #max_covariate_steps = #nodes

So for a cluster of 20 CPU’s:

· A covariate search with 5 covariates would be launched using: sc=
m myModel.scm -threads=5 -nodes=4

· A covariate search with 20 covariates would be launched using: sc=
m myModel.scm -threads=20 -nodes=1

Remember that running multiple nonmem runs concurrently should always be pr=
eferred over parallelizing a single nonmem run.

Finally, let’s talk about bootstrapping:

In this case, there is no dependence between the results. This problem can =
be perfectly parallelized. In this case, always prefer to run multiple nonm=
em runs concurrently, instead of parallelizing a single nonmem run.

bootstrap myModel -samples=2000 -threads=2000 -nodes=1

Final summary:

For a single nonmem run, parallelization may or may not work, depending on =
how complex your model $PRED code is.

For a covariate search, try to prefer running multiple runs at the same tim=
e, rather than parallelizing single runs.

For bootstraps, always run multiple runs at the same time. Never paralleliz=
e a single nonmem run.

Kind regards,


From: owner-nmusers
 Behalf Of Pavel Belo
Sent: dinsdag 8 december 2015 22:54
To: nmusers
Subject: [NMusers] setup of parallel processing and supporting software - h=
elp wanted

Hello The Team,

We hear different opinions about effectiveness of parallel processing with =
NONMEM from very helpful to less helpful. It can be task dependent. How =
useful is it in phase 3 for basic and covariate models, as well as for boot=

We reached a non-exploratory (production) point when popPK is on a critical=
 path and sophisticated but slow home-made utilities may be insufficient. =
Are there efficient/quick companies/institutions, which setup parallel proc=
essing, supporting software and, possibly, some other utilities (cloud comp=
uting, ...)? A group which used to helped us a while ago disappeared somew=



Information in this email and any attachments is confidential and intended =
solely for the use of the individual(s) to whom it is addressed or otherwis=
e directed. Please note that any views or opinions presented in this email =
are solely those of the author and do not necessarily represent those of th=
e Company. Finally, the recipient should check this email and any attachmen=
ts for the presence of viruses. The Company accepts no liability for any da=
mage caused by any virus transmitted by this email. All SGS services are re=
ndered in accordance with the applicable SGS conditions of service availabl=
e on request and accessible at

Received on Wed Dec 09 2015 - 08:42:51 EST

The NONMEM Users Network is maintained by ICON plc. Requests to subscribe to the network should be sent to:

Once subscribed, you may contribute to the discussion by emailing: