Regarding point 2, keep in mind that PARSE_TYPE=2 or 4, algorithms you he=
lped with, do empirical load balancing, improving its assessment with each =
iteration, so the idle time waiting for all to finish is reduced.
Robert J. Bauer, Ph.D.
Vice President, Pharmacometrics R&D
ICON Early Phase
Office: (215) 616-6428
Mobile: (925) 286-0769
From: owner-nmusers_at_globomaxnm.com [mailto:owner-nmusers_at_globomaxnm.com] On=
Behalf Of Mark Sale
Sent: Tuesday, December 08, 2015 3:00 PM
To: Pavel Belo; nmusers_at_globomaxnm.com
Subject: Re: [NMusers] setup of parallel processing and supporting software=
- help wanted
The loss of efficiency with parallel computing in NONMEM has two sources:
1. I/O time, each process has to do it's calculation, then write those resu=
lts to a disc file (on a single machine, even with the MPI method the resul=
ts are written to a file, that file may or may not be written to disc by th=
e operating system, depending on the file size the whether the OS decides t=
he file may be used soon, same actually in the FPI method, where the OS may=
decide to buffer the file and not actually write it to disc.). This ineffi=
ciency gets larger with the number of processes, and gets substantially lar=
ger when you go to multiple machines, as they must send data over the netwo=
rk (and must actually write the data to disk, with either MPI or FPI method=
). You can actually run parallel NONMEM over a VPN, but as you might imagi=
ne, this slows it down substantially.
2. Inefficiency due to one process finishing it's slice of the data before =
the other. The manager program must wait until the last process is finishe=
d before it can do the management (sum the OBJ, calculate the gradient, get=
the next parameter values, send them out to the processes). This also ge=
ts larger with more processes. In a well conditioned problem, where every =
individual takes roughly the same amount of time to calculate the OBJ for, =
this isn't too bad. But, occasionally, with stiff ODEs you'll find a small=
number of individuals who take much, much longer to solve the ODES, and yo=
u'll find that efficiency drops substantially.
Together these make up Amdahl's law
[Image removed by sender.]<https://en.wikipedia.org/wiki/Amdahl%27s_law
Amdahl's law - Wikipedia, the free encyclopedia
In computer architecture, Amdahl's law (or Amdahl's argument ) gives the=
theoretical speedup in latency of the execution of a task at fixed workloa=
d that can be ...
All that said, here are my recommendations:
Don't bother trying to parallelize a run that takes less than 10 minutes, t=
he I/O time will cancel out any gain in execution time.
If the execution time for a single function evaluation (note a run is often=
between 1000 and 5000 function evaluations) is less than 0.5 seconds, you =
probably can improve performance with parallel execution. Note that 1000 fu=
nction evaluations at 0.5 seconds each = 500 seconds, 8 minutes.
Assuming a 1 gbit network, if the execution time for a single function eval=
uation is > 1 second, you probably can improve performance with parallel ex=
I have personally never found a problem that benefited from more than 24 pr=
ocesses, but, in theory some very large problems (run time of weeks) may.
Here is a link to a nice paper from the Gibianskys and Bob Bauer with more =
recent benchmarks than our early work.
Comparison of Nonmem 7.2 estimation methods and parallel ...
1. J Pharmacokinet Pharmacodyn. 2012 Feb;39(1):17-35. doi: 10.1007/s10928-0=
11-9228-y. Epub 2011 Nov 19. Comparison of Nonmem 7.2 estimation methods an=
d parallel ...
Mark Sale M.D.
Vice President, Modeling and Simulation
Nuventra, Inc. (tm)
2525 Meridian Parkway, Suite 280
Research Triangle Park, NC 27713
Empower your Pipeline
CONFIDENTIALITY NOTICE The information in this transmittal (including attac=
hments, if any) may be privileged and confidential and is intended only for=
the recipient(s) listed above. Any review, use, disclosure, distribution o=
r copying of this transmittal, in any form, is prohibited except by or on b=
ehalf of the intended recipient(s). If you have received this transmittal i=
n error, please notify me immediately by reply email and destroy all copies=
of the transmittal.
From: owner-nmusers_at_globomaxnm.com <owner-nmusers_at_globomaxnm.com> on behalf=
of Pavel Belo <nonmem_at_optonline.net>
Sent: Tuesday, December 8, 2015 4:54 PM
Subject: [NMusers] setup of parallel processing and supporting software - h=
Hello The Team,
We hear different opinions about effectiveness of parallel processing with =
NONMEM from very helpful to less helpful. It can be task dependent. How =
useful is it in phase 3 for basic and covariate models, as well as for boot=
We reached a non-exploratory (production) point when popPK is on a critical=
path and sophisticated but slow home-made utilities may be insufficient. =
Are there efficient/quick companies/institutions, which setup parallel proc=
essing, supporting software and, possibly, some other utilities (cloud comp=
uting, ...)? A group which used to helped us a while ago disappeared somew=
<br /><br />
ICON plc made the following annotations.
s e-mail transmission may contain confidential or legally privileged inform=
ation that is intended only for the individual or entity named in the e-mai=
l address. If you
are not the intended recipient, you are hereby notified=
that any disclosure, copying, distribution, or reliance upon the contents =
of this e-mail is strictly prohibited. If
you have received this e-mail t=
ransmission in error, please reply to the sender, so that ICON plc can arra=
nge for proper delivery, and then please delete the message.
South County Business Park
Registered number: 145835
Received on Tue Dec 08 2015 - 18:27:22 EST