NONMEM Users Network Archive

Hosted by Cognigen

Re: [NMusers] Context-free lexer for NM-TRAN

From: Devin Pastoor <devin.pastoor_at_gmail.com>
Date: Thu, 14 Jun 2018 11:39:18 -0400

Hi all,

A couple thoughts on this. First, I would suggest constructing a parse tree
over an abstract syntax tree, as it will likely be important to retain
additional metadata such as comments, as I presume such a tool would
provide a foundation for automatic things such as refactoring,
reformatting, updating parameter values etc. Thus it would need to retain
the state of the initial document.

I don't know how if ANTLR is the right approach as:

1) you'd also need to parse FORTRAN snippets that can be embedded in a
control stream, would you not?
2) Write bindings for the antlr targets (eg to get into R I guess would
need to go through the C++ antlr target, blah)
3) does nonmem even have a formal specification to compare grammar against?

Rather than focusing on the formally defining the language to be compatible
with ANTLR, and instead treating it as a DSL, a simpler lexer could be
constructed "by hand", thus allowing flexibility in handling some of the
particularities that the DSL imposes (mix of fortran dialects, etc).

Perhaps taking a step back and figuring out what the most common uses,
given such a tool would exist, would help drive the discussion moving
forward about what would be the most reasonable implementation objective to
achieve the major outcomes. For example, having an autochecking CLI tool
for grammar errors vs a tool for updating parameters, vs a tool to
integrate with developer tools such as rstudio/vscode etc.

Would also be happy to continue discussing on github/otherwise.

Devin

On Thu, Jun 14, 2018 at 10:59 AM Ruben Faelens <ruben.faelens_at_gmail.com>
wrote:

> Hi Bill,
>
> Nice to see you're interested. I have something basic working in XText,
> that can already create an AST and editor for an example nonmem file.
> However, it suffers from the context-free aspect of the lexer, and
> therefore errors out in some cases...
>
> See http://github.com/rfaelens/nmparser/demo/
> Feel free to git pull and continue to implement the full language.
> Although note that the lexing is currently the weak part...
>
> / Ruben
>
> On Thu, Jun 14, 2018 at 3:18 PM Bill Denney <wdenney_at_humanpredictions.com=
>
> wrote:
>
>> Hi Ruben,
>>
>>
>>
>> I’m also interested in a lexer-parser for NONMEM. The regexp-ba=
sed ones
>> that I’ve used have typically had issues (I’ve tried abo=
ut 4 different ones
>> including one that I wrote), and they are working for many but not all
>> models. I’m unaware of a reasonably complete lexer-parser for N=
ONMEM
>> (though I know of at least one non-public effort; I’ve contacted=
 that
>> author to see if he is interested in joining this conversation).
>>
>>
>>
>> I’ve wanted to build the abstract syntax tree for NONMEM to help=
 with
>> computational model-building, and I’ve been looking into ANTLR a=
s well.
>> Three questions: Are you interested in collaborating on the parser (can
>> you create a GitHub project for it)? Why ANTLRv3 instead of v4? Do you
>> have a way to get an ANTLR parse tree into R?
>>
>>
>>
>> Thanks
>>
>>
>>
>> Bill
>>
>>
>>
>> *From:* owner-nmusers_at_globomaxnm.com <owner-nmusers_at_globomaxnm.com> *On
>> Behalf Of *Ruben Faelens
>>
>> *Sent:* Thursday, June 14, 2018 8:55 AM
>> *To:* Tim Bergsma <Tim.Bergsma_at_certara.com>
>> *Cc:* nmusers_at_globomaxnm.com
>> *Subject:* Re: [NMusers] Context-free lexer for NM-TRAN
>>
>>
>>
>> Hi Tim,
>>
>>
>>
>> Thanks for pointing to that.
>>
>> Unfortunately, nonmemica uses regular expressions to simply split the
>> character stream into subsections.
>>
>> This is not the way to go. As an example, nonmemica would get confused b=
y
>> the following input:
>>
>> $PROBLEM This is a problem with special $PK section
>>
>> $PK ;Refer to $ERROR for more information
>>
>> CL=THETA(1)
>>
>> $ERROR
>>
>> Y = W*F
>>
>>
>>
>> Probably a contextual lexer is the way to go; fortunately ANTLRv3 has
>> functionality for this.
>>
>>
>>
>> Kind regards,
>>
>> Ruben
>>
>>
>>
>> On Thu, Jun 14, 2018 at 12:42 PM Tim Bergsma <Tim.Bergsma_at_certara.com>
>> wrote:
>>
>>
>>
>> Hi Ruben.
>>
>>
>>
>> Related: the CRAN package “nonmemica” has a function as.=
model() that
>> parses NONMEM control streams. Type “?nonmemica” at the =
R prompt after
>> loading. See also https://github.com/MikeKSmith/rspeaksnonmem . Happy
>> to discuss further.
>>
>>
>>
>> Kind regards,
>>
>>
>>
>> Tim
>>
>>
>>
>> *Tim Bergsma, PhD*
>>
>> Associate Director
>>
>> Certara Strategic Consulting
>>
>> [image: image001.png]
>>
>> m. 860.930.9931 <(860)%20930-9931>
>>
>> tim.bergsma_at_certara.com
>>
>>
>>
>> *From:* owner-nmusers_at_globomaxnm.com <owner-nmusers_at_globomaxnm.com> *On
>> Behalf Of *Ruben Faelens
>> *Sent:* Thursday, June 14, 2018 4:33 AM
>> *To:* nmusers_at_globomaxnm.com
>> *Subject:* [NMusers] Context-free lexer for NM-TRAN
>>
>>
>>
>> Hi all,
>>
>>
>>
>> Calling all computer scientists and computer language experts.
>>
>> In my spare time, I am working on a lexer and parser for NM-Tran.
>> Primarly to teach myself about grammars and DSL, but perhaps something
>> useful will come out of this (e.g. a context-sensitive editor with code
>> completion).
>>
>>
>>
>> When lexing, I am having a hard time describing the keywords used by
>> nm-tran.
>>
>> Let us take '.EQ.' as an example.
>>
>> 1) It seems that *.EQ. *is a keyword used to describe a comparison.
>>
>> 2) However, a filename could also be 'foo.eq.bar'
>>
>> The same thing applies for keywords on the '$ESTIMATION' record. These
>> keywords could also be used as variable names.
>>
>>
>>
>> Am I right in saying that NM-TRAN cannot be tokenized with a context-fre=
e
>> lexer? And that I should focus my efforts on building a lexer-less parse=
r?
>> (Or building my own lexer-parser, see
>> https://en.wikipedia.org/wiki/The_lexer_hack )
>>
>> I assume building a parser for NM-TRAN was already done in the DDMoRe
>> project, but I failed to find the source code...
>>
>>
>>
>> Kind regards,
>>
>> Ruben Faelens
>>
>>
>>
>> *NOTICE: *The information contained in this electronic mail message is
>> intended only for the personal and confidential use of the designated
>> recipient(s) named above. This message may be an attorney-client
>> communication, may be protected by the work product doctrine, and may be
>> subject to a protective order. As such, this message is privileged and
>> confidential. If the reader of this message is not the intended recipien=
t
>> or an agent responsible for delivering it to the intended recipient, you
>> are hereby notified that you have received this message in error and tha=
t
>> any review, dissemination, distribution, or copying of this message is
>> strictly prohibited. If you have received this communication in error,
>> please notify us immediately by telephone and e-mail and destroy any and
>> all copies of this message in your possession (whether hard copies or
>> electronically stored copies). Thank you.
>>
>> Personal data may be transferred to the United States of America and, if
>> this occurs, it is possible that US governmental authorities may access
>> such personal data.
>>
>>
>> buSp9xeMeKEbrUze
>>
>>



image001.png
(image/png attachment: image001.png)

Received on Thu Jun 14 2018 - 11:39:18 EDT

The NONMEM Users Network is maintained by ICON plc. Requests to subscribe to the network should be sent to: nmusers-request_at_iconplc.com. Once subscribed, you may contribute to the discussion by emailing: nmusers@globomaxnm.com.