Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
621 views
in Technique[技术] by (71.8m points)

antlr4 - how to write a custom error reporter in go target of antlr

I am trying to migrate an antlr project from c++ to go. The grammar and code generation is mostly done (based on the solution provided in 65038949), but one pending item is to write the custom error reporter in go.

I am looking for a custom error reporter for these purposes:

  1. I would like to print my custom message, maybe with extra information (eg the file name, which is not printed by the default error printer).

  2. On every error, the error reporter updates a global counter, and in the main program if this error_count>0 then further processing is skipped.

So here is what was done in the c++ project:

  1. A custom message was defined in this function:

    string MyErrorMessage(unsigned int l, unsigned int p, string m) {
        stringstream s;
        s << "ERR: line " << l << "::" << p << " " << m;
        global->errors++;
        return s.str();
    }
    
  2. And the antlr runtime (ConsoleErrorListener.cpp) had been updated to call the above function:

    void ConsoleErrorListener::syntaxError(IRecognizer *, Token * ,
      size_t line, size_t charPositionInLine, const std::string &msg, std::exception_ptr)  {
      std::cerr << MyErrorMessage(line, charPositionInLine, msg) << std::endl;
    }
    
  3. Finally, the main program would skip the further processing like this:

    parser.top_rule();
    if(global->errors > 0) {
        exit(0);
    }
    

How can these pieces of c++ code be re-written for the go target of antlr?

Some additional notes, after browsing the antlr runtime code (from github.com/antlr/antlr4/runtime/Go/antlr):

  • parser.go has a variable "_SyntaxErrors", which gets incremented on every error, but nobody seems to be using it. What is the purpose of this variable, and how do I use it after parsing, to check if any errors have occurred? I did the following, but obviously that did not work! (A workaround is to add a new variable MyErrorCount in the parser, and increment it whenever _SyntaxErrors also gets incremented, but that does not look like an elegant solution, because here I am editing the runtime code!)

    tree := parser.Top_rule() // this is ok
    fmt.Printf("errors=%d
    ", parser._SyntaxErrors) // this gives a compiler error
    //fmt.Printf("errors=%d
    ", parser.MyErrorCount) // this is ok
    
  • In the above note, I had introduced a new variable in antlr code, and reading it in the user code - bad coding style, but works. But I also need to do the reverse - the antlr error reporter (error_listener.go:SyntaxError()) needs to read the user code's variable having the file name and print it. I can do it by adding a new function in antlr to pass this argument and register this filename in a new variable, but is there a better way to do this?

question from:https://stackoverflow.com/questions/66067549/how-to-write-a-custom-error-reporter-in-go-target-of-antlr

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Antlr is awesome, however, one caveat of it is that it is not idiomatic Go when it comes to error handling. This makes the whole error process unintuitive to a GoLang engineer.

In order to inject your own error handling at each step (lexing, parsing, walking), you have to inject error listeners/handlers with panics. Panic & recovery is very much like a Java exception, and I think that is why it is designed this way (Antlr is written in Java).

Lex/Parse Error Collecting (easy to do)

You can implement as many ErrorListeners as you'd like. The default one used is ConsoleErrorListenerInstance. All it does is print to stderr on SyntaxErrors, so we remove it. The first step to custom error reporting is to replace this. I made a basic one that just collects the errors in a custom type that I can use/report with later.

type CustomSyntaxError struct {
    line, column int
    msg          string
}

type CustomErrorListener struct {
    *antlr.DefaultErrorListener // Embed default which ensures we fit the interface
    Errors []error
}

func (c *CustomErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{}, line, column int, msg string, e antlr.RecognitionException) {
    c.Errors = append(c.Errors, &CustomSyntaxError{
        line:   line,
        column: column,
        msg:    msg,
    })
}

You can inject the error listener (while clearing that default one) on the parser/lexer.

lexerErrors := &CustomErrorListener{}
lexer := NewMyLexer(is)
lexer.RemoveErrorListeners()
lexer.AddErrorListener(lexerErrors)

parserErrors := &CustomErrorListener{}
parser := NewMyParser(stream)
p.removeErrorListeners()
p.AddErrorListener(parserErrors)

When the Lexing/Parsing finishes, both data structures will have the syntax errors found during the Lexing/Parsing stage. You can play around with the fields given in SyntaxError. You'll have to look elsewhere for the other interface functions like ReportAmbuiguity.

    if len(lexerErrors.Errors) > 0 {
        fmt.Printf("Lexer %d errors found
", len(lexerErrors.Errors))
        for _, e := range lexerErrors.Errors {
            fmt.Println("", e.Error())
        }
    }

    if len(parserErrors.Errors) > 0 {
        fmt.Printf("Parser %d errors found
", len(parserErrors.Errors))
        for _, e := range parserErrors.Errors {
            fmt.Println("", e.Error())
        }
    }

Lex/Parse Error Aborting (unsure how solid this is)

WARNING: This really feels jank. If just error collecting is needed, just do what was shown above!

To abort a lex/parse midway, you have to throw a panic in the error listener. I don't get this design to be honest, but the lexing/parsing code is wrapped in panic recovers that check if the panic is of the type RecognitionException. This Exception is passed as an argument to your ErrorListener, so modify the SyntaxError expression

func (c *CustomErrorListener) SyntaxError(recognizer antlr.Recognizer, offendingSymbol interface{}, line, column int, msg string, e antlr.RecognitionException) {
  // ...
  panic(e) // Feel free to only panic on certain conditions. This stops parsing/lexing
}

This panic error is caught and passed to the ErrorHandler which implements ErrorStrategy. The important function we care about is Recover(). Recover attempts to recover from the error, consuming the token stream until the expected pattern/token can be found. Since we want this to abort, we can take inspiration from BailErrorStrategy. This strategy still sucks, as it uses panics to stop all work. You can simply just omit the implementation.

type BetterBailErrorStrategy struct {
    *antlr.DefaultErrorStrategy
}

var _ antlr.ErrorStrategy = &BetterBailErrorStrategy{}

func NewBetterBailErrorStrategy() *BetterBailErrorStrategy {

    b := new(BetterBailErrorStrategy)

    b.DefaultErrorStrategy = antlr.NewDefaultErrorStrategy()

    return b
}

func (b *BetterBailErrorStrategy) ReportError(recognizer antlr.Parser, e antlr.RecognitionException) {
    // pass, do nothing
}


func (b *BetterBailErrorStrategy) Recover(recognizer antlr.Parser, e antlr.RecognitionException) {
    // pass, do nothing
}

// Make sure we don't attempt to recover from problems in subrules.//
func (b *BetterBailErrorStrategy) Sync(recognizer antlr.Parser) {
    // pass, do nothing
}

Then add to the parser

parser.SetErrorHandler(NewBetterBailErrorStrategy())

That being said, I'd advise just collecting the errors with the listeners, and not bother trying to abort early. The BailErrorStrategy doesn't really seem to work all that well, and the use of panics to recover feels so clunky in GoLang, it's easy to mess up.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...