Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report line and column of syntax error when parsing #36

Open
mewmew opened this issue Dec 4, 2019 · 2 comments
Open

report line and column of syntax error when parsing #36

mewmew opened this issue Dec 4, 2019 · 2 comments

Comments

@mewmew
Copy link
Contributor

mewmew commented Dec 4, 2019

To help troubleshoot the cause of a syntax errors when parsing, it would be really helpful if the parser generated by textmapper reported both line and column of syntax errors when parsing.

Currently only the line number is reported, and this creates an additional step for users of the parser to figure out the exact cause of the parse error. One recent such example is llir/llvm#105 (comment), the extract of which is included here for completeness.

By @dannypsnl:

By the way, would you like to use official IR Parser by submodule LLVM-IR-parser and mapping ast only? textmapper didn't provide a reasonable result for parsing error. I didn't know which step it stuck.

All I got was: syntax error at line 1

Here is the test code:

package asm

import (
	"testing"

	"github.com/llir/ll/ast"
)

func TestParse(t *testing.T) {
	testCode := `!7 = !DIExpression(DW_OP_LLVM_convert, 16, DW_ATE_unsigned, DW_OP_LLVM_convert, 32, DW_ATE_signed)`
	_, err := ast.Parse("", testCode)
	if err != nil {
		t.Error(err)
	}
}

To get a better parse error, we ended up breaking the line into multiple lines, resulting in syntax error at line 5, which was more helpful. The same could be achieved with a line:column pair when reporting syntax errors.

ref: llir/llvm#105 (comment)

!7
=
!DIExpression(DW_OP_LLVM_convert,
16,
DW_ATE_unsigned,
DW_OP_LLVM_convert,
32,
DW_ATE_signed
)

Note, this might be me failing to use functionality already included in Textmapper. I simply use the generated parser and don't try to do anything fancy handling errors. So perhaps this information is already available?

Cheers,
Robin

@inspirer
Copy link
Owner

inspirer commented Dec 4, 2019

I used to have column tracking in parsers written in Java but it turned out we needed it so rarely that I dropped the support for it. Instead, I optionally track the offset of the current line (or rather of the token start line) in the lexer:

tokenLineOffset = true // enables this

Besides, columns are very weird as there are multiple interpretations for this number: bytes, unicode runes, or JavaScript offsets (measured in utf-16 code units). All browsers use the last one.

I can embed the line offset into the SyntaxError type if you need it.

@mewmew
Copy link
Contributor Author

mewmew commented Dec 4, 2019

Besides, columns are very weird as there are multiple interpretations for this number: bytes, unicode runes, or JavaScript offsets (measured in utf-16 code units). All browsers use the last one.

This being in Go, I'd probably go with Rune count (where tabs count as a single rune, since the tab width is otherwise subjective).

I can embed the line offset into the SyntaxError type if you need it.

That would be very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants