Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in double-quoting.js when handling quoted ) and $ characters #58

Open
cetfor opened this issue Aug 9, 2018 · 4 comments
Open
Assignees

Comments

@cetfor
Copy link

cetfor commented Aug 9, 2018

Hey, awesome job on bash-parser. I'm finding this extremely useful in my own work.

I've run into an issue when parsing ) and $ characters in quoted strings. Let me give an example to help explain.

Consider the following line:

command="$(echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g")"

This line is a part of a larger, more complex bash script I'm analyzing. There are actually two (2) issues here.

  1. At column 64, the $ causes an error.
  2. At column 55, the ) causes an error.

Removing the $ leads the parser to believe the ) at column 55 is the closing parentheses rather than a string literal.

Both issues boil down to line 50 in double-quoting.js, where reducers is undefined.

I understand this is a strange case since the issue here exists in parsing a string pattern. I'm honestly not too sure how to go about addressing this yet. If I come up with anything I'll make sure to comment here. Also, if you have any ideas please share them, I'd be happy to work on a fix as soon as I think of the best way to handle these cases.

@parro-it parro-it self-assigned this Aug 10, 2018
@cetfor
Copy link
Author

cetfor commented Aug 21, 2018

Hey @parro-it, I'm currently working on a fix for this issue and hope to get a PR in today. FYI, I've resolved the "build failures" due to test-loc.js which is just a silly double-quoting issue that throws a ton of errors - nothing critical. I hope we can ping back and forth on the PR and get something implemented to fix this. Thanks!

@cetfor
Copy link
Author

cetfor commented Aug 21, 2018

As far as the issue goes that causes Error: TypeError: Cannot read property 'doubleQuoting' of undefined it appears line 38 of expansion-start.js, the method previousReducer can take a third argument of reducers, but it's not set. See expansion-parameters.js line 31 for that definition.

Changing:
return state.previousReducer(state, [char].concat(source));

to:
return state.previousReducer(state, [char].concat(source), reducers);

Resolves the issue Error: TypeError: Cannot read property 'doubleQuoting' of undefined - all unit tests run through npm test continue to pass with this change.

@cetfor
Copy link
Author

cetfor commented Aug 21, 2018

With the above change in place, and using the following script:

const ast = parse('command="$(echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g")"');
console.log(JSON.stringify(ast, null, 2));

I get the following error:

/Users/john/Documents/cetfor/bash-parser/src/index.js:52
                throw new Error(err.stack || err.message);
                ^

Error: TypeError: Cannot read property 'char' of undefined
    at expansion.map.xp (/Users/john/Documents/cetfor/bash-parser/src/utils/tokens.js:80:21)
    at Array.map (<anonymous>)
    at tokenOrEmpty (/Users/john/Documents/cetfor/bash-parser/src/utils/tokens.js:76:45)
    at start (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/tokenizer/reducers/start.js:51:18)
    at tokenizer (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/tokenizer/index.js:193:13)
    at tokenizer.next (<anonymous>)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/iterable-lookahead/index.js:51:24)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:33:30)
    at filterIterator (/Users/john/Documents/cetfor/bash-parser/node_modules/filter-iterator/index.js:6:12)
    at filterIterator.next (<anonymous>)
    at parse (/Users/john/Documents/cetfor/bash-parser/src/index.js:52:9)
    at Object.<anonymous> (/Users/john/Documents/cetfor/bash-parser/ex.js:9:13)
    at Module._compile (module.js:652:30)
    at Object.Module._extensions..js (module.js:663:10)
    at Module.load (module.js:565:32)
    at tryModuleLoad (module.js:505:12)
    at Function.Module._load (module.js:497:3)
    at Function.Module.runMain (module.js:693:10)
    at startup (bootstrap_node.js:191:16)
    at bootstrap_node.js:612:3

The issue here exists in tokens.js line 80 in the tokenOrEmpty function. In this case, xp.loc has no end property!

I added the following dirty patch to see if I can continue debugging the issue:

// console.log('aaa', {token: state.loc, xp: xp.loc});
// return Object.assign({}, xp, {loc: {
// 	start: xp.loc.start.char - state.loc.start.char,
// 	end: xp.loc.end.char - state.loc.start.char
// }});
if(xp.loc.hasOwnProperty('end')) {
	return Object.assign({}, xp, {loc: {
		start: xp.loc.start.char - state.loc.start.char,
		end: xp.loc.end.char - state.loc.start.char
	}});
} else {
	return Object.assign({}, xp, {loc: {
		start: xp.loc.start.char - state.loc.start.char
	}});
}

This isn't ideal - just a way to contiue moving towards the issue. When I re-run the example with this "patch" I get the following error:

/Users/john/Documents/cetfor/bash-parser/src/index.js:50
                        throw err;
                        ^

SyntaxError: Unclosed "
    at tk (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/syntaxerror-oncontinue.js:7:10)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:35:18)
    at Object.next (/Users/john/Documents/cetfor/bash-parser/node_modules/map-iterable/index.js:33:30)
    at Object.lex (/Users/john/Documents/cetfor/bash-parser/src/shell-lexer.js:6:31)
    at lex (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/built-grammar.js:317:27)
    at Parser.parse (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/built-grammar.js:330:26)
    at parse (/Users/john/Documents/cetfor/bash-parser/src/index.js:34:22)
    at setCommandExpansion (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/command-expansion.js:17:21)
    at tokensUtils.setExpansions.token.expansion.map.xp (/Users/john/Documents/cetfor/bash-parser/src/modes/posix/rules/command-expansion.js:38:12)
    at Array.map (<anonymous>)

I uncommented the console.log({value, xp}) statement on line 55 of command-expansion.js

// console.log({value, xp})
to see what the command was being seen as, and I get this:

{ loc: { start: 9, end: 50 },
  command: 'echo "$input" | sed -e "s/^[ \t]*([^ \t]*',
  type: 'command_expansion' }
{ loc: { start: 1, end: 6 },
  parameter: 'input',
  type: 'parameter_expansion' }

So for some reason, the command is being truncated where the \) is. I would expect this command to be echo "$input" | sed -e "s/^[ \t]*\([^ \t]*\)[ \t]*.*$/\1/g". Perhaps state.escaping is not being set when this occurs?

I'm starting to think all of this happens within the lexerPhases. Any insight would be greatly appreciated.

@daniel-sc
Copy link

Hi there,
stumbled upon this problem as well (daniel-sc/bash-shell-to-bat-converter#13) any updates/progress here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants