Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to apply patch using contextual information when line numbers don't match #47

Open
JonathanHolvey opened this issue Nov 5, 2016 · 2 comments

Comments

@JonathanHolvey
Copy link

I have a unified patch file which can be successfully applied using the GNU utility Patch. The line numbers of the source files do not match the patch, however there is enough context available around each hunk (three lines, as default) for the process to complete anyway.

When I try using the same patch file with python-patch, the apply() method fails with the debug log

INFO  hunk no.1 doesn't match source file at line 275

When looking through the source for apply(), I can't see any attempt to make corrections for non-matching line numbers.

Is this useful feature of the unified diff format beyond the scope of the project, something that hasn't been implemented yet, or am I just using it wrong?

@JonathanHolvey
Copy link
Author

@techtonik

I've spent a bit of time modifying your code so that files can still be patched, even if they don't match the original exactly. The main changes are as follows:

  1. Hunks can appear in the source file in any order, and any position. Matching is done in a single pass in _match_file_hunks().
  2. If a hunk can be matched with the source file in multiple locations, then the one with the smallest offset (closest to the original line numbers) is used.
  3. The file is only patched if a combination of offsets can be found (if hunks are matched in multiple locations) where no line is being modified by more than one hunk. The context lines at the beginning and end of the hunk are ignored during this check and final writing, in case one hunk modifies the context required by another. All context lines are still used for hunk matching.
  4. The Hunk object has additional properties offset, contextstart and contextend to allow this mechanism to work.

I've only tested with a couple of files I'm particularly interested with so far, but my next step will be to validate the changes using your unit test script. Running it now results in 9 failures out of 44, compared with 7 for your master branch. I also need to check for Python 2 compatibility and speed optimisation.

You may find these changes too drastic to merge into you project, however please let me know if you are interested in including it. I don't imagine it will take me too long to have it in a state where it could be released.

Take a look at my branch here.

Cheers,
Jon

@JonathanHolvey
Copy link
Author

JonathanHolvey commented Dec 18, 2016

After fixing a few errors in the unit tests, your master branch passes without issue. My modifications result in 4 errors for the same tests. These, I think, should be straightforward to correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant