-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert ticket v3 HTML to JSON tickets #22
base: main
Are you sure you want to change the base?
Conversation
I have noticed that weighted items (at least in cs/cz) do not work currently. @diogotcorreia This should be the case for your implementation at https://github.com/diogotcorreia/lidl-to-grocy/blob/master/lidl/src/html_receipt.rs as well. I'm seeing the following entries in the HTML for a weighted item (apples):
There's no easy way to parse this AFAICS. One way would be to store the first non- |
8419bd7
to
18af73a
Compare
I implemented the method for the weighted items that I mentioned in my last comment. I believe this PR is ready for a review. |
@Kuba314 Does your receipt not have May I ask, what is the value of |
@diogotcorreia It does, just not for weighted items. These 3 lines are the only information that I have in the receipt. I'm not seeing what you're seeing. I only see
Yeah... this is very possible. Maybe detecting
Do you mean |
@Kuba314 That's unfortunate, I'm not sure how you would fix it then, since you also don't have an article number either :/ |
18af73a
to
3f81806
Compare
I have changed the VAT line detection from what is essentially |
FYKI: Hungarian Lidl Plus API broke as well a couple of days ago (some days after 09.21), but this PR solves the issue for me. Thanks @Kuba314 |
So no barcode, no match with openfoodfacts? |
@salvadorbs unfortunately yeah, there's no way to get the barcode now :/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to get receipts that didn't have discounts. I just started using this so can't tell for sure if everything else is working fine. Will continue testing. So far everything else looks good! Thanks!
} | ||
) | ||
elif node.attrib["class"] == "discount": | ||
discount = abs(parse_float(node.text.split()[-1])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This throws an IndexError when I'm running it because some of the span elements contain just white text so the node.text.split()
returns an empty list.
Here's the HTML on my receipt:
<span id="purchase_list_line_3" class="discount css_bold" data-promotion-id="promo_id"> Coupon Plus reward</span>
<span id="purchase_list_line_3" class="discount" data-promotion-id="promo_id"> </span>
<span id="purchase_list_line_3" class="discount" data-promotion-id="promo_id"> </span>
<span id="purchase_list_line_3" class="discount css_bold" data-promotion-id="promo_id">-0.69</span>```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man, they just can't be consistent... Thank you for providing another data point with which we can figure out all the formats they use for this! I'll try to implement the format you provided once I have time to actually do this though... You can always suggest changes and I'll be happy to use them of course.
For now I'm thinking a regex searching for something like -\d+[\.,]\d{2}$
would be best.
Btw shouldn't the code currently fail in parsing the first line's reward
word as float instead of the whitespace-split-index-error that you're describing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No rush to implement this. I would have done it myself but I was hesitant to suggest changes cause I'm still trying to understand what's happening. I will for sure once I get more familiar with the project. :)
I think because the class of the first line is discount ccs_bold
instead of just discount
it's not parsed at all.
So, does the HTML differ from country to country? Or does it depend on the coupon you use and whether it's a percentage/flat discount? That's the only receipt with a coupon I have so that's my only data point :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hesitant to suggest changes cause I'm still trying to understand what's happening. I will for sure once I get more familiar with the project.
No worries :) This is not even that tied to this specific project, but more to the actual lidl API since AFAIK there's no public documentation for it and people just somehow reverse engineered it.
I think because the class of the first line is discount ccs_bold instead of just discount it's not parsed at all.
Right, of course, missed that.
So, does the HTML differ from country to country? Or does it depend on the coupon you use and whether it's a percentage/flat discount?
There's definitely some difference for some reason. See diogotcorreia/lidl-to-grocy's lidl/test/receipt.html. It uses I think the same format as what I saw and implemented in this PR. It's weird that your receipt is different, but we'll probably have to implement a common parsing for all possible formats. Currently I'm blocked on #23 though so I can't verify if anything changed recently in my receipts, but in my lidl-plus android app I don't see any discounts as bold as you probably would.
We could probably do something like this to support both formats:
if ...:
...
elif {"discount", "css_bold"}.issubset(node.attrib["class"].split()) and try_parse_float(node.text):
...
elif node.attrib["class"] == "discount":
...
Is there anyway to instruct the api to results only in json (NATIVE)? |
@bchhabra Not that I could find |
Copy diogotcorreia/lidl-to-grocy's fix to a ticket v2 API change. The fix is to try the v2 API and if it fails use a v3 API which returns the ticket formatted as HTML. The JSON data is then constructed from that HTML.
Credit to https://github.com/diogotcorreia/lidl-to-grocy/blob/master/lidl/src/html_receipt.rs.
Closes #20
This is a draft, because this implementation has not been tested much. Feel free to test it yourself.