Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core improvements #14

Open
wants to merge 32 commits into
base: master
Choose a base branch
from
Open

Core improvements #14

wants to merge 32 commits into from

Conversation

mjmeans
Copy link

@mjmeans mjmeans commented Nov 17, 2024

Alot of changes here but collision avoidance resolution is fixed now when reading multiple tags at one time regardless of the tag's UID as long as there is either more than one or less than one tag with a UID ending in '0'. More details about this are in the 'tidy up the code a bit' commit second from the end. I'm going to continue to work on resolving that issue in another branch that will be created based on this one so that I can use all the additional debugging macro features I already have in place here.

…mmunicationMode options in setRF_on()"

This reverts commit 2bef556.
Somewhere along the line the DEBUG define was not in DEBUG, which results in formatHex() not being included. Now it is conspicuously uncommentable.
Expanded defines for all registers for reference and future use
Expanded defines for all Host Interface Commands for reference and future use
Fix #ifdef nesting
Improve code readabiility and allow eventual expansion to full ISO18593-3 compatibility.
The bug is: Calling getInventoryMultiple() hangs and times out if there is only one NFC tag and that tag's UID ends in '0'. The library doesn't when calling getInventory() with the same tag.

Still unable to see why this is happening. So expand the PN5180ERROR to show deeper calls.
PN5180 Direct Commands
These functions are the low level interface to the device:
1. Verify command's parameters
2. Prepare the transmission frame
3. Call transceiveCommand() to send the command
4. No state checking

The comments for these direct commands are based on the PN5180A0XX_C3_C4.pdf data sheet.

The original setRF_on(), setRF_off() and sendData() calls are the versions that do parameter and state checking before calling the direct commands. This will allow variations of the original calls with different state checking where necessary.
The bug that prevented getInventoryMultiple() from doing collision resolution is fixed. Four cards with UIDs ending in '0' were able to be read at one time. I also had two cards with UIDs ending in 'DD' that were recognized, so the collision avoidance resolution is at least getting to the third round. I do not have any cards with UIDs that all end in the same three or more identical nibbles to see how far it can go, but I think getting to the third round is proves that the algorithm is correct.

That alll being said, although my initial problem of the library hanging when reading a stack of cards with at least one ending in '0' has only partially been resolved. In any number of cards within the RF field, if only one of them has a UID ending in '0', getInventoryMultiple() still hangs. Adding a additional tags ending in '0' however does not hang. Even with all the debugging macros I added I still haven't discovered why this occurs only when there is a single tag ending in '0' within the field. And if using the regular single card reading call, getInventory(), cards ending in '0' read just fine.

While the latter problem is not yet solved, I have made significant progress on the other two issues arising from errors in collision resolution, so I'd ending this branch here and will send a PR upstream. I will work on the remaining solution in another branch. based on this one.

By the way, I have on occasion been able to read 9 NFC Type 5 ICODE SLIX card tags at one time on the PN5180 if the cards are held just above, but not touching, the PN5180 board antenna. But I do find that not all manufacturer's of these cards are able to be read that many at a time. Different manufacturer's use different size or shape or windings of their antennas and possibly use lower quality tag chips inside the cards. I haven't investigated this actual cause of this inconsistency or how to make sure the tags I get are ones that can operate in larger stacks.

On another note, I've had communication with NXP regarding the PN5180 and tyriing to get the chip to resolve at a full NFC Type 5 ranges over 10 cm since NFC Type 5 is supposed to be able to be read up to 1.5 m. NXP replied that their PN5180 is NFC compliant, but that their chip is only design to read at ranges up to 10 cm. So word to the wise, if you are looking for the ability to read a single NFC Type 5 tag at ranges longer than 10 cm, you probably don't want the PN5180 unless you are going to design a very complex powerful amplified antenna circuit. NXP offered me no guidance in this respect, other than to suggest contacting another company specializing in long range RFID. Sigh!
Copy link

@alastaira alastaira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't maskLen expressed in nibbles? (as calculated here:

)

*** EDIT ***
Sorry, my bad. I have a different PN5180 library, so ignore this!

@mjmeans
Copy link
Author

mjmeans commented Nov 20, 2024

The collision resolution algorithm with 16 slots asks all cards that match any specified mask to respond in the slot (0 to 15) corresponding to the least significant nibble after the mask. In the first round, the mask is 0 and the mask length is 0, so all cards will try to respond in the slot number of their least significant nibble. There are three possible results in each slot: no card response, single card response, and multiple card collision.

If a single card responds in any slot, it's UID is saved in the list of UIDs resolved. If a collision occurs, each slot where it occurs is pushed to array for later resolution.

At the end of each round of 16 slots, if there are any saved collisions yet to be resolved, then one is popped from the array and used as a mask for another round. The mask tells only those cards that match the mask are allowed to respond in the next round. Additional

So, for example if there were cards with UIDs that end in ... 50:34 and 5E:65 and 60:65, then the two ending in --:-5 then the single card without a collision would respond in time slot 4, and be resolved. But in time slot 5 there would be a collision.

So the next round would set the mask to 00:05 with a masklen of 4 bits (one nibble) telling only those cards that end in 5 to respond. The mask and masklen defines a wildcard equivalent to **:*5". In the second round only the cards that match that would respond and choose a time slot to respond in by the lowest 4 bits that were not in the mask. In this example both also have the same second nibble, a 6. So a collision would occur again for the same two cards. The third round would specify a mask of 00:65 and a masklen of 8 bits which would result in an equivalent wildcard search of **:65". In this round all cards that end in 65 would respond in the time slot according to the lowest nibble not in the mask. In this case the card with 60:65 would respond in slot 0 and the card with 5E56 would respond in slot 14 (E). Since at this point there are no other collisions to resolve in this group and there are no other collisions pending from any other round, the collision resolution process ends.

@alastaira
Copy link

Right, but I believe there's a flaw in the way the mask is applied (apologies for commenting on your pull request - it's been an existing issue in this library for some time but for some reason I've been unable to raise an issue on the repo itself. As you seem to be trying to address issues with collisions it seems appropriate to leave them here!).

Specifically, the issue is with the way maskLen is calculated, here:
https://github.com/tueddy/PN5180-Library/blob/d54712e5869de3e256a5ee744f3fd952de0c5467/PN5180ISO15693.cpp#L111C1-L118C5

This loops through the mask value, one nibble at a time, incrementing maskLen until all the remaining bits in the mask are equal to zero. The (false) assumption is that, by counting the number of non-zero nibbles, this gives the length of the mask.

The problem is that if a collision occurs in the first timeslot (i.e. multiple tags match the provided inventory request, and the last nibble of both of their UIDs is 0x0), then the mask will contain the actual value of the timeslot 0x0, as here:

if(maskLen > 0) collision[*numCol] = collision[0] | (slot << (maskLen * 2));
else collision[*numCol] = slot << (maskLen * 2); // Yes, store position of collision

So, 0x0 in the mask at this point doesn't mean "this mask slot has not been set yet", it means "this mask slot is 0x0". But the calculation of maskLen will see 0x00 and think it has reached the end of the mask, and fail to increment the maskLen.
Consequently, whenever a collision occurs in the first time slot, inventoryPoll() will be called recursively but using the same mask each time, which leads to an infinite collision loop, eventually crashing.

I believe the solution is not to calculate maskLen from the value of collision[0], but explicitly keep track of maskLen in a separate variable that is passed to inventoryPoll() (and incremented each time the mask is extended).

@mjmeans
Copy link
Author

mjmeans commented Nov 21, 2024

I agree that the collision array length should be tracked individually. It should be a LIFO.

This PR solves the case when there is a collision of two or more cards ending in 0 with a maskLen of 0. However, it does not solve the case where there is only 1 card ending in 0. IN that case something else happens which causes the transceive to be out of sync and fail, IIRC at send/3. The BUSY check never succeeds and a timeout causes it to abort.

If you have a suggestion, and my PR hasn't been added to this fork yet, please fork my fork and create a new branch off of my core-improvements branch and send me a PR for your changes to test.

I've been pulled off this project for now and won't be able to get back to it for a week or two, but I can spare enough time to test a fix if you submit one. Otherwise, I will be a few weeks before I can look into it. I wish I had some UID programmable Type 5 tags so that I could also test collision up to 8 nibbles deep to see if there are any other issues that occur if it goes 8 rounds.

@mjmeans
Copy link
Author

mjmeans commented Nov 21, 2024

We can also discuss this on my fork of the repo. I've enabled the issues and discussions tabs and created an issue concerning a reading multiple when a single x0 nibble tag is present i the field. here.

@alastaira
Copy link

Thanks. I've been working with my own fork which I took when ATrappman's repo was archived a few years ago, but it's sufficiently different now that I can't submit a PR that can be merged directly into this branch and it'll take me a little time to review any changes that have been committed since then anyway.
But I'll comment or submit a PR when I think I have something.

@mjmeans
Copy link
Author

mjmeans commented Nov 21, 2024

Now that I look at it, numCol is implemented in this fork. But poorly.

  1. In a single round, there can be up to 16 collsions. Therefore, the length of collsion[] is not reliable to calculate the mask for the next round in the current code.

I think the solution that preserves backward compatibility is for getInventoryMultiple() to do more work.

  1. Add a new resolutionPending[] (or whatever you want to call it) to cache all collisions returned from inventoryPoll(). It should be implemented IFLO (first in last out) like a stack.
  2. Calculate a numCol and collsion[] to send to inventoryPoll().
  3. Save any collisions returned from inventoryPoll() in the resolutionPending[] array.
  4. Loop back to 2 if there are any resolutionPending[].

@mjmeans
Copy link
Author

mjmeans commented Nov 21, 2024

If backward compatibility of inventoryPoll() is not needed, then returning which slots had collisions can be done as a single uint16_t bitmap in collision[0]. This could make inventoryPoll() faster, but I don't know if that's necessary.

In any case, the resolutionsPending[] could be a uint32_t array. Each element holds the most significant nibble of the mask << 16 and the collisions in the lower 16 bits for the last return of inventoryPoll(). That gives you enough information to quickly build a numCols and collision[] to pass to the next inventoryPoll(). The resolutionsPending[] array have a size of 24 which is the maximum possible number of nibbles that can be in any mask (16) + the maximum number of collisions that can result (16). And if there is any collision with a mask of 16 nibbles, then that exceeds the 8 bytes possible in the UID, so we would have to bail out at that point since either there are two cards with the same ID that are colliding, or there is some RF interference causing a problem, or maybe could be resolved by including AFI or DSFID which isn't supported here.

Well, I didn't test after turning deactivating all the DEBUG defines. This resolves it.
@mjmeans
Copy link
Author

mjmeans commented Nov 21, 2024

The last last commit resolves compilation errors with all the DEBUGs off.

@alastaira
Copy link

I agree it could be implemented better (and I don't personally have any issues with introducing breaking changes to inventoryPoll() to make it so), but I can't see that there's actual any flaws in the logic of the collision array at the moment...
If I understand correctly:
numCol is incremented each time a collision is found and pushed onto the end of the collision array:

if((rxStatus >> 18) & 0x01 && *numCol < maxTags){ // 7+ Determine if a collision occurred
if(maskLen > 0) collision[*numCol] = collision[0] | (slot << (maskLen * 2));
else collision[*numCol] = slot << (maskLen * 2); // Yes, store position of collision
*numCol = *numCol + 1;

And it's reduced again as getInventoryMultiple iterates through the array:

numCol--;
for(int i=0; i<numCol; i++){
collision[i] = collision[i+1];
}

If re-polling a collision with the new mask leads to another collision (because the UIDs of the colliding tags share two consecutive nibbles), that new, more specific collision replaces the previous one.

@alastaira
Copy link

For clarity, I'd probably prefer implementing a simple "collision" struct with seperate member variables for the mask and the nibble at which the collision occurred, rather than pack them together into a single uint32_t, but the result would be the same.

@mjmeans
Copy link
Author

mjmeans commented Nov 22, 2024

Consider a collision scenario where the first pol has 3 collisions. This will result in all three being stored in the collisions array before checking to see if any collisions need to be resolved. If there are only none or single collisions on subsequent polls then the library works fine.

However, if the second poll gets more than one collision, all of them will be added to the collisions array but only one of the previous collisions will be replaced with a new value. This means that the third poll can't calculate the new mask by simply walking back through consecutive entries in the array, as it does now.

@alastaira
Copy link

alastaira commented Dec 1, 2024

I'm encountering occasional crashes when attempting to read the reception buffer, which I'm attempting to track down.
After requesting an inventory, the RX_STATUS buffer reports that 10 bytes have been received (as expected), but then attempting to perform a SPI.transfer to fill the buffer hangs, (on the line `PN5180_SPI.transfer(recvBuffer, recvBufferLen); ).

I tried replacing the transceive command to receive only a single byte of the readbuffer at a time (i.e.

for (int i=0; i<recvBufferLen; i++){
    recvBuffer[i] = SPI.transfer(0x00);
}

but this also fails on attempting to receive the first byte.

This situation normally occurs after the code has been polling continuously for 30-40 seconds or so, with a card placed continuously right in front of the reader, so I'd expect every call to getInventory() to be successful.

One thing I'm unclear of is whether there could be a more robust test as to whether the PN5180 is ready to send received data to the controller before attempting to read the buffer? As I see it, there are several possible checks:

  • The BUSY line is LOW.
  • The RX_IRQ bit has been set to indicate reception complete
  • The IRQ_STATUS status is IDLE
  • RX_NUM_FRAMES_RECEIVED is > 0

I'm testing these out to see if I can prevent crashes (or, at least perform graceful recovery from them)

@alastaira
Copy link

Ok, so further update... I can reproduce this on several controllers, but it's hanging sooner on an UNO than it does on an ESP32, which makes me wonder it's a memory leak/overrun etc. After a cursory look through the code, I notice several calls to malloc(), yet none to free(), so are temporary pointers not getting released?

@mjmeans
Copy link
Author

mjmeans commented Dec 3, 2024

I only have ESP32 at this time. I used to have an STM I think, but I don't know where it is. I do have a few different ESP32's an S3, an S6, a C3 and a WROOM. I'm currently only using the WROOM since my use case also requires BT host capability to remotely operate a BT device. I might be able to look into what you reported later this week.

As far as hangs when reading cards over a long time, I didn't see that in my testing of this PR. However, I was only doing long run testing using getInventoryMultiple(), not getInventory(). I can look into that too.

As far as doing more checks to see if the device is in a ready state, I don't expect more checks to hurt as long as any xceive command is allowed to complete, i.e. I don't think the chip allows reading IRQ_STATUS after a send but before its response. If that's happening, then it's wrong. I think I saw some Microsoft code for NFC reading that caches the IRQ_STATUS register specifically to avoid reading IRQ_STATUS between the send and its response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants