Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resuming TLSv1.3 sessions #6172

Closed
PaulMartinsen opened this issue Mar 8, 2023 · 18 comments · Fixed by #6187
Closed

Resuming TLSv1.3 sessions #6172

PaulMartinsen opened this issue Mar 8, 2023 · 18 comments · Fixed by #6187
Assignees

Comments

@PaulMartinsen
Copy link

Version

5.5.4

Description

I would like to use TLSv1.3 session resumption on an embedded client to improve performance, but I can't figure out the intended architecture.

The client-tls13-resume.c example saves the session information in a local variable but this doesn't appear to use the library's session cache. So a useful test, but not the practical solution I'm looking for. Using this example, I can see the printf("Session ID reused; Successful resume\n"); message.

I found that if I call wolfSSL_SetServerID from my client, passing the host name for the id, I get a resumed session on the second call (to the same server) of wolfSSL_connect(…). But when making the third connection, I get a SOCKET_ERROR_E on the client and on the server DoTls13ClientHello, return -423 (BAD_BINDER) leaving me very confused. The connection is closed each time.

I'm using the echoserver example on the server end. Is there an example for session resumption that employs the cache?

@anhu
Copy link
Member

anhu commented Mar 8, 2023

Hi,
Are you enabling secure renegotiation?

--enable-secure-renegotiation

Warm regards, Anthony

@anhu anhu self-assigned this Mar 8, 2023
@PaulMartinsen
Copy link
Author

Thanks @anhu . I have HAVE_SECURE_RENEGOTIATION defined to support server initiated renegotiation for TLSv1.2 but not HAVE_SERVER_RENEGOTIATION_INFO. Are these needed for TLSv1.3 session resumption? I noticed your documentation doesn't recommend them.

@PaulMartinsen
Copy link
Author

I noticed the documentation for wolfSSL_get_session "recommends defining NO_SESSION_CACHE_REF and using wolfSSL_get1_session for session resumption".

Should I be implementing my own session cache and not using wolfSSL_SetServerID for TLSv1.3 connections?

@anhu
Copy link
Member

anhu commented Mar 8, 2023

Yes, they are for TLSv1.3 resumption. You can find good examples of how to use wolfSSL_get1_session in our examples.

In general, here at wolfSSL we take a very conservative stance regarding renegotiation and resumption. We always encourage a full and clean restart with a new connection whenever possible. I do not know anything about your use case, so my position remains the default which is that you should probably avoid resumption and/or renegotiation.

Can you let us know a little about your project and your goals? We always love know what people are doing with our code.

Warm regards, Anthony

@PaulMartinsen
Copy link
Author

Thanks @anhu . Rich can tell you about our project.

I have found that if I modified the client-tls13-resume.c client example to use wolfSSL_SetServerID instead of wolfSSL_set_session then the client session can be retrieved from the internal cache. And I can reconnect 10 times, reusing the session from the cache each time.

However, in my application I see:

  1. first connection succeeds, not using resumption.
  2. second connection succeeds, using resumption.
  3. third connection fails; DoTls13ClientHello on the server reports SSL_accept error = -423, binder does not verify from within DoPreSharedKeys, just after the call to BuildTls13HandshakeHmac. The binder information indeed doesn't match, though the length is the same.

I can't see the difference between what I am doing and the example code, but this is quite new to me. Can you suggest reasons the binding information wouldn't match or how I could figure out what's gone wrong?

@PaulMartinsen
Copy link
Author

Thanks @SparkiDev . It does seem similar to client initiated secure renegotiation in 1.2, which I understand is not desirable. Would this be called session resumption in TLS1.3?

@SparkiDev
Copy link
Contributor

Hi @PaulMartinsen

Resumption is the name for an abbreviated handshake using session tickets.

Are you using the same ticket each time or a new ticket with each new connection?

Sean

@PaulMartinsen
Copy link
Author

Thanks for clarifying @SparkiDev . That's exactly what I'm trying to do for TLSv1.3. And, that was a spectacularly good question.

I wasn't exactly sure what you meant by the ticket so I added a call to wolfSSL_get_session and wolfSSL_get_sessionID to the working example to get that ID, which broke resumption.

To cut a long day short, when using wolfSSL_SetServerID to do session resumption for TLSv1.3 with the session cache:

  1. if you don't call wolfSSL_get_session after making the first connection the next resumption will fail.
  2. if you do call wolfSSL_get_session after resuming a session, the next resumption will fail.

By fail I mean, for:

  1. wolfSSL_connect will make a connection with a fresh handshake (i.e., not resume the session),
  2. wolfSSL_connect will fail to make a connection and the server will protest SSL_accept error = -423, binder does not verify.

I couldn't find an example for how to use wolfSSL_SetServerID properly so I've been making it up as I go along. I attached my test code. It has two pre-processor symbols BREAK_SESSION_RESUMPTION_1 and BREAK_SESSION_RESUMPTION_2. They correspond to the two cases above. Setting either (or both) to 1 will cause the corresponding faults described above. 1 wins when both BREAK_SESSION_RESUMPTION_1 and BREAK_SESSION_RESUMPTION_2 are 1.

What I'm trying to do is:

  1. call wolfSSL_SetServerID to associate the next session with a server by its host name (i.e., its id).
  2. call wolfSSL_connect to create a TLSv1.3 connection. Either a new one if (this is the first connection or the session info has expired), or resume with the cached credentials if they are available.
  3. if the connection fails for some reason: clear the cached credentials and negotiate a new session next time.

I couldn't find a function to do #3. How should I be doing 1 & 2?

@anhu anhu assigned SparkiDev and unassigned anhu Mar 9, 2023
@SparkiDev
Copy link
Contributor

Hi @PaulMartinsen

I tried to reproduce the issue - your test code was helpful but I can't compile it.

I did see that the session is no longer being automatically stored in the client cache.
I've attached a patch that fixes that.
issue6172_patch_1.txt

What I didn't see was a binder error.
Please send me your user_settings.h file.
That may help us reproduce your error.

I can't see anyway currently to remove the cached credentials.
At the end of the handshake a new session ticket will be added in that will replace the old one so that will work.
But if a ticket is not sent then the invalid session will persist.
We can look at removing the cached session information on failed reuse.
Note there is a timeout value.

Sean

@PaulMartinsen
Copy link
Author

Hi @SparkiDev ,

Thanks for the patch. I'll try it over the weekend.

I've made a zip file with the user settings. It contains folders:

  • Client: user_settings.h for the program that's calling that TestResume function.
  • Lib: user_settings.h used to build the library that the client links to
  • Discovery Wolf - most of the server program that's reporting binding error. It won't build as there's some libraries I can't share but they just deal with locating credential paths which should be a trivial change. I did include (in x64) the executable with wolfSSL_debugging_ON, credential files and the wolfssl.lib file I'm linking with in case those are useful to you.

Removing the cached credentials only came up with the fault. Once it starts faulting, it just just stuck with failures. Aside from this issue, I assume there are cases where the session wouldn't resume properly (e.g., wifi died?) so some kind of recovery scenario would be needed. Great for me if that's handled in the library :)

Have a good day,
Paul.

@PaulMartinsen
Copy link
Author

Patch fixed both cases, with my TLSv1.3 connection successfully resuming for all four permutations of BREAK_SESSION_RESUMPTION_1 and BREAK_SESSION_RESUMPTION_2.

This resolves the issue I encountered. Has this an issue with persistence of an invalid session that you mentioned above? Can I leave you to consider and close this issue when it makes sense for you please?

Thanks for your help resolving this so quickly @SparkiDev . Much appreciated!

PaulMartinsen pushed a commit to PaulMartinsen/wolfssl that referenced this issue Mar 10, 2023
@PaulMartinsen
Copy link
Author

PaulMartinsen commented Mar 11, 2023

A related issue I spotted while tidying up: wolfSSL_SetServerID silently truncates the id supplied to fit an internal buffer.

Although an authority might be logical id to use, this buffer is significantly smaller than the size permitted for hostname. So hostname:port could work if the host name is short, but could create a hard to identify bug in some situations.

The id size is SERVER_ID_LEN, in internal.h, although that doesn't sound like a header file a library user should use. It turns out it is the same size as a sha-1 hash, so possibly something like this would work:

    wc_Sha ServerId;
    if (0 == wc_InitSha(&ServerId)
      && 0 == wc_ShaUpdate(&ServerId, (const byte*)host, strlen(host))
      && 0 == wc_ShaUpdate(&ServerId, (const byte*)&port, sizeof(port))
      && 0 == wc_ShaFinal(&ServerId, abyServerId))
    {
      if (WOLFSSL_SUCCESS != wolfSSL_SetServerID(soap->ssl, (const byte*)abyServerId, sizeof(abyServerId), 0))
      {
         //wolfSSL_SetServerID failed
      }
    }
    else
    {
      // Failed to create server id hash
    }

Note to future readers: if you use something like this, be sure to call wc_ShaFree(&ServerId) otherwise you might have other problems.

@SparkiDev
Copy link
Contributor

Hi @PaulMartinsen

I've put up a pull request with the previous patch and a change to hash long server IDs.
#6187

Let me know if this PR works for you.

Thanks,
Sean

@PaulMartinsen
Copy link
Author

Thanks @SparkiDev . It looks good to me, though I was a bit confused by the lines where it seems to check for the absence of SHA-256 then uses it. Perhaps its a different SHA-256.

#if defined(NO_SHA) && defined(NO_SHA256)
        if (wc_Sha256Hash(id, len, idHash) != 0) // <-- is this available when NO_SHA256 is defined
            return WOLFSSL_FAILURE;
#else
        if (wc_ShaHash(id, len, idHash) != 0)
            return WOLFSSL_FAILURE;
#endif

@SparkiDev
Copy link
Contributor

Hi @PaulMartinsen,

My mistake!
I've fixed this to be !defined(NO_SHA256)

Thanks for looking at the PR.

Sean

@SparkiDev
Copy link
Contributor

Hi @PaulMartinsen,

The PR has been merged and will go out in the next release which is due in the next week or so.

Is there anything else for the ticket?

Sean

@SparkiDev SparkiDev reopened this Mar 21, 2023
@PaulMartinsen
Copy link
Author

I think its resolved now. Thanks very much @SparkiDev & @anhu for your help in resolving this. Much appreciated!

@SparkiDev
Copy link
Contributor

Thank you @PaulMartinsen!

We fixed an important issue with resuming TLS v1.3.

Sean :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants