Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testcase with STM and Async not failing as expected #383

Open
NicolasT opened this issue Mar 23, 2023 · 2 comments
Open

Testcase with STM and Async not failing as expected #383

NicolasT opened this issue Mar 23, 2023 · 2 comments

Comments

@NicolasT
Copy link

NicolasT commented Mar 23, 2023

While trying some things using DejaFu, I ran into a case where a test-case gives (intentionally) inconsistent results, as witnessed when running the test in IO, however, according to testAuto its results are consistent. Only when I add some extra code to align the stars, the inconsistent results are also reported by DejaFu.

Am I doing something wrong, or is there a bug somewhere?

The case boils down to the following:

runComputation :: (MonadConc m) => m a -> (a -> m ()) -> (a -> m ()) -> m Int
runComputation mkLock unLock waitLock = do
  v <- newTVarConc 0

  l <- mkLock

  let act n = do
        waitLock l
        atomically $ writeTVar v n

  withAsync (act 1) $ \a ->
    withAsync (act 2) $ \b -> do
      unLock l
      wait a
      wait b
      readTVarConc v

I.e., run two transactions atomically over some TVar within two distinct threads, each putting another value in the TVar, wait for the threads to finish, and read the value from the TVar.

First, assume mkLock, unLock and waitLock are pure (), pure and pure no-ops. when running this in IO, either 1 or 2 is returned, which seems correct. Indeed, as a unit-test, running this 1000 times has a fairly high probability of distinct values being returned (using the threaded RTS and -N2). When testing with DejaFu, however, no deadlocks or exceptions are detected (good!), but the Consistent Result test succeeds as well, which is unexpected.

If instead we use newEmptyMVar, flip putMvar () and readMVar as mkLock, unLock and waitLock, then running the computation in IO multiple times still yields inconsistent results, and now DejaFu also detects this:

dejafu-test
  Without MVar
    IO:                  FAIL
      Main.hs:26:
      expected: 1
       but got: 2
      Use -p '/Without MVar.IO/' to rerun this test only.
    DejaFu
      Never Deadlocks:   OK
      No Exceptions:     OK
      Consistent Result: OK
  With MVar
    IO:                  FAIL
      Main.hs:35:
      expected: 1
       but got: 2
      Use -p '/With MVar.IO/' to rerun this test only.
    DejaFu
      Never Deadlocks:   OK (0.03s)
      No Exceptions:     OK (0.03s)
      Consistent Result: FAIL (0.07s)
        Failed:
                2 S0----------S1----------S0--S2----------S0---------------
        
                1 S0----------S1-P2----------S1---------S0----------------
        
        Use -p '/With MVar.DejaFu.Consistent Result/' to rerun this test only.

3 out of 8 tests failed (0.10s)

I created a Gist with the full test-suite (and Cabal file etc) to share the complete code, and make it easier to reproduce: https://gist.github.com/NicolasT/d2f456112221679ecac2680685627027

Maybe I'm misunderstanding what DejaFu is supposed to do here, and it's acting as expected, but it's non-obvious to me 😃

@barrucadu
Copy link
Owner

Hmm, that's definitely weird. I've managed to reduce it to:

runComputation5 :: (MonadConc m) => m Int
runComputation5 = do
    v <- atomically $ newTVar 0

    -- using an MVar for `blockA` and `blockB` fixes the problem
    blockA <- atomically newEmptyTMVar
    blockB <- atomically newEmptyTMVar

    mask $ \restore -> fork $ act restore v blockA 1
    mask $ \restore -> fork $ act restore v blockB 2

    -- swapping these two lines, or removing either of them, fixes the problem
    atomically $ readTMVar blockA
    atomically $ readTMVar blockB

    atomically $ readTVar v
  where
    act restore var block n = do
      -- removing the `try` fixes the problem
      -- removing the `restore` fixes the problem
      e <- try (restore (atomically $ writeTVar var n))
      atomically $ putTMVar block (e :: Either SomeException ())

So it seems to be some interaction between STM and exceptions (even though no exceptions are thrown in this code)

@NicolasT
Copy link
Author

I wanted to reduce my test-case to something "smaller" but didn't get to it yet. Indeed, as your test shows, replacing async by what it basically does behind the scenes also exposes the problem.

Glad to hear this is not caused by a misunderstanding of DejaFu on my end 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants