Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I am getting the error "TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')" and context_recall as 0 #1796

Closed
tvsathish opened this issue Dec 26, 2024 · 8 comments
Labels
bug Something isn't working module-testsetgen Module testset generation question Further information is requested

Comments

@tvsathish
Copy link

tvsathish commented Dec 26, 2024

[ ] I checked the documentation and related resources and couldn't find an answer to my question.

Your Question
Why am I getting this error "TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')" and why is my context_recall value showing up as 0.000000

Code Examples

import pprint

import pandas as pd
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from ragas import SingleTurnSample, EvaluationDataset
from ragas import evaluate
from ragas.embeddings import LangchainEmbeddingsWrapper
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity


def empty_nan_value(cell_value):
    return '' if pd.isna(cell_value) else cell_value


def create_turn_sample(row):
    return SingleTurnSample(
        user_input=row['user_input'],
        retrieved_contexts=[empty_nan_value(row['context1']), empty_nan_value(row['context2']),
                            empty_nan_value(row['context3']),
                            empty_nan_value(row['context4'])],
        response=row['response'],
        reference=row['reference'])


df = pd.read_excel("<RESULTS_EXCEL_FILE>")

eval_dataset = EvaluationDataset([create_turn_sample(row) for index, row in df.iterrows()])

azure_config = {
    "base_url": "<AZURE_CHAT_COMPLETIONS_URL>",
    # your endpoint
    "model_deployment": "<AZURE_DEPLOYMENT>",  # your model deployment name
    "model_name": "gpt-4o"  # your model name
}

evaluator_llm = LangchainLLMWrapper(AzureChatOpenAI(
    openai_api_version="2024-08-01-preview",
    azure_endpoint=azure_config["base_url"],
    azure_deployment=azure_config["model_deployment"],
    model=azure_config["model_name"],
    validate_base_url=False,
))

metrics = [
    LLMContextRecall(llm=evaluator_llm),
    FactualCorrectness(llm=evaluator_llm),
    Faithfulness(llm=evaluator_llm)
]
results = evaluate(dataset=eval_dataset, metrics=metrics)
pprint.pprint(results)

df = results.to_pandas()
df.head()

`
AZURE_OPENAI_KEY is environment variable

Additional context
Output:

Evaluating:  14%|████████████████████████████                                                                                                                                                                       | 25/174 [01:55<02:04,  1.19it/s]**Exception raised in Job[31]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')**
Evaluating:  23%|████████████████████████████████████████████▊                                                                                                                                                      | 40/174 [02:55<03:08,  1.41s/it]Exception raised in Job[34]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  25%|█████████████████████████████████████████████████▎                                                                                                                                                 | 44/174 [02:58<01:56,  1.12it/s]Exception raised in Job[37]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  30%|██████████████████████████████████████████████████████████▎                                                                                                                                        | 52/174 [03:59<06:44,  3.31s/it]Exception raised in Job[58]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  34%|██████████████████████████████████████████████████████████████████                                                                                                                                 | 59/174 [04:17<06:32,  3.41s/it]Exception raised in Job[43]: TimeoutError()
Evaluating:  35%|████████████████████████████████████████████████████████████████████▎                                                                                                                              | 61/174 [04:59<19:02, 10.11s/it]Exception raised in Job[46]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  41%|███████████████████████████████████████████████████████████████████████████████▌                                                                                                                   | 71/174 [05:02<02:13,  1.29s/it]Exception raised in Job[70]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  41%|████████████████████████████████████████████████████████████████████████████████▋                                                                                                                  | 72/174 [05:03<01:46,  1.05s/it]Exception raised in Job[73]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  44%|█████████████████████████████████████████████████████████████████████████████████████▏                                                                                                             | 76/174 [05:44<19:13, 11.77s/it]Exception raised in Job[61]: TimeoutError()
Evaluating:  56%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                                                                                     | 98/174 [07:04<05:08,  4.06s/it]Exception raised in Job[91]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  58%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                                                                                 | 101/174 [07:05<01:56,  1.59s/it]Exception raised in Job[100]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  67%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                                                                | 116/174 [08:06<04:46,  4.93s/it]Exception raised in Job[109]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  67%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                                                               | 117/174 [08:07<03:35,  3.78s/it]Exception raised in Job[115]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                                | 131/174 [09:10<05:46,  8.06s/it]Exception raised in Job[136]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  79%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋                                         | 137/174 [09:11<01:04,  1.74s/it]Exception raised in Job[142]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  80%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                      | 140/174 [09:11<00:30,  1.10it/s]Exception raised in Job[139]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  84%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊                               | 146/174 [09:27<02:06,  4.53s/it]Exception raised in Job[124]: TimeoutError()
Evaluating:  88%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌                       | 153/174 [10:13<00:58,  2.77s/it]Exception raised in Job[148]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  90%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                   | 157/174 [10:14<00:16,  1.01it/s]Exception raised in Job[157]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating:  91%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                 | 158/174 [10:15<00:16,  1.01s/it]Exception raised in Job[169]: TypeError(ufunc 'invert' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'')
Evaluating: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 174/174 [12:19<00:00,  4.25s/it]
**{'context_recall': 0.0000, 'factual_correctness': 0.1174, 'faithfulness': 0.0996}**
@tvsathish tvsathish added the question Further information is requested label Dec 26, 2024
@dosubot dosubot bot added the bug Something isn't working label Dec 26, 2024
@jjmachan
Copy link
Member

jjmachan commented Jan 9, 2025

hey @tvsathish are you still facing this?

If you are facing this could you run evaluate() again with raise_exceptions=True. This will show us the entire traces and help us figure out the error line.

cheers 🙂

@sahusiddharth sahusiddharth added the module-testsetgen Module testset generation label Jan 10, 2025
jjmachan pushed a commit that referenced this issue Jan 14, 2025
#1796 TypeError(ufunc 'invert' not supported for the input types, and
the inputs could not be safely coerced to any supported types according
to the casting rule ''safe'') will trigger when evaluating by
`FactualCorrectness`
@jjmachan
Copy link
Member

thanks to @oslijunw this issue has been fixed 🔥
@tvsathish could you check it out? closing this for now but if anybody still faces this feel free to comment down bellow

@tvsathish
Copy link
Author

I upgraded ragas to the latest version but I am still seeing this error message in the output

Name: ragas
Version: 0.2.11
Summary: 
Home-page: 
Author: 
Author-email: 
License:

@oslijunw
Copy link
Contributor

Can you provide some sample data in Excel, or forge a table with similar structure? @tvsathish

@tvsathish
Copy link
Author

Hi @oslijunw, I have tried to attach a sample Excel (can't share full data) - hope it helps in troubleshooting.

This is the detailed exception I get:

Traceback (most recent call last):
  File "<home_dir>/Desktop/rag_eval.py", line 52, in <module>
    results = evaluate(dataset=eval_dataset, metrics=metrics, raise_exceptions=True)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/_analytics.py", line 227, in wrapper
    result = func(*args, **kwargs)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/evaluation.py", line 318, in evaluate
    raise e
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/evaluation.py", line 298, in evaluate
    results = executor.results()
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 213, in results
    results = asyncio.run(self._process_jobs())
  File "<project_path>/.venv/lib/python3.9/site-packages/nest_asyncio.py", line 30, in run
    return loop.run_until_complete(task)
  File "<project_path>/.venv/lib/python3.9/site-packages/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/tasks.py", line 256, in __step
    result = coro.send(None)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 141, in _process_jobs
    await self._process_coroutines(
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 191, in _process_coroutines
    result = await future
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/tasks.py", line 611, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/tasks.py", line 256, in __step
    result = coro.send(None)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 48, in sema_coro
    return await coro
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 100, in wrapped_callable_async
    raise e
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/executor.py", line 96, in wrapped_callable_async
    result = await callable(*args, **kwargs)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/metrics/base.py", line 541, in single_turn_ascore
    raise e
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/metrics/base.py", line 534, in single_turn_ascore
    score = await asyncio.wait_for(
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "<home_dir>/.pyenv/versions/3.9.19/lib/python3.9/asyncio/tasks.py", line 256, in __step
    result = coro.send(None)
  File "<project_path>/.venv/lib/python3.9/site-packages/ragas/metrics/_factual_correctness.py", line 261, in _single_turn_ascore
    fp = sum(~reference_response)

Regards,
Paddy
Test Data.xlsx

@oslijunw
Copy link
Contributor

I found the reason why the error still occurs. I will resubmit the PR later. Because the abnormal trigger point of my sample is different from your sample, it is not fixed. In addition, is your context recall still 0?

@oslijunw
Copy link
Contributor

The reason why the context recall score is still 0 is that the content of your document is a URL. The evaluation framework will not actively request related documents for you, so you have to handle it yourself. @tvsathish

@tvsathish
Copy link
Author

Hi @oslijunw,
I didn't get the error when I read the contents of the URL as string and passed to the reference variable. I also got a non-zero context recall, but factual_correctness and faithfulness were zero. Also the program ran very quickly compared to before, which makes me wonder is everything computed properly.
Thanks,
Paddy

jjmachan pushed a commit that referenced this issue Jan 18, 2025
#1796 Compatibility is being fixed due to different sample trigger
points
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module-testsetgen Module testset generation question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants