fix: printing signature fields in verbose mode for signature_opt #547

ragul-kachiappan-dev · 2024-03-04T14:14:57Z

#482 introduced _print_signature() for verbose mode in signature optimizer.

def _print_signature(self, predictor):
        if self.verbose:
            if (hasattr(predictor, 'extended_signature')):
                signature = predictor.extended_signature
            else:
                signature = predictor.extended_signature1
            print(f"i: {signature.instructions}")
            print(f"p: {list(signature.fields().values())[-1].json_schema_extra['prefix']}")
            print()

fields() is a property method in SignatureMeta class.

@property
def fields(cls):
    # Make sure to give input fields before output fields
    return {**cls.input_fields, **cls.output_fields}`

"fields" method is being used incorrectly in _print_signature()
The current _print_signature() implementation is raising 'TypeError: 'dict' object is not callable' when trying to use signature optimizer as per the tutorial provided in https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/signature-optimizer

The tutorial for signature optimizer provided in the docs feels a bit incomplete. The tutorial example uses HotPotQA and GSM8K datasets inconsistently. While trying the example with HotPotQA dataset, I encountered an error of input keys not being set. Then I noticed that while creating Example objects for data, the with_inputs() method is not called in HotPotQA whereas it is called correctly in GSM8K. I am not sure if it is a known issue.

Example objects creation for hotpotqa:

    def _shuffle_and_sample(self, split, data, size, seed=0):
        '''
            The setting (seed=s, size=N) is always a subset
            of the setting (seed=s, size=M) for N < M.
        '''

        data = list(data)

        # Shuffle the data irrespective of the requested size.
        base_rng = random.Random(seed)

        if self.do_shuffle:
            base_rng.shuffle(data)

        data = data[:size]
        output = []

        for example in data:
            output.append(Example(**example, dspy_uuid=str(uuid.uuid4()), dspy_split=split))
        
        # TODO: NOTE: Ideally we use these uuids for dedup internally, for demos and internal train/val splits.
        # Now, some tasks (like convQA and Colors) have overlapping examples. Here, we should allow the user to give us
        # a uuid field that would respect this in some way. This means that we need a more refined concept that
        # uuid (each example is unique) and more like a group_uuid.

        # rng = random.Random(seed)
        # rng.shuffle(data)

        return output

Example objects creation for gsm8k:

        trainset = [dspy.Example(**x).with_inputs('question') for x in trainset]
        devset = [dspy.Example(**x).with_inputs('question') for x in devset]
        testset = [dspy.Example(**x).with_inputs('question') for x in testset]

I tested the current change with a custom dataset I created from "math_qa" huggingface dataset. Verbose mode now works as expected. I am open to a better reasoning or raising my concerns as an issue.

arnavsinghvi11 · 2024-03-05T17:21:16Z

LGTM!

fix: printing signature fields in verbose mode for signature_opt

c2737f6

arnavsinghvi11 merged commit 58030c9 into stanfordnlp:main Mar 5, 2024
4 checks passed

arnavsinghvi11 mentioned this pull request Mar 5, 2024

Fix _print_signature() for verbose=True #551

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: printing signature fields in verbose mode for signature_opt #547

fix: printing signature fields in verbose mode for signature_opt #547

ragul-kachiappan-dev commented Mar 4, 2024 •

edited

Loading

arnavsinghvi11 commented Mar 5, 2024

fix: printing signature fields in verbose mode for signature_opt #547

fix: printing signature fields in verbose mode for signature_opt #547

Conversation

ragul-kachiappan-dev commented Mar 4, 2024 • edited Loading

arnavsinghvi11 commented Mar 5, 2024

ragul-kachiappan-dev commented Mar 4, 2024 •

edited

Loading