`gemini-1.5-pro` on `ChatGoogleGenerativeAI` generates a newline token in every response #24648

tanmaygupta9 · 2024-07-25T04:11:46Z

tanmaygupta9
Jul 25, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

from langchain_google_genai import (
    ChatGoogleGenerativeAI,
    HarmBlockThreshold,
    HarmCategory,
)

gemini = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro",
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

messages = [
    (
        "system",
        "You are a helpful assistant.",
    ),
    ("human", "Hello!"),
]
ai_msg = gemini.invoke(messages)
# output is "Hello! How can I help you today? \n"

Description

I'm trying to use gemini-1.5-pro for a task, accessed through Langchain's wrapper ChatGoogleGenerativeAI, and each of its responses has a trailing " \n" character in it. This seems to be the behavior only for Gemini 1.5 Pro because Gemini 1.0 does not do this.

This is important for my use case because I'm asking Gemini to produce a key for a dictionary, but it produces the key with a trailing " \n" which makes my code break.

I can obviously just post-process the output but I'm trying to see if there's a cleaner solution to fix this at the generation step itself because I want my app to be LLM-neutral.

System Info

System Information

OS: Linux
OS Version: #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
Python Version: 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]

Package Information

langchain_core: 0.2.23
langsmith: 0.1.93
langchain_anthropic: 0.1.20
langchain_google_genai: 1.0.8

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph
langserve

@tanmaygupta9 · 2024-07-25T04:16:36Z

dosubot[bot]
bot Jul 25, 2024

Hey @tanmaygupta9! I'm here to help you out with any bugs, questions, or contributions you may have. Let's get to the bottom of this together!

To prevent gemini-1.5-pro from adding a trailing " \n" character to its responses when using Langchain's ChatGoogleGenerativeAI wrapper, you can modify the _generate method to strip the trailing newline character from the generated text. Here is the relevant part of the code with the necessary modification:

def _generate(
        self,
        prompts: List[str],
        stop: Optional[List[str]] = None,
        run_manager: Optional[CallbackManagerForLLMRun] = None,
        **kwargs: Any,
    ) -> LLMResult:
        generations: List[List[Generation]] = []
        generation_config = {
            "stop_sequences": stop,
            "temperature": self.temperature,
            "top_p": self.top_p,
            "top_k": self.top_k,
            "max_output_tokens": self.max_output_tokens,
            "candidate_count": self.n,
        }
        for prompt in prompts:
            if self.is_gemini:
                res = completion_with_retry(
                    self,
                    prompt=prompt,
                    stream=False,
                    is_gemini=True,
                    run_manager=run_manager,
                    generation_config=generation_config,
                )
                candidates = [
                    "".join([p.text for p in c.content.parts]).rstrip("\n") for c in res.candidates
                ]
                generations.append([Generation(text=c) for c in candidates])
            else:
                res = completion_with_retry(
                    self,
                    model=self.model_name,
                    prompt=prompt,
                    stream=False,
                    is_gemini=False,
                    run_manager=run_manager,
                    **generation_config,
                )
                prompt_generations = []
                for candidate in res.candidates:
                    raw_text = candidate["output"]
                    stripped_text = _strip_erroneous_leading_spaces(raw_text).rstrip("\n")
                    prompt_generations.append(Generation(text=stripped_text))
                generations.append(prompt_generations)

        return LLMResult(generations=generations)

In this modification, the rstrip("\n") method is used to remove any trailing newline characters from the generated text. This ensures that the responses do not end with a trailing " \n" character ^[1].

To continue talking to Dosu, mention @dosu.

0 replies

aaronbriel · 2024-11-13T15:00:03Z

aaronbriel
Nov 13, 2024

I created PR 598 for this issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`gemini-1.5-pro` on `ChatGoogleGenerativeAI` generates a newline token in every response #24648

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

gemini-1.5-pro on ChatGoogleGenerativeAI generates a newline token in every response #24648

tanmaygupta9 Jul 25, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

System Information

Package Information

Packages not installed (Not Necessarily a Problem)

Replies: 2 comments

dosubot[bot] bot Jul 25, 2024

aaronbriel Nov 13, 2024

`gemini-1.5-pro` on `ChatGoogleGenerativeAI` generates a newline token in every response #24648

tanmaygupta9
Jul 25, 2024

dosubot[bot]
bot Jul 25, 2024

aaronbriel
Nov 13, 2024