Generate and Understand the Given Code Context. #149

1openwindow · 2023-03-24T05:55:23Z

1openwindow
Mar 24, 2023

Hi, I'm wondering if the Semantic Kernel has the ability to generate code based on a given GitHub repository. I have already tried the sample "Github Q&A," which can be found at https://github.com/microsoft/semantic-kernel/tree/main/samples/apps/github-qna-webapp-react , but unfortunately, it did not generate the code according to the required code context.

I would like to know how to help the Language Models (LLMs) understand the code context, which exists in multiple files, so that they can generate the code correctly.

Answered by dluc

Mar 24, 2023

hi @1openwindow yes that's possible, you can start from the sample code and make a few changes. The sample app currently indexes only markdown files. The first change would be to index also source code files, but here it gets tricky. Unless you're indexing only small files, when dealing with a large code you'll have to split it in "chunks", to fit into the maximum size accepted by the embeddings generator. Chunking code can be hard, depending on the language. But if you get past that, the rest should work fine, you'll use semantic memory and semantic search to retrieve relevant code snippets, fit them into a prompt, and use a semantic function (aka an LLM prompt) to generate the code you …

View full answer

dluc · 2023-03-24T08:33:04Z

dluc
Mar 24, 2023
Maintainer

hi @1openwindow yes that's possible, you can start from the sample code and make a few changes. The sample app currently indexes only markdown files. The first change would be to index also source code files, but here it gets tricky. Unless you're indexing only small files, when dealing with a large code you'll have to split it in "chunks", to fit into the maximum size accepted by the embeddings generator. Chunking code can be hard, depending on the language. But if you get past that, the rest should work fine, you'll use semantic memory and semantic search to retrieve relevant code snippets, fit them into a prompt, and use a semantic function (aka an LLM prompt) to generate the code you want.

Here's an example of a prompt. The prompt has 2 parts:

a code example
and a question: The code above is an example about how to use Semantic Kernel (SK) with the python SDK. Show me how to use SK with gpt4 backend instead of davinci3, use code comments to explain the code

See at the bottom the answer given by the LLM. You can tweak the question to get what needed, as long as the code example is relevant. So, the main problem is injecting relevant code, using semantic search.

# Copyright (c) Microsoft. All rights reserved.

import asyncio

import semantic_kernel as sk

sk_prompt = """
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know'
when it doesn't know the answer.

{{$chat_history}}
User:> {{$user_input}}
ChatBot:>
"""

kernel = sk.create_kernel()

api_key, org_id = sk.openai_settings_from_dot_env()
kernel.config.add_openai_completion_backend(
    "davinci-003", "text-davinci-003", api_key, org_id
)

prompt_config = sk.PromptTemplateConfig.from_completion_parameters(
    max_tokens=2000, temperature=0.7, top_p=0.4
)

prompt_template = sk.PromptTemplate(
    sk_prompt, kernel.prompt_template_engine, prompt_config
)

function_config = sk.SemanticFunctionConfig(prompt_config, prompt_template)
chat_function = kernel.register_semantic_function("ChatBot", "Chat", function_config)


async def chat() -> None:
    context = sk.ContextVariables()
    context["chat_history"] = ""

    try:
        user_input = input("User:> ")
        context["user_input"] = user_input
    except KeyboardInterrupt:
        print("\n\nExiting chat...")
        return False
    except EOFError:
        print("\n\nExiting chat...")
        return False

    if user_input == "exit":
        print("\n\nExiting chat...")
        return False

    answer = await kernel.run_on_vars_async(context, chat_function)
    context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n"

    print(f"ChatBot:> {answer}")
    return True


async def main() -> None:
    chatting = True
    while chatting:
        chatting = await chat()


if __name__ == "__main__":
    asyncio.run(main())


The code above is an example about how to use Semantic Kernel (SK) with the python SDK. Show me how to use SK with gpt4, use code comments to explain the code.

LLM output:

# Copyright (c) Microsoft. All rights reserved.

import asyncio

import semantic_kernel as sk

sk_prompt = """
ChatBot can have a conversation with you about any topic.
It can give explicit instructions or say 'I don't know'
when it doesn't know the answer.

{{$chat_history}}
User:> {{$user_input}}
ChatBot:>
"""

kernel = sk.create_kernel()

api_key, org_id = sk.openai_settings_from_dot_env()

# Change the backend to gpt4
kernel.config.add_openai_completion_backend(
    "gpt-4", "text-gpt-4", api_key, org_id
)

prompt_config = sk.PromptTemplateConfig.from_completion_parameters(
    max_tokens=2000, temperature=0.7, top_p=0.4
)

prompt_template = sk.PromptTemplate(
    sk_prompt, kernel.prompt_template_engine, prompt_config
)

function_config = sk.SemanticFunctionConfig(prompt_config, prompt_template)
chat_function = kernel.register_semantic_function("ChatBot", "Chat", function_config)


async def chat() -> None:
    context = sk.ContextVariables()
    context["chat_history"] = ""

    try:
        user_input = input("User:> ")
        context["user_input"] = user_input
    except KeyboardInterrupt:
        print("\n\nExiting chat...")
        return False
    except EOFError:
        print("\n\nExiting chat...")
        return False

    if user_input == "exit":
        print("\n\nExiting chat...")
        return False

    # Change the function to use the gpt4 backend
    answer = await kernel.run_on_vars_async(context, chat_function)
    context["chat_history"] += f"\nUser:> {user_input}\nChatBot:> {answer}\n"

    print(f"ChatBot:> {answer}")
    return True


async def main() -> None:
    chatting = True
    while chatting:
        chatting = await chat()


if __name__ == "__main__":
    asyncio.run(main())

0 replies

dluc · 2023-03-24T08:41:27Z

dluc
Mar 24, 2023
Maintainer

Note how I had to be very specific in the question ("Ask Smart to Get Smart.") to force the LLM to use a GPT4 model ID. Also the model name is wrong, because the LLM doesn't have that information. That aside, the LLM can generate code, which is pretty much how GitHub Copilot works. Then it's up to you whether to trust the code generated (I would not just run it, you'll need some form of validation).

0 replies

1openwindow · 2023-03-25T01:36:14Z

1openwindow
Mar 25, 2023
Author

@dluc , , I appreciate your guidance and clarification, as it has been both inspiring and beneficial for me. I'm eager to explore our AI-driven project using the semantical kernel approach. Thank you so much!

0 replies

evchaki · 2023-04-04T15:29:23Z

evchaki
Apr 4, 2023
Collaborator

@1openwindow thanks for raising this and thanks to @dluc for answering! Closing this discussion.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate and Understand the Given Code Context. #149

{{title}}

Replies: 4 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Generate and Understand the Given Code Context. #149

1openwindow Mar 24, 2023

Replies: 4 comments

dluc Mar 24, 2023 Maintainer

dluc Mar 24, 2023 Maintainer

1openwindow Mar 25, 2023 Author

evchaki Apr 4, 2023 Collaborator

1openwindow
Mar 24, 2023

dluc
Mar 24, 2023
Maintainer

dluc
Mar 24, 2023
Maintainer

1openwindow
Mar 25, 2023
Author

evchaki
Apr 4, 2023
Collaborator