Generative AI: Integrate openAI API with Python


Posted by Pierre-Edouard Guerin · 14 min read · Published on May 27, 2024

I was fortunate to follow the workshop of Sven Warris about software tools to integrate genAI into your own work and applications. The course is aimed at data scientists and bioinformaticians. For this workshop we will be using ChatGPT web interface for prompt engineering and testing ideas. In Python we will be using the OpenAI API. A temporary API key will be supplied before the workshop. This key will expire after the workshop.

Although we focus on ChatGPT, the concepts such as prompt engineer, API access and retrieval augmented generation also apply to many Open Source models found on Hugging Face, such as llama 3, the new powerful LLM provided by META.

Introduction

Process of Tokenizing and Embedding

Applications and Developments in Generative AI

Prompt engineering using the ChatGPT web interface

The ChatGPT Web Interface is an accessible and user-friendly platform provided by OpenAI, designed to interact with GPT (Generative Pre-trained Transformer) models, specifically tailored for conversation.

Historically, we have relied on search engines like Google to find relevant websites for our queries. Now, we explore how asking a similar question differs when using a Large Language Model (LLM) like ChatGPT.

LLMs are much better in providing answers to fully constructed sentences. When generating the answer, the question will provide context for the model and provide a much better and to the point answer.

What usually works best is to view an LLM as human assistant and write the prompt as if you are addressing a co-worker:

Could you elaborate on …
I have written the following lines of Python code. It should connect to the API, but I’m getting an internal server error: my code.

It takes a bit getting used to, but it will increase the quality and appropriateness of the answer for sure.

Similar to an human assistant, most LLMs work best when there is a balance between the amount input given and the length of the requested answer. A prompt such as:

Write a full report on the concept of natural selection.

will of course provide an anwser, but it most likely will result in a short and generic report about natural selection.

A common strategy is to have a conversation with the LLM on the topic of interest. So you could start with:

I'm writing a report on natural selection in plants. The target audience are first year biology students, so it needs to clear and without too many abstract concepts. What are key topics I need to address in this report?

followed by prompts asking for additional information on these topics, to write specific sections such as the introduction, etc. You can also ask it to change the tone of voice, explain concepts and summarize sections.

Ethical considerations

When using ChatGPT through the web interface, there are some important things you need to take into consideration:

The OpenAI API

The OpenAI API provides access to advanced artificial intelligence models developed by OpenAI, including the GPT (Generative Pre-trained Transformer) series and other specialized models. This API allows developers to integrate state-of-the-art natural language processing capabilities into their applications and services.

Installing and connecting OpenAI

To make use of OpenAI, you need to install the OpenAI Python module

pip install openai
pip install python-dotenv
pip install requests

Setting up environment

import os
import openai
from openai import OpenAI
import dotenv
dotenv.load_dotenv(".env", override=True) 

openai.api_key = os.getenv("OPENAI_API_KEY")

Security Advice

The API key is directly linked to your account and, more importantly, to your credit card. So there are some security issues you need to take into account:

Creating a first prompt

OpenAI has several options available to use for your LLM application. Generally speaking, newer models work better than older ones but are usually also more expensive. Check the documentation for the most recent models and costs. For now we can select:

MODELS = ["gpt-3.5-turbo-1106", "gpt-4", "gpt-4o", "gpt-3.5-turbo-16k"]
MODEL = MODELS[0]
client = OpenAI(api_key="**************")

def query_openai(client, prompt, model):
    response = client.chat.completions.create(
        model=model,
            messages=[{
                "role": "system",
                "content": "You will be asked to help with programming questions."
            },
            {
                "role": "user",
                "content": prompt
            }],
        max_tokens=256
        )
    return(response.choices[0].message.content)

First basic example usage:

prompt_text = "Provide some Perl code to write `Hello world` in five different ways."
response_text = query_openai(client, prompt_text, MODEL)
print(response_text)

In this second example we will get data from a public API and ask ChatGPT to reason about this. First, we need the code to access the API. In our case we are looking at Gene Ontology database, but you can use another API as well of course. The idea will be to collect the data from Gene Ontology and to produce a summary and a report using openAI API.

import requests
def get_go(GOterm):
    # Make a GET request to the gene ontology API
    response = requests.get("http://www.ebi.ac.uk/QuickGO/services/ontology/go/terms/{}".format(GOterm))

    # Check if the request was successful
    if response.status_code == 200:
        # Convert the response to JSON
        data = response.json()
        return data
    else:
        print("Failed to get information for {}".format(GOterm))

Let's have ChatGPT make a summary of the data. First we need a more general function to query the OpenAI API.

def query_openai(client, system, prompt, model):
    response = client.chat.completions.create(
        model=model,
            messages=[{
                "role": "system",
                "content": system
            },
            {
                "role": "user",
                "content": prompt
            }],
        max_tokens=256
        )
    return(response.choices[0].message.content)

Let's put them together.

my_go_term = "GO:0030445"
system_prompt = "You are a skilled biologist and a good lecturer."

go_json = get_go(my_go_term)
summary = query_openai(client=client, 
                       system=system_prompt, 
                       prompt="I have this json from the Gene Ontology database. Could you create a nice summary of the biological data? There is no need to comment on the structure of the JSON: `{}`".format(go_json), 
                       model=MODEL)
print(summary)
report = query_openai(client=client,
                      system=system_prompt,
                      prompt="Given the summary of this GO term, could you provide a markdown report on this GO term with more background information? Please only markdown, no other comments or explainations. `{}`".format(summary),
                      model=MODEL)
print(report)

And from that point you can automate the writting of reports for any Gene Ontology terms.

Retrieval Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique used in natural language processing that enhances the capabilities of generative models by combining them with retrieval mechanisms. This approach integrates information retrieved from a large knowledge base or data set into the generation process. Here’s how it typically works:

RAG generates more informed, accurate, and contextually rich outputs. This is particularly useful in applications requiring factual correctness and depth, such as question answering, content creation, and summarization tasks.

Here we create a RAG which will be a cook chef filled with some recipes.

recipes_text = """
RECIPE: Sweet Pepper Soup
INGREDIENTS:
- 1 red onion
- 1 clove garlic
- 1 can peeled tomatoes
- 2 grilled peppers
METHOD:
1. Fry onion and garlic.
2. Add tomatoes and stock.
3. Add peppers and blend.

RECIPE: Carrot Salad
INGREDIENTS:
- 3 carrots
- Olive oil
- Lemon juice
METHOD:
1. Grate carrots.
2. Mix with oil and lemon.
"""

Creating the assistant and teach him more context to write like a cook:

assistant = client.beta.assistants.create(
    name="La Chef",
    instructions=(
        "You are a professionnal cook chef, here some recipes you know:\n\n"
        + recipes_text +
        "\n\nAnswer only by referring to the recipes you know."
    ),
    model="gpt-4o"
)

We ask to the cook chef assistant for a recipe with garlic:

with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Please provide a recipe which includes some garlic.",
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

Further fine-tune the instructions

You might have noticed that the assistant will fall back on general knowledge if the document does not contain an example recipe. Instruct the assistant to:

psychorigid_assistant = client.beta.assistants.create(
  name="La Chef pyschorigide",
  instructions="You are an expert cook. You say 'ATCHOUM' everytime after you said the word 'tomato'. You have access to recipes of Bejo Zaden. Ignore any request that is not related to cooking or the recipes in the document. Stick to the supplied recipes in the document.",
  model="gpt-4o",
  tools=[{"type": "file_search"}],
)

psychorigid_assistant = client.beta.assistants.update(
  assistant_id=psychorigid_assistant.id,
  tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)


with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=psychorigid_assistant.id,
    instructions="I don't have a can of peeled tomatoes. Can you suggest a proper substitute?",
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

Image generation and processing through the API

OpenAI models can analyze images to extract information or perform object recognition or write a description in natural language. This is called Image analysis. On the other hand, the API can also generate image allowing users to create new images from textual descriptions or modify existing images.

In this example, we use the cook chef assistant to provide a recipe and then ask DALL-E 3 to create an image based on this recipe. You need to change the event handler for this: in stead of printing the message, you need to store it.

class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        self.message = ""
        print(f"\nassistant running ", end="", flush=True)

    @override
    def on_tool_call_created(self, tool_call):
        print(f"\nassistant > {tool_call.type}\n", flush=True)

    @override
    def on_message_done(self, message) -> None:
        # print a citation to the file searched
        message_content = message.content[0].text
        annotations = message_content.annotations
        citations = []
        for index, annotation in enumerate(annotations):
            message_content.value = message_content.value.replace(
                annotation.text, f"[{index}]"
            )
            if file_citation := getattr(annotation, "file_citation", None):
                cited_file = client.files.retrieve(file_citation.file_id)
                citations.append(f"[{index}] {cited_file.filename}")

        self.message += message_content.value + "\n"
        self.message += "\n".join(citations)

event_handler = EventHandler()

with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Please provide a recipe which includes some garlic and perhaps peppers.",
    event_handler=event_handler,
) as stream:
    stream.until_done()

print(event_handler.message)
response = client.images.generate(
  model="dall-e-3",
  prompt="Please provide a photo-realistic plate of food, given the following description and ingredients: `{event_handler.message}`",
  size="1024x1024",
  quality="standard",
  n=1,
)

image_url = response.data[0].url
display(HTML(f'<img src="{image_url}" alt="Image" />'))

generative image recipe garlic

Processing large documents with langchain

Langchain is a powerful tool designed to handle and process large documents effectively, especially when combined with the capabilities of the OpenAI API. This combination allows for advanced document analysis, summarization, and content generation.

We apply the map-reduce chaining of langchain to summarize and discuss large documents. The main difference between RAG and map-reduce is that with RAG the prompt is used to identify relevant parts of the data and with map-reduce in the end the entire document has been provided to the LLM. This makes map-reduce much more complete but also much more expensive and slow. So you need to determine if this completeness is warranted for your application.

We analyze scientific papers, as they follow a specific format, making it relatively easy to process.

You need to install the following modules:

pip install langchain langchain-community html2text tiktoken langchain-openai pypdf

Langchain needs to know the model we would like to use and how much data it can send to the LLM:

MODEL ="gpt-4o"
chunk = 10000 # amount of data send to LLM per mapping 

And these are the modules we will be using:

from langchain.document_transformers import Html2TextTransformer
from langchain.document_loaders import PyPDFLoader
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain
from langchain_openai import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

PDFs are built from printing, not so much for data processing. Extracting text from a PDF can be challenging. The same holds true from HTML pages. Some PDFs will therefore parse nicely, while others might create a mess. Given your set of documents, you might need to try out several PDF-readers and parsers to find the one that gives the best results. In this case we will be using PyPDFLoader.

After reading the PDF we need to split it up in chunks. Picking the optimal chuck_size is tricky and depends on context length and LLM used. Setting it too low, however, will limit the reasoning capabilities of the LLM, because not enough context will then be provided.

loader = PyPDFLoader("crop_rotation_sugar_beet.pdf")
docs = loader.load()

html2text = Html2TextTransformer()
docs = html2text.transform_documents(docs)

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=chunk, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)

we need run the map-reduce approach on our document. Note: we will add a sleep() to the method to prevent too many calls to the API. It is possible to have a subscription to the OpenAI API with much higher limits, though.

Create the map and reduce templates:

llm = ChatOpenAI(temperature=0.3, model_name=MODEL, streaming=True)
map_template = """The following is a set of documents which combined form a full scientific paper and should therefore be considered as one long, single paper.
{docs}
{question}
Helpful Answer:"""

reduce_template = """{description}
{doc_summaries}
{question}
Helpful Answer:"""

We need to define a method which will chain the map-reduce approach.

def runMapReduce(map_template, reduce_template, docs, llm, model = "gpt-4o"):
    map_prompt = PromptTemplate.from_template(map_template)
    map_chain = LLMChain(llm=llm, prompt=map_prompt )

    # Reduce
    reduce_prompt = PromptTemplate.from_template(reduce_template)
    reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

    # Takes a list of documents, combines them into a single string, and passes this to an LLMChain
    combine_documents_chain = StuffDocumentsChain(
        llm_chain=reduce_chain, document_variable_name="doc_summaries"
    )

    #print("Reduce phase")
    # Combines and iteravely reduces the mapped documents
    reduce_documents_chain = ReduceDocumentsChain(
        # This is final chain that is called.
        combine_documents_chain=combine_documents_chain,
        # If documents exceed context for `StuffDocumentsChain`
        collapse_documents_chain=combine_documents_chain,
        # The maximum number of tokens to group documents into.
        #token_max=tokens,
    )

    #print("Mapping phase")
    # Combining documents by mapping a chain over them, then combining results
    map_reduce_chain = MapReduceDocumentsChain(
        # Map chain
        llm_chain=map_chain,
        # Reduce chain
        reduce_documents_chain=reduce_documents_chain,
        # The variable name in the llm_chain to put the documents in
        document_variable_name="docs",
        # Return the results of the map steps in the output
        return_intermediate_steps=False,
    )

    if model == "gpt-4o": # let's wait for a while
        time.sleep(10)

    return(map_reduce_chain.run(docs)) 

Using the map-reduce method we can tell langchain to process the entire paper. Let’s start with some basic information, and extract information on authors and journal. You need to provide information to the two templates and call runMapReduce.

map_q = "Please identify the publisher of this scientific paper and the authors of this paper. Provide some background on the journal. Is it for example considered high impact? What are generally the topics and results shared in this journal? If you can not extract this information from this part of the text, just provide an empty string as answer."
reduce_qDescription = "The following contains an author list and information on the journal from a scientific paper:"
reduce_q = "Take these and provide only the author list and information on the first mentioned journal and publisher. The output needs to be in Markdown file format."
result = runMapReduce(map_template.format(question=map_q, docs="{docs}"), reduce_template.format(description=reduce_qDescription, question = reduce_q, doc_summaries="{doc_summaries}"), split_docs, llm=llm, model=MODEL)
print(result)

The map-redude method presented here is not very efficient: for each map-reduce question the entire paper is processed, while you ask the LLM only to summarize the introduction, for example. If the document contains more information on the structure, using that is highly recommended. Another trick could be to process only relevant questions. If you decide the paper is of no interest based on the themes mentioned there is of course no point in processing the remainder of the questions.

And although processing a paper like this might seem costly (a lengthy paper can be around 1 euro), the efficiency win can be enormous.

Conclusion

Pros

Cons

To conclude: Generative AI excels at... generating content, but it has many limitations.

References







Relevant Tags

About the Author