Generative AI: Integrate openAI API with Python

Posted by Pierre-Edouard Guerin · 14 min read · Published on May 27, 2024

I was fortunate to follow the workshop of Sven Warris about software tools to integrate genAI into your own work and applications. The course is aimed at data scientists and bioinformaticians. For this workshop we will be using ChatGPT web interface for prompt engineering and testing ideas. In Python we will be using the OpenAI API. A temporary API key will be supplied before the workshop. This key will expire after the workshop.

Although we focus on ChatGPT, the concepts such as prompt engineer, API access and retrieval augmented generation also apply to many Open Source models found on Hugging Face, such as llama 3, the new powerful LLM provided by META.

Introduction

Artificial Intelligence (AI): Technology aimed at mimicking human abilities like reasoning, learning, problem-solving, and perception.
- Machine Learning (ML): Uses algorithms and statistical models to execute tasks based on pattern recognition and inference, without explicit instructions.
- Statistics: The foundational science for analyzing and interpreting data, essential for AI development.
- Deep Learning (DL): A subset of ML that utilizes layered neural networks, excelling in complex tasks such as speech and image recognition.
- Generative AI: Focuses on creating new content (e.g., images, texts, sounds) from trained data, emphasizing the versatility in output types.

Process of Tokenizing and Embedding

Tokenization: Breaking down text into smaller units called tokens e.g. words.
Embedding: Converts tokens into numerical vectors that represent both the literal and contextual meanings of the tokens. This process is critical for machine understanding and manipulation of language.

Applications and Developments in Generative AI

Prompt Engineering: Often referred to as prompt hacking, it involves crafting queries that guide AI to produce desired outcomes. Used in various applications from grant proposals to blog posts and image generation.
Data Analytics and Application Integration: LLMs can be integrated with APIs like OpenWeather for real-time data processing and activity suggestions based on weather conditions.
Automated Document Processing: Tools like Langchain and ChatGPT can process and generate structured outputs from documents such as scientific papers.

Prompt engineering using the ChatGPT web interface

The ChatGPT Web Interface is an accessible and user-friendly platform provided by OpenAI, designed to interact with GPT (Generative Pre-trained Transformer) models, specifically tailored for conversation.

Historically, we have relied on search engines like Google to find relevant websites for our queries. Now, we explore how asking a similar question differs when using a Large Language Model (LLM) like ChatGPT.

LLMs are much better in providing answers to fully constructed sentences. When generating the answer, the question will provide context for the model and provide a much better and to the point answer.

What usually works best is to view an LLM as human assistant and write the prompt as if you are addressing a co-worker:

Could you elaborate on …
I have written the following lines of Python code. It should connect to the API, but I’m getting an internal server error: my code.

It takes a bit getting used to, but it will increase the quality and appropriateness of the answer for sure.

Similar to an human assistant, most LLMs work best when there is a balance between the amount input given and the length of the requested answer. A prompt such as:

Write a full report on the concept of natural selection.

will of course provide an anwser, but it most likely will result in a short and generic report about natural selection.

A common strategy is to have a conversation with the LLM on the topic of interest. So you could start with:

I'm writing a report on natural selection in plants. The target audience are first year biology students, so it needs to clear and without too many abstract concepts. What are key topics I need to address in this report?

followed by prompts asking for additional information on these topics, to write specific sections such as the introduction, etc. You can also ask it to change the tone of voice, explain concepts and summarize sections.

Ethical considerations

When using ChatGPT through the web interface, there are some important things you need to take into consideration:

Any code, idea or concept you add to the prompt, or is generated by ChatGPT, might be used for training. This means that you effectively share it with all users of ChatGPT. It could well be that the code you provided will be later generated for another user working on a similar approach.
Never share personal information, passwords, API keys / tokens, private company information.
ChatGPT (or any other tool from OpenAI, such as DALL-E) will not produce results that are inappropriate in any way or form, or might violate copyrights. Generally speaking, this is a good idea. However, what is considered inappropriate for some, might be a key cultural aspect to others, and is open to debate. But OpenAI has the final say for ChatGPT. Many other (Open Source) LLMs generally do not have these filters, so then it is up to the user to decide what is inappropriate. When you ask DALL-E to generate images which might contain copyrighted materials, it will also refrain from generating content. Or make something up. You can always ask ChatGPT or DALL-E what the reasons are you’re not getting what you expected. There are Open Source LLMs available which are becoming very good, such as Llama 3, and they can run on your local infrastructure. This means that all data, communications and results are kept private.

The OpenAI API

The OpenAI API provides access to advanced artificial intelligence models developed by OpenAI, including the GPT (Generative Pre-trained Transformer) series and other specialized models. This API allows developers to integrate state-of-the-art natural language processing capabilities into their applications and services.

Installing and connecting OpenAI

To make use of OpenAI, you need to install the OpenAI Python module

pip install openai
pip install python-dotenv
pip install requests

Setting up environment

import os
import openai
from openai import OpenAI
import dotenv
dotenv.load_dotenv(".env", override=True) 

openai.api_key = os.getenv("OPENAI_API_KEY")

Security Advice

The API key is directly linked to your account and, more importantly, to your credit card. So there are some security issues you need to take into account:

Never add a key to your code which you will push to a git repository. Once it is in there, it is very difficult to remove, due to the nature of the version control software. Also, public repositories are actively scanned by hackers for passwords, keys, etc.
Create separate keys for each of your projects. This way you can not only track usage much better (and detect potential misuse), but in the event your key has been exposed or hacked, you only need to update that single project.
Read keys from the environment or from a (secrets) file. In cloud infrastructure you can set these environment variables outside your code, function or VM, for example.
By using the API you are providing access to advanced models for which you normally have to pay when using the ChatGPT web interface. This means that users might try to use your (web) interface to access these models for free.

Creating a first prompt

OpenAI has several options available to use for your LLM application. Generally speaking, newer models work better than older ones but are usually also more expensive. Check the documentation for the most recent models and costs. For now we can select:

MODELS = ["gpt-3.5-turbo-1106", "gpt-4", "gpt-4o", "gpt-3.5-turbo-16k"]
MODEL = MODELS[0]
client = OpenAI(api_key="**************")

def query_openai(client, prompt, model):
    response = client.chat.completions.create(
        model=model,
            messages=[{
                "role": "system",
                "content": "You will be asked to help with programming questions."
            },
            {
                "role": "user",
                "content": prompt
            }],
        max_tokens=256
        )
    return(response.choices[0].message.content)

First basic example usage:

prompt_text = "Provide some Perl code to write `Hello world` in five different ways."
response_text = query_openai(client, prompt_text, MODEL)
print(response_text)

In this second example we will get data from a public API and ask ChatGPT to reason about this. First, we need the code to access the API. In our case we are looking at Gene Ontology database, but you can use another API as well of course. The idea will be to collect the data from Gene Ontology and to produce a summary and a report using openAI API.

import requests
def get_go(GOterm):
    # Make a GET request to the gene ontology API
    response = requests.get("http://www.ebi.ac.uk/QuickGO/services/ontology/go/terms/{}".format(GOterm))

    # Check if the request was successful
    if response.status_code == 200:
        # Convert the response to JSON
        data = response.json()
        return data
    else:
        print("Failed to get information for {}".format(GOterm))

Let's have ChatGPT make a summary of the data. First we need a more general function to query the OpenAI API.

def query_openai(client, system, prompt, model):
    response = client.chat.completions.create(
        model=model,
            messages=[{
                "role": "system",
                "content": system
            },
            {
                "role": "user",
                "content": prompt
            }],
        max_tokens=256
        )
    return(response.choices[0].message.content)

Let's put them together.

my_go_term = "GO:0030445"
system_prompt = "You are a skilled biologist and a good lecturer."

go_json = get_go(my_go_term)
summary = query_openai(client=client, 
                       system=system_prompt, 
                       prompt="I have this json from the Gene Ontology database. Could you create a nice summary of the biological data? There is no need to comment on the structure of the JSON: `{}`".format(go_json), 
                       model=MODEL)
print(summary)

report = query_openai(client=client,
                      system=system_prompt,
                      prompt="Given the summary of this GO term, could you provide a markdown report on this GO term with more background information? Please only markdown, no other comments or explainations. `{}`".format(summary),
                      model=MODEL)
print(report)

And from that point you can automate the writting of reports for any Gene Ontology terms.

Retrieval Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a technique used in natural language processing that enhances the capabilities of generative models by combining them with retrieval mechanisms. This approach integrates information retrieved from a large knowledge base or data set into the generation process. Here’s how it typically works:

Retrieval Phase: When a query or prompt is received, the model first performs a search to retrieve relevant documents or data snippets from a structured database or a large corpus of text. This retrieval is based on the similarity of the content to the input query, ensuring that the information is pertinent to the task at hand.
Augmentation: The retrieved content is then used to augment the input to the generative model. This augmentation can provide additional context, facts, or examples that are not inherently known by the model but are useful for generating accurate and contextually relevant responses.
Generation Phase: Equipped with both the original input and the retrieved information, the generative model (often a transformer-based model) synthesizes this information to produce a coherent and contextually enriched output. This output aims to reflect both the direct query and the supplementary information obtained through retrieval.

RAG generates more informed, accurate, and contextually rich outputs. This is particularly useful in applications requiring factual correctness and depth, such as question answering, content creation, and summarization tasks.

Here we create a RAG which will be a cook chef filled with some recipes.

recipes_text = """
RECIPE: Sweet Pepper Soup
INGREDIENTS:
- 1 red onion
- 1 clove garlic
- 1 can peeled tomatoes
- 2 grilled peppers
METHOD:
1. Fry onion and garlic.
2. Add tomatoes and stock.
3. Add peppers and blend.

RECIPE: Carrot Salad
INGREDIENTS:
- 3 carrots
- Olive oil
- Lemon juice
METHOD:
1. Grate carrots.
2. Mix with oil and lemon.
"""

Creating the assistant and teach him more context to write like a cook:

assistant = client.beta.assistants.create(
    name="La Chef",
    instructions=(
        "You are a professionnal cook chef, here some recipes you know:\n\n"
        + recipes_text +
        "\n\nAnswer only by referring to the recipes you know."
    ),
    model="gpt-4o"
)

We ask to the cook chef assistant for a recipe with garlic:

with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Please provide a recipe which includes some garlic.",
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

Further fine-tune the instructions

You might have noticed that the assistant will fall back on general knowledge if the document does not contain an example recipe. Instruct the assistant to:

Ignore any request that is not related to cooking or the recipes in the document
General cooking questions can be answered, but the assistant should stick to the supplied recipes.

psychorigid_assistant = client.beta.assistants.create(
  name="La Chef pyschorigide",
  instructions="You are an expert cook. You say 'ATCHOUM' everytime after you said the word 'tomato'. You have access to recipes of Bejo Zaden. Ignore any request that is not related to cooking or the recipes in the document. Stick to the supplied recipes in the document.",
  model="gpt-4o",
  tools=[{"type": "file_search"}],
)

psychorigid_assistant = client.beta.assistants.update(
  assistant_id=psychorigid_assistant.id,
  tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)


with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=psychorigid_assistant.id,
    instructions="I don't have a can of peeled tomatoes. Can you suggest a proper substitute?",
    event_handler=EventHandler(),
) as stream:
    stream.until_done()

Image generation and processing through the API

OpenAI models can analyze images to extract information or perform object recognition or write a description in natural language. This is called Image analysis. On the other hand, the API can also generate image allowing users to create new images from textual descriptions or modify existing images.

In this example, we use the cook chef assistant to provide a recipe and then ask DALL-E 3 to create an image based on this recipe. You need to change the event handler for this: in stead of printing the message, you need to store it.

class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        self.message = ""
        print(f"\nassistant running ", end="", flush=True)

    @override
    def on_tool_call_created(self, tool_call):
        print(f"\nassistant > {tool_call.type}\n", flush=True)

    @override
    def on_message_done(self, message) -> None:
        # print a citation to the file searched
        message_content = message.content[0].text
        annotations = message_content.annotations
        citations = []
        for index, annotation in enumerate(annotations):
            message_content.value = message_content.value.replace(
                annotation.text, f"[{index}]"
            )
            if file_citation := getattr(annotation, "file_citation", None):
                cited_file = client.files.retrieve(file_citation.file_id)
                citations.append(f"[{index}] {cited_file.filename}")

        self.message += message_content.value + "\n"
        self.message += "\n".join(citations)

event_handler = EventHandler()

with client.beta.threads.runs.stream(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Please provide a recipe which includes some garlic and perhaps peppers.",
    event_handler=event_handler,
) as stream:
    stream.until_done()

print(event_handler.message)
response = client.images.generate(
  model="dall-e-3",
  prompt="Please provide a photo-realistic plate of food, given the following description and ingredients: `{event_handler.message}`",
  size="1024x1024",
  quality="standard",
  n=1,
)

image_url = response.data[0].url
display(HTML(f'<img src="{image_url}" alt="Image" />'))

generative image recipe garlic

Processing large documents with langchain

Langchain is a powerful tool designed to handle and process large documents effectively, especially when combined with the capabilities of the OpenAI API. This combination allows for advanced document analysis, summarization, and content generation.

We apply the map-reduce chaining of langchain to summarize and discuss large documents. The main difference between RAG and map-reduce is that with RAG the prompt is used to identify relevant parts of the data and with map-reduce in the end the entire document has been provided to the LLM. This makes map-reduce much more complete but also much more expensive and slow. So you need to determine if this completeness is warranted for your application.

We analyze scientific papers, as they follow a specific format, making it relatively easy to process.

You need to install the following modules:

pip install langchain langchain-community html2text tiktoken langchain-openai pypdf

Langchain needs to know the model we would like to use and how much data it can send to the LLM:

MODEL ="gpt-4o"
chunk = 10000 # amount of data send to LLM per mapping

And these are the modules we will be using:

from langchain.document_transformers import Html2TextTransformer
from langchain.document_loaders import PyPDFLoader
from langchain.chains.llm import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains import ReduceDocumentsChain, MapReduceDocumentsChain
from langchain_openai import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

PDFs are built from printing, not so much for data processing. Extracting text from a PDF can be challenging. The same holds true from HTML pages. Some PDFs will therefore parse nicely, while others might create a mess. Given your set of documents, you might need to try out several PDF-readers and parsers to find the one that gives the best results. In this case we will be using PyPDFLoader.

After reading the PDF we need to split it up in chunks. Picking the optimal chuck_size is tricky and depends on context length and LLM used. Setting it too low, however, will limit the reasoning capabilities of the LLM, because not enough context will then be provided.

loader = PyPDFLoader("crop_rotation_sugar_beet.pdf")
docs = loader.load()

html2text = Html2TextTransformer()
docs = html2text.transform_documents(docs)

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=chunk, chunk_overlap=0
)
split_docs = text_splitter.split_documents(docs)

we need run the map-reduce approach on our document. Note: we will add a sleep() to the method to prevent too many calls to the API. It is possible to have a subscription to the OpenAI API with much higher limits, though.

Create the map and reduce templates:

llm = ChatOpenAI(temperature=0.3, model_name=MODEL, streaming=True)
map_template = """The following is a set of documents which combined form a full scientific paper and should therefore be considered as one long, single paper.
{docs}
{question}
Helpful Answer:"""

reduce_template = """{description}
{doc_summaries}
{question}
Helpful Answer:"""

We need to define a method which will chain the map-reduce approach.

def runMapReduce(map_template, reduce_template, docs, llm, model = "gpt-4o"):
    map_prompt = PromptTemplate.from_template(map_template)
    map_chain = LLMChain(llm=llm, prompt=map_prompt )

    # Reduce
    reduce_prompt = PromptTemplate.from_template(reduce_template)
    reduce_chain = LLMChain(llm=llm, prompt=reduce_prompt)

    # Takes a list of documents, combines them into a single string, and passes this to an LLMChain
    combine_documents_chain = StuffDocumentsChain(
        llm_chain=reduce_chain, document_variable_name="doc_summaries"
    )

    #print("Reduce phase")
    # Combines and iteravely reduces the mapped documents
    reduce_documents_chain = ReduceDocumentsChain(
        # This is final chain that is called.
        combine_documents_chain=combine_documents_chain,
        # If documents exceed context for `StuffDocumentsChain`
        collapse_documents_chain=combine_documents_chain,
        # The maximum number of tokens to group documents into.
        #token_max=tokens,
    )

    #print("Mapping phase")
    # Combining documents by mapping a chain over them, then combining results
    map_reduce_chain = MapReduceDocumentsChain(
        # Map chain
        llm_chain=map_chain,
        # Reduce chain
        reduce_documents_chain=reduce_documents_chain,
        # The variable name in the llm_chain to put the documents in
        document_variable_name="docs",
        # Return the results of the map steps in the output
        return_intermediate_steps=False,
    )

    if model == "gpt-4o": # let's wait for a while
        time.sleep(10)

    return(map_reduce_chain.run(docs))

Using the map-reduce method we can tell langchain to process the entire paper. Let’s start with some basic information, and extract information on authors and journal. You need to provide information to the two templates and call runMapReduce.

map_q = "Please identify the publisher of this scientific paper and the authors of this paper. Provide some background on the journal. Is it for example considered high impact? What are generally the topics and results shared in this journal? If you can not extract this information from this part of the text, just provide an empty string as answer."
reduce_qDescription = "The following contains an author list and information on the journal from a scientific paper:"
reduce_q = "Take these and provide only the author list and information on the first mentioned journal and publisher. The output needs to be in Markdown file format."
result = runMapReduce(map_template.format(question=map_q, docs="{docs}"), reduce_template.format(description=reduce_qDescription, question = reduce_q, doc_summaries="{doc_summaries}"), split_docs, llm=llm, model=MODEL)
print(result)

The map-redude method presented here is not very efficient: for each map-reduce question the entire paper is processed, while you ask the LLM only to summarize the introduction, for example. If the document contains more information on the structure, using that is highly recommended. Another trick could be to process only relevant questions. If you decide the paper is of no interest based on the themes mentioned there is of course no point in processing the remainder of the questions.

And although processing a paper like this might seem costly (a lengthy paper can be around 1 euro), the efficiency win can be enormous.

Conclusion

Pros

Many generative AI models exist today, each specialized for different kinds of use cases.
All these models are available online through the Hugging Face community.
It is possible to automate prompt generation and therefore automate text generation with ChatGPT.
For example, instead of manually asking ChatGPT to summarize a single article, you can write a program that asks it to summarize 10,000 articles automatically.
- Swen’s LEGO app: bricks
- Microsoft AI used to discover new battery elements: Unlocking a new era for scientific discovery with AI: How Microsoft’s AI screened over 32 million candidates to find a better battery

Cons

GPT can be useful in plant breeding for tasks such as literature summarization, idea generation, or drafting documentation.
However, its scientific reliability is limited. Some prototype trained on scientific publications already exists for example BioGPT.
Using GPT at scale has a financial cost. Example: Swen spent 20 euros for 8 participants during a GPT‑4 workshop.
Working with GPT often requires trial‑and‑error to find effective prompts, which increases both time and cost.
GPT models are slow compared to traditional software. For instance, summarizing a 10‑page article took about 1 minute, although this is still faster than a human.
GPT does not return explicit error messages. When it produces incorrect output, we call this an hallucination.
If the input data is incomplete or poorly formatted, GPT is more likely to hallucinate.
It is difficult to force GPT to strictly follow a dataset. During the practical session, it was not possible to restrict the assistant to the provided content only.
GPT outputs are non‑deterministic. The same input can produce different results, making experiments hard to reproduce.
There is no built‑in way to verify whether GPT’s answer is complete or correct. Example: you cannot be sure it retrieved all garlic‑based recipes from the source data.
GPT is a black box. When errors occur, the only lever you can adjust is the prompt.
Prompt engineering is not programming and not natural language. It follows its own logic, with hidden rules such as:
- maintaining a consistent lexical field
- avoiding restricted content (censorship rules)
- respecting copyright constraints
- managing time and context windows
Even though GPT uses human language, you cannot interact with it as with a human.
Using GPT raises concerns about data privacy and confidentiality, especially when handling sensitive or proprietary information.

To conclude: Generative AI excels at... generating content, but it has many limitations.

References

Generative AI workshop: Sven Warris Materials
Example of webapp using openAI API: BIONIC GPT
Other example; Assistant example based on pdf, LEGO Set Guide: bricks
Planeks tech blog article: How to Integrate ChatGPT API with Python
LangChain Docs: LangChain python overview

Published on May 27, 2024

Relevant Tags

software-engineering

About the Author

Pierre-Edouard Guerin

Bioinformatician