An Introduction to OpenAI Function Calling | by David Hundley | Jul, 2023


No more unstructured data outputs; turn ChatGPT’s completions into structured JSON!

David Hundley
Towards Data Science
Title card created by the author

A few months ago, OpenAI released their API to the general public, which excited many developers who wanted to make use of ChatGPT’s outputs in a systematic way. As exciting has this has been, it’s equally been a bit of a nightmare since we programmers tend to work in the realm of structured data types. We like integers, booleans, and lists. The unstructured string can be unwieldy to deal with, and in order to get consistent results, a programmer is required to face their worst nightmare: developing a regular expression (Regex) for proper parsing. 🤢

Of course, prompt engineering can actually help quite a bit here, but it’s still not perfect. For example, if you want to have ChatGPT analyze the sentiment of a movie review for positivity or negativity, you might structure a prompt that looks like this:

prompt = f'''
Please perform a sentiment analysis on the following movie review:
{MOVIE_REVIEW_TEXT}
Please output your response as a single word: either "Positive" or "Negative".
'''

This prompt honestly does pretty decently, but the results aren’t precisely consistent. For example, I have seen ChatGPT produce outputs that look like the following in terms of the movie sentiment example:

  • Positive
  • positive
  • Positive.

This might not seem like a big deal, but in the world of programming, those are NOT equal. Again, you can get around a simpler example like this with a bit of Regex, but beyond the fact that most people (including myself) are terrible at writing regular expressions, there are simply some instances where even Regex can’t parse the information correctly.

As you can tell, programmers have been hoping that OpenAI would add functionality to support structured JSON outputs, and OpenAI has delivered in the form of function calling. Function calling is exactly as it sounds: it allows ChatGPT to produce arguments that can interact with a custom function in a manner that uses structured data types. Yup, no more fancy prompt engineering and Regex to cross your fingers and hope you get the right outcome. In this post, we’ll cover how to make use of this new functionality, but first, let’s start with an example of how we used to attempt to produce structured data outputs with prompt engineering and Regex.

Before we jump into the bulk of our post, please allow me to share a link to this Jupyter notebook in my GitHub. This notebook contains all the code I will be running (and more) as part of this blog post. Additionally, I would encourage you to check out OpenAI’s official function calling documentation for anything that I may not cover here.

To demonstrate what we used to do in the “pre-function calling days”, I wrote a small bit of text about myself, and we’ll be using the OpenAPI to extract bits of information from this text. Here is the “About Me” text we’ll be working with:

Hello! My name is David Hundley. I am a principal machine learning engineer at State Farm. I enjoy learning about AI and teaching what I learn back to others. I have two daughters. I drive a Tesla Model 3, and my favorite video game series is The Legend of Zelda.

Let’s say I want to extract the following bits of information from that text:

  • Name
  • Job title
  • Company
  • Number of children as an integer (This is important!)
  • Car make
  • Car model
  • Favorite video game series

Here’s how I would engineer a few-shot prompt in order to produce a structured JSON output:

# Engineering a prompt to extract as much information from "About Me" as a JSON object
about_me_prompt = f'''
Please extract information as a JSON object. Please look for the following pieces of information.
Name
Job title
Company
Number of children as a single integer
Car make
Car model
Favorite video game series

This is the body of text to extract the information from:
{about_me}
'''

# Getting the response back from ChatGPT (gpt-3.5-turbo)
openai_response = openai.ChatCompletion.create(
model = 'gpt-3.5-turbo',
messages = [{'role': 'user', 'content': about_me_prompt}]
)

# Loading the response as a JSON object
json_response = json.loads(openai_response['choices'][0]['message']['content'])
json_response

Let’s check out how ChatGPT returned this completion to me:

The “Pre-Function Calling” Days (Captured by the author)

As you can see, this actually isn’t bad. But it’s not ideal and could prove to be risky for the following reasons:

  • We are not guaranteed that OpenAI’s response will provide a clean JSON output. It could have produced something like “Here is your JSON:” followed by the JSON output, meaning that in order to use json.loads() to parse the string into a JSON object, we’d first have to strip out that little bit of text that opens the response.
  • We are not guaranteed that the keys in the key-value pairs of the JSON object will be consistent from API call to API call. Recall the example from above of the 3 instances of the word Positive. This is precisely the same risk you run trying to have ChatGPT parse out keys through few-shot prompt engineering. The only way you could maybe lock this down is with Regex, which comes with its own baggage as we already discussed.
  • We are not guaranteed to receive our responses in the proper data type format. While our prompt engineering to extract number of children did parse into a proper integer, we’re at the mercy of crossing our fingers and hoping we get that consistent result for every API call.

We could summarize these issues into a single statement: Without function calling, we are not guaranteed to get consistent results that are important for the precision required for systematic implementation. It’s a nontrivial issue that can be very challenging to remedy through prompt engineering and regular expressions.

Now that we’ve built an intuition around why getting structured outputs from ChatGPT was formerly problematic, let’s move into looking at the new function calling capability introduced by OpenAI.

Function calling is actually a bit of a misnomer. OpenAI is not actually running your code in a true function call. Rather, it’s simply setting up the structured arguments you’d need to execute your own custom functions, and I’d argue this is preferred behavior. While you might be thinking that it doesn’t make sense that the OpenAI API isn’t executing your custom function, consider that in order to do that, you’d have to pass that function code into ChatGPT. This function code probably contains proprietary information that you would NOT want to expose to anybody, hence why it’s good that you don’t actually have to pass this code to make use of OpenAI’s function calling.

Let’s jump into an example of how to enable function calling with a single custom function. Using our “About Me” sample text from the previous section, let’s create a custom function called extract_person_info. This function needs just three bits of information: person name, job title, and number of children. (We’ll revisit extracting the rest of the information in the next section; I just want to start simpler for now.) This custom function is intentionally very simple and will simply take our arguments and print them together in a single string. Here’s the code for this:

def extract_person_info(name, job_title, num_children):
'''
Prints basic "About Me" information

Inputs:
- name (str): Name of the person
- job_title (str): Job title of the person
- num_chilren (int): The number of children the parent has.
'''

print(f'This person\'s name is {name}. Their job title is {job_title}, and they have {num_children} children.')

In order to make use of function calling, we need to set up a JSON object in a specific way that notes the name of our custom function and what data elements we are hoping ChatGPT will extract from the body of the text. Because of the specificity on how this JSON object should look, I would encourage you reference OpenAI’s developer documentation if you want to know any details that I don’t cover here.

(Note: In the OpenAI documentation, I noticed one element in the JSON object called required that seemingly indicates that a parameter must be present for ChatGPT to properly recognize the function. I tried testing this out, and either this isn’t how this functionality works or I did something wrong. Either way, I transparently have no idea what this required parameter indicates. 😅)

Here is how we need to structure our JSON object to make use of our custom function:

my_custom_functions = [
{
'name': 'extract_person_info',
'description': 'Get "About Me" information from the body of the input text',
'parameters': {
'type': 'object',
'properties': {
'name': {
'type': 'string',
'description': 'Name of the person'
},
'job_title': {
'type': 'string',
'description': 'Job title of the person'
},
'num_children': {
'type': 'integer',
'description': 'Number of children the person is a parent to'
}
}
}
}
]

You’re probably already familiar with JSON syntax, although let me draw attention for a moment to the data type associated to each property. If you are a Python developer like myself, be aware that the data typing for this JSON structure is NOT directly equivalent to how we define data structures in Python. Generally speaking, we can find equivalencies that work out alright, but if you want to know more about the specific data types associated with this JSON structure, check out this documentation.

Now we’re ready to make our API call to get the results! Using the Python client, you’ll notice the syntax is very similar to how we obtain completions in general. We’re just going to add some additional arguments into this call that represent our function calling:

# Getting the response back from ChatGPT (gpt-3.5-turbo)
openai_response = openai.ChatCompletion.create(
model = 'gpt-3.5-turbo',
messages = [{'role': 'user', 'content': about_me}],
functions = my_custom_functions,
function_call = 'auto'
)

print(openai_response)

As you can see, we simply pass in our list of custom functions (or in our case for now, our singular custom function) as the functions parameter, and you’ll also notice an additional parameter called function_call that we’ve set to auto. Don’t worry about this for now as we’ll revisit what this auto piece is doing in the next section.

Let’s run this code and take a look at the full API response from ChatGPT

Function calling with a single function (Captured by the author)

For the most part, this response looks the same as a non-function call response, but now there’s an additional field in the response called function_call, and nested under this dictionary are two additional items: name and arguments. name indicates the name of our custom function that we will be calling with ChatGPT’s output, and arguments contains a string that we can load using json.loads() to load our custom function arguments as a JSON object.

Notice now that we’re getting much more consistency than we were in our pre-function calling methodology. Now we can be guaranteed that the keys of the key-value pairs WILL be consistent, and the data types WILL be consistent. No need for fancy prompt engineering or regular expressions!

That’s the core of OpenAI’s function calling! Of course, this was a very simplistic example to get you going, but you probably have additional questions. Let’s cover those in this next section.

The previous section covered a very simple example of how to enable function calling, but if you’re like me, you probably have some additional questions beyond this point. Naturally, I can’t cover all these questions, but I do want to cover two big ones that are slightly more advanced than what we covered in the previous section.

What if the prompt I submit doesn’t contain the information I want to extract per my custom function?

In our original example, our custom function sought to extract three very specific bits of information, and we demonstrated that this worked successfully by passing in my custom “About Me” text as a prompt. But you might be wondering, what happens if you pass in any other prompt that doesn’t contain that information?

Recall that we set a parameter in our API client call called function_call that we set to auto. We’ll explore this even deeper in the next subsection, but what this parameter is essentially doing is telling ChatGPT to use its best judgment in figuring out when to structure the output for one of our custom functions.

So what happens when we submit a prompt that doesn’t match any of our custom functions? Simply put, it defaults to typical behavior as if function calling doesn’t exist. Let’s test this out with an arbitrary prompt: “How tall is the Eiffel Tower?”

Function calling but with a prompt that doesn’t match the function (Captured by the author)

As you can see, we are getting a typical “Completions” output though we passed in our custom function. Naturally, this makes sense since this arbitrary Eiffel Towel prompt contains none of the specific information we are looking for.

What if I want to pass multiple custom functions and some of them have overlapping parameters?

In short, ChatGPT intelligently handles this without a problem. Where we previously passed in one custom function as essentially a list of Python dictionaries, we just need to keep adding additional Python dictionaries to this same list, each representing its own distinct function. Let’s add two new functions: one called extract_vehicle_info and another called extract_all_info. Here’s what our adjusted syntax looks like:

# Defining a function to extract only vehicle information
def extract_vehicle_info(vehicle_make, vehicle_model):
'''
Prints basic vehicle information

Inputs:
- vehicle_make (str): Make of the vehicle
- vehicle_model (str): Model of the vehicle
'''

print(f'Vehicle make: {vehicle_make}\nVehicle model: {vehicle_model}')

# Defining a function to extract all information provided in the original "About Me" prompt
def extract_vehicle_info(name, job_title, num_children, vehicle_make, vehicle_model, company_name, favorite_vg_series):
'''
Prints the full "About Me" information

Inputs:
- name (str): Name of the person
- job_title (str): Job title of the person
- num_chilren (int): The number of children the parent has
- vehicle_make (str): Make of the vehicle
- vehicle_model (str): Model of the vehicle
- company_name (str): Name of the company the person works for
- favorite_vg_series (str): Person's favorite video game series.
'''

print(f'''
This person\'s name is {name}. Their job title is {job_title}, and they have {num_children} children.
They drive a {vehicle_make} {vehicle_model}.
They work for {company_name}.
Their favorite video game series is {favorite_vg_series}.
''')

# Defining how we want ChatGPT to call our custom functions
my_custom_functions = [
{
'name': 'extract_person_info',
'description': 'Get "About Me" information from the body of the input text',
'parameters': {
'type': 'object',
'properties': {
'name': {
'type': 'string',
'description': 'Name of the person'
},
'job_title': {
'type': 'string',
'description': 'Job title of the person'
},
'num_children': {
'type': 'integer',
'description': 'Number of children the person is a parent to'
}
}
}
},
{
'name': 'extract_car_info',
'description': 'Extract the make and model of the person\'s car',
'parameters': {
'type': 'object',
'properties': {
'vehicle_make': {
'type': 'string',
'description': 'Make of the person\'s vehicle'
},
'vehicle_model': {
'type': 'string',
'description': 'Model of the person\'s vehicle'
}
}
}
},
{
'name': 'extract_all_info',
'description': 'Extract all information about a person including their vehicle make and model',
'parameters': {
'type': 'object',
'properties': {
'name': {
'type': 'string',
'description': 'Name of the person'
},
'job_title': {
'type': 'string',
'description': 'Job title of the person'
},
'num_children': {
'type': 'integer',
'description': 'Number of children the person is a parent to'
},
'vehicle_make': {
'type': 'string',
'description': 'Make of the person\'s vehicle'
},
'vehicle_model': {
'type': 'string',
'description': 'Model of the person\'s vehicle'
},
'company_name': {
'type': 'string',
'description': 'Name of the company the person works for'
},
'favorite_vg_series': {
'type': 'string',
'description': 'Name of the person\'s favorite video game series'
}
}
}
}
]

Notice specifically how the extract_all_info covers some of the same parameters as our original extract_person_info function, so how does ChatGPT know which one to select? Simply put, ChatGPT looks for the best match. If we pass in a prompt that contains all the arguments needed for the extract_all_info function, that’s the one it’ll select. But if we just pass in a prompt that contains either just simple information about me or a prompt about my vehicle, it’ll leverage the respective functions that do that. Let’s execute that in code here with a few samples:

  • Sample 1: The original “About Me” text. (See above.)
  • Sample 2: “My name is David Hundley. I am a principal machine learning engineer, and I have two daughters.”
  • Sample 3: “She drives a Kia Sportage.”
Sample #1’s Results (Captured by the author)
Sample #2’s Results (Captured by the author)
Sample #3’s results:

With each of the respective prompts, ChatGPT selected the correct custom function, and we can specifically note that in the name value under function_call in the API’s response object. In addition to this being a handy way to identify which function to use the arguments for, we can programmatically map our actual custom Python function to this value to run the correct code appropriately. If that doesn’t make sense, perhaps looking at this in code would make this more clear:

# Iterating over the three samples
for i, sample in enumerate(samples):

print(f'Sample #{i + 1}\'s results:')

# Getting the response back from ChatGPT (gpt-3.5-turbo)
openai_response = openai.ChatCompletion.create(
model = 'gpt-3.5-turbo',
messages = [{'role': 'user', 'content': sample}],
functions = my_custom_functions,
function_call = 'auto'
)['choices'][0]['message']

# Checking to see that a function call was invoked
if openai_response.get('function_call'):

# Checking to see which specific function call was invoked
function_called = openai_response['function_call']['name']

# Extracting the arguments of the function call
function_args = json.loads(openai_response['function_call']['arguments'])

# Invoking the proper functions
if function_called == 'extract_person_info':
extract_person_info(*list(function_args.values()))
elif function_called == 'extract_vehicle_info':
extract_vehicle_info(*list(function_args.values()))
elif function_called == 'extract_all_info':
extract_all_info(*list(function_args.values()))

Final programmatic results! (Captured by the author)

**Beware one thing**: In the spirit of full transparency, I had to run that code there multiple times to get it to produce like that. The trouble is that because the extract_person_info and extract_all_info are more similar in nature, ChatGPT kept confusing those for one another. I guess the lesson to be learned here is that your functions should be extracting distinct information. I also only tested using gpt-3.5-turbo, so it’s possible that a more powerful model like GPT-4 could have handled that better.

I hope you can see now why function calling can be so powerful! When it comes to building applications that leverage Generative AI, this kind of function calling is a godsend for programmers. By not having to worry so much now about the output JSON structure, we can now focus our time on building out other parts of the application. It’s an awesome time to be working in this space 🥳



Source link

This post originally appeared on TechToday.