Structured Output with Pydantic

import os
# os.environ['ANTHROPIC_LOG'] = 'debug'

model = models[1] # sonnet -- haiku was quite flaky on the failure cases.

Pydantic provides a way of extracting structured outputs from text. This is useful for integrating an LLM as a single component of a pipeline.

Much of this was inspired by Instructor.

Anthropic function calling is based on JSON Schema (with a few small tweaks). You can easily get the the JSON Schema from any Pydantic mdoel with the model_json_schema classmethod.

class User(BaseModel):
    "Create a new user"
    username: str
    password: str
    email: str
    success: bool  = Field(..., description="Indicate if user creation is a success.")
    failure_reason: str = Field(default="", description="Failure reason. This should be \"\" if `success=True`. If `success=False` you **must** give a failure reason.")

User.model_json_schema()

{'description': 'Create a new user',
 'properties': {'username': {'title': 'Username', 'type': 'string'},
  'password': {'title': 'Password', 'type': 'string'},
  'email': {'title': 'Email', 'type': 'string'},
  'success': {'description': 'Indicate if user creation is a success.',
   'title': 'Success',
   'type': 'boolean'},
  'failure_reason': {'default': '',
   'description': 'Failure reason. This should be "" if `success=True`. If `success=False` you **must** give a failure reason.',
   'title': 'Failure Reason',
   'type': 'string'}},
 'required': ['username', 'password', 'email', 'success'],
 'title': 'User',
 'type': 'object'}

Notable differences between JSON Schema and Anthropic’s function calling:

title -> name
properties -> input_schema: { properties }
No title in parameters, instead key is used.

I’m not really sure why they’d break a spec for such small differences. We can create a new claude_schema based on model_json_schema.

source

BaseModel.claude_schema

 BaseModel.claude_schema ()

Create tool schema for claude

Exported source

@patch(cls_method=True)
def claude_schema(cls: BaseModel):
    "Create tool schema for claude"
    def _filter_title(obj):
        if isinstance(obj, dict): return {k:_filter_title(v) for k,v in obj.items() if k != 'title'}
        elif isinstance(obj, list): return [_filter_title(item) for item in obj]
        else: return obj

    schema = cls.model_json_schema()
    name = schema.pop('title')
    try:
        description = schema.pop('description')
    except KeyError:
        raise KeyError("Provide a docstring")
    return {
        "name": name,
        "description": description,
        "input_schema": _filter_title(schema)
    }

User.claude_schema()

{'name': 'User',
 'description': 'Create a new user',
 'input_schema': {'properties': {'username': {'type': 'string'},
   'password': {'type': 'string'},
   'email': {'type': 'string'},
   'success': {'description': 'Indicate if user creation is a success.',
    'type': 'boolean'},
   'failure_reason': {'default': '',
    'description': 'Failure reason. This should be "" if `success=True`. If `success=False` you **must** give a failure reason.',
    'type': 'string'}},
  'required': ['username', 'password', 'email', 'success'],
  'type': 'object'}}

Just using claude_schema, we can now use Pydantic models using only our existing tools:

c = Client(model)
pr = "create a user for sarah adams, email sarah@gmail.com, and give them a strong password"
r = c(pr, tools=[User.claude_schema()], tool_choice=mk_tool_choice('User'))
cts = contents(r)
mod = call_func(cts, ns=[User])
mod

User(username='sarahadams', password='X9#mK2$pL7@qR4', email='sarah@gmail.com', success=True, failure_reason='')

And creating a new chat messages:

mk_msg([mk_funcres(cts.id, mod)])

{'role': 'user',
 'content': [{'type': 'tool_result',
   'tool_use_id': 'toolu_01KUWkLVxiwnkVJPybTbod9f',
   'content': "username='sarahadams' password='X9#mK2$pL7@qR4' email='sarah@gmail.com' success=True failure_reason=''"}]}

Let’s create a nicer function that wraps the Chat.__call__. This takes an unintialized Pydantic BaseModel and returns an initialized BaseModel.

source

Client.struct

 Client.struct (msgs:list, resp_model:type[pydantic.main.BaseModel],
                sp='', temp=0, maxtok=4096, prefill='', stream:bool=False,
                stop=None, tools:Optional[list]=None,
                tool_choice:Optional[dict]=None,
                metadata:MetadataParam|NotGiven=NOT_GIVEN,
                stop_sequences:List[str]|NotGiven=NOT_GIVEN, system:Union[
                str,Iterable[TextBlockParam]]|NotGiven=NOT_GIVEN,
                temperature:float|NotGiven=NOT_GIVEN,
                top_k:int|NotGiven=NOT_GIVEN,
                top_p:float|NotGiven=NOT_GIVEN,
                extra_headers:Headers|None=None,
                extra_query:Query|None=None, extra_body:Body|None=None,
                timeout:float|httpx.Timeout|None|NotGiven=NOT_GIVEN)

Parse Claude output into the Pydantic resp_model

	Type	Default	Details
msgs	list		List of messages in the dialog
resp_model	type		Non-initialized pydantic struct
sp	str		The system prompt
temp	int	0	Temperature
maxtok	int	4096	Maximum tokens
prefill	str		Optional prefill to pass to Claude as start of its response
stream	bool	False	Stream response?
stop	NoneType	None	Stop sequence
tools	Optional	None	List of tools to make available to Claude
tool_choice	Optional	None	Optionally force use of some tool
metadata	MetadataParam \| NotGiven	NOT_GIVEN
stop_sequences	List[str] \| NotGiven	NOT_GIVEN
system	Union[str, Iterable[TextBlockParam]] \| NotGiven	NOT_GIVEN
temperature	float \| NotGiven	NOT_GIVEN
top_k	int \| NotGiven	NOT_GIVEN
top_p	float \| NotGiven	NOT_GIVEN
extra_headers	Headers \| None	None
extra_query	Query \| None	None
extra_body	Body \| None	None
timeout	float \| httpx.Timeout \| None \| NotGiven	NOT_GIVEN
Returns	BaseModel		Initialized pydantic struct

Exported source

@patch
@delegates(Client.__call__)
def struct(self:Client,
             msgs:list, # List of messages in the dialog
             resp_model: type[BaseModel], # Non-initialized pydantic struct
             **kwargs
          ) -> BaseModel: # Initialized pydantic struct
    "Parse Claude output into the Pydantic `resp_model`"
    kwargs["tool_choice"] = mk_tool_choice(resp_model.__name__)
    kwargs["tools"] = [resp_model.claude_schema()] # no other tools needed -- model is forced by tool_choice
    fc = self(msgs=msgs, **kwargs)
    res = _mk_struct(contents(fc).input, resp_model)
    return res

This will always return a BaseModel “struct”

c.struct(pr, resp_model=User)

User(username='sarahadams', password='X9#mK2$pL7@qR4', email='sarah@gmail.com', success=True, failure_reason='')

Even if we try not to:

c.struct('what is 2+2', resp_model=User)

User(username='<UNKNOWN>', password='<UNKNOWN>', email='<UNKNOWN>', success=False, failure_reason="The query is unrelated to user creation. It's a simple arithmetic question.")

Now let’s implement this in Chat. The most non-invasive way I can think of to do this is add a new struct function that adds the function result to the history

source

Chat.struct

 Chat.struct (resp_model:type[pydantic.main.BaseModel],
              treat_as_output=True, sp='', temp=0, maxtok=4096,
              prefill='', stream:bool=False, stop=None,
              tools:Optional[list]=None, tool_choice:Optional[dict]=None,
              metadata:MetadataParam|NotGiven=NOT_GIVEN,
              stop_sequences:List[str]|NotGiven=NOT_GIVEN, system:Union[st
              r,Iterable[TextBlockParam]]|NotGiven=NOT_GIVEN,
              temperature:float|NotGiven=NOT_GIVEN,
              top_k:int|NotGiven=NOT_GIVEN,
              top_p:float|NotGiven=NOT_GIVEN,
              extra_headers:Headers|None=None,
              extra_query:Query|None=None, extra_body:Body|None=None,
              timeout:float|httpx.Timeout|None|NotGiven=NOT_GIVEN)

	Type	Default	Details
resp_model	type		Non-initialized pydantic struct
treat_as_output	bool	True	Usually using a tool
sp	str		The system prompt
temp	int	0	Temperature
maxtok	int	4096	Maximum tokens
prefill	str		Optional prefill to pass to Claude as start of its response
stream	bool	False	Stream response?
stop	NoneType	None	Stop sequence
tools	Optional	None	List of tools to make available to Claude
tool_choice	Optional	None	Optionally force use of some tool
metadata	MetadataParam \| NotGiven	NOT_GIVEN
stop_sequences	List[str] \| NotGiven	NOT_GIVEN
system	Union[str, Iterable[TextBlockParam]] \| NotGiven	NOT_GIVEN
temperature	float \| NotGiven	NOT_GIVEN
top_k	int \| NotGiven	NOT_GIVEN
top_p	float \| NotGiven	NOT_GIVEN
extra_headers	Headers \| None	None
extra_query	Query \| None	None
extra_body	Body \| None	None
timeout	float \| httpx.Timeout \| None \| NotGiven	NOT_GIVEN
Returns	BaseModel		Initialized pydantic struct

Exported source

@patch
@delegates(Client.struct)
def struct(self:Chat,
             resp_model: type[BaseModel], # Non-initialized pydantic struct
             treat_as_output=True, # Usually using a tool
             **kwargs) -> BaseModel:
    self._append_pr(kwargs.pop("pr", None))
    res = self.c.struct(self.h, resp_model=resp_model, **kwargs)
    if treat_as_output:
        msgs = [mk_msg(repr(res), "assistant")] # alternatively: res.json()
    else:
        r = self.c.result
        tool_id = contents(r).id
        msgs = [mk_msg(r, "assistant"),
                mk_msg([mk_funcres(tool_id, repr(res))], "user")]
    self.h += msgs
    return res

gen_pass = True
def generate_password() -> dict:
    """generate a strong user password.

    @returns { "success": <indicates function success>, "pass": <password> }
    """
    if gen_pass:
        return {"success": True, "pass": "qwerty123"}
    return {"success": False, "pass": "<UNKNOWN>"}

sp = """You are a user generation system.
Refer to only the **most recent** user generation request. Do not attend to previous requests.

<instructions>
1. If creating a user with only an email, pick a relevant username.
  a) If no email is given, fail user creation. Do not ask for more information.
2. You must use `generate_password` tool to create passwords. You must NOT create your own passwords.
  a) if `generate_password` returns `success=False` and `pass=<UNKNOWN>` fail user creation.
3. If you are given `tool_choice=User`, refer to <user_creation> for your response. Else respond in plain english.
</instructions>

<user_creation>
if user creation is successful:
    create user using the `User` tool
else: # user creation has failed
    refer to <fail_user>
</user_creation>

<fail_user>
This should only be run if and only if `User` is given in `tool_choice` and user creation has failed.

1. Mark failed fields as <UNKNOWN>
2. Set `success = False`
3. Give relevant details for failure in `failure_reason`
</fail_user>
"""

chat = Chat(model, tools=[generate_password], sp=sp, cont_pr="use the tool specified")
chat("create a user with username jackAdam12 and email jack@email.com")

Certainly! I’ll create a user with the username jackAdam12 and email jack@email.com. To do this, we need to generate a strong password using the generate_password function. Let’s proceed with the user creation process.

id: msg_019SpTmBh9LtM7tERkY4Rxko
content: [{‘text’: “Certainly! I’ll create a user with the username jackAdam12 and email jack@email.com. To do this, we need to generate a strong password using the generate_password function. Let’s proceed with the user creation process.”, ‘type’: ‘text’}, {‘id’: ‘toolu_01LHxVZsNKUSpfkvHGa5u9BX’, ‘input’: {}, ‘name’: ‘generate_password’, ‘type’: ‘tool_use’}]
model: claude-3-5-sonnet-20240620
role: assistant
stop_reason: tool_use
stop_sequence: None
type: message
usage: {‘input_tokens’: 681, ‘output_tokens’: 89}

chat.struct(User)

User(username='jackAdam12', password='qwerty123', email='jack@email.com', success=True, failure_reason='')

Now let’s make the gen_pass function fail:

gen_pass = False
print(generate_password())
chat('cool, can you create another user for sarahjones@gmail.com?')

{'success': False, 'pass': '<UNKNOWN>'}

Certainly! I’ll create a user for sarahjones@gmail.com. Since only the email was provided, I’ll generate a relevant username based on the email address. Then, we’ll use the generate_password function to create a secure password for this user.

First, let’s generate the password:

id: msg_01S6SUDmhmrDKns5nxZRc9Ae
content: [{‘text’: “Certainly! I’ll create a user for sarahjones@gmail.com. Since only the email was provided, I’ll generate a relevant username based on the email address. Then, we’ll use the generate_password function to create a secure password for this user., let’s generate the password:”, ‘type’: ‘text’}, {‘id’: ‘toolu_018LXGQEsMjhEZzrwyW2eJ5b’, ‘input’: {}, ‘name’: ‘generate_password’, ‘type’: ‘tool_use’}]
model: claude-3-5-sonnet-20240620
role: assistant
stop_reason: tool_use
stop_sequence: None
type: message
usage: {‘input_tokens’: 1080, ‘output_tokens’: 103}

We get a user creation failure with an appropiate failure message.

chat.struct(User)

User(username='sarahjones', password='<UNKNOWN>', email='sarahjones@gmail.com', success=False, failure_reason='Unable to generate a secure password')

Finally let’s try to create a user with no email:

gen_pass = True
chat('finally can you create an account for Adam?')

I apologize, but I’m unable to create an account for Adam with just the name provided. To create a user account, we need at least an email address. Without an email address, I cannot proceed with the user creation process as per the instructions I’ve been given.

Here’s why I can’t create the account:

An email address is a required piece of information for user creation.
We don’t have enough information to generate a unique username or to associate the account with a valid email address.
The instructions specifically state that if no email is given, we should fail user creation and not ask for more information.

If you’d like to create an account for Adam, you would need to provide at least an email address. Once you have an email address for Adam, please feel free to ask again, and I’ll be happy to assist you with creating the account.

Is there anything else I can help you with, or would you like to provide an email address for Adam to proceed with account creation?

id: msg_01EHSXCJ4nACtApWNT89fdmy
content: [{‘text’: “I apologize, but I’m unable to create an account for Adam with just the name provided. To create a user account, we need at least an email address. Without an email address, I cannot proceed with the user creation process as per the instructions I’ve been given.‘s why I can’t create the account:. An email address is a required piece of information for user creation.. We don’t have enough information to generate a unique username or to associate the account with a valid email address.. The instructions specifically state that if no email is given, we should fail user creation and not ask for more information.you’d like to create an account for Adam, you would need to provide at least an email address. Once you have an email address for Adam, please feel free to ask again, and I’ll be happy to assist you with creating the account.there anything else I can help you with, or would you like to provide an email address for Adam to proceed with account creation?“, ’type’: ‘text’}]
model: claude-3-5-sonnet-20240620
role: assistant
stop_reason: end_turn
stop_sequence: None
type: message
usage: {‘input_tokens’: 1655, ‘output_tokens’: 217}

chat.struct(User)

User(username='Adam', password='<UNKNOWN>', email='<UNKNOWN>', success=False, failure_reason='Insufficient information provided: missing email address and password')