Structured Output with Pydantic

import os
# os.environ['ANTHROPIC_LOG'] = 'debug'
model = models[1] # sonnet -- haiku was quite flaky on the failure cases.

Pydantic provides a way of extracting structured outputs from text. This is useful for integrating an LLM as a single component of a pipeline.

Much of this was inspired by Instructor.

Anthropic function calling is based on JSON Schema (with a few small tweaks). You can easily get the the JSON Schema from any Pydantic mdoel with the model_json_schema classmethod.

class User(BaseModel):
    "Create a new user"
    username: str
    password: str
    email: str
    success: bool  = Field(..., description="Indicate if user creation is a success.")
    failure_reason: str = Field(default="", description="Failure reason. This should be \"\" if `success=True`. If `success=False` you **must** give a failure reason.")

User.model_json_schema()
{'description': 'Create a new user',
 'properties': {'username': {'title': 'Username', 'type': 'string'},
  'password': {'title': 'Password', 'type': 'string'},
  'email': {'title': 'Email', 'type': 'string'},
  'success': {'description': 'Indicate if user creation is a success.',
   'title': 'Success',
   'type': 'boolean'},
  'failure_reason': {'default': '',
   'description': 'Failure reason. This should be "" if `success=True`. If `success=False` you **must** give a failure reason.',
   'title': 'Failure Reason',
   'type': 'string'}},
 'required': ['username', 'password', 'email', 'success'],
 'title': 'User',
 'type': 'object'}

Notable differences between JSON Schema and Anthropic’s function calling:

I’m not really sure why they’d break a spec for such small differences. We can create a new claude_schema based on model_json_schema.


source

BaseModel.claude_schema

 BaseModel.claude_schema ()

Create tool schema for claude

Exported source
@patch(cls_method=True)
def claude_schema(cls: BaseModel):
    "Create tool schema for claude"
    def _filter_title(obj):
        if isinstance(obj, dict): return {k:_filter_title(v) for k,v in obj.items() if k != 'title'}
        elif isinstance(obj, list): return [_filter_title(item) for item in obj]
        else: return obj

    schema = cls.model_json_schema()
    name = schema.pop('title')
    try:
        description = schema.pop('description')
    except KeyError:
        raise KeyError("Provide a docstring")
    return {
        "name": name,
        "description": description,
        "input_schema": _filter_title(schema)
    }
User.claude_schema()
{'name': 'User',
 'description': 'Create a new user',
 'input_schema': {'properties': {'username': {'type': 'string'},
   'password': {'type': 'string'},
   'email': {'type': 'string'},
   'success': {'description': 'Indicate if user creation is a success.',
    'type': 'boolean'},
   'failure_reason': {'default': '',
    'description': 'Failure reason. This should be "" if `success=True`. If `success=False` you **must** give a failure reason.',
    'type': 'string'}},
  'required': ['username', 'password', 'email', 'success'],
  'type': 'object'}}

Just using claude_schema, we can now use Pydantic models using only our existing tools:

c = Client(model)
pr = "create a user for sarah adams, email sarah@gmail.com, and give them a strong password"
r = c(pr, tools=[User.claude_schema()], tool_choice=mk_tool_choice('User'))
cts = contents(r)
mod = call_func(cts, ns=[User])
mod
User(username='sarahadams', password='X9#mK2$pL7@qR4', email='sarah@gmail.com', success=True, failure_reason='')

And creating a new chat messages:

mk_msg([mk_funcres(cts.id, mod)])
{'role': 'user',
 'content': [{'type': 'tool_result',
   'tool_use_id': 'toolu_01KUWkLVxiwnkVJPybTbod9f',
   'content': "username='sarahadams' password='X9#mK2$pL7@qR4' email='sarah@gmail.com' success=True failure_reason=''"}]}

Let’s create a nicer function that wraps the Chat.__call__. This takes an unintialized Pydantic BaseModel and returns an initialized BaseModel.


source

Client.struct

 Client.struct (msgs:list, resp_model:type[pydantic.main.BaseModel],
                sp='', temp=0, maxtok=4096, prefill='', stream:bool=False,
                stop=None, tools:Optional[list]=None,
                tool_choice:Optional[dict]=None,
                metadata:MetadataParam|NotGiven=NOT_GIVEN,
                stop_sequences:List[str]|NotGiven=NOT_GIVEN, system:Union[
                str,Iterable[TextBlockParam]]|NotGiven=NOT_GIVEN,
                temperature:float|NotGiven=NOT_GIVEN,
                top_k:int|NotGiven=NOT_GIVEN,
                top_p:float|NotGiven=NOT_GIVEN,
                extra_headers:Headers|None=None,
                extra_query:Query|None=None, extra_body:Body|None=None,
                timeout:float|httpx.Timeout|None|NotGiven=NOT_GIVEN)

Parse Claude output into the Pydantic resp_model

Type Default Details
msgs list List of messages in the dialog
resp_model type Non-initialized pydantic struct
sp str The system prompt
temp int 0 Temperature
maxtok int 4096 Maximum tokens
prefill str Optional prefill to pass to Claude as start of its response
stream bool False Stream response?
stop NoneType None Stop sequence
tools Optional None List of tools to make available to Claude
tool_choice Optional None Optionally force use of some tool
metadata MetadataParam | NotGiven NOT_GIVEN
stop_sequences List[str] | NotGiven NOT_GIVEN
system Union[str, Iterable[TextBlockParam]] | NotGiven NOT_GIVEN
temperature float | NotGiven NOT_GIVEN
top_k int | NotGiven NOT_GIVEN
top_p float | NotGiven NOT_GIVEN
extra_headers Headers | None None
extra_query Query | None None
extra_body Body | None None
timeout float | httpx.Timeout | None | NotGiven NOT_GIVEN
Returns BaseModel Initialized pydantic struct
Exported source
@patch
@delegates(Client.__call__)
def struct(self:Client,
             msgs:list, # List of messages in the dialog
             resp_model: type[BaseModel], # Non-initialized pydantic struct
             **kwargs
          ) -> BaseModel: # Initialized pydantic struct
    "Parse Claude output into the Pydantic `resp_model`"
    kwargs["tool_choice"] = mk_tool_choice(resp_model.__name__)
    kwargs["tools"] = [resp_model.claude_schema()] # no other tools needed -- model is forced by tool_choice
    fc = self(msgs=msgs, **kwargs)
    res = _mk_struct(contents(fc).input, resp_model)
    return res

This will always return a BaseModel “struct”

c.struct(pr, resp_model=User)
User(username='sarahadams', password='X9#mK2$pL7@qR4', email='sarah@gmail.com', success=True, failure_reason='')

Even if we try not to:

c.struct('what is 2+2', resp_model=User)
User(username='<UNKNOWN>', password='<UNKNOWN>', email='<UNKNOWN>', success=False, failure_reason="The query is unrelated to user creation. It's a simple arithmetic question.")

Now let’s implement this in Chat. The most non-invasive way I can think of to do this is add a new struct function that adds the function result to the history


source

Chat.struct

 Chat.struct (resp_model:type[pydantic.main.BaseModel],
              treat_as_output=True, sp='', temp=0, maxtok=4096,
              prefill='', stream:bool=False, stop=None,
              tools:Optional[list]=None, tool_choice:Optional[dict]=None,
              metadata:MetadataParam|NotGiven=NOT_GIVEN,
              stop_sequences:List[str]|NotGiven=NOT_GIVEN, system:Union[st
              r,Iterable[TextBlockParam]]|NotGiven=NOT_GIVEN,
              temperature:float|NotGiven=NOT_GIVEN,
              top_k:int|NotGiven=NOT_GIVEN,
              top_p:float|NotGiven=NOT_GIVEN,
              extra_headers:Headers|None=None,
              extra_query:Query|None=None, extra_body:Body|None=None,
              timeout:float|httpx.Timeout|None|NotGiven=NOT_GIVEN)
Type Default Details
resp_model type Non-initialized pydantic struct
treat_as_output bool True Usually using a tool
sp str The system prompt
temp int 0 Temperature
maxtok int 4096 Maximum tokens
prefill str Optional prefill to pass to Claude as start of its response
stream bool False Stream response?
stop NoneType None Stop sequence
tools Optional None List of tools to make available to Claude
tool_choice Optional None Optionally force use of some tool
metadata MetadataParam | NotGiven NOT_GIVEN
stop_sequences List[str] | NotGiven NOT_GIVEN
system Union[str, Iterable[TextBlockParam]] | NotGiven NOT_GIVEN
temperature float | NotGiven NOT_GIVEN
top_k int | NotGiven NOT_GIVEN
top_p float | NotGiven NOT_GIVEN
extra_headers Headers | None None
extra_query Query | None None
extra_body Body | None None
timeout float | httpx.Timeout | None | NotGiven NOT_GIVEN
Returns BaseModel Initialized pydantic struct
Exported source
@patch
@delegates(Client.struct)
def struct(self:Chat,
             resp_model: type[BaseModel], # Non-initialized pydantic struct
             treat_as_output=True, # Usually using a tool
             **kwargs) -> BaseModel:
    self._append_pr(kwargs.pop("pr", None))
    res = self.c.struct(self.h, resp_model=resp_model, **kwargs)
    if treat_as_output:
        msgs = [mk_msg(repr(res), "assistant")] # alternatively: res.json()
    else:
        r = self.c.result
        tool_id = contents(r).id
        msgs = [mk_msg(r, "assistant"),
                mk_msg([mk_funcres(tool_id, repr(res))], "user")]
    self.h += msgs
    return res
gen_pass = True
def generate_password() -> dict:
    """generate a strong user password.

    @returns { "success": <indicates function success>, "pass": <password> }
    """
    if gen_pass:
        return {"success": True, "pass": "qwerty123"}
    return {"success": False, "pass": "<UNKNOWN>"}

sp = """You are a user generation system.
Refer to only the **most recent** user generation request. Do not attend to previous requests.

<instructions>
1. If creating a user with only an email, pick a relevant username.
  a) If no email is given, fail user creation. Do not ask for more information.
2. You must use `generate_password` tool to create passwords. You must NOT create your own passwords.
  a) if `generate_password` returns `success=False` and `pass=<UNKNOWN>` fail user creation.
3. If you are given `tool_choice=User`, refer to <user_creation> for your response. Else respond in plain english.
</instructions>

<user_creation>
if user creation is successful:
    create user using the `User` tool
else: # user creation has failed
    refer to <fail_user>
</user_creation>

<fail_user>
This should only be run if and only if `User` is given in `tool_choice` and user creation has failed.

1. Mark failed fields as <UNKNOWN>
2. Set `success = False`
3. Give relevant details for failure in `failure_reason`
</fail_user>
"""

chat = Chat(model, tools=[generate_password], sp=sp, cont_pr="use the tool specified")
chat("create a user with username jackAdam12 and email jack@email.com")

Certainly! I’ll create a user with the username jackAdam12 and email jack@email.com. To do this, we need to generate a strong password using the generate_password function. Let’s proceed with the user creation process.

  • id: msg_019SpTmBh9LtM7tERkY4Rxko
  • content: [{‘text’: “Certainly! I’ll create a user with the username jackAdam12 and email jack@email.com. To do this, we need to generate a strong password using the generate_password function. Let’s proceed with the user creation process.”, ‘type’: ‘text’}, {‘id’: ‘toolu_01LHxVZsNKUSpfkvHGa5u9BX’, ‘input’: {}, ‘name’: ‘generate_password’, ‘type’: ‘tool_use’}]
  • model: claude-3-5-sonnet-20240620
  • role: assistant
  • stop_reason: tool_use
  • stop_sequence: None
  • type: message
  • usage: {‘input_tokens’: 681, ‘output_tokens’: 89}
chat.struct(User)
User(username='jackAdam12', password='qwerty123', email='jack@email.com', success=True, failure_reason='')

Now let’s make the gen_pass function fail:

gen_pass = False
print(generate_password())
chat('cool, can you create another user for sarahjones@gmail.com?')
{'success': False, 'pass': '<UNKNOWN>'}

Certainly! I’ll create a user for sarahjones@gmail.com. Since only the email was provided, I’ll generate a relevant username based on the email address. Then, we’ll use the generate_password function to create a secure password for this user.

First, let’s generate the password:

  • id: msg_01S6SUDmhmrDKns5nxZRc9Ae
  • content: [{‘text’: “Certainly! I’ll create a user for sarahjones@gmail.com. Since only the email was provided, I’ll generate a relevant username based on the email address. Then, we’ll use the generate_password function to create a secure password for this user., let’s generate the password:”, ‘type’: ‘text’}, {‘id’: ‘toolu_018LXGQEsMjhEZzrwyW2eJ5b’, ‘input’: {}, ‘name’: ‘generate_password’, ‘type’: ‘tool_use’}]
  • model: claude-3-5-sonnet-20240620
  • role: assistant
  • stop_reason: tool_use
  • stop_sequence: None
  • type: message
  • usage: {‘input_tokens’: 1080, ‘output_tokens’: 103}

We get a user creation failure with an appropiate failure message.

chat.struct(User)
User(username='sarahjones', password='<UNKNOWN>', email='sarahjones@gmail.com', success=False, failure_reason='Unable to generate a secure password')

Finally let’s try to create a user with no email:

gen_pass = True
chat('finally can you create an account for Adam?')

I apologize, but I’m unable to create an account for Adam with just the name provided. To create a user account, we need at least an email address. Without an email address, I cannot proceed with the user creation process as per the instructions I’ve been given.

Here’s why I can’t create the account:

  1. An email address is a required piece of information for user creation.
  2. We don’t have enough information to generate a unique username or to associate the account with a valid email address.
  3. The instructions specifically state that if no email is given, we should fail user creation and not ask for more information.

If you’d like to create an account for Adam, you would need to provide at least an email address. Once you have an email address for Adam, please feel free to ask again, and I’ll be happy to assist you with creating the account.

Is there anything else I can help you with, or would you like to provide an email address for Adam to proceed with account creation?

  • id: msg_01EHSXCJ4nACtApWNT89fdmy
  • content: [{‘text’: “I apologize, but I’m unable to create an account for Adam with just the name provided. To create a user account, we need at least an email address. Without an email address, I cannot proceed with the user creation process as per the instructions I’ve been given.‘s why I can’t create the account:. An email address is a required piece of information for user creation.. We don’t have enough information to generate a unique username or to associate the account with a valid email address.. The instructions specifically state that if no email is given, we should fail user creation and not ask for more information.you’d like to create an account for Adam, you would need to provide at least an email address. Once you have an email address for Adam, please feel free to ask again, and I’ll be happy to assist you with creating the account.there anything else I can help you with, or would you like to provide an email address for Adam to proceed with account creation?“, ’type’: ‘text’}]
  • model: claude-3-5-sonnet-20240620
  • role: assistant
  • stop_reason: end_turn
  • stop_sequence: None
  • type: message
  • usage: {‘input_tokens’: 1655, ‘output_tokens’: 217}
chat.struct(User)
User(username='Adam', password='<UNKNOWN>', email='<UNKNOWN>', success=False, failure_reason='Insufficient information provided: missing email address and password')