Integrating LLMs
What are we trying to accomplish?
In this lesson, students transition from using LLMs through a browser interface to integrating them programmatically — the way real applications work. Using Python and the Gemini API, students will authenticate with a hosted LLM provider, construct prompts as code, and send them over an API. The second lecture extends this foundation with four production-relevant capabilities: structured outputs that return typed data instead of plain text, multi-turn conversations with persistent message history, multimodal inputs that incorporate images alongside text, and thinking mode for problems requiring deeper reasoning. By the end of this lesson, students will have a working Python integration that reflects the full range of what a real AI-powered feature requires.
Lectures and Assignments
Lectures
Assignments
TLO's (Terminal Learning Objectives)
- Build a Python application that integrates a hosted LLM via API, returning structured data across a multi-turn conversation.
ELO's (Enabling Learning Objectives)
- Authenticate with a hosted LLM provider by configuring an API key and initializing the SDK client.
- Explain the role of a system prompt and construct one that sets a model's persona, scope, and response constraints.
- Select an appropriate model from the Gemini family based on cost, capability, and task requirements.
- Send a single-turn prompt to the Gemini API and print the model's response.
- Define a structured output schema using Pydantic and configure the API call to return JSON that matches that schema.
- Build a multi-turn conversation by maintaining and passing a message history list across requests.
- Incorporate image inputs into a Gemini API request alongside text prompts.
- Apply thinking mode to a request and explain how extended reasoning affects response quality and token cost.
- Identify when each capability — structured output, multi-turn, multimodal, thinking mode — is the right tool for a given task.