Building an Automated Coding Problem Solver
In this cookbook, we'll create an intelligent agent that can automatically solve programming challenges from websites like LeetCode, HackerRank, and CodeSignal. Our agent will:
- Visit the coding problem URL
- Extract the problem description, requirements, and constraints
- Analyze the input/output format and code templates
- Generate a complete, working solution in the required programming language
This approach combines:
- Hyperbrowser for web extraction and data parsing
- OpenAI's GPT-4o for code analysis and solution generation
After going through this cookbook, you'll have a powerful AI assistant that can tackle coding challenges across various platforms, helping you learn programming concepts or prepare for technical interviews!
Prerequisites
Before starting, you'll need:
- A Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one)
- An OpenAI API key with access to GPT-4o
Store these API keys in a .env
file in the same directory as this notebook:
HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here
Step 1: Initialize Environment and Import Libraries
import asyncioimport jsonimport osfrom dotenv import load_dotenvfrom hyperbrowser import AsyncHyperbrowserfrom hyperbrowser.tools import WebsiteExtractToolfrom openai import AsyncOpenAIfrom openai.types.chat import (ChatCompletionMessageParam,ChatCompletionMessageToolCall,ChatCompletionToolMessageParam,)load_dotenv()
Step 2: Initialize API Clients
Here we create instances of the Hyperbrowser and OpenAI clients using our API keys. These clients will be responsible for web data extraction and AI-powered code generation respectively.
hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))llm = AsyncOpenAI()
Step 3: Implement the Tool Handler
The handle_tool_call
function processes requests from the LLM to interact with external tools - in this case, the WebsiteExtractTool from Hyperbrowser.
This function:
- Identifies which tool the LLM is requesting to use
- Gets the parameters for the tool call
- Executes the tool with those parameters
- Returns the results back to the LLM for further processing
For our code solver, we primarily use the extract_data
tool to extract structured information about coding problems from websites.
async def handle_tool_call(tc: ChatCompletionMessageToolCall,) -> ChatCompletionToolMessageParam:print(f"Handling tool call: {tc.function.name}")try:if (tc.function.name!= WebsiteExtractTool.openai_tool_definition["function"]["name"]):raise ValueError(f"Tool not found: {tc.function.name}")args = json.loads(tc.function.arguments)print(args)content = await WebsiteExtractTool.async_runnable(hb=hb, params=args)return {"role": "tool", "tool_call_id": tc.id, "content": content}except Exception as e:err_msg = f"Error handling tool call: {e}"print(err_msg)return {"role": "tool","tool_call_id": tc.id,"content": err_msg,"is_error": True, # type: ignore}
Step 4: Create the Agent Loop
The agent loop is the core function that manages the conversation between the LLM and external tools. It implements a recursive pattern where:
- The current state of the conversation is sent to the LLM
- The LLM either provides a final answer or requests more information via tool calls
- If tool calls are made, they're processed and their results are added to the conversation
- This process repeats until the LLM provides a final answer
This architecture allows the agent to gather information iteratively, making multiple web extraction requests if necessary to fully understand the coding problem before generating a solution.
async def agent_loop(messages: list[ChatCompletionMessageParam]) -> str:while True:response = await llm.chat.completions.create(messages=messages,model="gpt-4o",tools=[WebsiteExtractTool.openai_tool_definition,],max_completion_tokens=8000,)choice = response.choices[0]# Append response to messagesmessages.append(choice.message) # type: ignore# Handle tool callsif (choice.finish_reason == "tool_calls"and choice.message.tool_calls is not None):tool_result_messages = await asyncio.gather(*[handle_tool_call(tc) for tc in choice.message.tool_calls])messages.extend(tool_result_messages)elif choice.finish_reason == "stop" and choice.message.content is not None:return choice.message.contentelse:print(choice)raise ValueError(f"Unhandled finish reason: {choice.finish_reason}")
Step 5: Design the System Prompt
The system prompt dictates the LLM's behavior. Our prompt establishes the agent as an expert coder tasked with solving programming challenges. It provides detailed instructions on:
- What information to extract from the coding problem page
- How to analyze the problem requirements
- What format to use when returning the solution
- The specific elements that should be included in the final response
By structuring the prompt this way, we ensure the agent returns consistent, well-organized solutions that address all aspects of the coding challenge.
SYSTEM_PROMPT = """You are an expert coder. You have access to a 'extract_data' tool which can be used to get structured data from a webpage. When providing data to the extract API, Make sure you provide a properly formatted json schema, and that the json schema is structured as you would for a ask for in a structured output API call.This is the link to a piece of code {link}. You are required to find the input parameters, the output parameters, the template in which the code is to be provided, the language in which the code is to be written, the task to be performed, and the list of examples provided (in input and output format).Once you have the information, you need to use those parameters to provide code that will adequately solve the given task.You are required to response with1. The task to be solved2. The input parameters format3. The output parameters format4. The code template provided5. The language in which the solution is required6. The list of examples provided7. Finally, and most importantly, the complete solution for the coding task given.""".strip()
Step 6: Create the Agent Factory Function
The make_coding_agent
function is a factory that creates specialized instances of our code-solving agent for specific coding problems. This approach provides several benefits:
- Encapsulation: It wraps all the complexity of setting up the agent with the appropriate system prompt
- Reusability: We can create multiple agent instances for different coding problems
- Configurability: The function handles URL normalization and system prompt formatting
The factory returns an async function that can be called with a user query to start the problem-solving process.
async def make_coding_agent(link_to_code: str):# Popular documentation providers like Gitbook, Mintlify etc automatically generate a llms.txt file# for documentation sites hosted on their platforms.if not (link_to_code.startswith("http://") or link_to_code.startswith("https://")):link_to_code = f"https://{link_to_code}"sysprompt = SYSTEM_PROMPT.format(link=link_to_code,)return await agent_loop([{"role": "system", "content": sysprompt},{"role": "user", "content": "Solve this coding problem"},])
Step 7: Test the Agent with a LeetCode Problem
Now let's put our agent to the test with the classic "Two Sum" problem from LeetCode. This will demonstrate the full workflow:
- The agent will visit the problem page
- Extract the problem description, requirements, and examples
- Analyze the required input/output format and programming language, along with the template of the code solution.
- Generate an optimal solution with explanations
In this case, we're solving the Two Sum problem, which asks us to find the indices of two numbers in an array that add up to a target value.
link_to_coding_task = "https://leetcode.com/problems/two-sum/description/"response = await make_coding_agent(link_to_coding_task)print(response)
Handling tool call: extract_data {'urls': ['https://leetcode.com/problems/two-sum/description/'], 'prompt': 'Extract the input parameters, output parameters, code template, language, task description, and example test cases for the Two Sum problem.', 'schema': '{"task":{"type":"string"},"input_parameters":{"type":"string"},"output_parameters":{"type":"string"},"code_template":{"type":"string"},"language":{"type":"string"},"examples":{"type":"array","items":{"type":"object","properties":{"input":{"type":"string"},"output":{"type":"string"}}}}}', 'max_links': 5} Error handling tool call: schema - Invalid JSON schema - Status: 400 - Caused by HTTPStatusError: Client error '400 Bad Request' for url 'https://app.hyperbrowser.ai/api/extract' For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400 Handling tool call: extract_data {'urls': ['https://leetcode.com/problems/two-sum/description/'], 'prompt': 'Extract the input parameters, output parameters, code template, language, task description, and example test cases for the Two Sum problem.', 'schema': '{"type":"object","properties":{"task":{"type":"string"},"input_parameters":{"type":"string"},"output_parameters":{"type":"string"},"code_template":{"type":"string"},"language":{"type":"string"},"examples":{"type":"array","items":{"type":"object","properties":{"input":{"type":"string"},"output":{"type":"string"}}}}}}', 'max_links': 5} 1. **The task to be solved**: Two Sum 2. **The input parameters format**: An array of integers `nums` and an integer `target`. 3. **The output parameters format**: An array of indices of the two numbers that add up to the target. 4. **The code template provided**: ```cpp class Solution { public: vector\< int\ > twoSum(vector\< int \>& nums, int target) { } }; ``` 5. **The language in which the solution is required**: C++ 6. **The list of examples provided**: - Input: `nums = [2,7,11,15], target = 9` Output: `[0,1]` - Input: `nums = [3,2,4], target = 6` Output: `[1,2]` - Input: `nums = [3,3], target = 6` Output: `[0,1]` 7. **Complete solution for the coding task**: Here is a C++ solution for the Two Sum problem: ```cpp #include#include using namespace std; class Solution { public: vector\< int\> twoSum(vector\< int\>& nums, int target) { unordered_map\< int, int\> num_map; for (int i = 0; i < nums.size(); ++i) { int complement = target - nums[i]; if (num_map.find(complement) != num_map.end()) { return {num_map[complement], i}; } num_map[nums[i]] = i; } return {}; // Solution assumes one valid solution exists } }; ``` This solution utilizes a hash map to track each number's index, allowing efficient lookup for complement pairs that sum up to the given target.
Conclusion
In this cookbook, we've built a powerful AI code solver that can tackle programming challenges from platforms like LeetCode. Our agent:
- Extracts problem descriptions, requirements, and constraints from coding challenge websites
- Analyzes the input/output formats and programming language requirements
- Generates optimized, working solutions with clear explanations
- Presents the information in a structured, easy-to-understand format
This tool can be invaluable for:
- Learning programming concepts by seeing optimal solutions to common problems
- Preparing for technical interviews by analyzing different solution approaches
- Debugging your own solutions by comparing them with an AI-generated reference
- Exploring different implementation strategies for the same problem
Next Steps
To enhance this tool further, you could:
- Add support for more coding platforms like HackerRank, CodeSignal, and Codewars
- Implement solution generation in multiple programming languages
- Add time and space complexity analysis for the generated solutions
- Create a comparison feature to analyze multiple solution approaches
- Build a web interface where users can input problem URLs and get immediate solutions
Happy coding!