Building an Automated Coding Problem Solver

In this cookbook, we'll create an intelligent agent that can automatically solve programming challenges from websites like LeetCode, HackerRank, and CodeSignal. Our agent will:

  1. Visit the coding problem URL
  2. Extract the problem description, requirements, and constraints
  3. Analyze the input/output format and code templates
  4. Generate a complete, working solution in the required programming language

This approach combines:

  • Hyperbrowser for web extraction and data parsing
  • OpenAI's GPT-4o for code analysis and solution generation

After going through this cookbook, you'll have a powerful AI assistant that can tackle coding challenges across various platforms, helping you learn programming concepts or prepare for technical interviews!

Prerequisites

Before starting, you'll need:

  1. A Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one)
  2. An OpenAI API key with access to GPT-4o

Store these API keys in a .env file in the same directory as this notebook:

HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here

Step 1: Initialize Environment and Import Libraries

import asyncio
import json
import os
from dotenv import load_dotenv
from hyperbrowser import AsyncHyperbrowser
from hyperbrowser.tools import WebsiteExtractTool
from openai import AsyncOpenAI
from openai.types.chat import (
ChatCompletionMessageParam,
ChatCompletionMessageToolCall,
ChatCompletionToolMessageParam,
)
load_dotenv()

Step 2: Initialize API Clients

Here we create instances of the Hyperbrowser and OpenAI clients using our API keys. These clients will be responsible for web data extraction and AI-powered code generation respectively.

hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))
llm = AsyncOpenAI()

Step 3: Implement the Tool Handler

The handle_tool_call function processes requests from the LLM to interact with external tools - in this case, the WebsiteExtractTool from Hyperbrowser.

This function:

  1. Identifies which tool the LLM is requesting to use
  2. Gets the parameters for the tool call
  3. Executes the tool with those parameters
  4. Returns the results back to the LLM for further processing

For our code solver, we primarily use the extract_data tool to extract structured information about coding problems from websites.

async def handle_tool_call(
tc: ChatCompletionMessageToolCall,
) -> ChatCompletionToolMessageParam:
print(f"Handling tool call: {tc.function.name}")
try:
if (
tc.function.name
!= WebsiteExtractTool.openai_tool_definition["function"]["name"]
):
raise ValueError(f"Tool not found: {tc.function.name}")
args = json.loads(tc.function.arguments)
print(args)
content = await WebsiteExtractTool.async_runnable(hb=hb, params=args)
return {"role": "tool", "tool_call_id": tc.id, "content": content}
except Exception as e:
err_msg = f"Error handling tool call: {e}"
print(err_msg)
return {
"role": "tool",
"tool_call_id": tc.id,
"content": err_msg,
"is_error": True, # type: ignore
}

Step 4: Create the Agent Loop

The agent loop is the core function that manages the conversation between the LLM and external tools. It implements a recursive pattern where:

  1. The current state of the conversation is sent to the LLM
  2. The LLM either provides a final answer or requests more information via tool calls
  3. If tool calls are made, they're processed and their results are added to the conversation
  4. This process repeats until the LLM provides a final answer

This architecture allows the agent to gather information iteratively, making multiple web extraction requests if necessary to fully understand the coding problem before generating a solution.

async def agent_loop(messages: list[ChatCompletionMessageParam]) -> str:
while True:
response = await llm.chat.completions.create(
messages=messages,
model="gpt-4o",
tools=[
WebsiteExtractTool.openai_tool_definition,
],
max_completion_tokens=8000,
)
choice = response.choices[0]
# Append response to messages
messages.append(choice.message) # type: ignore
# Handle tool calls
if (
choice.finish_reason == "tool_calls"
and choice.message.tool_calls is not None
):
tool_result_messages = await asyncio.gather(
*[handle_tool_call(tc) for tc in choice.message.tool_calls]
)
messages.extend(tool_result_messages)
elif choice.finish_reason == "stop" and choice.message.content is not None:
return choice.message.content
else:
print(choice)
raise ValueError(f"Unhandled finish reason: {choice.finish_reason}")

Step 5: Design the System Prompt

The system prompt dictates the LLM's behavior. Our prompt establishes the agent as an expert coder tasked with solving programming challenges. It provides detailed instructions on:

  1. What information to extract from the coding problem page
  2. How to analyze the problem requirements
  3. What format to use when returning the solution
  4. The specific elements that should be included in the final response

By structuring the prompt this way, we ensure the agent returns consistent, well-organized solutions that address all aspects of the coding challenge.

SYSTEM_PROMPT = """
You are an expert coder. You have access to a 'extract_data' tool which can be used to get structured data from a webpage. When providing data to the extract API, Make sure you provide a properly formatted json schema, and that the json schema is structured as you would for a ask for in a structured output API call.
This is the link to a piece of code {link}. You are required to find the input parameters, the output parameters, the template in which the code is to be provided, the language in which the code is to be written, the task to be performed, and the list of examples provided (in input and output format).
Once you have the information, you need to use those parameters to provide code that will adequately solve the given task.
You are required to response with
1. The task to be solved
2. The input parameters format
3. The output parameters format
4. The code template provided
5. The language in which the solution is required
6. The list of examples provided
7. Finally, and most importantly, the complete solution for the coding task given.
""".strip()

Step 6: Create the Agent Factory Function

The make_coding_agent function is a factory that creates specialized instances of our code-solving agent for specific coding problems. This approach provides several benefits:

  1. Encapsulation: It wraps all the complexity of setting up the agent with the appropriate system prompt
  2. Reusability: We can create multiple agent instances for different coding problems
  3. Configurability: The function handles URL normalization and system prompt formatting

The factory returns an async function that can be called with a user query to start the problem-solving process.

async def make_coding_agent(link_to_code: str):
# Popular documentation providers like Gitbook, Mintlify etc automatically generate a llms.txt file
# for documentation sites hosted on their platforms.
if not (link_to_code.startswith("http://") or link_to_code.startswith("https://")):
link_to_code = f"https://{link_to_code}"
sysprompt = SYSTEM_PROMPT.format(
link=link_to_code,
)
return await agent_loop(
[
{"role": "system", "content": sysprompt},
{"role": "user", "content": "Solve this coding problem"},
]
)

Step 7: Test the Agent with a LeetCode Problem

Now let's put our agent to the test with the classic "Two Sum" problem from LeetCode. This will demonstrate the full workflow:

  1. The agent will visit the problem page
  2. Extract the problem description, requirements, and examples
  3. Analyze the required input/output format and programming language, along with the template of the code solution.
  4. Generate an optimal solution with explanations

In this case, we're solving the Two Sum problem, which asks us to find the indices of two numbers in an array that add up to a target value.

link_to_coding_task = "https://leetcode.com/problems/two-sum/description/"
response = await make_coding_agent(link_to_coding_task)
print(response)
Handling tool call: extract_data

{'urls': ['https://leetcode.com/problems/two-sum/description/'], 'prompt': 'Extract the input parameters, output parameters, code template, language, task description, and example test cases for the Two Sum problem.', 'schema': '{"task":{"type":"string"},"input_parameters":{"type":"string"},"output_parameters":{"type":"string"},"code_template":{"type":"string"},"language":{"type":"string"},"examples":{"type":"array","items":{"type":"object","properties":{"input":{"type":"string"},"output":{"type":"string"}}}}}', 'max_links': 5}

Error handling tool call: schema - Invalid JSON schema - Status: 400 - Caused by HTTPStatusError: Client error '400 Bad Request' for url 'https://app.hyperbrowser.ai/api/extract'

For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

Handling tool call: extract_data

{'urls': ['https://leetcode.com/problems/two-sum/description/'], 'prompt': 'Extract the input parameters, output parameters, code template, language, task description, and example test cases for the Two Sum problem.', 'schema': '{"type":"object","properties":{"task":{"type":"string"},"input_parameters":{"type":"string"},"output_parameters":{"type":"string"},"code_template":{"type":"string"},"language":{"type":"string"},"examples":{"type":"array","items":{"type":"object","properties":{"input":{"type":"string"},"output":{"type":"string"}}}}}}', 'max_links': 5}

1. **The task to be solved**: Two Sum



2. **The input parameters format**: An array of integers `nums` and an integer `target`.



3. **The output parameters format**: An array of indices of the two numbers that add up to the target.



4. **The code template provided**:

   ```cpp

   class Solution {

   public:

       vector\< int\ > twoSum(vector\< int \>& nums, int target) {

           

       }

   };

   ```



5. **The language in which the solution is required**: C++



6. **The list of examples provided**:

   - Input: `nums = [2,7,11,15], target = 9`

     Output: `[0,1]`

   - Input: `nums = [3,2,4], target = 6`

     Output: `[1,2]`

   - Input: `nums = [3,3], target = 6`

     Output: `[0,1]`



7. **Complete solution for the coding task**:

   Here is a C++ solution for the Two Sum problem:



   ```cpp

   #include 

   #include 



   using namespace std;



   class Solution {

   public:

       vector\< int\> twoSum(vector\< int\>& nums, int target) {

           unordered_map\< int, int\> num_map;

           for (int i = 0; i < nums.size(); ++i) {

               int complement = target - nums[i];

               if (num_map.find(complement) != num_map.end()) {

                   return {num_map[complement], i};

               }

               num_map[nums[i]] = i;

           }

           return {}; // Solution assumes one valid solution exists

       }

   };

   ```



This solution utilizes a hash map to track each number's index, allowing efficient lookup for complement pairs that sum up to the given target.

Code Solver

Conclusion

In this cookbook, we've built a powerful AI code solver that can tackle programming challenges from platforms like LeetCode. Our agent:

  1. Extracts problem descriptions, requirements, and constraints from coding challenge websites
  2. Analyzes the input/output formats and programming language requirements
  3. Generates optimized, working solutions with clear explanations
  4. Presents the information in a structured, easy-to-understand format

This tool can be invaluable for:

  • Learning programming concepts by seeing optimal solutions to common problems
  • Preparing for technical interviews by analyzing different solution approaches
  • Debugging your own solutions by comparing them with an AI-generated reference
  • Exploring different implementation strategies for the same problem

Next Steps

To enhance this tool further, you could:

  • Add support for more coding platforms like HackerRank, CodeSignal, and Codewars
  • Implement solution generation in multiple programming languages
  • Add time and space complexity analysis for the generated solutions
  • Create a comparison feature to analyze multiple solution approaches
  • Build a web interface where users can input problem URLs and get immediate solutions

Happy coding!