Building a News Analysis Agent with Hyperbrowser and GPT-4o
In this cookbook, we'll build an intelligent News Analysis Agent that can summarize news from any topic by automatically searching news aggregators, analyzing multiple sources, and generating comprehensive summaries with source-specific insights.
Our agent will:
- Access news aggregator websites like Google News
- Extract articles from multiple news sources on a specific topic
- Analyze content across different publications
- Generate a comprehensive summary with source-specific details
- Answer user questions based on the aggregated news content
We'll use these tools to build our agent:
- Hyperbrowser for web extraction and accessing news sources
- OpenAI's GPT-4o-mini for intelligent analysis and report generation
By the end of this cookbook, you'll have a versatile news analysis tool that can keep you informed on any topic with balanced, multi-source perspectives!
Prerequisites
To follow along you'll need the following:
- A Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one, it's free)
- An OpenAI API key (sign up at openai.com if you don't have one, it's free)
Both API keys should be stored in a .env
file in the same directory as this notebook with the following format:
HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here
Step 1: Set up imports and load environment variables
import asyncioimport jsonimport osfrom dotenv import load_dotenvfrom hyperbrowser import AsyncHyperbrowserfrom hyperbrowser.tools import WebsiteExtractToolfrom openai import AsyncOpenAIfrom openai.types.chat import (ChatCompletionMessageParam,ChatCompletionMessageToolCall,ChatCompletionToolMessageParam,)from typing import Listfrom IPython.display import Markdown, displayimport urllib.parseload_dotenv()
Step 2: Initialize API clients
Next, we'll initialize our API clients for Hyperbrowser and OpenAI using the environment variables we loaded. These clients will handle web scraping and AI-powered analysis respectively.
hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))llm = AsyncOpenAI()
Step 3: Create a tool handler function
The handle_tool_call
function processes requests from the LLM to interact with external tools. In this case, it handles the WebsiteExtractTool from Hyperbrowser, which allows our agent to extract structured data from news websites.
This function:
- Takes a tool call request from the LLM
- Checks if it's for a supported tool (in this case, the
extract_data
tool) - Parses the parameters and executes the tool
- Returns the results back to the LLM
- Provides any error handling for issues so that the LLM can fix it with different args for the tool use.
This is essential for enabling our agent to dynamically access and extract news content.
async def handle_tool_call(tc: ChatCompletionMessageToolCall,) -> ChatCompletionToolMessageParam:print(f"Handling tool call: {tc.function.name}")try:if (tc.function.name== WebsiteExtractTool.openai_tool_definition["function"]["name"]):args = json.loads(tc.function.arguments)content = await WebsiteExtractTool.async_runnable(hb=hb, params=args)return {"role": "tool", "tool_call_id": tc.id, "content": content}else:raise ValueError(f"Tool not found: {tc.function.name}")except Exception as e:err_msg = f"Error handling tool call: {e}"print(err_msg)return {"role": "tool","tool_call_id": tc.id,"content": err_msg,"is_error": True, # type: ignore}
Step 4: Implement the agent loop
The agent loop is the heart of our news analysis system. It implements a recursive conversation pattern where:
- The current state of the conversation is sent to the LLM (GPT-4o-mini in this case)
- The LLM either provides a final answer or requests more information via tool calls
- If tool calls are made, they're processed and their results are added to the conversation
- This process repeats until the LLM provides a final analysis
This architecture allows the agent to gather information from multiple news sources iteratively before generating a comprehensive summary. We're using GPT-4o-mini here for efficient processing while maintaining high-quality analysis.
async def agent_loop(messages: list[ChatCompletionMessageParam]) -> str:while True:response = await llm.chat.completions.create(messages=messages,model="gpt-4o-mini",tools=[WebsiteExtractTool.openai_tool_definition,],max_completion_tokens=8000,)choice = response.choices[0]# Append response to messagesmessages.append(choice.message) # type: ignore# Handle tool callsif (choice.finish_reason == "tool_calls"and choice.message.tool_calls is not None):tool_result_messages = await asyncio.gather(*[handle_tool_call(tc) for tc in choice.message.tool_calls])messages.extend(tool_result_messages)elif choice.finish_reason == "stop" and choice.message.content is not None:return choice.message.contentelse:print(choice)raise ValueError(f"Unhandled finish reason: {choice.finish_reason}")
Step 5: Design the system prompt
The system prompt is crucial for guiding the LLM's behavior. Our prompt establishes the agent as an expert news analyst and provides detailed instructions on how to:
- Access and extract information from news sources
- Generate an overall summary based on multiple sources
- Provide source-specific analysis with attribution
- Answer user questions based on the aggregated news content
This structured approach ensures that our news summaries are comprehensive, balanced, and properly sourced.
SYSTEM_PROMPT = """You are an expert news analyst. You have access to a 'extract_data' tool which can be used to get structured data from a webpage.This is the link to a news aggregator {link}. You are required to summarize the overall news in a concise manner. In addition, you are required to provide the list of what the analysis is from the various news sources in detail. Your overall analysis should be based on the summary from the various individual news sources. In addition, if the user asks a question, you are required to provide the answer based on your overall summary.In summary, you are required to provide the following:1. Overall summary of the news, based on the individual news sources2. List of what the analysis is from the various news sources in detail. This analysis should be based on the particular news source itself.3. If the user asks any questions, you are required to provide the answer based on your overall summary.""".strip()
Step 6: Create a factory function for generating news analysis agents
Now we'll create a factory function that generates specialized news analysis agents for specific topics or sources. This approach provides several benefits:
- Reusability: We can create multiple agents for different news topics
- Configurability: Each agent can be configured with a specific news source URL
- Flexibility: The function handles optional user questions for interactive analysis
The factory returns an async function that can be called to generate news summaries or answer topic-specific questions.
from typing import Coroutine, Any, Callable, Optionaldef make_news_analyst(link_to_aggregator: str,) -> Callable[..., Coroutine[Any, Any, str]]:# Popular documentation providers like Gitbook, Mintlify etc automatically generate a llms.txt file# for documentation sites hosted on their platforms.if not (link_to_aggregator.startswith("http://")or link_to_aggregator.startswith("https://")):link_to_aggregator = f"https://{link_to_aggregator}"sysprompt = SYSTEM_PROMPT.format(link=link_to_aggregator,)async def news_agent(question: Optional[str] = None) -> str:messages: List[ChatCompletionMessageParam] = [{"role": "system", "content": sysprompt},]if question is not None:messages.append({"role": "user","content": f"The user asked the following question as a : {question}",})return await agent_loop(messages)return news_agent
Step 7: Get the news topic from the users questions
So while we have the ability to summarize the news, and answer the users questions, we can't yet derive what the user wants to search about. So, we'll run a minimal query that gets the actual news query from the users question. Then, we can form the link to the news aggregator using the query.
Note that the query gets encoded using the quote_plus
function so that the url generated conforms to the expected urls.
async def convert_to_news_query(question: str) -> Optional[str]:messages: List[ChatCompletionMessageParam] = [{"role": "system","content": """Convert the user's question into a concise news search query. Focus on key terms and remove unnecessary words. The query should be suitable for searching news articles.For example:"What are the latest developments in artificial intelligence regulation?" -> AI regulation news"How is the current state of the NBA?" -> NBA news""",},{"role": "user","content": f"Convert this question to a news search query: {question}",},]response = await llm.chat.completions.create(model="gpt-4o-mini", messages=messages, temperature=0.3, max_tokens=50)return response.choices[0].message.contentasync def generate_news_url(question: str) -> Optional[str]:query = await convert_to_news_query(question)print(query)if query is not None:return f"https://news.google.com/search?q={urllib.parse.quote_plus(query)}"return Noneasync def get_news_summary(query: str) -> Optional[str]:news_url = await generate_news_url(query)if news_url is not None:news_agent = make_news_analyst(news_url)response = await news_agent(query)if response:return responseelse:return "**No response from the agent when summarizing news**"else:return "**Could not generate news url**"
Step 8: Set up the news topic and user question
Now we'll define the specific news topic we want to analyze and an optional user question for the agent to answer. In this example, we're:
- Setting up a Google News search for "bird flu" articles
- Asking the specific question "Is bird flu a serious concern for humans?"
This demonstrates how the agent can not only summarize news but also answer specific questions about the topic.
query = "Is bird flu a serious concern for humans?"
Step 9: Run the news analyst agent
Finally, we'll create an instance of our news analysis agent for the specified topic and run it with our question. This demonstrates the full workflow:
- The agent accesses Google News for "bird flu" articles
- It extracts content from multiple news sources on this topic
- It analyzes and summarizes the content from different publications
- It generates a comprehensive summary with source-specific details
- It answers our specific question about human health concerns
The formatted output will include an overall summary, source-by-source analysis, and an answer to our question.
news_summary = await get_news_summary(query)display(Markdown(news_summary))
bird flu human concern news Handling tool call: extract_data
1. **Overall Summary of the News**:
Recent news articles indicate that bird flu, particularly the H5N1 strain, poses a growing concern for human health. Experts are alarmed by the increasing risk of transmission to humans due to recent cases in livestock, including pigs and cattle. There are fears of possible mutations that might allow for human-to-human transmission. The World Health Organization (WHO) has expressed significant concern over the potential for a pandemic and has stressed the importance of readiness and vigilance in monitoring the situation.
-
List of Analysis from Various News Sources:
-
Nature.com: Reports indicate a rising risk of bird flu potentially sparking a human pandemic, with scientists warning that the threat level is increasing.
-
UC Berkeley School of Public Health: Discusses public health concerns related to bird flu, underlining the need for public awareness.
-
TODAY: Covers the first recorded death of a patient in the U.S. from H5N1, emphasizing the urgency of addressing bird flu risks.
-
News-Medical.Net: Details a study linking some H5N1 variants in cattle to milder human cases, yet notes that other strains remain a significant threat.
-
UPI News: Expresses concerns that the U.S. may not be sufficiently prepared for a potential outbreak of bird flu among humans.
-
STAT: Reports the detection of bird flu in a U.S. pig for the first time, escalating concerns regarding risks to humans.
-
The Associated Press: Highlights the first U.S. case of bird flu found in a pig, pointing out the potential dangers to human health.
-
NBC News: Discusses the urgent need for developing new vaccines as fears about bird flu continue to grow.
-
Fox News: Reports on mutations detected in a bird flu patient, raising alarms about the risk of human transmission.
-
USA TODAY: Summarizes the ongoing situation with bird flu outbreaks and their implications for human health.
-
If you have any specific questions about bird flu and its risks to humans, feel free to ask!
Step 8: Try it with your own questions
Now that we've seen the agent in action, you can have it analyze your own queries.
# Example: Get the news summary for another topic# news_topic = "How's the crypto market doing ?"# news_summary = await get_news_summary(news_topic)# print(news_summary)
Feel free to experiment with different questions and news aggregation sites!
Conclusion
In this cookbook, we built a powerful news analysis agent using Hyperbrowser and OpenAI's GPT-4o-mini. Our agent can:
- Access and extract information from news aggregators like Google News
- Process content from multiple news sources on any topic
- Generate comprehensive, balanced summaries with source attribution
- Answer specific questions based on the aggregated news content
- Present information in a well-structured, easy-to-understand format
This tool can be invaluable for:
- Staying informed on complex news topics without bias
- Quickly understanding the consensus and divergent views across publications
- Getting concise answers to specific questions about current events
- Saving time by consolidating information from multiple sources
Next Steps
To enhance this news analysis system further, you could:
- Add sentiment analysis to detect media bias
- Implement tracking of topics over time to identify trend changes
- Create a web interface for easier access to news summaries
- Add support for specialized news sources in specific domains (finance, technology, etc.)
- Implement translation for international news coverage
Happy news analyzing!