Building a Twitter Profile Analyzer with Hyperbrowser and GPT-4o-mini
In this cookbook, we'll build a Twitter (X) Profile Analyzer that can extract detailed information from any Twitter profile using a persistent authenticated browser session. This approach allows us to access Twitter data even when a log in is required.
We'll use these tools to build our agent:
- Hyperbrowser for authenticated web browsing and data extraction
- OpenAI's GPT-4o-mini for parsing Twitter profiles and answering questions about the data
By the end of this cookbook, you'll have a reusable agent that can analyze any Twitter profile and answer questions about their activity!
import asyncioimport jsonimport osfrom dotenv import load_dotenvfrom hyperbrowser import AsyncHyperbrowserfrom hyperbrowser.tools import WebsiteExtractToolfrom openai import AsyncOpenAIfrom openai.types.chat import (ChatCompletionMessageParam,ChatCompletionMessageToolCall,ChatCompletionToolMessageParam,)from typing import Listfrom IPython.display import Markdown, displayload_dotenv()
Prerequisites
To follow along you'll need the following:
- A Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one, it's free)
- An OpenAI API key (sign up at openai.com if you don't have one, it's free)
- Python 3.9+ installed
Both API keys should be stored in a .env file in the same directory as this notebook with the following format:
Step 1: Initialize clients
hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))llm = AsyncOpenAI()
Step 2: Set up a persistent browser profile
A key feature of this cookbook is using a persistent browser profile. This allows us to maintain a login session across different runs, essential for accessing Twitter content that might require authentication.
profile_id = "[If a profile already exists, use it here]"### If you are using this notebook for the first time, uncomment the following lines and run them# from hyperbrowser.models import CreateSessionParams,CreateSessionProfile# profile = await hb.profiles.create()# print(profile)# session = await hb.sessions.create(CreateSessionParams(profile=CreateSessionProfile(id=profile.id,persist_changes=True)))# print(session.live_url)# profile_id = profile.id
When you uncomment and run the code above for the first time, it will:
- Create a new browser profile
- Start a browser session with that profile
- Show a live URL where you can interact with the browser
- Save the profile ID for future use
You can use the live URL to manually log in to Twitter. Once logged in, the session will be saved to your profile for future API calls.
# await hb.sessions.stop(session.id)
Use the code above to stop the browser session when you're done with manual interaction. It is critical to stop the live session, so that the log in data can then be stored.
Step 3: Implement tool handling for web extraction
Now we'll create a function to handle the extraction of data from Twitter profiles using Hyperbrowser's API. Notice how the profile_id
has been injected into the, extract params. This makes sure that the agent always is authenticated when accessing an authed flow.
async def handle_tool_call(tc: ChatCompletionMessageToolCall,) -> ChatCompletionToolMessageParam:print(f"Handling tool call: {tc.function.name}")try:if (tc.function.name!= WebsiteExtractTool.openai_tool_definition["function"]["name"]):raise ValueError(f"Tool not found: {tc.function.name}")args = json.loads(tc.function.arguments)print(args)extract_job_params = dict(**args,session_options=dict(profile=dict(id=profile_id),),)content = await WebsiteExtractTool.async_runnable(hb=hb,params=extract_job_params,)return {"role": "tool", "tool_call_id": tc.id, "content": content}except Exception as e:err_msg = f"Error handling tool call: {e}"print(e)print(err_msg)return {"role": "tool","tool_call_id": tc.id,"content": err_msg,"is_error": True, # type: ignore}
Step 4: Create the agent loop
This function handles the conversation flow between the user, the LLM, and the tools.
It's fairly straightforward, but in summary
- It takes a list of messages, including the system prompt and the users question
- Send them to OpenAI
- Process the tool calls, if any
- Repeat the loop until the stop message is sent.
async def agent_loop(messages: list[ChatCompletionMessageParam]) -> str:while True:response = await llm.beta.chat.completions.parse(messages=messages,model="gpt-4o-mini",tools=[WebsiteExtractTool.openai_tool_definition,],max_completion_tokens=8000,)choice = response.choices[0]# Append response to messagesmessages.append(choice.message) # type: ignore# Handle tool callsif (choice.finish_reason == "tool_calls"and choice.message.tool_calls is not None):tool_result_messages = await asyncio.gather(*[handle_tool_call(tc) for tc in choice.message.tool_calls])messages.extend(tool_result_messages)elif choice.finish_reason == "stop" and choice.message.content is not None:return choice.message.contentelse:print(choice)raise ValueError(f"Unhandled finish reason: {choice.finish_reason}")
Step 5: Design the system prompt
The system prompt guides the LLM's behavior, telling it what to extract from Twitter profiles, what to do with the users question if provided, and how it should respond.
SYSTEM_PROMPT = """You are an expert social media manager. You have access to a 'extract_data' tool which can be used to get structured data from a webpage. You can use this tool to get the data from the twitter profile. Here is the link to the twitter profile: {link}From the scraped information, you are required to extract the following information:1. The username of the twitter profile2. The number of followers of the twitter profile3. The number of following of the twitter profile4. The recent tweets of the twitter profile- The text of the tweet- The number of likes of the tweet- The number of replies of the tweet- The number of retweets of the tweet- The date and time of the tweet- The url of the tweet- The number of views of the tweetThe user may also have some questions about the twitter profile. You are required to answer the questions based on the information extracted from the twitter profile. Respond in markdown format.""".strip()
Step 6: Create a factory function for generating Twitter analyzer agents
Now we'll create a factory function that generates specialized Twitter profile analyzers.
This function
- Takes the twitter profile link
- Injects it into the user profile
- Bundles it inot a reusable prompt that can be used repeatedly.
def make_twitter_profile_agent(link_to_profile: str):# Popular documentation providers like Gitbook, Mintlify etc automatically generate a llms.txt file# for documentation sites hosted on their platforms.if not (link_to_profile.startswith("http://") or link_to_profile.startswith("https://")):link_to_profile = f"https://{link_to_profile}"sysprompt = SYSTEM_PROMPT.format(link=link_to_profile,)async def twitter_profile_agent(question: str) -> str:return await agent_loop([{"role": "system", "content": sysprompt},{"role": "user", "content": f"The user asked: {question}"},])return twitter_profile_agent
Step 7: Test the Twitter analyzer with NASA's profile
Let's test our agent by analyzing NASA's Twitter profile and asking a question about their latest news. Here's what the full flow would look like
- The agent receives a question about NASA's twitter profile
- It goes to NASAs twitter profile and performs a structured extraction.
- ChatGPT processes the structured extraction result
- It processes the tool call result to answer the users question.
- Returns the answer satisfying the users question(s).
You can see the flow below here, and what tools were called.
twitter_profile_agent = make_twitter_profile_agent("https://x.com/NASA")question = "What's the latest news from NASA ?"answer = await twitter_profile_agent(question)
Handling tool call: extract_data {'urls': ['https://x.com/NASA'], 'prompt': 'Extract recent tweets from the NASA Twitter profile, including the text, likes, replies, retweets, date and time, URL, and views for each tweet.', 'schema': '{"type":"object","properties":{"tweets":{"type":"array","items":{"type":"object","properties":{"text":{"type":"string"},"likes":{"type":"integer"},"replies":{"type":"integer"},"retweets":{"type":"integer"},"date_time":{"type":"string"},"url":{"type":"string"},"views":{"type":"integer"}}}}}}', 'max_links': 5}
Step 8: Display the formatted results
display(Markdown(answer))
Here's the latest news from NASA as per their Twitter profile:
-
Turn @NASAHubble sights into Hubble sounds!
-
Text: With Hearing Hubble, you can make your own sonifications out of our universe's most spectacular galaxies and nebulae. Choose an image, create your own symphony, and share your masterpiece. Start now: link
-
Likes: 70
-
Replies: 169
-
Retweets: 958
-
Date & Time: 1h ago
-
Views: 160,000
-
-
Can't stop, won't stop going to space 🚀
-
Text: NASA's @SpaceX #Crew10 mission is scheduled to send four new crew members to the @Space_Station at 7:48pm ET (2348 UTC) on Wednesday, March 12. Live coverage starts at 3:45pm ET (1945 UTC)—watch with us here on X.
-
Likes: 184
-
Replies: 309
-
Retweets: 1800
-
Date & Time: 4h ago
-
Views: 250,000
-
-
LIVE: Two missions - one launch!
-
Text: Watch with us as PUNCH and SPHEREx share a ride to space. They’re set to lift off from California’s @SLDelta30 at 11:10pm ET (0310 UTC March 11).
-
Likes: 220
-
Replies: 615
-
Retweets: 3400
-
Date & Time: 18h ago
-
Views: 5,100,000
-
You can follow @NASA for more updates!
Conclusion
In this cookbook, we built a Twitter Profile Analyzer using Hyperbrowser and OpenAI's GPT-4o-mini. This agent can:
- Access Twitter profiles using an authenticated browser session
- Extract profile information and recent tweets
- Analyze the content and provide insights based on user questions
- Format the output in a readable markdown format This pattern can be extended to work with other social media platforms or to perform more complex analyses of social media activity.
Next Steps
To take this further, you might consider:
- Adding sentiment analysis of tweets
- Implementing tracking of engagement metrics over time
- Building a web interface for easier interaction
- Implementing scheduled runs to monitor profile activity
Happy social media analyzing!