GitHub Changelog Generator with Hyperbrowser

In this cookbook, we'll create a powerful AI-enhanced changelog generator that automatically extracts commits from GitHub repositories and transforms them into organized, human-readable release notes. This tool:

  • Compares any two commits, branches, or tags in a GitHub repository
  • Extracts detailed information about each commit and file change
  • Organizes changes into logical categories (fixes, features, improvements)
  • Generates clear, concise release notes using AI to interpret technical changes

This approach eliminates the tedious manual work of aggregating and organizing commits for release documentation, saving developers hours of work while producing more consistent and user-friendly changelogs.

Prerequisites

Before starting, you'll need:

  • Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one)
  • OpenAI API key (for AI-generated changelog enhancements)
  • Python 3.9+ installed

Store your API keys in a .env file in the same directory as this notebook:

HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here

Step 1: Import Libraries and Initialize Clients

We start by importing the necessary packages and setting up our API clients. The key components include:

  • AsyncHyperbrowser: For web extraction and GitHub comparison data retrieval
  • AsyncOpenAI: For intelligent formatting and categorization of commit messages
  • BaseModel from Pydantic: For structured data parsing and validation

These tools work together to extract, process, and present GitHub commit data in a standardized format.

import os
from typing import Optional
from dotenv import load_dotenv
from IPython.display import display, Markdown
from hyperbrowser import AsyncHyperbrowser
from hyperbrowser.models.extract import StartExtractJobParams
from pydantic import BaseModel
from openai import AsyncOpenAI
from openai.types.chat import ChatCompletionMessageParam
from typing import List
from IPython.display import display, Markdown
# Load environment variables
load_dotenv()

Step 2: Initialize API Clients

Now we'll create instances of the APIs we'll be using. The AsyncHyperbrowser client handles web extraction tasks, while OpenAI powers our AI-driven changelog.

hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))
llm = AsyncOpenAI()

Step 3: Define Data Models

Before we can extract GitHub comparison data, we need to define structured models that represent the information we want to capture. These Pydantic models will:

  1. Enforce strict typing and validation for our extracted data
  2. Provide a clear schema for the extraction API to follow
  3. Make the data easily serializable and traversable

Our model hierarchy includes:

  • FileChange: Details about modifications to individual files
  • Commit: Information about each commit in the comparison
  • GitComparisonSchema: The top-level container for all extracted data

Following the models, we define the extract_git_comparison function that uses Hyperbrowser's extraction capabilities to pull structured data from GitHub comparison pages.

# Define Pydantic models for the data schema
class FileChange(BaseModel):
file_path: str
additions: int
deletions: int
code_change: str
raw_code_original: str
raw_code_changed: str
is_visible: bool
def str(self):
return f"""Path:{self.file_path}\naddition=>+${self.additions}|removals=>-{self.deletions}\n{self.code_change}"""
class Commit(BaseModel):
message: str
description: Optional[str]
committer_name: str
is_verified: bool
def str(self):
return f"""Commit:{self.message}\nDescription:{self.description}\nCommitter:{self.committer_name}"""
class GitComparisonSchema(BaseModel):
num_commits: int
num_files_changed: int
commits: List[Commit]
file_changes: List[FileChange]
def str(self):
return f"""# Git Comparison Data
number of commits: {self.num_commits}
number of files changed: {self.num_files_changed}
commits:\n{"\n".join([str(commit) for commit in self.commits])}
file changes:\n{"\n".join([str(file) for file in self.file_changes])}
"""
async def extract_git_comparison(comparison_url: str):
"""Extract git comparison data using Hyperbrowser."""
try:
api_key = os.getenv("HYPERBROWSER_API_KEY")
if not api_key:
print("❌ Error: HYPERBROWSER_API_KEY not set in .env file")
return None
print("🔄 Extracting comparison data from GitHub...")
result = await hb.extract.start_and_wait(
params=StartExtractJobParams(
urls=[comparison_url],
schema=GitComparisonSchema.model_json_schema(),
prompt="""
Extract the following information from this GitHub comparison page:
1. The total number of commits in this comparison
2. The total number of files changed
3. For each commit:
- The main commit message (title)
- The commit description (body, if any)
- The name of the committer
- Whether the commit is verified (true/false)
4. For each file changed:
- The file path
- The number of additions (green lines)
- The number of deletions (red lines)
- A summary of the actual changes
- Whether the commit change is visible in the UI (true/false)
""",
)
)
print("✅ Successfully extracted comparison data!")
return GitComparisonSchema.model_validate(result.data)
except Exception as e:
print(f"❌ Error extracting data: {str(e)}")
return None

Step 4: Define Changelog Generation Functions

With our data models in place, we now implement the core functionality for generating changelogs. This section contains two key functions:

  1. get_comparison_url: Constructs a valid GitHub comparison URL from a repository URL and two reference points (commits, branches, or tags)

  2. generate_ai_changelog: The primary function that orchestrates the entire process:

  • Fetches comparison data using our extraction function
  • Formats the data for the AI model
  • Sends the data to OpenAI for intelligent analysis and formatting
  • Returns a well-structured changelog

The AI component then:

  • Categorizes changes (fixes, features, refactoring, etc.)
  • Merges related commits
  • Translates technical descriptions into user-friendly language
  • Highlights the most significant changes
def get_comparison_url(git_url: str, start: str, end: str) -> str:
# Remove .git extension if present
if git_url.endswith(".git"):
git_url = git_url[:-4]
return f"{git_url}/compare/{start}...{end}"
# Cell 5: Generate AI Changelog
async def generate_ai_changelog(git_url, starting_commit, ending_commit):
"""Generate AI-enhanced changelog from GitHub repository comparison."""
comparison_url = get_comparison_url(git_url, starting_commit, ending_commit)
print(f"🔗 Comparison URL: {comparison_url}")
# Extract data
comparison_data = await extract_git_comparison(comparison_url)
if not comparison_data:
return None
repo_name = git_url.split("/")[-1].replace(".git", "")
messages: List[ChatCompletionMessageParam] = [
{
"role": "system",
"content": """You are a helpful assistant that formats git commit data into a readable changelog. Here are your parameters for making good changelogs
Guiding Principles
- Changelogs are for humans, not machines.
- There should be an entry for every single version.
- The same types of changes should be grouped.
- Versions and sections should be linkable.
- The latest version comes first.
- The release date of each version is displayed.
- Mention whether you follow Semantic Versioning.
Types of changes
- Added for new features.
- Changed for changes in existing functionality.
- Deprecated for soon-to-be removed features.
- Removed for now removed features.
- Fixed for any bug fixes.
- Security in case of vulnerabilities.
""",
"name": "changelog_assistant",
},
{
"role": "user",
"content": f"Here is the changelog:\n\n{str(comparison_data)}",
},
]
# Generate AI changelog
print("🤖 Generating AI-enhanced changelog...")
ai_changelog = await llm.chat.completions.create(
messages=messages,
model="gpt-4o-mini",
)
return ai_changelog

Step 5: Execute the Changelog Generator

Now we'll test our changelog generator with a real-world example - comparing two specific commits in the React repository. This demonstrates the entire workflow:

  1. We specify the GitHub repository URL and the two commits to compare
  2. The system fetches the comparison data from GitHub
  3. The AI processes the raw commit data and generates a structured changelog

While this example uses specific commit hashes, the same approach works with branch names (e.g., "main...develop") or tags (e.g., "v1.0.0...v1.1.0").

git_url = "https://github.com/facebook/react" # Replace with actual repo URL
starting_commit = "50ab2dde940bf0027773a944da005277b3d5598a" # Replace with starting commit/branch/tag
ending_commit = (
"0ca3deebcf20d2514771a568e1be08801da5cf85" # Replace with ending commit/branch/tag
)
# Run the function with the provided inputs
ai_changelog = await generate_ai_changelog(git_url, starting_commit, ending_commit)
🔗 Comparison URL: https://github.com/facebook/react/compare/50ab2dde940bf0027773a944da005277b3d5598a...0ca3deebcf20d2514771a568e1be08801da5cf85

🔄 Extracting comparison data from GitHub...

✅ Successfully extracted comparison data!

🤖 Generating AI-enhanced changelog...

Step 6: Display the AI-Generated Changelog

Finally, we'll display the AI-generated changelog in a clean, readable format. This output includes:

  1. A formatted Markdown version of the changelog
  2. Changes organized by type (fixes, changes, additions, deletions)
  3. A link to the original GitHub comparison for reference

The resulting changelog is significantly more readable and useful than raw commit logs, making it immediately ready for release notes or documentation.

if ai_changelog:
print("\n🤖 AI-Generated Changelog:")
message = ai_changelog.choices[0].message
display(Markdown(ai_changelog.choices[0].message.content))
# Show link to comparison
comparison_url = get_comparison_url(git_url, starting_commit, ending_commit)
print(f"\n🔗 View original comparison on GitHub: {comparison_url}")
else:
print("❌ Failed to generate AI changelog. Check the errors above.")

🤖 AI-Generated Changelog:
# Changelog

[Unreleased] - TBD

Fixed

  • Fix: Addressed an issue with finished view transitions animations that could cause conflicts with new view transitions, resolving visual glitches in Safari. Commit by sebmarkbage

  • Fix: Resolved a critical problem with the moveBefore function to ensure it only operates on moved elements, optimizing performance. Commit by sebmarkbage

  • Fix: Corrected the output platform configuration in esbuild due to a copy-paste error. Commit by poteto

Changed

  • Changed: Updated the ReactFiberConfigDOM.js file to ensure proper configuration handling. Commit by sebmarkbage

  • Changed: Modified the tsup.config.ts file to improve build configurations. Commit by sebmarkbage

  • Changed: Adjusted the ReactFeatureFlags.js to better reflect feature management strategies. Commit by sebmarkbage

Added

  • Added: Enabled the moveBefore feature in experimental releases, allowing early feature detection with caution based on current browser support. Commit by sebmarkbage

Versioning

This project adheres to Semantic Versioning.


Note: Entries are linked to specific commits for reference.

Extension Ideas

The changelog generator could be enhanced in several ways:

  1. Release Notes Generator: Extend with version numbering and release date tracking
  2. Integration with GitHub Actions: Automatically generate changelogs on new tags or releases
  3. Custom Templates: Allow different formatting styles based on project requirements

Each of these enhancements builds on the foundation we've created, making the changelog process even more valuable for development teams.

Conclusion

We've built an AI-driven changelog generator that transforms basic Git commit information into structured, human-readable release notes. This tool demonstrates how combining web extraction with AI can dramatically improve developer workflows by:

  1. Automating tedious documentation tasks that developers typically avoid or rush through
  2. Creating consistent, well-structured changelogs that follow best practices in technical documentation
  3. Translating developer-centric commit messages into user-friendly change descriptions
  4. Organizing changes by category to highlight the most important information

This notebook provides a foundation that can be extended for more sophisticated changelog generation, including integration with CI/CD pipelines, release management systems, or documentation automation.