GitHub Changelog Generator with Hyperbrowser
In this cookbook, we'll create a powerful AI-enhanced changelog generator that automatically extracts commits from GitHub repositories and transforms them into organized, human-readable release notes. This tool:
- Compares any two commits, branches, or tags in a GitHub repository
- Extracts detailed information about each commit and file change
- Organizes changes into logical categories (fixes, features, improvements)
- Generates clear, concise release notes using AI to interpret technical changes
This approach eliminates the tedious manual work of aggregating and organizing commits for release documentation, saving developers hours of work while producing more consistent and user-friendly changelogs.
Prerequisites
Before starting, you'll need:
- Hyperbrowser API key (sign up at hyperbrowser.ai if you don't have one)
- OpenAI API key (for AI-generated changelog enhancements)
- Python 3.9+ installed
Store your API keys in a .env file in the same directory as this notebook:
HYPERBROWSER_API_KEY=your_hyperbrowser_key_here
OPENAI_API_KEY=your_openai_key_here
Step 1: Import Libraries and Initialize Clients
We start by importing the necessary packages and setting up our API clients. The key components include:
- AsyncHyperbrowser: For web extraction and GitHub comparison data retrieval
- AsyncOpenAI: For intelligent formatting and categorization of commit messages
- BaseModel from Pydantic: For structured data parsing and validation
These tools work together to extract, process, and present GitHub commit data in a standardized format.
import osfrom typing import Optionalfrom dotenv import load_dotenvfrom IPython.display import display, Markdownfrom hyperbrowser import AsyncHyperbrowserfrom hyperbrowser.models.extract import StartExtractJobParamsfrom pydantic import BaseModelfrom openai import AsyncOpenAIfrom openai.types.chat import ChatCompletionMessageParamfrom typing import Listfrom IPython.display import display, Markdown# Load environment variablesload_dotenv()
Step 2: Initialize API Clients
Now we'll create instances of the APIs we'll be using. The AsyncHyperbrowser client handles web extraction tasks, while OpenAI powers our AI-driven changelog.
hb = AsyncHyperbrowser(api_key=os.getenv("HYPERBROWSER_API_KEY"))llm = AsyncOpenAI()
Step 3: Define Data Models
Before we can extract GitHub comparison data, we need to define structured models that represent the information we want to capture. These Pydantic models will:
- Enforce strict typing and validation for our extracted data
- Provide a clear schema for the extraction API to follow
- Make the data easily serializable and traversable
Our model hierarchy includes:
FileChange
: Details about modifications to individual filesCommit
: Information about each commit in the comparisonGitComparisonSchema
: The top-level container for all extracted data
Following the models, we define the extract_git_comparison
function that uses Hyperbrowser's extraction capabilities to pull structured data from GitHub comparison pages.
# Define Pydantic models for the data schemaclass FileChange(BaseModel):file_path: stradditions: intdeletions: intcode_change: strraw_code_original: strraw_code_changed: stris_visible: booldef str(self):return f"""Path:{self.file_path}\naddition=>+${self.additions}|removals=>-{self.deletions}\n{self.code_change}"""class Commit(BaseModel):message: strdescription: Optional[str]committer_name: stris_verified: booldef str(self):return f"""Commit:{self.message}\nDescription:{self.description}\nCommitter:{self.committer_name}"""class GitComparisonSchema(BaseModel):num_commits: intnum_files_changed: intcommits: List[Commit]file_changes: List[FileChange]def str(self):return f"""# Git Comparison Datanumber of commits: {self.num_commits}number of files changed: {self.num_files_changed}commits:\n{"\n".join([str(commit) for commit in self.commits])}file changes:\n{"\n".join([str(file) for file in self.file_changes])}"""async def extract_git_comparison(comparison_url: str):"""Extract git comparison data using Hyperbrowser."""try:api_key = os.getenv("HYPERBROWSER_API_KEY")if not api_key:print("❌ Error: HYPERBROWSER_API_KEY not set in .env file")return Noneprint("🔄 Extracting comparison data from GitHub...")result = await hb.extract.start_and_wait(params=StartExtractJobParams(urls=[comparison_url],schema=GitComparisonSchema.model_json_schema(),prompt="""Extract the following information from this GitHub comparison page:1. The total number of commits in this comparison2. The total number of files changed3. For each commit:- The main commit message (title)- The commit description (body, if any)- The name of the committer- Whether the commit is verified (true/false)4. For each file changed:- The file path- The number of additions (green lines)- The number of deletions (red lines)- A summary of the actual changes- Whether the commit change is visible in the UI (true/false)""",))print("✅ Successfully extracted comparison data!")return GitComparisonSchema.model_validate(result.data)except Exception as e:print(f"❌ Error extracting data: {str(e)}")return None
Step 4: Define Changelog Generation Functions
With our data models in place, we now implement the core functionality for generating changelogs. This section contains two key functions:
-
get_comparison_url
: Constructs a valid GitHub comparison URL from a repository URL and two reference points (commits, branches, or tags) -
generate_ai_changelog
: The primary function that orchestrates the entire process:
- Fetches comparison data using our extraction function
- Formats the data for the AI model
- Sends the data to OpenAI for intelligent analysis and formatting
- Returns a well-structured changelog
The AI component then:
- Categorizes changes (fixes, features, refactoring, etc.)
- Merges related commits
- Translates technical descriptions into user-friendly language
- Highlights the most significant changes
def get_comparison_url(git_url: str, start: str, end: str) -> str:# Remove .git extension if presentif git_url.endswith(".git"):git_url = git_url[:-4]return f"{git_url}/compare/{start}...{end}"# Cell 5: Generate AI Changelogasync def generate_ai_changelog(git_url, starting_commit, ending_commit):"""Generate AI-enhanced changelog from GitHub repository comparison."""comparison_url = get_comparison_url(git_url, starting_commit, ending_commit)print(f"🔗 Comparison URL: {comparison_url}")# Extract datacomparison_data = await extract_git_comparison(comparison_url)if not comparison_data:return Nonerepo_name = git_url.split("/")[-1].replace(".git", "")messages: List[ChatCompletionMessageParam] = [{"role": "system","content": """You are a helpful assistant that formats git commit data into a readable changelog. Here are your parameters for making good changelogsGuiding Principles- Changelogs are for humans, not machines.- There should be an entry for every single version.- The same types of changes should be grouped.- Versions and sections should be linkable.- The latest version comes first.- The release date of each version is displayed.- Mention whether you follow Semantic Versioning.Types of changes- Added for new features.- Changed for changes in existing functionality.- Deprecated for soon-to-be removed features.- Removed for now removed features.- Fixed for any bug fixes.- Security in case of vulnerabilities.""","name": "changelog_assistant",},{"role": "user","content": f"Here is the changelog:\n\n{str(comparison_data)}",},]# Generate AI changelogprint("🤖 Generating AI-enhanced changelog...")ai_changelog = await llm.chat.completions.create(messages=messages,model="gpt-4o-mini",)return ai_changelog
Step 5: Execute the Changelog Generator
Now we'll test our changelog generator with a real-world example - comparing two specific commits in the React repository. This demonstrates the entire workflow:
- We specify the GitHub repository URL and the two commits to compare
- The system fetches the comparison data from GitHub
- The AI processes the raw commit data and generates a structured changelog
While this example uses specific commit hashes, the same approach works with branch names (e.g., "main...develop") or tags (e.g., "v1.0.0...v1.1.0").
git_url = "https://github.com/facebook/react" # Replace with actual repo URLstarting_commit = "50ab2dde940bf0027773a944da005277b3d5598a" # Replace with starting commit/branch/tagending_commit = ("0ca3deebcf20d2514771a568e1be08801da5cf85" # Replace with ending commit/branch/tag)# Run the function with the provided inputsai_changelog = await generate_ai_changelog(git_url, starting_commit, ending_commit)
🔗 Comparison URL: https://github.com/facebook/react/compare/50ab2dde940bf0027773a944da005277b3d5598a...0ca3deebcf20d2514771a568e1be08801da5cf85 🔄 Extracting comparison data from GitHub... ✅ Successfully extracted comparison data! 🤖 Generating AI-enhanced changelog...
Step 6: Display the AI-Generated Changelog
Finally, we'll display the AI-generated changelog in a clean, readable format. This output includes:
- A formatted Markdown version of the changelog
- Changes organized by type (fixes, changes, additions, deletions)
- A link to the original GitHub comparison for reference
The resulting changelog is significantly more readable and useful than raw commit logs, making it immediately ready for release notes or documentation.
if ai_changelog:print("\n🤖 AI-Generated Changelog:")message = ai_changelog.choices[0].messagedisplay(Markdown(ai_changelog.choices[0].message.content))# Show link to comparisoncomparison_url = get_comparison_url(git_url, starting_commit, ending_commit)print(f"\n🔗 View original comparison on GitHub: {comparison_url}")else:print("❌ Failed to generate AI changelog. Check the errors above.")
🤖 AI-Generated Changelog:
# Changelog
[Unreleased] - TBD
Fixed
-
Fix: Addressed an issue with finished view transitions animations that could cause conflicts with new view transitions, resolving visual glitches in Safari. Commit by sebmarkbage
-
Fix: Resolved a critical problem with the
moveBefore
function to ensure it only operates on moved elements, optimizing performance. Commit by sebmarkbage -
Fix: Corrected the output platform configuration in esbuild due to a copy-paste error. Commit by poteto
Changed
-
Changed: Updated the
ReactFiberConfigDOM.js
file to ensure proper configuration handling. Commit by sebmarkbage -
Changed: Modified the
tsup.config.ts
file to improve build configurations. Commit by sebmarkbage -
Changed: Adjusted the
ReactFeatureFlags.js
to better reflect feature management strategies. Commit by sebmarkbage
Added
- Added: Enabled the
moveBefore
feature in experimental releases, allowing early feature detection with caution based on current browser support. Commit by sebmarkbage
Versioning
This project adheres to Semantic Versioning.
Note: Entries are linked to specific commits for reference.
🔗 View original comparison on GitHub: https://github.com/facebook/react/compare/50ab2dde940bf0027773a944da005277b3d5598a...0ca3deebcf20d2514771a568e1be08801da5cf85
Extension Ideas
The changelog generator could be enhanced in several ways:
- Release Notes Generator: Extend with version numbering and release date tracking
- Integration with GitHub Actions: Automatically generate changelogs on new tags or releases
- Custom Templates: Allow different formatting styles based on project requirements
Each of these enhancements builds on the foundation we've created, making the changelog process even more valuable for development teams.
Conclusion
We've built an AI-driven changelog generator that transforms basic Git commit information into structured, human-readable release notes. This tool demonstrates how combining web extraction with AI can dramatically improve developer workflows by:
- Automating tedious documentation tasks that developers typically avoid or rush through
- Creating consistent, well-structured changelogs that follow best practices in technical documentation
- Translating developer-centric commit messages into user-friendly change descriptions
- Organizing changes by category to highlight the most important information
This notebook provides a foundation that can be extended for more sophisticated changelog generation, including integration with CI/CD pipelines, release management systems, or documentation automation.