Claude Sonnet 4 coding agent

AI
Published

July 16, 2025

This week I explored Claude Sonnet 4 and GitHub Copilot’s new Agent Mode. Claude Sonnet 4, recently released by Anthropic, is particularly strong at coding tasks. Agent Mode was added to Copilot and VS Code earlier this year. I was inspired to try them both after reading a recent post that advocates delegating all development tasks to a coding agent.

What Is Agent Mode?

Until now, I’ve mainly interacted with LLMs via chat and through Copilot’s inline suggestions in VS Code. Agent Mode is something more: it’s like a true pair programmer. It reads files in your project, executes commands, makes and reviews changes, commits code, and creates pull requests. It supports nearly the full programming workflow autonomously.

What I Tried

I used the Claude coding agent for several tasks—code reviews, feature implementation, and changelog updates—and was impressed across the board.

Code Reviews

First, I asked Claude to review multiple projects. For two current ones, it gave helpful feedback and practical improvement suggestions. For two legacy codebases, Claude’s assessments were insightful: one codebase was maintainable but needed refactoring, while the other was outdated beyond repair—though Claude acknowledged the embedded domain expertise.

Issue Resolution

Next, I asked Claude to fix an open GitHub issue, replacing one package with another offering similar functionality. Claude reviewed the GitHub issue, checked the documentation for both packages, implemented the change, validated it, and refined it after testing. It then created a branch, made the commit, and opened a PR. I only had to supervise and approve command execution. The entire task, from reading the issue to a working PR, was handled autonomously.

Changelog Updates

Finally, I asked Claude to generate and update changelogs for several projects. I provided a template from one project, and Claude analyzed commits, PRs, and issues to generate changelogs for each project, following the structure I shared.

What Didn’t Work?

In one code review, Claude overemphasized legacy documentation included in the repo, resulting in an overly critical assessment.

Additionally, I realized that the way I phrased my prompt had a significant impact on its interpretation. Once reframed, Claude’s feedback was more balanced.

Final Thoughts

Claude Sonnet 4 in VS Code Copilot agent mode impressed me. It felt less like a tool and more like a colleague. I also tested GPT-4 in a similar setup, but Claude was noticeably more capable. To me, these tools signal a real shift in how we write and maintain code—one where LLMs become integral to the development workflow.