AI Strategy
12 min read

GitHub Copilot’s New Metered Billing: A Team Guide

MV
Mayank Vijay
June 4, 2026

On June 1, 2026, GitHub quietly retired flat-rate pricing for Copilot and replaced it with a raw token-based consumption model called AI Credits. If you manage an engineering team, this is not a minor billing update. It is a fundamental shift in how your AI tooling budget works. Your cost is no longer determined by the number of questions your team asks. It is determined by the depth of context in every single interaction.

Our 5-person engineering team consumed 50% of our monthly credit pool in the first 3 days after the switch. That experience forced us to rethink everything about how we use AI-assisted development. This post is the playbook we built as a result.

What Changed on June 1, 2026

For context, GitHub offers two paid Copilot plans for teams: Copilot Business at $19/seat/month and Copilot Enterprise at $39/seat/month. Both plans now operate on the AI Credits system instead of the old Premium Request (PR) model. The allowances differ by plan, but the billing mechanics are identical.

DimensionBefore (May 2026)After (June 2026)
CurrencyPremium Requests (PR)AI Credits (1 Credit = $0.01)
Allowance (Business, $19/seat)300 PR per seat (individual, not pooled)3,000 credits per seat, pooled across team
Allowance (Enterprise, $39/seat)1,000 PR per seat (individual, not pooled)7,000 credits per seat, pooled across team
Pricing modelFlat per prompt (1 prompt = fixed multiplier regardless of size)Token-based (cost depends on how much text goes in and out)
Agent tool calls / file readsFree (only your prompt counted)Billed (every token the model reads or generates counts)
Overage cost$0.04 per additional PR$0.01 per additional AI credit
Billing scopeIndividual per user (your allowance, your usage)Shared team pool (all seats draw from one org-level pool)

The pooled allowance means that for a 5-person team on Business, you share 15,000 credits per month. On Enterprise, that pool is 35,000 credits. Sounds generous until you see what a single heavy session costs.

Note: These allowances are temporarily expanded through October 2026 as part of a promotional period. After October, both plans will see reduced credit allowances, making efficiency even more critical.

The Per-Question Cost Comparison

The simplest way to understand the impact: take one identical question and compare what it cost before versus what it costs now. Below is a comparison using Claude Opus 4.6, one of the most capable (and expensive) models available in Copilot.

Scenario: You ask Copilot in Agent Mode to refactor a service module. The agent reads 12 files and produces a multi-file edit.

MetricLegacy (PRU model)New (AI Credits model)
Cost of your prompt3 Premium Requests (fixed)~50 credits (depends on prompt length)
Cost of agent reading 12 filesFree~400 credits (billed per token read)
Cost of model generating the responseIncluded in the 3 PRs~250 credits (output tokens are 5x input cost)
Total cost3 PRs = $0.12~700 credits = $7.00
Pool impact (5-person Business team)0.2% of 1,500 shared PRs4.7% of 15,000 shared credits
Identical question with a simple model (GPT-4.1)1 PR = $0.04~50 credits = $0.50

The critical difference: under the old model, it did not matter if the agent read 5 files or 50 files to answer your question. The cost was fixed. Under the new model, every file the agent opens, every line of code it reads, and every token it generates is individually metered.

This is the trap that catches teams off guard. Agent Mode is enormously useful because it explores your codebase autonomously. But that autonomous exploration is now billable. You did not ask Copilot to read 50 files. Its architecture decided to. But you pay for it.

Model Pricing: The Full Breakdown

Not all models cost the same. The pricing dashboard in GitHub reveals a massive disparity between frontier models and lightweight daily-driver models. Here is the complete pricing table as of June 2026 (cost in AI Credits per 1 million tokens):

ModelContext SizeInput CostOutput CostCache Cost
Claude Haiku 4.5200K10050010
Claude Opus 4.5200K500250050
Claude Opus 4.6200K500250050
Claude Sonnet 4.5200K300150030
Claude Sonnet 4.6200K300150030
Gemini 2.5 Pro173K125100012
Gemini 3 Flash (Preview)173K503005
Gemini 3.5 Flash192K15090015
GPT-5 mini192K252002
GPT-5.2400K175140017
GPT-5.4400K250150025
GPT-5.4 mini400K754507
GPT-5.5400K500300050

How to Read This Table

Take Claude Opus 4.6 as an example. The input cost is 500 credits per 1M tokens, and the output cost is 2,500 credits per 1M tokens. If you send a prompt that includes 50,000 tokens of context (about 12 medium-sized source files) and the model generates 2,000 tokens of response, your cost is:

ComponentClaude Opus 4.6GPT-5 mini
Input (50K tokens)50,000 / 1M x 500 = 25 credits50,000 / 1M x 25 = 1.25 credits
Output (2K tokens)2,000 / 1M x 2,500 = 5 credits2,000 / 1M x 200 = 0.4 credits
Total per interaction30 credits = $0.301.65 credits = $0.017

That is an 18x cost difference for the same question. For routine tasks (debugging syntax, renaming variables, writing tests, asking clarifying questions), the cheaper model is perfectly adequate and will save your team thousands of credits per month.

Model Tiers for Team Policy

TierModelsRelative CostBest For
Cheap (daily driver)GPT-5 mini, GPT-5.4 mini, Gemini 3 Flash, Claude Haiku 4.51x (base)Routine chat, quick questions, simple edits, test generation
MediumGPT-5.2, Gemini 2.5 Pro, Gemini 3.5 Flash3-5xMulti-file edits, moderate agent tasks, code review
ExpensiveClaude Sonnet 4.5/4.6, GPT-5.46-10xComplex reasoning, architecture decisions
Very ExpensiveClaude Opus 4.5/4.6, GPT-5.515-50xOnly for the hardest problems where cheaper models fail

Practical Guide: Staying Within Your Credit Pool

Our team developed the following practices after the billing switch. These are not theoretical suggestions. They are hard-won lessons from actual usage. Engineering leaders should share this section directly with their teams.

1. Choose the Right Model for Every Task

This is the single highest-impact change. Using Opus 4.6 to ask "what does this function do?" costs roughly 80 credits. Using GPT-5 mini for the same question costs 5 credits. That is a 16x difference for an identical answer quality on a simple question.

Establish a team rule: GPT-5 mini, Claude Haiku 4.5, or Gemini 3 Flash for everyday work. Escalate to Sonnet for multi-file reasoning. Reserve Opus for genuinely hard architectural problems where cheaper models produce wrong answers.

2. Be Surgical with Context

Before the billing change, you could tell the agent "look at the entire codebase and find the bug." The agent would read 50 files and it cost nothing extra. Now that same exploration burns 2,000+ credits on an expensive model.

Instead, do local exploration first. Use grep, find, or your IDE's search to identify the specific file and line number. Then tell the agent: "The bug is in UserService.tsx around line 165, the sessionId is not updating after refresh." That targeted prompt costs 100 credits instead of 2,000.

The principle: every file the agent reads is now billable. You are paying for its curiosity. Constrain it.

3. Close Agent Sessions When Done

This catches people by surprise. Each follow-up message in a chat session resends the entire conversation history as input tokens. A conversation that started with 5,000 tokens of context grows to 50,000 tokens after 10 back-and-forth exchanges. Every message costs more than the last because it includes everything that came before.

When you finish a task, close the session. Start fresh for the next topic. Never keep a thread open for multiple days. The accumulated context costs grow exponentially.

4. Use Auto Model Selection for the 10% Discount

If you do not explicitly pick a model, Copilot selects the best fit for your question and gives you a 10% discount on that interaction. For routine prompts where any capable model will do, this is free savings.

5. Break Large Tasks into Smaller Prompts with Cheaper Models

One massive Opus session to "implement the entire feature" might cost 1,500 credits. Instead, break it into steps:

  • Step 1: GPT-5 mini to plan the implementation (10 credits)
  • Step 2: GPT-5 mini to create the interfaces and types (15 credits)
  • Step 3: Sonnet to implement the complex logic (200 credits)
  • Step 4: GPT-5 mini to write the tests (30 credits)
  • Total: 255 credits instead of 1,500 (roughly 6x cheaper)

6. Be Selective with Copilot PR Reviews

Copilot as a PR reviewer reads the entire diff, all comments, and related context. On large PRs, this is expensive. Do not auto-assign Copilot as a reviewer on every pull request. Reserve AI review for complex PRs where you genuinely need a second pair of eyes on tricky logic.

7. Use Personal Tools for Non-Contextual Learning

If you want to learn about a concept, ask a general programming question, or explore an idea that does not require your codebase context, use a personal ChatGPT or Claude subscription instead. Copilot's value is in its workspace context. If you do not need that context, do not pay for it.

8. Set Budget Alerts and Track Consumption

For the initial few days after the switch, closely monitor your team's token consumption patterns. GitHub provides a detailed usage report that can be downloaded directly from the billing portal (Settings, Billing, Usage, filter by product:copilot). This report breaks down credits consumed per user per day and which models are eating the pool.

Set a budget alert at 80% of your pool (12,000 credits for a 15,000 pool) so you have time to course-correct before overages hit. More importantly, consider disabling overage spending entirely during the first month. This prevents surprise bills while your team learns the new consumption patterns. You can always re-enable it once you have confidence in your team's usage discipline.

The Bigger Picture: Industry-Wide Shift to Per-Token Billing

GitHub is not alone in this transition. This is part of a broader industry move away from flat-rate AI subscriptions toward usage-based pricing. Anthropic has already signaled that future Claude integrations will adopt per-token billing for enterprise tooling. OpenAI's ChatGPT Team and Enterprise plans are expected to follow a similar path as model costs increase with capability.

The economic logic is straightforward: as AI models become more capable (larger context windows, better reasoning, agentic behavior), the compute cost per interaction rises significantly. Flat-rate pricing becomes unsustainable for providers when a single agent session can cost $18 in raw compute. The subsidy had to end eventually.

For engineering leaders, this means the practices outlined above are not just Copilot-specific advice. They represent the new normal for managing any AI-assisted development tool. Teams that build disciplined AI usage habits now will be well-positioned regardless of which vendor or model they use in the future.

Our Experience: Week One Lessons

In our team of 5 engineers, the first few days after the switch were an eye-opener. Normal Agent Mode usage patterns that had been perfectly fine under the old billing suddenly became expensive. The problem was that nobody had adjusted their habits.

After implementing the practices above, we saw a roughly 50% improvement in credit efficiency within the first week. The quality of AI assistance did not noticeably decrease. We simply became more intentional about when to use expensive models and how much context to provide.

The adjustment period is real, but it is manageable. In the long run, token-based billing actually rewards efficient engineering practices. Teams that write clean, well-organized code with clear naming conventions need less AI context to get good results. The billing model, intentionally or not, incentivizes better software engineering.


At ZipTier, we keep things simple for marketers. You pay only for the leads you get. No token math, no worrying about response length, no document size considerations, no model selection headaches. Our AI assistants handle all the optimization under the hood so you can focus on what matters: converting visitors into qualified leads.

About Premium Requests and AI Credits · What Changed with Billing · Models and Pricing · Enterprise Subscription Management · Billing Dashboard

Pricing data and model availability are based on GitHub's public documentation as of June 2026 and are subject to change.

MV

Written by Mayank Vijay

Mayank is the Co-founder of ZipTier and shares insights on AI strategy, engineering, and AI-assisted productivity.

Learn more about us