Development13 min read

Using WebRTC in Chrome Extensions

A technical guide to implementing WebRTC features in Chrome extensions, covering screen sharing, peer-to-peer data channels, video capture, and signaling via extension messaging.

C
CWS Kit Team
Share
📡

Using WebRTC in Chrome Extensions

Real-time communication superpowers meet browser extension APIs.

WebRTC gives web applications the ability to capture audio and video, share screens, and establish peer-to-peer connections — all without plugins or servers relaying media streams. Chrome extensions can use these same capabilities, but the execution context is different. Extensions have service workers instead of long-running pages, permission models that interact with WebRTC's getUserMedia prompts, and messaging channels that can serve as signaling infrastructure.

This guide covers practical WebRTC implementation inside Chrome extensions. Not the theory — the actual code patterns that work in Manifest V3's constrained environment.

Why WebRTC in an Extension?#

Extensions are uniquely positioned for certain WebRTC use cases. They can capture the screen of any tab without the user needing to visit a specific website. They can establish P2P data channels between browser instances for real-time sync. They can record and stream tab audio. And they can do all of this from a privileged context that persists across page navigations.

Common use cases:

  • Screen recording extensions that capture tab or desktop content and stream or save it
  • Collaboration tools that share cursor position, selections, or page annotations in real time
  • Remote assistance tools where one user sees another user's screen through the extension
  • Peer-to-peer sync for extension data without a central server
  • Tab audio capture for transcription, noise cancellation, or streaming
  1. 🧠

    Understand the Context

    Learn which extension contexts (popup, background, offscreen, content script) can access WebRTC APIs and their limitations.

  2. 🔑

    Request Permissions

    Configure manifest permissions for screen capture, tab capture, and media access.

  3. 🎥

    Set Up Media Capture

    Use chrome.tabCapture or getDisplayMedia to capture screen, tab, or camera streams.

  4. 🔗

    Create Peer Connections

    Establish RTCPeerConnection with proper STUN/TURN configuration for NAT traversal.

  5. 📨

    Build Signaling

    Use extension messaging (chrome.runtime) as your signaling channel for offer/answer exchange.

  6. 💾

    Handle Data Channels

    Set up RTCDataChannel for low-latency P2P data transfer between extension instances.

  7. 🐛

    Debug and Optimize

    Use chrome://webrtc-internals and extension-specific debugging techniques.

WebRTC in Different Extension Contexts#

Not every extension context has equal access to WebRTC APIs. Understanding which context to use for what is the first decision you need to make.

The popup can access getUserMedia and RTCPeerConnection. However, the popup closes when the user clicks away, terminating any active WebRTC connections. Do not establish long-lived connections in the popup. Use it to initiate connections and then hand off to a more persistent context.

The service worker (background) cannot access getUserMedia or RTCPeerConnection directly. These are DOM APIs, and service workers have no DOM. In Manifest V2, the background page had full DOM access and was the natural home for WebRTC. In MV3, you need an alternative.

Offscreen documents are the MV3 solution. An offscreen document is an invisible HTML page that your extension can create from the service worker. It has full DOM access, including getUserMedia, RTCPeerConnection, and MediaRecorder. It persists as long as you need it (subject to Chrome's idle timeout policies).

Content scripts can access getUserMedia and RTCPeerConnection through the host page's context, but this is generally a bad idea. You are running WebRTC inside someone else's page, subject to their CSP, their JavaScript environment, and potential conflicts.

Setting Up Screen Capture#

Screen capture is the most common WebRTC use case in extensions. There are two approaches: chrome.tabCapture for capturing a specific tab, and getDisplayMedia for capturing the entire screen or a window.

Tab Capture with chrome.tabCapture#

// service-worker.ts — Initiate tab capture
chrome.action.onClicked.addListener(async (tab) => {
  // First, ensure the offscreen document exists
  await ensureOffscreenDocument();
 
  // Get a media stream ID for the current tab
  const streamId = await chrome.tabCapture.getMediaStreamId({
    targetTabId: tab.id,
  });
 
  // Send the stream ID to the offscreen document
  chrome.runtime.sendMessage({
    type: 'START_CAPTURE',
    target: 'offscreen',
    streamId,
    tabId: tab.id,
  });
});
 
async function ensureOffscreenDocument() {
  const existingContexts = await chrome.runtime.getContexts({
    contextTypes: [chrome.runtime.ContextType.OFFSCREEN_DOCUMENT],
  });
 
  if (existingContexts.length > 0) return;
 
  await chrome.offscreen.createDocument({
    url: 'offscreen.html',
    reasons: [chrome.offscreen.Reason.USER_MEDIA],
    justification: 'WebRTC media capture for screen recording',
  });
}
// offscreen.ts — Handle the media stream
chrome.runtime.onMessage.addListener(async (message) => {
  if (message.type !== 'START_CAPTURE' || message.target !== 'offscreen') return;
 
  try {
    const stream = await navigator.mediaDevices.getUserMedia({
      audio: {
        mandatory: {
          chromeMediaSource: 'tab',
          chromeMediaSourceId: message.streamId,
        },
      } as any,
      video: {
        mandatory: {
          chromeMediaSource: 'tab',
          chromeMediaSourceId: message.streamId,
          maxWidth: 1920,
          maxHeight: 1080,
          maxFrameRate: 30,
        },
      } as any,
    });
 
    // Stream is now available — record it, send it via WebRTC, etc.
    startRecording(stream);
  } catch (error) {
    console.error('Tab capture failed:', error);
  }
});
 
function startRecording(stream: MediaStream) {
  const recorder = new MediaRecorder(stream, {
    mimeType: 'video/webm;codecs=vp9,opus',
    videoBitsPerSecond: 2500000,
  });
 
  const chunks: Blob[] = [];
  recorder.ondataavailable = (e) => chunks.push(e.data);
  recorder.onstop = () => {
    const blob = new Blob(chunks, { type: 'video/webm' });
    // Save or upload the recording
    const url = URL.createObjectURL(blob);
    chrome.runtime.sendMessage({ type: 'RECORDING_COMPLETE', url });
  };
 
  recorder.start(1000); // Capture in 1-second chunks
}

Full Screen Capture with getDisplayMedia#

If you need to capture the entire screen or a specific window (not just a tab), use getDisplayMedia. This requires user interaction to trigger — Chrome will show a picker dialog.

// offscreen.ts — Full screen capture
async function captureScreen() {
  const stream = await navigator.mediaDevices.getDisplayMedia({
    video: {
      width: { ideal: 1920 },
      height: { ideal: 1080 },
      frameRate: { ideal: 30, max: 60 },
    },
    audio: true, // System audio (if the user selects a tab)
  });
 
  return stream;
}

The key difference: tabCapture can be initiated programmatically without a user prompt (once the user clicks the extension action). getDisplayMedia always shows a picker dialog. For screen recording extensions, tabCapture provides a smoother UX for tab-specific recording, while getDisplayMedia is necessary for desktop-wide capture.

Establishing Peer Connections#

Once you have a media stream, you might want to send it to another user in real time. This is where RTCPeerConnection comes in.

1

Create RTCPeerConnection

Initialize with STUN/TURN servers for NAT traversal. Both peers create their own connection object.

2

Add Media Tracks

The sending peer adds their media stream tracks to the connection using addTrack().

3

Create and Send Offer

The initiating peer creates an SDP offer and sends it to the remote peer via signaling.

4

Receive and Answer

The remote peer receives the offer, sets it as the remote description, creates an answer, and sends it back.

5

Exchange ICE Candidates

Both peers exchange ICE candidates via signaling as they are discovered. Each peer adds received candidates to their connection.

6

Connection Established

Once ICE negotiation completes, media flows directly between peers (or through TURN if direct connection fails).

// webrtc-manager.ts — Reusable WebRTC connection manager
interface SignalingChannel {
  send(data: any): void;
  onMessage(callback: (data: any) => void): void;
}
 
class ExtensionWebRTC {
  private pc: RTCPeerConnection;
  private signaling: SignalingChannel;
 
  constructor(signaling: SignalingChannel) {
    this.signaling = signaling;
    this.pc = new RTCPeerConnection({
      iceServers: [
        { urls: 'stun:stun.l.google.com:19302' },
        { urls: 'stun:stun1.l.google.com:19302' },
        // Add TURN server for production use
        // {
        //   urls: 'turn:your-turn-server.com:3478',
        //   username: 'user',
        //   credential: 'pass',
        // },
      ],
    });
 
    // Forward ICE candidates to the remote peer
    this.pc.onicecandidate = (event) => {
      if (event.candidate) {
        this.signaling.send({
          type: 'ice-candidate',
          candidate: event.candidate.toJSON(),
        });
      }
    };
 
    // Handle incoming signaling messages
    this.signaling.onMessage(async (data) => {
      switch (data.type) {
        case 'offer':
          await this.handleOffer(data.sdp);
          break;
        case 'answer':
          await this.pc.setRemoteDescription(new RTCSessionDescription(data.sdp));
          break;
        case 'ice-candidate':
          await this.pc.addIceCandidate(new RTCIceCandidate(data.candidate));
          break;
      }
    });
  }
 
  async startCall(stream: MediaStream) {
    // Add local tracks
    for (const track of stream.getTracks()) {
      this.pc.addTrack(track, stream);
    }
 
    // Create and send offer
    const offer = await this.pc.createOffer();
    await this.pc.setLocalDescription(offer);
    this.signaling.send({ type: 'offer', sdp: offer });
  }
 
  private async handleOffer(sdp: RTCSessionDescriptionInit) {
    await this.pc.setRemoteDescription(new RTCSessionDescription(sdp));
    const answer = await this.pc.createAnswer();
    await this.pc.setLocalDescription(answer);
    this.signaling.send({ type: 'answer', sdp: answer });
  }
 
  onRemoteStream(callback: (stream: MediaStream) => void) {
    this.pc.ontrack = (event) => {
      callback(event.streams[0]);
    };
  }
}

Signaling via Extension Messaging#

Traditional WebRTC applications need a signaling server — a WebSocket server, a Firebase instance, or similar. Chrome extensions have a built-in alternative: chrome.runtime.sendMessage and chrome.runtime.onMessage. If both peers have your extension installed, you can relay signaling data through your extension's service worker.

// signaling-adapter.ts — Extension messaging as a signaling channel
class ExtensionSignaling implements SignalingChannel {
  private peerId: string;
  private messageCallback: ((data: any) => void) | null = null;
 
  constructor(peerId: string) {
    this.peerId = peerId;
 
    // Listen for signaling messages routed through the background
    chrome.runtime.onMessage.addListener((message) => {
      if (message.type === 'webrtc-signal' && message.from === this.peerId) {
        this.messageCallback?.(message.payload);
      }
    });
  }
 
  send(data: any) {
    // Send to background worker, which routes to the target peer
    chrome.runtime.sendMessage({
      type: 'webrtc-signal',
      to: this.peerId,
      payload: data,
    });
  }
 
  onMessage(callback: (data: any) => void) {
    this.messageCallback = callback;
  }
}

This works for extensions where both peers are on the same browser profile or where you have a way to relay messages between different installations. For cross-device signaling, you still need an external server — but you can keep it minimal. A simple WebSocket relay that forwards JSON messages between connected clients is all you need for signaling. The actual media data flows peer-to-peer.

Data Channels for P2P Extension Sync#

WebRTC is not just for audio and video. RTCDataChannel provides low-latency, peer-to-peer data transfer that is perfect for syncing extension state between devices without a central server.

// data-channel.ts — P2P sync for extension data
function setupDataChannel(pc: RTCPeerConnection) {
  const channel = pc.createDataChannel('extension-sync', {
    ordered: true,
    maxRetransmits: 3,
  });
 
  channel.onopen = () => {
    console.log('Data channel open — syncing extension state');
    // Send current state to the peer
    const state = await chrome.storage.local.get(null);
    channel.send(JSON.stringify({ type: 'full-sync', data: state }));
  };
 
  channel.onmessage = (event) => {
    const message = JSON.parse(event.data);
    switch (message.type) {
      case 'full-sync':
        // Merge remote state with local state
        mergeState(message.data);
        break;
      case 'incremental':
        // Apply a single change
        chrome.storage.local.set({ [message.key]: message.value });
        break;
    }
  };
 
  // Watch for local storage changes and push them to the peer
  chrome.storage.onChanged.addListener((changes) => {
    for (const [key, { newValue }] of Object.entries(changes)) {
      if (channel.readyState === 'open') {
        channel.send(JSON.stringify({
          type: 'incremental',
          key,
          value: newValue,
        }));
      }
    }
  });
 
  return channel;
}

Data channels support both reliable (TCP-like) and unreliable (UDP-like) modes. For syncing settings or bookmarks, use reliable ordered delivery. For streaming real-time cursor positions or collaborative annotations, unreliable unordered delivery gives you lower latency.

Troubleshooting WebRTC in Extensions#

Debugging with chrome://webrtc-internals#

Chrome's built-in WebRTC debugger at chrome://webrtc-internals is invaluable. It shows every active RTCPeerConnection, including ICE candidate exchange, SDP offers and answers, connection state transitions, and statistics about media streams (bitrate, packet loss, jitter).

When something is not working, open chrome://webrtc-internals before triggering the connection. It captures events in real time and you can see exactly where the process stalls — whether at ICE gathering, DTLS handshake, or media negotiation.

For extension-specific debugging, remember that WebRTC connections in offscreen documents appear in webrtc-internals just like connections from regular pages. Look for your connection by the STUN/TURN server URLs you configured.

Manifest Permissions#

Your manifest.json needs the right permissions for WebRTC features:

{
  "permissions": [
    "tabCapture",
    "offscreen"
  ],
  "optional_permissions": [
    "desktopCapture"
  ]
}

tabCapture is required for capturing tab audio/video. offscreen is required to create offscreen documents. desktopCapture is needed for full-screen capture via chrome.desktopCapture.chooseDesktopMedia() — make it optional since not all users need it.

Do not request microphone or camera permissions in the manifest unless your extension explicitly needs the user's camera or microphone. Tab capture and screen capture do not require these permissions — they use different APIs. Requesting unnecessary permissions increases review scrutiny and decreases install rates. For more on permissions strategy, see Manifest V3 permissions best practices.

Key Takeaway

WebRTC in Chrome extensions works best through offscreen documents — they provide the DOM access that service workers lack while persisting through user interactions. Use chrome.tabCapture for tab-specific capture (no user prompt required after the initial click), getDisplayMedia for full-screen capture, and RTCDataChannel for P2P data sync. The extension's messaging system can serve as a signaling channel between peers who both have the extension installed, eliminating the need for a dedicated signaling server in many use cases. Always configure TURN servers for production — relying on STUN alone will fail for 10-15% of users behind symmetric NATs.

Continue reading

Related articles

View all posts