Documentation

Learn how to use InterviewVision Analyzer to evaluate communication skills and technical abilities in interviews

Quick Start Guide

Get started with video analysis in 3 simple steps

Upload Your Video

Navigate to the Analyzer page and upload an MP4 or WebM video file (max 15 minutes). Optionally add a GitHub repository URL for code quality analysis.

Processing & Analysis

Our AI will analyze the video for eye contact, speech patterns, pacing, screen sharing activities, and code quality. This typically takes 1-2 minutes.

Review Results

View detailed metrics, timeline flags, screen activity analysis, code quality scores, and an overall assessment. Export or share the report.

Analysis Metrics Explained

Eye Contact

Measures how often the candidate maintains direct eye contact with the camera.

Excellent80%+

Good60-79%

Needs Work<60%

Flags: Camera avoidance, hiding from view, excessive head turning

Speech Pacing

Analyzes words per minute (WPM) and pause patterns to assess speaking rhythm.

Optimal120-160 WPM

Too Fast>160 WPM

Too Slow<120 WPM

Flags: Rushed delivery, excessive pauses

Filler Words

Counts filler words like "um", "uh", "like", "you know" per minute of speech.

Excellent0-2/min

Acceptable3-5/min

High>5/min

Custom filler words can be configured in Settings

Screen Share Analysis

Analyzes screen recordings to detect coding activities, tool usage, and productivity patterns.

CodingIDE/Editor

TerminalCLI Tools

DocumentationReading

Detects programming languages, frameworks, and productivity score

Code Quality Review

Reviews GitHub repositories for code quality, architecture, and best practices.

ArchitectureStructure

ConsistencyStyle

DocumentationComments

Analyzes up to 50 files for comprehensive quality assessment

Overall Score

Composite score combining all metrics with configurable weights.

Strong Communicator80-100

Competent60-79

Needs Improvement<60

Default weights: Eye Contact 35%, Pacing 25%, Fillers 20%, Engagement 20%

Technical Details

How the analysis works under the hood

Vision Analysis

Uses MediaPipe Face Mesh for real-time facial landmark detection. Gaze direction and head pose are calculated from iris and facial geometry. Video frames are sampled every 1-2 seconds for efficiency.

Speech Analysis

Audio transcription uses the Web Speech API for browser-native processing. Filler word detection uses regex pattern matching on the transcript. WPM is calculated from total words divided by speaking time (excluding pauses longer than 2 seconds).

Privacy & Performance

All processing happens client-side in your browser using Web Workers. Videos are not uploaded to external servers for analysis, ensuring privacy. Results can be exported as JSON for integration with other systems.

Configuration Options

Customize thresholds and analysis parameters

You can adjust analysis thresholds and scoring weights in the Settings page to match your hiring criteria:

Minimum eye contact percentage threshold
Optimal WPM range for speech pacing
Maximum acceptable filler words per minute
Custom filler word detection list
Scoring weights for each metric
Processing quality and performance options

Export & Integration

Use analysis results in your workflow

Analysis results can be exported in multiple formats:

JSON Report

Structured data with all metrics, timestamps, and flags

PDF Summary

Human-readable report with charts and highlights

CSV Data

Time-series data for custom analysis in Excel/Sheets

Shareable Link

Send results to candidates or team members