InterviewVision

Documentation

Learn how to use InterviewVision Analyzer to evaluate communication skills and technical abilities in interviews

Quick Start Guide
Get started with video analysis in 3 simple steps
1

Upload Your Video

Navigate to the Analyzer page and upload an MP4 or WebM video file (max 15 minutes). Optionally add a GitHub repository URL for code quality analysis.

2

Processing & Analysis

Our AI will analyze the video for eye contact, speech patterns, pacing, screen sharing activities, and code quality. This typically takes 1-2 minutes.

3

Review Results

View detailed metrics, timeline flags, screen activity analysis, code quality scores, and an overall assessment. Export or share the report.

Analysis Metrics Explained

Eye Contact

Measures how often the candidate maintains direct eye contact with the camera.

Excellent80%+
Good60-79%
Needs Work<60%

Flags: Camera avoidance, hiding from view, excessive head turning

Speech Pacing

Analyzes words per minute (WPM) and pause patterns to assess speaking rhythm.

Optimal120-160 WPM
Too Fast>160 WPM
Too Slow<120 WPM

Flags: Rushed delivery, excessive pauses

Filler Words

Counts filler words like "um", "uh", "like", "you know" per minute of speech.

Excellent0-2/min
Acceptable3-5/min
High>5/min

Custom filler words can be configured in Settings

Screen Share Analysis

Analyzes screen recordings to detect coding activities, tool usage, and productivity patterns.

CodingIDE/Editor
TerminalCLI Tools
DocumentationReading

Detects programming languages, frameworks, and productivity score

Code Quality Review

Reviews GitHub repositories for code quality, architecture, and best practices.

ArchitectureStructure
ConsistencyStyle
DocumentationComments

Analyzes up to 50 files for comprehensive quality assessment

Overall Score

Composite score combining all metrics with configurable weights.

Strong Communicator80-100
Competent60-79
Needs Improvement<60

Default weights: Eye Contact 35%, Pacing 25%, Fillers 20%, Engagement 20%

Technical Details
How the analysis works under the hood

Vision Analysis

Uses MediaPipe Face Mesh for real-time facial landmark detection. Gaze direction and head pose are calculated from iris and facial geometry. Video frames are sampled every 1-2 seconds for efficiency.

Speech Analysis

Audio transcription uses the Web Speech API for browser-native processing. Filler word detection uses regex pattern matching on the transcript. WPM is calculated from total words divided by speaking time (excluding pauses longer than 2 seconds).

Privacy & Performance

All processing happens client-side in your browser using Web Workers. Videos are not uploaded to external servers for analysis, ensuring privacy. Results can be exported as JSON for integration with other systems.

Configuration Options
Customize thresholds and analysis parameters

You can adjust analysis thresholds and scoring weights in the Settings page to match your hiring criteria:

  • Minimum eye contact percentage threshold
  • Optimal WPM range for speech pacing
  • Maximum acceptable filler words per minute
  • Custom filler word detection list
  • Scoring weights for each metric
  • Processing quality and performance options
Export & Integration
Use analysis results in your workflow

Analysis results can be exported in multiple formats:

JSON Report

Structured data with all metrics, timestamps, and flags

PDF Summary

Human-readable report with charts and highlights

CSV Data

Time-series data for custom analysis in Excel/Sheets

Shareable Link

Send results to candidates or team members