450 free credits on signup · Plans from $16.90/mo

HappyHorse 1.0 — #1 AI Video Model on LMArena

Elo 1392 in image-to-video, Elo 1333 in text-to-video. 40-layer Self-Attention Transformer with 8-step denoising and integrated audio. The model that put every other AI video generator behind.

Video

720P

600

#1 LMArena · Elo 1392 Image-to-Video · Elo 1333 Text-to-Video · 8-Step Denoising

HappyHorse 1.0 AI Video Gallery

Browse AI Video Examples — Text-to-Video & Image-to-Video

Real AI video outputs generated by HappyHorse 1.0. See what text-to-video and image-to-video can produce — then recreate with your own assets.

30 template examples100% Real AI Video AssetsModel: HappyHorse 1.0

Reproduction quality depends on uploaded assets, prompt clarity, and selected mode.

Confidence score combines mode constraints, shot determinism, and duration complexity.

Filter by mode

30 templates in this view

Template Feed

Subject Reference

Street Magic Character Swap

Fast illusion pacing with character continuity and social-ready rhythm.

Use In Create

Subject Reference

High-Impact Visual Trick

Designed for replay value with continuous motion and unexpected reveal.

Best result came from tighter camera wording in line 2.

Use In Create

Multi Frame

Cinematic Transition Chain

Multi-shot transition pacing for creators who need strong visual momentum.

This one converts really well for social hooks.

Use In Create

All Reference

Abstract Motion Concept

A stylized long-form concept clip focused on atmosphere and texture.

Use In Create

All Reference

Epic Scene Build

Long sequence with strong production value and layered environmental detail.

Prompt structure is clean. Upload order matters a lot here.

Use In Create

First/Last Frame

Warm Lifestyle Motion

Short and clean lifestyle movement optimized for product story hooks.

Matched the motion curve after replacing with my own character set.

Use In Create

First/Last Frame

City Rhythm Shot

Urban pacing with controlled start/end framing for cleaner edits.

Use In Create

All Reference

Galaxy Portal Journey

A long sci-fi sequence with transitions across deep-space locations.

This one converts really well for social hooks.

Use In Create

Multi Frame

Split Planet Sequence

High-contrast celestial motion with dramatic visual progression.

Needed two retries, then quality was close to reference.

Use In Create

First/Last Frame

Water Drop Macro

Nature-focused macro style with shallow depth and detail emphasis.

Use In Create

First/Last Frame

Forest Micro Scene

Close-up environmental storytelling with calm movement and focus pull.

Matched the motion curve after replacing with my own character set.

Use In Create

All Reference

Atmospheric Story Cut

Long atmospheric scene optimized for mood-driven visual storytelling.

Best result came from tighter camera wording in line 2.

Use In Create

First/Last Frame

Moody Mushroom Study

Dark, moody close-up texture shot ideal for aesthetic edits.

Use In Create

Multi Frame

Vintage Street Follow

Zoom-out and follow movement through a period-style city environment.

Needed two retries, then quality was close to reference.

Use In Create

First/Last Frame

Graceful Daily Moment

Fixed camera lifestyle beat with graceful hand motion and subtle realism.

Prompt structure is clean. Upload order matters a lot here.

Use In Create

Subject Reference

Urban Chase Sequence

Fast side tracking chase with crowd chaos and subject identity retention.

Use In Create

All Reference

Longform Motion Reel

A production-style long take useful for premium ad storytelling.

Best result came from tighter camera wording in line 2.

Use In Create

Multi Frame

Retro Fighter Standoff

16-bit arcade duel scene inspired by classic fighting game aesthetics.

This one converts really well for social hooks.

Use In Create

Subject Reference

HappyHorse Meme Glow

Playful, self-referential meme-style promo clip with dramatic glow energy.

Use In Create

Subject Reference

Neon Rain Operative

From a neon skyline to a masked heroine close-up, built for sleek sci-fi ads.

Prompt structure is clean. Upload order matters a lot here.

Use In Create

All Reference

Van Gogh Brawl

Expressionist brushstroke action with close-up tension in a classic room.

Matched the motion curve after replacing with my own character set.

Use In Create

First/Last Frame

Studio Face Hold

Clean black-background portrait with subtle expression shifts.

Use In Create

Subject Reference

Night Formula Street Run

High-speed Formula-style race car chase through neon city avenues.

This one converts really well for social hooks.

Use In Create

Multi Frame

Dojo Duel Storyboard

Storyboard-to-final anime sword duel with cinematic pacing.

Needed two retries, then quality was close to reference.

Use In Create

All Reference

Candy Surgery Twist

Medical drama setup that flips into absurd candy-filled surrealism.

Use In Create

Subject Reference

Reef Predator Hybrid

Underwater chase featuring a tiger-shark and zebra-fish hybrid concept.

Matched the motion curve after replacing with my own character set.

Use In Create

Multi Frame

Ruin to Clean Room

Same-frame room restoration from decay to clean interior.

Best result came from tighter camera wording in line 2.

Use In Create

Subject Reference

Monkey Orders Milk Tea

Comedic cafe scene with a monkey customer and bichon barista.

Use In Create

First/Last Frame

Midnight Floor Monologue

Black-and-white reflection mood transitioning into warm floor close-up.

Needed two retries, then quality was close to reference.

Use In Create

All Reference

Fighter Style Match

Match real choreography to anime-style fighters with consistent poses.

Prompt structure is clean. Upload order matters a lot here.

Use In Create

LMArena Rankings — April 2026

HappyHorse 1.0: The Mystery AI Video Model That Dominated LMArena Then Vanished

In early April 2026, HappyHorse 1.0 appeared on the Artificial Analysis Video Arena, instantly topped both text-to-video and image-to-video leaderboards — beating Seedance 2.0, Kling 3.0, and PixVerse V6 — then disappeared within days. Here is the complete analysis.

HappyHorse 1.0 Model Overview

Model TypeText + Image to Video with integrated audio synthesis

Architecture40-layer single-stream Self-Attention Transformer, no Cross-Attention

Inference StepsOnly 8 denoising steps, no CFG (Classifier-Free Guidance)

LanguagesChinese, English, Japanese, Korean, German, French

Release ArtifactsBase model / Distilled model / Super-resolution model / Inference code

Appeared OnArtificial Analysis Video Arena & LMArena Video Track

Current StatusV1/V2 removed from public leaderboards; GitHub/Model Hub marked "Coming Soon"

Suspected OriginAsian team; community speculates Wan 2.7 or Seedance lineage — unconfirmed

Timeline: The Rise and Disappearance

2026 is the Year of the Horse in the Chinese zodiac. The name "HappyHorse" echoes this, and overseas media noted a wave of horse-themed AI releases from Chinese teams — a key clue to its Asian origin.

1V1 appeared anonymously on Artificial Analysis Video Arena and reached the text-to-video top 3 within hours
2V2 appeared almost simultaneously — both variants briefly held #1 and #2 on the image-to-video leaderboard
3HappyHorse 1.0 pushed Seedance 2.0 720p, Kling 3.0, and PixVerse V6 behind on both leaderboards
4Within days, V1 and V2 were both removed from public rankings. An official page later appeared with "base model open-source coming soon"

This "sudden appearance → domination → quiet removal" pattern typically means either an anonymous A/B test by a lab, or a vendor caught by traffic exposure before a planned launch.

Text-to-Video (No Audio)

1HappyHorse 1.0

Elo 1333

2Seedance 2.0 720p

Elo 1273

3SkyReels V4

Elo 1244

4Kling 3.0 1080p

Elo 1241

5PixVerse V6

Elo 1239

Image-to-Video (No Audio)

1HappyHorse 1.0

Elo 1392

2Seedance 2.0 720p

Elo 1355

3PixVerse V6

Elo 1338

4Grok Imagine Video

Elo 1333

5Kling 3.0 Omni 1080p

Elo 1297

Text-to-Video (With Audio)

1Seedance 2.0 720p

Elo 1219

2HappyHorse 1.0

Elo 1205

HappyHorse 1.0 image-to-video Arena rankings — Elo 1392, leading Seedance 2.0 and PixVerse V6

HappyHorse 1.0 architecture analysis — 40-layer single-stream Self-Attention Transformer diagram

Key Performance Observations

Image-to-video lead is largest: 1392 vs 1355 (Elo gap ~40 points) — users can consistently perceive the quality difference at this margin
Text-to-video also #1: 1333 vs 1273, a 60-point lead over Seedance 2.0 even without a reference image — superior in shot composition and motion
Audio track: currently #2 behind Seedance 2.0, which still leads in audio-visual synchronization
Multi-language support + human-centric optimization explain the massive image-to-video lead — digital human scenarios are its core strength

Architecture Deep-Dive: 40-Layer Single-Stream Transformer

Traditional video generation models use multi-stream architectures where text, video, and audio each have separate encoders interacting via Cross-Attention. HappyHorse 1.0 collapses this into a single pipeline.

Single-Stream Self-Attention Replaces Multi-Stream Complexity

One 40-layer Self-Attention Transformer processes text, video, and audio tokens simultaneously — no Cross-Attention, no modality-specific sub-networks. All modalities are unified into a single token sequence attending to each other in the same attention space.

Higher parameter efficiency — no redundant parameters for modality isolation
Shorter inference path — no cross-modal data transfers, more continuous kernels
Unified training objective — text, visual, and audio share one loss function for end-to-end optimization
Native audio-video joint generation — sound and visuals are tokens in the same sequence with built-in synchronization

8-Step Denoising + No CFG: Extreme Inference Speed

Only 8 denoising steps with no Classifier-Free Guidance needed to produce Arena-leading quality. This implies Consistency Distillation, Rectified Flow, or Progressive Distillation during training — compressing multi-step sampling into direct prediction. Combined with the released distilled and super-resolution models, the full stack targets both edge-friendly and high-throughput server deployment.

Estimated Parameter Count

Weights are not yet public, but given the 40-layer single-stream architecture, 6-language support, and Arena performance, the model likely falls in the 10B–30B parameter range — comparable to Wan 2.x, Seedance 1.x, and Hunyuan Video.

HappyHorse 1.0 performance benchmarks — comparison across text-to-video, image-to-video, and audio tracks

Ideal Use Cases for HappyHorse 1.0

Based on its emphasis on human-centric scenarios, facial performance, and lip-sync, HappyHorse 1.0 is best suited for:

Virtual presenters and digital human short videos
AI-generated short dramas with accurate facial acting
Multi-language promotional videos with lip-sync
Advertising clips featuring human talent
Social media content creation with talking-head formats

The Mystery: Who Made HappyHorse 1.0?

Three main theories have emerged in the AI community. None are officially confirmed.

#1Alibaba Wan 2.7 variant

Evidence: Wan 2.7 released in the same period; Alibaba Tongyi Lab is aggressive in the video track; "Horse" name fits the zodiac year

Counter: Wan 2.7 official description focuses on image/thinking mode, architecture description doesn't match HappyHorse's single-stream 40-layer design

#2ByteDance Seedance team experiment

Evidence: Seedance 2.0 is a top Chinese Arena contender; ByteDance has motive for anonymous testing

Counter: Seedance 2.0 still leads HappyHorse in the audio track — ByteDance has no reason to upload a stronger version under a different name

#3Undisclosed lab / academic consortium

Evidence: "Full open-source + distilled model + super-res model" bundle is more research-style; quirky naming, minimal website

Counter: Model quality has reached commercial tier — a pure academic team would struggle to train at this scale independently

Community opinion is leaning toward theory #3: HappyHorse 1.0 likely comes from a new team using an open-source strategy to break through overnight. Anonymous Arena submission first builds credibility through blind testing data, then formal release follows. This "leaderboard first → open-source → then product" playbook has been validated by multiple Asian labs in the past 18 months. But until the GitHub repo and Model Hub go live, no claim of "it's definitely X" should be treated as fact.

Three Layers of Industry Impact

Architecture Paradigm Shift

For two years, mainstream models refined multi-stream Diffusion + Cross-Attention. HappyHorse proves "single-stream Self-Attention + minimal-step inference" can also reach SOTA — and is cleaner to engineer. More teams will reconsider whether Cross-Attention is a "complexity tax" worth paying.

Open-Source Strategy Evolution

HappyHorse chose "anonymous leaderboard → announce open-source → release weights" instead of "paper first → weights later". This is closer to a consumer product launch, putting user-perceived data before academic publication. If it open-sources as promised, it could become the next heavily-forked video foundation model after Wan, Hunyuan Video, and Open-Sora.

Blind Testing Credibility

The "instant domination then vanish" pattern is a wake-up call for platforms like Artificial Analysis and LMArena. As anonymous entries multiply, distinguishing "genuine new models" from "existing model checkpoints" will become a critical challenge for leaderboard maintainers.

Frequently Asked Questions

Can I download and use HappyHorse 1.0 now?+

Not yet. The official page still marks GitHub and Model Hub links as "Coming Soon". Weights and inference code are not publicly available. Be cautious of any channel claiming otherwise.

Why did HappyHorse disappear from the Arena leaderboard?+

No confirmed explanation exists. Two mainstream theories: (1) the model authors voluntarily withdrew to prepare a formal release, or (2) the platform removed anonymous entries pending identity verification. Neither implies the model is flawed.

Is HappyHorse 1.0 the same as Wan 2.7?+

No official confirmation. Wan 2.7 from Alibaba focuses on "thinking mode" and long-text rendering. HappyHorse emphasizes 40-layer single-stream Transformer and 8-step denoising. Their technical narratives diverge significantly — they appear to be two separate products in the same era.

Can HappyHorse generate video with audio?+

Yes. The 40-layer Transformer jointly processes text, video, and audio tokens, natively supporting "text input → video with audio output". It currently ranks #2 in the audio-included Arena track.

What business scenarios is HappyHorse 1.0 best for?+

Based on its emphasis on facial performance, lip-sync, and multi-language support: virtual presenters, digital human short videos, AI short dramas, multi-language promotional clips, and ad content featuring people. For landscape or product-focused shots, Seedance 2.0, Veo 3.1, and Kling 3.0 remain more proven choices.

How should developers prepare?+

Keep your toolchain model-neutral: integrate video generation through a unified multi-model platform, prepare your prompts, shot scripts, and review workflows. When HappyHorse open-sources or launches an API, you only need to switch the model parameter.

Source: Full Analysis on APIYi — HappyHorse Model Mystery: AI Video LMArena Analysis

How AI Video Generation Works

Create AI Video in 3 Simple Steps

From text-to-video or image-to-video input to finished AI video in minutes — with HappyHorse 1.0 control at each stage.

Upload Your Materials

Direct The AI Video Output

Generate AI Video, Review, Ship

Upload Your Materials

Drop product shots, reference clips, voice tracks, or mood images for AI video generation into one workspace.

Text-to-video and image-to-video inputs
Drag-and-drop upload flow

Direct The AI Video Output

Write intent with @references so each asset has a clear role in the HappyHorse 1.0 generated scene — more control than Sora or Veo offer.

Reference assets with @mention
Control shot style and camera movement

Generate AI Video, Review, Ship

Run AI video variants with HappyHorse 1.0, Sora 2, or Veo 3 — compare versions and export the best result.

Rapid AI video iteration cycles
Export-ready for ads and social

HappyHorse 1.0 Core Capabilities

#1 on LMArena — Built Different from Every Other AI Video Model

HappyHorse 1.0 uses a 40-layer single-stream Self-Attention Transformer with unified token processing for text, video, and audio. Only 8 denoising steps — no CFG needed.

Text-to-Video Generation (Elo 1333)

HappyHorse 1.0 ranks #1 on LMArena text-to-video, ahead of Seedance 2.0 (1273), Kling 3.0 (1241), and PixVerse V6 (1239). Generate videos from text in 6 languages.

Image-to-Video Generation (Elo 1392)

The highest-ranked image-to-video model on LMArena. HappyHorse 1.0 beats Seedance 2.0 (1355) and PixVerse V6 (1338) by a significant margin.

Integrated Audio Synthesis

Generate videos with synchronized audio in a single pass. Speech, music, and sound effects are synthesized alongside the visual output.

Facial & Lip-Sync Precision

Optimized for human-centric scenarios: virtual presenters, digital people, and short dramas with accurate facial performance and lip synchronization.

8-Step Ultra-Fast Denoising

Only 8 denoising steps vs. 20-50+ for competitors. No Classifier-Free Guidance needed. Dramatically faster inference without quality loss.

40-Layer Self-Attention Transformer

Single-stream architecture with unified token processing for text, video, and audio. No Cross-Attention — a fundamentally different design from Sora, Veo, or Kling.

AI Video Use Cases

AI Video Generation For Revenue-Critical Work

HappyHorse 1.0 is tuned for teams that ship AI video content fast. Use text-to-video and image-to-video to outpace Sora and Veo workflows.

Ads

AI Video for E-commerce Product Ads

Turn product stills into conversion-focused short ads for paid acquisition and PDP loops.

Lower AI video production cost

Creative

AI Video Creative Testing

Generate multiple AI video hooks and shot styles from the same source assets.

Faster A/B cycle turnaround

Music

Music & Rhythm AI Video

Pair tracks with beat-synced AI video transitions for socials and teaser drops.

Consistent clip cadence

Film

AI Video Previsualization

Prototype camera language and pacing before expensive live production starts.

Reduced pre-production uncertainty

Social

Social Media Clips

Create scroll-stopping short clips optimized for Reels, TikTok, and YouTube Shorts.

Higher engagement rate

Narrative

AI Storytelling & Narrative

Build scene-by-scene animated narratives with consistent characters and style.

Scalable story production

Real Estate

Real Estate & Architecture

Transform renders and photos into immersive property walkthrough videos.

Faster listing turnaround

Education

Education & Training

Generate explainer videos and visual tutorials from text-based course material.

Reduced content creation time

HappyHorse 1.0 Pricing

AI Video Generation Plans — HappyHorse 1.0, Sora 2 & Veo 3

Start generating AI video for free with text-to-video and image-to-video. Scale when your video generation volume grows.

Save 33%

Lite

$12.42per month

💰Pay yearly only$202.8$149

Save $53.8

Perfect for getting started with AI video

5000 credits included monthly

5,000 credits/month
HappyHorse 1.0, Veo 3.1, Sora 2 included
AI image models (Seedream 5.0, NB2)
Download videos in MP4 format
All multi-modal modes
Access to all templates
No watermark
Community support

1080p resolution
Priority queue

Pro

Popular

$16.58per month

💰Pay yearly only$298.8$199

Save $99.8

Built for individual creators and small teams

9000 credits included monthly

9,000 credits/month
Full HappyHorse 1.0 access — all modes
AI image models (Seedream 5.0, NB2)
1080p resolution (HappyHorse & Veo)
Priority generation queue
Download videos in MP4 format
Access to all templates
No watermark
Keep your videos forever
API access

Team workspace
Batch processing

Business

$49.92per month

💰Pay yearly only$838.8$599

Save $239.8

For high-frequency production teams

30000 credits included monthly

30,000 credits/month
All Pro features included
Team collaboration workspace
Batch processing
Dedicated account support
Fastest generation queue
SLA and invoicing options
Keep your videos forever

Enterprise customization by request
Overage usage via credit packs

FAQ

AI Video Generation — Common Questions

If you are evaluating HappyHorse 1.0 for AI video production, these are the most common questions from teams.

HappyHorse 1.0 — #1 AI Video Model on LMArena

Elo 1392 in image-to-video, Elo 1333 in text-to-video. 40-layer Self-Attention Transformer with 8-step denoising and integrated audio. The model that put every other AI video generator behind.

Video

720P

600

#1 LMArena · Elo 1392 Image-to-Video · Elo 1333 Text-to-Video · 8-Step Denoising

Browse AI Video Examples — Text-to-Video & Image-to-Video

Real AI video outputs generated by HappyHorse 1.0. See what text-to-video and image-to-video can produce — then recreate with your own assets.

30 template examples100% Real AI Video AssetsModel: HappyHorse 1.0

Reproduction quality depends on uploaded assets, prompt clarity, and selected mode.

Confidence score combines mode constraints, shot determinism, and duration complexity.

HappyHorse 1.0 — #1 AI Video Model on LMArena