11 June 2025 Midjourney Office Hours

Jun 11, 2025

Your Style References Are About to Break (But the Video Revolution Makes It Worth It)

Here’s what you need to know before Monday: Midjourney is about to ship its biggest update ever, and your current workflows are going to need adjustments.

The video model everyone’s been waiting for is finally entering launch phase, but that’s just the beginning of what’s changing.

If you’re using style references in production work right now, stop what you’re doing and archive your codes. The new S-Ref system dropping imminently will break compatibility with everything you’ve built. But before you panic, the tradeoff looks solid—we’re getting legitimate mood board capabilities and a randomization feature that could replace hours of manual experimentation.

The S-Ref Overhaul Changes Everything About Style Consistency

The updated style reference system requires appending --sv4 or --sv5 to use your existing codes.

Yes, this means updating every single workflow, template, and saved prompt you’re currently using. The silver lining: the new URL handling system actually works properly, and the mood board functionality transforms how we can maintain brand consistency across projects.

What really caught my attention is the S-ref randomization feature. Instead of manually testing dozens of reference combinations, you’ll be able to let the system generate variations automatically. For agency work where clients want “something like this but different,” this could cut exploration time by 80%.

Video Generation: Beautiful, Limited, and Shipping Without Key Features

Let me save you from disappointment: the video model launching soon is image-to-video only.

No text-to-video. No high resolution. No extended lengths. It’s shipping with what the team calls “medium quality” output, deliberately constrained to ensure everyone gets access.

But here’s what actually matters: it works with everything from V4 to V7, including Niji images. That means your entire back catalog of generated assets can potentially become animated content. The visual consistency with Midjourney’s signature style reportedly holds up better than any text-to-video system currently available.

For client work, this translates to finally being able to create simple animated assets from approved stills.

Think subtle logo animations, environmental loops for websites, or social media content that moves just enough to catch attention. The initial length limitations mean you’re looking at short loops rather than narrative sequences, but that covers 90% of commercial use cases.

Pricing Strategy Reveals Long-Term Platform Direction

Midjourney is considering low-entry pricing for video despite the brutal computational costs.

This isn’t charity—it’s a calculated bet on adoption over immediate profit. Early access will likely require yearly subscriptions or Mega Plan membership, so if you’re month-to-month and want in early, now’s the time to upgrade.

The lack of relaxed mode support at launch tells you everything about the server requirements. Video generation will likely consume 2x the current infrastructure capacity, and they’re actively negotiating with three different providers to make it work. Translation: expect queue times and potential generation limits until they scale up.

The Niji Video Model Could Leapfrog Everything Else

Anime and illustration studios, pay attention: the Niji-specific video model scheduled for release within a month of the main launch might actually ship with text-to-video capabilities. The structured nature of anime visuals makes them easier to train for video generation, potentially giving Niji users features the main model won’t have for months.

For studios doing animated content, webtoons, or motion graphics, this could be game-changing. The ability to maintain consistent anime-style animation from text prompts would eliminate the current workflow of generating stills first, then animating.

Infrastructure Reality Check: What This Means for Your Workflows

The server load implications are serious. Video generation requiring double the current capacity means we’re looking at potential access restrictions, queue management, or tiered availability. If you’re running time-sensitive client projects, build buffer time into your workflows now.

The team is being transparent about including “broken” content in the initial rating parties—heads spinning around, glitchy outputs, the works. This isn’t a polished launch; it’s a functional one. Plan accordingly for the first few weeks of quirks and quality variations.

Future Roadmap: V8 and the Evolution of Visual Understanding

V7.1 is borrowing learnings from the video model to improve image coherence. More interesting: V8 is in early development focused on “visual understanding.” This suggests movement toward models that actually comprehend spatial relationships and object permanence, not just pattern matching.

The new Style Explore system building on updated S-ref capabilities could finally give us the granular control over aesthetic elements we’ve been requesting. Combined with renewed experimentation on the problematic O-ref feature, we’re looking at potentially having full control over style, subject, and composition as separate elements.

What You Should Do Right Now

If you’re on production deadlines: Archive all your current S-ref codes with clear documentation. Set up parallel workflows using --sv4 flags to ensure continuity when the update drops.

If you want early video access: Upgrade to yearly or Mega Plan membership before the launch. The computational costs mean free tier access could be months away.

If you’re pitching animated content: Start preparing still assets now using V7 or Niji. Having a library ready for image-to-video conversion gives you first-mover advantage when video ships.

If server stability matters: Build alternative workflows now. The 2x capacity requirement means potential downtime or queues during peak hours, especially in the first month.

The Character Consistency Problem Finally Gets Priority

User voting has pushed character consistency to the top of the development priority list. For anyone doing sequential art, storyboards, or brand mascot work, this is huge. Combined with video capabilities, consistent characters could enable actual narrative content creation within Midjourney’s ecosystem.

The focus on user-requested features like angle shifting and extended video lengths for post-launch development shows a team actually listening to professional use cases. This isn’t just adding features for feature’s sake—it’s building toward a complete creative workflow.

Bottom Line: Disruption With Purpose

This isn’t incremental improvement. Between the S-ref overhaul, video launch, and infrastructure scaling, Midjourney is attempting to transform from an image generator into a complete visual content platform.

The next 60 days will be messy, with broken workflows and limited features, but the trajectory is clear.

For professionals, this means opportunity. While others complain about broken style codes or limited video lengths, those who adapt quickly will have new capabilities their competition won’t understand for months. The platforms that win aren’t the ones that stay stable—they’re the ones that push forward fast enough to maintain competitive advantage.

Get ready for a wild ride. Your workflows are about to break, but what you’ll be able to create by year’s end will make today’s capabilities look primitive.

Midjourney Newsletter

Discussion about this post