Digital Marionette

An Object-Aware Generation Workflow

Digital Marionette - Object-Aware AI Generation Workflow

Project Overview

Digital Marionette represents a groundbreaking R&D collaboration with Lightcraft Technology, the creators of Jetset Cine. This innovative workflow solves a fundamental challenge in AI-powered VFX: how to apply different AI styles to specific objects within a single scene without "concept bleeding" between elements.

The system uses 3D proxy geometry and Cryptomatte masks to control multiple specialized LoRAs simultaneously, creating precise, object-aware generation that maintains consistency across animated sequences. This approach enables professional VFX workflows to harness generative AI with the precision required for commercial production, bridging the gap between creative AI capabilities and industry-standard control requirements.

🎯

Goal

Develop a production-ready workflow for object-aware AI generation using 3D geometry to control multiple LoRAs within a single scene.

⏱️

Timeline

3 months intensive R&D (June - August 2025)

Lightcraft Technology Logo

Client

Lightcraft Technology

🧠

Role

Lead AI/VFX Developer, AI Training Specialist, AI Application Specialist, Custom Tool Creator, Workflow Architect.

🛠️

Tools & Technologies

ComfyUI, Blender, Custom LoRA Trainer, Cryptomatte, PartPacker, Custom Nodes, Python Scripting, Docker.

Challenge & Solution

The Challenge

Generative AI's lack of precise control creates major obstacles for professional VFX integration. The core challenges included:

  • Style Interference: When multiple AI models are active in a scene, their outputs interfere with each other, creating unpredictable results across different objects.
  • Lack of Spatial Control: Traditional AI generation cannot distinguish between different objects in a scene, making targeted styling impossible.
  • Temporal Consistency: Maintaining object identity and style consistency across animated sequences requires sophisticated control mechanisms.
  • Production Integration: The workflow needed to integrate seamlessly with existing professional VFX pipelines and industry-standard tools.
  • Scalability Requirements: Manual processes were too time-intensive for practical production implementation.

The Solution

Digital Marionette creates a revolutionary approach to AI generation control using "invisible strings" like a marionette:

  • 3D Proxy Control: 3D objects serve as invisible guides, with Cryptomatte masks defining exactly where each LoRA should be applied in the final composite.
  • Automated LoRA Pipeline: Custom-built automated training system generates specialized LoRAs from concept images, including data augmentation and captioning.
  • Modular Generation System: Each object is generated separately with its dedicated LoRA, then composited using the Cryptomatte masks for pixel-perfect precision.
  • Industry-Standard Integration: Built specifically for Blender/Cryptomatte workflows, making it immediately adoptable by existing VFX teams.
  • Custom ComfyUI Nodes: Developed specialized nodes for metadata handling, automated processing, and seamless pipeline integration.
Early chariot generation attempt with unwanted roof

Early challenge: AI generated unwanted elements (roof) despite specific prompts

Final successful chariot generation

Solution: Precise control achieved workflow refinement

Process & Methodology

This project required pioneering new techniques in AI-powered VFX, combining cutting-edge generative models with traditional production pipelines through systematic R&D and iterative refinement.

1

Concept Development & Model Research

The project began with extensive research into the latest generative models and concept refinement. Initial attempts using standard prompting revealed the need for more sophisticated control mechanisms.

Generative model chariot samples

Model A: Concept variations

Advanced generative model chariot samples

Model B: Refined outputs

Through systematic testing of leading-edge generative models, we identified the optimal foundation approaches and began developing historically accurate concepts for the target assets.

2

3D Asset Generation & Workflow Pivot

Instead of complex 3D texturing, we developed a LoRA pipeline that proved more robust and production-friendly.

3D chariot mesh from PartPacker

3D-generated chariot geometry

3D tent mesh from PartPacker

3D-generated tent geometry

Using advanced 3D generation tools, we created the foundation assets needed for the workflow.

3

Automated LoRA Training Pipeline

This phase involved creating a sophisticated automated system for LoRA training. A critical breakthrough was solving the "Knowledge Stitching" problem for consistent AI captioning.

First successful turntable generation

Breakthrough: First successful turntable dataset generation

Data variety testing with different lighting

Data augmentation: Automated lighting variations for robust training

4

GPU Optimization & Workflow Architecture

As the workflow became more sophisticated, GPU memory constraints threatened the project's viability. This led to a strategic decision to architect the system as multiple interconnected workflows rather than one monolithic process.

First successful LoRA test: Proof of concept demonstrating consistent Persian war chariot generation

Breakthrough Moment: The first successful LoRA training proved the core concept. The trained model consistently generated Persian war chariots that maintained the desired aesthetic while allowing for natural variation - exactly what professional VFX workflows require.

5

Cryptomatte Integration & Final Delivery

The final phase involved integrating Blender's Cryptomatte system for precise object isolation and developing the final compositing workflow. This stage proved the entire system could work with production 3D scenes and animated camera movements.

Production Integration: Rather than requiring video footage, the system uses animated Cryptomatte sequences from Blender to define exactly where each LoRA should be applied. This provides the precision needed for professional VFX while maintaining the creative flexibility of generative AI.

Siggraph Showcase: The project culminated in a showcase-ready demonstration delivered in time for Siggraph 2025, proving the workflow's readiness for industry presentation and potential commercial adoption.

Technical Innovation & Custom Tools

🧠 Automated Training System

Developed proprietary automation techniques for generating specialized AI models with consistent, production-quality results.

🔧 Custom Tools & Integration

Created specialized tools that solved key technical challenges.

🎯 Production-Ready Pipeline

Engineered a complete workflow enabling professional VFX teams to achieve precise, reliable results with advanced AI capabilities.

📜 Commercial Framework

Structured as a professional R&D engagement with clear licensing and IP terms.

Results & Impact

Digital Marionette successfully delivered a production-ready workflow that bridges the gap between generative AI capabilities and professional VFX requirements. The final demonstration showcased a 100-frame animated sequence with perfect object-specific styling.

Interactive Before & After

Before: 3D Proxy Geometry After: Object-Aware Generation

Drag the slider to compare the 3D proxy scene with the final AI-generated result

Siggraph
2025 Showcase
Successfully finalized project during the industry's premier computer graphics and interactive techniques conference.
3 Month
R&D Sprint
Rapid development cycle from initial concept to production-ready workflow demonstration.
Production
Ready
Delivered complete automated pipeline integrating seamlessly with professional VFX workflows.

Key Achievements

Beyond the technical metrics, this project achieved several groundbreaking outcomes:

  • Workflow Innovation: Created the first documented system for object-aware AI generation using 3D geometry control, solving a fundamental VFX integration challenge.
  • Production Readiness: Delivered a complete, automated pipeline that integrates seamlessly with existing Blender/Cryptomatte workflows used by professional VFX teams.
  • Industry Validation: Successfully finalized project during Siggraph 2025, demonstrating the workflow's commercial viability and technical sophistication.
  • Technical Breakthroughs: Solved multiple complex problems including automated LoRA training, Knowledge Stitching for captioning, and GPU memory optimization through modular architecture.

"Fantastic! It looks like the process is working."

— Eliot Mack, CEO, Lightcraft Technology

Reflection & Learnings

Digital Marionette represented a fascinating intersection of R&D innovation and practical problem-solving, requiring both cutting-edge experimentation and disciplined engineering to deliver a functional system within tight deadlines.

What Worked Well

  • Strategic Workflow Pivots: Abandoning complex 3D texturing in favor of the LoRA approach proved more elegant and production-friendly.
  • Modular Architecture: Splitting the system into interconnected workflows solved GPU constraints while maintaining quality and enabling parallel development.
  • Automated Systems: Building comprehensive automation for LoRA training made the workflow scalable and repeatable for production use.
  • Custom Node Development: Creating specialized ComfyUI nodes for exporting solved critical integration challenges.

Challenges & Solutions

  • Knowledge Stitching Problem: Inconsistent AI captioning was solved by developing a custom node that integrated multiple captioning models with refined prompting.
  • GPU Memory Constraints: Out-of-memory errors led to strategic architectural decisions, splitting workflows to maintain quality while working within hardware limits.
  • Historical Accuracy: Achieving specific Persian war chariot details required iterative prompt engineering and reference image integration.
  • Client Communication: Balancing ambitious R&D exploration with practical delivery timelines required clear progress updates and expectation management.

Future Applications

  • Live Action Integration: The workflow could be extended to work with real video footage, applying object-specific AI styles to live action elements.
  • Real-Time Applications: Optimization for real-time generation could enable interactive VFX applications and game engine integration.
  • Multiple Object Scenes: Scaling to handle complex scenes with dozens of different objects, each with specialized LoRAs.
  • Industry Adoption: The modular approach makes this workflow adaptable to various VFX studio pipelines and production requirements.

Personal Takeaway

This project reinforced the importance of balancing innovation with practicality. While pushing the boundaries of what's possible with AI-powered VFX, the real value came from creating systems that professionals could actually use in production environments. The experience of building custom tools, solving undocumented technical challenges, and delivering within client timelines strengthened my ability to translate experimental research into robust, commercial-ready solutions. Working with Lightcraft Technology also demonstrated how collaborative R&D can accelerate innovation when both parties bring complementary expertise to complex problems.