The Art of Control: Balancing Autonomy, Authority, and Initiative in Human-AI Co-Creation

TL;DR for operators

Most AI product debates still treat “control” as a single slider: more automation on the right, more human control on the left. Convenient, tidy, and wrong in exactly the way tidy models usually are.

The MOSAAIC paper argues that control in human-AI co-creation has at least three separable dimensions: autonomy, or who can choose creative actions; initiative, or who can proactively contribute; and authority, or who can decide and direct the process.¹ This matters because a system can be highly autonomous but still reactive, proactive but not authoritative, or authoritative in small tactical ways while leaving the human responsible for the final artifact.

The paper’s evidence is not a new benchmark or controlled experiment. It is a framework derived from a systematic literature review of 172 full-length papers, then applied to six existing co-creative systems: Cyborg, LuminAI, Snake Story, ChatGPT, Reframer, and Shimon. That makes the contribution conceptual and diagnostic. It helps teams describe control clearly before they start arguing about UX features, “agentic” roadmaps, or whether the model is being too helpful again.

For business use, MOSAAIC is most valuable as a product design map. It tells teams to ask three separate questions: should the AI choose actions, should it take initiative, and should it have decision authority? Then it offers two broad ways to manage the balance: AI-controlled adaptation, where the system adjusts its control level based on context, and human-controlled configuration, where users set or steer the balance themselves.

The practical lesson is simple: co-creative AI is not made better by blindly increasing AI power. It is made better by allocating the right kind of control to the right party at the right moment. A shocking concept, apparently.

Control is not one dial

Imagine a writing assistant inside a brand team. It can generate campaign concepts, suggest alternative headlines, rewrite copy for different segments, and critique tone. Now ask a basic question: who is in control?

The easy answer is “the user,” because the user approves the final output. But that answer hides the real machinery. The AI may be choosing the options the user sees. It may be deciding when to interrupt. It may be steering the style of the campaign by framing which alternatives look plausible. The user may still hold final approval while the system quietly controls the creative pathway.

That is the problem MOSAAIC tries to name. Control in co-creative AI is not a single property. It is a bundle of powers distributed across the human and the machine.

The paper defines control as the power to determine, initiate, and direct the process of co-creation. Those verbs are doing the heavy lifting. “Determine” points to choice. “Initiate” points to proactive contribution. “Direct” points to decision power. In MOSAAIC, these become three dimensions: autonomy, initiative, and authority.

This is a useful correction to a common misconception: that co-creative control is basically autonomy with nicer furniture. It is not. Autonomy answers whether the AI can choose creative actions. Initiative answers whether it can act before being asked. Authority answers whether it can shape the direction of the work.

Once those dimensions separate, many product design puzzles become clearer. A model that generates a whole storyboard from a one-line prompt has high autonomy, but if it only acts after the prompt, it may have low initiative. A music robot that starts improvising with a human performer has shared initiative, but may not have final authority over the composition. A design tool that offers sliders and variants may preserve human authority while still giving the AI creative autonomy within a defined space.

This is the article’s core mechanism: control is not about whether AI is “in the loop” or “out of the loop.” It is about which part of the loop the AI controls.

MOSAAIC turns co-creation into a three-axis design space

The paper positions MOSAAIC as a framework for “Managing Optimization towards Shared Autonomy, Authority, and Initiative in Co-creation.” The acronym is doing a lot of decorative work, but the framework itself is useful.

It maps control across three axes. Each axis ranges from full human control to full AI control, with shared control in the middle.

Control dimension	Core question	Full human end	Shared middle	Full AI end
Autonomy	Who chooses creative actions?	AI follows direct instructions	Human and AI both make creative choices	AI independently generates outputs
Initiative	Who starts or pushes the process forward?	AI waits for the user	Both parties can initiate and respond	AI proactively contributes without prompting
Authority	Who directs decisions and outcomes?	Human decides direction and final use	Human and AI negotiate or mutually adapt	AI determines creative direction

This is not just taxonomy for taxonomy’s sake, although academia does enjoy a labelled box. The framework matters because different combinations create very different products.

A prompt-based image tool may offer wide autonomy inside generation but little initiative: it does not usually decide to begin a creative exchange on its own. A dance improvisation system may share initiative because it can respond, start, or continue movement dynamically. A writing assistant may appear powerful while still leaving authority with the human, because the human selects the final argument, tone, and publication decision.

That difference matters operationally. If a user complains that a tool “takes over,” the problem may not be autonomy. It may be excessive initiative or misplaced authority. If a tool feels “passive,” the issue may not be output quality. It may be that the AI never initiates, probes, reframes, or contributes at the right moment.

This is where MOSAAIC becomes useful for product teams: it turns vague user discomfort into diagnosable control allocation.

Autonomy is about creative freedom, not conversational boldness

Autonomy is the easiest axis to understand and the easiest one to overuse. In MOSAAIC, autonomy means the agent’s ability to choose creative action.

A fully human-autonomous system is closer to a passive tool. The user decides what to do, and the AI executes. A fully AI-autonomous system generates outputs independently, with minimal human oversight. Shared autonomy sits between those extremes: both human and AI have meaningful freedom to make creative choices.

The paper notes that co-creativity research has often treated autonomy as central to AI’s creative role. That makes sense. A system that merely formats text or applies a filter is not a creative partner in any serious sense. To participate creatively, the AI needs some room to choose.

But autonomy alone does not create balanced co-creation. A model can generate surprising, high-quality material and still produce a weak collaboration if the human cannot influence direction, timing, or decision flow. The system may be creative, but the interaction may feel like receiving a parcel from a very imaginative warehouse.

For operators, this is the first design warning. Increasing model autonomy may improve output variety, but it does not automatically improve user agency. In many enterprise settings, the user does not simply want “more AI.” They want controlled creative leverage: options, suggestions, reframing, and execution without losing responsibility for intent.

That distinction matters in domains such as brand, product design, education, game development, and professional writing. Users often want AI to expand the search space, not quietly redefine the goal. Autonomy is valuable when it creates new reachable possibilities. It becomes risky when it substitutes the system’s interpretation for the user’s purpose.

Initiative is where assistants become collaborators

Initiative is the second axis, and it is where many “agentic” products either become useful or become irritating.

In MOSAAIC, initiative means the ability to proactively contribute to the creative process. A system with full human initiative waits. It responds when asked. A system with full AI initiative contributes without user prompting. Shared initiative allows both parties to start, respond, and shift the exchange dynamically.

This is a more subtle dimension than autonomy. A system may be autonomous in generation but completely reactive in interaction. ChatGPT, as coded in the paper’s case study, has shared autonomy in writing because it interprets prompts and contributes creative strategies. But it is classified as full-human initiative because it generally responds to the user rather than independently beginning the creative process.

That classification is useful because it cuts through the lazy claim that all powerful generative AI is already a co-creator in the same way. It is not. A reactive system can produce impressive work while still leaving the rhythm of collaboration entirely human-led.

The paper’s case studies show the contrast. LuminAI, an improvisational dance partner, can begin dancing if the human does not start within a threshold period. Shimon, a robotic marimba player, can generate phrases, set tempo, and lead parts of a performance. Snake Story alternates contributions between human and AI in a game-design story process. These systems are not merely producing outputs. They are participating in the timing of the exchange.

For business design, initiative is the difference between a tool that waits in the corner and a collaborator that notices the work is stuck. A creative AI system with well-designed initiative might ask for missing constraints, propose next steps, flag contradictions, or introduce alternatives when the user is looping. Poorly designed initiative, by contrast, becomes interruption dressed as intelligence.

The lesson is not “make AI more proactive.” That is how products become haunted office furniture. The lesson is to decide where proactivity reduces friction and where it violates the user’s sense of authorship.

Authority is the dimension teams avoid until something breaks

Authority is the most politically sensitive axis because it concerns decision-making power. In MOSAAIC, authority refers to who directs the creative process and decides the tactical means for achieving the goal.

A full-human authority system keeps the human in charge. The AI may suggest, generate, or critique, but the user directs the process and decides what counts. Full AI authority means the system determines creative direction without human intervention. Shared authority allows negotiation and mutual adaptation.

This is the dimension enterprise teams often under-specify. Product requirements mention “assist,” “recommend,” “generate,” or “automate,” but they do not explicitly state whether the AI is allowed to overrule, redirect, prioritize, or finalize. Then everyone acts surprised when users either ignore the system or resent it.

In the six systems analysed by the paper, authority remains mostly human-controlled. Four out of six systems are coded as full-human authority: Cyborg, Snake Story, ChatGPT, and Reframer. Only LuminAI and Shimon are coded as shared authority. The paper points out an interesting pattern: the shared-authority systems are improvisational, where the creative process itself is the product rather than a fixed artifact requiring final approval.

That distinction is important. Authority is easier to share when there is no single document, design, legal claim, or brand asset that must be approved at the end. In improvisational dance or music, the interaction unfolds as the output. In professional writing, product design, marketing, or software work, there is usually an artifact that someone must own. Where ownership exists, authority becomes harder to distribute.

This gives operators a practical boundary. Shared authority is not inherently better. It depends on accountability. If the output affects compliance, brand identity, customer promises, safety, or money, authority needs to be explicit. The AI may have local authority over suggestions or transformations, but final authority may need to remain human.

A system can still be sophisticated without pretending the model is a co-equal executive. Sometimes the right architecture is not shared authority. Sometimes it is strong AI autonomy plus human authority, with transparent initiative rules. Not glamorous, but neither are audit logs, and yet civilisation persists.

The paper’s evidence is diagnostic, not experimental

The paper builds MOSAAIC from a systematic literature review and then applies the framework to six case studies. That matters because the evidence supports a specific kind of claim.

The authors searched ACM Digital Library and the Association for Computational Creativity database, using keyword groups related to control dimensions, balancing strategies, human-AI co-creation, mixed-initiative co-creativity, AI autonomy, adaptive AI, and variable autonomy. They retrieved 446 papers, removed duplicates to reach 317 unique papers, screened abstracts down to 243 papers, and then conducted full-text review to produce a final corpus of 172 papers: 146 from ACM and 26 from ACC.

From that literature, they developed MOSAAIC through iterative thematic analysis. They then coded six co-creative AI systems using the framework.

Paper component	Likely purpose	What it supports	What it does not prove
Systematic literature review	Main evidence for framework construction	Control can be organised around autonomy, initiative, and authority	That these are the only possible dimensions
3D control model	Conceptual mechanism	Control can be mapped as a multi-axis distribution	That the exact optimal point can be calculated
Strategy model	Design interpretation	Control can be balanced through AI adaptation and human configuration	That either strategy improves performance in every domain
Six system case studies	Applicability demonstration and qualitative comparison	MOSAAIC can classify existing systems and reveal design patterns	Statistical generalisability or causal effects
Case coding by researchers	Implementation detail for applying the framework	The framework can be operationalised for system analysis	Objective ground truth about each system’s lived user experience

This distinction should not be treated as a weakness. The paper is not trying to prove that a given interface increases productivity by 17.3%, the traditional imaginary precision of product theatre. It is trying to supply a vocabulary and structure for analysing control.

That said, the boundary is real. The paper does not provide user trials showing that MOSAAIC-designed systems outperform alternatives. It does not estimate business impact. It does not produce a quantitative optimisation rule for how much autonomy, initiative, or authority a system should have in a given task. Its contribution is conceptual clarity and design diagnosis.

For leaders, that means MOSAAIC should not be used as a procurement scorecard or ROI calculator. It should be used earlier in the product process: when defining agent behaviour, interaction patterns, permission boundaries, and user configuration options.

Six cases show that autonomy has advanced faster than authority

The case study section is useful because it shows where current co-creative systems already distribute control, and where they still remain conservative.

The six systems span theatre, dance, game design, writing, drawing, and music:

System	Domain	Autonomy	Initiative	Authority	Balancing strategy
Cyborg	Theatre	Shared	Full-human	Full-human	Human-controlled configuration
LuminAI	Dance	Shared	Shared	Shared	Both AI adaptation and human configuration
Snake Story	Game design	Shared	Shared	Full-human	Both
ChatGPT	Writing	Shared	Full-human	Full-human	Both
Reframer	Drawing	Full-human	Full-human	Full-human	Human-controlled configuration
Shimon	Music	Shared	Shared	Shared	Both

The pattern is revealing. Five out of six systems are coded as having shared autonomy. Only Reframer is coded as full-human autonomy because it follows user prompts and allows variation control without deviating creatively from the prompt in the same way more emergent systems do.

Initiative is more divided. Three systems show shared initiative: LuminAI, Snake Story, and Shimon. Three remain full-human initiative: Cyborg, ChatGPT, and Reframer. This suggests that many systems can generate creatively once invoked, but fewer are designed to participate proactively in the timing of collaboration.

Authority is the most constrained. Four out of six systems retain full-human authority. Only LuminAI and Shimon share authority, and both operate in improvisational performance contexts. That is not accidental. Where the creative process is live and emergent, authority can be distributed moment by moment. Where there is a final artifact, humans tend to keep decision rights.

The business interpretation is straightforward: generative capability has moved faster than control architecture. Many products can now generate, transform, and suggest. Fewer have a mature theory of when to interrupt, when to lead, when to defer, and when to ask the user to decide.

That gap is where many enterprise AI tools become awkward. They can produce content but cannot manage collaboration. They know how to answer but not how to share the work.

The two balancing strategies are adaptation and configuration

MOSAAIC identifies two broad strategies for balancing control: AI-controlled adaptation and human-controlled configuration.

AI-controlled adaptation means the system adjusts autonomy, initiative, or authority based on task, context, workload, expertise, or user state. For example, the AI might take more responsibility during routine steps and defer during complex or high-stakes steps. It might become more proactive when the user appears stuck, or less directive when the user is exploring.

Human-controlled configuration means users can explicitly shape the control dynamic. They may set the AI’s level of involvement, choose between variants, adjust sliders, define stylistic constraints, or decide when the system should suggest, critique, or act.

These strategies should not be treated as rivals. The case studies show that four of the six systems use both. That combination is likely where many practical products will land: the system adapts in real time, but the user can still configure the terms of collaboration.

For product design, the distinction maps neatly onto two questions.

First, what should the system infer? If the user is a novice, under time pressure, or asking for broad exploration, the AI may reasonably take more initiative. If the user is an expert refining a precise artifact, the AI may need to stay quieter and preserve human authority.

Second, what should the user explicitly control? Experts may want configuration over style, constraints, thresholds, authority boundaries, and intervention frequency. Novices may prefer simpler modes: “guide me,” “collaborate with me,” or “stay in assistant mode.”

The trap is to hide too much control inside adaptation. When the AI silently changes its level of authority or initiative, users may experience the system as inconsistent or manipulative. The opposite trap is to expose every control as a setting and turn creativity into cockpit management. A product that requires users to configure twelve collaboration sliders before writing a paragraph has not empowered anyone. It has relocated the burden with a nicer font.

The design challenge is to combine adaptive intelligence with legible configuration.

What MOSAAIC changes in product planning

MOSAAIC is especially useful before teams build features. It gives product managers, designers, and AI engineers a shared language for specifying collaboration behaviour.

Instead of asking, “How agentic should this tool be?” teams can ask a better set of questions:

Product design question	MOSAAIC dimension	Operational consequence
Can the AI choose among creative actions, or only execute the user’s chosen action?	Autonomy	Determines whether the tool expands the creative search space or simply accelerates execution
Can the AI speak first, interrupt, suggest next steps, or reframe the task?	Initiative	Determines whether the system behaves like a passive assistant or active collaborator
Can the AI direct the workflow, reject weak options, or decide what should happen next?	Authority	Determines accountability, user agency, and governance requirements
Should the system adjust its role based on user behaviour or task state?	AI-controlled adaptation	Determines whether collaboration changes dynamically
Should the user be able to set the AI’s role, assertiveness, or decision rights?	Human-controlled configuration	Determines transparency, personalisation, and trust

This is the practical value of the framework. It converts fuzzy “AI role” discussions into design decisions.

For a marketing platform, the team may decide that AI should have high autonomy in generating variants, moderate initiative in suggesting campaign angles, and low authority over final brand claims. For an educational creativity tool, the AI may have more initiative with novices but less authority with advanced learners. For a game-design assistant, shared initiative may be valuable during ideation, while human authority remains essential for final narrative direction.

The same logic applies beyond explicitly creative products. Strategy tools, coding assistants, analytics copilots, slide generators, and workflow agents all involve co-creation in a broader sense. They produce artifacts with humans. They choose, suggest, and sometimes direct. MOSAAIC gives teams a way to state exactly what kind of partner the AI is supposed to be.

That clarity matters because “agentic AI” has become a product label stretched over everything from autocomplete to autonomous workflow execution. MOSAAIC does not solve that semantic inflation, but it does make it harder to hide behind it.

Business value comes from better control diagnosis, not immediate ROI proof

The business relevance of MOSAAIC is not that it proves co-creative AI increases revenue. It does not. The paper does not run commercial deployments or measure ROI. The business value is earlier and more architectural.

First, MOSAAIC helps diagnose user dissatisfaction. If users say the tool is “too controlling,” “too passive,” or “not collaborative,” teams can map the complaint to autonomy, initiative, or authority. That is better than randomly adjusting prompt style and calling it iteration.

Second, it helps design role-specific experiences. Experts and novices often need different control distributions. Experts may want autonomy from the AI but not authority. Novices may welcome more directive scaffolding. A one-size-fits-all assistant is usually a one-size-annoys-several product.

Third, it supports governance. Authority is directly linked to accountability. If an AI system can direct decisions, approve outputs, or shape final artifacts, the organisation needs explicit rules. Who owns the result? Who can override whom? What decisions must remain human? These are not philosophical decorations. They become audit, compliance, brand, and liability questions.

Fourth, it improves roadmap discipline. Rather than saying “we will make the AI more agentic,” a team can specify: “we will add shared initiative during early ideation, preserve human authority at final approval, and allow users to configure intervention frequency.” That is a product requirement. The former is a conference sentence looking for a budget.

Where the framework stops

MOSAAIC is useful, but it should not be overclaimed.

The first boundary is empirical. The framework is derived from a literature review and demonstrated through qualitative case analysis. That is appropriate for conceptual design work, but it does not prove that a MOSAAIC-balanced system performs better, produces more creative outputs, or improves user satisfaction in deployment.

The second boundary is optimisation. The paper identifies dimensions and strategies, but it does not provide a formula for the optimal distribution of control. The authors explicitly leave context-appropriate optimisation for future work. In practice, the right balance will depend on task type, user expertise, risk level, creative phase, and organisational accountability.

The third boundary is coverage. The literature review focuses on ACM and ACC sources. The authors acknowledge that relevant work outside those libraries may be omitted. For a framework paper, that does not invalidate the contribution, but it does mean the map may expand as adjacent literatures are included.

The fourth boundary is case sampling. The six systems are chosen by convenience sampling from systems familiar to the authors and relevant across domains. They demonstrate applicability; they do not represent the full population of co-creative AI tools.

Finally, MOSAAIC does not fully resolve external factors such as trust, ownership, emotional attachment, legal responsibility, or cultural expectations of authorship. These factors influence control, and the authors flag them as future directions. Product teams should treat MOSAAIC as the control layer, not the entire human-AI relationship model.

The real product question is who gets to shape the work

The paper’s strongest contribution is not that it coins three terms. It is that it forces a cleaner question.

When humans and AI co-create, who gets to shape the work?

Not just who presses the button. Not just who writes the final prompt. Not just who approves the output. Who chooses the actions? Who starts the next move? Who directs the process when there is ambiguity?

Those questions are becoming more important as generative systems move from passive tools into collaborative work environments. The more capable the model becomes, the less useful it is to describe it only by output quality. Capability without control architecture creates brittle collaboration. The model may be impressive, but the user may still feel displaced, interrupted, or quietly managed.

MOSAAIC gives teams a practical way to avoid that failure. It says: separate autonomy, initiative, and authority. Decide where each belongs. Use adaptation where the system can responsibly infer the right balance. Use configuration where users need explicit control. Then test whether the resulting collaboration actually supports the work.

The future of co-creative AI will not be decided by autonomy alone. It will be decided by whether systems can share control without turning the human into either a button-pusher or a compliance officer for machine imagination.

A little less magic. A little more architecture. Progress, in other words.

Cognaptus: Automate the Present, Incubate the Future.

Alayt Issak, Jeba Rezwana, and Casper Harteveld, “MOSAAIC: Managing Optimization towards Shared Autonomy, Authority, and Initiative in Co-creation,” arXiv:2505.11481, 2025, https://arxiv.org/abs/2505.11481. ↩︎

TL;DR for operators#

Control is not one dial#

MOSAAIC turns co-creation into a three-axis design space#

Autonomy is about creative freedom, not conversational boldness#

Initiative is where assistants become collaborators#

Authority is the dimension teams avoid until something breaks#

The paper’s evidence is diagnostic, not experimental#

Six cases show that autonomy has advanced faster than authority#

The two balancing strategies are adaptation and configuration#

What MOSAAIC changes in product planning#

Business value comes from better control diagnosis, not immediate ROI proof#

Where the framework stops#

The real product question is who gets to shape the work#