TL;DR for operators

Small teams should stop asking whether they need “AI automation” and start asking what kind of human agency each task deserves. Full automation is attractive when the work is repetitive, low-value, easy to verify, and not politically radioactive. Semi-automation is better when the task depends on context, judgment, interpersonal trust, creative control, or reversible-but-annoying decisions that still consume human attention.

The useful operating model is not “AI versus people.” That is a fine slogan if one’s ambition is to sell conference lanyards. The better model is task portfolio design: some tasks should be delegated, some should be co-produced, and some should remain human-led with AI quietly fetching the coffee, metaphorically or otherwise.

A recent large-scale audit of AI agents and occupational tasks gives this argument a sturdier spine.1 Across 844 tasks and 104 occupations, workers wanted AI agent automation for many low-value and repetitive tasks, but not evenly, not universally, and not simply because the technology could do the work. The same study found that workers often preferred more human agency than AI experts judged technically necessary. Translation for managers: feasibility is not adoption.

For a small team, the practical takeaway is simple: Taylor Swift the stack. Curate collaborators. Do not invite every tool onto the album. Give AI the parts that improve tempo without stealing authorship, accountability, or taste.

The real question is not “can AI do it?” but “how much agency should remain?”

A four-person team does not suffer from a lack of software. It suffers from a lack of slack.

The calendar is full. The project tracker is half-fiction. Someone promised a client revision by Thursday, but the promise lives in a Slack thread beneath a GIF and three unrelated invoices. At this point, a tool that turns messy notes into task dependencies is useful. A tool that silently emails the client, reschedules the contractor, rewrites the proposal, and updates the budget because it “understood the goal” is either impressive or a lawsuit rehearsing in private.

That is the distinction this article cares about.

Full automation means the system completes the task with little or no human involvement. Semi-automation means the system reduces the burden while preserving human direction, review, or final judgment. The difference is not philosophical. It changes risk, cost, adoption, and workflow design.

The mistake many small teams make is treating automation as a maturity ladder:

Manual work → assisted work → full automation → operational enlightenment.

Nice staircase. Shame about the missing floorboards.

The better view is that different tasks deserve different levels of agency. Some should be automated because they are repetitive and verifiable. Some should be assisted because the cost of being wrong is social, reputational, or strategic. Some should not be touched yet because the tool can technically act but the team cannot meaningfully supervise it.

The paper audits tasks, not job titles, because job titles are lazy maps

The central research behind this revision, Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce, does something useful: it avoids asking whether “marketing managers” or “accountants” will be automated.1 Job titles are bundles. Bundles hide the decision that matters.

A marketing manager might compile competitor lists, write positioning copy, negotiate with vendors, analyse campaign data, and calm down a client who has discovered “brand refresh” means “work.” These tasks do not have the same automation profile. One is database work. One is judgment. One is diplomacy with slides.

The paper builds the WORKBank database using O*NET occupational tasks and combines two perspectives:

Input What it captures Why it matters
Ratings from 1,500 domain workers Whether people doing the work actually want AI agents to automate or assist the task Adoption is not just technical permission
Assessments from 52 AI experts Whether current agent technology appears capable of performing the task Desire without capability is a product backlog, not a deployment plan
844 tasks across 104 occupations Task-level variation inside real jobs “This role is exposed to AI” is too crude to be operationally useful

The paper also introduces a Human Agency Scale, from H1 to H5. H1 means the AI agent can handle the task entirely. H3 means human and AI form an equal partnership. H5 means the task cannot function without continuous human involvement.

That scale is more useful for small teams than the usual “automate more” dashboard. It lets a founder, operations lead, or agency manager ask: is this task a candidate for delegation, collaboration, or human-led execution with AI support?

This is where the “semi or full” decision becomes less theatrical.

The evidence supports selective automation, not a robot coup with nicer branding

The paper finds that workers expressed positive attitudes toward AI agent automation for 46.1% of assessed tasks. That number is large enough to matter and small enough to ruin simplistic automation sermons.

The reason matters more than the headline. Among pro-automation responses, the most common motivation was freeing time for higher-value work, selected in 69.38% of cases. Repetitiveness and quality improvement were also common reasons. Workers were not saying, “Please remove me from my profession.” They were saying, with admirable clarity, “Please stop making me wrestle with administrative lint.”

The top automation-desired tasks are telling: scheduling appointments, maintaining files, recording payroll adjustments, converting files, maintaining customer databases, tracking quality-control data, and preparing routine charts or tables. These are not sacred acts of human civilisation. If your competitive advantage is manually scheduling tax appointments, the market has already sent a memo; perhaps check the spam folder.

The bottom of the list is equally instructive. Workers showed low desire for automating tasks such as writing editorial content, creating design concepts, reviewing layouts, handling unusual information requests, and making judgment-heavy assessments. These are not all “creative” in the romantic sense, but they involve ownership, interpretation, taste, responsibility, or context.

A small team should read this as permission to be selective. Automate the drag. Assist the judgment. Protect the work that carries trust.

The desire-capability map is the small-team procurement filter

The most useful part of the paper is its desire-capability landscape. It divides tasks into four zones:

Zone Worker desire AI capability Small-team decision
Automation Green Light High High Automate aggressively, but keep audit trails
Automation Red Light Low High Do not deploy just because the demo works
R&D Opportunity High Low Prototype carefully or wait for better tools
Low Priority Low Low Ignore, unless boredom is the strategy

This table should be taped above every small-team AI procurement discussion. Preferably next to a reminder that “agentic” is not a synonym for “worth paying for.”

The red-light zone is especially important. These are tasks where AI may be technically capable, but workers do not want full automation. That resistance is not automatically irrational. It may reflect risk, professional identity, client trust, accountability, or the simple fact that the person doing the task understands hidden constraints the vendor demo skipped.

The paper also finds a market mismatch: 41.0% of mapped Y Combinator company activity fell into the Low Priority and Automation Red Light zones, while many promising Green Light and R&D Opportunity tasks remained under-addressed. For small teams, that is a warning. Startup density is not the same as operational relevance. Sometimes the market produces ten tools for the person holding a venture cheque and none for the person reconciling the payroll exception.

Semi-automation is a design choice, not a moral compromise

Semi-automation is often described as a temporary stage before full automation. That framing is lazy. Some tasks are not waiting to become fully automated. They are collaborative by nature.

The Human Agency Scale results make this visible. In the WORKBank study, H3—equal human-agent partnership—was the dominant worker-desired level in 47 of 104 occupations. Workers and experts matched on the desired or feasible agency level for only 26.9% of tasks, while workers preferred higher human agency than experts deemed technically necessary in 47.5% of tasks.

That gap is the entire management problem in miniature.

An AI expert may look at a task and say, “The system can do this.” A worker may respond, “Yes, and if it does, I still get blamed when it misses the client nuance.” Both can be correct. Capability is about performance under a defined task. Agency preference is about control, accountability, and the right to intervene before the machine confidently ruins Tuesday.

Collaborative-agent research supports this point. In Collaborative Gym, agents working with humans outperformed fully autonomous counterparts in several evaluated tasks, with reported win rates of 86% in travel planning, 74% in tabular analysis, and 66% in related-work writing under real-user evaluation.2 That does not prove collaboration is always superior. It does show that human involvement is not merely a safety blanket for nervous managers. In some workflows, it is part of the performance mechanism.

Semi-automation works when the AI does one or more of the following:

  • drafts a first version but leaves authorship with the human;
  • extracts options but lets the human choose;
  • monitors routine events and escalates exceptions;
  • turns messy inputs into structured artefacts;
  • proposes actions with clear evidence and reversible approval.

This is not “half automation.” It is automation with a deliberate control surface.

Where full automation actually earns its keep

Full automation is not the villain. It is simply overcast too often.

For small teams, full automation makes sense when five conditions hold:

Condition Practical test
The task is repetitive It happens often enough that manual handling creates real drag
The output is easy to verify A human can quickly spot whether it worked
The downside is limited Failure is annoying, not existential
The process is stable Inputs and rules do not change every three days
The accountability path is clear Someone knows what the system did and can reverse it

Invoice reminders, file backups, appointment scheduling, database updates, routine report generation, lead follow-up sequences, internal status summaries, and data-format conversion are natural candidates. The AI does not need to “understand the business” in any grand metaphysical sense. It needs to perform a bounded function reliably.

This is where small teams can win quickly. They do not need a sovereign AI employee with a profile photo and suspicious confidence. They need fewer dropped tasks, cleaner handoffs, and less clerical sediment in the workday.

The evidence from real AI use also points in this direction. A study of millions of Claude conversations found that AI usage was concentrated in software development and writing tasks, but also that observed usage split between augmentation and automation: 57% of usage suggested augmentation, while 43% suggested automation.3 Even in actual usage, not vendor prophecy, the pattern is mixed.

Mixed is not disappointing. Mixed is reality having standards.

The operational playbook: Taylor Swift the stack

The title’s “Taylor Swift” line is not a demand that small teams run their workflows like a stadium tour. Though, honestly, the average stadium tour has better operations discipline than many SaaS implementations.

The point is curation. Pick collaborators that fit the work. Do not invite every shiny tool into the stack because the demo had gradients and a confident narrator.

A small team can use this simple diagnostic:

Task question If yes Automation posture
Is this repetitive, low-risk, and easy to verify? Yes Full automation candidate
Does this require judgment, taste, negotiation, or domain context? Yes Semi-automation candidate
Would workers resist automation even if the tool can technically do it? Yes Red-light review before deployment
Is worker desire high but tools are weak? Yes Prototype, do not operationally depend on it
Is desire low and capability low? Yes Move on with your life

For a small agency, this could mean fully automating meeting reminders, file organisation, recurring invoice nudges, and draft status updates. It could mean semi-automating client proposals: AI prepares the outline, extracts prior project data, checks consistency, and suggests pricing language, but a human owns the promise. It could mean refusing to automate final creative direction, sensitive client communication, or hiring decisions, even if a vendor insists the model has “enterprise-grade empathy.” A phrase that should make everyone reach gently for the exit.

What Cognaptus infers for business use, and what the paper does not prove

Here is the clean separation.

Claim Evidence Cognaptus business interpretation Boundary
Many workers want AI automation for low-value repetitive tasks 46.1% of tasks received positive automation desire in WORKBank Start with administrative drag, not strategic theatre The database covers selected computer-compatible tasks and 104 occupations
Worker desire and technical capability do not perfectly align The paper’s four-zone landscape shows green, red, R&D, and low-priority zones Procurement should evaluate desire and capability together Expert ratings reflect early-2025 capability
Human agency remains important in many tasks H3 was dominant in 47 of 104 occupations Semi-automation is often the target state, not a stepping stone Agency preferences may change as tools and incentives change
Full autonomy increases governance stakes Prior work argues risks rise as users cede more control to autonomous agents4 Use full automation only where failure modes are bounded and auditable Risk varies by domain, task, data access, and reversibility
Worker attitudes are nuanced, not purely anti-automation A multinational worker-perspective survey found many workers see potential benefits from automation, depending on job design and incentives5 Adoption improves when automation visibly improves work, not just margins Survey evidence does not replace local workflow testing

The business implication is not “buy fewer AI tools.” It is more annoying and therefore more useful: buy tools with a theory of agency.

Before adopting an AI workflow product, ask the vendor to state the intended agency level. If the answer is vague, assume the product team has confused autonomy with ambition. Ask what the human sees, approves, edits, reverses, and learns. Ask what happens when the model is uncertain. Ask where logs live. Ask how the tool behaves when the input is incomplete, contradictory, or politically delicate.

A small team cannot afford invisible complexity. Full automation can remove labour, but it can also create monitoring work, exception handling, integration debt, and a new ritual in which everyone asks, “Why did the agent do that?” This is not productivity. This is mystery with a subscription plan.

Boundaries: this is a baseline, not a prophecy

The WORKBank evidence is valuable because it grounds the automation discussion at the task level and includes both worker preference and expert capability. But it is not a universal operating manual dropped from the heavens, which is good, because heaven’s API documentation would probably also be incomplete.

The limits matter.

First, the paper reflects the state of AI agents and worker expectations in early 2025. Model capability, tool integration, reliability, cost, and user familiarity are moving targets.

Second, the tasks come from existing occupational definitions. AI may create new tasks, dissolve old ones, or change what “good work” means inside a role. A database of today’s tasks cannot fully describe tomorrow’s work design.

Third, worker preference is not fixed. People may resist automation because they distrust poor tools, fear surveillance, or have not seen a useful implementation. They may also embrace automation too quickly when incentives reward short-term speed over long-term quality. Humans, regrettably, are not perfectly calibrated instruments.

Fourth, small-team context differs from large-enterprise context. A small team has faster feedback loops but less redundancy. A bad automation in a large firm creates a ticket. A bad automation in a small firm may create a client call, a cash-flow problem, and a long evening.

So the framework should be repeated periodically. Review the task portfolio every quarter. Move tasks between full automation, semi-automation, and human-led categories as tools mature and failure data accumulates.

Conclusion: curate autonomy before autonomy curates you

Small teams do not need to choose between heroic manual work and full robotic delegation. That binary is mostly useful to vendors and people who enjoy drawing arrows on strategy slides.

The better answer is selective autonomy.

Fully automate work that is repetitive, verifiable, low-risk, and unloved. Semi-automate work that benefits from machine speed but still needs human context, taste, responsibility, or trust. Keep humans firmly in charge where the task carries authorship, accountability, negotiation, or judgment that cannot be safely reduced to a prompt.

That is what it means to Taylor Swift the tech stack: choose the collaborators that amplify the performance, not the ones that drown out the artist.

Start with one task. Classify it honestly. Decide the right level of agency. Then automate only as much as the work deserves.

Cognaptus: Automate the Present, Incubate the Future.


  1. Yijia Shao, Humishka Zope, Yucheng Jiang, Jiaxin Pei, David Nguyen, Erik Brynjolfsson, and Diyi Yang, “Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce,” arXiv:2506.06576, 2025. https://arxiv.org/abs/2506.06576 ↩︎ ↩︎

  2. Yijia Shao, Vinay Samuel, Yucheng Jiang, John Yang, and Diyi Yang, “Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration,” arXiv:2412.15701, 2024. https://arxiv.org/abs/2412.15701 ↩︎

  3. Kunal Handa, Alex Tamkin, Miles McCain, Saffron Huang, Esin Durmus, Sarah Heck, Jared Mueller, Jerry Hong, Stuart Ritchie, Tim Belonax, Kevin K. Troy, Dario Amodei, Jared Kaplan, Jack Clark, and Deep Ganguli, “Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations,” arXiv:2503.04761, 2025. https://arxiv.org/abs/2503.04761 ↩︎

  4. Margaret Mitchell, Avijit Ghosh, Alexandra Sasha Luccioni, and Giada Pistilli, “Fully Autonomous AI Agents Should Not be Developed,” arXiv:2502.02649, 2025. https://arxiv.org/abs/2502.02649 ↩︎

  5. Ben Armstrong, Valerie K. Chen, Alex Cuellar, Alexandra Forsey-Smerek, and Julie A. Shah, “Automation from the Worker’s Perspective,” arXiv:2409.20387, 2024. https://arxiv.org/abs/2409.20387 ↩︎