Introduction: Seeing Opportunity in the Unseen

In developing economies across Southeast Asia and Latin America, many businesses—especially SMEs—still rely heavily on paper-based workflows. According to the World Bank, over 65 million SMEs operate across these regions, and in countries like the Philippines, Vietnam, Colombia, and Peru, up to 70% of invoicing remains manual and paper-driven 12. Despite the growth of digital tools, invoice scanning, expense tracking, and compliance reporting remain highly fragmented and inefficient.

The opportunity: deploying AI-powered OCR systems that are fine-tuned to local language, format, and compliance requirements. Global tools fall short here—not because of lack of sophistication, but because of lack of localization. We believe in building for the Eyeconomy—a new frontier where machine vision sees and understands localized business documents.

Market Gap

Common Pain Points

  • Existing OCR tools struggle with rotated, skewed, or handwritten receipts
  • Invoices often mix multiple languages or dialects, e.g., Spanish + indigenous terms
  • LLM-based tools like GPT-4o can interpret extracted text, but do not perform native OCR—they require integration with visual encoders or pre-processed text
  • Local tax formats (e.g., BIR in the Philippines) and business rules (e.g., document signatures, approval codes) are rarely recognized correctly by off-the-shelf OCR

Why Global Tools Fail Locally

Tools like Google Document AI and Microsoft Form Recognizer provide powerful OCR APIs, but they lack:

  • Country-specific tax form mapping
  • Support for regional handwriting variations
  • Localized pricing structures for SMEs
  • The ability to adapt quickly to changing local invoice formats or regulations

Market Potential (with References)

  • SME Volume: Over 65 million SMEs in SEA and LATAM combined 1
  • Paper Dominance: Over 70% of invoices still manually handled 23
  • Market Value: Estimated $3–5B annual opportunity for digital invoicing and compliance tools, including OCR and workflow automation 4

The Eyeconomy Solution

Our approach focuses on fine-tuning vision-language models, localized for a specific country first (e.g., the Philippines), then scaling outward regionally.

Phase 1: Country-Level Specialization

Dataset Collection

  • Target: 5,000–20,000 real receipts/invoices
  • Include:
    • Scanned or photographed invoices (including rotated/angled samples)
    • Handwritten, thermal, and printed forms
    • Bilingual examples (e.g., English + Tagalog)
  • Address legal and privacy concerns via NDAs or synthetic dataset bootstrapping

Model Selection & Tuning

Model Role & Capability
Donut End-to-end OCR-free image-to-JSON generation
LayoutLMv3 Combines layout, image, and text embeddings
TrOCR High-accuracy OCR model for text recognition

We fine-tune Donut and/or LayoutLMv3 using augmented data, simulate poor conditions (e.g., blurs, tilt), and align outputs with a target JSON schema.

Knowledge Base (KB) Enhancement

  • TIN validation
  • Employee signature verification
  • Known vendor list matching (e.g., top suppliers)
  • Company metadata for invoice cross-checking

From Image to Structured Accounting

📸 Capture (via phone or scanner)
↓
🧹 Preprocess (de-skew, de-noise, auto-rotate)
↓
🤖 OCR Model (Donut or LayoutLMv3)
↓
🧾 Structured Extraction (TIN, Items, VAT, etc.)
↓
🧠 LLM Validator (cross-check using KB)
↓
💼 Integration (accounting system, tax filing API)

Overcoming Practical Implementation Challenges

Adoption Barriers

  • Connectivity: Offline capture and sync support needed in rural areas
  • Training: Intuitive UI + guided onboarding
  • Cost Sensitivity: Entry-level free tier or pay-per-invoice model

Pricing Strategy

Tier Description Price
Starter 500 pages/month $19/mo
Growth 5,000 pages/month $99/mo
Enterprise Custom + on-prem option Negotiated

Pilot programs with local tax authorities or mid-sized ERPs can incentivize early adoption and help validate performance.

Scaling and Sustainability

Localization Strategy

  1. Pilot in the Philippines with 1–2 tax-compliant accounting platforms
  2. Expand to Spanish-speaking LATAM markets (e.g., Colombia, Peru)
  3. Use a modular architecture to plug in local rules and forms easily

Updating the Model

  • Maintain “human-in-the-loop” review dashboards
  • Schedule monthly retraining with new formats or tax rule updates
  • Use crowdsourced data tagging + local partnerships

Competitors and Differentiators

Competitors

  • Google Document AI
  • Microsoft Form Recognizer
  • Rossum, Veryfi, Mindee

Our Edge

  • Local data, local teams, local compliance
  • Language-aware and format-flexible models
  • Seamless integration into local tax ecosystems (e.g., BIR, DIAN)

Business Vision: Building the Eyeconomy

“The future of compliance is not just paperless—it’s context-aware.”

Our vision is not just to scan better, but to understand better—tailoring OCR and validation models to each local economy, starting with one, then scaling smartly.

With continued investment in local partnerships, legal safeguards, and user-first design, Eyeconomy can become the foundation of next-gen accounting and compliance in emerging markets.


Eyeconomy is a vision-driven initiative by Cognaptus Insights, focused on AI applications in emerging economies.


  1. World Bank SME Finance Forum, 2023 ↩︎ ↩︎

  2. Philippine Department of Trade and Industry (DTI), 2022 Report on Digital Transformation ↩︎ ↩︎

  3. Inter-American Development Bank Report on Digital Financial Inclusion, 2023 ↩︎

  4. IDC Forecast: Digital Document Management in LATAM, 2024 ↩︎