LLM | Cognaptus

Serverless Bulls and Bears: How One Developer Built a Real-Time Stock Analyst with Zero Infrastructure

Most real-time financial systems rely on deep stacks of infrastructure, from custom APIs to cloud VMs and high-frequency data ingestion pipelines. But what if a single developer could deploy a daily-updating, AI-powered stock analysis engine without a single server? That’s exactly what Taniv Ashraf set out to do — and accomplished — in his recent case study on a fully serverless architecture using Google Gemini, GitHub Actions, and static web hosting. The result is an elegantly simple yet conceptually powerful demonstration of how qualitative LLM analysis and automation tools can replace entire categories of financial tooling — if wielded strategically. ...

Tables Turned: Why LLM-Based Table Agents Are the Next Big Leap in Business AI

When most people think of AI today, they picture text generation, image synthesis, or copilots answering emails. But beneath the surface of digital transformation lies an often-overlooked backbone of enterprise work: tables. Spreadsheets, databases, and semi-structured tabular documents are still where critical operations happen — from finance to health records to logistics. A recent survey paper, Toward Real-World Table Agents, pushes us to rethink how AI interacts with tabular data. Instead of treating tables as static inputs, the authors argue that tables are evolving into active data canvases — and LLM-based Table Agents are poised to become their intelligent orchestrators. ...

The Retrieval-Reasoning Tango: Charting the Rise of Agentic RAG

In the AI race to make large language models both factual and reasoned, two camps have emerged: one focused on retrieval-augmented generation (RAG) to fight hallucination, the other on long-chain reasoning to mimic logic. But neither wins alone. This week’s survey by Li et al. (2025), Towards Agentic RAG with Deep Reasoning, delivers the most comprehensive synthesis yet of the field’s convergence point: synergized RAG–Reasoning. It’s no longer a question of whether retrieval helps generation or reasoning helps retrieval—but how tightly the two can co-evolve, often under the coordination of autonomous agents. ...

Plug Me In: Why LLMs with Tools Beat LLMs with Size

The latest research out of Heriot-Watt University doesn’t just challenge the notion that bigger is better — it quietly dismantles it. In their newly released Athena framework, Nripesh Niketan and Hadj Batatia demonstrate how integrating external APIs into LLM pipelines can outperform even the likes of GPT-4o and LLaMA-Large on real tasks like math and science. And they didn’t just beat them — they lapped them. Why GPT-4 Still Fumbles Math Ask GPT-4o to solve a college-level math problem, and it might hallucinate steps or miss basic arithmetic. The reason? LLMs, even at trillion-parameter scale, are not calculators. They’re probabilistic machines trained on patterns, not deterministic reasoners. ...

The Rise of the Self-Evolving Scientist: STELLA and the Future of Biomedical AI

When was the last time a machine truly surprised you—not with a quirky ChatGPT poem or a clever image generation, but with scientific reasoning that evolved on its own? Meet STELLA, an AI agent for biomedical research that doesn’t just solve problems—it gets better at solving them while solving them. The Static Curse of Smart Agents Modern AI agents have shown promise in navigating the labyrinth of biomedical research, where each inquiry might require cross-referencing papers, running custom bioinformatics analyses, or interrogating molecular databases. But the vast majority of these agents suffer from a fatal limitation: they rely on static, pre-installed toolkits and hard-coded logic trees. Like a PhD student who memorized a textbook but never updated it, they can’t adapt to new tasks or new knowledge without human intervention. ...

LLMs Meet Logic: SymbolicThought Turns AI Relationship Guesswork into Graphs

If AI is going to understand people, it first has to understand relationships. But when it comes to parsing character connections from narrative texts — whether news articles, biographies, or novels — even state-of-the-art language models stumble. They hallucinate links, miss cross-sentence cues, and often forget what they’ve just read. Enter SymbolicThought, a hybrid framework that gives LLMs a logic-boosted sidekick: symbolic reasoning. Developed by researchers at King’s College London and CUHK, the system doesn’t just extract character relationships from text; it builds editable graphs, detects logical contradictions, and guides users through verification with a smart, interactive interface. ...

The Meek Shall Compute It

The Meek Shall Compute It For the past five years, discussions about AI progress have centered on a simple formula: more data + more compute = better models. This scaling paradigm has produced marvels like GPT-4 and Gemini—but also entrenched a new aristocracy of compute-rich players. Is this inequality here to stay? According to a provocative new paper from MIT CSAIL, the answer may be: not for long. The authors argue that due to the laws of diminishing returns, the performance gap between state-of-the-art (SOTA) models and smaller, cheaper “meek” models will shrink over time. If true, this reframes the future of AI as one not of centralized supremacy, but of widespread, affordable competence. ...

Echo Chamber in a Prompt: How Survey Bias Creeps into LLMs

Large Language Models (LLMs) are increasingly deployed as synthetic survey respondents in social science and policy research. But a new paper by Rupprecht, Ahnert, and Strohmaier raises a sobering question: are these AI “participants” reliable, or are we just recreating human bias in silicon form? By subjecting nine LLMs—including Gemini, Llama-3 variants, Phi-3.5, and Qwen—to over 167,000 simulated interviews from the World Values Survey, the authors expose a striking vulnerability: even state-of-the-art LLMs consistently fall for classic survey biases—especially recency bias. ...

Beyond the Pareto Frontier: Pricing LLM Mistakes in the Real World

For all the hype about model accuracy, inference cost, and latency, most organizations are still squinting at scatter plots to decide which large language model (LLM) to use. But what if we could cut through the tradeoff fog with a single number that tells you exactly which model is worth deploying—for your use case, under your constraints? That’s the bold proposal in a recent paper by Zellinger and Thomson from Caltech: treat LLM selection as an economic decision. Rather than searching for models on the accuracy-cost “Pareto frontier,” they suggest an approach grounded in price-tagging errors, delays, and abstentions in dollar terms. Think of it as a model selection framework that answers: How much is a mistake worth to you? ...

The Phantom Menace in Your Knowledge Base

Retrieval-Augmented Generation (RAG) may seem like a fortress of AI reliability—until you realize the breach happens at the front door, not in the model. Large Language Models (LLMs) have become the backbone of enterprise AI assistants. Yet as more systems integrate RAG pipelines to improve their factuality and domain alignment, a gaping blindspot has emerged—the document ingestion layer. A new paper titled “The Hidden Threat in Plain Text” by Castagnaro et al. warns that attackers don’t need to jailbreak your model or infiltrate your vector store. Instead, they just need to hand you a poisoned DOCX, PDF, or HTML file. And odds are, your RAG system will ingest it—invisibly. ...