DPO | Cognaptus

Cite Before You Write: Agentic RAG That Picks Graph vs. Vector on the Fly

Paper: Open-Source Agentic Hybrid RAG Framework for Scientific Literature Review (Nagori et al., 2025) One‑line: The authors wrap a hybrid RAG pipeline (Neo4j GraphRAG + FAISS VectorRAG) inside an agent (Llama‑3.3‑70B) that decides per query which retriever to use, then instruction‑tunes generation (Mistral‑7B) and quantifies uncertainty via bootstrapped evaluation. It’s open‑source and genuinely useful. Why this paper matters (beyond research circles) Business pain: Knowledge workers drown in PDFs. Static “semantic search + summarize” tools miss citation structure and provenance; worse, they hallucinate under pressure. What’s new: Dynamic query routing between graph queries (Cypher over Neo4j) and semantic + keyword retrieval (FAISS + BM25 + rerank). Then DPO nudges the generator to prefer grounded answers. So what: For regulated sectors (healthcare, finance, legal), this is a pattern you can implement today for auditable reviews with traceable sources and tunable confidence bands. The blueprint (concrete, reproducible) Ingestion: Pull bibliometrics (DOI, title, abstract, year, authors, PDF URL, source) from PubMed, arXiv, Google Scholar. Deduplicate and filter by cosine similarity of TF‑IDF keywords (keep top‑quartile relevance). ...

The Sentiment Edge: How FinDPO Trains LLMs to Think Like Traders

Financial markets don’t reward the loudest opinions. They reward the most timely, well-calibrated ones. FinDPO, a new framework by researchers from Imperial College London, takes this lesson seriously. It proposes a bold shift in how we train language models to read market sentiment. Rather than relying on traditional supervised fine-tuning (SFT), FinDPO uses Direct Preference Optimization (DPO) to align a large language model with how a human trader might weigh sentiment signals in context. And the results are not just academic — they translate into real money. ...

Mirror, Mirror in the Model: How MLLMs Learn from Their Own Mistakes

When multimodal large language models (MLLMs) like Gemini or Janus are asked to generate an image and then assess whether that image matches a prompt, you’d expect agreement. But a new study shows this harmony is often missing: the model’s own understanding branch disagrees with what its generation branch creates. This phenomenon—called self-contradiction—isn’t just an embarrassing quirk. As it turns out, it may be the most valuable feedback signal MLLMs have. ...

Guardians of the Chain: How Smart-LLaMA-DPO Turns Code into Clarity

When the DAO hack siphoned millions from Ethereum in 2016, the blockchain world learned a hard lesson: code is law, and bad law can be catastrophic. Fast forward to today, and smart contract security still walks a tightrope between complexity and automation. Enter Smart-LLaMA-DPO, a reinforced large language model designed not just to find vulnerabilities in smart contracts—but to explain them, clearly and reliably. 🧠 Beyond Detection: Why Explanations Matter Most smart contract vulnerability detectors work like smoke alarms—loud when something’s wrong, but not exactly helpful in telling you why. The core innovation of Smart-LLaMA-DPO is that it speaks the language of developers. It explains vulnerabilities with clarity and technical nuance, whether it’s a reentrancy flaw or an oracle manipulation scheme. And that clarity doesn’t come from magic—it comes from Direct Preference Optimization (DPO), a training method where the model learns not just from correct labels, but from expert-ranked explanations. ...

Overqualified, Underprepared: Why FinLLMs Matter More Than Reasoning

General-purpose language models can solve math puzzles and explain Kant, but struggle to identify a ticker or classify earnings tone. What the financial world needs isn’t more reasoning—it’s better reading. Over the past year, large language models (LLMs) have surged into every corner of applied AI, and finance is no exception. But while the promise of “reasoning engines” captivates headlines, the pain point for financial tasks is much simpler—and more niche. ...