Choosing Topics Without Counting: When LDA Meets Black-Box Intelligence
Opening — Why this matters now Topic modeling has matured into infrastructure. It quietly powers search, document clustering, policy analysis, and exploratory research pipelines across industries. Yet one deceptively simple question still wastes disproportionate time and compute: How many topics should my LDA model have? Most practitioners answer this the same way they did a decade ago: grid search, intuition, or vague heuristics (“try 50, see if it looks okay”). The paper behind this article takes a colder view. Selecting the number of topics, T, is not an art problem — it is a budget‑constrained black‑box optimization problem. Once framed that way, some uncomfortable truths emerge. ...