Edge AI

Talk Less, Coordinate More: MARL Meets the Real World

A warehouse robot fleet does not fail because one robot forgot how to move. It fails because three robots each saw a slightly different world, one message arrived late, another was dropped, and the coordination policy confidently optimised against yesterday’s reality. Very modern. Very autonomous. Very expensive. That is the uncomfortable premise behind Robust and Efficient Communication in Multi-Agent Reinforcement Learning, a survey of how multi-agent reinforcement learning, or MARL, behaves when the communication layer is no longer treated as magic plumbing.1 The paper is not presenting a new benchmark champion. Its value is quieter and more useful: it organises a scattered body of work around the communication failures that actually matter in deployed multi-agent systems. ...

Decoding Intelligence: When Spikes Meet Hyperdimensions

Edge AI has a habit of turning every efficiency problem into a hardware problem. Buy a better chip. Quantise the model. Move the workload closer to the sensor. Reduce the precision until the accuracy team starts twitching. This paper takes a quieter route. It asks whether part of the energy problem comes not from the sensor, the chip, or even the whole network, but from the way the network is asked to speak. ...

Edge of Reason: Orchestrating LLMs Without a Conductor

TL;DR for operators Symphony is not just another “let several agents chat until something sensible happens” framework. The paper’s real contribution is more specific: it proposes a decentralised orchestration pattern where agents advertise capabilities, subtasks are routed to the best-matching available worker, and final answers are selected through weighted voting across multiple reasoning paths.1 ...

Shattering the Spectrum: How PRISM Revives Signal Processing in Time-Series AI

TL;DR for operators PRISM is a useful reminder that the cheapest model is not always the dumbest model. It classifies multivariate time series by first treating each input channel separately, applying symmetric convolutional filters at several temporal resolutions, then mixing those resolution-specific features into a compact representation.1 The business message is straightforward: for sensor-heavy classification tasks, especially wearables, activity recognition, sleep staging, ECG-like biomedical signals, and industrial monitoring, PRISM suggests that a well-chosen signal-processing prior can cut model size and inference cost without turning accuracy into a charity case. ...

From Tadpole to Titan: How DEVFT Grows LLMs Like a Brain

TL;DR for operators Federated LLM fine-tuning sounds attractive until someone asks the rude operational question: who is actually paying for the compute, memory, and communication on the devices? The paper behind DevFT proposes a useful answer: do not fine-tune the full model end-to-end from the first round. Start with a compact submodel, train it federatively, transfer the learned LoRA parameters forward, then expand the model in stages until it reaches the full target size.1 The authors call this Developmental Federated Tuning, and yes, the developmental psychology metaphor is a little enthusiastic. Fortunately, the mechanism is more interesting than the metaphor. ...

Graft and Go: How Knowledge Grafting Shrinks AI Without Shrinking Its Brain

TL;DR for operators A field robot does not care that your neural network is elegant. It cares whether the model fits on the device, runs without draining the battery, and still recognises the weed before the sprayer makes an expensive little mistake. The paper introduces knowledge grafting, a mechanism for taking selected intermediate features from a larger donor model and attaching them to a smaller deployable model, called the rootstock.1 In the reported DeepWeeds experiment, the authors reduce a VGG16-derived model from 64.39 MB to 7.38 MB, cutting parameters from 16,880,201 to 1,934,665, while reporting 90.45% test accuracy on unseen images. ...

Divide, Route, and Conquer: DriftMoE's Smart Take on Concept Drift

TL;DR for operators Production data does not politely wait for quarterly retraining. Sensor readings shift, fraud patterns mutate, market microstructure changes, network traffic acquires new habits, and customer behaviour performs its usual interpretive dance. This is concept drift: the model is still running, but the world it learned from has moved on. ...

Talk is Flight: How RALLY Bridges Language and Learning in UAV Swarms

TL;DR for operators RALLY is not a chatbot with propellers. It is a hybrid control framework for UAV swarms where the LLM supplies structured semantic reasoning and the reinforcement-learning layer decides how agents should divide responsibility.1 The practical insight is the separation of labour. A drone swarm does not only need to know where to fly; it needs to agree who should lead, who should coordinate, who should follow, and when those roles should change. RALLY handles that by combining two-stage LLM consensus with RMIX, a role-value mixing network trained to assign Commander, Coordinator, and Executor roles under partial observability and limited communication. ...

Unsafe at Any Bit: Patching the Safety Gaps in Quantized LLMs

TL;DR for operators Quantizing an LLM is not a harmless cost-saving step. It changes the model, and the paper analysed here shows that those changes can weaken safety even when familiar utility scores still look respectable. That is the uncomfortable part: the dashboard can say “performance preserved” while the model has become more willing to comply with harmful requests. Very efficient. Very modern. Very easy to miss. ...

The Outlier Is a Lie: Quantization Breakthroughs with OSP

TL;DR for operators If your deployment plan depends on squeezing a language model into cheap inference hardware, this paper is worth reading because it changes the timing of the quantization problem. Most quantization work asks: “How do we repair a model after training so it survives 4-bit inference?” Outlier-Safe Pre-Training asks a more irritating question: “Why did we train a quantization-hostile model in the first place?”1 ...