Encoding and Decoding Process LLM

The hidden bottleneck in LLM inference and the impact on MLPerf benchmarking

Recent frontier LLM inference benchmarks have highlighted a recurring pattern. GPU-based systems deliver outstanding ...

15d

MeMo's memory model lets teams upgrade their LLM without retraining it — and performance jumps 26%

Researchers' MeMo keeps AI memory separate from reasoning, so teams can upgrade their LLM without retraining it and see a 26% ...

Semiconductor Engineering

The Edge LLM Offload Story

Modern edge devices demand heterogeneous AI architectures that can mix and match subsystems to accelerate different aspects ...

EDN

MLPerf and the rise of latency-aware LLM benchmarking

Here is a sneak peek at the evolution of the MLPerf benchmark and how generative AI forced a radical shift in AI hardware ...

Tech Xplore

Making LLMs faster and more efficient across multiple languages

Large language models (LLMs), which are the artificial intelligence (AI) systems behind modern chatbots, translation tools, ...

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

LCLMs compress LLM context before decode — 8.8x faster at 16x compression, beating every KV cache method tested. Open-sourced by NYU and Columbia.

14d

Axial encoding unlocks up to eightfold faster 3D microscopy with less light

A research team from HKU Engineering has pioneered a fundamentally new imaging strategy known as AIMED (Arbitrary illumination microscopy with encoded depth), which utilizes a sub-sampling approach.

Tech Xplore

LLMs help robots understand vague instructions and focus on key details

Imagine working at a warehouse or office sometime in the near future, and you're asked to help a new trainee learn the basics ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results