LLM & NLP

Breakthroughs in LLMs have shifted NLP from task-specific methods to a generalized, data-driven approach, revolutionizing research and applications. Modern LLMs are increasingly being integrated with external tools, such as search engines, APIs, or symbolic reasoning systems to tackle complex tasks requiring specialized knowledge. However, their rise in usage has highlighted challenges in fairness, controllability, transparency, and explainability, which are especially critical qualities in domains like HR, legal, finance, and healthcare.

At Megagon Labs, we strive to harness the potential of LLMs while addressing these limitations. Our research focuses on three key areas: 

  1. Understanding LLM Behavior and Limitations: Investigating how LLMs perform and the challenges they face in real-world production use cases.
  2. Advancing LLM Capabilities: Developing novel systems, hybrid neuro-symbolic approaches, and domain-specific innovations to enhance LLM performance.
  3. Robust Evaluation Methods: Creating effective methods to assess LLMs on complex, real-world tasks, ensuring their reliability and effectiveness in diverse applications.

By leveraging these techniques, we aim to improve the quality, consistency, fairness, and truthfulness of AI solutions tailored for HR and related domains, driving impactful progress in both research and practical applications. Our work encompasses fundamental research, applied projects, and open-source contributions, ensuring that our innovations make a meaningful impact both within and beyond the lab.

Highlighted

Projects

We benchmark and investigate to understand when retrieval enhances LLM performance and when it may hinder it. Our insights contribute to the development of a reliable, retrieval-augmented language model-based QA system.

Investigation into LLM’s sensitivity in multiple-choice question answering – a task commonly used to study the reasoning and fact-retrieving capabilities of LLMs.

 

 

AmbigNLG

Addressing ambiguity in natural language generation (NLG) instructions by identifying unclear specifications and refining them for better output quality.

Less Is More Abstract

An innovative approach, “Extract then Evaluate,” to evaluate long document summaries using LLMs that not only significantly reduces evaluation costs but also aligns more closely with human evaluations.

 

6 Min Read
November 20, 2025
Explore the key takeaways from COLM 2025, including breakthroughs in Reasoning & RL, Multimodal LLMs, and Retrieval & Embedding, as highlighted by Megagon Labs research scientists and engineer.
6 Min Read
November 7, 2025
“Mixed Signals,” exposes hidden biases in VLMs with major implications for healthcare, RAG systems, and AI safety.
5 Min Read
September 17, 2025
We share Megagon Labs’ key takeaways from ACL 2025 — highlighting the trends, debates, and breakthroughs shaping the future of NLP, agentic AI, and trustworthy evaluation.