


default search action
Jimmy Lin
Person information
- affiliation: University of Waterloo, David R. Cheriton School of Computer Science
- affiliation: Twitter Inc., San Francisco, USA
- affiliation: University of Maryland, College Park, Institute for Advanced Computer Studies (UMIACS)
- affiliation: Massachusetts Institute of Technology (MIT), Artificial Intelligence Laboratory
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[c405]Jimmy Lin
, Ronak Pradeep
, Gilad Mishne
, Pankaj Gupta
:
Assembling Your Personal AI Council in Yupp to Provide Multiple Perspectives. WSDM 2026: 1349-1350
[i221]Zhichao Xu, Shengyao Zhuang, Crystina Zhang, Xueguang Ma, Yijun Tian, Maitrey Mehta, Jimmy Lin, Vivek Srikumar:
LACONIC: Dense-Level Effectiveness for Scalable Sparse Retrieval via a Two-Phase Training Curriculum. CoRR abs/2601.01684 (2026)
[i220]Sahel Sharifymoghaddam, Jimmy Lin:
Rerank Before You Reason: Analyzing Reranking Tradeoffs through Effective Token Cost in Deep Search Agents. CoRR abs/2601.14224 (2026)
[i219]Manveer Singh Tamber, Hosna Oyarhoseini, Jimmy Lin:
Unifying Adversarial Robustness and Training Across Text Scoring Models. CoRR abs/2602.00857 (2026)- 2025
[j67]Anas Dorbani, Sunny Yasser, Jimmy Lin, Amine Mhedhbi:
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB. Proc. VLDB Endow. 18(12): 5415-5418 (2025)
[c404]Jimmy Lin:
Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? ACL (6) 2025: 865-872
[c403]Jessica Ojo, Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Jimmy Lin, Pontus Stenetorp, David Ifeoluwa Adelani:
AfroBench: How Good are Large Language Models on African Languages? ACL (Findings) 2025: 19048-19095
[c402]Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin:
VISA: Retrieval Augmented Generation with Visual Source Attribution. ACL (1) 2025: 30154-30169
[c401]Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen-tau Yih, Xilun Chen:
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers. ACL (1) 2025: 30170-30186
[c400]Daniel Gwon
, Nour Jedidi
, Jimmy Lin
:
Study on LLMs for Promptagator-Style Dense Retriever Training. CIKM 2025: 4748-4752
[c399]Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell, Jimmy Lin:
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track. ECIR (1) 2025: 132-148
[c398]Manveer Singh Tamber, Ronak Pradeep, Jimmy Lin:
LiT and Lean: Distilling Listwise Rerankers Into Encoder-Decoder Models. ECIR (3) 2025: 156-164
[c397]Andrew Liu, Edward Xu, Crystina Zhang, Jimmy Lin:
The Impact of Incidental Multilingual Text on Cross-Lingual Transfer in Monolingual Retrieval. ECIR (3) 2025: 165-173
[c396]Crystina Zhang, Sebastian Hofstätter, Patrick Lewis, Raphael Tang, Jimmy Lin:
Rank-Without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models. ECIR (2) 2025: 233-247
[c395]Tommaso Teofili
, Jimmy Lin:
Patience in Proximity: A Simple Early Termination Strategy for HNSW Graph Traversal in Approximate k-Nearest Neighbor Search. ECIR (3) 2025: 401-407
[c394]Yijun Ge, Zijian Chen, Jimmy Lin:
QuackIR: Retrieval in DuckDB and Other Relational Database Management Systems. EMNLP (Industry Track) 2025: 492-500
[c393]Manveer Singh Tamber, Forrest Sheng Bao, Chenyu Xu, Ge Luo, Suleman Kazi, Minseok Bae, Miaoran Li, Ofer Mendelevitch, Renyi Qu, Jimmy Lin:
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards. EMNLP (Industry Track) 2025: 799-811
[c392]Nandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin:
Hard Negatives, Hard Lessons: Revisiting Training Data Quality for Robust Information Retrieval with LLMs. EMNLP (Findings) 2025: 9064-9083
[c391]Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi, Jimmy Lin, Bryan Catanzaro, Wei Ping:
Mm-Embed: Universal Multimodal Retrieval with Multimodal LLMS. ICLR 2025
[c390]Shivani Upadhyay
, Ronak Pradeep, Nandan Thakur
, Daniel Campos, Nick Craswell
, Ian Soboroff, Jimmy Lin
:
A Large-Scale Study of Relevance Assessments with Large Language Models Using UMBRELA. ICTIR 2025: 358-368
[c389]Manveer Singh Tamber, Jimmy Lin:
Illusions of Relevance: Arbitrary Content Injection Attacks Deceive Retrievers, Rerankers, and LLM Judges. IJCNLP-AACL (Findings) 2025: 1112-1127
[c388]Nadia Sheikh, Daniel Buades Marcos
, Anne-Laure Jousse, Akintunde Oladipo, Olivier Rousseau
, Jimmy Lin
:
CURE: A dataset for Clinical Understanding & Retrieval Evaluation. KDD (2) 2025: 5270-5277
[c387]Zijian Chen, John-Michael Gamble, Jimmy Lin:
Zero-Shot ATC Coding with Large Language Models for Clinical Assessments. NAACL (Industry Track) 2025: 226-232
[c386]Nandan Thakur, Suleman Kazi, Ge Luo, Jimmy Lin, Amin Ahmad:
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems. NAACL (Long Papers) 2025: 274-298
[c385]Crystina Zhang, Jing Lu, Vinh Q. Tran, Tal Schuster, Donald Metzler, Jimmy Lin:
Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts? NAACL (Findings) 2025: 1821-1837
[c384]Manveer Singh Tamber, Jasper Xian, Jimmy Lin:
Can't Hide Behind the API: Stealing Black-Box Commercial Embedding Models. NAACL (Findings) 2025: 1958-1969
[c383]Sahel Sharifymoghaddam, Shivani Upadhyay, Wenhu Chen, Jimmy Lin:
UniRAG: Universal Retrieval Augmentation for Large Vision Language Models. NAACL (Findings) 2025: 2026-2039
[c382]Ronak Pradeep
, Nandan Thakur
, Shivani Upadhyay
, Daniel Campos
, Nick Craswell
, Ian Soboroff
, Hoa Trang Dang
, Jimmy Lin
:
The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models. SIGIR 2025: 180-190
[c381]Shengyao Zhuang
, Ekaterina Khramtsova
, Xueguang Ma
, Bevan Koopman
, Jimmy Lin
, Guido Zuccon
:
Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks. SIGIR 2025: 414-423
[c380]Nandan Thakur
, Ronak Pradeep
, Shivani Upadhyay
, Daniel Campos
, Nick Craswell
, Ian Soboroff
, Hoa Trang Dang
, Jimmy Lin
:
Assessing Support for the TREC 2024 RAG Track: A Large-Scale Comparative Study of LLM and Human Evaluations. SIGIR 2025: 2759-2763
[c379]Zijian Chen
, Ronak Pradeep
, Jimmy Lin
:
Accelerating Listwise Reranking: Reproducing and Enhancing FIRST. SIGIR 2025: 3165-3172
[c378]Jimmy Lin
, Arthur Haonan Chen
, Carlos Lassance
, Xueguang Ma
, Ronak Pradeep
, Tommaso Teofili
, Jasper Xian
, Jheng-Hong Yang
, Brayden Zhong
, Vincent Zhong
:
Gosling Grows Up: Retrieval with Learned Dense and Sparse Representations Using Anserini. SIGIR 2025: 3223-3233
[c377]Sahel Sharifymoghaddam
, Ronak Pradeep
, Andre Slavescu
, Ryan Nguyen
, Andrew Xu
, Zijian Chen
, Yilin Zhang
, Yidi Chen
, Jasper Xian
, Jimmy Lin
:
RankLLM: A Python Package for Reranking with LLMs. SIGIR 2025: 3681-3690
[c376]Xueguang Ma
, Luyu Gao
, Shengyao Zhuang
, Jiaqi Samantha Zhan
, Jamie Callan
, Jimmy Lin
:
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality. SIGIR 2025: 4061-4065
[i218]Shengyao Zhuang, Ekaterina Khramtsova, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon:
Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks. CoRR abs/2501.16902 (2025)
[i217]Manveer Singh Tamber, Jimmy Lin:
Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges. CoRR abs/2501.18536 (2025)
[i216]Kenneth C. Enevoldsen, Isaac Chung, Imene Kerboua, Márton Kardos, Ashwin Mathur, David Stap, Jay Gala, Wissam Siblini, Dominik Krzeminski, Genta Indra Winata, Saba Sturua, Saiteja Utpala, Mathieu Ciancone, Marion Schaeffer, Gabriel Sequeira, Diganta Misra, Shreeya Dhakal, Jonathan Rystrøm, Roman Solomatin, Ömer Çagatan, Akash Kundu, Martin Bernstorff, Shitao Xiao, Akshita Sukhlecha, Bhavish Pahwa, Rafal Poswiata, Kranthi Kiran GV, Shawon Ashraf, Daniel Auras, Björn Plüster, Jan Philipp Harries, Loïc Magne, Isabelle Mohr, Mariya Hendriksen, Dawei Zhu, Hippolyte Gisserot-Boukhlef, Tom Aarsen, Jan Kostkan, Konrad Wojtasik, Taemin Lee, Marek Suppa, Crystina Zhang, Roberta Rocca, Mohammed Hamdy, Andrianos Michail, John Yang, Manuel Faysse, Aleksei Vatolin, Nandan Thakur, Manan Dey, Dipam Vasani, Pranjal A. Chitale, Simone Tedeschi, Nguyen Tai, Artem Snegirev, Michael Günther, Mengzhou Xia, Weijia Shi, Xing Han Lù, Jordan Clive, Gayatri Krishnakumar, Anna Maksimova, Silvan Wehrli
, Maria Tikhonova, Henil Panchal, Aleksandr Abramov, Malte Ostendorff, Zheng Liu, Simon Clematide, Lester James V. Miranda, Alena Fenogenova, Guangyu Song, Ruqiya Bin Safi
, Wen-Ding Li, Alessia Borghini, Federico Cassano, Hongjin Su, Jimmy Lin, Howard Yen, Lasse Hansen, Sara Hooker, Chenghao Xiao, Vaibhav Adlakha, Orion Weller, Siva Reddy, Niklas Muennighoff:
MMTEB: Massive Multilingual Text Embedding Benchmark. CoRR abs/2502.13595 (2025)
[i215]Zhichao Xu, Fengran Mo, Zhiqi Huang, Crystina Zhang, Puxuan Yu, Bei Wang, Jimmy Lin, Vivek Srikumar:
A Survey of Model Architectures in Information Retrieval. CoRR abs/2502.14822 (2025)
[i214]Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen-tau Yih, Xilun Chen:
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers. CoRR abs/2502.18460 (2025)
[i213]Manveer Singh Tamber, Suleman Kazi, Vivek Sourabh, Jimmy Lin:
Teaching Dense Retrieval Models to Specialize with Listwise Distillation and LLM Data Augmentation. CoRR abs/2502.19712 (2025)
[i212]Shengyao Zhuang, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon:
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning. CoRR abs/2503.06034 (2025)
[i211]Anas Dorbani, Sunny Yasser, Jimmy Lin, Amine Mhedhbi:
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB. CoRR abs/2504.01157 (2025)
[i210]Nandan Thakur, Jimmy Lin, Sam Havens, Michael Carbin, Omar Khattab, Andrew Drozdov:
FreshStack: Building Realistic Benchmarks for Evaluating Retrieval on Technical Documents. CoRR abs/2504.13128 (2025)
[i209]Ronak Pradeep, Nandan Thakur, Shivani Upadhyay, Daniel Campos, Nick Craswell, Jimmy Lin:
The Great Nugget Recall: Automating Fact Extraction and RAG Evaluation with Large Language Models. CoRR abs/2504.15068 (2025)
[i208]Nandan Thakur, Ronak Pradeep, Shivani Upadhyay, Daniel Campos, Nick Craswell, Jimmy Lin:
Support Evaluation for the TREC 2024 RAG Track: Comparing Human versus LLM Judges. CoRR abs/2504.15205 (2025)
[i207]Sahel Sharifymoghaddam, Shivani Upadhyay, Nandan Thakur, Ronak Pradeep, Jimmy Lin:
Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses. CoRR abs/2504.20006 (2025)
[i206]Xueguang Ma, Luyu Gao, Shengyao Zhuang, Jiaqi Samantha Zhan, Jamie Callan, Jimmy Lin:
Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality. CoRR abs/2505.02466 (2025)
[i205]Manveer Singh Tamber, Forrest Sheng Bao, Chenyu Xu, Ge Luo, Suleman Kazi, Minseok Bae, Miaoran Li, Ofer Mendelevitch, Renyi Qu, Jimmy Lin:
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards. CoRR abs/2505.04847 (2025)
[i204]Nour Jedidi, Yung-Sung Chuang, James R. Glass, Jimmy Lin:
Don't "Overthink" Passage Reranking: Is Reasoning Truly Necessary? CoRR abs/2505.16886 (2025)
[i203]Nandan Thakur, Crystina Zhang, Xueguang Ma, Jimmy Lin:
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval. CoRR abs/2505.16967 (2025)
[i202]Manveer Singh Tamber, Suleman Kazi, Vivek Sourabh, Jimmy Lin:
Conventional Contrastive Learning Often Falls Short: Improving Dense Retrieval with Cross-Encoder Listwise Distillation and Synthetic Data. CoRR abs/2505.19274 (2025)
[i201]Sahel Sharifymoghaddam, Ronak Pradeep, Andre Slavescu, Ryan Nguyen, Andrew Xu, Zijian Chen, Yilin Zhang, Yidi Chen, Jasper Xian, Jimmy Lin:
RankLLM: A Python Package for Reranking with LLMs. CoRR abs/2505.19284 (2025)
[i200]Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Esther Adenuga, David Ifeoluwa Adelani, Jimmy Lin:
Improving Multilingual Math Reasoning for African Languages. CoRR abs/2505.19848 (2025)
[i199]Shivani Upadhyay, Messiah Ataey, Shariyar Murtuza, Yifan Nie, Jimmy Lin:
On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools. CoRR abs/2506.05182 (2025)
[i198]Jiaqi Samantha Zhan, Crystina Zhang, Shengyao Zhuang, Xueguang Ma, Jimmy Lin:
MAGMaR Shared Task System Description: Video Retrieval with OmniEmbed. CoRR abs/2506.09409 (2025)
[i197]Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin:
Overview of the TREC 2021 deep learning track. CoRR abs/2507.08191 (2025)
[i196]Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Hossein A. Rahmani, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, Ian Soboroff:
Overview of the TREC 2023 deep learning track. CoRR abs/2507.08890 (2025)
[i195]Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, Ian Soboroff:
Overview of the TREC 2022 deep learning track. CoRR abs/2507.10865 (2025)
[i194]Zijian Chen, Xueguang Ma, Shengyao Zhuang, Ping Nie, Kai Zou, Andrew Liu, Joshua Green, Kshama Patel, Ruoxi Meng, Mingyi Su, Sahel Sharifymoghaddam, Yanxi Li, Haoran Hong, Xinyu Shi, Xuye Liu, Nandan Thakur, Crystina Zhang, Luyu Gao, Wenhu Chen, Jimmy Lin:
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent. CoRR abs/2508.06600 (2025)
[i193]Yijun Ge, Sahel Sharifymoghaddam, Jimmy Lin:
Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM. CoRR abs/2509.02558 (2025)
[i192]Daniel Gwon, Nour Jedidi, Jimmy Lin:
Study on LLMs for Promptagator-Style Dense Retriever Training. CoRR abs/2510.02241 (2025)
[i191]Nour Jedidi, Jimmy Lin:
Revisiting Feedback Models for HyDE. CoRR abs/2511.19349 (2025)- 2024
[j66]Xinyu Zhang
, Kelechi Ogueji
, Xueguang Ma
, Jimmy Lin
:
Toward Best Practices for Training Multilingual Dense Retrieval Models. ACM Trans. Inf. Syst. 42(2): 39:1-39:33 (2024)
[c375]Jimmy Lin, Junkai Li, Jiasi Gao, Weizhi Ma, Yang Liu:
Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification. AAAI 2024: 13817-13825
[c374]Mofetoluwa Adeyemi, Akintunde Oladipo, Ronak Pradeep, Jimmy Lin:
Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages. ACL (Short Papers) 2024: 650-656
[c373]Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye Hao, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh:
EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems. ACL (1) 2024: 14169-14187
[c372]Ronak Pradeep, Jimmy Lin:
Towards Automated End-to-End Health Misinformation Free Search with a Large Language Model. ECIR (4) 2024: 78-86
[c371]Ronak Pradeep, Daniel Lee, Ali Mousavi, Jeffrey Pound, Yisi Sang, Jimmy Lin, Ihab F. Ilyas, Saloni Potdar, Mostafa Arefiyan, Yunyao Li:
ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models. EMNLP (Industry Track) 2024: 1176-1206
[c370]Shengyao Zhuang, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon
:
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval. EMNLP 2024: 4375-4391
[c369]Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture:
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation. EMNLP 2024: 5441-5454
[c368]Xueguang Ma, Sheng-Chieh Lin, Minghan Li, Wenhu Chen, Jimmy Lin:
Unifying Multimodal Retrieval via Document Screenshot Embedding. EMNLP 2024: 6492-6505
[c367]Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin:
"Knowing When You Don't Know": A Multilingual Relevance Assessment Dataset for Robust Retrieval-Augmented Generation. EMNLP (Findings) 2024: 12508-12526
[c366]Jheng-Hong Yang, Jimmy Lin:
Toward Automatic Relevance Judgment using Vision-Language Models for Image-Text Retrieval Evaluation. LLM4Eval@SIGIR 2024: 113-123
[c365]Xinyu Zhang, Minghan Li, Jimmy Lin:
CELI: Simple yet Effective Approach to Enhance Out-of-Domain Generalization of Cross-Encoders. NAACL (Short Papers) 2024: 188-196
[c364]Raphael Tang, Xinyu Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture:
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models. NAACL-HLT 2024: 2327-2340
[c363]Nandan Thakur, Jianmo Ni, Gustavo Hernández Ábrego, John Wieting, Jimmy Lin, Daniel Cer:
Leveraging LLMs for Synthesizing Training Data Across Many Languages in Multilingual Dense Retrieval. NAACL-HLT 2024: 7699-7724
[c362]Minghan Li, Xilun Chen, Ari Holtzman, Beidi Chen, Jimmy Lin, Scott Yih, Victoria Lin:
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution. NeurIPS 2024
[c361]Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Scott Yih, Xilun Chen:
FLAME : Factuality-Aware Alignment for Large Language Models. NeurIPS 2024
[c360]Mofetoluwa Adeyemi
, Akintunde Oladipo
, Xinyu Zhang
, David Alfonso-Hermelo
, Mehdi Rezagholizadeh
, Boxing Chen
, Abdul-Hakeem Omotayo
, Idris Abdulmumin
, Naome A. Etori
, Toyib Babatunde Musa
, Samuel Fanijo
, Oluwabusayo Olufunke Awoyomi
, Saheed Abdullahi Salahudeen
, Labaran Adamu Mohammed
, Daud Olamide Abolade
, Falalu Ibrahim Lawan
, Maryam Sabo Abubakar
, Ruqayya Nasir Iro
, Amina Abubakar Imam
, Shafie Abdi Mohamed
, Hanad Mohamud Mohamed
, Tunde Oluwaseyi Ajayi
, Jimmy Lin
:
CIRAL: A Test Collection for CLIR Evaluations in African Languages. SIGIR 2024: 293-302
[c359]Nandan Thakur
, Luiz Bonifacio
, Maik Fröbe
, Alexander Bondarenko
, Ehsan Kamalloo
, Martin Potthast
, Matthias Hagen
, Jimmy Lin
:
Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR. SIGIR 2024: 1420-1430
[c358]Ehsan Kamalloo
, Nandan Thakur
, Carlos Lassance
, Xueguang Ma
, Jheng-Hong Yang
, Jimmy Lin
:
Resources for Brewing BEIR: Reproducible Reference Models and Statistical Analyses. SIGIR 2024: 1431-1440
[c357]Minghan Li
, Honglei Zhuang
, Kai Hui
, Zhen Qin
, Jimmy Lin
, Rolf Jagerman
, Xuanhui Wang
, Michael Bendersky
:
Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers? SIGIR 2024: 2321-2326
[c356]Xueguang Ma
, Liang Wang
, Nan Yang
, Furu Wei
, Jimmy Lin
:
Fine-Tuning LLaMA for Multi-Stage Text Retrieval. SIGIR 2024: 2421-2425
[c355]Akintunde Oladipo
, Mofetoluwa Adeyemi
, Jimmy Lin
:
On Backbones and Training Regimes for Dense Retrieval in African Languages. SIGIR 2024: 2564-2568
[c354]Ehsan Kamalloo
, Shivani Upadhyay
, Jimmy Lin
:
Towards Robust QA Evaluation via Open LLMs. SIGIR 2024: 2811-2816
[c353]Shi Zong
, Santosh Kolagati
, Amit Chaudhary
, Josh Seltzer
, Jimmy Lin
:
Reflections on the Coding Ability of LLMs for Analyzing Market Research Surveys. SIGIR 2024: 2900-2904
[c352]Jasper Xian
, Tommaso Teofili
, Ronak Pradeep
, Jimmy Lin
:
Vector Search with OpenAI Embeddings: Lucene Is All You Need. WSDM 2024: 1090-1093
[i190]Jimmy Lin, Junkai Li, Jiasi Gao, Weizhi Ma, Yang Liu:
Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification. CoRR abs/2404.15279 (2024)
[i189]Shengyao Zhuang, Xueguang Ma, Bevan Koopman, Jimmy Lin, Guido Zuccon:
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval. CoRR abs/2404.18424 (2024)
[i188]Sheng-Chieh Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Wen-tau Yih, Xilun Chen:
FLAME: Factuality-Aware Alignment for Large Language Models. CoRR abs/2405.01525 (2024)
[i187]Shivani Upadhyay, Ehsan Kamalloo, Jimmy Lin:
LLMs Can Patch Up Missing Relevance Judgments in Evaluation. CoRR abs/2405.04727 (2024)
[i186]Sahel Sharifymoghaddam, Shivani Upadhyay, Wenhu Chen, Jimmy Lin:
UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models. CoRR abs/2405.10311 (2024)
[i185]Minghan Li, Xilun Chen, Ari Holtzman, Beidi Chen, Jimmy Lin, Wen-tau Yih, Xi Victoria Lin:
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution. CoRR abs/2405.19325 (2024)
[i184]Shivani Upadhyay, Ronak Pradeep, Nandan Thakur, Nick Craswell, Jimmy Lin:
UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor. CoRR abs/2406.06519 (2024)
[i183]Raphael Tang, Xinyu Zhang, Lixinyu Xu, Yao Lu, Wenyan Li, Pontus Stenetorp, Jimmy Lin, Ferhan Ture:
Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation. CoRR abs/2406.08482 (2024)
[i182]Manveer Singh Tamber, Jasper Xian, Jimmy Lin:
Can't Hide Behind the API: Stealing Black-Box Commercial Embedding Models. CoRR abs/2406.09355 (2024)
[i181]Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye Hao, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh:
EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems. CoRR abs/2406.10393 (2024)
[i180]Xueguang Ma, Sheng-Chieh Lin, Minghan Li, Wenhu Chen, Jimmy Lin:
Unifying Multimodal Retrieval via Document Screenshot Embedding. CoRR abs/2406.11251 (2024)
[i179]Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell, Jimmy Lin:
Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track. CoRR abs/2406.16828 (2024)
[i178]Shi Zong, Jimmy Lin:
Categorical Syllogisms Revisited: A Review of the Logical Reasoning Abilities of LLMs for Analyzing Categorical Syllogism. CoRR abs/2406.18762 (2024)
[i177]Nandan Thakur, Luiz Bonifacio, Maik Fröbe, Alexander Bondarenko, Ehsan Kamalloo, Martin Potthast, Matthias Hagen, Jimmy Lin:
Systematic Evaluation of Neural Retrieval Models on the Touché 2020 Argument Retrieval Subset of BEIR. CoRR abs/2407.07790 (2024)
[i176]Jheng-Hong Yang, Jimmy Lin:
Toward Automatic Relevance Judgment using Vision-Language Models for Image-Text Retrieval Evaluation. CoRR abs/2408.01363 (2024)
[i175]Ronak Pradeep, Daniel Lee, Ali Mousavi, Jeff Pound, Yisi Sang, Jimmy Lin, Ihab F. Ilyas, Saloni Potdar, Mostafa Arefiyan, Yunyao Li:
ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models. CoRR abs/2408.05948 (2024)
[i174]Jimmy Lin:
Operational Advice for Dense and Sparse Retrievers: HNSW, Flat, or Inverted Indexes? CoRR abs/2409.06464 (2024)
[i173]Nandan Thakur, Suleman Kazi, Ge Luo, Jimmy Lin, Amin Ahmad:
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems. CoRR abs/2410.13716 (2024)
[i172]Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi, Jimmy Lin, Bryan Catanzaro, Wei Ping:
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs. CoRR abs/2411.02571 (2024)
[i171]Xinyu Zhang, Jing Lu, Vinh Q. Tran, Tal Schuster, Donald Metzler, Jimmy Lin:
Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models. CoRR abs/2411.04530 (2024)
[i170]Zijian Chen, Ronak Pradeep, Jimmy Lin:
An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking. CoRR abs/2411.05508 (2024)
[i169]Shivani Upadhyay, Ronak Pradeep, Nandan Thakur, Daniel Campos, Nick Craswell, Ian Soboroff, Hoa Trang Dang, Jimmy Lin:
A Large-Scale Study of Relevance Assessments with Large Language Models: An Initial Look. CoRR abs/2411.08275 (2024)
[i168]Ronak Pradeep, Nandan Thakur, Shivani Upadhyay, Daniel Campos, Nick Craswell, Jimmy Lin:
Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework. CoRR abs/2411.09607 (2024)
[i167]Nadia Sheikh, Anne-Laure Jousse, Daniel Buades Marcos
, Akintunde Oladipo, Olivier Rousseau, Jimmy Lin:
CURE: A dataset for Clinical Understanding & Retrieval Evaluation. CoRR abs/2412.06954 (2024)
[i166]Zijian Chen, John-Michael Gamble, Micaela Jantzi, John P. Hirdes, Jimmy Lin:
Zero-Shot ATC Coding with Large Language Models for Clinical Assessments. CoRR abs/2412.07743 (2024)
[i165]Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin:
VISA: Retrieval Augmented Generation with Visual Source Attribution. CoRR abs/2412.14457 (2024)
[i164]Jimmy Lin, Pankaj Gupta, Will Horn, Gilad Mishne:
Musings About the Future of Search: A Return to the Past? CoRR abs/2412.18956 (2024)- 2023
[j65]Sheng-Chieh Lin, Minghan Li
, Jimmy Lin:
Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval. Trans. Assoc. Comput. Linguistics 11: 436-452 (2023)
[j64]Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin:
MIRACL: A Multilingual Retrieval Dataset Covering 18 Diverse Languages. Trans. Assoc. Comput. Linguistics 11: 1114-1131 (2023)
[j63]Joel Mackenzie
, Andrew Trotman
, Jimmy Lin
:
Efficient Document-at-a-time and Score-at-a-time Query Evaluation for Learned Sparse Representations. ACM Trans. Inf. Syst. 41(4): 96:1-96:28 (2023)
[j62]Sheng-Chieh Lin
, Jimmy Lin
:
A Dense Representation Framework for Lexical and Semantic Matching. ACM Trans. Inf. Syst. 41(4): 110:1-110:29 (2023)
[c351]Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Jimmy Lin:
Evaluating Embedding APIs for Information Retrieval. ACL (industry) 2023: 518-526
[c350]Aleksandra Piktus, Odunayo Ogundepo, Christopher Akiki, Akintunde Oladipo, Xinyu Zhang, Hailey Schoelkopf, Stella Biderman, Martin Potthast, Jimmy Lin:
GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration. ACL (demo) 2023: 588-598
[c349]Luyu Gao, Xueguang Ma, Jimmy Lin, Jamie Callan:
Precise Zero-Shot Dense Retrieval without Relevance Labels. ACL (1) 2023: 1762-1777
[c348]Ji Xin, Raphael Tang, Zhiying Jiang, Yaoliang Yu, Jimmy Lin:
Operator Selection and Ordering in a Pipeline Approach to Efficiency Optimizations for Transformers. ACL (Findings) 2023: 2870-2882
[c347]Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin, Ferhan Ture:
What the DAAM: Interpreting Stable Diffusion Using Cross Attention. ACL (1) 2023: 5644-5659
[c346]Zhiying Jiang, Matthew Y. R. Yang, Mikhail Tsirlin, Raphael Tang, Yiqin Dai, Jimmy Lin:
"Low-Resource" Text Classification: A Parameter-Free Classification Method with Compressors. ACL (Findings) 2023: 6810-6828
[c345]Minghan Li
, Sheng-Chieh Lin, Barlas Oguz, Asish Ghoshal, Jimmy Lin, Yashar Mehdad, Wen-tau Yih, Xilun Chen:
CITADEL: Conditional Token Interaction via Dynamic Lexical Routing for Efficient and Effective Multi-Vector Retrieval. ACL (1) 2023: 11891-11907
[c344]Xueguang Ma
, Tommaso Teofili
, Jimmy Lin
:
Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes. CIKM 2023: 5366-5370
[c343]Wei Zhong
, Yuqing Xie
, Jimmy Lin
:
Answer Retrieval for Math Questions Using Structural and Dense Retrieval. CLEF 2023: 209-223
[c342]Ronak Pradeep, Haonan Chen, Lingwei Gu, Manveer Singh Tamber, Jimmy Lin:
PyGaggle: A Gaggle of Resources for Open-Domain Question Answering. ECIR (3) 2023: 148-162
[c341]Manveer Singh Tamber, Ronak Pradeep, Jimmy Lin:
Pre-processing Matters! Improved Wikipedia Corpora for Open-Domain Question Answering. ECIR (3) 2023: 163-176
[c340]Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, Xinyu Zhang, Akintunde Oladipo, Jimmy Lin, Martin Potthast:
Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face. EMNLP (Demos) 2023: 140-148
[c339]Akintunde Oladipo, Mofetoluwa Adeyemi, Orevaoghene Ahia, Abraham Toluwase Owodunni, Odunayo Ogundepo, David Ifeoluwa Adelani, Jimmy Lin:
Better Quality Pre-traini


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID