Andrew O. Arnold
  • Principal Applied Machine Learning Engineer, Shopify
  • Adjunct Professor, New York University
  • Quantitative Portfolio Manager
  • Ph.D., Machine Learning, Carnegie Mellon University
  • CV   Linked In   shopify   NYU   Google Scholar   Amazon Science   Email
  • About me •  Professional •  Teaching •  Patents •  Publications •  Invited talks


About me

I am a hands-on technical and scientific expert with experience both as a senior individual contributor and org leader delivering complex, high impact applied machine learning research projects in the areas of science, technology, commerce and quantitative trading. Currently Principal Applied Machine Learning Engineer at Shopify, building machine learning based products to help make commerce better for everyone. Previously Chief Scientist at Oracle Alpha leading machine learning and natural language processing research and production for an emerging systematic fundamental hedge fund. I am also an Adjunct Professor in NYU's Department of Finance and Risk Engineering, lecturing on natural language processing and machine learning applied to quantitative trading and finance. I received my Ph.D. in Machine Learning from Carnegie Mellon University, and my BA in Computer Science and Artificial Intelligence from Columbia University.

With expertise in machine learning, natural language processing and quantitative trading, my personal research interests are in robust machine learning, developing models and features that are robust to: I am particularly interested in applications of robust machine learning to time series and natural language processing models in financial and other domains.
Professional
Current: Former:
Teaching
Patents
Publications

  • Jing Wang, Jie Shen, Xiaofei Ma and Andrew O. Arnold
    Uncertainty-based Active Learning for Reading Comprehension
    Transactions on Machine Learning Research (TMLR), 2022. [paper, video, code]

  • Yuantong Li, Xiaokai Wei, Zijian Wang, Shen Wang, Parminder Bhatia, Xiaofei Ma and Andrew O. Arnold
    Debiasing Neural Retrieval via In-batch Balancing Regularization
    NAACL Workshop on Gender Bias in Natural Language Processing (NAACL:GeBNLP), 2022. [paper]

  • Zhihan Zhou, Dejiao Zhang, Wei Xiao, Nicholas Dingwall, Xiaofei Ma, Andrew O. Arnold and Bing Xiang
    Learning Dialogue Representations from Consecutive Utterances
    North American Chapter of the Association for Computational Linguistics (NAACL), 2022. [paper]

  • Xisen Jin, Dejiao Zhang, Henghui Zhu, Wei Xiao, Shang-Wen Li, Xiaokai Wei, Andrew O. Arnold and Xiang Ren
    Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora
    North American Chapter of the Association for Computational Linguistics (NAACL), 2022. [paper]

  • Danilo Neves Ribeiro, Shen Wang, Xiaofei Ma, Xiaokai Wei, Henghui Zhu, Rui Dong, Xinchi Chen, Peng Xu, Zhiheng Huang, Andrew O. Arnold and Dan Roth
    Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner
    Findings of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022. [paper]

  • Zheng Li, Zijian Wang, Ming Tan, Ramesh Nallapati, Parminder Bhatia, Andrew O. Arnold, Dan Roth and Bing Xiang
    DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
    Association for Computational Linguistics (ACL), 2022. [paper, blog]

  • Mufan Sang, Haoqi Li, Fang Liu, Andrew O. Arnold and Li Wan
    Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022. [paper]

  • Shen Wang, Xiaokai Wei, Cicero Nogueira dos Santos, Zhiguo Wang, Ramesh Nallapati, Andrew O. Arnold and Philip S. Yu
    Knowledge Graph Representation via Hierarchical Hyperbolic Neural Graph Embedding
    IEEE International Conference on Big Data (BigData), 2021. [paper]

  • Dejiao Zhang, Wei Xiao, Henghui Zhu, Xiaofei Ma and Andrew O. Arnold
    Virtual Augmentation Supported Contrastive Learning of Sentence Representations
    Findings of the Association for Computational Linguistics (ACL), 2022. [paper, code, video]

  • Andy T. Liu, Wei Xiao, Henghui Zhu, Dejiao Zhang, Shang-Wen Li and Andrew O. Arnold
    QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition
    arXiv:2203.01543, 2022. [paper]

  • Xiaokai Wei, Shen Wang, Dejiao Zhang, Parminder Bhatia and Andrew O. Arnold
    Knowledge Enhanced Pretrained Language Models: A Comprehensive Survey
    arXiv:2110.08455, 2021. [paper]

  • Dejiao Zhang, Shang-Wen Li, Wei Xiao, Henghui Zhu, Ramesh Nallapati, Andrew O. Arnold and Bing Xiang
    Pairwise Supervised Contrastive Learning of Sentence Representations
    Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021. [paper, code]

  • Yifan Gao, Henghui Zhu, Patrick Ng, Cicero Nogueira dos Santos, Zhiguo Wang, Feng Nan, Dejiao Zhang, Ramesh Nallapati, Andrew O. Arnold and Bing Xiang
    Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction
    Association for Computational Linguistics (ACL), 2021. [paper]

  • Feng Nan, Cicero Nogueira dos Santos, Henghui Zhu, Patrick Ng, Kathleen McKeown, Ramesh Nallapati, Dejiao Zhang, Zhiguo Wang, Andrew O. Arnold and Bing Xiang
    Improving Factual Consistency of Abstractive Summarization via Question Answering
    Association for Computational Linguistics (ACL), 2021. [paper]

  • Xiaofei Ma, Cicero Nogueira dos Santos and Andrew O. Arnold
    Contrastive Fine-tuning Improves Robustness for Neural Rankers
    Findings of the Association for Computational Linguistics (ACL), 2021. [paper]

  • Dejiao Zhang, Feng Nan, Xiaokai Wei, Shang-Wen Li, Henghui Zhu, Kathleen McKeown, Ramesh Nallapati, Andrew O. Arnold and Bing Xiang
    Supporting Clustering with Contrastive Learning
    North American Chapter of the Association for Computational Linguistics (NAACL), 2021. [paper, code]

  • Shen Wang, Xiaokai Wei, Cicero Nogueira dos Santos, Zhiguo Wang, Ramesh Nallapati, Andrew O. Arnold, Bing Xiang, Isabel F. Cruz and Philip S. Yu
    Mixed-Curvature Multi-relational Graph Neural Network for Knowledge Graph Completion
    The Web Conference (WWW), 2021. [paper]

  • Andrew O. Arnold and William W. Cohen
    Instance-based Transfer Learning for Multilingual Deep Retrieval
    The Web Conference Workshop on Multilingual Search (WWW), 2021. [paper]

  • Haitian Sun, Andrew O. Arnold, Tania Bedrax-Weiss, Fernando Pereira and William W. Cohen
    Faithful Embeddings for Knowledge Base Queries
    Neural Information Processing Systems (NeurIPS), 2020. [paper]

  • Cheng Tang and Andrew O. Arnold
    Neural document expansion for ad-hoc information retrieval
    arXiv:2012.14005, 2020. [paper]

  • Andrew O. Arnold
    Exploiting Domain and Task Regularities for Robust Named Entity Recognition
    Ph.D. Thesis, Carnegie Mellon University (CMU), 2009. [paper, slides, proposal, proposal slides]

  • Andrew O. Arnold and William W. Cohen
    Information Extraction as Link Prediction: Using Curated Citation Networks to Improve Gene Detection
    International AAAI Conference on Weblogs and Social Media (ICWSM), 2009. [paper, extended version, poster]

  • Amr Ahmed, Andrew O. Arnold, Luis Pedro Coelho, Joshua Kangas, Abdul-Saboor Sheikh, Eric Xing, William Cohen and Robert F. Murphy
    Structured Literature Image Finder
    ISMB BioLINK Special Interest Group (BioLINK), 2009. [paper]

  • Andrew O. Arnold and William W. Cohen
    Intra-document Structural Frequency Features for Semi-supervised Domain Adaptation
    Conference on Information and Knowledge Management (CIKM), 2008. [paper, slides]

  • Andrew O. Arnold, Ramesh Nallapati and William W. Cohen
    Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition
    Association for Computational Linguistics: Human Language Technologies (ACL:HLT), 2008. [paper, slides]

  • Xiubo Geng, Tie-Yan Liu, Tao Qin, Andrew O. Arnold, Hang Li and Harry Shum
    Query Dependent Ranking Using K-Nearest Neighbor
    Special Interest Group on Information Retrieval (SIGIR), 2008. [paper]

  • Andrew O. Arnold, Ramesh Nallapati and William W. Cohen
    A Comparative Study of Methods for Transductive Transfer Learning
    International Conference on Data Mining Workshop on Mining and Management of Biological Data (ICDM), 2007. [paper, Extended version, slides]

  • Andrew O. Arnold, Yan Liu and Naoki Abe
    Temporal Causal Modeling with Graphical Granger Methods
    International Conference on Knowledge Discovery and Data Mining (KDD), 2007. [paper, slides, video]

  • Andrew O. Arnold, Joseph E. Beck and Richard Scheines
    Feature Discovery in the Context of Educational Data Mining: An Inductive Approach
    AAAI Workshop on Educational Data Mining (AAAI), 2006. [paper]

  • Andrew O. Arnold, Richard Scheines, Joseph E. Beck and Bill Jerome
    Time and Attention: Students, Sessions, and Tasks
    AAAI Workshop on Educational Data Mining (AAAI), 2005. [paper]

  • Eleazar Eskin, Andrew O. Arnold, Michael Prerau, Leonid Portnoy and Salvatore Stolfo
    A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data
    Applications of Data Mining in Computer Security, 2002. [paper]

  • Kristinn R. Thorisson, Hrvoje Benko, Denis Abramov, Andrew O. Arnold, Sameer Maskey, and Aruchunan Vaseekaran
    Constructionist Design Methodology for Interactive Intelligences
    AI Magazine (AAAI), 2004. [paper, abstract, article, video]

  • Andrew O. Arnold and Andrew Howard
    Reinforcement Learning in the Presence of Hidden States
    Computer Science Department, Columbia University, 2002. [paper]
Invited Talks
Software