Research

Machine Learning for a better future.

Nayan Saxena is a researcher working at the intersection of statistical machine learning, artificial intelligence, and computational cognitive sciences. His research is both fundamental and applied in nature with a focus on the development of algorithms that promote generalization across various tasks. He is part of an extensive network of accomplished researchers with varied backgrounds spanning from machine learning, artificial intelligence to computational cognitive sciences.

He analyzes and manipulates big datasets to create robust machine learning models, enhancing our comprehension of intelligence in humans and machines. Alongside esteemed researchers from the University of Toronto, Brown University, University of Edinburgh and beyond, Saxena continues to generate research outputs that extend to interdisciplinary areas such as multi-armed bandit algorithms, clinical trials, and population dynamics in relation to COVID-19, demonstrating the potency and breadth of his AI research.

Publications

Links: Paper | Code
Venues:
- Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI), Vienna, Austria. (Spotlight Paper)
- Workshop on Machine Learning for the Developing World, 35th Conference on Neural Information Processing Systems (NeurIPS) (Oral Paper)
Authors: Steven Kolawole, Opeyemi Osakuade, Nayan Saxena, & Babatunde Kazeem Olorisade
Abstract: Through this paper, we seek to reduce the communication barrier between the hearing-impaired community and the larger society who are usually not familiar with sign language in the sub-Saharan region of Africa with the largest occurrences of hearing disability cases, while using Nigeria as a case study. The dataset is a pioneer dataset for the Nigerian Sign Language and was created in collaboration with relevant stakeholders. We pre-processed the data in readiness for two different object detection models and a classification model and employed diverse evaluation metrics to gauge model performance on sign-language to text conversion tasks. Finally, we convert the predicted sign texts to speech and deploy the best performing model in a lightweight application that works in real-time and achieves impressive results converting sign words/phrases to text and subsequently, into speech.
Links: Paper | Poster | Code
Venue: Proceedings of the 44th Annual Meeting of the Cognitive Science Society
Authors: Nayan Saxena, Rebekah Gelpi, Daphna Buchsbaum, Chris Lucas
Abstract: We test whether people flexibly shift their sampling strategy for learning a functional relationship, based on the strategy’s perceived effectiveness. While general-purpose heuristics such as gathering information evenly across the environment may often approximate optimal sampling policies when opportunities for learning are sparse, these strategies may systematically fail when much of the environment is uninformative. Across several different classes of arbitrary smooth functions, participants (N = 89) made sampling choices that were initially consistent with a simple heuristic, but shifted their sampling strategy when this heuristic failed to be informative. People were subsequently more accurate at approximating the true function for smoother functions that can be more easily predicted by sampling evenly, t(645.3) = 6.803, p < .001; nevertheless, while individuals vary considerably, aggregate judgments across all functions are very accurate, suggesting that people show inductive biases in active learning, but effectively learn when to adjust their policies.
Links: Paper | Code
Venue: Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada
Authors: Nayan Saxena, Robert Wu, & Rohan Jain
Abstract: We evaluate the robustness of a Neural Architecture Search (NAS) algorithm known as Efficient NAS (ENAS) against data agnostic poisoning attacks on the original search space with carefully designed ineffective operations. We empirically demonstrate how our one shot search space poisoning approach exploits design flaws in the ENAS controller to degrade predictive performance on classification tasks. With just two poisoning operations injected into the search space, we inflate prediction error rates for child networks upto 90% on the CIFAR-10 dataset.
Links: Paper | Code
Venue: Proceedings of the 36th AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada (Oral Paper; Top 20)
Authors: Robert Wu, Nayan Saxena & Rohan Jain
Abstract: Neural Architecture Search (NAS) algorithms automate the task of finding optimal deep learning architectures given an initial search space of possible operations. Developing these search spaces is usually a manual affair with pre-optimized search spaces being more efficient, rather than searching from scratch. In this paper we present a new framework called Neural Architecture Type System (NeuralArTS) that categorizes the infinite set of network operations in a structured type system. We further demonstrate how NeuralArTS can be applied to convolutional layers and propose several future directions.

2022

2021

Links: Paper | Poster | Talk
Venues:
- Proceedings of the 19th International Conference on Cognitive Modeling
- Abstract in proceedings of the 43rd Annual Meeting of the Cognitive Science Society, Vienna, Austria, 2021
Authors: Rebekah Gelpi, Nayan Saxena, George Lifchits, Daphna Buchsbaum & Christopher G Lucas
Abstract: People are capable of learning diverse functional relationships from data; nevertheless, they are most accurate when learning linear relationships, and deviate further from estimating the true relationship when presented with non-linear functions. We investigate whether, when given the opportunity to learn actively, people choose samples in an efficient fashion, and whether better sampling policies improve their ability to learn linear and non-linear functions. We find that, across multiple different function families, people make informative sampling choices consistent with a simple, low-effort policy that minimizes uncertainty at extreme values without requiring adaptation to evidence. While participants were most accurate at learning linear functions, those who more closely adhered to the simple sampling strategy also made better predictions across all non-linear functions. We discuss how the use of this heuristic might reflect rational allocation of limited cognitive resources.
Links: Paper | Code
Venue: Proceedings of the 21st IEEE International Conference on Advanced Learning Technologies, Tartu, Estonia, 2021
Authors: Nayan Saxena, Varun Lodaya & Trisha Thakur
Abstract: In this paper, we explore an extension of the Item Response Theory (IRT) model to predict student responses using dichotomous data and formulate approaches to improve the predictive accuracy of the traditional algorithm. We present a simple extension to the IRT modelling approach called IRT++, which combines both the 1-parameter and 2-parameter IRT models and modulates parameter optimisation through simple machine learning techniques like adaptive gradient descent and random-normal initialisation of parameters. By experimentation on real-world education data, we show how the IRT++ modelling framework other baselines at predicting student responses, and achieves better performance while sacrificing very little in model interpretability and rate of convergence.
Links: Paper | Preprint | Spotlight Talk | Code
Venue: Workshop on Reinforcement Learning for Education, 14th International Conference on Educational Data Mining , Paris, France, 2021 (Spotlight Paper)
Authors: Nayan Saxena, Pan Chen, Emmy Liu
Abstract: Multi-Armed-Bandit frameworks have often been used by researchers to assess educational interventions, however, recent work has shown that it is more beneficial for a student to provide qualitative feedback through preference elicitation between different alternatives, making a dueling bandits framework more appropriate. In this paper, we explore the statistical quality of data under this framework by comparing traditional uniform sampling to a dueling bandit algorithm and find that dueling bandit algorithms perform well at cumulative regret minimisation, but lead to inflated Type-I error rates and reduced power under certain circumstances. Through these results we provide insight into the challenges and opportunities in using dueling bandit algorithms to run adaptive experiments.
Links: Preprint | Workshop Paper | Poster | Code
Venue: Workshop on Adversarial Machine Learning, 38th International Conference on Machine Learning, 2021
Authors: Robert Wu*, Nayan Saxena* & Rohan Jain* (*equal contribution)
Abstract: Deep learning has proven to be a highly effective problem-solving tool for object detection and image segmentation across various domains such as healthcare and autonomous driving. At the heart of this performance lies neural architecture design which relies heavily on domain knowledge and prior experience on the researchers' behalf. More recently, this process of finding the most optimal architectures, given an initial search space of possible operations, was automated by Neural Architecture Search (NAS). In this paper, we evaluate the robustness of one such algorithm known as Efficient NAS (ENAS) against data agnostic poisoning attacks on the original search space with carefully designed ineffective operations. By evaluating algorithm performance on the CIFAR-10 dataset, we empirically demonstrate how our novel search space poisoning (SSP) approach and multiple-instance poisoning attacks exploit design flaws in the ENAS controller to result in inflated prediction error rates for child networks. Our results provide insights into the challenges to surmount in using NAS for more adversarially robust architecture search.

2020

Links: Paper
Venue: Technical Report
Authors: Nayan Saxena & Alexander Shum
Links: Paper
Venue:
- SocArXiv
- Work performed under the University of Toronto COVID-19 Student Engagement Grant
Authors: Liam Keating, Nayan Saxena, Emma Cooper, Jordan Tirico, Daniel Khain, Daphne Imahori
Abstract: The Social Distancing Index (SDI) measures social distancing behaviour every day across all fifty American states. This study leverages SDI data to model social distancing behaviour with time-series COVID-19 data, as well as an array of political and economic variables. The central aim of this study is to examine three hypotheses: (i) COVID-19 outbreaks within a state will induce social distancing by fear of the virus, (ii) states with more low-income workers will engage in less social distancing due to the nature of essential work, and (iii) political beliefs will influence social distancing behaviour, through the public debate over social distancing policy and a partisan logic defining state stay-at-home orders. We use Vector Autoregressive (VAR) and Beta Regression models to determine the most influential variables in this study. VAR models for time-series relationships between cases and social distancing behaviour in California and Texas, and corresponding Granger-Cause Test results, are investigated through case studies. Significant Beta model variables influencing social distancing behaviour are closely examined through visual data analysis and qualitatively contextualized to describe relationships present in the data. Our results indicate statistically significant relationships between the severity of state outbreaks, age and income distribution, change in governor approval ratings, and social distancing behaviour. There are also clear relationships between the partisan make-up of a state and social distancing behaviour. The results found in this study contribute to growing evidence regarding the impact political polarization has on various aspects of American social life, while giving insight to behavioural dynamics that play a critical role in mitigating the spread of COVID-19.