IBM at IJCAI 2023
- Macao, China
About
The AI research teams from various IBM Research locations will be at the 32nd International Joint Conference on Artificial Intelligence, the premier international gathering of researchers in AI. These IBMers will showcase some of our exciting projects by giving talks, hosting interactive workshops and engaged panel discussions.
Read our accepted papers at IJCAI 2023 (including workshop papers not listed as part of the agenda).
For presentation times of workshops, demos, papers, and tutorials from our teams see the agenda section below.
Note: All times are displayed in your local time.
Agenda
- Description:
This tutorial reviews the design of common meaning representations, SoTA models for predicting meaning representations, and the applications of meaning representations in a wide range of downstream NLP tasks and real-world applications. Reporting by a diverse team of NLP researchers from academia and industry with extensive experience in designing, building, and using meaning representations, our tutorial has three components: (1) an introduction to common meaning representations, including basic concepts and design challenges; (2) a review of SoTA methods on building models for meaning representations; and (3) an overview of applications of meaning representations in downstream NLP tasks and real-world applications.
This is an all-day** tutorial.**
**Presenters: **Nianwen Xue; Julia Bonn; Jeffrey Flanigan; Timothy O’Gorman; Jan Jan Hajic; Ishan Jindal; Yunyao Li
- Description:
A team from IBM Research Africa is co-organizing the neuro-symbolic agents workshop (NSA). This workshop combines ML and symbolic AI for reinforcement learning; more details available on github: https://nsa-wksp.github.io/.
Organizers: Alessandra Russo; Daiki Kimura _(primary contact); _Ndivhuwo Makondo; Steven James
- Description:
While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations. This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and reinforcement learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and reinforcement learning, in particular, those that focus on intelligent decision-making.
More details are available here.
Organizing Committee: Cameron Allen, Timo P. Gros, Michael Katz, Harsha Kokel, Hector Palacios, Sarath Sreedharan
- Description:
Open Retrieval Question Answering (OpenQA) systems are made of two key components: retrievers and readers, both of which have seen rapid advancements in their cross-lingual capabilities across modalities. This brings up a new challenge of ensuring efficiency, reproducibility and easy reusability of these state-of-the-art (SOTA) retrievers and readers. In this tutorial, we will first cover recent advances in multilingual and multi-modal retrievers and readers across structured tabular data and unstructured text data, with a specific focus to using them in OpenQA. We will also introduce a novel toolkit which hosts multiple SOTA retrievers and readers, and have hands-on sessions to demonstrate how to build robust OpenQA systems with SOTA models using this toolkit.
This is a 3-hour** tutorial** with presenters from IBM, Stanford, UIUC, UWaterloo
Presenters: Avi Sil; Bhavani Iyer; Revanth Reddy Gangi Reddy; Jaydeep Sen; Wenhu Chen; Christopher Potts
- Description:
The Logical Credal Network or LCN is a recent probabilistic logic designed for effective aggregation and reasoning over multiple sources of imprecise knowledge. An LCN specifies a set of probability distributions over all interpretations of a set of logical formulas for which marginal and conditional probability bounds on their truth values are known. Inference in LCNs involves the exact solution of a non-convex non-linear program defined over an exponentially large number of non-negative real valued variables and, therefore, is limited to relatively small problems. In this paper, we present ARIEL — a novel iterative message-passing scheme for approximate inference in LCNs. Inspired by classical belief propagation for graphical models, our method propagates messages that involve solving considerably smaller local non-linear programs. Experiments on several classes of LCNs demonstrate clearly that ARIEL yields high quality solutions compared with exact inference and scales to much larger problems than previously considered.
This work will be presented in a** Uncertainty in AI (1/2) Session.**
Authors: Radu Marinescu; Haifeng Qian; Alexander Gray; Debarun Bhattacharjya; Francisco Barahona; Tian Gao; Ryan Riegel
- Description:
In this paper we discuss how AI can contribute to support the documentation and vitalization of Indigenous languages and how that involves a delicate balancing of ensuring social impact, exploring technical opportunities, and dealing with ethical constraints. We start by surveying previous work on using AI and NLP to support critical activities of strengthening Indigenous and endangered languages and discussing key limitations of current technologies. After presenting basic ethical constraints of working with Indigenous languages and communities, we propose that creating and deploying language technology ethically with and for Indigenous communities forces AI researchers and engineers to address some of the main shortcomings and criticisms of current technologies. Those ideas are also explored in the discussion of a real case of development of large language models for Brazilian Indigenous languages.
This work will be presented in the AI for Social Good – Humans and AI Session.
Authors: Claudio S. Pinhanez; Paulo Cavalin; Marisa Vasconcelos; Julio Nogima
- Description:
Event sequences are widely available across application domains and there is a long history of models for representing and analyzing such datasets. Summary Markov models are a recent addition to the literature that help identify the subset of event types that influence event types of interest to a user. In this paper, we introduce logical summary Markov models, which are a family of models for event sequences that enable interpretable predictions through logical rules that relate historical predicates to the probability of observing an event type at any arbitrary position in the sequence. We illustrate their connection to prior parametric summary Markov models as well as probabilistic logic programs, and propose new models from this family along with efficient greedy search algorithms for learning them from data. The proposed models outperform relevant baselines on most datasets in an empirical investigation on a probabilistic prediction task. We also compare the number of influencers that various logical summary Markov models learn on real-world datasets, and conduct a brief exploratory qualitative study to gauge the promise of such symbolic models around guiding large language models for predicting societal events.
This work will be presented in the Uncertainty in AI (1/2) Session.
Authors: Debarun Bhattacharjya; Oktie Hassanzadeh; Ronny Luss; Keerthiram Murugesan
- Description:
A new branch of work from the research team in Africa will be presenting work on efficient characterization of deep learning representations for detection tasks at the** African Spotlight Poster Session.** This work proposes a method to obtain the representation of DNNs using node-specific histograms to compute p-values of observed activations without retaining already-known inputs. Our approach demonstrates promising potential when validated with multiple network architectures across various downstream detection tasks and compared with the kernel density estimates and brute-force empirical baseline. It reduces memory usage by 30% with faster p-value computing time, while maintaining state-of-the-art detection power in downstream tasks.
This contribution is part of the** African Spotlight Poster Session.**
Speakers:TACCGTGirmaw Abebe TadessePrincipal Research Scientist and ManagerMicrosoftAOAdebayo OshingbesanResearch EngineerIBM ResearchSSEIEdward McFowland IIIAssistant ProfessorHarvard Business School - Description:
We consider the problem of risk-aware Markov Decision Processes (MDPs) for Safe AI. We introduce a theoretical framework, Extended Markov Ratio Decision Processes (EMRDP), that incorporates risk into MDPs and embeds environment learning into this framework. We propose an algorithm to find the optimal policy for EMRDP with theoretical guarantees. Under a certain monotonicity assumption, this algorithm runs in strongly-polynomial time both in the discounted and expected average reward models. We validate our algorithm empirically on a Grid World benchmark, evaluating its solution quality, required number of steps, and numerical stability. We find its solution quality to be stable under data noising, while its required number of steps grows with added noise. We observe its numerical stability compared to global methods.
This work will be presented in a Planning and Scheduling (1/3) Session.
Authors: Alexander Zadorojniy; Takayuki Osogami; Orit Davidovich
- Description:
Trading networks are an indispensable part of today’s economy, but to compete successfully with others, they must be efficient in maximizing the value they provide to the external market. While the prior work relies on truthful disclosure of private information to achieve efficiency, we study the problem of designing mechanisms that result in efficient trading networks by incentivizing firms to truthfully reveal their private information to a third party. Additional desirable properties of such mechanisms are weak budget balance (WBB; the third party needs not invest) and individual rationality (IR; firms get non-negative utility). Unlike combinatorial auctions, there may not exist mechanisms that simultaneously satisfy these properties ex post for trading networks. We propose an approach for computing or learning truthful and efficient mechanisms for given networks in a Bayesian setting, where WBB and IR, respectively, are relaxed to ex ante and interim for a given distribution over the private information. We incorporate techniques to reduce computational and sample complexity. We empirically demonstrate that the proposed approach successfully finds the mechanisms with the relaxed properties for trading networks where achieving ex post properties is impossible.
This work will be presented in the GTEP: Mechanism Design Session.
Authors: Takayuki Osogami; Segev Wasserkrug; Elisheva S. Shamash
- Description:
Top-k planning, the task of finding k top-cost plans, is a key formalism for many planning applications and K* search is a well-established approach to top-k planning. The algorithm iteratively runs A* search and Eppstein’s algorithm until a sufficient number of plans is found. The performance of K* algorithm is therefore inherently limited by the performance of A*, and in order to improve K* performance, that of A* must be improved. In cost-optimal planning, orbit space search improves A* performance by exploiting symmetry pruning, essentially performing A* in the orbit space instead of state space. In this work, we take a similar approach to top-k planning. We show theoretical equivalence between the goal paths in the state space and in the orbit space, allowing to perform K* search in the orbit space instead, reconstructing plans from the found paths in the orbit space. We prove that our algorithm is sound and complete for top-k planning and empirically show it to achieve state-of-the-art performance, overtaking all existing to date top-k planners. The code is available at https://github.com/IBM/kstar. This work will be presented in a Planning and Scheduling (2/3) Session.
Authors: Michael Katz; Junkyu Lee
- Description:
SiWare is an AI-powered Knowledge Discovery system, that helps unlock new insights and accelerates data-driven decisions with contextualized Industrial data. SiWare links and fuses heterogeneous data sources with an industry semantic model leveraging multiple AI capabilities to provide system-wide visibility into operational characteristics. As part of this demo paper, we describe the requirements for such a system, and deployment aspects, and demonstrate the benefits in two industrial scenarios.
Authors: Anuradha Bhamidipaty; Elham Khabiri; Bhavna Agrawal; Yingjie Li
- Description:
One of the ultimate goals of Artificial Intelligence is to assist humans in complex decision making. A promising direction for achieving this goal is Neuro-Symbolic AI, which aims to combine the interpretability of symbolic techniques with the ability of deep learning to learn from raw data. However, most current approaches require manually engineered symbolic knowledge, and where end-to-end training is considered, such approaches are either unable to learn solutions to problems of computational complexity greater than P, or are restricted to training binary neural networks. In this paper, we introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data, whilst learning symbolic knowledge that maps latent concepts to target labels. The novelty of our approach is a method for biasing the learning of symbolic knowledge, based on the in-training performance of both neural and symbolic components. We evaluate NSIL on three problem domains of different complexity, including an NP-complete problem. Our results demonstrate that NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency. The code and technical appendix can be found here.
This work will be presented in the** ML: Neuro-symbolic Methods Session**.
Authors: Daniel Cunnington; Mark Law; Jorge Lobo; Alessandra Russo
- Description:
The AI research team at IBM Research Africa has been working on trustworthy models in deep learning for the last six years. In this occasion, we show a method to detect attacks in optical flow models; this is an extension of previous work detecting adversarial attacks in an unconstrained search manner. In recent years, optical flow estimation has significantly improved with advances in deep neural networks. However, these flow networks have recently been shown to be vulnerable to patch-based adversarial attacks, which pose security risks in real-world applications such as self-driving cars and robotics. The team proposes a spatially constrained adversarial attack detection and localization framework. An attacked input sequence is detected via iterative optimization on the features from the inner layers of flow networks without any prior knowledge of the attacks. Our method, described in Figure 2, ensures that the detected anomalous subset of features comes from a local region. We can provide a subset of nodes within a spatial neighborhood that contributes more to the detection, which will be utilized to localize the attack in the input sequence.
This work was accepted at the main track, and will be presented in the Computer Vision (3/6) Session.
Speakers:HKHannah KimApplied ScientistAmazonCCGTGirmaw Abebe TadessePrincipal Research Scientist and ManagerMicrosoftSS - Description:
Agriculture faces unprecedented challenges due to climate change, population growth, and water scarcity. These challenges highlight the need for efficient resource usage to optimize crop production. Conventional techniques for forecasting hydrological response features, such as soil moisture, rely on physics-based and empirical hydrological models, which necessitate significant time and domain expertise. Drawing inspiration from traditional hydrological modeling, a novel temporal graph convolution neural network has been constructed. This involves grouping units based on their time-varying hydrological properties, constructing graph topologies for each cluster based on similarity using dynamic time warping, and utilizing graph convolutions and a gated recurrent neural network to forecast soil moisture. The method has been trained, validated, and tested on field-scale time series data spanning 40 years in northeastern United States. Results show that using domain-inspired clustering with time series graph neural networks is more effective in forecasting soil moisture than existing models. This framework is being deployed as part of a pro bono social impact program that leverages hybrid cloud and AI technologies to enhance and scale non-profit and government organizations. The trained models are currently being deployed on a series of small-holding farms in central Texas.
This work will be presented in the AI for Social Good – ML 2 Session.
Authors: Muneeza Azmat; Malvern Madondo; Arun Bawa; Kelsey Dipietro; Raya Horesh; Michael Jacobs; Raghavan Srinivasan; Fearghal O’Donncha
- Description:
In the reinforcement learning space, the team presents NIAGRA for automated theorem proving tasks. Current approaches use representations of logical statements that often rely on the names used in these statements, and, as a result, the models are generally not transferable from one domain to another. The size of these representations and whether to include the whole theory or part of it, are other important decisions that affect the performance of these approaches as well as their runtime efficiency. In this paper, the team present NIAGRA; an ensemble improved Graph Neural Network for learning name-invariant formula representations that is tailored for their unique characteristics and coupled with an efficient ensemble approach for automated theorem proving. This work will be presented in a Knowledge Representation and Reasoning (4/4) Session.
Speakers:AFAchille Fokoue-NkoutcheResearch ScientistIBM ResearchIAIBM ResearchMCMaxwell CrouseResearch ScientistIBM ResearchSIShajith IkbalSenior Research ScientistIBM ResearchAKAkihiro KishimotoSenior Research ScientistIBM ResearchGLNMRM - Description:
Planning tasks succinctly represent labeled transition systems, with each ground action corresponding to a label. This granularity, however, is not necessary for solving planning tasks and can be harmful, especially for model-free methods. In order to apply such methods, the label sets are often manually reduced. In this work, we propose automating this manual process. We characterize a valid label reduction for classical planning tasks and propose an automated way of obtaining such valid reductions by leveraging lifted mutex groups. Our experiments show a significant reduction in the action label space size across a wide collection of planning domains. We demonstrate the benefit of our automated label reduction in two separate use cases: improved sample complexity of model-free reinforcement learning algorithms and speeding up successor generation in lifted planning. The code and supplementary material are available at https://github.com/IBM/Parameter-Seed-Set.
This work will be presented in a Planning and Scheduling (2/3) Session.
Authors: Harsha Kokel; Junkyu Lee; Michael Katz; Kavitha Srinivas; Shirin Sohrabi
- Description:
Accurate completion of archaeological artifacts is a critical aspect in several archaeological studies, including documentation of variations in style, inference of chronological and ethnic groups, and trading routes trends, among many others. However, most available pottery is fragmented, leading to missing textural and morphological cues. Currently, the reassembly and completion of fragmented ceramics is a daunting and time-consuming task, done almost exclusively by hand, which requires the physical manipulation of the fragments. To overcome the challenges of manual reconstruction, reduce the materials' exposure and deterioration, and improve the quality of reconstructed samples, we present IberianVoxel, a novel 3D Autoencoder Generative Adversarial Network (3D AE-GAN) framework tested on an extensive database with complete and fragmented references. We generated a collection of 100$ 3D voxelized samples and their fragmented references from Iberian wheel-made pottery profiles. The fragments generated are stratified into different size groups and across multiple pottery classes. Lastly, we provide quantitative and qualitative assessments to measure the quality of the reconstructed voxelized samples by our proposed method and archaeologists' evaluation. Our team will present a collaboration on robust generative models and cultural heritage in a AI and Arts: Arts, Design and Crafts Session.
Speakers:PNPablo NavarroResearch ScientistInstituto Patagónico de Ciencias Sociales y Humanas (CONICET), Centro Nacional PatagónicoCCMLManuel LucenaProfessorResearch University Institute for Iberian Archaeology, University of JaénJFJosé Manuel FuertesProfessorResearch University Institute for Iberian Archaeology, University of JaénARAntonio RuedaProfessorResearch University Institute for Iberian Archaeology, University of JaénRSRafael SeguraProfessorResearch University Institute for Iberian Archaeology, University of JaénCOCarlos Ogayar-AnguitaProfessorCenter for Advanced Studies in Information and Communication Technologies, University of JaénRGRolando González-JoséResearch ScientistInstituto Patagónico de Ciencias Sociales y Humanas (CONICET), Centro Nacional PatagónicoCDClaudio DelrieuxProfessorUniversidad del Sur - Description:
Disease progression modeling (DPM) plays an essential role in characterizing patients’ historical progressive pathways and predicting their future risks. Apprenticeship learning (AL) seeks to induce decision-making policies via observing and imitating experts’ demonstrated behaviors. In this paper, we investigate the incorporation of patterns derived from AL for DPM, utilizing a Time-aware Hierarchical EM Energy-based Subsequence (THEMES) AL approach. To the best of our knowledge, this is the first study incorporating AL-derived interventional patterns for DPM, and we evaluate its efficacy on a challenging task of septic shock early prediction. Our results demonstrate that integrating AL-derived intervention patterns can significantly enhance the performance of DPM.
This work will be presented in the DM: Mining Spatial and/or Temporal Data Session.
Authors: Xi Yang; Ge Gao; Min Chi
- Description:
Careful choice of feature transformations in a dataset can help predictive model performance, data understanding and data exploration. However, finding useful features is a challenge, and while recent Automated Machine Learning (AutoML) systems provide some limited automation for feature engineering or data exploration, it is still mostly done by humans. We demonstrate a system called SemFORMS (Semantic Transforms), which attempts to mine useful expressions for a dataset from access to a repository of code that may target the same dataset/similar dataset. In many enterprises, numerous data scientists often work on the same or similar datasets, but are largely unaware of each other’s work. SemFORMS finds appropriate code from such a repository, and normalizes the code to be an actionable transform that can prepended into any AutoML pipeline. We demonstrate SemFORMS operating over example datasets from the OpenML benchmarks where it sometimes leads to significant improvements in AutoML performance.
Authors: Ibrahim Abdelaziz; Julian Dolby; Udayan Khurana; Horst Samulowitz; Kavitha Srinivas
- Description:
Moderator: F. Amilcar Cardoso (University of Coimbra)
Panelists: Carlos Cancino-Chacón (Johannes Kepler University), Celia Cintas (IBM Research Africa), Jivko Sinapov (Tufts University), Philippe Pasquier (Simon Fraser University)
Upcoming events
- —
Berkeley Innovation Forum 2025 at IBM Research
- San Jose, CA, USA
IBM Research Brazil Forum 2025
- Rio de Janeiro, Brazil
- —
AI Hardware Forum 2025
- Yorktown Heights, NY, USA