Week Ending 1.28.2024
RESEARCH WATCH: 1.28.2024
SPONSORED BY
Scientific Large Language Models: A Survey on Biological & Chemical Domains
The paper on scientific language models provides a comprehensive survey of how large language models can be applied to facilitate discoveries in specialized scientific domains like biology and chemistry. These models have potential to enhance natural language understanding and representation in these fields.
Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen
Link: https://arxiv.org/abs/2401.14656v1
Date: 2024-01-26
Summary:
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent of scientific LLMs, a novel subclass specifically engineered for facilitating scientific discovery. As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration. However, a systematic and up-to-date survey introducing them is currently lacking. In this paper, we endeavor to methodically delineate the concept of "scientific language", whilst providing a thorough review of the latest advancements in scientific LLMs. Given the expansive realm of scientific disciplines, our analysis adopts a focused lens, concentrating on the biological and chemical domains. This includes an in-depth examination of LLMs for textual knowledge, small molecules, macromolecular proteins, genomic sequences, and their combinations, analyzing them in terms of model architectures, capabilities, datasets, and evaluation. Finally, we critically examine the prevailing challenges and point out promising research directions along with the advances of LLMs. By offering a comprehensive overview of technical developments in this field, this survey aspires to be an invaluable resource for researchers navigating the intricate landscape of scientific LLMs.
--------------------------------------------------------------------------------------------------------
Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse
The paper on alternative speech proposes a new technique to directly combat hate speech by providing corrected responses tailored to the context. This method could complement existing approaches and mitigate issues like racial discrimination.
Authors: Seungyoon Lee, Dahyun Jung, Chanjun Park, Seolhwa Lee, Heuiseok Lim
Link: https://arxiv.org/abs/2401.14616v1
Date: 2024-01-26
Summary:
We introduce the concept of "Alternative Speech" as a new way to directly combat hate speech and complement the limitations of counter-narrative. An alternative speech provides practical alternatives to hate speech in real-world scenarios by offering speech-level corrections to speakers while considering the surrounding context and promoting speakers to reform. Further, an alternative speech can combat hate speech alongside counter-narratives, offering a useful tool to address social issues such as racial discrimination and gender inequality. We propose the new concept and provide detailed guidelines for constructing the necessary dataset. Through discussion, we demonstrate that combining alternative speech and counter-narrative can be a more effective strategy for combating hate speech by complementing specificity and guiding capacity of counter-narrative. This paper presents another perspective for dealing with hate speech, offering viable remedies to complement the constraints of current approaches to mitigating harmful bias.
--------------------------------------------------------------------------------------------------------
Driving Towards Inclusion: Revisiting In-Vehicle Interaction in Autonomous Vehicles
The paper on interaction design for autonomous vehicles reviews inclusive human-computer interaction to ensure accessible passenger experiences. The proposed design framework could help developers and policymakers enable safe, comfortable rides.
Authors: Ashish Bastola, Julian Brinkley, Hao Wang, Abolfazl Razi
Link: https://arxiv.org/abs/2401.14571v1
Date: 2024-01-26
Summary:
This paper presents a comprehensive literature review of the current state of in-vehicle human-computer interaction (HCI) in the context of self-driving vehicles, with a specific focus on inclusion and accessibility. This study's aim is to examine the user-centered design principles for inclusive HCI in self-driving vehicles, evaluate existing HCI systems, and identify emerging technologies that have the potential to enhance the passenger experience. The paper begins by providing an overview of the current state of self-driving vehicle technology, followed by an examination of the importance of HCI in this context. Next, the paper reviews the existing literature on inclusive HCI design principles and evaluates the effectiveness of current HCI systems in self-driving vehicles. The paper also identifies emerging technologies that have the potential to enhance the passenger experience, such as voice-activated interfaces, haptic feedback systems, and augmented reality displays. Finally, the paper proposes an end-to-end design framework for the development of an inclusive in-vehicle experience, which takes into consideration the needs of all passengers, including those with disabilities, or other accessibility requirements. This literature review highlights the importance of user-centered design principles in the development of HCI systems for self-driving vehicles and emphasizes the need for inclusive design to ensure that all passengers can safely and comfortably use these vehicles. The proposed end-to-end design framework provides a practical approach to achieving this goal and can serve as a valuable resource for designers, researchers, and policymakers in this field.
--------------------------------------------------------------------------------------------------------
The paper on PET imaging presents a technique to generate high-quality attenuation maps from low-dose PET, reducing radiation exposure. The method could enable accurate, low-dose PET imaging to improve clinical workflows.
Authors: Bo Zhou, Jun Hou, Tianqi Chen, Yinchi Zhou, Xiongchao Chen, Huidong Xie, Qiong Liu, Xueqi Guo, Yu-Jung Tsai, Vladimir Y. Panin, Takuya Toyonaga, James S. Duncan, Chi Liu
Link: https://arxiv.org/abs/2401.14285v1
Date: 2024-01-25
Summary:
Low-dose PET offers a valuable means of minimizing radiation exposure in PET imaging. However, the prevalent practice of employing additional CT scans for generating attenuation maps (u-map) for PET attenuation correction significantly elevates radiation doses. To address this concern and further mitigate radiation exposure in low-dose PET exams, we propose POUR-Net - an innovative population-prior-aided over-under-representation network that aims for high-quality attenuation map generation from low-dose PET. First, POUR-Net incorporates an over-under-representation network (OUR-Net) to facilitate efficient feature extraction, encompassing both low-resolution abstracted and fine-detail features, for assisting deep generation on the full-resolution level. Second, complementing OUR-Net, a population prior generation machine (PPGM) utilizing a comprehensive CT-derived u-map dataset, provides additional prior information to aid OUR-Net generation. The integration of OUR-Net and PPGM within a cascade framework enables iterative refinement of $\mu$-map generation, resulting in the production of high-quality $\mu$-maps. Experimental results underscore the effectiveness of POUR-Net, showing it as a promising solution for accurate CT-free low-count PET attenuation correction, which also surpasses the performance of previous baseline methods.
--------------------------------------------------------------------------------------------------------
The paper on long-term conversations leverages commonsense knowledge and context to expand limited persona descriptions. This approach to improve response relevance could benefit conversational agents.
Authors: Hana Kim, Kai Tzu-iunn Ong, Seoyeon Kim, Dongha Lee, Jinyoung Yeo
Link: https://arxiv.org/abs/2401.14215v1
Date: 2024-01-25
Summary:
Memorizing and utilizing speakers' personas is a common practice for response generation in long-term conversations. Yet, human-authored datasets often provide uninformative persona sentences that hinder response quality. This paper presents a novel framework that leverages commonsense-based persona expansion to address such issues in long-term conversation. While prior work focuses on not producing personas that contradict others, we focus on transforming contradictory personas into sentences that contain rich speaker information, by refining them based on their contextual backgrounds with designed strategies. As the pioneer of persona expansion in multi-session settings, our framework facilitates better response generation via human-like persona refinement. The supplementary video of our work is available at https://caffeine-15bbf.web.app/.
--------------------------------------------------------------------------------------------------------
The Boundaries of Tractability in Hierarchical Task Network Planning
The paper on hierarchical task network planning identifies theoretical complexity boundaries for key problems. The work provides insights on achieving tractable solutions for real-world automated planning.
Authors: Cornelius Brand, Robert Ganian, Fionn Mc Inerney, Simon Wietheger
Link: https://arxiv.org/abs/2401.14174v1
Date: 2024-01-25
Summary:
We study the complexity-theoretic boundaries of tractability for three classical problems in the context of Hierarchical Task Network Planning: the validation of a provided plan, whether an executable plan exists, and whether a given state can be reached by some plan. We show that all three problems can be solved in polynomial time on primitive task networks of constant partial order width (and a generalization thereof), whereas for the latter two problems this holds only under a provably necessary restriction to the state space. Next, we obtain an algorithmic meta-theorem along with corresponding lower bounds to identify tight conditions under which general polynomial-time solvability results can be lifted from primitive to general task networks. Finally, we enrich our investigation by analyzing the parameterized complexity of the three considered problems, and show that (1) fixed-parameter tractability for all three problems can be achieved by replacing the partial order width with the vertex cover number of the network as the parameter, and (2) other classical graph-theoretic parameters of the network (including treewidth, treedepth, and the aforementioned partial order width) do not yield fixed-parameter tractability for any of the three problems.
--------------------------------------------------------------------------------------------------------
The paper proposes aligning language models with embodied environments through reinforcement learning for improved decision-making. This self-supervised approach could enable more accurate, sample-efficient models.
Authors: Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An
Link: https://arxiv.org/abs/2401.14151v1
Date: 2024-01-25
Summary:
Despite the impressive performance across numerous tasks, large language models (LLMs) often fail in solving simple decision-making tasks due to the misalignment of the knowledge in LLMs with environments. On the contrary, reinforcement learning (RL) agents learn policies from scratch, which makes them always align with environments but difficult to incorporate prior knowledge for efficient explorations. To narrow the gap, we propose TWOSOME, a novel general online framework that deploys LLMs as decision-making agents to efficiently interact and align with embodied environments via RL without requiring any prepared datasets or prior knowledge of the environments. Firstly, we query the joint probabilities of each valid action with LLMs to form behavior policies. Then, to enhance the stability and robustness of the policies, we propose two normalization methods and summarize four prompt design principles. Finally, we design a novel parameter-efficient training architecture where the actor and critic share one frozen LLM equipped with low-rank adapters (LoRA) updated by PPO. We conduct extensive experiments to evaluate TWOSOME. i) TWOSOME exhibits significantly better sample efficiency and performance compared to the conventional RL method, PPO, and prompt tuning method, SayCan, in both classical decision-making environment, Overcooked, and simulated household environment, VirtualHome. ii) Benefiting from LLMs' open-vocabulary feature, TWOSOME shows superior generalization ability to unseen tasks. iii) Under our framework, there is no significant loss of the LLMs' original ability during online PPO finetuning.
--------------------------------------------------------------------------------------------------------
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
The paper introduces a system to efficiently serve quantized language models using customized hardware. It could achieve practical speedups to deploy large models in applications.
Authors: Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song
Link: https://arxiv.org/abs/2401.14112v1
Date: 2024-01-25
Summary:
Six-bit quantization (FP6) can effectively reduce the size of large language models (LLMs) and preserve the model quality consistently across varied applications. However, existing systems do not provide Tensor Core support for FP6 quantization and struggle to achieve practical performance improvements during LLM inference. It is challenging to support FP6 quantization on GPUs due to (1) unfriendly memory access of model weights with irregular bit-width and (2) high runtime overhead of weight de-quantization. To address these problems, we propose TC-FPx, the first full-stack GPU kernel design scheme with unified Tensor Core support of float-point weights for various quantization bit-width. We integrate TC-FPx kernel into an existing inference system, providing new end-to-end support (called FP6-LLM) for quantized LLM inference, where better trade-offs between inference cost and model quality are achieved. Experiments show that FP6-LLM enables the inference of LLaMA-70b using only a single GPU, achieving 1.69x-2.65x higher normalized inference throughput than the FP16 baseline. The source code will be publicly available soon.
--------------------------------------------------------------------------------------------------------
The paper applies a feature selection technique to identify relevant inputs for wind speed prediction using neural networks. The model could support forecasting for wind energy applications.
Authors: Hasmat Malik, Amit Kumar Yadav, Fausto Pedro García Márquez, Jesús María Pinar-Pérez
Link: https://arxiv.org/abs/2401.14065v1
Date: 2024-01-25
Summary:
Wind power generated by wind has non-schedule nature due to stochastic nature of meteorological variable. Hence energy business and control of wind power generation requires prediction of wind speed (WS) from few seconds to different time steps in advance. To deal with prediction shortcomings, various WS prediction methods have been used. Predictive data mining offers variety of methods for WS predictions where artificial neural network (ANN) is one of the reliable and accurate methods. It is observed from the result of this study that ANN gives better accuracy in comparison conventional model. The accuracy of WS prediction models is found to be dependent on input parameters and architecture type algorithms utilized. So the selection of most relevant input parameters is important research area in WS predicton field. The objective of the paper is twofold: first extensive review of ANN for wind power and WS prediction is carried out. Discussion and analysis of feature selection using Relief Algorithm (RA) in WS prediction are considered for different Indian sites. RA identify atmospheric pressure, solar radiation and relative humidity are relevant input variables. Based on relevant input variables Cascade ANN model is developed and prediction accuracy is evaluated. It is found that root mean square error (RMSE) for comparison between predicted and measured WS for training and testing wind speed are found to be 1.44 m/s and 1.49 m/s respectively. The developed cascade ANN model can be used to predict wind speed for sites where there are not WS measuring instruments are installed in India.
--------------------------------------------------------------------------------------------------------
Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution
The paper puts forth a strategy for agents to consolidate and reuse prior experience on tasks. This self-evolution approach may increase robustness and reduce data needs.
Authors: Cheng Qian, Shihao Liang, Yujia Qin, Yining Ye, Xin Cong, Yankai Lin, Yesai Wu, Zhiyuan Liu, Maosong Sun
Link: https://arxiv.org/abs/2401.13996v1
Date: 2024-01-25
Summary:
This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution. Unlike existing methods focused on intra-task learning, ICE promotes the transfer of knowledge between tasks for genuine self-evolution, similar to human experience learning. The strategy dynamically investigates planning and execution trajectories, consolidates them into simplified workflows and pipelines, and exploits them for improved task execution. Our experiments on the XAgent framework demonstrate ICE's effectiveness, reducing API calls by as much as 80% and significantly decreasing the demand for the model's capability. Specifically, when combined with GPT-3.5, ICE's performance matches that of raw GPT-4 across various agent tasks. We argue that this self-evolution approach represents a paradigm shift in agent design, contributing to a more robust AI community and ecosystem, and moving a step closer to full autonomy.
--------------------------------------------------------------------------------------------------------
Investigating the Efficacy of Large Language Models for Code Clone Detection
The paper explores using large language models like ChatGPT for code clone detection, a non-generative task. By testing different prompts, the method shows promise for finding duplicated code across programming languages. This could enhance code analysis and maintenance.
Authors: Mohamad Khajezade, Jie Wu, Fatemeh Hendijani Fard, Gema Rodríguez-Pérez, Mohamed Sami Shehata
Link: https://arxiv.org/abs/2401.13802v1
Date: 2024-01-24
Summary:
Large Language Models (LLMs) have demonstrated remarkable success in various natural language processing and software engineering tasks, such as code generation. The LLMs are mainly utilized in the prompt-based zero/few-shot paradigm to guide the model in accomplishing the task. %\textbf{Goal:} GPT-based models are one of the popular ones studied for tasks such as code comment generation or test generation. These tasks are `generative' tasks. However, there is limited research on the usage of LLMs for `non-generative' tasks such as classification using the prompt-based paradigm. In this preliminary exploratory study, we investigated the applicability of LLMs for Code Clone Detection (CCD), a non-generative task. %\textbf{Method:} By building a mono-lingual and cross-lingual CCD dataset derived from CodeNet, we first investigated two different prompts using ChatGPT to detect \textcolor{black}{Type-4} code clones in Java-Java and Java-Ruby pairs in a zero-shot setting. We \textcolor{black}{then} conducted an analysis to understand the strengths and weaknesses of ChatGPT in CCD. %\textbf{Results:} ChatGPT surpasses the baselines in cross-language CCD \textcolor{black}{attaining an F1-score of 0.877 } and achieves comparable performance to fully fine-tuned models for mono-lingual CCD, \textcolor{black}{with an F1-score of 0.878}. Also, the \textcolor{black}{prompt and the} difficulty level of the problems has an impact on the performance of ChatGPT. \textcolor{black}{Finally,} we provide insights and future directions based on our initial analysis
--------------------------------------------------------------------------------------------------------
Multi-Agent Diagnostics for Robustness via Illuminated Diversity
The paper addresses vulnerabilities in multi-agent reinforcement learning systems. It puts forth an approach to systematically test policies in diverse adversarial scenarios. Identifying weaknesses could lead to more robust real-world applications like coordination of autonomous vehicles.
Authors: Mikayel Samvelyan, Davide Paglieri, Minqi Jiang, Jack Parker-Holder, Tim Rocktäschel
Link: https://arxiv.org/abs/2401.13460v1
Date: 2024-01-24
Summary:
In the rapidly advancing field of multi-agent systems, ensuring robustness in unfamiliar and adversarial settings is crucial. Notwithstanding their outstanding performance in familiar environments, these systems often falter in new situations due to overfitting during the training phase. This is especially pronounced in settings where both cooperative and competitive behaviours are present, encapsulating a dual nature of overfitting and generalisation challenges. To address this issue, we present Multi-Agent Diagnostics for Robustness via Illuminated Diversity (MADRID), a novel approach for generating diverse adversarial scenarios that expose strategic vulnerabilities in pre-trained multi-agent policies. Leveraging the concepts from open-ended learning, MADRID navigates the vast space of adversarial settings, employing a target policy's regret to gauge the vulnerabilities of these settings. We evaluate the effectiveness of MADRID on the 11vs11 version of Google Research Football, one of the most complex environments for multi-agent reinforcement learning. Specifically, we employ MADRID for generating a diverse array of adversarial settings for TiZero, the state-of-the-art approach which "masters" the game through 45 days of training on a large-scale distributed infrastructure. We expose key shortcomings in TiZero's tactical decision-making, underlining the crucial importance of rigorous evaluation in multi-agent systems.
--------------------------------------------------------------------------------------------------------
Beyond Accuracy-Fairness: Stop evaluating bias mitigation methods solely on between-group metrics
The paper argues prevailing fairness metrics don't fully capture model impacts on subgroups. It advocates first perfectly ranking individuals within groups before selecting candidates. Adopting this process could improve equity in high-stakes predictions.
Authors: Sofie Goethals, Toon Calders, David Martens
Link: https://arxiv.org/abs/2401.13391v1
Date: 2024-01-24
Summary:
Artificial Intelligence (AI) finds widespread applications across various domains, sparking concerns about fairness in its deployment. While fairness in AI remains a central concern, the prevailing discourse often emphasizes outcome-based metrics without a nuanced consideration of the differential impacts within subgroups. Bias mitigation techniques do not only affect the ranking of pairs of instances across sensitive groups, but often also significantly affect the ranking of instances within these groups. Such changes are hard to explain and raise concerns regarding the validity of the intervention. Unfortunately, these effects largely remain under the radar in the accuracy-fairness evaluation framework that is usually applied. This paper challenges the prevailing metrics for assessing bias mitigation techniques, arguing that they do not take into account the changes within-groups and that the resulting prediction labels fall short of reflecting real-world scenarios. We propose a paradigm shift: initially, we should focus on generating the most precise ranking for each subgroup. Following this, individuals should be chosen from these rankings to meet both fairness standards and practical considerations.
--------------------------------------------------------------------------------------------------------
No Longer Trending on Artstation: Prompt Analysis of Generative AI Art
The paper analyzes millions of text prompts for AI image generation. It finds the prompts focus heavily on superficial attributes versus deeper meaning. The study suggests current systems promote conventional styles rather than empowering users creatively.
Authors: Jon McCormack, Maria Teresa Llano, Stephen James Krol, Nina Rajcic
Link: https://arxiv.org/abs/2401.14425v1
Date: 2024-01-24
Summary:
Image generation using generative AI is rapidly becoming a major new source of visual media, with billions of AI generated images created using diffusion models such as Stable Diffusion and Midjourney over the last few years. In this paper we collect and analyse over 3 million prompts and the images they generate. Using natural language processing, topic analysis and visualisation methods we aim to understand collectively how people are using text prompts, the impact of these systems on artists, and more broadly on the visual cultures they promote. Our study shows that prompting focuses largely on surface aesthetics, reinforcing cultural norms, popular conventional representations and imagery. We also find that many users focus on popular topics (such as making colouring books, fantasy art, or Christmas cards), suggesting that the dominant use for the systems analysed is recreational rather than artistic.
--------------------------------------------------------------------------------------------------------
XAI for All: Can Large Language Models Simplify Explainable AI?
The paper introduces a method to simplify explanations from complex AI systems for non-experts. By tailoring outputs for different audiences, the approach makes techniques more accessible. This could support trust and adoption of AI across domains.
Authors: Philip Mavrepis, Georgios Makridis, Georgios Fatouros, Vasileios Koukos, Maria Margarita Separdani, Dimosthenis Kyriazis
Link: https://arxiv.org/abs/2401.13110v1
Date: 2024-01-23
Summary:
The field of Explainable Artificial Intelligence (XAI) often focuses on users with a strong technical background, making it challenging for non-experts to understand XAI methods. This paper presents "x-[plAIn]", a new approach to make XAI more accessible to a wider audience through a custom Large Language Model (LLM), developed using ChatGPT Builder. Our goal was to design a model that can generate clear, concise summaries of various XAI methods, tailored for different audiences, including business professionals and academics. The key feature of our model is its ability to adapt explanations to match each audience group's knowledge level and interests. Our approach still offers timely insights, facilitating the decision-making process by the end users. Results from our use-case studies show that our model is effective in providing easy-to-understand, audience-specific explanations, regardless of the XAI method used. This adaptability improves the accessibility of XAI, bridging the gap between complex AI technologies and their practical applications. Our findings indicate a promising direction for LLMs in making advanced AI concepts more accessible to a diverse range of users.
--------------------------------------------------------------------------------------------------------
Sparse identification of nonlinear dynamics in the presence of library and system uncertainty
The paper demonstrates an improved algorithm for identifying dynamical system equations from data. By addressing uncertainties in variables and mathematical representations, it could expand applications in control systems, robotics, and other physical domains.
Authors: Andrew O'Brien
Link: https://arxiv.org/abs/2401.13099v1
Date: 2024-01-23
Summary:
The SINDy algorithm has been successfully used to identify the governing equations of dynamical systems from time series data. However, SINDy assumes the user has prior knowledge of the variables in the system and of a function library that can act as a basis for the system. In this paper, we demonstrate on real world data how the Augmented SINDy algorithm outperforms SINDy in the presence of system variable uncertainty. We then show SINDy can be further augmented to perform robustly when both kinds of uncertainty are present.
--------------------------------------------------------------------------------------------------------
Free Form Medical Visual Question Answering in Radiology
The paper presents an approach to visual question answering for radiology images. By advancing medical image understanding and generating textual responses, the method could assist clinicians' decision-making and patient care.
Authors: Abhishek Narayanan, Rushabh Musthyala, Rahul Sankar, Anirudh Prasad Nistala, Pranav Singh, Jacopo Cirrone
Link: https://arxiv.org/abs/2401.13081v1
Date: 2024-01-23
Summary:
Visual Question Answering (VQA) in the medical domain presents a unique, interdisciplinary challenge, combining fields such as Computer Vision, Natural Language Processing, and Knowledge Representation. Despite its importance, research in medical VQA has been scant, only gaining momentum since 2018. Addressing this gap, our research delves into the effective representation of radiology images and the joint learning of multimodal representations, surpassing existing methods. We innovatively augment the SLAKE dataset, enabling our model to respond to a more diverse array of questions, not limited to the immediate content of radiology or pathology images. Our model achieves a top-1 accuracy of 79.55\% with a less complex architecture, demonstrating comparable performance to current state-of-the-art models. This research not only advances medical VQA but also opens avenues for practical applications in diagnostic settings.
--------------------------------------------------------------------------------------------------------
The paper discusses ethical issues with using AI for organizing information about historical atrocities. It advocates balancing virtues like beneficence and respect to preserve memory responsibly. Adopting guidelines proposed could enable development of tools that promote education and healing.
Authors: Mykola Makhortykh
Link: https://arxiv.org/abs/2401.13079v1
Date: 2024-01-23
Summary:
The growing application of artificial intelligence (AI) in the field of information retrieval (IR) affects different domains, including cultural heritage. By facilitating organisation and retrieval of large volumes of heritage-related content, AI-driven IR systems inform users about a broad range of historical phenomena, including genocides (e.g. the Holocaust). However, it is currently unclear to what degree IR systems are capable of dealing with multiple ethical challenges associated with the curation of genocide-related information. To address this question, this chapter provides an overview of ethical challenges associated with the human curation of genocide-related information using a three-part framework inspired by Belmont criteria (i.e. curation challenges associated with respect for individuals, beneficence and justice/fairness). Then, the chapter discusses to what degree the above-mentioned challenges are applicable to the ways in which AI-driven IR systems deal with genocide-related information and what can be the potential ways of bridging AI and memory ethics in this context.
--------------------------------------------------------------------------------------------------------
The paper introduces a technique leveraging transfer learning and ensembling to address limited training data for Question Answering on Quranic text. Improving performance on these datasets could support development of assistive tools for Islamic religious study.
Authors: Mohammed Alaa Elkomy, Amany Sarhan
Link: https://arxiv.org/abs/2401.13060v1
Date: 2024-01-23
Summary:
In this paper, we present our approach to tackle Qur'an QA 2023 shared tasks A and B. To address the challenge of low-resourced training data, we rely on transfer learning together with a voting ensemble to improve prediction stability across multiple runs. Additionally, we employ different architectures and learning mechanisms for a range of Arabic pre-trained transformer-based models for both tasks. To identify unanswerable questions, we propose using a thresholding mechanism. Our top-performing systems greatly surpass the baseline performance on the hidden split, achieving a MAP score of 25.05% for task A and a partial Average Precision (pAP) of 57.11% for task B.
--------------------------------------------------------------------------------------------------------
Enabling Global Image Data Sharing in the Life Sciences
The white paper makes recommendations for collaborative frameworks to share biological imaging data globally. Enabling open access to research images could accelerate discoveries and applications in healthcare.
Authors: Peter Bajcsy, Sreenivas Bhattiprolu, Katy Borner, Beth Cimini, Lucy Collinson, Jan Ellenberg, Reto Fiolka, Maryellen Giger, Wojtek Goscinski, Matthew Hartley, Nathan Hotaling, Rick Horwitz, Florian Jug, Anna Kreshuk, Emma Lundberg, Aastha Mathur, Kedar Narayan, Shuichi Onami, Anne L. Plant, Fred Prior, Jason Swedlow, Adam Taylor, Antje Keppler
Link: https://arxiv.org/abs/2401.13023v1
Date: 2024-01-23
Summary:
Coordinated collaboration is essential to realize the added value of and infrastructure requirements for global image data sharing in the life sciences. In this White Paper, we take a first step at presenting some of the most common use cases as well as critical/emerging use cases of (including the use of artificial intelligence for) biological and medical image data, which would benefit tremendously from better frameworks for sharing (including technical, resourcing, legal, and ethical aspects). In the second half of this paper, we paint an ideal world scenario for how global image data sharing could work and benefit all life sciences and beyond. As this is still a long way off, we conclude by suggesting several concrete measures directed toward our institutions, existing imaging communities and data initiatives, and national funders, as well as publishers. Our vision is that within the next ten years, most researchers in the world will be able to make their datasets openly available and use quality image data of interest to them for their research and benefit. This paper is published in parallel with a companion White Paper entitled Harmonizing the Generation and Pre-publication Stewardship of FAIR Image Data, which addresses challenges and opportunities related to producing well-documented and high-quality image data that is ready to be shared. The driving goal is to address remaining challenges and democratize access to everyday practices and tools for a spectrum of biomedical researchers, regardless of their expertise, access to resources, and geographical location.
--------------------------------------------------------------------------------------------------------