Colloquium Archive

Toxicity in Two On-line Platforms: Dissenter and GitHub

Robert Beverly and Erik Rye
Naval Postgraduate School


Understanding efforts by community-driven on-line platforms to enforce legal and policy-based norms are especially timely as these services rise in prominence and importance.  For instance, recent actions to curb toxicity in such platforms have driven debate on censorship, de-platforming, and the emergence of unrestricted alternatives.

In this talk, we present a data-driven analysis of toxicity in two on-line platforms: the fringe "dissenter" overlay and the widely used GitHub code collaboration service.  Dissenter is a browser that provides a conversational overlay for any web page -- thereby removing the power of content creators to control conversation on their content.  In our IMC 2020 work, we obtain the full history of Dissenter comments, users, and websites being discussed, and analyze more than 1.6M comments to characterize users, toxicity, and the conditional probability of hateful comments given website political bias.

Last, we describe our initial work in mining and characterizing non-inclusive language and toxicity in program code and commit messages as publicly available in GitHub.  Rather than residing in a fringe community, toxicity in GitHub may have important implications to broadening participation in STEM.

How does AI Perceive You?

Nina Marhamati
Assistant Professor
Computer Science Sonoma State University


Have you ever wondered how Siri, Alexa, or any chat-bot perceives you? We are interested in answering this question by making it possible to receive feedback from interactions with AI. Understanding the human message and emotion when approaching a machine is a valuable psychological experience and can tell us a lot about ourselves and our society. In collaboration with the Art Department, we are using AI methods to create a reaction to people’s interaction with machines.  The interaction received through sensory inputs is processed by the machine and the content of the interaction is decoded. Using deep learning models for natural language processing, the emotional state of the interaction is detected from the content. The detected state is used to visualize, in 2D or 3D, what the machine has perceived from the interaction. The final product will give the user a personalized interaction with the machine through a piece of art that reflects the user's emotion. This product can be combined with VR equipment and used for providing more realistic experience for game players, online training participants, or interaction with chat-bots.

Juneau: Managing and Guiding Data Science

Zachary Ives
Adani President's Distinguished Professor and Department Chair, Computer and Information Science Department
University of Pennsylvania


How do we promote large-scale data science and data sharing, e.g., in the sciences or across organizations? Many modern data science applications have been leveraging data lakes: schema-agnostic repositories of data files and data products, which offer limited organization and management capabilities. There is a need to build a new generation of data science environments, which leverage data lakes so scientists and analysts can find tables, schemas, workflows, and datasets useful to their task at hand. Juneau incorporates search and management solutions into the Jupyter Notebook data science platform, to enable scientists to augment training data, find potential features to extract, clean data, and find joinable or linkable tables. Our core methods also generalize to other settings where computational tasks involve execution of programs or scripts.

How We Got Here: A brief history of the evolution of parallel processing in high-performance computing

David Barkai
HPC Consultant


HPC underwent several dramatic changes in system architecture over the last 50 years. From the pre-70’s mainframe, to vector processors and multiprocessors of the 80s, followed by the emergence of the “killer micros” and shift to commodity processors and clusters. From shared memory to distributed memory systems, and the addition of accelerators (GPUs). This journey is accompanied by tracking the transition from a single thread program execution to ever increasing levels of parallelism, and the implications to the software tools and the application end user. HPC today is much more than the domain of numerical simulations. It includes data analytics and AI.

Addressing Climate Change through Drone Swarms

Emily Spahn
Software Engineer


Climate change is a major issue of concern worldwide. Trees are currently among the best ways to capture carbon. DroneSeed has been a leader in mass reforestation by drone, and is uniquely able to reforest after wildfires. We'll explore the evolution of technological needs to support this goal. Let's talk about how we went from the idea of combining biology with the emerging drone industry, to arrive at the realities of a startup putting seeds on the ground in post-wildfire environments.