CS Colloquium

Spring 2021

Presented by the Computer Science Department
Wednesdays 12:00 - 12:50pm, Online
All lectures free and open to all: Zoom link

Toxicity in Two On-line Platforms: Dissenter and GitHub

Robert Beverly and Erik Rye
Naval Postgraduate School

02/03/2021

Understanding efforts by community-driven on-line platforms to enforce legal and policy-based norms are especially timely as these services rise in prominence and importance.  For instance, recent actions to curb toxicity in such platforms have driven debate on censorship, de-platforming, and the emergence of unrestricted alternatives.

In this talk, we present a data-driven analysis of toxicity in two on-line platforms: the fringe "dissenter" overlay and the widely used GitHub code collaboration service.  Dissenter is a browser that provides a conversational overlay for any web page -- thereby removing the power of content creators to control conversation on their content.  In our IMC 2020 work, we obtain the full history of Dissenter comments, users, and websites being discussed, and analyze more than 1.6M comments to characterize users, toxicity, and the conditional probability of hateful comments given website political bias.

Last, we describe our initial work in mining and characterizing non-inclusive language and toxicity in program code and commit messages as publicly available in GitHub.  Rather than residing in a fringe community, toxicity in GitHub may have important implications to broadening participation in STEM.

How does AI Perceive You?

Nina Marhamati
Assistant Professor
Computer Science Sonoma State University

02/10/2021

Have you ever wondered how Siri, Alexa, or any chat-bot perceives you? We are interested in answering this question by making it possible to receive feedback from interactions with AI. Understanding the human message and emotion when approaching a machine is a valuable psychological experience and can tell us a lot about ourselves and our society. In collaboration with the Art Department, we are using AI methods to create a reaction to people’s interaction with machines.  The interaction received through sensory inputs is processed by the machine and the content of the interaction is decoded. Using deep learning models for natural language processing, the emotional state of the interaction is detected from the content. The detected state is used to visualize, in 2D or 3D, what the machine has perceived from the interaction. The final product will give the user a personalized interaction with the machine through a piece of art that reflects the user's emotion. This product can be combined with VR equipment and used for providing more realistic experience for game players, online training participants, or interaction with chat-bots.

Juneau: Managing and Guiding Data Science

Zachary Ives
Adani President's Distinguished Professor and Department Chair, Computer and Information Science Department
University of Pennsylvania

02/17/2021

How do we promote large-scale data science and data sharing, e.g., in the sciences or across organizations? Many modern data science applications have been leveraging data lakes: schema-agnostic repositories of data files and data products, which offer limited organization and management capabilities. There is a need to build a new generation of data science environments, which leverage data lakes so scientists and analysts can find tables, schemas, workflows, and datasets useful to their task at hand. Juneau incorporates search and management solutions into the Jupyter Notebook data science platform, to enable scientists to augment training data, find potential features to extract, clean data, and find joinable or linkable tables. Our core methods also generalize to other settings where computational tasks involve execution of programs or scripts.

How We Got Here: A brief history of the evolution of parallel processing in high-performance computing

David Barkai
HPC Consultant

02/24/2021

HPC underwent several dramatic changes in system architecture over the last 50 years. From the pre-70’s mainframe, to vector processors and multiprocessors of the 80s, followed by the emergence of the “killer micros” and shift to commodity processors and clusters. From shared memory to distributed memory systems, and the addition of accelerators (GPUs). This journey is accompanied by tracking the transition from a single thread program execution to ever increasing levels of parallelism, and the implications to the software tools and the application end user. HPC today is much more than the domain of numerical simulations. It includes data analytics and AI.

Addressing Climate Change through Drone Swarms

Emily Spahn
Software Engineer
DroneSeed

03/03/2021

Climate change is a major issue of concern worldwide. Trees are currently among the best ways to capture carbon. DroneSeed has been a leader in mass reforestation by drone, and is uniquely able to reforest after wildfires. We'll explore the evolution of technological needs to support this goal. Let's talk about how we went from the idea of combining biology with the emerging drone industry, to arrive at the realities of a startup putting seeds on the ground in post-wildfire environments.

Distributed Cache Invalidation at Scale

Greg Cooper
Software Engineer
Google

03/10/2021

Dr. Cooper will describe some of the challenges involved in building a large-scale distributed cache invalidation system and will present the design for one such system, called Thialfi, which was built and operated at Google for most of the past decade. He will also discuss the limitations of that design and touch on ways in which modern infrastructure allows improvements to it. The work is joint with Atul Adya, Phil Bogle, Dennis Geels, Brice Hulse, Larry Kai, Vishesh Khemani, Nick Kline, Colin Meek, Amanda Moreton, Daniel Myers, and Michael Piatek.

Securing Drone Identity

Zachary Peterson
Associate Professor
Computer Science Cal Poly, San Luis Obispo

03/17/2021

The future of drones and other autonomous vehicles is exciting—they promise to change the way we do business, manufacturing, travel, and delivery logistics, all while increasing convenience, lowering cost, and lessing the impact on the environment. They will also significantly transform our cityscapes and put new security pressures on the critical infrastructure to support them.  Among these pressures is the need for technologies that support secure remote identity—the ability to prove an assertion of identity from a vehicle whose operator may be many miles away (or not exist at all!). There have numerous proposals, including new rules just established by the FAA, for remotely identifying drones. Sadly, all existing schemes either ignore or leave optional the elements that secure identity, leaving open the possibility of impersonation, forgeries, and other malicious behavior. In this talk, we discuss the setting, the requirements, and the challenges in deploying a secure identity system for drones and other autonomous vehicles.

Machine Learning Enhanced Video Accessibility for Blind and Low Vision Individuals

Ilmi Yoon
Professor, Computer Science Dept.
San Francisco State University

04/07/2021

The blind or visually impaired often miss out on the visual information conveyed through videos. The vast majority of online video material is currently not accessible to millions of visually-impaired people who would significantly benefit from improved access to videos for education, employment, and entertainment purposes.

This work addresses two major issues:

  1. Enhancing video accessibility for blind or visually-impaired individuals.
  2. Generating well-structured training data to advance the state of the art in video understanding.

How Do Film, Television, and other Media Influence Girls to Pursue STEM?

Kim Bishop
Mechanical Engineer

04/14/2021

What types of female STEM role models do girls see in television and film today? Are they represented at all? We will explore what the current STEM media landscape looks like, what plans are for the future, and how STEM professionals and media professionals can work together to expand female STEM roles in media.

Creating Autonomous Mobility from the Ground Up

Amirhossein Tamjidi
Software Engineer
Zoox

04/21/2021

Zoox is transforming mobility-as-a-service by developing a fully autonomous, purpose-built fleet that is designed for AI to drive and humans to enjoy. In the first part of this presentation I will introduce the main components of the Zoox's software stack and its hardware. I will discuss the challenges of localization, perception, prediction, and planning in autonomous vehicles. In the second part of this presentation, I will share some resources that students can use to learn more about different aspects of autonomous vehicle technology and prepare to become a researcher or practitioner in this field.

Advanced Software Design Project - CS 470 - Virtual Showcase

Anamary Leal
Assistant Professor, Computer Science Dept.
Sonoma State University

04/28/2021

Dr. Leal will facilitate a virtual showcase of students’ advanced software design projects from CS 470 this semester.

Spring 2021 Short Presentations Of Student Research

STUDENT PRESENTATIONS

05/05/2021

Short presentations of research carried out by Sonoma State Computer Science Students.

  • Carson Whitt
    Title: Human Audio Emotional Classification
    Research Mentor: Dr. Nina Marhamati
    Abstract: Natural language processing in the realm of computer science has many facets, with one of the most difficult being human vocal classification. Many techniques have been developed to address challenges such as voice recognition and language classification, but one area that has been growing with the rise of deep learning is classification of human emotion. Many techniques in the pursuit of extracting and abstracting useful data from human audio have been addressed in past research papers. The goal of our research is to use those techniques such as spectrogram analysis and vocal embedding to design and complete a working model for taking raw human audio and classifying the existing emotion using Robert Plutchicks’ wheel of emotions as a reference. For audio data we have been using the RAVDESS database, which contains over 2000 samples and eight emotional categories all using a Northern American accent. We use a basic deep learning model to train and classify based on vocal embeddings extracted from YAMNet. Combined with that we have used multiple techniques and augmentations to overcome the lack of audio data readily available. Classifying to three basic classes (neutral, pleasant, unpleasant) has given poor accuracy and convergence of the model but overall has made good strides towards a working solution to the emotional classification challenge.
     
  • Ari Encarnacion
    Title: Machine Learning in Geology: A Pipeline for Automatic Classification of Shear-Sense Indicating Clasts
    Research Mentor: Dr. Gurman Gill
    Abstract: We are constructing a machine learning (ML) powered, automated pipeline for classifications and detections of shear-sense indicating clasts in photomicrographs. Classifications include Sinistral (Counter-Clockwise aka CCW) and Dextral (Clockwise aka CW) shearing. Detections refer to the location of clasts in photomicrographs. Current efforts involve improving final classification results, gathering more data, and experimentation with different combinations of object detectors and classifiers. This presentation focuses on the current pipeline structure and how detections could improve classification results. Future work includes pipeline assembly and providing user access to the model via an app. This app will employ our pipeline to provide automatic classification & detections to the user. This will provide users with vital data, and feedback on app-generated results will benefit our pipeline.
     
  • Brandon Fong
    Title: Elliptic Curve Cryptography
    Research Mentor: Dr. Mark Gondree
    Abstract: I will summarize select topics covered in a recent directed study course on the topic of modern cryptography. In particular, I will focus on some well-known cryptographic schemes and the practical consideration of key lengths for those systems.   Suggestions on key lengths are based on best known attacks against cryptographic systems. Some problems yield new schemes that outperform current schemes.  In this presentation, I discuss Elliptic Curve (EC) cryptosystems and compare these against the well-known Rivest-Shamir-Adleman (RSA) cryptosystem.  

Spring 2021 Short Presentations Of Student Research

STUDENT PRESENTATIONS

05/12/2021

Short presentations of research carried out by Sonoma State Computer Science Students.

  • Vincent Valenzuela
    Title:  Interactive NLP application for classifying pleasant and unpleasant emotions from text
    Research Mentor: Dr. Nina Marhamati
    Abstract: My project was to develop a light weight model for sentiment analysis and use it to classify everyday speech into either pleasant or unpleasant emotions. Using the model I then developed an interface that accepts written text or speech as input and displays a visual representation of the classified input.