Sihwa Park: Uncertain Facing

Next: Sil van der Woerd: Swim »

« Previous: Shunsaku Hayashi: Transform Fault

Artist(s):

Sihwa Park

Title:

Uncertain Facing

Exhibition:

SIGGRAPH Asia 2020: Untitled & Untied

Artist Statement:

Summary

Abstract

Uncertain Facing is a data-driven, interactive audiovisual installation that aims to represent the uncertainty of data points of which their positions in 3D space are estimated by machine learning techniques. It also tries to raise concerns about the possibility of the unintended use of machine learning along with synthetic/fake data. Uncertain Facing visualizes the real-time clustering of fake faces in 3D space through t-SNE, a non-linear dimensionality reduction technique, with face embeddings of the faces.This clustering reveals what faces are similar to each other based on the assumption of a probability distribution over data points. However, unlike the original purpose of t-SNE that is meant to be used in an objective data exploration in machine learning, it represents data points as metaballs, in which two or more face images become a merged face when they are close enough, to reflect the uncertain and probabilistic nature of data locations the t-SNE algorithm yields. As a result, metaball rendering is used as a means of an abstract, probabilistic representation of data as opposed to exactness that we expect from the use of scientific visualizations. Along with the t-SNE and metaball- based visualization, Uncertain Facing sonifies the change of the overall data distribution in 3D space based on a granular sound synthesis technique. Uncertain Facing also reflects error values, which t-SNE measures at each iteration between a distribution in original high dimensions and a deduced low-dimensional distribution, to represent the uncertainty of data as jittery motion and inharmoic sound. As an interactive installation, Uncertain Facing allows the audience to see the relationship with a picture of their face and the fake faces, implying an aspect that machine learning could be misused in an unintended way as face recognition technology does not distinguish between real and fake faces.

In the ear of the quantified self, a plethora of personal data is used in training intelligent machines that heavily affect our daily lives nowadays. Our text, images, and videos in social media, our locations that are always trackable on our smartphones, and our health information that smartwatches capture have been utilized not only for the data owner’s explicit self-representative purpose but also for implicit goals for machine vision in attempting to understand and recognize human behaviors.

Then, how do machines understand and see us?

As real-world data has high dimensionality that causes the curse of dimensionality, machine learning techniques aim to reduce this high dimensionality into low-dimensional, principal data features that will be used to categorize, recognize us, or predict our preferences and behaviors. The uncertainty of data and algorithms are involved in this process. Biased data could result in biased machine models and the stochastic nature of machine learning algorithms could yield errors or ignore significant outliers that could affect the inclusiveness, fairness of artificial intelligence.

Also, with advances in machine learning techniques, it has become easier and more feasible to generate fake and synthetic data based on our real data. It means we will live in a world where it would be harder to distinguish whether digital images we are seeing are fake or real.

Uncertain Facing is a project that explores these aspects in the era of personal data, machine vision, and fake/synthetic data. As a data-driven audiovisual art piece, Uncertain Facing tries to reveal the uncertainty of data and machine learning algorithms with an abstract, probabilistic representation of data as opposed to the exactness that we expect from the use of scientific visualizations.

Technical Information:

As a data-driven audiovisual piece, Uncertain Facing consists of three major components: 1) face data including synthetic images of faces, generated by StyleGAN2, a generative adversarial network (GAN) for generating portraits of fake human faces, and their face embeddings obtained from FaceNet, a deep neural network trained for finding 128-dimensional feature vectors from face images, 2) t-SNE (t-distributed Stochastic Neighbor Embedding), a non-linear dimensionality reduction technique for the visualization of high-dimensional datasets, and 3) multimodal data representation based on metaball rendering, an implicit surface modeling technique in computer graphics, and granular sound synthesis

Uncertain Facing visualizes the real-time clustering of fake faces in 3D space through t-SNE with face embeddings of the faces. This clustering reveals what faces are similar to each other by deducing a 3-dimensional map from 128 dimensions with an assumption based on a probability distribution over data points. However, unlike the original purpose of t-SNE that is meant to be used in an objective data exploration in machine learning, it represents data points as metaballs to reflect the uncertain and probabilistic nature of data locations the algorithm yields. Face images are mapped to surfaces of metaballs as textures and if more than two data points are getting closer, their faces begin to merge, creating a fragmented, combined face as a means of an abstract, probabilistic representation of data as opposed to exactness that we expect from the use of scientific visualizations. Uncertain Facing also reflects an error value, which t-SNE measures at each iteration between a distribution in original high dimensions and a deduced low-dimensional distribution, as the jittery motion of data points.

Along with the t-SNE and metaball-based visualization, Uncertain Facing sonifies the change of the overall data distribution in 3D space based on a granular sound synthesis technique. The data space is divided into eight subspaces and the locations of data points in the subspaces are tracked during the t-SNE operation. The density of data points in each subspace contributes to parameters of granular synthesis, for example, grain density. The error values also are used in determining the frequency range and duration of sound grains to represent the uncertainty of data in sound.

As an interactive installation, Uncertain Facing allows the audience to explore the data in detail or their overall structure through a web-based UI on iPad. Moreover, the audience can take a picture of their faces, send it to the data space to see its relationship with the data. Given the new face image, Uncertain Facing re-starts t-SNE after obtaining face embeddings of the audience’s face image in real time. While the real audience face is being mixed, merged with the fake faces in 3D space, Uncertain Facing also shows the top nine similar faces on the UI. Here, it tries to imply an aspect that machine learning could be misused in an unintended way as FaceNet does not distinguish between real and fake faces.

Process Information:

The uniqueness of this project lies in the audiovisual representation of uncertainty in machine learning and data. As for metaball rendering in visualization, a metaball, a spherical representation of a data instance means the uncertain boundary of the data point in space, and their combined face textures on the metaball surface represent the synthetic possibility of fake data. The jittery motion of metaballs also reflects the degree of error that t-SNE estimates in evaluating the inference of data locations. In data sonification, the distribution of data points in 3D space along with the error values is also reflected as changes in sound such as pitch and amplitude.

Metaball rendering with a large number of data was challenging. Especially, it had to fast enough to reflect the t-SNE visualization process in real time. And applying multiple texture images on meatball surfaces was another difficult part.

Other Information:

Inspiration Behind the Project

The biggest motivation of this project began from StyleGAN2. When I saw the realistic fake faces generated from StyleGAN2, I thought we would get more difficult to distinguish what is real or fake images when we see them on digital media. As machine learning has been affecting our society, it is important to be aware of its probabilistic nature in machine learning processes. What we get from machine learning models is not always correct as much as we believe, and has uncertainty. Dimensionality reduction which is used to find hidden patterns of data is one example that shows the uncertain and stochastic characteristics of machine learning. In this project, I wanted to represent this uncertainty of both machine learning processes and synthetic data generated from generative models by making audiences find their similar fake faces which are fragmented, combined.