2024

Algorithmic Mirror header

Social media platforms track, aggregate, and monetize adolescents' data while giving them almost no visibility into how algorithms construct their digital identities. Adolescents are left behind a one-way mirror: algorithms build detailed profiles that shape what they see, but there is no way to look back and understand what those profiles actually contain or how they were assembled. Current approaches to algorithmic literacy, simulations, games, synthetic data, tend to stay abstract and disconnected from lived experience. The knowledge does not transfer.

Algorithmic Mirror upload interface showing drag-and-drop data export from TikTok, YouTube, and Netflix with upload progress indicator

Algorithmic Mirror inverts this relationship. Developed in collaboration with Yui Kondo at the Oxford Internet Institute and built on the Latent Lab platform, it takes a user's real watch history data from YouTube, TikTok, and Netflix and transforms it into an interactive semantic visualization. Users drag and drop their platform data exports, and the system handles the rest: parsing heterogeneous file formats, enriching sparse metadata, harmonizing descriptions across platforms with an LLM, embedding the content, and reducing it to a navigable 2D map. Each video appears as a colored dot (color indicates platform), and semantically similar content clusters together. Automated topic labels emerge at multiple scales, so zooming in reveals increasingly specific subcategories of interest.



Four-panel datafication view showing topic clusters with temporal evolution and multi-platform overlay across a participant's viewing history

Pre-processing pipeline: TikTok, YouTube, and Netflix data exports are parsed, enriched, and harmonized into a unified format for embedding

A key design challenge was cross-platform clustering. Early versions produced separate clusters for each platform because Netflix synopses, TikTok hashtag captions, and YouTube descriptions are syntactically very different even when describing the same content. We solved this with a homogenization preprocessing step that standardizes all descriptions to focus on core content, which allows the embedding space to cluster by meaning rather than by platform syntax.

The temporal slider lets users watch their content landscape evolve over time, revealing how interests emerge, shift, and fade across months or years. This was one of the most impactful features in our study: many participants had no idea platforms retained data going back to their early childhood, and seeing several years of viewing history animated was both surprising and deeply personal.

Zoom depth and temporal evolution: top row shows progressive zoom from overview to individual video details, bottom row shows how a participant's content landscape evolves over time

In a qualitative study with 27 adolescents (ages 12 to 16) conducted in collaboration with Yui Kondo and Jun Zhao at the Oxford Internet Institute, participants uploaded over 750,000 videos spanning up to five years of activity. We found three key things. First, seeing years of data visualized made datafication concrete and personal, eliciting emotions ranging from curiosity to fear. Second, participants could see how algorithms construct a "digital self" from their viewing histories, and some felt accurately reflected while others felt misrepresented. Third, teens wanted tools like this integrated into platforms to access and manage their data, yet many doubted their ability to actually control or change these systems.

Algorithmic Mirror data mirror view showing a participant's full content landscape with video thumbnails, topic labels, and platform color legend

The work highlights a tension at the core of adolescent data autonomy: teens care deeply about how they are represented in algorithmic systems and want agency over those representations, but the platforms are not designed to give it to them. Algorithmic Mirror is an attempt to shift the framing from explaining algorithms to letting people explore their own data and draw their own conclusions.

Gallery

Links

Technology & Tools

AI EthicsVisualizationSocial Computing