Diversify my playlist

A symphonic experiment to burst social bubbles

9 min readAug 6, 2021

I am a firm believer in what we here could call “social bubble bursting”, namely the act of intentionally investigating social doing (social arrangements, social behaviors, social norms, etc) outside of my own social bubbles. I am taught both by family upbringing and through my social science educational upbringing that interacting in “diverse” social doings encourages us to understand and question our surroundings through the lens of multiple viewpoints. Ultimately I believe that these diversified environments and the multiplicity of viewpoints can incubate opportunity spaces for healthy diverse social interactions among all of us across social bubbles.

Designers of technology are constantly optimizing in the mantra of frictionless and seamless human-technology interactions where the technology metaphorically is becoming invisible. This invisible tech design worship is in essence created by maintaining the users inside our own social bubbles.

This mini-project seeks to experiment with the notion of “visible technologies” by creating a bit of friction in the technology design.

Symphonic bubbles

“When we first start listening, we find that music is everywhere”. The concept of sounds goes beyond the act of composing music. Doing sounds can be seen as the social act of composing and forming social doings within social groups. When listening to foreign sounds we are from a phenomenological perspective feeling the social doings of that social assembly. But despite this stunning opportunity for cultural insights through music, my playlists on Spotify are still consisting of the same, pretty predictable, forty-ish artists, where the same ten artists have been in heavy rotation in my entire grownup life.

Based on Spotify’s own user data like “listening history”, “likes”, “follower record” but also weblogs and articles, the Spotify “fans-also-like”-algorithm constructs relations between similar artists on the app. Spotify uses these types of algorithms to curate personalized radio stations and playlists for each of their millions of users that will run endless music exclusively with music it assumes you like based on your listening behavior on the app.

This is an interesting example of tech designers manufacturing a seamless and frictionless invisible listening experience for us, the users. But also a great example of tech designers supporting everyone in staying within their own social bubble, in this case musically.

By visualizing how artists on Spotify are related we can examine social bubbles by locating clusters of related artists. This practice will give us a sense of the underlying social bubble structure of related artists on Spotify — and when we understand these social bubbles, we can try our best to burst them.

***

Before we continue, I would like to invite you to listen to this playlist while you are reading. Spoiler, it is the very outcome of this project: “the diversified playlist”

***

Method - getting data

As a somewhat discrete approach to maintaining power and domination, some tech companies allow and even invite outside developers access to their private valuable data. One example is Google Maps who are providing traffic and navigation data to other companies to implement into their products. So when you are using Uber to get home from a party the driver will be navigating with data from Google Maps through the Uber app. Google Maps are not trying to be friendly here, they are simply maintaining their power position. By providing their data as building blocks for others to build upon, the companies increase the number of users and secure the upper hand over their competitors.

The way in which developers are accessing this data is through what is known as an API. Spotify is providing a relatively open API and to simplify things even more open source communities are building API wrappers that make the task of obtaining data even easier. This means that we relatively easily can get large amounts of music data out of Spotify.

The Spotify API wrapper I am using for this project is the python library “Spotipy”. After setting up a Spotify developer client account and logging in, I can now with this simple line of code below, get all the “fans-also-like” related artists for a given artist, in this example “Britney Spears”.

IN:
britney_spears_spotify_id = “26dSoYclwsYLMAKD3tpOr4”
related_to_britney = sp.artist_related_artists(britney_spears_spotify_id)
[print(related_artist[“name”]) for related_artist in related_to_britney[“artists”]]
OUT:
Nelly Furtado, Jennifer Lopez, The Pussycat Dolls, Gwen Stefani, Fergie Avril Lavigne, Hilary Duff, Christina Aguilera, Selena Gomez & The Scene, Kelly Clarkson, Kesha, Ashlee Simpson,, Madonna, Spice Girls, Aly & AJ, Natasha Bedingfield, Kylie Minogue, Destiny’s Child, JoJo, RuPaul

I can now use the related artists as the new input and find their related artists. By repeating this process over and over: artist -> related artist -> artist -> related artist -> artist etc. It is now possible to build a dataset with artist data from Spotify. My Spotify dataset consists of just above 1 million artists from Spotify with their i) artist name, ii) related artists, iii) follower count, iv) popularity scorer, v) genres, and vi) artist ID.

Since data about less famous artists are insufficient I decided to focus on the more established artists on Spotify I decided to remove artists with fewer than 10.000 followers but included artists with fewer than 10.000 followers if they are related to an artist with more than 10.000 followers. The final dataset consists of 145.426 different artists.

Now that I have acquired this dataset it is not time to use the data to actually “diversify” my playlist. One of the key elements in the diversifying mechanisms I am building here involves the ability to scope the artists and genres as groups of artists and groups of genres.

Method — communities

Are all data relational data? Apparently, some would say no. I would argue that all data can be mapped and analyzed as relational data. One of the most beneficial values in relational data is the ability to analyze data points as groups rather than in isolation. In this project, I am interested in groups, in network theory called communities, of artists and genres rather than each one of them. So, here we go.

The python package Networkx is one of the best open-source toolkits for mapping and analyzing data as relational data — here as network graphs. A basic network graph consists of nodes that are connected by edges.

Nodes connected with an edge can be said to be more similar to each other than to the rest of the network and thereby assuming that nodes in closer proximity to each other share common features and thereby belong together in a community. We can now use this logic to create communities of nodes. In the python library “Community”, is a function called “best_partition” that finds communities by attempting to maximize the so-called modularity of each community. Below, is an example visualization of The genre-network where the nodes have been colored in different colors according to the community it belongs to using the community modularity score. I used the network visualization tool Gephi to create this example visualization.

I have created two different network graphs that I am using in this project.

“The artist-network”: Artists as nodes, connected with an edge if they are related artists in Spotify. Eg. ‘Britney Spears’ is here a node and is connected with Nelly Furtado, Jennifer Lopez, etc. Because they are related artists — just as we saw earlier.

In the artist-network the community algorithm detected 481 communities of artists. The average size of each community is 302.3 artists per community.

“The genre-network”: Genres as nodes, connected with an edge if they both categorize the same artist in Spotify. Eg ‘pop’ is a node that is connected with ‘pop rap’, ‘post-teen pop’, and ‘dance pop’ because they all are used to categorize Britney Spear’s music genre.

In the genre-network the community algorithm detected 252 communities of genres. The average size of each community is 18.2 genres per community.

I am also interested in how long the shortest path in the network space is between different artists. Network graphs are particularly powerful in these types of analysis. With the networkx function, ‘shortest_path’ we can effectively find the paths between any nodes in our network. Below is an example of the shortest path between ‘Britney Spears’ and ‘Portishead’ in the artist network space.

IN:
source = “Britney Spears”
target = “Portishead”
print(nx.shortest_path(artist-network, source, target))
OUT:
[‘Britney Spears’, ‘Kylie Minogue’, ‘Róisín Murphy’, ‘Moloko’, ‘Portishead’]

Diversifying the playlist

Diversity comes in many forms and colors, so what is a diversified playlist? In this project, I am defining diverse playlists by how remote the artists are positioned from each other in the artist-network space and the genres in the genre-network space.

My diversifying function takes in any playlist as an input, then in four steps it goes through each song and replaces each song with a song with similar popularity but positioned remotely in the network spaces compared to the rest of the songs in the new playlist, and spits out this new diversified playlist as an output. The six diversifying steps for each song are:

Pick a random artist from Spotify with a similar followers count and popularity scorer as original songs artist. (within 20 percent)
Check if the new artist “genre communities” do not already exist in the new playlist.
Check if the new artist “artist community” does not already exist in the new playlist.
Check if the new artist is more than three links away from the original artist.

Running the function

I picked a mixed playlist that Spotify had generated especially for me, with music the algorithms know I love. The playlist is called “Daily mix 2”, and includes some of my favorite downtempo artists like, ‘Beirut’, ‘Band of horses’ and ‘Portishead’.

I important the all the artists from the Daily mix 2 playlist into python, by calling the Spotify API and saving the result as a list with these lines of code:

dailymix_spotify_id = “37i9dQZF1E37VRhik25G67”
pl = sp.playlist(dailymix_spotify_id )
playlist_list = []
for playlist_track in pl[“tracks”][“items”]: playlist_list.append(playlist_track[“track”][“album”][“artists”][0][“name”])

With this playlist as input, I executed my diversifying function 4000 times, generating 4000 different playlists, in order to be able to pick the playlist with the highest number of genre communities represented. A playlist contains songs and not artists so after I have generated a list of artists I again called the Spotify API this time to get the top track from each artist with the Spotipy function, ‘artist_top_tracks’.

playlist_list_new = []
for artist in diversified_artists_list:
results = sp.artist_top_tracks(df_loop.loc[artist, “id”])
playlist_list_new.append(artist, “: “,results[‘tracks’][0][‘name’])

The outcome was this playlist, ‘Diversified_dailymix’:

Comparing playlists

Daily mix 2

The original Spotify personalized generated playlist ‘Daily mix 2’ consists of 39 songs spread across 47 different ‘genre communities’ and 10 different ‘artist communities’. The average shortest path between all the artists in the playlist is 2,9 paths, with the longest shortest path of 5 one of the paths is between Britpopy ‘Blur’ and the icelands ‘Ásgeir’.

‘Blur’ -> ‘White Town’ -> ‘Goldfrapp’ -> ‘GusGus’ -> ‘Vök’ -> ‘Ásgeir’

Diversified_dailymix

The newly generated playlist ‘Diversified_dailymix’ also consists of 39 songs but are spread across 84 different ‘genre communities’ and 39 different ‘artist communities’. The average shortest path between all the artists in the playlist is 5,7 paths, with the longest shortest path of 9 between the Vietnamese ‘Andiez’ and the Indian ‘Kulbir Jhinjer’:

‘Andiez’ -> ‘Lyly’ -> ‘Binz’ -> ‘Silky’ -> ‘Caps’ -> ‘Nafees’ -> ‘Flint J’ -> ‘Raashi Sood’ -> ‘Armaan Bedil’ -> ‘Kulbir Jhinjer’

Diversity in numbers

Rounding up

With this relatively simple project I have tried to showcase the possibility to be aware and maybe burst social bubbles other places than on social media and news sites, but here also with the music we passively are exposed to in our everyday life.

Tech designers might be building an invisible seamless technological world for us, the lazy users, to surrender to. But it is not too late to move in the opposite direction where we can think and rethink alternative tech design worlds to inhabit.