Enhacing pocast learning: Saving audio clips and transcripts for engaged listeners


As the leading audio streaming app, Spotify has heavily invested in the podcasting industry in recent years, broadening the popularity and global reach of tuning into podcasts as a way to learn and gain new information. Yet, for a platform with such breadth and depth of knowledge, there is no way for listeners to save any of their learnings.

Through Designlab's UX Academy, I worked as the sole Product Designer to add a conceptual feature that allows Spotify listeners to save podcast clips for future reference.


8 weeks


Product Designer


Figma, Dovetail, Maze


A feature to save and organize popular and custom clips so listeners can save the important information to return to later.

Save Custom Clips

Create saved text and audio clips by highlighting a transcript.

Save Quick Clips

View and save the most popular insights of an episode with just one click

Saved Clips Library

Access your saved clips or view the clip's transcript from your library when the inspiration strikes.


Spotify is the most popular one-stop shop for audio needs.

To better understand what already exists in the market for podcast listeners, I explored the features and interface of 5 of the top podcast streaming platforms. While there were slight variations in interface design and the experience of searching for and discovering shows and episodes, the listening experience was largely similar.

So what sets Spotify apart? It’s a one-stop shop for all one’s listening needs, from podcasts to music, and most recently, audiobooks as well.


Most podcast listeners tune in to learn something new, and for the entertainment factor

Beyond understanding the motivations and experiences of podcast listeners (and maybe getting a few podcast recommendations) I also wanted to get users’ impressions of various mobile interfaces of the podcast apps mentioned above.

Using Dovetail to tag the shared insights into an empathy map bar chart, the experiences of the 9 users I interviewed revealed a few common insights:

Sometimes I’ll open my notes app to jot something down...but then I have to rewind the podcast. Or I’ll just make a mental note...but most often I forget it.

I listen to podcasts to learn new things! Usually self-help or career related.

I listen to podcasts to learn new things! Usually self-help or career related.

I don’t usually save whole episodes and rarely re-listen to one I’ve already listened to, but I’d probably share shorter clips to reference later or share with other people

User insights helped to narrow down and identify the problem:


74% of podcast listeners seek to learn new things, yet there is no way to save podcast content to help users remember what is important to them.


Make it multimodal: the presence of two modalities of learning improves learning.

Intrigued by the impact of learning methods and how users process important information, I delved into research on different learning modalities.

Multimodal learning is based on the VARK model, which suggests that individuals have preferred learning styles such as visual, auditory, reading/writing, or kinesthetic learning.





Studies show that combining text with audio improves focus and memory, and reduces mind wandering, outperforming both listening without text and silent reading.

Continuing my exploration of learning modalities, I investigated existing apps in the learning and education market and their approaches to content delivery and saving information. One common pattern I discovered among multimodal learning apps is the inclusion of text alongside another mode of learning.


Keeping in mind Spotify’s business goals of increasing user engagement and listenership while balancing the users’ needs on the learning front, I brainstormed and narrowed down opportunity statements to the following:

How Might We...

...make saving information more accessible?

...encourage users to save podcast clips?

...establish podcasts as a way of learning?

These opportunities lead to the following hypothesis for an added feature:


Enabling users to save podcast clips with text transcriptions offers a multimodal learning experience that enhances learning and promotes sustained use as a valuable learning tool.


Designing 3 key flows

The ideation process resulted in the development of user flows for three scenarios. While these scenarios may address primary needs for specific personas, they can also serve as use cases for any user.

➣ One-click clip saving from the audio player
➣ Saving a clip through the transcript
➣ Accessing saved clips for reference

I want to save clips easily as I’m going about my day

I want to save clips easily as I’m going about my day

I want to find my saved clips easily so I can reference them later

I want to be precise and save my clips through a transcript


As I began to explore designs in wireframes, I focused on 3 elements of the design:

Transitioning from audio player to transcript.

Spotify's music player includes a Lyrics tab for viewing song lyrics. However, to allow users to view podcast transcripts that innately come with more text, I opted for a toggle option to switch to the transcript view based on an A/B test where 94% of testers preferred it over swiping up.

Saving custom clips

Users are able to save Quick Clips— popular quotes most often saved by users and are accentuated in the transcript—with just a tap of a button.

Similar to the presentation of most highlighted passages on ebooks, providing frequently-saved segments and a quick interaction to save them allows—and perhaps encourages—users to easily engage with the content.

The selected solution emphasizes the suggested quote through a distinction in text size and separation from the text paragraph.

Saving custom clips

Like highlighting text in a physical book or on paper, users are able to save a custom clip by highlighting the transcript text.

This interaction saves the text as well as the audio clip that pairs with it. It was important for users to be able to save a clip through a transcript for enhanced accessibility to the episode content and to offer a multimodal approach to support the user’s learning.

High Fidelity Solution

With the mid-fidelity wireframes and the addition of animations and interactions, I created a high-fidelity prototype that included actions familiar to mobile users. The three user flows are reflected in the tasks users would be asked to complete through Maze during usability testing.

Usability Testing

Users thought the added feature was well-intgrated into the existing product

Over 3 moderated tests and 18 unmoderated tests, I gathered feedback on the usability of the feature and calculated a system usability score. Using the 10-question questionnaire for the System Usability Scale (SUS), the added feature received a 77.5 rating which falls into the ‘good’ range.

The resulting SUS score suggests this feature is easy to use and well-integrated into the existing systems. This could support the possibility of increased user engagement and retention, leading to higher listenership, brand loyalty, and increased revenue.


What’s in a name? (And other enhancements)

Following usability testing, mentor feedback and a round of group critiques, adjustments were made to the final designs, including:

➣ Using tool tips to add clarity to the quick clip feature

➣ Streamlining the custom clip interactions

➣ Utilizing heat maps to create additional entry points to accessing saved clips



Learnings (and a Haiku!)

This added feature is not groundbreaking in terms of its ability to switch between audio and transcript views, nor is it the first time users have saved content, organized digital files, or highlighted text within an app.

I decided to focus on the aspect of saving a clip through a transcript rather than the audio player to support the users who benefit for dual modalities of learning. In my last 3 years as a teacher, I worked with many students in the process of being evaluated for learning disabilities. Through this work, I learned the importance of accessibility and bringing different types of learning experiences to help make an idea or skill stick. Future iterations could benefit from an exploration of the audio side of saving a clip as well as a deeper diveinto how these saved clips are stored.

Last but not least, a haiku:

An Ode to Figma Components

Command, option, K—

A worthy time investment.

(Endless Youtube vids)

Next up…




Helping teachers maximize the productivity of their planning time with task-batching and time-blocking