Beneath the Surface: A Finding Nemo Character Study

Intro

Welcome to “Beneath the Surface”! If you’re a fan of the beloved Pixar classic, Finding Nemo, and are interested in analyzing the script using text analysis, you’re in the right place. In this blog, we’ll be taking a deep dive into the words spoken by each of the characters in the movie to uncover their personalities, motivations, and relationships with one another. Using text analysis tools, we’ll be able to examine the language and patterns of speech of each character and draw insights about how they contribute to the overall narrative. Whether you’re a fan of Marlin, Nemo, Crush, or any of the other lovable characters in this underwater adventure, “Beneath the Surface” is the place to be for a closer look at the story of Finding Nemo.

Wordclouds

Top Characters

Marlin

Marlin, the overprotective clownfish and father of Nemo, is the main character in Pixar’s Finding Nemo.

One of Marlin’s most frequently used words is “son,” which reflects his deep love and concern for Nemo. He is constantly worried about his son’s safety, and his use of the word “son” reinforces this central theme. His use of “look” also reflects his hyper-awareness of his surroundings and his need to stay vigilant to keep his son safe. One interesting finding is Marlin’s use of the word “boat”, which could reflect his fear of human world and his desire to save Nemo from it.

Marlin’s less frequent but important use of words like “coral” and “kids” reveals more about his relationships with other characters in the movie. His use of “coral” reflects his attachment to his home and community, while his use of “kids” reveals his connection to other parents from whom he receives advice, such as Crush, the sea turtle.

Dory

Dory is undoubtedly one of the most beloved characters in Finding Nemo, known for her famous catchphrase “just keep swimming.”

One of the most frequently used words in Dory’s vocabulary is “swim”, which reflects her natural inclination to explore. It also serves as a metaphor for her philosophy on life, which is to keep moving forward no matter what obstacles may arise. Dory also frequently uses the word “can’t”. This reflects her lack of confidence, as well as her tendency to give up when faced with a challenge. However, it also serves to highlight her growth throughout the movie, as she learns to believe in herself and her abilities. Another word that is less frequently used but still important to Dory’s character is her own name, “Dory,” which she uses as a form of self-identification and self-affirmation.

Gill

Gill is the leader of the Tank Gang.

One of his most frequently used words is “calm.” When the group is trying to figure out how to escape from the dentist’s office, Gill tells them to “stay calm” and not panic. This shows that Gill is a natural leader who is able to keep a level head in stressful situations, and that he is concerned about the well-being of his fellow fish.

The word “lead” is also important in understanding Gill’s character. While it is not one of his most frequently used words, it does appear several times throughout the movie. For example, when the Tank Gang is trying to get out of the plastic bags, Gill tells Nemo to “lead the way.” This shows that Gill is not only a leader, but also recognizes leadership qualities in others. He is able to delegate tasks and trust others to take charge when needed.

Peach

Peach is one of the supporting characters in the movie “Finding Nemo.” She is a pink starfish who lives in the tank in the dentist’s office where Nemo is taken after being captured in the wild. Throughout the movie, Peach is shown to be a loyal and caring friend to the other inhabitants of the tank, particularly Gill, who is the de facto leader of the group.

A text analysis of Peach’s dialogue reveals that she frequently mentions both Gill and Nemo, which suggests that these two characters are important to her. Peach often talks about Gill in relation to his plans to escape from the tank, showing her admiration for his leadership skills and bravery.

Sentiment Analysis

Sentiment Analysis involves analyzing the overall “feel” of a body of text. We wanted to see what the overall sentiments of the main characters were using two common sentiment databases: the AFINN Lexicon and the NRC Lexicon. While these lexicons enable powerful analysis of character’s general sentiments, they are unfortunately not comprehensive and many of the words in Finding Nemo did not appear in either lexicon.

AFINN Sentiments

The AFINN lexicon assigns each word a number from -5 to 5 according to how “negative” or “positive” a connotation it contains. We averaged these scores across all of the words said by each character, and calculated an average sentiment for each character on a scale of -5 to 5. For simplicity, we count the number of AFINN lexicon words said by each character, and only display the top 10 characters by total AFINN words said

From these results, we can see that most characters are on average fairly positive. This makes sense given Finding Nemo’s status as a PG movie intended for children. Of these characters, Peach and Crush have very positive average sentiments, which tracks with their personalities. Peach, the pink starfish, is a big-sister type character, helping Nemo when he first enters the dentist’s tank. With her positive encouragement and protection, Peach’s dialogue is overall very positive. On the other hand, Crush is a surfer dude type character, and his carefree lifestyle leads to largely positive dialogue. Finally, the one character with a negative overall sentiment is Bloat, the pufferfish. Bloat starts off by scaring Nemo when he first enters the dentist’s office, and throughout the film is generally short-tempered and anxious, leading to an overall negative sentiment.

NRC Sentiments

In contrast to the AFINN lexicon, the NRC lexicon categorizes words into different sentiment categories such as anticipation, fear, or joy, allowing for one word to exist in multiple categories. In this analysis, we count the number of words in each character’s dialogue associated with each sentiment category, and display the top 5 sentiments in each character’s dialogue. As this analysis is more complex, we only include the top 6 characters by total NRC words said.

As before, we can see that Crush’s dialogue is mostly positive, with over half of his words belonging to that category. Unfortunately, due to the large number of words that fit into the broad categories of “positive” and “negative,” these sentiment categories figure prominently in each character’s dialogue. Among the other, narrower, sentiment categories, anticipation figures prominently in Dory, Marlin, and Nemo’s dialogues. This makes sense since Nemo is thrust into an unfamiliar environment while Dory and Marlin frantically try to find him.

Network Centrality

Network centrality can be used to study the relative importance, or centrality, of characters in the movie. Here we count every time a character speaks right after another character as an interaction, and sum up the total number of interactions for each pair of characters. For simplicity, we only consider interactions between the top ten characters by total words spoken.

Character Interactions

This graph displays the characters as nodes and interactions between characters as edges. Edge color and thickness are scaled according to the number of interactions between the characters at either end. Three pairwise interactions stand out: Dory and Marlin, Marlin and Nemo, and Nemo and Gill. Dory and Marlin have the most interactions out of any pair of characters, which is understandable given that most of the movie follows their journey to save Nemo. Nemo and Marlin have the second most interactions, as they are a very close father-son pair, despite being separated for a large part of the movie. Finally, Gill is the leader of the fish in the dentists’ tank where Nemo ends up, and so also interacts a lot with Nemo.

Character Centrality

We used network centrality to evaluate the strength centrality of the nodes in the above graph. Strength centrality is simply the sum of the weights of all edges a node is attached to.

Similarly to the graph, this bar chart puts Marlin as the most central character, which makes sense as the movie mostly follows his journey in finding his son. From there, the most central characters are Dory, Nemo, and Gill, in that order, all three of which were participants in one of the three most common pairwise interactions above. All in all, network centrality correlates well with the status of Marlin, Dory, and Nemo as the main characters of Finding Nemo.

References

Trindade, Ash. (July 2022), “Finding Nemo Movie Script”, Kaggle, available at https://www.kaggle.com/datasets/ashtrindade/finding-nemo-movie-script.

Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.

Feinerer I, Hornik K (2022). tm: Text Mining Package. R package version 0.7-10, https://CRAN.R-project.org/package=tm.

Fellows I (2018). wordcloud: Word Clouds. R package version 2.6, https://CRAN.R-project.org/package=wordcloud.

Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” JOSS, 1(3). doi:10.21105/joss.00037 https://doi.org/10.21105/joss.00037, http://dx.doi.org/10.21105/joss.00037.

Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.

Ashton D, Porter S (2016). radarchart: Radar Chart from ‘Chart.js’. R package version 0.3.1, https://CRAN.R-project.org/package=radarchart.

Yihui Xie (2023). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.42.

Lang D (2023). wordcloud2: Create Word Cloud by htmlWidget. R package version 0.2.2, https://github.com/lchiffon/wordcloud2.

Wickham H, François R, Henry L, Müller K (2022). dplyr: A Grammar of Data Manipulation. R package version 1.0.10, https://CRAN.R-project.org/package=dplyr.

Wickham H (2022). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.5.0, https://CRAN.R-project.org/package=stringr.

Barnier J (2022). rmdformats: HTML Output Formats and Templates for ‘rmarkdown’ Documents. R package version 1.0.4, https://CRAN.R-project.org/package=rmdformats.

Wickham H, Girlich M (2022). tidyr: Tidy Messy Data. R package version 1.2.1, https://CRAN.R-project.org/package=tidyr.

Hvitfeldt E (2022). textdata: Download and Load Various Text Datasets. R package version 0.4.4, https://CRAN.R-project.org/package=textdata.

Csardi G, Nepusz T: The igraph software package for complex network research, InterJournal, Complex Systems 1695. 2006. https://igraph.org

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Briatte F (2021). ggnetwork: Geometries to Plot Networks with ‘ggplot2’. R package version 0.5.10, https://CRAN.R-project.org/package=ggnetwork.

Wickham H, Seidel D (2022). scales: Scale Functions for Visualization. R package version 1.2.1, https://CRAN.R-project.org/package=scales.