Example Project
Example Project
October 3, 2024
A PYTHON PROJECT
INTRODUCTION
Netflix, a leader in global streaming, has evolved significantly since its inception. This project
examines whether Netflix has become more family-friendly over time by analyzing three key aspects
of its content library:
• Added Content Age Ratings: Assessing Family-Friendliness Trends
• Additions of Kids & Family Content: Yearly Trends
• Duration Trends of Family Content
This report aims to uncover Netflix’s content strategy and determine if it has shifted towards
becoming more family-friendly in response to audience demands and viewing habits.
Dataset used for this project is available at Netflix Movies and TV Shows Dataset .
2 I. DATA EXPLORATION
First, let’s import libraries to help with processing and visualizing the data.
[ ]: # Importing "pandas" library for reading the dataset and working with it.
import pandas as pd
Now, we will import the “Netflix Movies and TV Shows” dataset using pandas and get an initial
overview.
[ ]: df = pd.read_csv('/content/drive/MyDrive/netflix_titles.csv')
df.shape
[ ]: (8807, 12)
1
This dataset has 8807 rows and 12 columns.
Now let’s display the first few rows of the dataset to get a sense of its structure.
[ ]: df.head()
cast country \
0 NaN United States
1 Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban… South Africa
2 Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi… NaN
3 NaN NaN
4 Mayur More, Jitendra Kumar, Ranjan Raj, Alam K… India
listed_in \
0 Documentaries
1 International TV Shows, TV Dramas, TV Mysteries
2 Crime TV Shows, International TV Shows, TV Act…
3 Docuseries, Reality TV
4 International TV Shows, Romantic TV Shows, TV …
description
0 As her father nears the end of his life, filmm…
1 After crossing paths at a party, a Cape Town t…
2 To protect his family from a powerful drug lor…
3 Feuds, flirtations and toilet talk go down amo…
4 In a city of coaching centers known to train I…
The dataset generally looks well-structured and mostly straight-forward. We will only
look into columns that need further clarification.
First, let’s have a look at column ‘type’.
[ ]: df.type.value_counts()
2
[ ]: type
Movie 6131
TV Show 2676
Name: count, dtype: int64
Netflix categorizes its offerings into two main types: “TV Show” and “Movie”. It is
clear that majority of Netflix content are movies.
Let’s check the column ‘country’.
[ ]: df.country.value_counts()
[ ]: country
United States 2818
India 972
United Kingdom 419
Japan 245
South Korea 199
…
Romania, Bulgaria, Hungary 1
Uruguay, Guatemala 1
France, Senegal, Belgium 1
Mexico, United States, Spain, Colombia 1
United Arab Emirates, Jordan 1
Name: count, Length: 748, dtype: int64
The result shows 748 different entries, which is much higher than the roughly 200 countries that exist
in the world. And since entries can contain more than one country per row, this suggests the column
represents filming or production locations rather than just release countries. While inconsistent
country name formatting may inflate the count, the column clearly represents production locations.
Therefore, we’ll rename “country” to “filming_countries” for clarity.
[ ]: #Rename the 'country' column into 'filming_countries'
df.rename(columns={'country': 'filming_countries'}, inplace=True)
cast filming_countries \
0 NaN United States
1 Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban… South Africa
3
2 Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi… NaN
3 NaN NaN
4 Mayur More, Jitendra Kumar, Ranjan Raj, Alam K… India
listed_in \
0 Documentaries
1 International TV Shows, TV Dramas, TV Mysteries
2 Crime TV Shows, International TV Shows, TV Act…
3 Docuseries, Reality TV
4 International TV Shows, Romantic TV Shows, TV …
description
0 As her father nears the end of his life, filmm…
1 After crossing paths at a party, a Cape Town t…
2 To protect his family from a powerful drug lor…
3 Feuds, flirtations and toilet talk go down amo…
4 In a city of coaching centers known to train I…
Now let’s see the distribution of Netflix content based on countries involved in production.
[ ]: # Distribution of content by countries involved in filming process.
df['filming_countries'].value_counts(normalize=True) * 100
[ ]: filming_countries
United States 35.330993
India 12.186560
United Kingdom 5.253260
Japan 3.071715
South Korea 2.494985
…
Romania, Bulgaria, Hungary 0.012538
Uruguay, Guatemala 0.012538
France, Senegal, Belgium 0.012538
Mexico, United States, Spain, Colombia 0.012538
United Arab Emirates, Jordan 0.012538
Name: proportion, Length: 748, dtype: float64
More than 35% of Netflix’s content is produced in the United States, making it the
most dominant source in the platform’s overall catalog.
Since “release_year” is temporal data, we’ll use matplotlib.pyplot to visualize trends rather than
4
using the describe() function.
Netflix’s content is predominantly from recent years, with the majority released in 2020.
Content from before 2000 is minimal.
Let’s have a look at the “rating” column.
[ ]: df.rating.value_counts()
[ ]: rating
TV-MA 3207
TV-14 2160
TV-PG 863
R 799
PG-13 490
TV-Y7 334
TV-Y 307
5
PG 287
TV-G 220
NR 80
G 41
TV-Y7-FV 6
NC-17 3
UR 3
74 min 1
84 min 1
66 min 1
Name: count, dtype: int64
The ‘rating’ column indicates the age classification of the content (e.g., PG, R, or TV-
MA) rather than its likability or star rating. Therefore, we will rename the “rating”
column to “age_rating.”
It’s important to note that there are 3 entry errors that appear to belong to “duration”
(including “74min,” “84 min,” and “66min”), but this does not affect the fact that the
column still represents age ratings.
[ ]: #Rename the 'rating' column into 'age_rating'
df.rename(columns={'rating': 'age_rating'}, inplace=True)
cast filming_countries \
0 NaN United States
1 Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban… South Africa
2 Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi… NaN
3 NaN NaN
4 Mayur More, Jitendra Kumar, Ranjan Raj, Alam K… India
listed_in \
6
0 Documentaries
1 International TV Shows, TV Dramas, TV Mysteries
2 Crime TV Shows, International TV Shows, TV Act…
3 Docuseries, Reality TV
4 International TV Shows, Romantic TV Shows, TV …
description
0 As her father nears the end of his life, filmm…
1 After crossing paths at a party, a Cape Town t…
2 To protect his family from a powerful drug lor…
3 Feuds, flirtations and toilet talk go down amo…
4 In a city of coaching centers known to train I…
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8807 entries, 0 to 8806
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 show_id 8807 non-null object
1 type 8807 non-null object
2 title 8807 non-null object
3 director 6173 non-null object
4 cast 7982 non-null object
5 filming_countries 7976 non-null object
6 date_added 8797 non-null object
7 release_year 8807 non-null int64
8 age_rating 8803 non-null object
9 duration 8804 non-null object
10 listed_in 8807 non-null object
11 description 8807 non-null object
dtypes: int64(1), object(11)
memory usage: 825.8+ KB
There are missing values in 6 out of the 12 columns, because the non-null counts of
these columns do not match the total number of entries.
Let’s have a closer look at the number of null-values each column has.
[ ]: #Count missing values for each column
df.isnull().sum()
[ ]: show_id 0
type 0
title 0
director 2634
7
cast 825
filming_countries 831
date_added 10
release_year 0
age_rating 4
duration 3
listed_in 0
description 0
dtype: int64
As shown in the chart, the director column has the most missing values, followed
by filming_countries and cast. The duration, age_rating, and date_added
columns also have missing values, but the number of missing values is minimal compared
to the overall size.
8
columns_with_missing
[ ]: director 2634
cast 825
filming_countries 831
date_added 10
age_rating 4
duration 3
dtype: int64
Result is:
[ ]: missing_percentages
[ ]: show_id 0.000000
type 0.000000
title 0.000000
director 29.908028
cast 9.367549
filming_countries 9.435676
date_added 0.113546
release_year 0.000000
age_rating 0.045418
duration 0.034064
listed_in 0.000000
description 0.000000
dtype: float64
9
Approximately 30% of the director column contains missing values, while the cast and
filming_countries columns have missing percentages of around 10%. The date_added,
age_rating, and duration columns each have missing values of less than 1%.
cast filming_countries \
1 Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban… South Africa
3 NaN NaN
4 Mayur More, Jitendra Kumar, Ranjan Raj, Alam K… India
10 NaN NaN
14 NaN NaN
listed_in \
10
1 International TV Shows, TV Dramas, TV Mysteries
3 Docuseries, Reality TV
4 International TV Shows, Romantic TV Shows, TV …
10 Crime TV Shows, Docuseries, International TV S…
14 British TV Shows, Crime TV Shows, Docuseries
description
1 After crossing paths at a party, a Cape Town t…
3 Feuds, flirtations and toilet talk go down amo…
4 In a city of coaching centers known to train I…
10 Sicily boasts a bold "Anti-Mafia" coalition. B…
14 Cameras following Bengaluru police on the job …
[ ]: df[df['director'].isnull()].tail()
cast \
8795 Mike Liscio, Emily Bauer, Billy Bob Thompson, …
8796 Gökhan Atalay, Payidar Tüfekçioglu, Baran Akbu…
8797 Michael Johnston, Jessica Gee-George, Christin…
8800 Sanam Saeed, Fawad Khan, Ayesha Omer, Mehreen …
8803 NaN
filming_countries date_added \
8795 Japan, Canada May 1, 2018
8796 Turkey January 17, 2017
8797 United States, France, South Korea, Indonesia September 13, 2018
8800 Pakistan December 15, 2016
8803 NaN July 1, 2019
listed_in \
8795 Anime Series, Kids' TV
8796 International TV Shows, TV Dramas
8797 Kids' TV
11
8800 International TV Shows, Romantic TV Shows, TV …
8803 Kids' TV, Korean TV Shows, TV Comedies
description
8795 Now that he's discovered the Pendulum Summonin…
8796 During the Mongol invasions, Yunus Emre leaves…
8797 Teen surfer Zak Storm is mysteriously transpor…
8800 Strong-willed, middle-class Kashaf and carefre…
8803 While living alone in a spooky town, a young g…
There is a pattern where the ‘type’ column shows ‘TV Show’ when the ‘director’ column
is null.
Let’s explore if there are any other cases.
[ ]: # Total rows where 'director' is null
director_null = df[df['director'].isnull()]
# Count the unique values in the 'type' column when 'director' is null
type_when_null = director_null['type'].value_counts()
type_when_null
[ ]: type
TV Show 2446
Movie 188
Name: count, dtype: int64
Apparently, aside from ‘TV Show’, there are 188 rows with ‘Movie’ type also have no
director listed.
Let’s check how many entries in total there are for ‘TV Show’ and ‘Movie’ type.
[ ]: df['type'].value_counts()
[ ]: type
Movie 6131
TV Show 2676
Name: count, dtype: int64
For ‘TV Show’ content, it’s common for the director column to be left blank, as 2,446
out of 2,676 entries list no director. This frequent absence can be justified. Because TV
shows, particularly reality shows, are often unscripted and may not have a designated
director listed.
Let’s dig deeper into the 188 movies that are missing directors.
[ ]: df[(df['director'].isnull()) & (df['type'] == 'Movie')].head()
12
[ ]: show_id type title director \
404 s405 Movie 9to5: The Story of a Movement NaN
470 s471 Movie Bridgerton - The Afterparty NaN
483 s484 Movie Last Summer NaN
641 s642 Movie Sisters on Track NaN
717 s718 Movie Headspace: Unwind Your Mind NaN
cast filming_countries \
404 NaN NaN
470 David Spade, London Hughes, Fortune Feimster NaN
483 Fatih Şahin, Ece Çeşmioğlu, Halit Özgür Sarı, … NaN
641 NaN NaN
717 Andy Puddicombe, Evelyn Lewis Prieto, Ginger D… NaN
listed_in \
404 Documentaries
470 Movies
483 Dramas, International Movies, Romantic Movies
641 Documentaries, Sports Movies
717 Documentaries
description
404 In this documentary, female office workers in …
470 "Bridgerton" cast members share behind-the-sce…
483 During summer vacation in a beachside town, 16…
641 Three track star sisters face obstacles in lif…
717 Do you want to relax, meditate or sleep deeply…
It looks like there’s a trend where movies without directors are often documentaries.
Since documentaries are usually unscripted, it’s possible that’s why directors aren’t
always listed for them
13
[ ]: show_id type title \
0 s1 Movie Dick Johnson Is Dead
3 s4 TV Show Jailbirds New Orleans
10 s11 TV Show Vendetta: Truth, Lies and The Mafia
14 s15 TV Show Crime Stories: India Detectives
16 s17 Movie Europe's Most Dangerous Man: Otto Skorzeny in …
listed_in \
0 Documentaries
3 Docuseries, Reality TV
10 Crime TV Shows, Docuseries, International TV S…
14 British TV Shows, Crime TV Shows, Docuseries
16 Documentaries, International Movies
description
0 As her father nears the end of his life, filmm…
3 Feuds, flirtations and toilet talk go down amo…
10 Sicily boasts a bold "Anti-Mafia" coalition. B…
14 Cameras following Bengaluru police on the job …
16 Declassified documents reveal the post-WWII li…
[ ]: df[df['cast'].isnull()].tail()
filming_countries date_added \
8746 France, Netherlands, South Africa, Finland February 26, 2018
8755 United States November 1, 2016
14
8756 United States August 13, 2019
8763 United States March 31, 2017
8803 NaN July 1, 2019
listed_in \
8746 Documentaries, International Movies
8755 Crime TV Shows, Docuseries
8756 Documentaries, Music & Musicals
8763 Documentaries
8803 Kids' TV, Korean TV Shows, TV Comedies
description
8746 Winnie Mandela speaks about her extraordinary …
8755 This reality series recounts true stories of w…
8756 For the 50th anniversary of the legendary Wood…
8763 Filmmaker John Huston narrates this Oscar-nomi…
8803 While living alone in a spooky town, a young g…
It’s evident that the ‘cast’ column is empty for entries listed under documentaries,
docuseries, or TV shows. This makes sense because these types of content often feature
regular people rather than professional actors, so the ‘cast’ column is typically left blank.
15
… … … … …
8802 s8803 Movie Zodiac David Fincher
8803 s8804 TV Show Zombie Dumb NaN
8804 s8805 Movie Zombieland Ruben Fleischer
8805 s8806 Movie Zoom Peter Hewitt
8806 s8807 Movie Zubaan Mozez Singh
cast date_added \
0 NaN September 25, 2021
1 Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban… September 24, 2021
2 Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi… September 24, 2021
3 NaN September 24, 2021
4 Mayur More, Jitendra Kumar, Ranjan Raj, Alam K… September 24, 2021
… … …
8802 Mark Ruffalo, Jake Gyllenhaal, Robert Downey J… November 20, 2019
8803 NaN July 1, 2019
8804 Jesse Eisenberg, Woody Harrelson, Emma Stone, … November 1, 2019
8805 Tim Allen, Courteney Cox, Chevy Chase, Kate Ma… January 11, 2020
8806 Vicky Kaushal, Sarah-Jane Dias, Raaghav Chanan… March 2, 2019
listed_in \
0 Documentaries
1 International TV Shows, TV Dramas, TV Mysteries
2 Crime TV Shows, International TV Shows, TV Act…
3 Docuseries, Reality TV
4 International TV Shows, Romantic TV Shows, TV …
… …
8802 Cult Movies, Dramas, Thrillers
8803 Kids' TV, Korean TV Shows, TV Comedies
8804 Comedies, Horror Movies
8805 Children & Family Movies, Comedies
8806 Dramas, International Movies, Music & Musicals
description
16
0 As her father nears the end of his life, filmm…
1 After crossing paths at a party, a Cape Town t…
2 To protect his family from a powerful drug lor…
3 Feuds, flirtations and toilet talk go down amo…
4 In a city of coaching centers known to train I…
… …
8802 A political cartoonist, a crime reporter and a…
8803 While living alone in a spooky town, a young g…
8804 Looking to survive in a world taken over by zo…
8805 Dragged from civilian life, a former superhero…
8806 A scrappy but poor boy worms his way into a ty…
[ ]: date_added 10
age_rating 4
duration 3
dtype: int64
The columns date_added, age_rating, and duration have 10, 4, and 3 missing values,
respectively, out of 8,807 entries each. These represent very small percentages of the
total data: approximately 0.11%, 0.05%, and 0.03%. Since we don’t have external
data to accurately fill these gaps, we’ll leave them as missing, as they are unlikely to
significantly impact the analysis.
[ ]: 0
Result returns 0 means no duplicate entries were present. Therefore, no further action
is needed.
4 III. Analysis
In this section, we will examine whether Netflix is becoming more family-friendly by analyzing three
aspects: the trends in age ratings of newly added content, the yearly number of added kid-friendly
content, and the durations of kids’ content.
17
4.1 1. Added Content Age Ratings: Assessing Family-Friendliness Trends
Age ratings indicate content suitability for different audiences and can reveal if Netflix is becoming
more family-oriented.
In this section, we will examine the distribution of Netflix’s content based on age ratings to identify
any increases in the number of additions that are classified as more kid-friendly.
[ ]: df['age_rating'].value_counts()
[ ]: age_rating
TV-MA 3207
TV-14 2160
TV-PG 863
R 799
PG-13 490
TV-Y7 334
TV-Y 307
PG 287
TV-G 220
NR 80
G 41
TV-Y7-FV 6
NC-17 3
UR 3
74 min 1
84 min 1
66 min 1
Name: count, dtype: int64
We notice that there are 3 entries with inappropriate values in the rating column, where
the values mistakenly represent durations instead.
To continue the analysis, we will need to remove these three rows. This action is unlikely to have
a significant impact on our results, as the number of removed rows represent a very small fraction
of the total dataset.
[ ]: #Create a list of unwanted values
list1 = ['74 min','84 min', '66 min']
#Total count of each unique value of the 'rating' column from new dataframe
new_rating = df['age_rating'].value_counts()
new_rating
[ ]: age_rating
TV-MA 3207
18
TV-14 2160
TV-PG 863
R 799
PG-13 490
TV-Y7 334
TV-Y 307
PG 287
TV-G 220
NR 80
G 41
TV-Y7-FV 6
NC-17 3
UR 3
Name: count, dtype: int64
[ ]: age_rating
TV-MA 36.426624
TV-14 24.534303
TV-PG 9.802363
R 9.075420
PG-13 5.565652
TV-Y7 3.793730
TV-Y 3.487051
PG 3.259882
TV-G 2.498864
NR 0.908678
G 0.465697
TV-Y7-FV 0.068151
NC-17 0.034075
UR 0.034075
Name: count, dtype: float64
[ ]: new_rating_percentages.plot.barh()
plt.show()
19
Netflix’s content is predominantly categorized as TV-MA (Mature Audiences) and TV-
14(Parents Strongly Cautioned), comprising 36.4% and 24.5% respectively, together
accounting for over 60% of the total content. TV-PG(Parental Guidance Suggested)
and R(Restricted - Under 17 requires accompanying parent or adult guardian) ratings
follow, each making up just under 10%, while all other rating types are below 5% each.
Currently, over 70% of Netflix’s content is classified as R, TV-MA, and TV-14, empha-
sizing a dominant focus on mature, complex, or intense content. This indicates that
the platform is primarily geared toward an audience seeking more adult-oriented con-
tent, including both adults and older teenagers, with less emphasis on family-friendly
or younger content.
Let’s analyze the yearly trend in the number of movies Netflix adds, categorized by age rating.
[ ]: # DataFrame to analyze yearly trends of content added to Netflix, categorized␣
↪by age rating.
df[['age_rating','date_added']]
[ ]: age_rating date_added
0 PG-13 September 25, 2021
1 TV-MA September 24, 2021
2 TV-MA September 24, 2021
3 TV-MA September 24, 2021
4 TV-MA September 24, 2021
20
… … …
8802 R November 20, 2019
8803 TV-Y7 July 1, 2019
8804 R November 1, 2019
8805 PG January 11, 2020
8806 TV-14 March 2, 2019
Since our ‘date_added’ column is of the object datatype, we need to convert it into a
datetime format to analyze trends effectively. In this case we will only retain the year
portion.
Now we will convert the ‘date_added’ column to datetime format and keep only the year portion.
We will call the new column ‘year_added’.
[ ]: # Convert the 'date_added' to datetime type
df['date_added'] = pd.to_datetime(df['date_added'])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-39-87c447b46333> in <cell line: 3>()
1 # Convert the 'date_added' to datetime type
2
----> 3 df['date_added'] = pd.to_datetime(df['date_added'])
4
5 # Extract just the year
/usr/local/lib/python3.10/dist-packages/pandas/core/tools/datetimes.py in␣
↪to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit,␣
/usr/local/lib/python3.10/dist-packages/pandas/core/tools/datetimes.py in␣
↪_maybe_cache(arg, format, cache, convert_listlike)
21
--> 254 cache_dates = convert_listlike(unique_dates, format)
255 # GH#45319
256 try:
/usr/local/lib/python3.10/dist-packages/pandas/core/tools/datetimes.py in␣
↪_convert_listlike_datetimes(arg, format, name, utc, unit, errors, dayfirst,␣
↪yearfirst, exact)
489
490 result, tz_parsed = objects_to_datetime64ns(
/usr/local/lib/python3.10/dist-packages/pandas/core/tools/datetimes.py in␣
↪_array_strptime_with_fallback(arg, name, utc, fmt, exact, errors)
strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
ValueError: time data " August 4, 2017" doesn't match format "%B %d, %Y", at␣
↪position 1441. You might want to try:
- passing `format='mixed'`, and the format will be inferred for each element␣
↪individually. You might want to use `dayfirst` alongside this.
22
df
Now we create a chart that shows the trend of shows added to Netflix over the years, categorized
by age rating.
[ ]: import pandas as pd
import matplotlib.pyplot as plt
The chart shows that while TV-MA and TV-14 rated movies remain dominant, there’s
been a sharp decline in these ratings since 2019. Similarly, TV-PG and G ratings have
also been decreasing. Conversely, R, TV-Y7, and PG-13 rated content have seen steady
growth.
Conclusion
This trend reflects Netflix’s effort to cater to a wider range of age groups, from children
and teenagers to adults, rather than focusing exclusively on mature audiences. By
increasing TV-Y7 and PG-13 rated movies, Netflix is balancing its content to appeal to
younger viewers and families while still providing R-rated content for mature audiences.
The significant decline in TV-MA and TV-14 ratings clearly indicates a shift away from
highly mature or teenage-oriented content, potentially making Netflix a more family-
friendly streaming platform.
23
First, we need to filter our main DataFrame to isolate only the kids-related content before creating
the chart.
[ ]: # Define a list of keywords that are commonly found in kid-related genres
kid_keywords = ['Kids', 'Family', 'Children',␣
↪'Animated','Friendship','Animation' 'Cartoon','Anime','Educational']
director \
6 Robert Cullen, José Luis Ucha
13 Bruno Garotti
23 Alex Woo, Stanley Moore
34 NaN
37 NaN
… …
8793 Raja Gosnell
8795 NaN
8797 NaN
8803 NaN
8805 Peter Hewitt
cast date_added \
6 Vanessa Hudgens, Kimiko Glenn, James Marsden, … September 24, 2021
13 Klara Castanho, Lucca Picon, Júlia Gomes, Marc… September 22, 2021
23 Maisie Benson, Paul Killam, Kerry Gudjohnsen, … September 21, 2021
34 Dami Lee, Jason Lee, Bommie Catherine Han, Jen… September 17, 2021
37 Antti Pääkkönen, Heljä Heikkinen, Lynne Guagli… September 16, 2021
24
… … …
8793 Dennis Quaid, Rene Russo, Sean Faris, Katija P… November 20, 2019
8795 Mike Liscio, Emily Bauer, Billy Bob Thompson, … May 1, 2018
8797 Michael Johnston, Jessica Gee-George, Christin… September 13, 2018
8803 NaN July 1, 2019
8805 Tim Allen, Courteney Cox, Chevy Chase, Kate Ma… January 11, 2020
listed_in \
6 Children & Family Movies
13 Children & Family Movies, Comedies
23 Children & Family Movies
34 Kids' TV
37 Kids' TV, TV Comedies
… …
8793 Children & Family Movies, Comedies
8795 Anime Series, Kids' TV
8797 Kids' TV
8803 Kids' TV, Korean TV Shows, TV Comedies
8805 Children & Family Movies, Comedies
description
6 Equestria's divided. But a bright-eyed hero be…
13 When the clever but socially-awkward Tetê join…
23 From arcade games to sled days and hiccup cure…
34 Tayo speeds into an adventure when his friends…
37 Birds Red, Chuck and their feathered friends h…
… …
8793 When a father of eight and a mother of 10 prep…
8795 Now that he's discovered the Pendulum Summonin…
8797 Teen surfer Zak Storm is mysteriously transpor…
8803 While living alone in a spooky town, a young g…
8805 Dragged from civilian life, a former superhero…
25
Although the keyword list for identifying kid-friendly content may not be complete,
we still have 1,301 entries specifically related to kids out of a total dataset of over
8,000 entries. This substantial sample size allows us to conduct a meaningful analysis,
ensuring that our findings are representative of the overall content available.
Now, let’s generate a chart to illustrate the yearly trend of kids and family-friendly content added
over the years.
[ ]: import pandas as pd
import matplotlib.pyplot as plt
The graph reveals a striking upward trend in kids content additions on Netflix over the
past decade. From just 25 new titles in 2015, Netflix dramatically increased its yearly
kids content additions to over 300 by 2020, representing a more than tenfold increase
in just five years.
Despite a slight dip in 2021, likely due to pandemic-related production constraints, the
overall trend shows a substantial increase in kids content additions compared to previous
years.
Conclusion
This transformation unmistakably positions Netflix as an increasingly attractive option
for family viewing, solidifying its evolution into a more comprehensive, family-friendly
streaming service.
26
4.3 3. Duration Trends of Family Content
Shorter durations often reflect content designed for younger audiences or families, as children tend
to have shorter attention spans and families prefer content that can be consumed quickly in one
sitting, fitting into busy schedules. Let’s analyze recent Netflix content to see if the duration trends
align with this.
We will concentrate exclusively on kid and family-related content for our analysis, as examining
the duration trends of other genres is not relevant to our focus on family-friendly programming.
Our dataset for this duration analysis will also be “kids_content” from the previous section, let’s
have a overall look.
[ ]: kids_content.head()
Let’s see the distribution of duration type of these kids related content.
[ ]: kids_content.duration.value_counts()
Let’s analyze the duration trends in minutes for content related to kids and families.
It’s important to note that while shorter durations typically signify more kid-friendly content,
longer durations can still align with Netflix’s goal of being kid-friendly if extended episodes are
released during times like COVID-19, as they cater to children spending more time at home.
[ ]: import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Dataframe is "duration_minutes"
27
# Step 2: Clean the 'duration' column and convert to numeric
# Remove " minutes" and " min" from the duration column
duration_minutes['duration'] = duration_minutes['duration'].str.replace('␣
↪minutes', '').str.replace(' min', '').str.replace('min', '').str.strip()
duration_minutes['duration'] = pd.to_numeric(duration_minutes['duration'],␣
↪errors='coerce')
# Step 5: Create a line plot to show average duration trends over the years
plt.figure(figsize=(12, 6))
plt.plot(trend_data.index, trend_data.values, marker='o')
plt.title('Average Duration of Kid and Family-Related Content in Minutes Over␣
↪the Years')
plt.xlabel('Year Added')
plt.ylabel('Average Duration (Minutes)')
plt.xticks(trend_data.index) # Show all years on the x-axis
plt.grid(True)
plt.show()
From 2017 to 2020, the average duration of kids-related content on Netflix stabilized
around 80 minutes, which is 20% shorter than the average duration for non-kids-related
content, which stands at 100 minutes. This trend indicates that Netflix is prioritizing
shorter, more digestible content that likely appeals to families and children, reinforcing
its position as a more family-friendly platform.
However, there was a noticeable increase to nearly 90 minutes from 2020 to 2021, sug-
gesting that Netflix may have responded to viewer preferences for longer episodes or
films during this period, possibly due to increased family viewership as a result of
pandemic-related restrictions.
It’s worth noting that the trends from 2011 to 2013 may have been influenced by an
uneven number of entries each year, which could lead to potential outliers skewing the
average duration figures.
Conclusion
The data indicates that Netflix initially emphasized shorter durations for kids-labeled
content, suggesting a family-friendly strategy. Despite the increase in average duration
for kids-related content from 2020 to 2021, this shift reflects Netflix’s adaptation to
viewer preferences during the COVID pandemic, which increased the demand for more
substantial content and reinforces its commitment to being a family-friendly platform.
For comparision, average duration of non-kids-related content from Netflix is over 100 minutes and
is calculated as below.
28
[ ]: import pandas as pd
# Step 1: Filter out entries that contain "Season" in the duration column
non_season_entries = df[~df['duration'].str.contains("Season", na=False)]
non_kids_entries['duration'] = pd.to_numeric(non_kids_entries['duration'],␣
↪errors='coerce')
Let’s analyze the duration trends in seasons for content related to kids and families.
Please note that in this section, since duration is measured in seasons, longer durations actually
align with Netflix’s goal of becoming more kid-friendly, as they signify a commitment to developing
engaging material for young audiences.
[ ]: import pandas as pd
import matplotlib.pyplot as plt
# DataFrame is "duration_season"
29
# Optional: Drop rows with NaN values in 'duration'
duration_season = duration_season.dropna(subset=['duration'])
plt.xlabel('Year Added')
plt.ylabel('Average Duration (Seasons)')
plt.xticks(trend_data.index) # Show all years on the x-axis
plt.grid(True)
plt.show()
The significant drop from an average of 5 seasons to under 1 season between 2014 and
2016 could be attributed to the limited number of data samples during that period. This
fluctuation may not accurately represent the overall trends in kids’ TV show offerings
on Netflix.
The steady growth in the number of seasons for kids’ TV shows on Netflix from 2016
to 2021 underscores rising demand for family-oriented programming, with the average
increasing from about one season in 2016 to approximately 2.1 seasons by 2021.
Conclusion
This trend reflects Netflix’s commitment to long-term investments in family-oriented
series and the cultivation of audience loyalty among younger viewers. Overall, it high-
lights Netflix’s strategic focus on becoming a more family-friendly platform.
5 IV. Summary
Key insights from the analysis include:
• Content Age Rating: Sharp decline in TV-MA and TV-14 ratings indicates a shift away from
mature content, making Netflix more family-friendly.
• Kids Content Additions: Significant increase in children’s content offerings positions Netflix
as an increasingly attractive option for family viewing.
• Duration Trends: Initially focused on shorter content for families, Netflix increased the av-
erage duration during the pandemic, showcasing its effort to cater to and attract family and
children audiences.
• Seasonal Content: The growth in multi-season kids’ shows reflects Netflix’s long-term invest-
ment in family-oriented series and the cultivation of younger audience loyalty, demonstrating
its shift toward becoming a more family-friendly platform.
These findings collectively demonstrate Netflix’s strategic evolution towards becoming
30
a more family-oriented streaming platform, adapting to viewer needs and preferences
while diversifying its content portfolio.
31