You can slide the charts around to focus on different aspects of it or see different parts.
Streaming Services Analysis
Analysis Overview:
This analysis will dive into data from four leading streaming platforms—Netflix, Hulu, Amazon Prime, and Disney+—to explore trends, viewership patterns, and key performance metrics. By examining each service’s unique offerings and subscriber behaviors, we aim to uncover insights that highlight their market positions and user preferences in the ever-evolving digital entertainment landscape.
Distribution of Movie/Show Durations by Streaming Service
import pandas as pdimport polars as plimport seaborn as snsimport matplotlib.pyplot as plt# --- Load Data ---disney = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/disney_plus_titles.csv")hulu = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/hulu_titles.csv")netflix = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/netflix_titles.csv")prime = pd.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/amazon_prime_titles.csv")disney = disney.with_columns(pl.lit("Disney+").alias("platform"))hulu = hulu.with_columns(pl.lit("Hulu").alias("platform"))netflix = netflix.with_columns(pl.lit("Netflix").alias("platform"))disney = disney.to_pandas()hulu = hulu.to_pandas()netflix = netflix.to_pandas()prime["platform"] ="Amazon Prime"data = pd.concat([disney, hulu, netflix, prime], ignore_index=True)def convert_duration(duration):if pd.isna(duration):returnNone duration =str(duration).lower()if"min"in duration:returnint(duration.replace(" min", ""))elif"h"in duration: parts = duration.split(" ") hours =int(parts[0].replace("h", "")) *60 minutes =int(parts[1].replace("min", "")) iflen(parts) >1else0return hours + minutesreturnNonedata["duration_minutes"] = data["duration"].apply(convert_duration)data["date_added"] = pd.to_datetime(data["date_added"], errors="coerce")data["year_added"] = data["date_added"].dt.yearplt.figure(figsize=(8, 6))sns.violinplot(data=data, x="platform", y="duration_minutes", palette="Set2")plt.title("Distribution of Movie/Show Durations by Streaming Service")plt.xlabel("Streaming Service")plt.ylabel("Duration (Minutes)")plt.xticks(rotation=45)plt.show()
/var/folders/zs/zycd14f96ks5gpx0bsb97xtr0000gn/T/ipykernel_65863/3666423932.py:43: FutureWarning:
Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect.
The violin plot reveals a striking contrast in content duration across platforms. Amazon Prime stands out with the broadest range, extending to exceptionally lengthy content (up to 600 minutes), suggesting a focus on longer films and series. In contrast, Disney+ exhibits the shortest content range, implying a preference for more concise viewing experiences. Netflix and Hulu fall somewhere in between, with Netflix showing a slightly wider spread than Hulu.
Number of Titles Added Over Time
plt.figure(figsize=(18, 6))df_yearly = data.groupby(["year_added", "platform"]).size().reset_index(name="count")sns.lineplot(data=df_yearly, x="year_added", y="count", hue="platform", marker="o")plt.title("Number of Titles Added Over Time")plt.xlabel("Year")plt.ylabel("Number of Titles")plt.legend(title="Streaming Service")plt.show()
This chart highlights the aggressive content acquisition strategy employed by Netflix. Starting around 2016, Netflix embarked on a dramatic expansion of its library, consistently adding new titles at a pace that far surpassed its competitors. While other platforms also increased their offerings, their growth trajectories were notably less steep, underscoring Netflix’s commitment to content dominance.
Distribution of Movie Ratings Across Platforms
plt.figure(figsize=(30, 6))sns.histplot(data=data, x="rating", hue="platform", multiple="stack", palette="Set1", shrink=0.8)plt.title("Distribution of Movie Ratings Across Platforms")plt.xlabel("Rating")plt.ylabel("Count of Titles")plt.xticks(rotation=45)plt.legend(title="Streaming Service")plt.show()
/var/folders/zs/zycd14f96ks5gpx0bsb97xtr0000gn/T/ipykernel_65863/2575985119.py:7: UserWarning:
No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
The distribution of movie ratings reveals distinct content preferences across platforms. Amazon Prime caters to a more mature audience with a significant concentration of TV-MA rated content. In contrast, Disney+ likely focuses on family-friendly viewing, as suggested by its lower proportion of mature ratings (though specific data on G and PG ratings is not available here). Netflix and Hulu demonstrate a more balanced approach, offering a mix of ratings to appeal to a broader audience.
Conclusion:
This analysis has illuminated key differences in content strategies across the four streaming platforms. Each exhibits distinct trends in content offerings, duration, and release timing, catering to diverse viewer preferences. Amazon Prime, with its extensive library and broad range of content durations, including a substantial offering of mature titles, appears to prioritize attracting a diverse and mature audience. Netflix, while showing signs of slowing growth in recent years, still boasts a massive library and a history of aggressive content acquisition, suggesting a focus on breadth and variety.
Source Code
---date: '2025-03-10'title: Streaming Services Analysiscategories: [Python, Dataviz]image: profile.pngtitle-block-banner: falsecode-folding: showcode-tools: truecode-block-bg: false---You can slide the charts around to focus on different aspects of it or see different parts. # **Streaming Services Analysis**## *Analysis Overview:*This analysis will dive into data from four leading streaming platforms—Netflix, Hulu, Amazon Prime, and Disney+—to explore trends, viewership patterns, and key performance metrics. By examining each service's unique offerings and subscriber behaviors, we aim to uncover insights that highlight their market positions and user preferences in the ever-evolving digital entertainment landscape.### Distribution of Movie/Show Durations by Streaming Service```{python, message=FALSE, warning=FALSE}import pandas as pdimport polars as plimport seaborn as snsimport matplotlib.pyplot as plt# --- Load Data ---disney = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/disney_plus_titles.csv")hulu = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/hulu_titles.csv")netflix = pl.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/netflix_titles.csv")prime = pd.read_csv("/Users/scotttow123/Documents/Streaming_Services/Data/amazon_prime_titles.csv")disney = disney.with_columns(pl.lit("Disney+").alias("platform"))hulu = hulu.with_columns(pl.lit("Hulu").alias("platform"))netflix = netflix.with_columns(pl.lit("Netflix").alias("platform"))disney = disney.to_pandas()hulu = hulu.to_pandas()netflix = netflix.to_pandas()prime["platform"] = "Amazon Prime"data = pd.concat([disney, hulu, netflix, prime], ignore_index=True)def convert_duration(duration): if pd.isna(duration): return None duration = str(duration).lower() if "min" in duration: return int(duration.replace(" min", "")) elif "h" in duration: parts = duration.split(" ") hours = int(parts[0].replace("h", "")) * 60 minutes = int(parts[1].replace("min", "")) if len(parts) > 1 else 0 return hours + minutes return Nonedata["duration_minutes"] = data["duration"].apply(convert_duration)data["date_added"] = pd.to_datetime(data["date_added"], errors="coerce")data["year_added"] = data["date_added"].dt.yearplt.figure(figsize=(8, 6))sns.violinplot(data=data, x="platform", y="duration_minutes", palette="Set2")plt.title("Distribution of Movie/Show Durations by Streaming Service")plt.xlabel("Streaming Service")plt.ylabel("Duration (Minutes)")plt.xticks(rotation=45)plt.show()```The violin plot reveals a striking contrast in content duration across platforms. Amazon Prime stands out with the broadest range, extending to exceptionally lengthy content (up to 600 minutes), suggesting a focus on longer films and series. In contrast, Disney+ exhibits the shortest content range, implying a preference for more concise viewing experiences. Netflix and Hulu fall somewhere in between, with Netflix showing a slightly wider spread than Hulu. ### Number of Titles Added Over Time```{python, message=FALSE, warning=FALSE}plt.figure(figsize=(18, 6))df_yearly = data.groupby(["year_added", "platform"]).size().reset_index(name="count")sns.lineplot(data=df_yearly, x="year_added", y="count", hue="platform", marker="o")plt.title("Number of Titles Added Over Time")plt.xlabel("Year")plt.ylabel("Number of Titles")plt.legend(title="Streaming Service")plt.show()```This chart highlights the aggressive content acquisition strategy employed by Netflix. Starting around 2016, Netflix embarked on a dramatic expansion of its library, consistently adding new titles at a pace that far surpassed its competitors. While other platforms also increased their offerings, their growth trajectories were notably less steep, underscoring Netflix's commitment to content dominance.### Distribution of Movie Ratings Across Platforms```{python, message=FALSE, warning=FALSE}plt.figure(figsize=(30, 6))sns.histplot(data=data, x="rating", hue="platform", multiple="stack", palette="Set1", shrink=0.8)plt.title("Distribution of Movie Ratings Across Platforms")plt.xlabel("Rating")plt.ylabel("Count of Titles")plt.xticks(rotation=45)plt.legend(title="Streaming Service")plt.show()```The distribution of movie ratings reveals distinct content preferences across platforms. Amazon Prime caters to a more mature audience with a significant concentration of TV-MA rated content. In contrast, Disney+ likely focuses on family-friendly viewing, as suggested by its lower proportion of mature ratings (though specific data on G and PG ratings is not available here). Netflix and Hulu demonstrate a more balanced approach, offering a mix of ratings to appeal to a broader audience.## *Conclusion:*This analysis has illuminated key differences in content strategies across the four streaming platforms. Each exhibits distinct trends in content offerings, duration, and release timing, catering to diverse viewer preferences. Amazon Prime, with its extensive library and broad range of content durations, including a substantial offering of mature titles, appears to prioritize attracting a diverse and mature audience. Netflix, while showing signs of slowing growth in recent years, still boasts a massive library and a history of aggressive content acquisition, suggesting a focus on breadth and variety.