Super Bowl

Are Super Bowl Ads Worth It?

Project Overview: An Introduction

Every year, millions of Americans sit down to watch the highly-anticipated Super Bowl. Accompanied by an action-packed NFL football game, funny and emotional commercials, and a halftime show that generally involves celebrities and Hollywood icons, the Super Bowl brings people together from all walks of life for one night of relaxation and entertainment.

Our focus for this project is to direct our attention to the advertisements, specifically the thirty second ads that are played throughout the game during commericals and other airtime breaks. These ads are widely regarded as some of the most expensive TV network slots on the planet, and the prices of these advertisements show just that.

The aim of this project is to observe the relationship between the cost of these thiry second ads over the course of the entire history of the Super Bowl. We want to know, is it worth it to buy these ads? Is that value of these advertisements worth the cost? Let's begin diving into it!

Project Overview: Observations

We are observing whether it’s worth it for companies to buy an ad for the NFL’s Super Bowl. To assess this accurately, we look at statistics related to cost per thousand impressions (CPM), viewership data, and ad cost for all Super Bowl events held between 1966 and 2019. By looking at these specific sets of data, it will help us deduce important info about which years were best for companies to buy an ad, which years were not as worth it, and if there are any outliers.

Project Overview: Hypothesis

Before we begin getting into the data, let us first propose an initial hypothesis. The hypothesis is that despite inflation surrounding the U.S. dollar, we believe that if the costs of advertisements throughout the Super Bowl's history were to account for inflation of today's U.S. dollar value, there would be no change in the CPM (cost per thousand views). We are using CPM as our measurement of advertisement effectiveness, since a lower-cost CPM ad is more effective for companies than higher-cost CPM ads, since those companies would save money and be able to produce the same marketing and promotional benefits.

Imports Used

We import the following libraries to get, use, and manipulate the data from an internet source. With these libraries, we store, graph, plot, and showcase the data to find useful info and visually represent things for user simplicity.

The usual python data science libraries used are Requests, BeautifulSoup, Pandas, Numpy, Datetime, Math, String, and Matplotlib.

We also utilize Stats Models which is used later on in the project with hypothesis testing.

Additionally, we install a library called Easy Money and utilize the EasyPeasy() import, which will help us later on in the project when we're going to want to reproduce the ad costs to account for inflation.

Part 1: Data Wrangling, Collection, Curration & Parsing

Here, we first fetch our data from the superbowl-ads.com URL listed below. This will provide us with all the necessary data we will need to draw conclusions for our hypothesis.

After requesting and getting the URL page's HTML code, we utilize Beautiful Soup and get the text.

Now, we have the page's HTML in our "soup" variable, and need to begin searching for the data set on the website page. This can be done easily by using Beautiful Soup's find() method, which gives us the exact table that we need.

Afterwards, the page is scraped by searching for all the rows within the table that we previously scraped, and each row is appended to a temporary two-dimensional array called result.

Finally, we remove the top row from the table because it is unnecessary, and then save the new top row as columns_title, before removing that row as well.

All that's left is to put the result into a Pandas Data Frame with the column_titles variable as the Data Frame's column titles.

Part 2: Data Management/Representation

The table on the site we pulled our data from contains a lot of information, some of which we do not need. We started by taking in all the data in the table, but now we'll be condensing it down to what we want.

First, the Data Frame is re-indexed such that it begins from 1 and not from 0. This will help us later on with organization.

Next, we drop all the columns that we do not need. This includes the columns 'Super Bowl', 'Game Date', 'Network', 'Rating', and 'Share'. Once those are dropped, the data remaining is all we'll need for further evaluation.

A little bit of organization is needed before we explore the data. The columns, 'Avg. Cost Per 30-Seconds' and 'Avg. Number of Viewers' need to be numeric, not strings like the way they came. The commas and dollar signs are removed from these two columns and the results are converted into integers. These values are then put back into the data frame.

Finally, for simplicity and clairty, the columns are renamed to 'Year', 'Cost', and 'Views'.

The last thing needed for this data to be in the correct form for usage and evaluation is reversing its order, such that the first row is from the year 1966 and the last row is 2019.

The data is now properly organized and ready to be analyzed.

As a summary, here is what the data represents:

Part 3: Exploratory Data Analysis

In this section, we will graph several different relationships. Namely, having to do with the average cost of ads, average number of views, CPM, and inflated cost values.

We first look at viewership compared to the years of the Super Bowl and create a bar graph. We can see a generally increasing trend, where viewership has increased over time. But there are surely some drops. In 2018, for example, there was a decline in Super Bowl ratings due to players kneeling for the national anthem and fans not tuning in because they disagreed with those actions during the entire NFL season.

Source: (https://www.washingtontimes.com/news/2018/feb/6/nfl-ratings-down-due-anthem-protests-survey/)

For this bar graph, the cost for ads increased as time passed on. And there were no huge spikes or drops.

We now look at the cost of ads with the viewership and see the rate at which average cost increases is much higher and faster than that for the viewership numbers. The companies are spending more money but getting essentially the same return because the viewers are not increasing at such a high rate. This is not that great for companies looking to get a better return.

Below, we utilize the EasyPeasy() function to help us inflate the cost values from each year of the Super Bowl.

We implement inflation here to get the real-value numbers and stats because we want to see everything on an even playing field in terms of cost. And here, cost doesn’t increase as fast compared to the last graph because inflation is accounted for.

Now, we will examine the CPM (Cost Per 1000 views).

Add a new column to the Data Frame with the CPM listed for each year.

For this CPM per year graph, we show the increase in cost per thousand viewers and we see that no matter what, when the viewership is still the same, the CPM increases for 1000 viewer subsets. This shows that once again, it isn’t the best return for companies since they are not getting a bang for their buck (spending more than they should have to, to get a return). Each year, they can expect to spend more money on essentially the same number of viewers.

Part 4: Hypothesis Testing

For the hypothesis testing, we have a null hypothesis on top of an alternative hypothesis. The significance level is important in this scenario since we have to see whether the p-value is inside the significance level. This will allow us to reject or accept the null hypothesis. This test is being done since we want to see if the model we make is a good representation/fit of the data.

Let's look at an example now. In our example we will be looking to see if there is a linear correlation between views, ad cost, and CPM. Since the variable that we are trying to predict is numerical, we use a polyfit line.

Before we run the test we must set up our hypotheses and the significance level:

𝛼 (significance level) = 0.05

𝐻𝑜= There is no linearity between CPM and year

𝐻𝑎= There is linearity between the CPM and year

From this result, we can see that for each year, the p-value definitely varies. For example, in 1969, the p-value was at 0.985, which is way above 0.05. But for 2019, it’s at 0.005.

To be more specific, from 1967 to 1985, the p-value tended to be greater than 0.05. This means for this time period we do not reject the null hypothesis. However, from 1985 to 2019, the p-value was more-so below 0.05. This means that we do reject the null hypothesis that there is no linearity between CPM and year.

However, we can see the various curvatures in the graph and this would support the fact that there is no clear linearity, but we are leaning more towards rejecting the null since this takes place in the most recent time frame. Note that if we made a line of best-fit, it would better show linearity over time. However, if you look at our CPM graph from earlier, the curviture is quite noticeable with a given degree of 5. With a degree of 1, however, this line becomes quite linear.

Part 5: Conclusion

Overall, we are able to detect linearity between the Cost of an Ad per 1000 viewers and years, especially in the last 35 years.

What this tells us is that the CPM is linearly increasing, which means that from year to year, the cost of advertising will increase but not he viewership necessarily. From one year to the next, the rate at which the cost of advertisements increases was far greater than the rate at which viewership was increasing.

This means that despite the rising numbers in viewers, the costs for ads has gone up. More specifically, the CPM has gone up, which is the better and more important piece of information. From a marketing teams' perspective that is considering launching their advertisement during the Super Bowl, they can expect to pay more money each year to show their ad to the same number of people.

In conclusion, it is difficult to say whether advertisements are worth promoting during a Super Bowl commercial, namely because of the price that airtime costs. However, given the amount of eyeballs and viewers tuned in, it's also a great time for marketing growth. Data from Google trends can be seen that shows which types of companies gain more value each year by advertising commericals, but based off the data collected here and the projections that were analyzed, it is generally not worth it to advertise anymore during a Super Bowl commercial.

Source: https://trends.google.com/trends/explore?q=amazon%20alexa,%2Fm%2F01jp3,%2Fm%2F0160jl,%2Fm%2F0dp88,%2Fm%2F07mb6&geo=US,US,US,US,US&date=2017-01-01%202018-12-31,2017-01-01%202018-12-31,2017-01-01%202018-12-31,2017-01-01%202018-12-31,2017-01-01%202018-12-31&gprop=youtube#TIMESERIES