Dinar Nato has successfully completed this project.

Intro to Pandas for Data Analysis

easy

4.73

Exploring DataFrames: Uncovering Insights from Top 30 US Fast Food Chains

Finished

July 1, 2025 7:36 PM

Elapsed time (min)

Completed activities

Resolution

Activities

Project.ipynb

Notebook

Exploring DataFrames: Uncovering Insights from Top 30 US Fast Food Chains¶

In [21]:

# Importing necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Loading dataset as a dataframe  

df = pd.read_csv('Top 30 US fast food chains.csv')

¶

Let's get started

In [22]:

# Start by looking at the first 5 records of the dataframe.
df.head()

Out[22]:

	Rank	Chain	Sales (U.S., 2017)	# of Locations (U.S.)
0	1	McDonald's	37500000000	14,036
1	2	Starbucks	13200000000	13,930
2	3	Subway	10800000000	25,908
3	4	Burger King	9800000000	7,226
4	5	Taco Bell	9300000000	6,446

In [23]:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 4 columns):
 #   Column                 Non-Null Count  Dtype 
---  ------                 --------------  ----- 
 0   Rank                   30 non-null     int64 
 1   Chain                  30 non-null     object
 2   Sales (U.S., 2017)     30 non-null     int64 
 3   # of Locations (U.S.)  30 non-null     object
dtypes: int64(2), object(2)
memory usage: 1.1+ KB

In [24]:

df.describe()

Out[24]:

	Rank	Sales (U.S., 2017)
count	30.000000	3.000000e+01
mean	15.500000	5.740000e+09
std	8.803408	6.801095e+09
min	1.000000	1.100000e+09
25%	8.250000	2.225000e+09
50%	15.500000	3.600000e+09
75%	22.750000	5.900000e+09
max	30.000000	3.750000e+10

1. What is the primary difference between the `df.info()` and `df.describe()` methods in pandas?¶

In [ ]:

2. What is the data type of the `Rank` column?¶

In [25]:

df['Rank'].dtype

Out[25]:

dtype('int64')

3. Which of these columns contain numeric data?¶

In [26]:

df.select_dtypes(include=['int64', 'float64']).columns

Out[26]:

Index(['Rank', 'Sales (U.S., 2017)'], dtype='object')

4. What is the shape of our DataFrame `df`?¶

In [27]:

df.shape

Out[27]:

(30, 4)

Diving Deeper¶

5. Select the Sales Column.¶

In [29]:

sales = df["Sales (U.S., 2017)"]
sales

Out[29]:

0     37500000000
1     13200000000
2     10800000000
3      9800000000
4      9300000000
5      9300000000
6      5900000000
7      9000000000
8      5900000000
9      5500000000
10     4500000000
11     4500000000
12     4400000000
13     4400000000
14     3600000000
15     3600000000
16     3500000000
17     3500000000
18     3200000000
19     3100000000
20     2300000000
21     2300000000
22     2200000000
23     2100000000
24     2100000000
25     1500000000
26     1400000000
27     1400000000
28     1300000000
29     1100000000
Name: Sales (U.S., 2017), dtype: int64

Just For Exploration¶

Let's visualise the Sales share of each food chain. Run the cell below to find out!

In [ ]:

# Sales Share by Chain Pie Chart
plt.figure(figsize=(10, 10))
plt.pie(df['Sales (U.S., 2017)'], labels=df['Chain'], autopct='%1.1f%%', startangle=140)
plt.title('Market Share of Top 30 US Fast Food Chains in 2017')
plt.show()

The pie chart clearly illustrates that McDonald's leads in sales, capturing a significant portion of the market share. This highlights its popularity and strong brand presence among consumers. 🍔✨

6. Display Top Three Chains¶

In [31]:

top_3_chains = df.head(3)['Chain']
top_3_chains

Out[31]:

0    McDonald's
1     Starbucks
2        Subway
Name: Chain, dtype: object

7. Identify the 5th Ranking Fast Food Chain¶

In [33]:

df.loc[4, 'Chain']

Out[33]:

'Taco Bell'

8. How many chains have more than `'5000'` locations?¶

In [35]:

df['# of Locations (U.S.)'] = df['# of Locations (U.S.)'].str.replace(',', '').astype(int)

In [36]:

(df['# of Locations (U.S.)'] > 5000).sum()

Out[36]:

9. Analyse Sales Distribution Across Food Chains¶

In [38]:

median_sales = df['Sales (U.S., 2017)'].median()
median_sales

Out[38]:

3600000000.0

10. If you want to select the third row from the DataFrame `df` using positional indexing , which of the following would be correct?¶

In [ ]:

Statement of Completion#8ee7eb03

Intro to Pandas for Data Analysis

Exploring DataFrames: Uncovering Insights from Top 30 US Fast Food Chains

Exploring DataFrames: Uncovering Insights from Top 30 US Fast Food Chains¶

¶

1. What is the primary difference between the df.info() and df.describe() methods in pandas?¶

2. What is the data type of the Rank column?¶

3. Which of these columns contain numeric data?¶

4. What is the shape of our DataFrame df?¶

Diving Deeper¶

5. Select the Sales Column.¶

Just For Exploration¶

6. Display Top Three Chains¶

7. Identify the 5th Ranking Fast Food Chain¶

8. How many chains have more than '5000' locations?¶

9. Analyse Sales Distribution Across Food Chains¶

10. If you want to select the third row from the DataFrame df using positional indexing , which of the following would be correct?¶

1. What is the primary difference between the `df.info()` and `df.describe()` methods in pandas?¶

2. What is the data type of the `Rank` column?¶

4. What is the shape of our DataFrame `df`?¶

8. How many chains have more than `'5000'` locations?¶

10. If you want to select the third row from the DataFrame `df` using positional indexing , which of the following would be correct?¶