Statement of Completion#311697b8
Intro to Pandas for Data Analysis
easy
Practice Series Filtering
Resolution
Activities
Take a deep look at the dataset and Pandas series we are working with¶
In [1]:
# Import the Pandas library
import pandas as pd
In [2]:
# Read in the data from the CSV file
data = pd.read_csv('leadersdata.csv')
data
Out[2]:
Player | I | R | B | Outs | Avg | SR | HS | 4s | 6s | 50 | 100 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A Bagai | 15 | 343 | 556 | 13 | 26.38 | 61.69 | 84 | 38 | 0 | 2 | 0 |
1 | A Balbirnie | 6 | 236 | 260 | 6 | 39.33 | 90.77 | 97 | 20 | 4 | 2 | 0 |
2 | A Codrington | 5 | 28 | 65 | 5 | 5.60 | 43.08 | 16 | 2 | 0 | 0 | 0 |
3 | A Flintoff | 12 | 248 | 357 | 12 | 20.67 | 69.47 | 64 | 20 | 7 | 1 | 0 |
4 | A Flower | 7 | 332 | 459 | 7 | 47.43 | 72.33 | 71 | 30 | 1 | 3 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
643 | Younis Khan | 17 | 349 | 493 | 16 | 21.81 | 70.79 | 72 | 23 | 1 | 2 | 0 |
644 | YS Chahal | 1 | 5 | 5 | 1 | 5.00 | 100.00 | 5 | 1 | 0 | 0 | 0 |
645 | Yuvraj Singh | 21 | 738 | 817 | 14 | 52.71 | 90.33 | 113 | 68 | 13 | 7 | 1 |
646 | Z Khan | 10 | 52 | 70 | 9 | 5.78 | 74.29 | 15 | 6 | 0 | 0 | 0 |
647 | ZE Surkari | 6 | 104 | 231 | 6 | 17.33 | 45.02 | 34 | 7 | 0 | 0 | 0 |
648 rows × 12 columns
In [3]:
data.columns
Out[3]:
Index(['Player', 'I', 'R', 'B', 'Outs', 'Avg', 'SR', 'HS', '4s', '6s', '50', '100'], dtype='object')
In [4]:
# Setting Player's Name as the index
data.set_index('Player', inplace=True)
In [5]:
# Creating pandas series for each column
innings = data['I']
runs = data['R']
balls = data['B']
outs = data['Outs']
batting_average = data['Avg']
strike_rate = data['SR']
highest_score = data['HS']
number_of_fours = data['4s']
number_of_sixes = data['6s']
number_of_fifties = data['50']
number_of_hundreds = data['100']
In [6]:
# Printing the first 5 rows of each series
print("Innings:\n", innings.head())
print("Runs:\n", runs.head())
print("Balls:\n", balls.head())
print("Outs:\n", outs.head())
print("Batting Average:\n", batting_average.head())
print("Strike Rate:\n", strike_rate.head())
print("Highest Score:\n", highest_score.head())
print("Number of Fours:\n", number_of_fours.head())
print("Number of Sixes:\n", number_of_sixes.head())
print("Number of Fifties:\n", number_of_fifties.head())
print("Number of Hundreds:\n", number_of_hundreds.head())
Innings: Player A Bagai 15 A Balbirnie 6 A Codrington 5 A Flintoff 12 A Flower 7 Name: I, dtype: int64 Runs: Player A Bagai 343 A Balbirnie 236 A Codrington 28 A Flintoff 248 A Flower 332 Name: R, dtype: int64 Balls: Player A Bagai 556 A Balbirnie 260 A Codrington 65 A Flintoff 357 A Flower 459 Name: B, dtype: int64 Outs: Player A Bagai 13 A Balbirnie 6 A Codrington 5 A Flintoff 12 A Flower 7 Name: Outs, dtype: int64 Batting Average: Player A Bagai 26.38 A Balbirnie 39.33 A Codrington 5.60 A Flintoff 20.67 A Flower 47.43 Name: Avg, dtype: float64 Strike Rate: Player A Bagai 61.69 A Balbirnie 90.77 A Codrington 43.08 A Flintoff 69.47 A Flower 72.33 Name: SR, dtype: float64 Highest Score: Player A Bagai 84 A Balbirnie 97 A Codrington 16 A Flintoff 64 A Flower 71 Name: HS, dtype: int64 Number of Fours: Player A Bagai 38 A Balbirnie 20 A Codrington 2 A Flintoff 20 A Flower 30 Name: 4s, dtype: int64 Number of Sixes: Player A Bagai 0 A Balbirnie 4 A Codrington 0 A Flintoff 7 A Flower 1 Name: 6s, dtype: int64 Number of Fifties: Player A Bagai 2 A Balbirnie 2 A Codrington 0 A Flintoff 1 A Flower 3 Name: 50, dtype: int64 Number of Hundreds: Player A Bagai 0 A Balbirnie 0 A Codrington 0 A Flintoff 0 A Flower 0 Name: 100, dtype: int64
Activities¶
1. How many players have a batting average greater than 30 in the batting_average
series¶
In [7]:
# try your code here
count = (batting_average > 30).sum()
print(count)
169
2. What is the maximum number of runs scored by a player in the runs
series¶
In [8]:
# try your code here
max_runs = runs.max()
print(f"maximum number of runs: {max_runs}")
maximum numer of runs: 1532
3. Name the player with maximum runs¶
In [11]:
# try your code here
player_name = runs.idxmax()
print(f"Name of the player with maximum runs : {player_name}")
Name of the player with maximum runs : KC Sangakkara
4. Name the player who played least number of balls¶
In [15]:
# try your code here
min_balls = balls.min()
player_with_min_balls = balls[balls == min_balls].index
if len(player_with_min_balls) > 1:
result = f"{player_with_min_balls[0]}, {player_with_min_balls[-1]}"
else:
result = player_with_min_balls[0]
print(f"the names of the player with the least number of balls: {result}")
the names of the player with the least number of balls: CJ Jordan, XJ Doherty
5. How many players have played more than 500 balls in the balls
series¶
In [17]:
# try your code here
count = (balls > 500).sum()
print(count)
70
6. What is the mean value of the batting_average
series¶
In [18]:
# try your code here
batting_average.mean()
Out[18]:
20.89378086419753
7. How many players have a strike rate not equal to 70 in the strike_rate
series¶
In [20]:
# try your code here
count = (strike_rate != 70).sum()
count
Out[20]:
648
8. What is the minimum number of innings played by a player in the innings
series¶
In [21]:
# try your code here
min_innings = innings.min()
min_innings
Out[21]:
1
9. How many players have a batting average greater than 50 in the batting_average
series¶
In [22]:
# try your code here
count = (batting_average > 50).sum()
print(count)
52
10. How many players have a batting average between 20 and 30 (inclusive) in the batting_average
series¶
In [26]:
# try your code here
count = ((batting_average >= 20) & (batting_average <= 30)).sum()
print(count)
113
11. Calculating the Average Balls Faced by a Player¶
In [30]:
# try your code here
Average_balls = balls.mean()
print(f"The average number of balls faced by a player is: {Average_balls:.2f}")
The average number of balls faced by a player is: 195.15
12. How many players have a strike rate greater than 120 in the strike_rate
series¶
In [31]:
# try your code here
count = (strike_rate > 120).sum()
count
Out[31]:
39
13. Provide the names of the top three players from the strike_rate
series¶
In [33]:
# try your code here
top_three_players = strike_rate.sort_values(ascending=False).head(3).index
result = ", ".join(top_three_players)
print(f"the numbers of the top three players: {result}")
the numbers of the top three players: KD Mills, LD Chandimal, F Behardien
14. Sum of Maximums from number_of_fours
and number_of_sixes
Series¶
In [35]:
# try your code here
max_fours = number_of_fours.max()
max_six = number_of_sixes.max()
sum_maximum = max_fours + max_six
print(f" the sum of the maximum values: {sum_maximum}")
the sum of the maximum values: 196
15. How many players have a batting average below 10 in the batting_average
series¶
In [36]:
# try your code here
count = (batting_average < 10).sum()
count
Out[36]:
220
16. Name the player who hit maximum sixes¶
In [38]:
# try your code here
max_sixes_player = number_of_sixes.idxmax()
max_sixes = number_of_sixes.max()
print(f"{max_sixes_player}, {max_sixes}")
CH Gayle, 49
17. How many players have a strike rate between 80 and 90 (inclusive) in the strike_rate
series¶
In [39]:
# try your code here
count = ((strike_rate >= 80) & (strike_rate <=90)).sum()
count
Out[39]:
92
18. What is the total number of runs scored by all players in the runs
series¶
In [40]:
# try your code here
total_runs = runs.sum()
total_runs
Out[40]:
102041
19. What is the range (difference between the maximum and minimum values) of the number_of_fifties
series¶
In [43]:
# try your code here
range_of_fifties = number_of_fifties.max() - number_of_fifties.min()
range_of_fifties
Out[43]:
10
20. How many players have a strike rate below 60 in the strike_rate
series¶
In [44]:
# try your code here
count = (strike_rate < 60).sum()
count
Out[44]:
221
21. Calculating the Mean Number of Boundaries (Fours + Sixes) Hit by a Player¶
In [46]:
# try your code here
total_boundaries = number_of_fours + number_of_sixes
mean_boundaries = total_boundaries.mean()
mean_boundaries
Out[46]:
17.566358024691358
22. Players with highest score in highest_score
series¶
In [58]:
# try your code here
sorted_scores = highest_score.sort_values(ascending=False)
top_five_scores = sorted_scores.head(5)
In [2]:
# Read in the data from the CSV file
data = pd.read_csv('leadersdata.csv')
data
Out[2]:
Player | I | R | B | Outs | Avg | SR | HS | 4s | 6s | 50 | 100 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A Bagai | 15 | 343 | 556 | 13 | 26.38 | 61.69 | 84 | 38 | 0 | 2 | 0 |
1 | A Balbirnie | 6 | 236 | 260 | 6 | 39.33 | 90.77 | 97 | 20 | 4 | 2 | 0 |
2 | A Codrington | 5 | 28 | 65 | 5 | 5.60 | 43.08 | 16 | 2 | 0 | 0 | 0 |
3 | A Flintoff | 12 | 248 | 357 | 12 | 20.67 | 69.47 | 64 | 20 | 7 | 1 | 0 |
4 | A Flower | 7 | 332 | 459 | 7 | 47.43 | 72.33 | 71 | 30 | 1 | 3 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
643 | Younis Khan | 17 | 349 | 493 | 16 | 21.81 | 70.79 | 72 | 23 | 1 | 2 | 0 |
644 | YS Chahal | 1 | 5 | 5 | 1 | 5.00 | 100.00 | 5 | 1 | 0 | 0 | 0 |
645 | Yuvraj Singh | 21 | 738 | 817 | 14 | 52.71 | 90.33 | 113 | 68 | 13 | 7 | 1 |
646 | Z Khan | 10 | 52 | 70 | 9 | 5.78 | 74.29 | 15 | 6 | 0 | 0 | 0 |
647 | ZE Surkari | 6 | 104 | 231 | 6 | 17.33 | 45.02 | 34 | 7 | 0 | 0 | 0 |
648 rows × 12 columns