Statement of Completion#65d3f3d2
Intro to Pandas for Data Analysis
easy
Practice Series Filtering
Resolution
Activities
Take a deep look at the dataset and Pandas series we are working with¶
In [1]:
# Import the Pandas library
import pandas as pd
In [3]:
# Read in the data from the CSV file
data = pd.read_csv('leadersdata.csv')
data
Out[3]:
Player | I | R | B | Outs | Avg | SR | HS | 4s | 6s | 50 | 100 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A Bagai | 15 | 343 | 556 | 13 | 26.38 | 61.69 | 84 | 38 | 0 | 2 | 0 |
1 | A Balbirnie | 6 | 236 | 260 | 6 | 39.33 | 90.77 | 97 | 20 | 4 | 2 | 0 |
2 | A Codrington | 5 | 28 | 65 | 5 | 5.60 | 43.08 | 16 | 2 | 0 | 0 | 0 |
3 | A Flintoff | 12 | 248 | 357 | 12 | 20.67 | 69.47 | 64 | 20 | 7 | 1 | 0 |
4 | A Flower | 7 | 332 | 459 | 7 | 47.43 | 72.33 | 71 | 30 | 1 | 3 | 0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
643 | Younis Khan | 17 | 349 | 493 | 16 | 21.81 | 70.79 | 72 | 23 | 1 | 2 | 0 |
644 | YS Chahal | 1 | 5 | 5 | 1 | 5.00 | 100.00 | 5 | 1 | 0 | 0 | 0 |
645 | Yuvraj Singh | 21 | 738 | 817 | 14 | 52.71 | 90.33 | 113 | 68 | 13 | 7 | 1 |
646 | Z Khan | 10 | 52 | 70 | 9 | 5.78 | 74.29 | 15 | 6 | 0 | 0 | 0 |
647 | ZE Surkari | 6 | 104 | 231 | 6 | 17.33 | 45.02 | 34 | 7 | 0 | 0 | 0 |
648 rows × 12 columns
In [4]:
data.columns
Out[4]:
Index(['Player', 'I', 'R', 'B', 'Outs', 'Avg', 'SR', 'HS', '4s', '6s', '50', '100'], dtype='object')
In [5]:
# Setting Player's Name as the index
data.set_index('Player', inplace=True)
In [6]:
# Creating pandas series for each column
innings = data['I']
runs = data['R']
balls = data['B']
outs = data['Outs']
batting_average = data['Avg']
strike_rate = data['SR']
highest_score = data['HS']
number_of_fours = data['4s']
number_of_sixes = data['6s']
number_of_fifties = data['50']
number_of_hundreds = data['100']
In [7]:
# Printing the first 5 rows of each series
print("Innings:\n", innings.head())
print("Runs:\n", runs.head())
print("Balls:\n", balls.head())
print("Outs:\n", outs.head())
print("Batting Average:\n", batting_average.head())
print("Strike Rate:\n", strike_rate.head())
print("Highest Score:\n", highest_score.head())
print("Number of Fours:\n", number_of_fours.head())
print("Number of Sixes:\n", number_of_sixes.head())
print("Number of Fifties:\n", number_of_fifties.head())
print("Number of Hundreds:\n", number_of_hundreds.head())
Innings: Player A Bagai 15 A Balbirnie 6 A Codrington 5 A Flintoff 12 A Flower 7 Name: I, dtype: int64 Runs: Player A Bagai 343 A Balbirnie 236 A Codrington 28 A Flintoff 248 A Flower 332 Name: R, dtype: int64 Balls: Player A Bagai 556 A Balbirnie 260 A Codrington 65 A Flintoff 357 A Flower 459 Name: B, dtype: int64 Outs: Player A Bagai 13 A Balbirnie 6 A Codrington 5 A Flintoff 12 A Flower 7 Name: Outs, dtype: int64 Batting Average: Player A Bagai 26.38 A Balbirnie 39.33 A Codrington 5.60 A Flintoff 20.67 A Flower 47.43 Name: Avg, dtype: float64 Strike Rate: Player A Bagai 61.69 A Balbirnie 90.77 A Codrington 43.08 A Flintoff 69.47 A Flower 72.33 Name: SR, dtype: float64 Highest Score: Player A Bagai 84 A Balbirnie 97 A Codrington 16 A Flintoff 64 A Flower 71 Name: HS, dtype: int64 Number of Fours: Player A Bagai 38 A Balbirnie 20 A Codrington 2 A Flintoff 20 A Flower 30 Name: 4s, dtype: int64 Number of Sixes: Player A Bagai 0 A Balbirnie 4 A Codrington 0 A Flintoff 7 A Flower 1 Name: 6s, dtype: int64 Number of Fifties: Player A Bagai 2 A Balbirnie 2 A Codrington 0 A Flintoff 1 A Flower 3 Name: 50, dtype: int64 Number of Hundreds: Player A Bagai 0 A Balbirnie 0 A Codrington 0 A Flintoff 0 A Flower 0 Name: 100, dtype: int64
Activities¶
1. How many players have a batting average greater than 30 in the batting_average
series¶
In [12]:
len(batting_average[batting_average > 30])
Out[12]:
169
2. What is the maximum number of runs scored by a player in the runs
series¶
In [22]:
runs.max()
Out[22]:
1532
3. Name the player with maximum runs¶
In [24]:
runs[runs == runs.max()]
Out[24]:
Player KC Sangakkara 1532 Name: R, dtype: int64
4. Name the player who played least number of balls¶
In [27]:
balls[balls == balls.min()]
Out[27]:
Player CJ Jordan 1 EC Rainsford 1 JJ Bumrah 1 KW Richardson 1 M Zondeki 1 MM Sharma 1 PHT Kaushal 1 PJK Mooney 1 PT Collins 1 SW Tait 1 XJ Doherty 1 Name: B, dtype: int64
5. How many players have played more than 500 balls in the balls
series¶
In [31]:
len(balls[balls > 500])
Out[31]:
70
6. What is the mean value of the batting_average
series¶
In [32]:
batting_average.mean()
Out[32]:
20.89378086419753
7. How many players have a strike rate not equal to 70 in the strike_rate
series¶
In [33]:
len(strike_rate[strike_rate!=70])
Out[33]:
648
8. What is the minimum number of innings played by a player in the innings
series¶
In [34]:
innings.min()
Out[34]:
1
9. How many players have a batting average greater than 50 in the batting_average
series¶
In [37]:
len(batting_average[batting_average > 50])
Out[37]:
52
10. How many players have a batting average between 20 and 30 (inclusive) in the batting_average
series¶
In [42]:
len(batting_average[(batting_average >= 20) & (batting_average <= 30)])
Out[42]:
113
11. Calculating the Average Balls Faced by a Player¶
In [43]:
balls.mean()
Out[43]:
195.14969135802468
12. How many players have a strike rate greater than 120 in the strike_rate
series¶
In [44]:
len(strike_rate[strike_rate > 120])
Out[44]:
39
13. Provide the names of the top three players from the strike_rate
series¶
In [51]:
strike_rate.sort_values(ascending = False).head(3)
Out[51]:
Player KD Mills 233.33 LD Chandimal 216.67 F Behardien 205.56 Name: SR, dtype: float64
14. Sum of Maximums from number_of_fours
and number_of_sixes
Series¶
In [65]:
number_of_fours.max() + number_of_sixes.max()
Out[65]:
196
15. How many players have a batting average below 10 in the batting_average
series¶
In [58]:
len(batting_average[batting_average < 10])
Out[58]:
220
16. Name the player who hit maximum sixes¶
In [60]:
number_of_sixes[number_of_sixes == number_of_sixes.max()]
Out[60]:
Player CH Gayle 49 Name: 6s, dtype: int64
17. How many players have a strike rate between 80 and 90 (inclusive) in the strike_rate
series¶
In [61]:
len(strike_rate[(strike_rate >= 80) & (strike_rate <= 90)])
Out[61]:
92
18. What is the total number of runs scored by all players in the runs
series¶
In [62]:
runs.sum()
Out[62]:
102041
19. What is the range (difference between the maximum and minimum values) of the number_of_fifties
series¶
In [69]:
number_of_fifties.max() - number_of_fifties.min()
Out[69]:
10
How many players have a strike rate below 60 in the strike_rate
seriesHow many players have a strike rate below 60?
In [70]:
len(strike_rate[strike_rate < 60])
Out[70]:
221
21. Calculating the Mean Number of Boundaries (Fours + Sixes) Hit by a Player¶
In [71]:
(number_of_fours + number_of_sixes).mean()
Out[71]:
17.566358024691358
22. Players with highest score in highest_score
series¶
In [75]:
sorted_scores = highest_score.sort_values(ascending=False)
top_five_scores = sorted_scores.head(5)
print(top_five_scores)
Player MJ Guptill 237 CH Gayle 215 DA Warner 178 V Sehwag 175 CB Wishart 172 Name: HS, dtype: int64