Project: Querying and Filtering Pokemon data¶

This project will help you practice your pandas querying and filtering skills. Let's begin!

No description has been provided for this image

Task 0 - Setup¶

There isn't much to do here, we'll provide the required imports and the read the pokemon CSV we'll be working with.

In [3]:

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

In [4]:

df = pd.read_csv("pokemon.csv")

In [5]:

df.head()

Out[5]:

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
0	1	Bulbasaur	Grass	Poison	318	45	49	49	65	65	45	1	False
1	2	Ivysaur	Grass	Poison	405	60	62	63	80	80	60	1	False
2	3	Venusaur	Grass	Poison	525	80	82	83	100	100	80	1	False
3	4	Charmander	Fire	NaN	309	39	52	43	60	50	65	1	False
4	5	Charmeleon	Fire	NaN	405	58	64	58	80	65	80	1	False

In [4]:

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 721 entries, 0 to 720
Data columns (total 13 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   #           721 non-null    int64 
 1   Name        721 non-null    object
 2   Type 1      721 non-null    object
 3   Type 2      359 non-null    object
 4   Total       721 non-null    int64 
 5   HP          721 non-null    int64 
 6   Attack      721 non-null    int64 
 7   Defense     721 non-null    int64 
 8   Sp. Atk     721 non-null    int64 
 9   Sp. Def     721 non-null    int64 
 10  Speed       721 non-null    int64 
 11  Generation  721 non-null    int64 
 12  Legendary   721 non-null    bool  
dtypes: bool(1), int64(9), object(3)
memory usage: 68.4+ KB

In [5]:

df.describe()

Out[5]:

	#	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation
count	721.00000	721.000000	721.000000	721.000000	721.000000	721.000000	721.000000	721.000000	721.000000
mean	361.00000	417.945908	68.380028	75.124827	70.697642	68.848821	69.180305	65.714286	3.323162
std	208.27906	109.663671	25.848272	29.070335	29.194941	28.898590	26.899364	27.277920	1.669873
min	1.00000	180.000000	1.000000	5.000000	5.000000	10.000000	20.000000	5.000000	1.000000
25%	181.00000	320.000000	50.000000	54.000000	50.000000	45.000000	50.000000	45.000000	2.000000
50%	361.00000	424.000000	65.000000	75.000000	65.000000	65.000000	65.000000	65.000000	3.000000
75%	541.00000	499.000000	80.000000	95.000000	85.000000	90.000000	85.000000	85.000000	5.000000
max	721.00000	720.000000	255.000000	165.000000	230.000000	154.000000	230.000000	160.000000	6.000000

Distribution of Pokemon Types:¶

In [ ]:

df['Type 1'].value_counts().plot(kind='pie', autopct='%1.1f%%', cmap='tab20c', figsize=(10, 8))

Distribution of Pokemon Totals:¶

In [ ]:

df['Total'].plot(kind='hist', figsize=(10, 8))

In [ ]:

df['Total'].plot(kind='box', vert=False, figsize=(10, 5))

Distribution of Legendary Pokemons:¶

In [ ]:

df['Legendary'].value_counts().plot(kind='pie', autopct='%1.1f%%', cmap='Set3', figsize=(10, 8))

Basic filtering¶

Let's start with a few simple activities regarding filtering.

1. How many Pokemons exist with an `Attack` value greater than 150?¶

Doing a little bit of visual exploration, we can have a sense of the most "powerful" pokemons (defined by their "Attack" feature). A boxplot is a great way to visualize this:

In [ ]:

sns.boxplot(data=df, x='Attack')

In [ ]:

# Try your code here

2. Select all pokemons with a Speed of `10` or less¶

In [ ]:

sns.boxplot(data=df, x='Speed')

In [36]:

slow_pokemons_df = df.loc[df['Speed']<=10]

3. How many Pokemons have a `Sp. Def` value of 25 or less?¶

In [ ]:

# Try your code here

4. Select all the Legendary pokemons¶

In [38]:

# Try your code here
legendary_df = df.loc[df['Legendary']]

5. Find the outlier¶

Find the pokemon that is clearly an outlier in terms of Attack / Defense:

In [ ]:

ax = sns.scatterplot(data=df, x="Defense", y="Attack")
ax.annotate(
    "Who's this guy?", xy=(228, 10), xytext=(150, 10), color='red',
    arrowprops=dict(arrowstyle="->", color='red')
)

In [ ]:

# Try your code here

Advanced selection¶

Now let's use boolean operators to create more advanced expressions

6. How many Fire-Flying Pokemons are there?¶

In [ ]:

# Try your code here

7. How many 'Poison' pokemons are across both types?¶

In [ ]:

# Try your code here

8. Name the pokemon of `Type 1` Ice which has the strongest defense?¶

In [ ]:

# Try your code here

9. What's the most common type of Legendary Pokemons?¶

In [ ]:

# Try your code here

10. What's the most powerful pokemon from the first 3 generations, of type water?¶

In [ ]:

# Try your code here

11. What's the most powerful Dragon from the last two generations?¶

In [6]:

df['Generation'].value_counts()

Out[6]:

Generation
5    156
1    151
3    135
4    107
2    100
6     72
Name: count, dtype: int64

In [16]:

# Try your code here
df.loc[(df['Generation'].isin((5,6))) & ((df['Type 1']=='Dragon') | (df['Type 2'] == 'Dragon'))].sort_values(by='Total', ascending=False)

Out[16]:

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
643	644	Zekrom	Dragon	Electric	680	100	150	120	120	100	90	5	True
642	643	Reshiram	Dragon	Fire	680	100	120	100	150	120	90	5	True
645	646	Kyurem	Dragon	Ice	660	125	130	90	130	90	95	5	True
705	706	Goodra	Dragon	NaN	600	90	100	70	110	150	80	6	False
717	718	Zygarde50% Forme	Dragon	Ground	600	108	100	121	81	95	95	6	True
634	635	Hydreigon	Dark	Dragon	600	92	105	90	125	90	98	5	False
611	612	Haxorus	Dragon	NaN	540	76	147	90	60	70	97	5	False
714	715	Noivern	Flying	Dragon	535	85	70	80	97	80	123	6	False
696	697	Tyrantrum	Rock	Dragon	521	82	121	119	69	59	71	6	False
690	691	Dragalge	Poison	Dragon	494	65	75	90	97	123	44	6	False
620	621	Druddigon	Dragon	NaN	485	77	120	90	60	90	48	5	False
704	705	Sliggoo	Dragon	NaN	452	68	75	53	83	113	60	6	False
633	634	Zweilous	Dark	Dragon	420	72	85	70	65	70	58	5	False
610	611	Fraxure	Dragon	NaN	410	66	117	70	40	50	67	5	False
695	696	Tyrunt	Rock	Dragon	362	58	89	77	45	45	48	6	False
609	610	Axew	Dragon	NaN	320	46	87	60	30	40	57	5	False
632	633	Deino	Dark	Dragon	300	52	65	50	45	50	38	5	False
703	704	Goomy	Dragon	NaN	300	45	50	35	55	75	40	6	False
713	714	Noibat	Flying	Dragon	245	40	30	35	45	40	55	6	False

12. Select most powerful Fire-type pokemons¶

In [14]:

# Try your code here
powerful_fire_df = df.loc[(df['Type 1']=='Fire') & (df['Attack']>100)]

13. Select all Water-type, Flying-type pokemons¶

In [17]:

# Try your code here
water_flying_df = df.loc[(df['Type 1']=='Water') & (df['Type 2']=='Flying')]

14. Select specific columns of Legendary pokemons of type Fire¶

In [21]:

# Try your code here
legendary_fire_df = df.loc[(df['Type 1']=='Fire') & (df['Legendary']),['Name','Attack','Generation']]

15. Select Slow and Fast pokemons¶

This is the distribution of speed of the pokemons. The red lines indicate those bottom 5% and top 5% pokemons by speed:

In [23]:

ax = df['Speed'].plot(kind='hist', figsize=(10, 5), bins=100)
ax.axvline(df['Speed'].quantile(.05), color='red')
ax.axvline(df['Speed'].quantile(.95), color='red')

Out[23]:

<matplotlib.lines.Line2D at 0x73d63e579150>

In [33]:

# Try your code here
slow_fast_df = df.loc[(df['Speed']< df['Speed'].quantile(0.05)) | (df['Speed']> df['Speed'].quantile(0.95))]
#slow_fast_df = df.loc[(df['Speed'] < df['Speed'].quantile(.05)) |(df['Speed'] > df['Speed'].quantile(.95))]

16. Find the Ultra Powerful Legendary Pokemon¶

In [ ]:

fig, ax = plt.subplots(figsize=(14, 7))
sns.scatterplot(data=df, x="Defense", y="Attack", hue='Legendary', ax=ax)
ax.annotate(
    "Who's this guy?", xy=(140, 150), xytext=(160, 150), color='red',
    arrowprops=dict(arrowstyle="->", color='red')
)

In [35]:

# Try your code here
df.loc[(df['Attack']>140)&(df['Defense']>120)].sort_values(by='Defense', ascending = False)

Out[35]:

	#	Name	Type 1	Type 2	Total	HP	Attack	Defense	Sp. Atk	Sp. Def	Speed	Generation	Legendary
382	383	Groudon	Ground	Fire	670	100	150	140	100	90	90	3	True

Statement of Completion#b6aac54c

Intro to Pandas for Data Analysis

Practicing filtering sorting with Pokemon

Project: Querying and Filtering Pokemon data¶

Task 0 - Setup¶

Distribution of Pokemon Types:¶

Distribution of Pokemon Totals:¶

Distribution of Legendary Pokemons:¶

Basic filtering¶

1. How many Pokemons exist with an `Attack` value greater than 150?¶

2. Select all pokemons with a Speed of `10` or less¶

3. How many Pokemons have a `Sp. Def` value of 25 or less?¶

4. Select all the Legendary pokemons¶

5. Find the outlier¶

Advanced selection¶

6. How many Fire-Flying Pokemons are there?¶

7. How many 'Poison' pokemons are across both types?¶

8. Name the pokemon of `Type 1` Ice which has the strongest defense?¶

9. What's the most common type of Legendary Pokemons?¶

10. What's the most powerful pokemon from the first 3 generations, of type water?¶

11. What's the most powerful Dragon from the last two generations?¶

12. Select most powerful Fire-type pokemons¶

13. Select all Water-type, Flying-type pokemons¶

14. Select specific columns of Legendary pokemons of type Fire¶

15. Select Slow and Fast pokemons¶

16. Find the Ultra Powerful Legendary Pokemon¶

The End!¶

Statement of Completion#b6aac54c

Intro to Pandas for Data Analysis

Practicing filtering sorting with Pokemon

Project: Querying and Filtering Pokemon data¶

Task 0 - Setup¶

Distribution of Pokemon Types:¶

Distribution of Pokemon Totals:¶

Distribution of Legendary Pokemons:¶

Basic filtering¶

1. How many Pokemons exist with an Attack value greater than 150?¶

2. Select all pokemons with a Speed of 10 or less¶

3. How many Pokemons have a Sp. Def value of 25 or less?¶

4. Select all the Legendary pokemons¶

5. Find the outlier¶

Advanced selection¶

6. How many Fire-Flying Pokemons are there?¶

7. How many 'Poison' pokemons are across both types?¶

8. Name the pokemon of Type 1 Ice which has the strongest defense?¶

9. What's the most common type of Legendary Pokemons?¶

10. What's the most powerful pokemon from the first 3 generations, of type water?¶

11. What's the most powerful Dragon from the last two generations?¶

12. Select most powerful Fire-type pokemons¶

13. Select all Water-type, Flying-type pokemons¶

14. Select specific columns of Legendary pokemons of type Fire¶

15. Select Slow and Fast pokemons¶

16. Find the Ultra Powerful Legendary Pokemon¶

The End!¶

1. How many Pokemons exist with an `Attack` value greater than 150?¶

2. Select all pokemons with a Speed of `10` or less¶

3. How many Pokemons have a `Sp. Def` value of 25 or less?¶

8. Name the pokemon of `Type 1` Ice which has the strongest defense?¶