Statement of Completion#600cb069
Intro to Pandas for Data Analysis
medium
Practicing Series Vectorized Operations with Penguins Data
Resolution
Activities
Look at the dataset¶
In [1]:
import pandas as pd
In [2]:
# Read the dataset into a DataFrame
df = pd.read_csv('penguins_cleaned.csv')
df
Out[2]:
species | island | culmen_length_mm | culmen_depth_mm | flipper_length_mm | body_mass_g | sex | |
---|---|---|---|---|---|---|---|
0 | Adelie | Torgersen | 39.1 | 18.7 | 181.0 | 3750.0 | MALE |
1 | Adelie | Torgersen | 39.5 | 17.4 | 186.0 | 3800.0 | FEMALE |
2 | Adelie | Torgersen | 40.3 | 18.0 | 195.0 | 3250.0 | FEMALE |
3 | Adelie | Torgersen | 36.7 | 19.3 | 193.0 | 3450.0 | FEMALE |
4 | Adelie | Torgersen | 39.3 | 20.6 | 190.0 | 3650.0 | MALE |
... | ... | ... | ... | ... | ... | ... | ... |
328 | Gentoo | Biscoe | 47.2 | 13.7 | 214.0 | 4925.0 | FEMALE |
329 | Gentoo | Biscoe | 46.8 | 14.3 | 215.0 | 4850.0 | FEMALE |
330 | Gentoo | Biscoe | 50.4 | 15.7 | 222.0 | 5750.0 | MALE |
331 | Gentoo | Biscoe | 45.2 | 14.8 | 212.0 | 5200.0 | FEMALE |
332 | Gentoo | Biscoe | 49.9 | 16.1 | 213.0 | 5400.0 | MALE |
333 rows × 7 columns
In [3]:
# Convert all columns to pandas Series
species = df['species']
island = df['island']
culmen_length_mm = df['culmen_length_mm']
culmen_depth_mm = df['culmen_depth_mm']
flipper_length_mm = df['flipper_length_mm']
body_mass_g = df['body_mass_g']
gender = df['sex']
In [4]:
print("Species: ", species)
Species: 0 Adelie 1 Adelie 2 Adelie 3 Adelie 4 Adelie ... 328 Gentoo 329 Gentoo 330 Gentoo 331 Gentoo 332 Gentoo Name: species, Length: 333, dtype: object
In [5]:
print("Island: ", island)
Island: 0 Torgersen 1 Torgersen 2 Torgersen 3 Torgersen 4 Torgersen ... 328 Biscoe 329 Biscoe 330 Biscoe 331 Biscoe 332 Biscoe Name: island, Length: 333, dtype: object
In [6]:
print("Culmen Length (mm): ", culmen_length_mm)
Culmen Length (mm): 0 39.1 1 39.5 2 40.3 3 36.7 4 39.3 ... 328 47.2 329 46.8 330 50.4 331 45.2 332 49.9 Name: culmen_length_mm, Length: 333, dtype: float64
In [7]:
print("Culmen Depth (mm): ", culmen_depth_mm)
Culmen Depth (mm): 0 18.7 1 17.4 2 18.0 3 19.3 4 20.6 ... 328 13.7 329 14.3 330 15.7 331 14.8 332 16.1 Name: culmen_depth_mm, Length: 333, dtype: float64
In [8]:
print("Flipper Length (mm): ", flipper_length_mm)
Flipper Length (mm): 0 181.0 1 186.0 2 195.0 3 193.0 4 190.0 ... 328 214.0 329 215.0 330 222.0 331 212.0 332 213.0 Name: flipper_length_mm, Length: 333, dtype: float64
In [9]:
print("Body Mass (g): ", body_mass_g)
Body Mass (g): 0 3750.0 1 3800.0 2 3250.0 3 3450.0 4 3650.0 ... 328 4925.0 329 4850.0 330 5750.0 331 5200.0 332 5400.0 Name: body_mass_g, Length: 333, dtype: float64
In [10]:
print("Gender: ", gender)
Gender: 0 MALE 1 FEMALE 2 FEMALE 3 FEMALE 4 MALE ... 328 FEMALE 329 FEMALE 330 MALE 331 FEMALE 332 MALE Name: sex, Length: 333, dtype: object
Activities¶
1. Add a constant value.¶
In [12]:
body_mass_g_plus_100 = (body_mass_g + 100)
print(body_mass_g_plus_100)
0 3850.0 1 3900.0 2 3350.0 3 3550.0 4 3750.0 ... 328 5025.0 329 4950.0 330 5850.0 331 5300.0 332 5500.0 Name: body_mass_g, Length: 333, dtype: float64
2. Subtract the 'culmen_length_mm' series from the 'flipper_length_mm' series¶
In [17]:
length_difference = (flipper_length_mm - culmen_length_mm)
print(length_difference)
0 141.9 1 146.5 2 154.7 3 156.3 4 150.7 ... 328 166.8 329 168.2 330 171.6 331 166.8 332 163.1 Length: 333, dtype: float64
3. Multiply to series¶
In [19]:
double_culmen_depth_mm = (culmen_depth_mm*2)
print(double_culmen_depth_mm)
0 37.4 1 34.8 2 36.0 3 38.6 4 41.2 ... 328 27.4 329 28.6 330 31.4 331 29.6 332 32.2 Name: culmen_depth_mm, Length: 333, dtype: float64
4. Raise the 'flipper_length_mm' series to the power¶
In [21]:
flipper_length_mm_squared = flipper_length_mm ** 2
print(flipper_length_mm_squared)
0 32761.0 1 34596.0 2 38025.0 3 37249.0 4 36100.0 ... 328 45796.0 329 46225.0 330 49284.0 331 44944.0 332 45369.0 Name: flipper_length_mm, Length: 333, dtype: float64
5. Calculate the mean of the 'culmen_length_mm' series and subtract it from each value in the series¶
In [32]:
culmen_length_mm_mean_centered =(culmen_length_mm - culmen_length_mm.mean())
print(culmen_length_mm_mean_centered)
0 -4.892793 1 -4.492793 2 -3.692793 3 -7.292793 4 -4.692793 ... 328 3.207207 329 2.807207 330 6.407207 331 1.207207 332 5.907207 Name: culmen_length_mm, Length: 333, dtype: float64
6. Concatenate the 'species' and 'gender' series¶
In [34]:
species_and_gender = species + '-' + gender
print(species_and_gender)
0 Adelie-MALE 1 Adelie-FEMALE 2 Adelie-FEMALE 3 Adelie-FEMALE 4 Adelie-MALE ... 328 Gentoo-FEMALE 329 Gentoo-FEMALE 330 Gentoo-MALE 331 Gentoo-FEMALE 332 Gentoo-MALE Length: 333, dtype: object
7. Perform element-wise addition¶
In [37]:
culmen_length_plus_depth_mm = culmen_length_mm + culmen_depth_mm
print(culmen_length_plus_depth_mm)
0 57.8 1 56.9 2 58.3 3 56.0 4 59.9 ... 328 60.9 329 61.1 330 66.1 331 60.0 332 66.0 Length: 333, dtype: float64
8. Sort culmen_length_mm
in descending order¶
In [41]:
culmen_length_mm_sorted = culmen_length_mm.sort_values(ascending=False)
print(culmen_length_mm_sorted)
246 59.6 163 58.0 313 55.9 209 55.8 326 55.1 ... 13 34.4 86 34.0 64 33.5 92 33.1 136 32.1 Name: culmen_length_mm, Length: 333, dtype: float64
9. Divide flipper_length_mm
by culmen_length_mm
¶
In [43]:
length_ratio = flipper_length_mm / culmen_length_mm
print(length_ratio)
0 4.629156 1 4.708861 2 4.838710 3 5.258856 4 4.834606 ... 328 4.533898 329 4.594017 330 4.404762 331 4.690265 332 4.268537 Length: 333, dtype: float64