In [1]:

import pandas as pd

In [13]:

df = pd.read_csv('words.csv',index_col="Word")

In [14]:

df.head()

Out[14]:

	Char Count	Value
Word
aa	2	2
aah	3	10
aahed	5	19
aahing	6	40
aahs	4	29

Activities¶

How many elements does this dataframe have?¶

In [7]:

df.shape

Out[7]:

(172821, 3)

In [8]:

df.info

Out[8]:

<bound method DataFrame.info of              Word  Char Count  Value
0              aa           2      2
1             aah           3     10
2           aahed           5     19
3          aahing           6     40
4            aahs           4     29
...           ...         ...    ...
172816    zymotic           7    111
172817  zymurgies           9    143
172818    zymurgy           7    135
172819    zyzzyva           7    151
172820   zyzzyvas           8    170

[172821 rows x 3 columns]>

In [ ]:

What is the value of the word `microspectrophotometries`?¶

In [17]:

df.loc["microspectrophotometries"]

Out[17]:

Char Count     24
Value         317
Name: microspectrophotometries, dtype: int64

In [ ]:

What is the highest possible value of a word?¶

In [18]:

df.max()

Out[18]:

Char Count     28
Value         319
dtype: int64

In [19]:

df.describe()

Out[19]:

	Char Count	Value
count	172821.000000	172821.000000
mean	9.087628	107.754179
std	2.818285	39.317452
min	2.000000	2.000000
25%	7.000000	80.000000
50%	9.000000	103.000000
75%	11.000000	131.000000
max	28.000000	319.000000

In [ ]:

Which of the following words have a Char Count of `7` and a Value of `87`?¶

In [26]:

df.loc[[

"microbrew",


"pinfish",


"enfold",


"superheterodyne"
,

"glowing"
]]

Out[26]:

	Char Count	Value
Word
microbrew	9	106
pinfish	7	81
enfold	6	56
superheterodyne	15	198
glowing	7	87

In [ ]:

What is the highest possible length of a word?¶

In [36]:

df.loc[df['Value']==319]

Out[36]:

	Char Count	Value
Word
reinstitutionalizations	23	319

In [ ]:

In [40]:

dt=df.loc[['Value']==274]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[40], line 1
----> 1 dt=df.loc[['Value']==274]

File /usr/local/lib/python3.11/site-packages/pandas/core/indexing.py:1191, in _LocationIndexer.__getitem__(self, key)
   1189 maybe_callable = com.apply_if_callable(key, self.obj)
   1190 maybe_callable = self._check_deprecated_callable_usage(key, maybe_callable)
-> 1191 return self._getitem_axis(maybe_callable, axis=axis)

File /usr/local/lib/python3.11/site-packages/pandas/core/indexing.py:1430, in _LocIndexer._getitem_axis(self, key, axis)
   1427         return self.obj.iloc[tuple(indexer)]
   1429 # fall thru to straight lookup
-> 1430 self._validate_key(key, axis)
   1431 return self._get_label(key, axis=axis)

File /usr/local/lib/python3.11/site-packages/pandas/core/indexing.py:1239, in _LocIndexer._validate_key(self, key, axis)
   1232 ax = self.obj._get_axis(axis)
   1233 if isinstance(key, bool) and not (
   1234     is_bool_dtype(ax.dtype)
   1235     or ax.dtype.name == "boolean"
   1236     or isinstance(ax, MultiIndex)
   1237     and is_bool_dtype(ax.get_level_values(0).dtype)
   1238 ):
-> 1239     raise KeyError(
   1240         f"{key}: boolean label can not be used without a boolean index"
   1241     )
   1243 if isinstance(key, slice) and (
   1244     isinstance(key.start, bool) or isinstance(key.stop, bool)
   1245 ):
   1246     raise TypeError(f"{key}: boolean values can not be used in a slice")

KeyError: 'False: boolean label can not be used without a boolean index'

In [ ]:

What is the word with the value of `319`?¶

In [39]:

df.loc[df['Value']==319 & df['Char Count']==23]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_19/2958559696.py in ?()
----> 1 df.loc[df['Value']==319 & df['Char Count']==23]

/usr/local/lib/python3.11/site-packages/pandas/core/generic.py in ?(self)
   1575     @final
   1576     def __nonzero__(self) -> NoReturn:
-> 1577         raise ValueError(
   1578             f"The truth value of a {type(self).__name__} is ambiguous. "
   1579             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1580         )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In [ ]:

What is the most common value?¶

In [46]:

df.loc[df["Value"]==274].sort_values(by="Char Count")

Out[46]:

	Char Count	Value
Word
overprotectivenesses	20	274
countercountermeasure	21	274
psychophysiologically	21	274

In [ ]:

What is the shortest word with value `274`?¶

In [ ]:

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

In [ ]:

What is the maximum value of `Ratio`?¶

In [47]:

df.describe()

Out[47]:

	Char Count	Value
count	172821.000000	172821.000000
mean	9.087628	107.754179
std	2.818285	39.317452
min	2.000000	2.000000
25%	7.000000	80.000000
50%	9.000000	103.000000
75%	11.000000	131.000000
max	28.000000	319.000000

In [48]:

df["Ratio"]=df["Value"]/df["Char Count"]

In [49]:

df

Out[49]:

	Char Count	Value	Ratio
Word
aa	2	2	1.000000
aah	3	10	3.333333
aahed	5	19	3.800000
aahing	6	40	6.666667
aahs	4	29	7.250000
...	...	...	...
zymotic	7	111	15.857143
zymurgies	9	143	15.888889
zymurgy	7	135	19.285714
zyzzyva	7	151	21.571429
zyzzyvas	8	170	21.250000

172821 rows × 3 columns

In [ ]:

What word is the one with the highest `Ratio`?¶

In [52]:

df["Ratio"].max()

Out[52]:

22.5

In [57]:

df.loc[df["Ratio"]==10]

Out[57]:

	Char Count	Value	Ratio
Word
aardwolf	8	80	10.0
abatements	10	100	10.0
abducts	7	70	10.0
abetment	8	80	10.0
abettals	8	80	10.0
...	...	...	...
ycleped	7	70	10.0
yodeled	7	70	10.0
zamia	5	50	10.0
zebecs	6	60	10.0
zwieback	8	80	10.0

2604 rows × 3 columns

In [63]:

df.loc[df["Ratio"]==10].sort_values(by="Value",ascending=False)

Out[63]:

	Char Count	Value	Ratio
Word
electrocardiographically	24	240	10.0
electroencephalographies	24	240	10.0
electroencephalographer	23	230	10.0
phonocardiographic	18	180	10.0
inconceivabilities	18	180	10.0
...	...	...	...
web	3	30	10.0
bug	3	30	10.0
elm	3	30	10.0
as	2	20	10.0
oe	2	20	10.0

2604 rows × 3 columns

In [64]:

df.loc[df["Value"]==260].sort_values(by="Char Count")

Out[64]:

	Char Count	Value	Ratio
Word
hydroxytryptamine	17	260	15.294118
neuropsychologists	18	260	14.444444
psychophysiologist	18	260	14.444444
revolutionarinesses	19	260	13.684211
countermobilizations	20	260	13.000000
underrepresentations	20	260	13.000000

In [65]:

df["Char Count"].describe()

Out[65]:

count    172821.000000
mean          9.087628
std           2.818285
min           2.000000
25%           7.000000
50%           9.000000
75%          11.000000
max          28.000000
Name: Char Count, dtype: float64

In [ ]:

How many words have a `Ratio` of `10`?¶

In [ ]:

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

In [ ]:

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶

In [ ]:

Based on the previous task, what word is it?¶

In [ ]:

Statement of Completion#8a17a928

Intro to Pandas for Data Analysis

DataFrames practice: working with English Words

Activities¶

How many elements does this dataframe have?¶

What is the value of the word `microspectrophotometries`?¶

What is the highest possible value of a word?¶

Which of the following words have a Char Count of `7` and a Value of `87`?¶

What is the highest possible length of a word?¶

What is the word with the value of `319`?¶

What is the most common value?¶

What is the shortest word with value `274`?¶

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

What is the maximum value of `Ratio`?¶

What word is the one with the highest `Ratio`?¶

How many words have a `Ratio` of `10`?¶

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶

Based on the previous task, what word is it?¶

Statement of Completion#8a17a928

Intro to Pandas for Data Analysis

DataFrames practice: working with English Words

Activities¶

How many elements does this dataframe have?¶

What is the value of the word microspectrophotometries?¶

What is the highest possible value of a word?¶

Which of the following words have a Char Count of 7 and a Value of 87?¶

What is the highest possible length of a word?¶

What is the word with the value of 319?¶

What is the most common value?¶

What is the shortest word with value 274?¶

Create a column Ratio which represents the 'Value Ratio' of a word¶

What is the maximum value of Ratio?¶

What word is the one with the highest Ratio?¶

How many words have a Ratio of 10?¶

What is the maximum Value of all the words with a Ratio of 10?¶

Of those words with a Value of 260, what is the lowest Char Count found?¶

Based on the previous task, what word is it?¶

What is the value of the word `microspectrophotometries`?¶

Which of the following words have a Char Count of `7` and a Value of `87`?¶

What is the word with the value of `319`?¶

What is the shortest word with value `274`?¶

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

What is the maximum value of `Ratio`?¶

What word is the one with the highest `Ratio`?¶

How many words have a `Ratio` of `10`?¶

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶