In [1]:

import pandas as pd

In [2]:

df = pd.read_csv('words.csv', index_col='Word')

In [3]:

df.head()

Out[3]:

	Char Count	Value
Word
aa	2	2
aah	3	10
aahed	5	19
aahing	6	40
aahs	4	29

Activities¶

How many elements does this dataframe have?¶

In [5]:

df.shape

Out[5]:

(172821, 2)

In [6]:

df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 172821 entries, aa to zyzzyvas
Data columns (total 2 columns):
 #   Column      Non-Null Count   Dtype
---  ------      --------------   -----
 0   Char Count  172821 non-null  int64
 1   Value       172821 non-null  int64
dtypes: int64(2)
memory usage: 4.0+ MB

In [7]:

df.count()

Out[7]:

Char Count    172821
Value         172821
dtype: int64

In [ ]:

What is the value of the word `microspectrophotometries`?¶

In [12]:

df[df.index=='microspectrophotometries']

Out[12]:

	Char Count	Value
Word
microspectrophotometries	24	317

In [14]:

df.loc['microspectrophotometries', 'Char Count']

Out[14]:

In [16]:

df.index.max()

Out[16]:

'zyzzyvas'

In [17]:

df['Value'].max()

Out[17]:

In [18]:

df.max()

Out[18]:

Char Count     28
Value         319
dtype: int64

In [19]:

df.describe()

Out[19]:

	Char Count	Value
count	172821.000000	172821.000000
mean	9.087628	107.754179
std	2.818285	39.317452
min	2.000000	2.000000
25%	7.000000	80.000000
50%	9.000000	103.000000
75%	11.000000	131.000000
max	28.000000	319.000000

In [ ]:

In [21]:

df.loc[['enfold',
'pinfish',
'superheterodyne',
'microbrew',
'glowing']]

Out[21]:

	Char Count	Value
Word
enfold	6	56
pinfish	7	81
superheterodyne	15	198
microbrew	9	106
glowing	7	87

What is the highest possible value of a word?¶

In [50]:

len(max(df.index, key=len))

Out[50]:

Which of the following words have a Char Count of `15`?¶

In [20]:

df[df['Char Count']==15]

Out[20]:

	Char Count	Value
Word
absorbabilities	15	143
abstractionisms	15	182
abstractionists	15	189
acanthocephalan	15	122
acceptabilities	15	134
...	...	...
worthlessnesses	15	220
wrongheadedness	15	161
xerographically	15	174
xeroradiography	15	184
zoogeographical	15	158

3192 rows × 2 columns

What is the highest possible length of a word?¶

In [22]:

df.index.sort_values

Out[22]:

<bound method Index.sort_values of Index(['aa', 'aah', 'aahed', 'aahing', 'aahs', 'aal', 'aalii', 'aaliis',
       'aals', 'aardvark',
       ...
       'zymology', 'zymosan', 'zymosans', 'zymoses', 'zymosis', 'zymotic',
       'zymurgies', 'zymurgy', 'zyzzyva', 'zyzzyvas'],
      dtype='object', name='Word', length=172821)>

What is the word with the value of `319`?¶

In [25]:

df[df['Value']==319]

Out[25]:

	Char Count	Value
Word
reinstitutionalizations	23	319

What is the most common value?¶

In [30]:

df['Value'].groupby(df['Value']).count().sort_values()

Out[30]:

Value
319       1
317       1
309       1
291       1
2         1
       ... 
92     1902
99     1907
95     1915
100    1921
93     1965
Name: Value, Length: 303, dtype: int64

What is the shortest word with value `274`?¶

In [ ]:

df.mode()

In [41]:

df[df['Value']==274].sort_values(by='Char Count')

Out[41]:

	Char Count	Value
Word
overprotectivenesses	20	274
countercountermeasure	21	274
psychophysiologically	21	274

In [40]:

df['Value'].value_counts()

Out[40]:

Value
93     1965
100    1921
95     1915
99     1907
92     1902
       ... 
317       1
304       1
300       1
319       1
278       1
Name: count, Length: 303, dtype: int64

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

In [ ]:

df['Ratio']=df['Value']/df['Char Count']

In [55]:

df.sort_values(by='Ratio')

Out[55]:

	Char Count	Value	Ratio
Word
aa	2	2	1.000000
aba	3	4	1.333333
baa	3	4	1.333333
ab	2	3	1.500000
abba	4	6	1.500000
...	...	...	...
pyx	3	65	21.666667
xyst	4	88	22.000000
wry	3	66	22.000000
muzzy	5	111	22.200000
xu	2	45	22.500000

172821 rows × 3 columns

What is the maximum value of `Ratio`?¶

In [ ]:

What word is the one with the highest `Ratio`?¶

In [61]:

df[(df['Char Count']==17)&(df['Value']==260)]

Out[61]:

	Char Count	Value	Ratio
Word
hydroxytryptamine	17	260	15.294118

How many words have a `Ratio` of `10`?¶

In [ ]:

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

In [ ]:

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶

In [43]:

mean_char = df['Char Count'].mean()
mean_char

Out[43]:

9.087628239623657

Based on the previous task, what word is it?¶

In [48]:

df.query("`Char Count`>@mean_char")

Out[48]:

	Char Count	Value
Word
aardwolves	10	120
abacterial	10	72
abandoners	10	93
abandoning	10	81
abandonment	11	103
...	...	...
zygomorphies	12	176
zygomorphy	10	168
zygosities	10	154
zygospores	10	165
zymologies	10	146

67582 rows × 2 columns

Statement of Completion#b5a037c1

Intro to Pandas for Data Analysis

DataFrames practice: working with English Words

Activities¶

How many elements does this dataframe have?¶

What is the value of the word `microspectrophotometries`?¶

What is the highest possible value of a word?¶

Which of the following words have a Char Count of `15`?¶

What is the highest possible length of a word?¶

What is the word with the value of `319`?¶

What is the most common value?¶

What is the shortest word with value `274`?¶

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

What is the maximum value of `Ratio`?¶

What word is the one with the highest `Ratio`?¶

How many words have a `Ratio` of `10`?¶

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶

Based on the previous task, what word is it?¶

Statement of Completion#b5a037c1

Intro to Pandas for Data Analysis

DataFrames practice: working with English Words

Activities¶

How many elements does this dataframe have?¶

What is the value of the word microspectrophotometries?¶

What is the highest possible value of a word?¶

Which of the following words have a Char Count of 15?¶

What is the highest possible length of a word?¶

What is the word with the value of 319?¶

What is the most common value?¶

What is the shortest word with value 274?¶

Create a column Ratio which represents the 'Value Ratio' of a word¶

What is the maximum value of Ratio?¶

What word is the one with the highest Ratio?¶

How many words have a Ratio of 10?¶

What is the maximum Value of all the words with a Ratio of 10?¶

Of those words with a Value of 260, what is the lowest Char Count found?¶

Based on the previous task, what word is it?¶

What is the value of the word `microspectrophotometries`?¶

Which of the following words have a Char Count of `15`?¶

What is the word with the value of `319`?¶

What is the shortest word with value `274`?¶

Create a column `Ratio` which represents the 'Value Ratio' of a word¶

What is the maximum value of `Ratio`?¶

What word is the one with the highest `Ratio`?¶

How many words have a `Ratio` of `10`?¶

What is the maximum `Value` of all the words with a `Ratio` of `10`?¶

Of those words with a `Value` of `260`, what is the lowest `Char Count` found?¶