5 Minimalist Tips for Data Scientists to…

Feb 6, 2022

2022-02-06

2 Comments

Feb 7, 2022

Hi Laszlo, for Point 5, why not using df['column'].value_counts(dropna=False).head(n=100)? Won't it give you the same thing along with a distribution?

Expand full comment

Reply (1)

Laszlo Sragner

Feb 14, 2022

Yeah, probably it's the same, though without going to StackOverflow/documentation or trying it out on a toy dataframe I don't know if your example returns a pd.Series or a list or a numpy array.

Counter always returns an OrderedDict and most_common() a list of tuples, and this is regardless of the input type, the only thing it needs is an iterable.

Expand full comment

Deliberate Machine Learning

5 Minimalist Tips for Data Scientists to…