2 Comments

Hi Laszlo, for Point 5, why not using df['column'].value_counts(dropna=False).head(n=100)? Won't it give you the same thing along with a distribution?

Expand full comment
author

Yeah, probably it's the same, though without going to StackOverflow/documentation or trying it out on a toy dataframe I don't know if your example returns a pd.Series or a list or a numpy array.

Counter always returns an OrderedDict and most_common() a list of tuples, and this is regardless of the input type, the only thing it needs is an iterable.

Expand full comment