Deliberate Machine Learning

Share this post

Search for code in all your notebooks with one simple command

laszlo.substack.com

Search for code in all your notebooks with one simple command

2023-01-09

Laszlo Sragner
Jan 9
1
Share this post

Search for code in all your notebooks with one simple command

laszlo.substack.com

Happy New Year, I hope everyone had a great holiday and is ready to start 2023!

As always, the Code Quality for Data Science (CQ4DS) discord is open for people who want to improve their coding skills. Invite link:

https://discord.com/invite/8uUZNMCad2


Searching in notebooks

When I need to use an infrequently used package, I often remember a specific use case I did some time ago. In this case, I need to search in a vast set of notebooks. I had various grep-based tools for these, which are less than ideal because they also match the cell outputs that are not relevant.

I was working on some complex JSON files which were easy to manipulate as python structures. Then it dawned on me that notebooks are just pure JSON files themselves. How difficult would it be to use them in the same way? Apparently, it is not very difficult. The gist of the entire package is:

for cell in json.loads(open(notebook).read())['cells']:

After this, the rest was a simple regex on cell[‘source’].

I wrapped this into a function and returned the values as a pandas dataframe.

Photo by Almas Salakhov on Unsplash

Hypermodern Python

I kept copying the above function into every notebook server I used, which is clearly not a sustainable solution. I thought I should convert it into a package, but the last time I tried to do this, it was a pain. And if I expose it on PyPI, I should do it “properly”.

At the same time, I bumped into the “Hypermodern Python” article series (6!! parts), which is an epic walk-through on all the recent technologies for a modern python environment. This is one of those topics that you know you should do, but you don’t because of … you know … “reasons”.

On a December Saturday, I bit the bullet and decided to go through the entire series and live to chat it on CQ4DS: [link]

I will write more posts on this topic, so subscribe if you want to learn more about tools like poetry, black, GitHub actions, nox, pre-commit hooks and many more.

nb_query quick start

!pip install nb-query
from nb_query import nb_query
nb_query('import numpy as np')

Links

  • Package on PyPI: https://pypi.org/project/nb-query/

  • Repository on GitHub: https://github.com/xLaszlo/nb-query

  • Docs on RTD: https://nb-query.readthedocs.io/en/latest/

  • Hypermodern Python: https://medium.com/@cjolowicz/hypermodern-python-d44485d9d769

The project is WIP. If you have any feedback, join the CQ4DS discord and share it on the #hypermodern-python channel:

https://discord.com/invite/8uUZNMCad2

Share this post

Search for code in all your notebooks with one simple command

laszlo.substack.com
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 Laszlo Sragner
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing