Even if you are dealing with a small dataset within Pandas and using “apply” for intensive computation, such as creating embeddings for each sample, this could take an extended time depending on the hardware resources at your disposal. tqdm comes with support for Pandas by providing a function(“progress_apply”) to Pandas framework that wraps around the Pandas function “apply”. This will display a progress bar during the apply operation, this is useful in giving you feedback to free up your time to work on other parts of your notebook whilst long-running tasks are yet to complete.
from tqdm.notebook import tqdm
model, preprocess = clip.load("ViT-B/32", device=device)
tqdm.pandas()
def generate_embeddings(imgs: list[str]):
embeddings = []
error_files = []
for img in imgs.copy():
try:
embeddings.append(preprocess(Image.open(img)))
except Exception as e:
imgs.remove(img)
print(e)
...
...
df.progress_apply(generate_embeddings)
tqdm Pandas progress implementation
tqdm Pandas progress bar