10 Pandas Tricks That Will Save You Hours
May 14, 20264 min readby StartD Editorial
Pandas is so flexible that most people use 10% of it badly. These ten patterns will clean up almost any notebook.
1. Method chaining beats temporary variables
# Instead of:
df2 = df[df.country == "BR"]
df2["revenue_k"] = df2.revenue / 1000
result = df2.groupby("city")["revenue_k"].sum()
# Do:
result = (
df.query("country == 'BR'")
.assign(revenue_k=lambda d: d.revenue / 1000)
.groupby("city")["revenue_k"]
.sum()
)2. query() is faster to read than boolean masks
df.query("age > 30 and country in ['BR', 'PT']")3. assign() for new columns inside a chain
df.assign(
margin=lambda d: d.revenue - d.cost,
margin_pct=lambda d: d.margin / d.revenue,
)4. pipe() for custom steps
def winsorize(df, col, q=0.01):
lo, hi = df[col].quantile([q, 1 - q])
return df.assign(**{col: df[col].clip(lo, hi)})
df.pipe(winsorize, "revenue")5–10: the rest
- Use .loc[] for assignment, never chained indexing.
- Categorical dtype saves memory and speeds up groupby.
- value_counts(normalize=True) gives you proportions in one call.
- merge() with indicator=True debugs join mismatches instantly.
- convert_dtypes() handles nullable ints and clean booleans.
- Profile with df.memory_usage(deep=True) before optimizing anything.
Adopt these and your future self (and your reviewers) will thank you.
Recommended Reading

Python for Data Analysis
Wes McKinney (3rd Edition, O'Reilly)
The definitive guide to pandas, NumPy, and the modern Python data stack — written by the creator of pandas himself.
View on Amazon
Hands-On Machine Learning
Aurélien Géron (3rd Edition, O'Reilly)
From linear regression to deep neural nets with Scikit-Learn, Keras and TensorFlow. The most recommended ML book of the decade.
View on Amazon