Allrecipes
Allrecipes data is on the menu this week.
Once again, there are a lot of interesting things to explore in this data. I thought I'd try looking at the relative prevalence of the two staples: rice and bread. My methodology is far from perfect, but I just tried to parse the ingredients list and extract either rice
on the one hand, or flour
, wheat
, or bread
on the other.
Once I'd added two booleans to flag each recipe, I took a net of flour-based minus rice-based recipes, and normalized it over the total number recipes for each cuisine. There aren't really any surprises in these numbers, so this approach must not have been completely off base.
copy (
with parse_ingredients as (
select *,
ingredients like '%rice%' as has_rice,
ingredients similar to '.*(flour|bread|wheat).*' as has_flour
from 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-09-16/cuisines.csv'
)
select cuisine,
count_if(has_rice) as rice_recipes,
count_if(has_flour) as flour_recipes,
count(*) as all_recipes,
(flour_recipes - rice_recipes)/all_recipes as net_pct
from parse_ingredients
group by all
) to 'rice_vs_bread.csv'
I also set up a mapping from cuisines to country names (with the help of Claude) and re-aggregated the data by country.
Of course, this doesn't even account for the many recipes that would just be paired with either rice or bread. I wonder if the prevalence of dishes served with just plain rice accounts for the rice-based recipes capping out below 50%.