Extracting Food Substitutes From Food Diary via Distributional Similarity


In this paper, we explore the problem of identifying substitute relationship between food pairs from real-world food consumption data as the first step towards the healthier food recommendation. Our method is inspired by the distributional hypothesis in linguistics. Specifically, we assume that foods that are consumed in similar contexts are more likely to be similar dietarily. For example, a turkey sandwich can be considered a suitable substitute for a chicken sandwich if both tend to be consumed with french fries and salad. To evaluate our method, we constructed a real-world food consumption dataset from MyFitnessPal’s public food diary entries and obtained ground-truth human judgements of food substitutes from a crowdsourcing service. The experiment results suggest the effectiveness of the method in identifying suitable substitutes.

Proceedings of the 2016 Workshop on Engendering Health with RecSys - HealthRecSys ‘16