Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Master Thesis: Data Science Thesis (Predicting Nutrition from Shopping Data, e.g. Cumulus, Supercard)
Our research project “Receipt2Nutrition” is the world’s first large-scale voluntary panel that contributes digital receipt data from loyalty cards anonymously to assess the data’s potential to predict dietary behaviour automatically.
Keywords: Digital receipts, Data science, Deep learning, Nutrition
**The Topic**
The Receipt2Nutrition research project is a novel and scalable approach for delivering individuallevel dietary monitoring by applying data science on automatically captured digital receipts. Driven by the recent introduction of online food composition declaration (EU-1169/2011 2014), curated food composition databases containing detailed nutritional information on products sold in a retail environment are now available. This information becomes particularly useful when combined with a consumer's shopping history. By using a loyalty card at checkout, a consumer automaticaly extends her electronic purchasing history including dates, locations, and individual items purchased in elec-tronic and machine-readable form. A total of 80% of Swiss retailers' revenue is covered by loyalty cards (Accarda 2005; Handelszeitung.ch 2004), thus providing a widely adopted basis for our research model. Thanks to the also recently introduced General Data Privacy Regulation (GDPR), users of such a loyalty card system can now request their own data from data processers (such as loyalty card system providers) and decide if they want to share it with research institutions or diet applications.
**The Thesis**
Our project Receipt2Nutrition seeks at least N=2’000 volunteers to produce meaningful results: The volunteering participants must contribute by 1) answering a mandatory introductory survey, 2) filling out a state-of-the-art food frequency questionnaire and finally, by 3) requesting their digital receipts from their loyalty card providers (e.g. Supercard and Cumulus) and sharing their data with the research project. Finally, all participants receive a web-based visual feedback that evaluates their dietary and purchasing behaviour from a dietician’s perspective.
**You are**
- experienced with Data Science/Machine learning (e.g. Python)
**The Topic** The Receipt2Nutrition research project is a novel and scalable approach for delivering individuallevel dietary monitoring by applying data science on automatically captured digital receipts. Driven by the recent introduction of online food composition declaration (EU-1169/2011 2014), curated food composition databases containing detailed nutritional information on products sold in a retail environment are now available. This information becomes particularly useful when combined with a consumer's shopping history. By using a loyalty card at checkout, a consumer automaticaly extends her electronic purchasing history including dates, locations, and individual items purchased in elec-tronic and machine-readable form. A total of 80% of Swiss retailers' revenue is covered by loyalty cards (Accarda 2005; Handelszeitung.ch 2004), thus providing a widely adopted basis for our research model. Thanks to the also recently introduced General Data Privacy Regulation (GDPR), users of such a loyalty card system can now request their own data from data processers (such as loyalty card system providers) and decide if they want to share it with research institutions or diet applications.
**The Thesis** Our project Receipt2Nutrition seeks at least N=2’000 volunteers to produce meaningful results: The volunteering participants must contribute by 1) answering a mandatory introductory survey, 2) filling out a state-of-the-art food frequency questionnaire and finally, by 3) requesting their digital receipts from their loyalty card providers (e.g. Supercard and Cumulus) and sharing their data with the research project. Finally, all participants receive a web-based visual feedback that evaluates their dietary and purchasing behaviour from a dietician’s perspective.
**You are** - experienced with Data Science/Machine learning (e.g. Python)
The goal of the Receipt2Nutrition research project is to develop a machine-learning/deep-learning model that reliably predicts an individual user’s dietary pattern from his respective purchasing log. More particular, we plan to assess, with which accuracy house-hold purchase data can predict individual dietary patterns (vegetable/fruit units per day, carbohydrates/fiber/sugar/salt/(saturated) fat/protein intake per day, daily energy intake). Especially, since the model is sensitive towards potential externalities such as household size, purchase behaviour, eating in restaurants, food waste and consumption rates), the model’s accuracy towards such externalities will be assessed. A confidence score between 0 and 1 for each predicted dietary intake variable attributed, and required minimum threshold values for household size, share of wallet with loyalty card data, restaurant visits per week will be calculated. It is the declared minimum goal is that for at least a subset of users the dietary patterns can be predicted reliably (e.g. small households, few restaurants visits per week, high share of wallet with loyalty card data available).
You have the chance to become a co-author of a scientific paper in a conference or journal.
The goal of the Receipt2Nutrition research project is to develop a machine-learning/deep-learning model that reliably predicts an individual user’s dietary pattern from his respective purchasing log. More particular, we plan to assess, with which accuracy house-hold purchase data can predict individual dietary patterns (vegetable/fruit units per day, carbohydrates/fiber/sugar/salt/(saturated) fat/protein intake per day, daily energy intake). Especially, since the model is sensitive towards potential externalities such as household size, purchase behaviour, eating in restaurants, food waste and consumption rates), the model’s accuracy towards such externalities will be assessed. A confidence score between 0 and 1 for each predicted dietary intake variable attributed, and required minimum threshold values for household size, share of wallet with loyalty card data, restaurant visits per week will be calculated. It is the declared minimum goal is that for at least a subset of users the dietary patterns can be predicted reliably (e.g. small households, few restaurants visits per week, high share of wallet with loyalty card data available).
You have the chance to become a co-author of a scientific paper in a conference or journal.
Klaus Fuchs
ETH Zürich, D-MTEC
Weinbergstrasse 56/58
8092 Zürich
Phone: +41 78 858 70 37
fuchsk@ethz.ch
www.autoidlabs.ch
www.im.ethz.ch
Klaus Fuchs ETH Zürich, D-MTEC Weinbergstrasse 56/58 8092 Zürich