Research Paper Diet 2
Research Paper Diet 2
Embedding
Ahmed A. Metwally1,2,∗ , Ariel K. Leong3,∗ , Aman Desai4 , Anvith Nagarjuna1 , Dalia Perelman1 , Michael Snyder1
Abstract—Diet management is key to managing chronic dis- created an equation to predict the score a user would assign
eases such as diabetes. Automated food recommender systems to a recipe that took into account the number of ingredients in
may be able to assist by providing meal recommendations that the recipe that the user liked and how much they liked them
arXiv:2110.15498v2 [cs.CL] 22 Nov 2021
Compute
Similarities
Embedding
Preprocessing
Embedded
Representation
Retrieve
Category of Food
Entry
WWEIA Database
Fig. 1. Workflow of the food preference learning algorithm. Each entry of the food log is processed through an NLP module. Then food embeddings is
obtained for each food entry in the food log and the the database. Next, for each food embedding of the food log, Cosine similarity is run against all
embeddings of the foods in the WWEIA database. The food category label of the food with the highest Cosine similarity is assigned to the food log entry.
This process is repeated for all foods in the food logs. The most common food categories are then calculated.
TABLE I between a food log entry and its database counterpart (or
F OOD L OG NAME P REPROCESSING M ETHODS E XAMPLES the one most closely analogous). Thus to improve food label
Method Result of Processing the Phrase ”Panera accuracy, food log entry names were preprocessed in various
Bread, salad, cobb, green goddess, with ways to remove words that increased similarity to incorrect
chicken & dressing” FNDDS entries or decreased similarity to the correct entry.
1 ”bread” An example showing the preprocessing strategies applied
2 ”bread, salad, cobb, green, with, chicken, &, to one food log entry, ”Panera Bread, salad, cobb, green
dressing” −
→ embedding goddess, with chicken & dressing,” is shown in Table II-B2.
3 ”salad, cobb, green, with, chicken, &, dress- The preprocessing methods, which build on each other, are
ing” −
→ embedding described below:
4 ”salad, cobb, green, chicken, dressing”
5 ”bread, salad, cobb, green, chicken, dress- • Method 1 was intended to only retain the food’s general
ing” −
→ embedding name. As previously mentioned, Cronometer food log en-
6 ”salad, cobb, green, chicken, dressing” but try names were structured as a series of comma-separated
the database is restricted to foods in the phrases. Generally, the first phrase would contain the
”Vegetable mixed dishes”, ”Chicken, whole general food name unless the food brand was specified.
pieces”, ”Chicken patties, nuggets, and ten- The food brand would be in the first comma-separated
ders”, ”Yeast breads,” and ”Salad dressings phrase, and the general food would be the second comma-
and vegetable oils” categories − → embed- separated phrase. Additional phrases specifying further
ding details about the food would follow it. The first phrase
would contain the brand name if there was one, fol-
lowed by the food name. The remaining phrases con-
2) Food Log Name Preprocessing Methods: In our ap- tained food composition or preparation details. Removing
proach, the correct labeling of a food depends on the similarity the food manufacturer’s brand name and more specific
details of the foods was hypothesized to increase the
similarity between analogous foods. The removal was # of correct assignments of unique foods
accomplished by eliminating all but either the first or Accuracylabel =
# of unique food log entries
second comma-separated phrase before generating an em- (3)
bedding. Determining whether the first comma-separated Due to the presence of overlapping food categories (for
phrase contained a food brand name and should not be example, ”Milk, whole” and ”Milk shakes and other dairy
used consisted of counting the number of words in the drinks”), the ”synonymous accuracy” was also calculated:
first and second comma-separated phrases that belonged
to the FNDDS vocabulary and choosing the phrase that # of synonymous assignments of unique foods
had the larger number. Accuracysyn =
# of unique food log entries
• Method 2 was similar to Method 1, but retained the (4)
comma-separated phrases that contained specific details. Food categories were considered synonymous if they shared
• Method 3, like Method 2, retained most of the food at least one word in common. Accounting for synonymous
name, but used another heuristic to judge whether the categories reduced the number of database food categories
first comma-separated phrase contained a brand name. from 155 to 98. Since it could be the case that the model did
Instead of counting the number of words that belonged to not predict the correct label but assigned it a high probability,
the FNDDS vocabulary, the percentage of FNDDS words the mean reciprocal rank (MRR) was also assessed:
in the comma-separated phrase was used.
• Method 4, in which generic food-related terms were n
1X 1
removed from the food log entry name (in addition to M RR = (5)
n i=1 ranki
the preprocessing done in Method 3), was introduced
after noticing that for some food log entries, the most where n is the total number of unique foods in the food
similar database food was one that was wholly unrelated log and rank is the rank that the model assigned each correct
but contained a generic word in common. For example, label for a food.
the most similar database food for many fruits, including Similar to accuracy, the synonymous mean reciprocal rank
”Blueberries, Fresh,” ”Blackberries, Fresh,” and ”Straw- (SMRR) was calculated:
berries, Fresh” was ”Fresh corn custard, Puerto Rican
n
style.” Thus the frequency of each word in the FNDDS 1X 1
vocabulary was tabulated. All of the generic words among SM RR = (6)
n i=1 synranki
the top 250 most common words were removed from the
food log name. where synrank is the rank of the highest-ranking synony-
• Method 5 addressed the mislabeling of foods such as mous food category.
”Kind, Nuts & Spices Bar, Dark Chocolate Nuts & Sea 2) Identifying Food Preferences Metrics: Effectiveness at
Salt,” where the first comma-separated phrase contains identifying food preferences was evaluated in several ways.
not only the brand name but also the general food name. Again, all metrics were calculated individually for each food
This method was identical to Method 4 except that instead log and then averaged. Food preference accuracy was defined
of removing the whole first comma-separated phrase, as
only words not found in the FNDDS vocabulary were
removed. 1 X
• Method 6 addressed mislabeling errors where the pre- Accuracypref erence = {pi == pif }
|categories| i∈categories d
dicted food label was very different from the true label.
(7)
(For example, ”Orowheat, Thin-Sliced Rustic White” was
where the categories are grains, vegetables, proteins, fruits,
misclassified as ”Liquor and cocktails.”) Instead of being
and dairy, and pid is the most popular food for category i in the
compared to the embeddings of all FNDDS foods, a food
dataset, and pif is the most popular food for category i in the
log entry was only compared to foods whose labels had
food log. A corresponding synonymous accuracy (whether the
associated FNDDS foods that shared words in common
food identified as the most popular was from a synonymous
with the food log name.
category) was also calculated. To measure food preferences
beyond one favorite, the percentage of the user’s top ten most
commonly eaten foods that the model was able to identify was
C. Evaluation Metrics
also calculated. A synonymous percentage was also calculated.
1) Labeling Accuracy Metrics: Several evaluation metrics III. R ESULTS AND D ISCUSSION
were employed to assess how well the system assigned the cor-
rect food label to each food entry. All metrics were calculated A. Food Logs and Database Summary
individually for each food log and then averaged. Accuracy Figure 2 demonstrates that most of the 34 food logs used
was computed for each food log as: in the analysis contain over 100 entries. This suggests that the
samples contain a representative selection of different foods be removed and words that were instrumental in establishing
that can be generalized to a larger population of subjects. similarity with the correct analogous database food. Methods
Figure 3 affirms this idea, illustrating the wide diversity in 4 and 5’s high performance on most of the tasks is likely
food selection among the 34 sampled individuals. Even rela- partly due to the removal of generic food-related words. The
tively common food item categories like ”Yeast breads” rarely poor performance of Method 6, the method that restricted the
make up more than 15% of any individual’s log, although it FNDDS foods that a food log entry was compared with to only
should be noted that the current food log entry system does those belonging to a category that contained foods that shared
not measure the exact amount of food consumed. Interestingly, at least one word in common with the food log entry, supports
none of the 10 most chosen food items are meat dishes; this is the importance of focusing on comparing food log entries
likely due to the fact that there are a large number of food log based on the contexts of their component words rather than
categories that refer to dishes with meat in them (ex. ”Ground on the words themselves. Method 6 incorrectly labeled foods
beef”, ”Pork”, ”Turkey, duck, other poultry”, etc.). Overall, the such as ”Creme Fraiche,” which was predicted to have the
lack of any overwhelmingly common food category selections ”Doughnuts, sweet rolls, pastries” label since neither ”creme”
indicates that the dataset has enough variation for us to make nor ”fraiche” appeared in a food name belonging to the correct
preliminary conclusions. category, ”Cream cheese, sour cream, whipped cream.”
Some incorrect label predictions were due to dataset limi-
tations. For several of the non-Western foods in the food logs,
such as sev, there were few or no similar foods in the dataset.
19
150
10 15 The database also did not contain alternate spellings, such
1
as ”yoghurt,” or abbreviations such as ”froyo.” The heuristic
12
14 15 14
for determining whether the first comma phrase contained a
14 15
14
13
9
9
company name was misled by company names that contained
9 12 5
13
8
14
14
food names, such as ”Chipotle.”
# of Food Entries
100
10
10
7
12
15
IV. C ONCLUSION & F UTURE W ORK
8
In this work, we introduce an approach to identify food pref-
1 15
7
erences from food logs that uses embeddings. We also propose
50
7
11 6 accompanying evaluation metrics. Our highest-performing
method identifies 82% of a user’s 10 most frequently-eaten
foods. This information regarding user’s favored foods can be
6
used to generate healthy and realistic meal recommendations
that feature ingredients that the user commonly consumes.
0 Our proposed approach can be generalized to other food
logging apps besides Cronometer and other food preference
S_10
S_11
S_12
S_13
S_14
S_15
S_16
S_17
S_18
S_19
S_20
S_21
S_22
S_23
S_24
S_25
S_26
S_27
S_28
S_29
S_30
S_31
S_32
S_33
S_34
S_1
S_2
S_3
S_4
S_5
S_6
S_7
S_8
S_9
Subject
details besides a user’s most frequently eaten foods. Each
food logging application has its own structure for a food
Fig. 2. Distribution of number of food entries per food log. Each bar name. This work introduces several methods of preprocessing
represents the total number of entries in each subject’s food log, with the food log names. For each application, one approach may
numbers at the top of each bar displaying the number of days across which more accurately identify dietary preferences than another. This
the data was being recorded.
method also provides a guide for identifying other dietary
preferences using a similar method to the one used to identify
B. Performance Evaluation frequently-eaten foods: create a set of vectors corresponding
Table III-B summarize the performance evaluation of each to each available option for a dietary preference. For example,
of the 6 methods in terms of food labeling metrics and food if one were trying to identify a user’s favored cuisines,
preferences metrics. The highest-performing method, Method they could create vectors for the cuisines ”Chinese food”
4, achieved an accuracy of 49% and identified 82% of users’ and ”Mediterranean food.” For each food entry, use cosine
10 most frequently-eaten foods (with synonymous categories similarity to identify the cuisine most likely to belong to that
included). A mean reciprocal rank of 0.57 suggests that for food.
many of the food log entries, the correct food label was Limitations of the work include the small number of food
one of the top two predicted choices. Comparisons to other logs used; annotating all of the entries with the corresponding
food preference evaluation work were difficult because their FNDDS food category required enormous effort. The large
approaches involved different evaluation metrics. decrease in accuracy for Method 6 suggests that using im-
The work shed light on some of the challenges involved proved embeddings, such as those generated using BERT [18]
in working with food logs. The varying performance of dif- or ELMo [19], could lead to better performance. The greater
ferent methods underscored the importance of distinguishing incorporation of context into the embeddings could assist in
between words that could bias the embeddings and should overcoming dataset limitations and labeling food log entries
S_1 S_3 S_5 S_7 S_9 S_11 S_13 S_15 S_17 S_19 S_21 S_23 S_25 S_27 S_29 S_31 S_33
S_2 S_4 S_6 S_8 S_10 S_12 S_14 S_16 S_18 S_20 S_22 S_24 S_26 S_28 S_30 S_32 S_34
0.15
Relative Frequency
0.10
0.05
0.00
Beef
Berries
Cheese
Dairy Desserts
Dips
Nutrition Bars
Pasta
White Potatoes
Yeast Breads
Food Name
Fig. 3. The relative frequency distribution for the ten most common food labels across the food logs.
TABLE II
R ESULTS OF EVALUATING THE FOOD PREFERENCE LEARNING ALGORITHM ON THE INTRODUCED FOOD LABELING AND FOOD REFERENCE METRICS .
Food Labeling Food Preference
Method Accuracy Synonymous Accuracy MRR SMRR Accuracy Synonymous Accuracy % Top 10 Foods Identified % Top 10 Synonymous Foods Identified
1 0.42 0.48 0.49 0.55 0.41 0.47 0.46 0.76
2 0.42 0.47 0.48 0.52 0.35 0.40 0.44 0.75
3 0.43 0.48 0.52 0.56 0.37 0.41 0.47 0.77
4 0.49 0.54 0.57 0.62 0.47 0.51 0.52 0.82
5 0.45 0.49 0.53 0.58 0.39 0.45 0.49 0.81
6 0.30 0.37 0.57 0.62 0.23 0.31 0.32 0.74
that do not have an identical analog in FNDDS. The increase [6] Raciel Yera Toledo, Ahmad A Alzahrani, and Luis Martinez. A
in performance when common words were removed suggests food recommender system considering nutritional information and user
preferences. IEEE Access, 7:96695–96711, 2019.
that other ways of weighting words that differ in importance [7] Yu Chen, Ananya Subburathinam, Ching-Hua Chen, and Mohammed J
when determining similarity, such as incorporating the TF-IDF Zaki. Personalized food recommendation as constrained question an-
statistic, may lead to improved performance. swering over a large-scale food knowledge graph. In Proceedings of the
14th ACM International Conference on Web Search and Data Mining,
In the future, we plan to add a recommendation component pages 544–552, 2021.
to our food preference learning system. After learning the [8] Donghyeon Park, Keonwoo Kim, Seoyoon Kim, Michael Spranger,
kinds of foods the user prefers to eat, the system will use and Jaewoo Kang. Flavorgraph: a large-scale food-chemical graph
for generating food representations and recommending food pairings.
nutritional information to recommend healthy variants of the Scientific reports, 11(1):1–13, 2021.
favored foods that fit with the user’s metabolic goals. We also [9] Felicia Cordeiro, Daniel A Epstein, Edison Thomaz, Elizabeth Bales,
plan to evaluate our system with real users and improve the Arvind K Jagannathan, Gregory D Abowd, and James Fogarty. Barriers
and negative nudges: Exploring challenges in food journaling. In
system by taking cuisine preferences into account. Proceedings of the 33rd Annual ACM Conference on Human Factors
in Computing Systems, pages 1159–1162, 2015.
V. C ODE AVAILABILITY [10] Xu Ye, Guanling Chen, Yang Gao, Honghao Wang, and Yu Cao. As-
The project source code is publicly available on sisting food journaling with automatic eating detection. In Proceedings
of the 2016 CHI conference extended abstracts on human factors in
(https://github.com/aametwally/LearningFoodPreferences). computing systems, pages 3255–3262, 2016.
[11] Myfitnesspal. https://www.myfitnesspal.com/. Accessed: 2021-08-17.
R EFERENCES [12] Cronometer. https://cronometer.com/. Accessed: 2021-08-17.
[1] Christoph Trattner and David Elsweiler. Food recommender systems: [13] Lose it! https://www.loseit.com/. Accessed: 2021-08-17.
important contributions, challenges and future research directions. arXiv [14] Andrea Morales-Garzón, Juan Gómez-Romero, and Maria J Martin-
preprint arXiv:1711.02760, 2017. Bautista. A word embedding-based method for unsupervised adaptation
[2] Peter Forbes and Mu Zhu. Content-boosted matrix factorization for of cooking recipes. IEEE Access, 9:27389–27404, 2021.
recommender systems: experiments with recipe recommendation. In [15] Wesley Tansey, Edward W Lowe Jr, and James G Scott. Diet2vec: Multi-
Proceedings of the fifth ACM conference on Recommender systems, scale analysis of massive dietary data. arXiv preprint arXiv:1612.00388,
pages 261–264, 2011. 2016.
[3] Morgan Harvey, Bernd Ludwig, and David Elsweiler. You are what [16] 2017-2018 food and nutrient database for dietary studies.
you eat: Learning user tastes for rating prediction. In International https://www.ars.usda.gov/. Accessed: 2021-08-17.
symposium on string processing and information retrieval, pages 153– [17] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient
164. Springer, 2013. estimation of word representations in vector space. arXiv preprint
[4] Mayumi Ueda, Mari Takahata, and Shinsuke Nakajima. User’s food arXiv:1301.3781, 2013.
preference extraction for personalized cooking recipe recommendation. [18] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova.
In Workshop of ISWC, pages 98–105, 2011. Bert: Pre-training of deep bidirectional transformers for language un-
[5] Longqi Yang, Cheng-Kang Hsieh, Hongjian Yang, John P Pollak, derstanding. arXiv preprint arXiv:1810.04805, 2018.
Nicola Dell, Serge Belongie, Curtis Cole, and Deborah Estrin. Yum- [19] Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christo-
me: a personalized nutrient-based meal recommender system. ACM pher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized
Transactions on Information Systems (TOIS), 36(1):1–31, 2017. word representations. arXiv preprint arXiv:1802.05365, 2018.