View on GitHub

masala-to-metrics

Masala to Metrics: Predicting Calories

Name: Anjika Jain & Riya Bhivare

Email: anjika@umich & rbhiv@umich.edu

Website Link: https://anjikajain1.github.io/masala-to-metrics/

Introduction

Background

The dataset we chose was Recipes and Ratings. These 2 datasets were scrapped from food.com, a popular online platform for sharing a discovering new recipes. The datasets we worked with were divided in two main components.

RAW_recipes.csv: Contains all details about recipes, including preparation time, number of steps in recipe, and nutritional information.

RAW_interactions.csv: Includes user reviews and ratings for recipes found in RAW_recipes.csv

Question: What are the different aspects that could impact the calorie content of recipes?

While keeping this question as our focus point we analyzed many properties revolving around calories, such as the distribution of calories in the recipes given and distribution of different nutritional information as well.

This question was interesting to us, because coming from two south asian households, we have seen that it can be difficult to track calories in some of the recipes that are often meals in our culture. Understanding calorie content can help user cook recipes that meet their dietary goals. It can also help to give food.com’s recommendation system to filter by calories considerations.

DataSet

Rows in RAW_recipes: 83,782 Rows in RAW_interactions: 731,927

Columns in RAW_recipes:

Columns in RAW_recipes:

Data Cleaning and Exploratory Data Analysis

Data Cleaning

Our team performed various steps in the data cleaning process to ensure our dataset was ready to be analyzed:

First, we performed a left merge between Recipes and Interactions on the recipe ID in order to bridge the two datasets.

Then, we noted that user ratings of 0 existed and so, we replaced them with NaN so the average rating calculations are not skewed and there is not a downward bias.

Next, our team found the average rating for each recipe as a series and added it as a column in order to gain a better understanding of the overall merged dataset.

Finally, one of the major data cleaning steps we took involved the nutrition column. Our team distributed the lists within the rows in order to seperate aspects of nutrition such as calories, total_fat, sugar, sodium, protein, saturated_fat and carbohydrates.

Below we've shown the `head` of the cleaned Dataframe.

name id minutes contributor_id submitted tags n_steps steps description ingredients n_ingredients user_id recipe_id date rating review avg_rating calories total_fat sugar sodium protein saturated_fat carbohydrates
1 brownies in the world best ever 333281 40 985201 2008-10-27 ['60-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'for-large-groups', 'desserts', 'lunch', 'snacks', 'cookies-and-brownies', 'chocolate', 'bar-cookies', 'brownies', 'number-of-servings'] 10 ['heat the oven to 350f and arrange the rack in the middle', 'line an 8-by-8-inch glass baking dish with aluminum foil', 'combine chocolate and butter in a medium saucepan and cook over medium-low heat , stirring frequently , until evenly melted', 'remove from heat and let cool to room temperature', 'combine eggs , sugar , cocoa powder , vanilla extract , espresso , and salt in a large bowl and briefly stir until just evenly incorporated', 'add cooled chocolate and mix until uniform in color', 'add flour and stir until just incorporated', 'transfer batter to the prepared baking dish', 'bake until a tester inserted in the center of the brownies comes out clean , about 25 to 30 minutes', 'remove from the oven and cool completely before cutting'] these are the most; chocolatey, moist, rich, dense, fudgy, delicious brownies that you'll ever make.....sereiously! there's no doubt that these will be your fav brownies ever for you can add things to them or make them plain.....either way they're pure heaven! ['bittersweet chocolate', 'unsalted butter', 'eggs', 'granulated sugar', 'unsweetened cocoa powder', 'vanilla extract', 'brewed espresso', 'kosher salt', 'all-purpose flour'] 9 386585.0 333281.0 2008-11-19 4.0 These were pretty good, but took forever to bake. I would send it ended up being almost an hour! Even then, the brownies stuck to the foil, and were on the overly moist side and not easy to cut. They did taste quite rich, though! Made for My 3 Chefs. 4.0 138.4 10.0 50.0 3.0 3.0 19.0 6.0
1 in canada chocolate chip cookies 453467 45 1848091 2011-04-11 ['60-minutes-or-less', 'time-to-make', 'cuisine', 'preparation', 'north-american', 'for-large-groups', 'canadian', 'british-columbian', 'number-of-servings'] 12 ['pre-heat oven the 350 degrees f', 'in a mixing bowl , sift together the flours and baking powder', 'set aside', 'in another mixing bowl , blend together the sugars , margarine , and salt until light and fluffy', 'add the eggs , water , and vanilla to the margarine / sugar mixture and mix together until well combined', 'add in the flour mixture to the wet ingredients and blend until combined', 'scrape down the sides of the bowl and add the chocolate chips', 'mix until combined', 'scrape down the sides to the bowl again', 'using an ice cream scoop , scoop evenly rounded balls of dough and place of cookie sheet about 1 - 2 inches apart to allow for spreading during baking', 'bake for 10 - 15 minutes or until golden brown on the outside and soft & chewy in the center', 'serve hot and enjoy !'] this is the recipe that we use at my school cafeteria for chocolate chip cookies. they must be the best chocolate chip cookies i have ever had! if you don't have margarine or don't like it, then just use butter (softened) instead. ['white sugar', 'brown sugar', 'salt', 'margarine', 'eggs', 'vanilla', 'water', 'all-purpose flour', 'whole wheat flour', 'baking soda', 'chocolate chips'] 11 424680.0 453467.0 2012-01-26 5.0 Originally I was gonna cut the recipe in half (just the 2 of us here), but then we had a park-wide yard sale, & I made the whole batch & used them as enticements for potential buyers ~ what the hey, a free cookie as delicious as these are, definitely works its magic! Will be making these again, for sure! Thanks for posting the recipe! 5.0 595.1 46.0 211.0 22.0 13.0 51.0 26.0
412 broccoli casserole 306168 40 50969 2008-05-30 ['60-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'side-dishes', 'vegetables', 'easy', 'beginner-cook', 'broccoli'] 6 ['preheat oven to 350 degrees', 'spray a 2 quart baking dish with cooking spray , set aside', 'in a large bowl mix together broccoli , soup , one cup of cheese , garlic powder , pepper , salt , milk , 1 cup of french onions , and soy sauce', 'pour into baking dish , sprinkle remaining cheese over top', 'bake for 25 minutes or until cheese is lightly browned', 'sprinkle with rest of french fried onions and bake until onions are browned and cheese is bubbly , about 10 more minutes'] since there are already 411 recipes for broccoli casserole posted to "zaar" ,i decided to call this one #412 broccoli casserole.i don't think there are any like this one in the database. i based this one on the famous "green bean casserole" from campbell's soup. but i think mine is better since i don't like cream of mushroom soup.submitted to "zaar" on may 28th,2008 ['frozen broccoli cuts', 'cream of chicken soup', 'sharp cheddar cheese', 'garlic powder', 'ground black pepper', 'salt', 'milk', 'soy sauce', 'french-fried onions'] 9 29782.0 306168.0 2008-12-31 5.0 This was one of the best broccoli casseroles that I have ever made. I made my own chicken soup for this recipe. I was a bit worried about the tsp of soy sauce but it gave the casserole the best flavor. YUM! \nThe photos you took (shapeweaver) inspired me to make this recipe and it actually does look just like them when it comes out of the oven. \nThanks so much for sharing your recipe shapeweaver. It was wonderful! Going into my family's favorite Zaar cookbook :) 5.0 194.8 20.0 6.0 32.0 22.0 36.0 3.0
412 broccoli casserole 306168 40 50969 2008-05-30 ['60-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'side-dishes', 'vegetables', 'easy', 'beginner-cook', 'broccoli'] 6 ['preheat oven to 350 degrees', 'spray a 2 quart baking dish with cooking spray , set aside', 'in a large bowl mix together broccoli , soup , one cup of cheese , garlic powder , pepper , salt , milk , 1 cup of french onions , and soy sauce', 'pour into baking dish , sprinkle remaining cheese over top', 'bake for 25 minutes or until cheese is lightly browned', 'sprinkle with rest of french fried onions and bake until onions are browned and cheese is bubbly , about 10 more minutes'] since there are already 411 recipes for broccoli casserole posted to "zaar" ,i decided to call this one #412 broccoli casserole.i don't think there are any like this one in the database. i based this one on the famous "green bean casserole" from campbell's soup. but i think mine is better since i don't like cream of mushroom soup.submitted to "zaar" on may 28th,2008 ['frozen broccoli cuts', 'cream of chicken soup', 'sharp cheddar cheese', 'garlic powder', 'ground black pepper', 'salt', 'milk', 'soy sauce', 'french-fried onions'] 9 1196280.0 306168.0 2009-04-13 5.0 I made this for my son's first birthday party this weekend. Our guests INHALED it! Everyone kept saying how delicious it was. I was I could have gotten to try it. 5.0 194.8 20.0 6.0 32.0 22.0 36.0 3.0
412 broccoli casserole 306168 40 50969 2008-05-30 ['60-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'side-dishes', 'vegetables', 'easy', 'beginner-cook', 'broccoli'] 6 ['preheat oven to 350 degrees', 'spray a 2 quart baking dish with cooking spray , set aside', 'in a large bowl mix together broccoli , soup , one cup of cheese , garlic powder , pepper , salt , milk , 1 cup of french onions , and soy sauce', 'pour into baking dish , sprinkle remaining cheese over top', 'bake for 25 minutes or until cheese is lightly browned', 'sprinkle with rest of french fried onions and bake until onions are browned and cheese is bubbly , about 10 more minutes'] since there are already 411 recipes for broccoli casserole posted to "zaar" ,i decided to call this one #412 broccoli casserole.i don't think there are any like this one in the database. i based this one on the famous "green bean casserole" from campbell's soup. but i think mine is better since i don't like cream of mushroom soup.submitted to "zaar" on may 28th,2008 ['frozen broccoli cuts', 'cream of chicken soup', 'sharp cheddar cheese', 'garlic powder', 'ground black pepper', 'salt', 'milk', 'soy sauce', 'french-fried onions'] 9 768828.0 306168.0 2013-08-02 5.0 Loved this. Be sure to completely thaw the broccoli. I didn't and it didn't get done in time specified. Just cooked it a little longer though and it was perfect. Thanks Chef. 5.0 194.8 20.0 6.0 32.0 22.0 36.0 3.0
412 broccoli casserole 306168 40 50969 2008-05-30 ['60-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'side-dishes', 'vegetables', 'easy', 'beginner-cook', 'broccoli'] 6 ['preheat oven to 350 degrees', 'spray a 2 quart baking dish with cooking spray , set aside', 'in a large bowl mix together broccoli , soup , one cup of cheese , garlic powder , pepper , salt , milk , 1 cup of french onions , and soy sauce', 'pour into baking dish , sprinkle remaining cheese over top', 'bake for 25 minutes or until cheese is lightly browned', 'sprinkle with rest of french fried onions and bake until onions are browned and cheese is bubbly , about 10 more minutes'] since there are already 411 recipes for broccoli casserole posted to "zaar" ,i decided to call this one #412 broccoli casserole.i don't think there are any like this one in the database. i based this one on the famous "green bean casserole" from campbell's soup. but i think mine is better since i don't like cream of mushroom soup.submitted to "zaar" on may 28th,2008 ['frozen broccoli cuts', 'cream of chicken soup', 'sharp cheddar cheese', 'garlic powder', 'ground black pepper', 'salt', 'milk', 'soy sauce', 'french-fried onions'] 9 520830.0 306168.0 2017-10-17 5.0 5 stars from my husband and son, my toughest critics. I used a 10-oz bag of chopped broccoli and a 10-oz bag of flowerettes which gave it more texture. Very good flavor and the smell while cooking was great. The sauce held it together without overwhelming the broccoli. 5.0 194.8 20.0 6.0 32.0 22.0 36.0 3.0
millionaire pound cake 286009 120 461724 2008-02-12 ['time-to-make', 'course', 'cuisine', 'preparation', 'occasion', 'north-american', 'desserts', 'american', 'southern-united-states', 'dinner-party', 'holiday-event', 'cakes', 'dietary', 'christmas', 'thanksgiving', 'low-sodium', 'low-in-something', 'taste-mood', 'sweet', '4-hours-or-less'] 7 ['freheat the oven to 300 degrees', 'grease a 10-inch tube pan with butter , dust the bottom and sides with flour , and set aside', 'in a large mixing bowl , cream the butter and sugar with an electric mixer and add the eggs one at a time , beating after each addition', 'alternately add the flour and milk , stirring till the batter is smooth', 'add the two extracts and stir till well blended', 'scrape the batter into the prepared pan and bake till a cake tester or knife blade inserted in the center comes out clean , about 1 1 / 2 hours', 'cool the cake in the pan on a rack for 5 minutes , then turn it out on the rack to cool completely'] why a millionaire pound cake? because it's super rich! this scrumptious cake is the pride of an elderly belle from jackson, mississippi. the recipe comes from "the glory of southern cooking" by james villas. ['butter', 'sugar', 'eggs', 'all-purpose flour', 'whole milk', 'pure vanilla extract', 'almond extract'] 7 813055.0 286009.0 2008-04-09 5.0 don't let the calories and fat grams scare you off. This is a wonderful recipe and is perfect for the summer cook-out topped with fresh berries! It will make you proud. This is meant to be shared! 5.0 878.3 63.0 326.0 13.0 20.0 123.0 39.0
2000 meatloaf 475785 90 2202916 2012-03-06 ['time-to-make', 'course', 'main-ingredient', 'preparation', 'main-dish', 'potatoes', 'vegetables', '4-hours-or-less', 'meatloaf', 'simply-potatoes2'] 17 ['pan fry bacon , and set aside on a paper towel to absorb excess grease', 'mince yellow onion , red bell pepper , and add to your mixing bowl', 'chop garlic and set aside', 'put 1tbsp olive oil into a saut pan , along with chopped garlic , teaspoons white pepper and a pinch of kosher salt', 'bring to a medium heat to sweat your garlic', 'preheat oven to 350f', 'coarsely chop your baby spinach add to your heated pan , stir frequently for approximately 5 min to wilt', 'add your spinach to the mixing bowl', 'chop your now cooled bacon , and add it to the mixing bowl', 'add your meatloaf mix to the bowl , with one egg and mix till thoroughly combined', 'add your goat cheese , one egg , 1 / 8 tsp white pepper and 1 / 8 tsp of kosher salt and mix till thoroughly combined', 'transfer to a 9x5 meatloaf pan , and cook for 60 min or until the internal temperature is at least 160f', 'let stand for 5min', 'melt 1tbsp unsalted butter into a frying pan , and cook up to three eggs at a time', 'crack each egg into a separate dish , in order to prevent egg shells from reaching the pan , then add salt and pepper to taste', 'wait until the egg whites are firm looking , but slightly runny on top before flipping your eggs', 'after flipping , wait 10~20 seconds before removing each egg and placing it over your slices of meatloaf'] ready, set, cook! special edition contest entry: a mediterranean flavor inspired meatloaf dish. featuring: simply potatoes - shredded hash browns, egg, bacon, spinach, red bell pepper, and goat cheese. ['meatloaf mixture', 'unsmoked bacon', 'goat cheese', 'unsalted butter', 'eggs', 'baby spinach', 'yellow onion', 'red bell pepper', 'simply potatoes shredded hash browns', 'fresh garlic', 'kosher salt', 'white pepper', 'olive oil'] 13 2204364.0 475785.0 2012-03-07 5.0 Delicious!!!!! -- the goat cheese made the difference. My new favorite meatloaf. 5.0 267.0 30.0 12.0 12.0 29.0 48.0 2.0
2000 meatloaf 475785 90 2202916 2012-03-06 ['time-to-make', 'course', 'main-ingredient', 'preparation', 'main-dish', 'potatoes', 'vegetables', '4-hours-or-less', 'meatloaf', 'simply-potatoes2'] 17 ['pan fry bacon , and set aside on a paper towel to absorb excess grease', 'mince yellow onion , red bell pepper , and add to your mixing bowl', 'chop garlic and set aside', 'put 1tbsp olive oil into a saut pan , along with chopped garlic , teaspoons white pepper and a pinch of kosher salt', 'bring to a medium heat to sweat your garlic', 'preheat oven to 350f', 'coarsely chop your baby spinach add to your heated pan , stir frequently for approximately 5 min to wilt', 'add your spinach to the mixing bowl', 'chop your now cooled bacon , and add it to the mixing bowl', 'add your meatloaf mix to the bowl , with one egg and mix till thoroughly combined', 'add your goat cheese , one egg , 1 / 8 tsp white pepper and 1 / 8 tsp of kosher salt and mix till thoroughly combined', 'transfer to a 9x5 meatloaf pan , and cook for 60 min or until the internal temperature is at least 160f', 'let stand for 5min', 'melt 1tbsp unsalted butter into a frying pan , and cook up to three eggs at a time', 'crack each egg into a separate dish , in order to prevent egg shells from reaching the pan , then add salt and pepper to taste', 'wait until the egg whites are firm looking , but slightly runny on top before flipping your eggs', 'after flipping , wait 10~20 seconds before removing each egg and placing it over your slices of meatloaf'] ready, set, cook! special edition contest entry: a mediterranean flavor inspired meatloaf dish. featuring: simply potatoes - shredded hash browns, egg, bacon, spinach, red bell pepper, and goat cheese. ['meatloaf mixture', 'unsmoked bacon', 'goat cheese', 'unsalted butter', 'eggs', 'baby spinach', 'yellow onion', 'red bell pepper', 'simply potatoes shredded hash browns', 'fresh garlic', 'kosher salt', 'white pepper', 'olive oil'] 13 2216720.0 475785.0 2012-03-21 5.0 What a fabulous recipe. I have a lot of friends who either love to cook, are cookbook authors, are on TV with a cooking show, or who have been featured on cooking shows, so I know a thing or two about cooking. I know, for instance that cooking offers up a continual stream of adventures that do not require a passport or long airline layovers. Cooking is creative, expressive and comforting. A form of open eyed meditation lifting one beyond the commonplace. I'm a vegetarian, but I love to visit other recipes for inspiration so that I can use them by adapting the meat ingredients and therefore adopting them into my favorite recipe file. All thumbs up for this recipe by an obviously gifted, dedicated and creative cook! 5.0 267.0 30.0 12.0 12.0 29.0 48.0 2.0
5 tacos 500166 20 2549237 2013-05-13 ['weeknight', '30-minutes-or-less', 'time-to-make', 'course', 'main-ingredient', 'preparation', 'occasion', 'main-dish', 'beef', 'vegetables', 'easy', 'diabetic', 'dinner-party', 'kid-friendly', 'stove-top', 'dietary', 'comfort-food', 'inexpensive', 'ground-beef', 'meat', 'greens', 'lettuces', 'tomatoes', 'taste-mood', 'equipment', '3-steps-or-less'] 5 ['cook meat', 'add taco seasoning', 'place meat into taco shells / tortillas', 'top with tomatoes , onions , lettuce , salsa and cheese', 'boil corn cobs 5-7 minutes'] costs about $5.00 to make. ['ground beef', 'taco seasoning', 'taco shells', 'lettuce', 'tomatoes', 'onion', 'salsa', 'cheddar cheese', 'corn cobs'] 9 369715.0 500166.0 2013-06-13 4.0 I doubled the recipe for my family but used two pounds of meat instead of 1.5 pounds. I followed the recipe except we did not use the onions and topped them with sour cream. I also didn't make the corn. We all enjoyed these. 4.0 249.4 26.0 4.0 6.0 39.0 39.0 0.0

Univariate Analysis

Explanation: From this histogram we see that the majority of recipes have calorie counts under 1000, and it is right skewed. It suggests that most recipes are relatively moderate in calories, and there are less outliers implying that individuals tend to not share recipes higher in calories.

Bivariate Analysis

Explanation: From this scatter plot we see how most recipes are around 10,000 calories even as the number of ingrediants increases showcasing how ingrediants aren’t may not be majorly relevant to the number of calories in a recip e. However, we also noticed how in a few cases as ingrediants increase the calorie count reduces which seemed not as intuitive for our team.

Interesting Aggregates

prep_time_group calories total_fat sugar sodium protein saturated_fat carbohydrates
0–15 min 301.65 23.45 66.56 27.42 16.57 27.07 10.09
16–30 min 369.7 27.96 45.72 23.77 31.31 33.47 11.42
31–60 min 432.15 33.04 60.68 27.09 34.56 42.44 13.67
61–120 min 564.28 43.7 95.62 32.61 40.69 56.51 18.51
120+ min 547.51 39.85 68.84 47.69 57.29 48.27 15.66

Significance: We wanted to build this pivot table in order to do some exploratory data analysis. We wanted to uncover interesting patters or insights in our datasets. From this we could see if calories were impacted by how long something needed to cook. A lot of south asian dishes require extended prep times so we wanted to see if general prep time of many different recipes had any impact on calories. We didn’t see any inherent patterns but we saw that sodium content was pretty high.

Imputation

Prior to imputation there were missing values in name, description,user_id, recipe_id, date, rating, review and avg_rating. This presented as opportunities for our team to impute these values. We decided tat iconstant mputation was only necessary fror name, description,user_id, recipe_id, date, and review. This is mainly because of the nature of the variables as textual (categ)rical variables filling them in with text indicating “No ___ were provided.” would not change the data in any fundamental way and for our analysis they weren’t very relevant.

We decided to not impute rating and avg_rating as these values were integral to our analysis and numerical. If we were to impute these values through mean imputation it would not be an accurate representation of the inputted values by the reviwer. This would likely introduce bias and change overall data analysis in an unpredictable way.

Framing a Prediction Problem

Prediction Problem

We are aiming to predict the calorie content of a recipe based on it’s nutritional components (example: total fat, sugar, carbohydrates, etc.) We chose to use nutritional components as our features as this information is available to us from the dataset itself.

Problem Type

This is a regression problem because the target, calories, is a continuous numeric value.

Response Variable

The response variable we chose is calories. We chose calories because it is an important indicator for indivudals who are tracking their diets for health, fitness, or medical reasons. Understanding how different nutritional components impact calories can give a more hollistic view to eating healthier. We also chose this as oftentimes south asian food has a lot of hidden calories and through these prediction models we hoped to see healthier recipes as people add their recipes and its nutritional components.

Evaluation Metric

We will be using Mean Absolute Error (MAE), to evaluate our model. It will tell us, on average, how many calories our model’s predictions are off by. MAE treats all errors equally.

We will also use Mean Square Error (MSE) as it provides an overall sense of prediction error, but will keep in mind that it is weighted more towards large error.

We will not look at the R^2 score as it isn’t intiuative for users, and provides a value which is unitless.

Baseline Model

Model Description and Features

We built a baseline linear regression model to predict the number of calories in recipe based on two nutritional components (carbohydrates, total fat).

Features

Quantative Features

carbohydrates: Sugar content in Percent Daily Value (PDV%)
</br> total_fat: Total Fat in Percent Daily Value (PDV%)

Nominal Features

None

Ordinal Features

None

Response Varaible

calories: Calorie content, measured as a continous value

Preprocessing Steps

We defined a sub-pipeline for numerical features, to use SimpleImputer to fill in missing values a replace them with the mean value in each column.

We evaluated the model on a test set of 20% of the data.

Model Performance

MSE Value: 9719.29 (RSME: 98.5) Interpretation: This measures the average square difference between predicted and actual calorie values. Because the errors are squared it may have penalized larger error more.

MAE Value: 55.75 This means our baseline model’s calorie predictions are off by an average of 55.75 calories.

Is This a Good Model?

Based on MAE value of 55.75, if our recipes on average have hundreds of calories being off by 56 could be okay for some users who just want a sort of general understanding of how calorie dense their meals are. However, for users who are looking for accurate predictions an average of 56 calories could be a large issue for them. A MSE value of 9719 is very high, suggesting that the errors that are occuring must be large. Since we use only two features (total_fat + carbohydrates), it is probably not highlighting how all the different nurtrional components (protein, sugar, etc.) impact the calories. This is an okay baseline mode, but not a great model, as we are not super precise.

Final Model

We introduce two engineered features to capture better predictions:

sugar_to_protein_ratio: The ratio captures the ratio between sugar and protein in a recipe. High sugar to protein ratios indicate calorie-dense dessets, while lower ratio’s are more protein-rich and possibly more healthier. This ratio might signify more than just taking sugar and protein.

total_macro_sum: This is the sum of six of the nutrional component (total fat, sugar, sodium, protein, statured_fat, carbohydrates). All calories come directly from the nutrional components. This feature acts as a overall proxy for nutrient density.

Quantative Features

carbohydrates: Sugar content in Percent Daily Value (PDV%) total_fat: Total Fat in Percent Daily Value (PDV%)

Engineered Features

sugar_to_protein_ration: The ration of sugar to protein in a recipe
</br>

total_macro_sum: The sum of six key nutrient components

Nominal Features

None

Ordinal Features

None

Response Varaible

calories: Calorie content, measured as a continous value

Modeling Algorithm

We used a Random Forest Regressor, choosen for it’s ability to be robust to outliers and irrelevant features, and capture non-linear relationships between nutrients and calories.

Hyperparameter Tuning

When tuning our Random Forest Regressor we wanted to choose hyperparameters that are high-impact and worth tunning. We chose these following hyperparamters to tune:

Best Hyperparamters

Our model performs best with 100 grown trees allowing it to capture complex patterns in the data. The no limit on the tree depth, allows for the model to learn complex relationships.

Model's Performance and Evaluation

MSE: 5336.18
</br> MAE: 21.37

MAE dropped by ~62%, and our model on average is only off by about 21 calories. The descrease in the MSE score (~47% lower) also shows that the model is doing better on outliers. The final model captures general patterns.