My Research

Image of people writing and doing research


Introduction

The purpose of this blog is to detail my progress and thoughts for my MSc Computer Science Research Project.

For my research, I will be investigating how efficiently I can perform web scraping on recipe websites to produce allergen filtered search results.

What is the Research?

There are 14 main allergen groups covering such food items as fish, milk, celery and gluten. In the UK and EU, legislation exists to mandate that food manufacturers label food products with the allergens that they contain.  However, in recipe websites (which can be hosted anywhere), there is no legislation to provide details of what allergens are present in a recipe. This can make it challenging for people with allergens to find safe recipes.

(Allergen guidance for food businesses) defined by the UK Food Standards Agency


For my research, I will be examining web scraping methods using Python to see if I can accurately and reliably parse recipe web sites to get details of food allergens and then provide that data in a visual form to users of the application.

The key here I believe is to be accurate and reliable. There is no standard template for web sites to be built from, so one recipe website (for example the BBC Recipes) will have a different layout and structure from another (for example Sainsburys Recipes). I therefore need to see if there is a reliable way that I can parse the websites without having to effectively hard code parsing each page – this would not be feasible.

There are many different techniques and tools to perform web scraping in many different programming languages. I have selected to use Python because of the fantastic library support it offers and the flexibility and power of the language to develop scientific computing applications and its ability to generate web applications.

References:

Allergen guidance for food businesses (no date) Food Standards Agency. Available at: https://www.food.gov.uk/business-guidance/allergen-guidance-for-food-businesses (Accessed: 4 September 2023).


Credits:

Photo by UX Indonesia on Unsplash


Comments

Popular posts from this blog

Notion Kanban Boards