My Research
Introduction
The purpose of this blog is to detail my progress and thoughts for my MSc Computer Science Research Project.
For my research, I will be investigating how efficiently I
can perform web scraping on recipe websites to produce allergen filtered search
results.
What is the Research?
There are 14 main allergen groups covering such food items
as fish, milk, celery and gluten. In the UK and EU, legislation exists to
mandate that food manufacturers label food products with the allergens that
they contain. However, in recipe
websites (which can be hosted anywhere), there is no legislation to provide
details of what allergens are present in a recipe. This can make it challenging
for people with allergens to find safe recipes.
(Allergen guidance
for food businesses) defined by the UK Food Standards Agency
For my research, I will be examining web scraping methods
using Python to see if I can accurately and reliably parse recipe web sites to
get details of food allergens and then provide that data in a visual form to
users of the application.
The key here I believe is to be accurate and reliable. There
is no standard template for web sites to be built from, so one recipe website
(for example the BBC Recipes) will have a different layout and structure from
another (for example Sainsburys Recipes). I therefore need to see if there is a
reliable way that I can parse the websites without having to effectively hard
code parsing each page – this would not be feasible.
There are many different techniques and tools to perform web
scraping in many different programming languages. I have selected to use Python
because of the fantastic library support it offers and the flexibility and
power of the language to develop scientific computing applications and its
ability to generate web applications.
References:
Allergen guidance for food
businesses
(no date) Food Standards Agency. Available at:
https://www.food.gov.uk/business-guidance/allergen-guidance-for-food-businesses
(Accessed: 4 September 2023).
Credits:
Photo by UX Indonesia on Unsplash


Comments
Post a Comment