NGC-04 · Deep Field
Deep-field studies — zooming in on individual objects to examine methods, constraints, and outcomes.
Food waste and economic inflation are reshaping how people cook — but recipe generation tools in 2023 were blunt instruments, pattern-matching on popularity rather than understanding why ingredients work together. The question I wanted to answer: could AI learn the underlying relationships between ingredients well enough to suggest combinations nobody had thought of yet?
I built a knowledge graph of ingredient relationships in Neo4j, pulling recipe data from AllRecipes via the Apify API and nutritional data from FooDB. Ingredient embeddings were generated using Word2Vec, then a GraphSAGE neural network was trained on the graph structure to learn deeper relational patterns. Dimensionality reduction techniques — PCA, UMAP, and t-SNE — were used to surface clusters and outliers in the embedding space. The results were fed to GPT-3.5 via the OpenAI API to generate novel ingredient pairings and full recipes from the learned relationships.
I finished this project in 2023, when large language models were just beginning to go mainstream — and looking back, the timing was both exciting and limiting. The pipeline holds up. What I'd do differently now: ground the ingredient relationships in actual chemical compound data rather than co-occurrence patterns, and integrate allergen information so the output is genuinely useful rather than just novel. I'd also push the model comparisons further — I compared dimensionality reduction techniques but not the generative models themselves, and that's a gap a stronger paper would have closed. Three years on, the infrastructure to do this properly is dramatically better. This one isn't finished — it's just waiting for a second chapter.