My Blog

Image

Message Passing Neural Networks

Image by BoliviaInteligente Over the past 15 years we have seen a surge of use of Graph Neural Networks (GNNs) being used to model social networks, recommendations systems, transportation networks, and many more systems. With the ever growing use of GNNs, this has naturally led to the questions about the use of GNNs within the medical sector. More specifically, the question of using GNNs to predict the properties of molecules was brought up. Back in the early 2010s, this idea was in its infancy with few successful applications.
Image

Kullback-Leibler Divergence

Image by Daniela Turcanu Introduction This article will cover the key features of Kullback-Leibler Divergence (KL divergence), a formula invented in 1951 by the mathematicians Soloman Kullback and Richard Leibler. This formula is used in the background of many of the modern day machine learning models focused around probabilistic modelling. These including Variational Autoencoders (VAEs), Generative Models, Reinforcement Learning, and Natural Language Processing. Additionally, this article will cover some of KL divergence’s key properties and briefly cover one of its applications.
Image

Diffusion Models (DDPM)

[Image by Justin Lim] Introduction This article will delve into diffusion models, a group of latent variable (see definitions) generative models with applications in image generation, audio synthesis, and denoising. More specifically, this article will mostly focus on the derivations and the ideas behind diffusion models, with a heavy enthuses on the ideas introduced in Ho et al. in his Denoising Diffusion Probabilisitic Models paper (DDPMs). The applications of these models will not be covered today.
Image

Dynamic Programming in Solving Palindrome Partitioning

Image by Danny Greenberg Summary This article explores dynamic programming (DP), a technique used to tackle complex problems in computer science. We will specifically apply DP to two problems involving palindrome partitioning. Requirements A basic understanding of dynamic programming and some experience with DP problems are recommended before reading this article. Definitions • Palindrome: A string is a palindrome if it reads the same backward as forward. For example, “aba” is a palindrome, while “aab” is not.
Image

Graph Sage

Image by Cajeo Zhang Graphs have been used across many fields due to their ability to represent relationships between entities with applications including social networks, search engines, and protein-protein interaction networks. However, one growing limitation of these graphs are the amount of computational resources they require with some large-scale graphs having millions of nodes each with their own set of features and their set of edges. This has led to the creation of graph embedding methods, more specifically the deep embedding methods. These embedding methods aim to create a high-quality representation of the nodes and their edges. Rather than just incorporating the graph structural information into an embedding, these methods also include node and edges features and other hierarchical information. This results in a complicated model which are able to learn very rich representations of nodes.
Image

Graph Factorisation Methods in Shallow Graphs

Image by Elena Mozhvilo Summary Graphs are incredibly useful for modelling a range of relationships and interactions. Using nodes to represent entities and edges to represent connections between these entities, they have become a very useful representation tool. Nowadays they are used to model social networks, protein-protein interactions, recommendations systems, knowledge graphs, supply chains, and so much more. However, as these graphs scale up and add more nodes and edges, a range of issues start to arise. They start to become computationally expensive to process, noisy, and difficult to interpret.
Image

Instance Normalisation within GANs

Image by Justin Simmonds Generative Adversarial Networks (GANs) were first introduced in 2014 by Ian Goodfellow in his paper “Generative Adversarial Nets.” This paper presented the GAN framework, which consists of two neural networks called the generator and the discriminator. The generator takes random noise as input and outputs a generated image. The discriminator takes both a generated image and a real image as inputs and tries to determine which is real and which is generated. Their training process can be likened to a ping-pong game, with the generator trying to produce images that fool the discriminator, and the discriminator is trying to identify which images are generated.
Image

U-Net

Image by Dan Gold Machine Learning (ML) has numerous applications in medicine, including disease diagnosis, drug development, predictive healthcare, and more. One key application of ML in medicine is biomedical image processing. These types of models takes an image as an input and then assigns a class label to each pixel in a process called localisation. Competitions are held annually to advance these ML models for biomedical image processing tasks. For instance, the International Symposium on Biomedical Imaging (ISBI) hosts yearly competitions focused on various biomedical imaging challenges. One notable problems from the ISBI involves segmenting neuronal structures in electron microscopy stacks.
Image

Why Do Trees Outperform Neural Networks on Tabular Data?

Image by Todd Quackenbush For the past 30 years, tree-based algorithms such as Adaboost and Random Forests have been the go-to methods for solving tabular data problems. While neural networks (NNs) have been used in this context, they have historically struggled to match the performance of tree-based methods. Despite recent advancements in NN capabilities and their success in tasks from computer vision, language translation, and image generation, tree-based algorithms still outperform neural networks when it comes to tabular data. This article will introduce several reasons behind the continued dominance of tree-based methods in this domain.

About Me

About Me Hi, I’m Ben, a Foundation ML Engineer based in London, currently working at the startup Pharmovo where we apply AI solutions to forecast the demand of pharmacetutical drugs in the US and UK. Prior to this role, I worked as a data scientist at the start up Eligible where we applied data driven approaches to predict the actions of mortgage users. My final work experience was at NatWest Markets where I applied ML methods to determine whether bonds were over or under priced within a portfolio.