Design Engineering
Showcase 2021

Digital Cognitive Companion

Machine Learning
Artificial Intelligence

Project Details

Sean Bazanye-Lutu
Design Engineering MEng
Professor Robert Shorten
Masters Project
Personal website

Currently, more than 3 billion people rely on voice activated search devices and assistants to navigate their day-to-day life, highlighting how reliant we have become on these products and systems. However, our current interactions with these systems are very one-sided, we are required to initiate the engagement in order for the system to be of use and have value. This can result in optimal conditions that meet our needs being missed as the system was initiated moments too late. What if these systems could generate their own value? Providing users with key information before they even knew they needed it?

A picture showing an example situation where the mobile digital companion has informed the user that they would need to leave 10 minutes earlier than usual to make it to university on time.


This project mainly focuses on one potential element of a fully realised "Digital Companion", commuter route planning, that is able to proactively provide the user with route options that factor in current traffic conditions and risks. This would enable users to minimise the effect unexpected travel conditions pose on their daily commute. To accomplish this, the system should be able to build a model of the user’s daily behaviour and pre-empt actions before they occur, functioning as more of a “Companion” than an “Assistant”.


The proposed framework of this system is comprised of 3 main layers: The frontend mobile app, which interacts with the user and outputs information from the system; a middleware server, which processes user requests and predicts destinations; and a flask reinforcement learning back-end server, which optimises and stores routes for a given user.

A flow diagram representing how the mobile app, middleware server and backend server interact and pass information.

Mobile App

Is responsible for communicating information to the user. It does this in 2 main ways: A message alert and an audible speech relay. This is to ensure the information is communicated to the user effectively and removes the requirement for them to be near or accessing the device at the given time. In addition to this, the app will access and relay the user’s behavioural information to the middle-ware server for processing. This behavioural information is comprised of location and time data representing the user’s journeys throughout the day. Overnight the app will trigger the backend flask server to train and optimise routes given the user’s historical data.

Middleware Server

The middleware server has been hosted on the IBM cloud as it makes use of IBM chatbot functionality to process and respond to user requests. In addition, the middleware server also houses the destination prediction algorithm that relies on the Markovian formalism. This can be explained in a chain representation. By collecting the user’s historical data, the system will build a probabilistic state representation that will enable predictions of the user’s next states based on current conditions. This transition information will become more accurate as more behavioural data is collected and will converge to a stationary distribution.

Backend Server

The Flask Backend server is where the training and calculation of user routes are conducted before being passed back to the frontend app. Reinforcement learning has been implemented in this study due to its ability to make decision in response to the environment. This is important as in real applications, traffic conditions change constantly.

A deep reinforcement learning approach has been used for this project to enable routing decisions to be based on the user's position and the state of a public transport network. These routing decisions are made by an Agent, which is trained to be able to make optimal decisions within a simulated environment.

A flow diagram representing how the Agent is trained and the interaction between the Agent and environment. After Initialisation, the pedestrian is loaded into the environment and an initial state is observed. The Agent then looks at the possible actions the state affords and selects an action. The environment processes this decision and calculates a reward alongside the new state. The Agent stores the old state, action, new state, and reward into memory and updates the neural network. The server makes a check to see if the agent is at the pedestrian is at the intended destination, terminating the simulation if it is and repeating the process if it’s not.