
About me in numbers
Projects
A small selection of data science projects I have worked on.
Site Selection Engine
Intelmatix, 2022 - present
A location intelligence engine that models the urban dynamics of a city to perform tasks such as real estate valuation or site selection for a new QSR branch. Relying on a graph-based saturation model, a branch cannibalization model, accessibility scoring for POIs and demographic groups, route optimization, and a catchment model, it can predict the future sales for unknown locations and recommend the highest-impact site for a new branch to open within the city.
The value prediction model receives inputs such as accessibility scores, market saturation, cannibalization score calculated based on the catchment of a location and its road network around it, and several demographic variables. This value prediction model can be trained to predict outputs such as future sales for a site, or the value of a land or real estate. Thanks to this flexibility on the target variable, the site selection engine can be scaled to several use cases that relate to urban and geospatial analytics.


Healthcare Policy Simulator
Intelmatix, 2022 - present
A policy simulation engine that identifies the relationships between the development of the healthcare practitioner workforce in a country and various government policy changes related to immigration, medical education, workforce allocation, and employment requirements.
The effect of policy changes are measured by the resulting number of practitioners per 1000 population per specialization. At the beginning of every simulation year a new trainee cohort is initialized for each specialization with particular characteristics such as demographics and university GPA. Each trainee agent acts according to their own specialization requirements and training center environment. For example, they might take various exams, decide to take a gap year, switch specializations, or perhaps abort or drop out. Eventually, the graduated trainees become senior registrars, who later may become consultants. At the end of the simulation the total workforce is captured.
The goal is to support healthcare system decision-makers in understanding the impact of potential policies and recommend them the right policy based on the national targets such as reaching a certain number of registered nurses or improving healthcare training quality.
Unstructured Financial Data Extractor
Dealogic, 2021 - 2022
A data wrangling and extraction engine that relies on various NLP methods such as named-entity recognition, sentiment analysis, and text mining in order to extract highly accurate financial data from unstructured sources like free text in legal documents published by the US Stock Exchange Commission.
The goal of the project was to build a graph database to store the relationships of special purpose acquisition companies (SPACs), their acquired companies, their sponsors and sponsor affiliates, as well as all involved individuals’ connections to these entities. In addition to capturing such relationships, we also wanted to collect additional information about the entities themselves, such as stock and warrant prices, ticker symbols and stock exchanges, purchase conditions and warrant expiry dates, etc.
The raw filings are collected and the data points are extracted near real-time, with minimal delay after publication on SEC. A NER model was trained on historical SEC documents, and it is able to process and extract data from various filing types such as 424B4, S10, quarterly and annual reports.
Automated Credit Underwriting Algorithm
Anyfin, 2019 - 2020
A credit underwriting algorithm that takes in a loan applicant profile as an input, makes a decision on whether or not to offer them an option to refinance their loan. If an offer is made, the algorithm also recommends the optimal interest rate to the applicant based on an internal credit scoring model, their probability of default, payment history, previous loan applications, and other variables.
My most significant contribution to this algorithm was the improvement of our default prediction model by developing a new feature scoring method that relied on the apriori algorithm, weight of evidence and information value statistics.
Temporal Risk Evaluator of Rising Household Debt
Stockholm University, 2019 - 2020
A risk assessment model that examines the effect of rising household debt on the financial markets in Sweden based on various time periods before, during and after the 2008 financial crisis.
For several years the household debt-to-GDP levels in Sweden have been close to that in the US prior to the 2008 financial crisis, coupled with a steady growth in housing prices. Considering the historical burst of the housing bubble in the US which triggered the crisis and had ripple effects on the financial markets, institutions such as the Swedish Central Bank and the National Institute of Economic Research in Sweden have been raising concerns.
This model was built to understand the market risk posed by the recent record high household debt-to-GDP levels, consisting mainly of mortgages, while controlling for variables such as consumer price index, repo rate, industrial production, oil prices, unemployment rate and household income. The financial markets were captured by major stock indices such as OMXS30, yield curves on long-term government bonds, and volatility indices like SIXVX.
Renewable Energy Policy Recommender
Stockholm University, 2018 - 2019
A comparative model of countries’ approach to low-emission renewable energy policy under liberalized energy markets. The aim of the model is to identify high-impact policies for maximizing the share of renewable energy in a country’s energy mix.
In one application we found that an indirect promotion mechanism through tax incentives with a combination of direct low-emission energy project subsidies, low market entry barriers, precisely measurable energy targets, and minimized regulatory changes significantly contributed to the record level share of renewable energy achieved by Denmark.
A country whose geography, economic environment and market conditions, technological maturity and grid capacity, as well as their cognitive environment and public awareness are similar to Denmark is likely to benefit from such policy interventions as well. In this case, one of such countries is the Netherlands, whose historical policy decision points marked a key divergence from Denmark in their progress to maximize their rate of low-emission renewable energy sources compared to their gross national energy consumption.
Contact
