AN

Alex Ndungu

CTO + Software Engineer + ML Engineer

Let's talk
HomeAboutExperienceProjectsSkillsContact
Let's talk
HomeAboutExperienceProjectsSkillsContact

Alex Ndungu

Backend systems, machine learning retrieval, and clean product-minded engineering for teams that care about reliability.

GitHubLinkedInLeetCodealexmeta517@gmail.com
Projects

Case studies built to show system thinking, not just finished screenshots.

These projects are framed the way hiring teams evaluate engineering work: problem definition, architecture quality, implementation choices, and what the system proves.

Harlem Manage
Multi-Tenant Real Estate Operating System

Harlem Manage

Built the full backend of a multi-tenant proptech platform: 46-tool role-based AI agents, real-time M-Pesa WebSocket payment flows, a 3-queue Celery task system, and a PII-scrubbing observability stack.

Harlem Manage is a Kenya-first, multi-tenant real estate operating system built for landlords, agencies, and property firms. It combines property workflows, tenant and lease management, financial reconciliation, communication channels, and Akoko, a deployed operational intelligence layer that adapts by role.

Next.jsDjangoDjango REST FrameworkPostgreSQLM-PESAWhatsApp/SMSAkoko AI

46

AI agent tools

4

RBAC user roles

3

Celery queue types

Architecture highlights

  • - Role-based AI agent system (GPT-4.1 Responses API) with dynamic tool selection across 4 roles and 46 tools — full audit trail per message for compliance
  • - M-Pesa C2B STK Push with OAuth2, field-level encrypted credentials (per organisation + property), and live payment status streamed via JWT-authenticated WebSocket consumers over Redis
Deep divePrivate Repo
CodePinion
Real-Time Developer Collaboration Platform

CodePinion

An open-source developer Q&A platform built across 3 collaboration modes — async threads, real-time chat, and integrated video calls — moving knowledge sharing from static forum searches to live problem solving.

CodePinion is a developer Q&A platform designed to close the gap between the person asking and the person best positioned to help. Rather than forcing developers through slow async threads, it layers real-time chat and video calling on top of a persistent Q&A base so problems can be worked through in context.

JavaScriptHTMLCSSPythonNode.jsReal-time communicationVideo integration

3

Collaboration modes

Open source

Public GitHub repo

Full-stack JS

Frontend + backend

Architecture highlights

  • - 3 collaboration modes in one product: persistent Q&A threads, real-time chat, and integrated video calling
  • - Real-time signaling layer for WebSocket-based chat and video session coordination
Deep diveGitHub
Catalog-Point
Library Management System

Catalog-Point

A full-stack library operations system with 5 core relational models, 2 user role types, a borrowing transaction engine with date-based cost calculation, and a deployment-ready Django stack.

Catalog-Point is a Django-based library management system covering the full operational surface of a real library: inventory tracking, category management, borrowing workflows, cost calculation, approval states, return handling, and user activity history — for both librarians and members.

DjangoPythonPostgreSQLHTMLCSSJavaScriptDjango AllauthGunicorn

5

Core relational models

2

User role types

Deployed

Gunicorn + WhiteNoise

Architecture highlights

  • - 5 relational models: profiles, categories, books, costs, and transactions — covering the full operational surface
  • - 2 user role types with separated workflows: librarian administration and member-facing catalog access
Deep diveGitHub
E-Commerce Backend System
Production Backend Engineering Project

E-Commerce Backend System

A backend commerce API built across 3 service domains (catalog, cart, order) with JWT-authenticated role-aware authorization, a relational schema optimized for checkout and order lifecycle workflows, and a separate React frontend — 2 public repos.

A backend-first commerce platform focused on clear domain separation, predictable API behavior, and a schema that supports catalog, cart, and order lifecycles without coupling everything into a single service layer. Paired with a public React frontend repo.

JavaSpring BootPostgreSQLJWT AuthReactREST APIs

3

Service domains

JWT + RBAC

Auth layer

2 repos

Frontend + backend

Architecture highlights

  • - 3 service domains (catalog, cart, order) separated into composable, independently testable layers
  • - JWT-authenticated role-aware authorization enforced at the API boundary
Deep diveFrontendBackend
ML Search Engine
Hybrid ML Search & Retrieval System

ML Search Engine

A 2-stage hybrid retrieval system trained on 45,000 StackOverflow records — SGD-based tag prediction for query expansion feeding into TF-IDF vectorization with cosine similarity ranking.

A machine learning search system built on ~45,000 StackOverflow records. The key insight was that a single retrieval technique misses intent — so the pipeline runs in 2 stages: classify the query to predict missing context tags, then use those enriched tags to improve the similarity search.

PythonPandasscikit-learnTF-IDFNLP preprocessingCosine similarity

45K+

Training records

2-stage

Hybrid retrieval pipeline

SGD + TF-IDF

Model combination

Architecture highlights

  • - 45,000 StackOverflow records processed through an HTML cleaning, normalization, and tokenization pipeline
  • - 2-stage hybrid pipeline: SGD tag prediction for query expansion → TF-IDF + cosine similarity retrieval
Deep diveGitHub
Movie Recommendation System
Content-Based Recommendation Engine

Movie Recommendation System

A 3-stage content-based recommendation pipeline — metadata extraction, vector representation, and cosine similarity scoring — that generates explainable suggestions with no user interaction data required.

A content-based recommender that processes movie metadata through 3 explicit pipeline stages: feature extraction, vector representation, and similarity scoring. The design prioritizes explainability — every suggestion is traceable to specific shared metadata signals rather than opaque collaborative filtering.

PythonPandasscikit-learnFeature engineeringSimilarity modeling

3

Pipeline stages

Content-based

No user data needed

Explainable

Traceable recommendations

Architecture highlights

  • - 3-stage pipeline: metadata cleaning + feature extraction → vector representation → cosine similarity scoring
  • - Content-based approach requiring zero user interaction data — recommendations driven purely by metadata signals
Deep diveGitHub