Rosas Behoundja

Projects

Here is a selection of projects I have worked on.
Individual Collaborative

MPVRP-CC

Proposal for a variant of the multi-product vehicle routing problem with production changeover costs. This problem models the distribution of several products while taking into account cleaning constraints between compartments.

Personal contribution: Technical coordination of the project, development of the API, website, and contribution to other aspects.

Combinatorial optimization, MILP, Operations research


AI-PigStack

Autonomous optimization IoT system for fattening pig farms integrating multi-objective predictive models. Proof-of-concept phase completed with environmental sensors, ML pipeline for growth prediction and early disease detection.

Smart farming, IoT, Edge computing, Embedded systems


tiny-language-model

From-scratch implementation of an autoregressive language model following GPT architecture (decoder-only transformer). 37M parameter model with 12 layers, trained on the LeCarnet corpus (French) using PyTorch. In-depth exploration of attention mechanisms, positional encoding, and scaling laws.

Transformer architecture, GPT, PyTorch, Deep learning, NLP


Sentimaster

ETL platform for multilingual semantic analysis of user feedback aggregated from X, Hellopeter, and Google Maps. Complete pipeline: API extraction, transcription (Whisper), sentiment classification (fine-tuned RoBERTa), topic modeling (BERTopic), and emotion detection. Airflow orchestration with real-time dashboard.

Sentiment analysis, BERTopic, Hugging Face, Data engineering, ETL pipeline


Opti'plan

Optimization system for automated scheduling of thesis defenses at University of Abomey-Calavi. Modeling as a constraint satisfaction problem with hard and soft constraints. Implementation of three approaches: CP model (OR-Tools) and two greedy heuristics with backtracking. Reduction of planning time from 2-3 days to under 5 minutes.

Constraint programming, CSP, Heuristics, OR-Tools, Decision support


Fluxy

Web application for automatic extraction of bank transactions from PDF/image statements via multimodal OCR (Gemini Vision API). Complex table parsing, normalization of heterogeneous banking formats, and validation through business rules. 90% reduction in manual entry time with <2% error rate over a 2-month pilot.

Gemini OCR, Computer Vision, Robotic Process Automation (RPA), FinTech, Fullstack Web


ifri-mini-ml-lib

Educational Python library reimplementing foundational ML algorithms from scratch, following the scikit-learn API. Personal contribution: implementation of association rules module (Apriori, Eclat, FP-Growth) with optimizations, development coordination (10+ contributors), and PyPI deployment with CI/CD.

Software engineering, Algorithms, Data mining, Association rules, CI/CD, Open source


Le Foncier intelligent

Land analysis solution developed in 72h (LuxDev hackathon) combining multimodal OCR and geospatial data. Geometric data extraction from topographic sketches, geocoding, and land use analysis through cross-referencing with satellite imagery. Stack: Next.js, FastAPI, Leaflet.js.

Geospatial, Gemini OCR, Computer vision, Image classification, Web development

COVID-Vaccine-GDP Analysis

Data analysis project investigating the correlation between COVID-19 vaccination rates and GDP per capita across countries over the 2020-2023 period. Data cleaning, exploratory data analysis, and visualization using R. Findings indicate a positive correlation between higher GDP and vaccination coverage.

R, Data Analysis, Econometrics, ETL Pipeline, Data Cleaning, COVID-19 Vaccination