Sarra Bouhenni

Realtime ML researcher

About Me

I enjoy building fast, efficient and scalable algorithms for big graphs and big data in general. I hold a PhD in data science that involved the design of distributed algorithms of graph pattern matching on current Think-Like-A-Vertex graph processing systems. I am most skilled in: Spark and Scala



Realtime ML researcher

July 2022 - Present

I work on designing large scale parallel and distributed machine learning algorithms for streaming data.

Ubiquity.AI helps Ad Networks to increase their ROI by removing ad decisioning bottlenecks thanks to a proprietary Decision as a Service platform that increases their margin up to 300% while preserving the scale. Unlike solutions that rely on constant manual tuning or general-purpose Machine Learning not fitted for all KPIs, Ubiquity.AI empowers Ad Networks to grow their margin and customer base without growing their operational costs.

CERIST Research Center

Research Associate

Februaury 2022 - June 2022

I worked on distributed algorithms of big graphs and big data processing platforms.

My research interests include: Distributed algorithms, Graph algorithms, Graph Pattern Matching, ML Applied to graphs, graph data models.

CERIST Research Center

Graduate Teaching Assistant

June 2019 - March 2021

I delivered the programming course Big Data - Scala in the post-graduation training PGS. It covers: basic syntax of Scala programming language, functional programming with Scala, and Scala collections.

I also delivered the introductory course Frameworks for big graphs - GraphX.


Data Science Consultant

August 2017 - December 2017

I worked on improving the logging system of the VTC application Yassir. My responsitibilies included mainly:

  • Extraction of the log events available at a Kafka server using a logger implemented with NodeJs.
  • Transformation of the events to compresenhive information to be stored on an Elasticsearch server.
  • Creation of dashboards for the visualization and reporting of logs using Kibana.


Data Science Intern

September 2016 - June 2017

In my final year’s engineering project, I worked on the design and development of a monitoring system for predicting plant diseases in Algeria, using Incremental Machine Learning algorithms. My responsibilities in this project included:

  • Collecting data on plant disease prediction for the two diseases Fusarium Head Blight and Potato Mildiew.
  • Design of a new algorithm for plant disease prediction based on incremental learning.
  • Design and development of a backend for disease prediction based on current weather conditions, using Python and Django.
  • Design and development of a web application, using Angular, for monitoring and visualization of the disease spread across Algerian crops.


Université Claude Bernard - Lyon 1 and Ecole Supérieure d'Informatique - Alger

Joint PhD in Computer Science

2018 - 2021

I worked in my PhD on the problem of Graph Pattern Matching (GPM) in the context of large-scale distributed graphs. PhD Thesis dissertation can be downloaded via this link: Parallel and Distributed Algorithms for Pattern Matching in Big Graphs

Ecole Supérieure d'Informatique

Engineering degree in IT Systems

2012 - 2017

My final year’s project was about the development of a plant disease forecasting platform. Thesis dissertation can be downloaded via this link: APDM : vers une plateforme intelligente pour la prevision des maladies végétales

Technical Experience

  • Programming Languages
    • Scala: I have an experience of 5 years working with Scala and functional programming. I have implemented several distributed and parallel algorithms of graphs with Scala, Spark RDDs and GraphX.
    • Python: I had the chance to work with Python for one year during my final year’s engineering project. I have implemented different data science algorithms in addition to a web application using Django. I have also worked on a data science project using Python during my experience at Ya Technologies.
    • Java: I work with Java on a daily basis.
    • Basic knowledge of C, C++
  • Big data plateforms
    • Spark Apache: I have an experience of 4 years working with Spark and its module GraphX for designing graph pattern matching algorithms in the context of massive graphs. I am also responsible of the deployment and management of the big data plateform at the CERIST research center.
    • Elastic Stack: I have worked with the different components of the Elastic Stack including Logstash, Elasticsearch and Kibana during my experience as a data science consultant at Yassir. I have also an experience in the deployment and management of an Elasticsearch cluster on a production environment using Docker and Kubernetes.

Spoken Languages

Arabic (native speaker) – French – English