profile_picture

About Me

Hi! I'm Denilson. I recently earned my degree in Data Science from UC Berkeley, with an emphasis in applied mathematics and modeling. I'm passionate about developing predictive models, building scalable pipelines, and extracting insights from large datasets. I enjoy the challenge of transforming complex information into systems that support data-driven decisions, and I like creating a structured world that's easier for others to navigate

Projects

This is a collection of projects I completed for various purposes in different programming languages. Feel free to reach out with questions on any project :)



Personnal Website Preview
Personal Website

I self-taught HTML/CSS/JavaScript and Bootstrap Studio software in order to create my personal website.

HTMLCSSBootstrap StudioGitHubJS
Project 1 preview
Trending Topic Detection

Implementation of the BN-Grams algorithm, following a published research paper, to detect trending topics in English from multilingual tweet streams.

I self-taught some NLP theory in order to complete this project.

PythonSparkDatabricksNLPCloud Computing
Project 1 preview
Flight Routing - Mixed Integer Linear Program

Profit optimization model for aircraft and crew assignments with geographical and temporal constraints

AMPLMixed Integer LPOptimization
Project 1 preview
Smart-Building Sensor Data ETL Pipeline

Scalable pipeline to clean, process, perform granular transformations, and interpolate sensor data using PotsgreSQL

PostgreSQLDatabasesData EngineeringData Transformation
Project 1 preview
Kaggle Classification Challenge Competition

Pipeline and machine learning model, tuning hyperparameters and enhancing quality of a noisy dataset for prediction

PythonSVMPCAEDAscikit-learn
Project 1 preview
Housing Data Exploration and Linear Model Prediction

Extraction of predictive features using EDA to build a predictive model for house pricing

PythonPandasLinear Regressionscikit-learn
Project 1 preview
NoSQL Data Mining and Analysis on Yelp Academic Dataset

Conducted complex aggregations and data cleaning on a NoSQL data architecture using MongoDB to manage semi-structured JSON from the Yelp Academic Dataset

MongoDBDatabasesData Analysis
Project 1 preview
Spam Classifier

Utilized linear regression and sci-kit learn to classify emails based on keywords. Employed cross validation to ensure high accuracy (87.8%)

sci-kit learnLinear regressionData AnalysisPythonPandas
Project 1 preview
Advanced SQL Analytics on IMDb Database

Developed SQL queries to clean and analyze the IMDb database, leveraging advanced PostgreSQL functions, regex, and relational joins to extract insights from millions of records

PostgreSQLExploratory Data AnalysisSamplingData Cleaning
Project 1 preview
SQL Performance Engineering

Optimized relational queries using PostgreSQL evaluating the cost-benefit trade-offs between various join algorithms, scan types, and indexing strategies on historical baseball dataset

PostgreSQLExploratory Data AnalysisData Engineering

Connect

email: denilsonhdez@berkeley.edu