Master's student in Data Science & AI at Unistra and Data Science/AI developer intern at Kwarto. I'm passionate about working on machine learning models, LLMs, visualizing data, and turning ideas into real applications.

Curious and analytical student, with strong interest in applying modern AI techniques. Experienced in teamwork, deep learning, and data analysis.

I have developed and contributed to a diverse portfolio of projects—from LLM text-to-SQL assistants and spatial analytics platforms to forecasting engines, dashboard-driven data products, computer vision pipelines, and advanced machine learning models.

Chatbot - Interactive Agricultural Parcels Explorer - Natural Language to SQL

Co-built a web app for exploring French agricultural parcel data through a conversational chatbot. Users type natural language prompts to query communes, crop types, years and spatial layouts, the chatbot converts them to SQL on the fly and automatically displays the results as maps, graphs and tables. Also supports result comparison across years and data upload/download. Built with Python, SQ, GeoPandas and Huggingface.

LLMPYSQLGEOPANDAS
Water Quality Clustering and Spatial Interpretation

Analyzed water quality data to answer questions about water quality and spatial patterns: engineered temporal features, ran K-Means, validated clusters with I2M2, and mapped spatial patterns with CORINE land cover.

PYMLGEOPANDAS
Stock Market Forecasting using LSTM and Transformers

Built 5-year equity datasets, trained global and per-stock deep learning models, and benchmarked architectures to highlight the Transformer as the most accurate.

PYDEEP LEARNINGTIME SERIES
View the full index ->

I am in the final stages of my master’s program and currently doing a six-month internship at Kwarto as a Data Scientist / AI Developer.

I am working on building "Kontrol", a data extraction pipeline for telecom tower technical documents. The system processes unstructured documents through a full pipeline including OCR, preprocessing, chunking, RAG-based retrieval, LLM-powered extraction and post-processing to automatically structure technical data at scale.

Capabilities

LLM alignment, geospatial analytics, forecasting, classification, computer vision, model evaluation, data analysis, clustering, data visualization, deep learning.

Toolbox

Python, PyTorch, scikit-learn, GeoPandas, SQL, TensorFlow, seaborn, matplotlib and a bit of C/C++/Java when needed.

Git, Bash, Dash, Next.js, a bit of Flask...

Contact

I keep my inbox open for offers, research projects, or any kind of questions!

ataberk.karabag@gmail.com