Seemab Yamin

AI Researcher & Engineer | LLMs, NLP, CV

Pakistan

Download Resume

About Me

I am a result-driven Python Engineer with hands-on experience in designing and developing scalable ETL pipelines, web scraping infrastructure, and data solutions using Python, SQL, AWS, and Git. Exemplified excellent problem-solving and communication skills working in Agile teams to produce streamlined data workflows, including the incorporation of Gen AI technologies such as ChatGPT, Gemini, etc, for intelligent automation.

Research focus: Generative AI and Foundation Models, with applied work across LLMs, NLP, and Computer Vision.

Well-documented history of deploying cloud-based solutions, streamlining data processes, and effective collaboration across teams to deliver innovation and reliability.

Backend & APIs

Python FastAPI Flask Django REST API

Cloud & DevOps

AWS Docker Apache Airflow Matillion Linux Servers

Databases

MySQL PostgreSQL MongoDB Elasticsearch

Data & AI

ETL Pipelines Data Engineering LLMs APIs (ChatGPT, Gemini) NLP Computer Vision Foundation Models OpenCV scikit-learn FastAI Web Scraping

Tools & Collaboration

Jira Notion Confluence Trello

Education

COMSATS University Islamabad

BS Computer Science · Sep 2018 – Aug 2022

Certifications

EF SET English Certificate (C1 Advanced)

Issued Feb 2026 · Credential

MongoDB Aggregation Fundamentals

Issued May 2025

Web Application Technologies and Django

Coursera · 2021

HTML5 Ultimate Course

Udemy · 2021

NLP Course For Beginner

Udemy · 2021

Learn Python Programming From A-Z: Beginner To Expert Course

Udemy · 2021

Python And Django Framework For Beginners Complete Course

Udemy · 2021

Learn JavaScript - For Beginners

Udemy · 2022

The Complete Intro to Machine Learning [Get Free Coupons]

Udemy · 2022

Git with Visual Studio Code

Udemy · 2021

1 Week of Code : 2021 Python for Beginners Crash Course

Udemy · 2022

Publications

Smart Infrastructure Management

Elsevier · 2025 · DOI

Experience

Data Engineer

Luxevision Consulting LLC · Nov 2024 – May 2025 · Remote (United States)

  • Managed and automated data processing ETL pipelines using Apache Airflow (Python) in Agile teams.
  • Monitored pipeline performance to ensure high data integrity and system reliability.
  • Integrated Amazon S3 with the Open Metadata platform and orchestrated ETL jobs using Matillion.

Web Scraping and Automation Engineer

Codeaza Technologies · Sep 2022 – Jun 2024 · Rawalpindi, Pakistan

  • Maintained and deployed robust core Python ETL pipelines utilizing AWS cloud services.
  • Executed data remodeling, built and managed data warehouses, and ensured high-quality web scraping and efficient crawling.
  • Developed and maintained automated bots, supported software deployments, and performed data validation and cleaning.
  • Integrated diverse databases and led agile teams to deliver scalable, efficient data solutions.
  • Engineered an LLM-based smart data solution by implementing the ChatGPT API.

Junior Python Developer

6th Sense Technologies · Jan 2022 – Jan 2023 · Remote (Pakistan)

  • Designed and built APIs to support scalable applications and extensible architecture.
  • Maintained and optimized web scrapers for consistent, high-volume data extraction.
  • Deployed and managed server infrastructure to maintain high availability and performance.
  • Developed backend solutions for web applications using the Django framework.

Web Development Intern

AndroWeb IT Solutions · May 2021 – Sep 2021 · Attock, Pakistan

  • Built responsive websites using HTML, CSS, and JavaScript.

Featured Projects

OCR Web Application

FastAPI-based OCR web app using the Tesseract OCR engine with REST API endpoints for extracting text from images.

  • OCR text extraction, logging, and health check endpoint.
  • Dockerized deployment for easy scaling and async request handling.
  • Validation and error handling for robust processing.
Docker Tesseract OCR FastAPI Python Pillow REST API Web Development
Role: Fullstack Published Jan 1, 2026

Appointment Booking n8n Workflow

An appointment booking form to take the user's request for appointments. This automation sends a notification via email, logs on Google Sheets, and creates a response draft email.

n8n Google Sheets Automation Gmail
Role: Automation Engineer Published Nov 8, 2025

Beach Searcher Scraper

This Python scraper, built with Scrapy, takes country URLs and dives deep into beach-searcher pages to uncover all its beaches. It follows the provided country link, gathers beach details, and stores them in a JSON file for analysis.

Beach Searcher Scraper Demo Video
Watch on YouTube

Demo Video — opens on YouTube

Data Collection Python Scrapy Web Scraping Framework Web Crawling
Role: Data Scraper Published Mar 12, 2023

Smart Topic Extractor

FYP topic modeling system that analyzes documents and generates word clouds for insight into text trends.

Django Pandas NumPy Matplotlib

FastAPI Boilerplate

Starter FastAPI project with health check, Dockerized development, and pytest-based tests.

FastAPI Docker pytest

Template Docs & Formatting

Portfolio project completed in Aug 2023 for kmonlinev. Scope: template document formatting and cleanup.

Price: Up to $50 Duration: 1–7 days

"Seemab quickly responded and provided solution for my problem. Excellent work!"

Grayson Appraisal Data Transformer

Desktop application that automates download and processing of appraisal data from Grayson County's mapping service. Extracts text files from ZIP archives and transforms them into a structured Excel spreadsheet for analysis and reporting.

Python pandas Requests PyQt Microsoft Excel UI Design
Role: Data Engineer Published Sep 12, 2025 Rating: 5.0/5

"Client Centered Approach. Great Communication. Great Completion Time. Great Character. I would hire him any day."

What People Say