GPT-3.5 Misinformation

UI for initial fine tuning model specifications

UI for child fine tuning model specifications

UI for data augmentation with multiple different methods

Model selection UI for statistical evaluation

Overview of statistical evaluations for a specific fine tuned model part 1

Overview of statistical evaluations for a specific fine tuned model part 2

GPT-3.5 Misinformation Details

I developed a project management and versioning system for fine-tuning pre-trained Large Language Models (LLMs) (e.g., OpenAI, Google) using augmented datasets. Users could upload or generate datasets dynamically via Easy Data Augmentation (EDA), text translation, and other expandable methods. A custom datapoint editor ensured compatibility with each model's required format for fine-tuning.

The system supported hierarchical model training, allowing models to be trained sequentially or in parallel, starting from a base model. A dual-layer versioning system (horizontal and vertical) structured training workflows, such as book-chapter-style development. Models and datasets could also be shared globally across different projects, enabling collaborative training and dataset reuse.

To optimize storage, augmented datasets were stored hierarchically, referencing the original data rather than duplicating large files (and much more). The live training status of models could be monitored, and sessions could be terminated if needed. Model evaluation included HHH (Helpful, Honest, Harmless) criteria, alignment scoring, and semantic similarity analysis via SBERT to assess data quality.

A comparative model analysis tool allowed users to inspect multiple models based on quantitative metrics, evaluation statistics, and training metadata. Each model's training history, augmentation methods, and datasets were fully documented.

The backend was built with Python, the frontend with Streamlit, and SQLite as the local-first database, allowing users to switch databases easily. A "Fail-Safe" mechanism ensured that if the application crashed or was closed mid-session, all progress—including active fine-tuning and dataset modifications—was saved, allowing users to resume seamlessly. Deployed via Docker.

This project highlights my expertise in AI model training, hierarchical versioning, real-time monitoring, and data augmentation, with a focus on efficiency, automation, and reliability.

Tech Stack

Python (Programming Language)
Streamlit
OpenAI API
Google Cloud Platform (GCP)
Rational Software Architect

Docker
SQLite
SQLAlchemy
ORM
Marshmallow

Project information

CategoryFull Stack
DevelopmentSolo
RepositoryPublic
Project dateJan. 2024
Visit Project