
MIGUEL COLLI
Bilingual AI Data Specialist (Spanish MX / English) | Junior Full-Stack Developer
I am a Systems Engineer focused on the AI development lifecycle, specializing in QA, multimodal annotation, model evaluation, and testing. I deeply understand that high-quality data is the prerequisite for reliable AI systems. My core focus is applying strict human evaluation criteria to training data and model outputs, which is crucial for mitigating hallucinations, improving reasoning, and fine-tuning advanced models. I pride myself on being well-organized and consistent in my work, strictly following project guidelines and maintaining clear communication with team leaders. As a collaborative professional open to feedback, I am currently learning Python, Next.js, and LLM APIs to implement AI in real-world applications. My long-term goal is to grow across the full AI lifecycle.
Core Competencies
AI Model Evaluation & Alignment
- Multimodal RLHF across NLP, TTS, voice, and image recognition.
- LLM alignment, response ranking, and red-team testing.
- Supervised Fine-Tuning (SFT) and ground-truth dataset creation.
- Evaluation across reasoning, creativity, safety, and contextual accuracy.
Data Collection & Quality Assurance
- Python-based data scraping and cleaning.
- Manual and automated data collection.
- High-fidelity audio and video capture and content creation.
- Annotation, transcription, translation, and structured review.
- Dataset validation and QA workflows.
Prompt Engineering & Applied AI
- Prompt design, testing, and optimization to enhance LLM performance.
- Workflow automation prototyping through LLM API integration and Vercel deployment.
Software Development
- Next.js (App Router) full-stack application development.
- TypeScript (and JavaScript).
- SQL and relational database design.
- Python scripting for automation and data processing.
- Tailwind CSS.
- Git/GitHub version control.
Linguistics & Localization
- Bilingual English/Spanish – MX.
- Cultural adaptation and MX market localization.
- Parallel corpora review.
- Speech data accuracy and transcription quality control.
Technical Skills
Coding & Development
Generative AI & Alignment
Data Processing & QA
Linguistics & Localization
Featured Projects
Microservices Catalogue
Collection of backend microservices powered by Python with a Next.js UI, including advanced audio processing and cleaning tools.
Explore Catalogue →Gemini AI Powered Content Generator
An interactive application for real estate marketing content generation using Next.js and Google’s Gemini AI API.
Try the Demo →Next.js 16.1.1 Dashboard
A responsive dashboard application built using the official documentation with Next.js App Router, featuring data fetching, streaming, and server actions.
View Dashboard →