Huzaifa Shafique

AI Engineer

https://talent.gravityer.com/huzaifa-shafique

Key Strengths

Deep expertise in on-device ML, model compression, and edge AI deployment, demonstrated across multiple projects.
Strong practical experience with multimodal AI systems (facial emotion, speech, posture analysis) and their real-time integration.
Proficient in various AI/ML frameworks (TensorFlow, PyTorch, Hugging Face) and specialized tools (TFLite INT8 Quantization, MediaPipe).
Proven ability to translate cutting-edge research into deployable, optimized solutions for resource-constrained environments.
Experience with RAG pipelines, LLMs, and generative AI, showcasing breadth in modern AI applications.

Cultural & Operational Fit

Cultural Fit Analysis

The candidate's academic projects showcase a strong drive for innovation and practical application of AI, aligning well with a culture that values hands-on development and real-world impact. Their diverse project portfolio, spanning computer vision, NLP, audio ML, and generative AI, indicates a broad interest and adaptability, which are beneficial for dynamic team environments. The emphasis on optimizing for edge devices suggests a resource-conscious and efficient approach to engineering.

Soft Skills & Operational Fit

The candidate demonstrates strong problem-solving skills, evidenced by their detailed approach to model optimization, architecture selection, and handling edge cases in projects. Their teaching assistant role suggests good communication and mentoring abilities, which are valuable for team collaboration and knowledge sharing. The focus on delivering complete, installable solutions indicates a product-oriented mindset and attention to detail.

AI is analyzing your overall score…

Identifying your key strengths…

Evaluating your skill match against the job requirements…

Assessing your cultural and operational fit

About

As an AI Engineering student with deep hands-on expertise in Computer Vision, On-Device ML, Natural Language Processing, and Multimodal Systems, I have a proven track record of architecting and deploying production-grade AI solutions on resource-constrained edge hardware. My work spans facial emotion recognition, speech emotion detection, and posture analysis, all compressed and optimized for sub-100 ms on-device inference via TFLite INT8 on Android, with no cloud dependency. With a passion for pushing the boundaries of deployable AI, I am adept at translating cutting-edge research into real-world applications that function reliably in low-resource environments. Let's discuss how my deep technical skills and engineering-first approach can drive your organization's AI and Computer Vision initiatives forward.

Top Skills

PythonTensorFlowPyTorchmodel compression

Projects

Sound Emotion Recognition Dual-Approach Audio Pipeline

January 1, 2026 – January 1, 2026

Built two independent SER pipelines and benchmarked them head-to-head: Approach 1 (40-dim MFCC + CNN-LSTM) reached 75% accuracy; Approach 2 (3-channel Mel spectrogram stacking Mel, Δ, and ΔΔ into a physics-aware 94×128×3 image + Bi-LSTM + custom Attention) reached 80% accuracy — a 5-point gain by encoding temporal dynamics as image channels rather than sequence features. Approach 2 uses rectangular CNN kernels (5×3 time-focus, 3×5 frequency-focus) and a custom Attention layer that suppresses silence and padding frames, focusing the model on voiced segments. Both pipelines use Mixed Float16 precision, light audio augmentation (Gaussian noise, ±1 semitone pitch shift, 0.95-1.05× time stretch), and WarmUp + Cosine Decay scheduling.

View Project

Tell2Design Generative AI for Architectural Floor Plans

January 1, 2026 – January 1, 2026

Developed a generative pipeline that transforms natural language room descriptions into structured 2D floor plan layouts using Graph Attention Networks (GAT) trained on the 3D-FRONT indoor scene dataset (18,000+ professionally designed rooms). Modeled spatial dependencies between rooms as a graph — nodes represent rooms, edges encode adjacency constraints — enabling the model to generate layouts that respect real-world architectural relationships (e.g. kitchen adjacent to dining, bedroom separate from living areas). Evaluated layout coherence with IoU-based room overlap metrics; demonstrated that GAT-based spatial reasoning outperforms sequence-to-sequence baselines on constraint satisfaction for multi-room plans.

View Project

Facial Emotion Recognition Blendshape MLP Pipeline

January 1, 2026 – January 1, 2026

Explored three architectures progressively before arriving at the production model: (1) FaceNet backbone fine-tuning with 3-phase curriculum training; (2) CNN/ViT baselines; (3) a compact 165-dim temporal blendshape MLP — chosen for its combination of accuracy, pose invariance, and mobile deployability. Refactored from a 1404-dim raw landmark feature space to a 52-dim FACS-aligned blendshape representation, achieving a 3x model size reduction with higher pose invariance; expanded to 165-dim by appending temporal derivatives and 4 geometric ratios (EAR, MAR, brow-to-eye, mouth pull). Architecture: GLU blocks + Residual MLP + Multi-Head Self-Attention with Supervised Contrastive Loss (T=0.07) + Focal Loss (y=2); trained on AffectNet + RAF-DB with AdamW + Cosine Decay Restarts. Achieved 78% validation accuracy on AffectNet and 68% zero-shot transfer on RAF-DB (held out entirely during training), demonstrating generalization beyond dataset-specific texture cues. Exported to TFLite INT8 with representative-dataset calibration; quantization metadata JSON enables direct drop-in to Android MediaPipe pipelines.

View Project

Virtual HR Multimodal Mock Interview Analysis System

January 1, 2025 – January 1, 2026

Built a real-time Android application that conducts AI-powered mock interviews: generates job-specific questions via LLM, records candidate responses, and produces a structured multimodal feedback report covering verbal fluency, facial emotion, and posture – all running entirely on-device with no server dependency. Integrated Whisper (speech-to-text) + BERT (fluency and coherence scoring) for verbal analysis; MediaPipe Pose for posture detection; and the custom 165-dim blendshape MLP (see FER project) for real-time facial emotion recognition – three parallel inference streams in a single APK. Deployed full inference stack via TFLite INT8 quantization, achieving sub-100 ms on-device latency across all three modules on mid-range Android hardware. (AI/ML backend and model integration by author; Flutter frontend by collaborator.) Delivered a complete, installable APK – not a prototype – with per-module structured feedback output designed for repeat use by job seekers.

View Project

Edge AI Livestock Tracking & Classification System

January 1, 2025 – January 1, 2025

Developed an edge-based animal tracking and classification system using ESP32-S3, ESP32-CAM, and TensorFlow Lite; dual IR-sensor direction detection triggers real-time image capture and on-device inference with a quantized MobileNet V1 model classifying livestock (Cow, Goat, Hen) with no cloud dependency. Logged classification events and running counts to InfluxDB via Wi-Fi; added OLED status display and a Streamlit interface for validating the TFLite model on uploaded images during development.

View Project

YouTube Video Question Answering RAG Pipeline

January 1, 2024 – January 1, 2024

Built a Streamlit application that answers natural language questions over any YouTube video — extracts the transcript, chunks and indexes it with FAISS + all-MiniLM-L6-v2 embeddings, then generates context-grounded answers via Meta LLAMA 3 (LangChain + Together AI). Supports multi-language transcripts and configurable retrieval size k; includes chunk-level source inspection so users can verify which segments informed each answer — addressing the hallucination transparency problem common in naive RAG deployments. Handles edge cases gracefully: missing captions, API rate limits, and videos with auto-generated vs. human transcripts are all caught and surfaced to the user with actionable error messages.

View Project

Key Strengths

Deep expertise in on-device ML, model compression, and edge AI deployment, demonstrated across multiple projects.
Strong practical experience with multimodal AI systems (facial emotion, speech, posture analysis) and their real-time integration.
Proficient in various AI/ML frameworks (TensorFlow, PyTorch, Hugging Face) and specialized tools (TFLite INT8 Quantization, MediaPipe).
Proven ability to translate cutting-edge research into deployable, optimized solutions for resource-constrained environments.
Experience with RAG pipelines, LLMs, and generative AI, showcasing breadth in modern AI applications.

Cultural & Operational Fit

Cultural Fit Analysis

Soft Skills & Operational Fit

Huzaifa Shafique

Key Strengths

Cultural & Operational Fit

About

Top Skills

Skills

Education

Experience

Projects

Key Strengths

Cultural & Operational Fit