Hearoo🎧

Hearoo 🎧 is an intelligent audio analysis platform that blends cutting-edge deep learning with an interactive user experience. Built with TypeScript on the frontend and Python on the backend, Hearoo leverages a ResNet-based CNN model trained on the ESC-50 dataset via Modal.com to classify and visualize environmental sounds with remarkable performance, achieving 81.25% accuracy across 50 sound categories. Users can upload WAV files to explore detailed feature visualizations, turning raw sound into insightful, dynamic visuals. Designed with scalability, clarity, and creativity in mind, Hearoo transforms the way we perceive and interpret sound through AI-powered intelligence.

Core Features

Deep audio classification using CNN

ResNet-based architecture

Mel spectrogram conversion

Real-time enviroment sound detection

Waveform and spectrogram visualizations

Interactive dashboard built with Next.js

Serverless GPU inference hosted on Modal

TensorBoard integration for model - performance tracking

Link

Hearoo

Hearoo-Github

Technical Highlights

Backend powered by FastAPI and Python

Frontend written entirely in TypeScript

Optimized training using AdamW

Data augmentation with Mixup technique

Batch Normalization for faster convergence

Pydantic validation for robust API requests

TensorBoard integration for model tracking

Modern UI built with Tailwind CSS and Shadcn UI