technology – Page 2 – Parag Kumar Mital

02.06.2015

#1836

Handwriting Recognition with LSTMs and ofxCaffe

Long Short Term Memory (LSTM) is a Recurrent Neural Network (RNN) architecture designed to better model temporal sequences (e.g. audio, sentences, video) and long range dependencies than conventional RNNs [1]. There is a lot of excitement in the machine …

01.04.2015

#1764

Real-Time Object Recognition with ofxCaffe

I’ve spent a little time with Caffe over the holiday break to try and understand how it might work in the context of real-time visualization/object recognition in more natural scenes/videos. Right now, I’ve implemented the following Deep Convolution Networks using …

10.16.2012

#1246

Copyright Violation Notice from “Rightster”

I’ve been working on an art project which takes the top 10 videos in YouTube and tries to resynthesize the #1 video in YouTube using the remaining 9 videos. The computational model is based on low-level human perception and uses …

06.29.2012

#1079

3D Musical Browser

I’ve been interested in exploring ways of navigating media archives. Typically, you may use iTunes and go from artist to artist, or have managed to tediously classify your collection into genres. Some may still even browse their music through a …

11.03.2011

#855

Memory Mosaicing

A product of my PhD research is now available on the iPhone App Store (for a small cost!): View in App Store.

This application is motivated by my interests in experiencing an Augmented Perception and of course very much …

10.08.2011

#830

Concatenative Video Synthesis (or Video Mosaicing)

Working closely with my adviser Mick Grierson, I have developed a way to resynthesize existing videos using material from another set of videos. This process starts by learning a database of objects that appear in the set of videos …