Latest Entries

C.A.R.P.E. version 0.1.1 release

I’ve updated C.A.R.P.E., a graphical tool for visualizing eye-movements and processing audio/video, to include a graphical timeline (thanks to ofxTimeline by James George/YCAM), support for audio playback/scrubbing (using pkmAudioWaveform), audio saving, and various bug fixes. This release has changed some parameters of the XML file and added others. Please refer to this example XML file Read the full article…

Toolkit for Visualizing Eye-Movements and Processing Audio/Video

Original video still without eye-movements and heatmap overlay copyright Dropping Knowledge Video Republic. From 2008 – 2010, I worked on the Dynamic Images and Eye-Movements (D.I.E.M.) project, led by John Henderson, with Tim Smith and Robin Hill. We worked together to collect nearly 200 participants eye-movements on nearly 100 short films from 30 seconds to Read the full article…

Handwriting Recognition with LSTMs and ofxCaffe

Long Short Term Memory (LSTM) is a Recurrent Neural Network (RNN) architecture designed to better model temporal sequences (e.g. audio, sentences, video) and long range dependencies than conventional RNNs [1]. There is a lot of excitement in the machine learning communities with LSTMs (and Deep Minds’s counterpart, “Neural Turing Machines” [2], or Facebook’s, “Memory Networks” Read the full article…

Real-Time Object Recognition with ofxCaffe

I’ve spent a little time with Caffe over the holiday break to try and understand how it might work in the context of real-time visualization/object recognition in more natural scenes/videos. Right now, I’ve implemented the following Deep Convolution Networks using the 1280×720 resolution webcamera on my 2014 Macbook Pro: VGG ILSVRC 2014 (16 Layers): 1000 Read the full article…

Extracting Automatically Labeled Volumetric ROIs from MRI

Performing a region of interest analysis on MRI requires knowing where the regions are in your subject data. Typically, this has been done using hand-drawn masks in a 3d viewer. However, recent research has made the process mostly automatic and the open-source community has implemented everything you will need to automatically create labeled volumetric regions Read the full article…

YouTube’s “Copyright School” Smash Up

Ever wonder what happens when you’ve been accused of violating copyright multiple times on YouTube? First, you get a redirect to YouTube’s “Copyright School” whenever you visit YouTube, forcing you to watch a cartoon of Happy Tree Friends where the main character is ironically dressed as a pirate…

An open letter to Sony ATV and UMPG

Dear Sony ATV Publishing, UMPG Publishing, and other concerned parties, I ask you to please withdraw your copyright violation notice on my video, “PSY – GANGNAM STYLE (?????) M/V (YouTube SmashUp)” as I believe my use of any copyrighted material is protected under Fair Use or Fair Dealing. This video was created by an automated Read the full article…

Copyright Violation Notice from “Rightster”

I’ve been working on an art project which takes the top 10 videos in YouTube and tries to resynthesize the #1 video in YouTube using the remaining 9 videos. The computational model is based on low-level human perception and uses only very abstract features such as edges, textures, and loudness. I’ve created a new synthesis Read the full article…

3D Musical Browser

I’ve been interested in exploring ways of navigating media archives. Typically, you may use iTunes and go from artist to artist, or have managed to tediously classify your collection into genres. Some may still even browse their music through a file browser, perhaps making sure the folders and filenames of their collection are descriptive of Read the full article…

Intention in Copyright

The following article is written for the LUCID Studio for Speculative Art based in India. Introduction My work in audiovisual resynthesis aims to create models of how humans represent and attend to audiovisual scenes. Using pattern recognition of both audio and visual material, these models use large corpora of learned audiovisual material which can be Read the full article…

Course @ CEMA Srishti School of Design, Bangalore, IN

From November 21st to the 2nd of December, I’ll have the pleasure to lead a course and workshop with Prayas Abhinav at the Center for Experimental Media Arts in the Srishti School of Design in Banaglore, IN. ┬áMany thanks to Meena Vari for all her help in organizing the project. Stories are flowing trees Key Read the full article…

Memory Mosaicing

A product of my PhD research is now available on the iPhone App Store (for a small cost!): View in App Store. This application is motivated by my interests in experiencing an Augmented Perception and of course very much inspired by some of the work here at Goldsmiths. The application of existing approaches in soundspotting/mosaicing Read the full article…

Concatenative Video Synthesis (or Video Mosaicing)

Working closely with my adviser Mick Grierson, I have developed a way to resynthesize existing videos using material from another set of videos. This process starts by learning a database of objects that appear in the set of videos to synthesize from. The target video to resynthesize is then broken into objects in a similar Read the full article…

Google Earth + Atlantis Space Shuttle

I managed to catch the live feed from NASA.gov of the Atlantis Space Shuttle launch yesterday. Though what I found really interesting was a real-time virtual reality of the space shuttle launch from inside Google Earth. Screen-capture with obligatory 12x speedup to retain attention span below:

Lunch Bites @ CULTURE Lab, Newcastle University

I was recently invited to the CULTURE lab at Newcastle University by director, Atau Tanaka. I would say it has the resources and creative power of 5 departments all housed in one spacious building. In the 12-some studios housed over 3 floors, over the course of 2 short days, I found people building multitouch tables, Read the full article…

Creative Community Spaces in INDIA

Jaaga – Creative Common Ground Bangalore http://www.jaaga.in/ CEMA – Center for Experimental Media Arts at Srishti School of Art, Design and Technology Bangalore http://cema.srishti.ac.in/site/ Bar1 – non-profit exchange programme by artists for artists to foster the local, Indian and international mutual exchange of ideas and experiences through guest residencies in Bangalore Bangalore http://www.bar1.org Sarai – Read the full article…

Facial Appearance Modeling/Tracking

I’ve been working on developing a method for automatic head-pose tracking, and along the way have come to model facial appearances. I start by initializing a facial bounding box using the Viola-Jones detector, a well known and robust detector used for training objects. This allows me to centralize the face. Once I know where the Read the full article…

Short Time Fourier Transform using the Accelerate framework

Using the libraries pkmFFT and pkm::Mat, you can very easily perform a highly optimized short time fourier transform (STFT) with direct access to a floating-point based object. Get the code on my github: http://github.com/pkmital/pkmFFT Depends also on: http://github.com/pkmital/pkmMatrix

Real FFT/IFFT with the Accelerate Framework

Apple’s Accelerate Framework can really speed up your code without thinking too much. And it will also run on an iPhone. Even still, I did bang my head a few times trying to get a straightforward Real FFT and IFFT working, even after consulting the Accelerate documentation (reference and source code), stackoverflow (here and here), Read the full article…

Augmented Sonic Reality

I recently gave two talks, one for the PhDs based in the Electronic Music Studios, and another for the PhDs in Arts and Computational Technology. I received some very valuable feedback, and having to incorporate what I’ve been working on in a somewhat presentable manner also had a lot of benefit. The talk abstract (which Read the full article…



Copyright © 2010 Parag K Mital. All rights reserved. Made with Wordpress. RSS