CalArts | Audiovisual Interaction w/ Machine Learning

CalArts 2017-2018

Introduction

This course guides students through using openFrameworks, integrating audiovisual interaction and machine learning into their practice, by using real-time processing and interaction as a guiding framework.

Course Link

Course Link on GitHub

Course Notes

MTEC-616-01: Audiovisual Interaction w/ Machine Learning

Section Name: MTEC-616-01
Departments: School of Music
Prerequisites: Complete MTEC-614
Academic Level: Bachelors (416) and Graduate (616) Term: Spring 2017
1/23/2017 – 5/12/2017
T 2:00 PM – 3:50 PM
Building(s): Main Building
Classroom(s): B214A

Schedule

Session 01 – 01/24/17 – Installation / openFrameworks basics

This session will introduce you to openFrameworks, a creative-coding toolkit written primarily in C++. It brings together many frameworks for a cross-platform coding environment capable of real-time audio visual analysis, graphics, sound processing, 3D, multithreading, communication protocols, and plenty more. In this session, we’ll see how to navigate openFrameworks, build a project, compile existing examples, and find additional resources.

Session 02 – 01/31/17 – Basic Graphics / Audio Recording

In the next two sessions, we’ll explore the use of some basic graphics primitives, how to animate them, and use them to control the recording and playback of audio. The end result will be multisampler that can record/playback audio, automatically segment audio, and we’ll try and apply some simple filters. Along the way we’ll learn about a lot of basic coding principles including if statements, for loops, and how to use classes to build objects and abstract our code. We’ll also see how to use parameters to define elements in a graphical user interface. By the end of today, we’ll see how to create an audio recorder that can record audio and play it back, and next week we’ll see how to add more interactivity including automatically segmenting audio and controlling the way it sounds with filtering.

Session 03 – 02/07/17 – Audio Interaction

We’ll continue with developing a multisampler that can record and playback audio. The approach we take to develop our mutlisampler will allow us to scale it artibrarily, as well as add new functionality to it fairly easily. We’ll then explore another application which can chop audio automatically and we’ll see how this can be used to create an infinitely generating random drum track. To do this, we’ll first see how to use the RMS of an audio signal to automatically segment audio. Then we’ll use some basic statistics of the RMS to segment audio and create some randomized audio playback of audio “grains”, and also see how to visualize them.

Session 04 – 02/14/17 – Audio Segmentation

In this session we finish our implementation of an automatic segmentation of an audio signal and use the detected segments to randomly play back segments of audio. Using a drum loop, we can see how to make some fun beat repeat style audio synthesis.

Session 05 – 02/21/17 – Granular Synthesis

This session covers two methods for loading samples, maxiSample and pkmEXTAudio methods for reading and saving any audio format using Apples EXTAudio format. Finally, we look at how to use maximilian’s granular synthesis engine to independently pitch and time stretch the amen break.

Session 06 – 02/28/17 – Visual Interaction

This session covers some advanced uses of OpenGL including use of the 3D geometry, meshes, cameras, virtual lighting, and procedural generation of meshes from a camera or video file.

Session 07 – 03/07/17 – Computer Vision

This session covers introductory computer vision topics such as an understanding of pixels, colorspaces, frame differencing and background subtraction, optical flow, and use of OpenCV and ofxCv.

Session 08 – 03/14/17 – Advanced Visual Interaction

This session explores an advanced use of computer vision known as blob tracking. We then explore a simple interface for projection mapping. Finally, we explore the use of the GLSL shader language.

Session 09 – 03/21/17 – Audiovisual Interaction

We looked at using the camera to make noise and interacting with tracked entities from blob tracking to drive sonic generation.

Session 10 – 04/04/17 – Advanced Audio Analysis

We looked at the DFT, Mel Transform, Circular Buffers, and MFCCs for audio fingerprinting and analysis.

Session 11 – 04/11/17 – Machine Learning Part I

We explored the use of kNN and tSNE for concatenative synthesis and visual exploration of a grain cloud.

Session 12 – 04/18/17 – Machine Learning Part II

We explored the use of k-means clustering and markov chains for generative concatenative synthesis.

Session 13 – 04/25/17 – NO CLASS

Session 14 – 05/02/17 – Project Help (Lab Only / Requests)

Final Project Presentations – 05/09/17 – Projects Due