Latest News

YouTube’s “Copyright School”

Ever wonder what happens when you’ve been accused of violating copyright multiple times on YouTube? First, you get a redirect to YouTube’s “Copyright School” whenever you visit YouTube, forcing you to watch a cartoon of Happy Tree Friends where the main character is dressed as an actual pirate:

Second, I’m guessing, your account will be banned. Third, you cry and wonder why you ever violated copyright in the first place.

In my case, I’ve disputed every one of the 4 copyright violation notices that I’ve received under grounds of Fair Use and Fair Dealing. Here’s what happens when you file a dispute using YouTube’s online form (click for high-res):






3 of the 4 have been dropped after I’ve filed disputes, though I’m still waiting to hear about the response to the above dispute. Read the dispute letter to Sony ATV and UPMG Publishers in full here.

The picture above shows a few stills from what my Smash Ups look like. The process described in greater detail on createdigitalmotion.com is part of my ongoing research into how existing content can be transformed into artistic styles reminiscent of analytic cubist, figurative, and futurist paintings. The process to create the videos uses content-based information retrieval techniques that I would assume are very similar (though likely not as advanced) as the techniques used to flag the video as a duplicate copy in the first place, YouTube’s Content ID System. Until Sony and UPMG respond, the infringing video is still available on YouTube:

Regardless of my disputes, I’m now redirected to YouTube’s Copyright School whenever I visit YouTube (until I successfully complete the test):

A bit about Happy Tree Friends – it is, according to Wikipedia, “extremely violent, with almost every episode featuring blood, pain, and gruesome deaths…depicting bloodshed and dismemberment in a vivid manner.” Nevermind. I’m a copyright violater, I can handle a little dismemberment. In fact, that is exactly what I’ve done to the “Copyright School” video, dismember it with the content of 70 videos of Happy Tree Friends using the same process which brought me to YouTube’s “Copyright School” in the first place:

I hope Russell the Pirate doesn’t feel his copyright is being violated.

Related: Copyright Violation Notice from “Rightster”, Intention in Copyright, EFF Wins Renewal of Smartphone Jailbreaking Rights Plus New Legal Protections for Video Remixing, An open letter to Sony ATV and UMPG

An open letter to Sony ATV and UMPG

Dear Sony ATV Publishing, UMPG Publishing, and other concerned parties,

I ask you to please withdraw your copyright violation notice on my video, “PSY – GANGNAM STYLE (?????) M/V (YouTube SmashUp)” as I believe my use of any copyrighted material is protected under Fair Use or Fair Dealing. This video was created by an automated process as part of an art project developed during my PhD at Goldsmiths, University of London: http://pkmital.com/home/projects/visual-smash-up/ and http://pkmital.com/home/projects/youtube-smash-up/

The process which creates the audio and video is entirely automated meaning the accused video is created by an algorithm. This algorithm begins by first creating a large database of tiny fragments of audio and video (less than 1 second of audio per fragment) using 9 videos from YouTube’s top 10 list. From this database, the tiny fragments of video and audio are stored as unrelated pieces of information and described only by a short series of 10-15 numbers. These numbers represent low-level features describing the texture and shape of the fragment of audio or video. These tiny fragments are then matched to the tiny fragments of audio and video detected within the target for resynthesis, in this case the number one YouTube video at the time, “PSY – GANGNAM STYLE (?????) M/V”.

To reiterate, the content from the target video, “PSY – GANGNAM STYLE (?????) M/V”, is not used in the resulting synthesis. That is, the process is creating a new video by not merely copying the target video, but attempting to re-create it out of entirely different material, the remaining 9 top 10 YouTube videos. Abstractly, there may appear to be a similar form or structure due to the collection of many fragments organized in a similar way as the target for resynthesis. These fragments however are from a very large collection of very different material to the original content’s own material. The content used in the resynthesis itself is only from the large database of tiny fragments of audio and video segmented from 9 other videos. As a result, I would argue the use of any content within this video is only through Fair Use or Fair Dealing of the content.

This art project’s purpose is towards highlighting an important aspect of how computers and humans perceive and how copyright itself may be dealt with within a computational arts practice which by its nature has to make use of existing content. The nature of this work further seeks to transform existing material into something entirely different such that experiencing a resynthesized video reveals a new understanding of one’s own perception. The amount of the content used is fragmented in nature and assembled using a coarse idea of audiovisual scene understanding with no notion of semantics. As a result, the video itself is very abstract and at times incomprehensible. Further, its effect on the publisher’s marker as noted by the very low view rate on YouTube is marginal at best. I therefore ask you to please withdraw your copyright claim.

Sincerely,
Parag K. Mital

Related: Copyright Violation Notice from “Rightster”, Intention in Copyright, EFF Wins Renewal of Smartphone Jailbreaking Rights Plus New Legal Protections for Video Remixing, YouTube’s Copyright School

[UPDATE Dec 8, 2012: All copyright violation notices have been dropped and the video is publicly accessible.]

Copyright Violation Notice from “Rightster”

I’ve been working on an art project which takes the top 10 videos in YouTube and tries to resynthesize the #1 video in YouTube using the remaining 9 videos. The computational model is based on low-level human perception and uses only very abstract features such as edges, textures, and loudness. I’ve created a new synthesis each week using the top 10 of the week in the hopes that, one day, I will be able to resynthesize my own video in the top 10. It is a viral algorithm essentially but it is not proven if it will succeed or not.

The database of content used in the recreation of the above video comes from the following videos:
#2 News Anchor FAIL Compilation 2012 || PC
#3 Flo Rida – Whistle [Official Video]
#4 Carly Rae Jepsen – Call Me Maybe
#5 Jennifer Lopez – Goin’ In ft. Flo Rida
#6 Taylor Swift – We Are Never Ever Getting Back Together
#7 will.i.am – This Is Love ft. Eva Simons
#8 Call Me Maybe – Carly Rae Jepsen (Chatroulette Version)
#9 Justin Bieber – As Long As You Love Me ft. Big Sean
#10 Rihanna – Where Have You Been

It looks and sounds like an abstract mess.

Today, I’ve received a somewhat automated copyright violation notice from YouTube (shown below) suggesting my smashup of “11 Month Old Twins Dancing to Daddy’s Guitar (YouTube Smash Up)” (shown above via Vimeo instead of YouTube) are infringing the “audiovisual content administered by: Rightster” (their website describes them as: “Services that optimise the distribution and monetisation of live + on demand video for sports rights holders, news networks, event owners and publishers“), and my account has been placed under “Not a good standing“. Acknowledging infringement seems to be the suggested path via YouTube’s automated copyright infringement system (see pictures below). Though perhaps I should instead dispute under fair-use terms, and risk my account being banned if my dispute is “fradulent“.

[UPDATE: 17/10/12]: I’ve disputed the claim and my account status is still “Not in a good standing” (though I have no idea what this means). YouTube says they will temporarily put the video back online though this may change at any time:

“After your dispute has been submitted, your video will soon be available on YouTube without ads for third parties. This is a temporary status and might change at any time. Learn more about copyright on YouTube.
As a result, your account has been penalized and is not in good standing. Deleting the video will remove the penalty due to this claim.”

Related: YouTube’s “Copyright School”, Intention in Copyright, EFF Wins Renewal of Smartphone Jailbreaking Rights Plus New Legal Protections for Video Remixing

3D Musical Browser

I’ve been interested in exploring ways of navigating media archives. Typically, you may use iTunes and go from artist to artist, or have managed to tediously classify your collection into genres. Some may still even browse their music through a file browser, perhaps making sure the folders and filenames of their collection are descriptive of the artist, album, year, etc… Though what about how the content actually sounds?

Wouldn’t it be nice to hear all music which shares similar sounds, or similar phrases of sounds? Research in the last 10-15 years have developed methods precisely to solve this problem and fall under the umbrella term content-based information retrieval (CBIR) algorithms, or uncovering the relationships of an archive through the information within the content. For images, Google’s Search by Image is a great example which only recently became public. For audio, audioDB and ShaZam are good examples of discovering music through the way it sounds, or the content-based relationships of the audio itself. Though, each of these interfaces present a list of matches to a image or audio query, making exploring the content-based relationships of a specific set of material difficult.

The video above demonstrates interaction with a novel 3D browser of a collection of music by one artist, Daphne Oram. The sounds are grouped in 3D space based on the way they sound, clustering together similar sounding material. Each of the 3 axes describes a grouping of sound frequencies. So a timbre, or a texture of sound. The position of the sound along one of these axes means there is a lot of that group of frequencies present in the sound file.

Exploring her work in this browser really demonstrates the variety of sounds she achieved. It also makes exploring the collection really fun to use, as there is a fun visual form, and you also get to hear stuff right away.

The browser has also been built to be a real-time tool for creating new sounds. Mousing over any of the tiny boxes (representing parts of audio files in the collection) triggers the clip to play. Since similar sounding clips are grouped closer together, one can “perform” the collection along perceptually coherent axes by moving the mouse along any of the axes.

Intention in Copyright

The following article is written for the LUCID Studio for Speculative Art based in India.

Introduction

My work in audiovisual resynthesis aims to create models of how humans represent and attend to audiovisual scenes. Using pattern recognition of both audio and visual material, these models use large corpora of learned audiovisual material which can be matched to ongoing streams of incoming audio or visual material. The way audio and visual material is stored and segmented within the model is based heavily on neurobiology and behavioral evidence (the details are saved for another post). I have called the underlying model Audiovisual Content-based Information Description/Distortion (or ACID for short).

As an example, a live stream of audio may be matched to a database of learned sounds from recordings of nature, creating a re-synthesis of the audio environment at present using only pre-recorded material from nature itself. These learned sounds may be fragments of a bird chirping, or the sound of footsteps. Incoming sounds of someone talking may then be synthesized using the closest sounding material to that person talking, perhaps a bird chirp or a footstep. Instead of a live stream, one can also re-synthesize a pre-recorded stream. Consider using a database of nature recordings and, instead of the live-stream, now use a pre-existing recording of Michael Jackson. The following video demonstrates the output using Michael Jackson’s “Beat It”.

Everything you hear comes from nature recordings (by Chris Watson). Try to realize what elements of Michael Jackson’s original recording remain “meaningful”. The beat of the song is incredibly predominant (@ 33 seconds). As well, some aspects of the lyrics are present and heavily cross-modally present with the influence of the visual (e.g. @ 1:27), though no words are audible as the database contained no words. Also consider what meaningful information may be present in the opposite scenario, i.e. using a database of Michael Jackson, and re-synthesizing nature sounds.

As a side note, I have developed a similar approach for visual resynthesis, taking segments of visual objects as the basis of the resynthesis algorithm, rather than segments of audio. The example below demonstrates a resynthesis of the introduction to The Simpsons using only material from the introduction to The Family Guy:

More examples are on my vimeo channel.

Audio Collage

The idea of audio collage is not new. Electronic musicians have investigated the technique within the practice of music concrète. The advent of digital sampling with devices such as the Fairlight CMI as well made the practice much more accessible. Plunderphonics, a technique by John Oswald in which he manually chopped and resynthesized a number of copyrighted albums, created entirely new landscapes and sounds of material. It was also formalized in his essay, “Plunderphonics, or Audio Piracy as a Compositional Prerogative” where he questioned where copyright can begin to claim ownership. He famously made use of Michael Jackson recordings after 12 years of developing the technique in his EP Plunderphonics, perhaps the most extensive use of sampling to date, which featured an image of Michael Jackson on a naked woman on the album cover.

The album itself made use of material from a variety of artists, all credited, such as Count Basie, Dolly Parton, Beethoven, and Michael Jackson to recreate unheard of sounds reminiscent of Stravinsky and The Beatles. Oswald believed that by not selling the album, he was not infringing on anyone’s copyright. However, he faced legal pressure by Jackson’s attorneys and was forced to stop releasing the album by CBS and Jackson’s attorneys, destroying all remaining prints of the album. Other artists such as Negativland which experienced similar legal battles with U2 and Cassetteboy whose audiovisual video collage depicting BBC material of Queen Elizabeth was stripped from Youtube are also of note. Numerous documentaries including RiP: A Remix Manifesto and Good Copy Bad Copy, books such as Cutting Across Media: Appropriation Art, Interventionist Collage, and Copyright Law and Lessig’s Remix, and funded studies such as Recut, Reframe, Recycle have also focused on the topic (Thanks to Nathan Harmer for additional links).

Perception is Inference

I have extended these questions into the very nature of perception, claiming the computational models I employ are plausible models of our own psychological modeling of audio and vision itself. I try to make explicit one possible mechanism of perception as a meaning making inference machine, an idea which dates back at least to Helmholtz in 1869. The very nature of inference entails understanding requires prior experiences, prior models, a set of known examples. By taking the small fragments of sound and rearranging them, these perceptual units lose their context, and necessarily their original meaning, and are only bounded together by the ongoing environment to create a new meaning based on organization of the existing environment. In other words, without the environment to make a sound, there is no synthesis of a sound. The meaning therefore is created both by the viewer, and the target of the synthesis. The fragments within the database have no intentionality or meaning attributed to it until it is re-organized, resynthesized, and re-appropriated within a new context. In the video above, this context is Michael Jackson’s Beat-It; however, not a single fragment of Michael Jackson’s Beat It appears in the audio output.

No Infringement of Copyright Intended

Where then does copyright hold stake within the computational models I have created? The ongoing environment provides the intentional influence of how the sounds are to be rearranged. If Michael Jackson were to appear in the environment and sing a song that also appears in the corpus, it is likely the synthesis would re-create Michael Jackson’s song. As our own mental machinery encompasses having heard Michael Jackson before, we are able to recognize Michael Jackson. However, now consider a database containing tiny fragments of Michael Jackson’s recordings and a target of birds chirping and taxis honking. Then, the only semblance of Michael Jackson is based on the tiny fragments that appear, and the organization of a song of Michael Jackson is no longer present.

Now the question appears, “Does copyright hold stake over the ongoing environment’s intentions?”, requiring no one to perform Michael Jackson for fear of the copyrighted re-synthesis? Or does copyright instead hold stake in the subtle fragments of sounds which were sampled from a copyrighted track? Let us say I set up this computational model as an installation environment where the database contains both Michael Jackson songs and nature recordings. As a target for synthesis, I have a microphone feeding live audio to the computational model. Until the microphone hears something *like* Michael Jackson, creating a synthesis using the tiniest fragment of a Michael Jackson recording, it seems I will not have violated copyright.

Even still, how could we have understood that a tiny fragment of a resynthesis was copyrighted in the first place? We will have had to matched every tiny fragment within our own perceptual machinery (i.e. recognition) to have understood that I had heard this fragment within a different context, that is, a copyrighted one. Though, is it not the case that any meaning making inference will necessarily be matched to a prior experience? If so, then isn’t copyright claiming our own experiences as copyright? Where does the context of sound take place within copyright? Could I not listen to the blow of trees and be reminded of Michael Jackson? Or more likely, hear the wind and be reminded of a Chris Watson recording?

The question I am getting at is “How does the meaning elicited by the new resynthesis change our notion of copyright?” Consider the opposite scenario, where Michael Jackson’s songs are no longer in the database, and instead we have nature recordings, as the video above does. What does copyright have to say for using the organization of Michael Jackson’s songs though using entirely different content? In this case it seems I am using the artist’s full intention, though not breaking any copyright as the material is not evidently materialized in the resynthesis.

Early 20th Century

The early Dada movement of the 1920′s also looked at resynthesis within a narrative context. During a surrealist rally in the 1920′s, Tristan Tzara is famously noted as standing in front of a theater and pulling random fragments of text out of a hat before the riot ensued and wrecked the theater. T.S. Eliot’s The Waste Land and John Dos Passos’ U.S.A. trilogy are also early examples of the cut-up technique popularized by Tzara. The technique was further made famous by Brion Gysin and William Burroughs during the 1950′s in their poetry, which heavily made use of the cut-up technique in a variety of fashions. Gysin mistakenly came across the technique as he made use of layers of newspapers while cutting paper on top of them using a razor blade (originally to protect the table). He noticed the cut-up fragments of newspaper created a juxtaposition of image and text that were strangely coherent and meaningful. The two key terms to stress are ‘coherent’ and ‘meaningful’, of which our perceptual systems cannot help but create from the world. In fact, a number of theories of neuro-cognitive behavior are also based on these premises. Interested readers are encouraged to read Ronald Rensink’s work on theorizing the phenomena of visual Change Blindness into “Coherence Theory”, and Shihab Shamma’s work on modeling auditory perception within a “Temporal Coherence Theory”.

Frederic Jameson also discusses the nature of collage, an early analytic cubist practice developed by Picasso and Braque which graced the start of the century, as a process which creates a new meaning from existing material. In contrast to pastiche or bricolage, collage seeks a new meaning, whereas pastiche is often used pejoratively to denote a random intention or imitation of existing intentions, e.g. 16th century forgeries/imitations.

Conclusion

The distinction of collage versus pastiche seems incredibly relevant to practitioners making use of sampling and collage. However, I doubt such a distinction would help anyone in court. Though it is curious to think of the opposing scenario, where the material content is not copyright, though the intention or organization of the content is. How does a digital future handle this distinction without referring to intentions of an artist? Further, how could it prove an artist’s intentions in the first place?

Related: Copyright Violation Notice from “Rightster”, YouTube’s “Copyright School”, EFF Wins Renewal of Smartphone Jailbreaking Rights Plus New Legal Protections for Video Remixing

Course @ CEMA Srishti School of Design, Bangalore, IN

From November 21st to the 2nd of December, I’ll have the pleasure to lead a course and workshop with Prayas Abhinav at the Center for Experimental Media Arts in the Srishti School of Design in Banaglore, IN.  Many thanks to Meena Vari for all her help in organizing the project.

Stories are flowing trees

Key words:  3D, interactive projects, data, histories, urban, creative coding, technology, sculpture, projection mapping

Project Brief:

Urban realities are more like fictions, constructed through folklore, media and policy. Compressing these constructions across time would offer some possibilities for the emergence of complexity and new discourse. Using video projections adapted for 3D surfaces, urban histories will become data and information – supple, malleable, and material.

The project will begin with a one week workshop by Parag Mital on “Creative Coding” using the openFrameworks platform for C/C++ coding”.

About the Artists:

Prayas Abhinav

Presently he teaches at the Srishti School of Art, Design and Technology and is a researcher at the Center for Experimental Media Arts (CEMA). He has taught in the past at Dutch Art Institute (DAI) and Center for Environmental Planning and Technology (CEPT).
He has been supported by fellowships by Openspace India (2009), TED (2009), Center for Media Studies (CMS) (2006), Public Service Broadcasting Trust (PSBT) (2006), Sarai/CSDS (2005). He has presented his projects and proposals in the last few years at Periferry, Guwahati (2010), Exit Art, New York (2010), Futuresonic, Manchester (2009), Wintercamp, Amsterdam (2009), 48c: Public Art Ecology (2008), Khoj (2008), Urban Climate Camp, ISEA (2008), Sensory Urbanism, Glasgow (2008), First Monday, Chicago (2006), The Paris Accord (2006) and PSBT/Prasar Bharti (2006).
He has also participated in the exhibitions Myth ?? Reality (2011) at The Guild, Mumbai, Continuum Transfunctioner (2010) at exhibit 320 in Delhi, Contested Space – Incursions (2010) at Gallery Seven Arts in Delhi and Astonishment of Being (2009) at the Birla Academy of Art and Culture in Kolkatta (2009).

http://prayas.in

Parag K Mital (London)

Parag K Mital is an American-born London-based PhD-student in Arts and Computational Technology at Goldsmiths, University of London working on augmented realities and audiovisual resynthesis. As an audiovisual installation artist, his work encourages the audience to directly question the processes surrounding perception through introspection and curiosity from experiencing real-time models of audiovisual perception. His work has traveled extensively in London, Athens, and Moscow, including the London Science Museum and the British Film Institute. As an educator he has taught at Edinburgh University, Goldsmiths, University of London, and is due to deliver a course on Audiovisual Processing for iPhone/iPad at the Victoria & Albert Museum in London.

http://pkmital.com

Workshop in “Creative Coding” using the openFrameworks platform

The workshop will cover the basics of openFrameworks, a c/c++ creative coding platform. This course will also introduce students to digital signal processing techniques in synthesis and analysis of audio and visual signals for interactive techniques using custom made libraries developed at Goldsmiths, University of London. Depending on interest, participants will also receive tutorial on developing for the iPhone/iPad in order to create real-time audiovisual apps.

http://www.openframeworks.cc
http://pkmital.com/home/code/
http://maximilian.strangeloop.co.uk/

Memory Mosaicing

A product of my PhD research is now available on the iPhone App Store (for a small cost!): View in App Store.

This application is motivated by my interests in experiencing an Augmented Perception and of course very much inspired by some of the work here at Goldsmiths. The application of existing approaches in soundspotting/mosaicing to a real-time stream and situated in the real-world allows one to play with their own sonic memories, and certainly requires an open ear for new experiences. Succinctly, the app records segments of sounds in real-time using it’s own listening model, as you walk around in different environment (or sit at your desk). These segments are constantly built up the longer the app is left running to form a database (working memory model) for which to understand new sounds. Incoming sounds are then matched to this database and the closest matching sound is played instead. What you get is a polyphony of sound memories triggered by the incoming feed of audio, and an app which sounds more like your environment the longer it is left to run. A sort of gimmicky feature of this app is the ability to learn a song from your iTunes Library. What this lets you do is experience your sonic world as your favorite hip-hop song or whatever you listen to.

Hope you have a chance to try it out and please forward to anyone of interest.

Concatenative Video Synthesis (or Video Mosaicing)

prototype

Working closely with my adviser Mick Grierson, I have developed a way to resynthesize existing videos using material from another set of videos. This process starts by learning a database of objects that appear in the set of videos to synthesize from. The target video to resynthesize is then broken into objects in a similar manner, but also matched to objects in the database. What you get is a resynthesis of the video that appears as beautiful disorder. Here are two examples, the first using Family Guy to resynthesize The Simpsons. And the second using Jan Svankmajer’s Food to resynthesize Jan Svankmajer’s Dimensions of Dialogue.

Google Earth + Atlantis Space Shuttle

I managed to catch the live feed from NASA.gov of the Atlantis Space Shuttle launch yesterday. Though what I found really interesting was a real-time virtual reality of the space shuttle launch from inside Google Earth. Screen-capture with obligatory 12x speedup to retain attention span below:

Lunch Bites @ CULTURE Lab, Newcastle University

I was recently invited to the CULTURE lab at Newcastle University by director, Atau Tanaka. I would say it has the resources and creative power of 5 departments all housed in one spacious building. In the 12-some studios housed over 3 floors, over the course of 2 short days, I found people building multitouch tables, controlling synthesizers with the touch of fabric, and researching augmented spatial sonic realities. There is a full suite of workshop tools including a laser cutter, multiple multi-channel sound studios, full stage/theater with stage lighting and multiple projection, radio lab, and tons of light and interesting places to sit and do whatever you feel like doing. The other thing I found really interesting is there are no “offices”. Instead, the staff are dispersed amongst the students in the twelve-some studios, picking a new desk perhaps whenever they need a change of scenery? If you are ever in the area, it is certainly worth a visit, and I’m sure the people there will be very open to tell you what they are up to.

I also had the pleasure to give a talk on my PhD research in Resynthesizing Audiovisual Perception with Augmented Reality at the Lunch BITES seminar series. Slides are below, though the embedded media is removed. Comments are welcome!

Download (PDF, 1.12MB)



Copyright © 2010 Parag K Mital. All rights reserved. Made with Wordpress. RSS