Hypercinema Blog

Project 1 Sound and Space

When we learn we have to choose an object, this object will be the key to the space and collect its sounds. And we need to use collected sound bites to assemble a soundscape or sound-walk of a specific location. The first thing that comes to our mind is the subway.

Because in the subway station, there are not only busy streets, but also the beeping card machine at the entrance, the announcement of the next train, the noise of the subway entering the station, and so on. This environment is undoubtedly an ideal place for us, because there are too many distinctive sounds with information that can help audiences distinguish the specific object from the particular location where it is located.

We quickly decided to make our recording at the subway station in Times Square because that is the closest and busiest station to us. Undoubtedly, we can find many special sounds there.

Our team decides that everyone make a personal version, and then we will decide which version is better or more proper for our statement at the meeting on Monday, and then we will update that version together.

During my editing process, I found footsteps will be a good point because the environment in the subway station is boisterous. But heavy footsteps can always attract people’s attention, even in a noisy environment. So I decided to set my theme as footsteps in a subway station in New York. Of course, my key object is the sound of footsteps. I hope to guide the audience through the sound of footsteps and feel different scene changes and emotions.

After meeting with the group on Monday, we decided to use my idea. But the team members think that there are many strange sounds in my audio that can’t make sense. In other words, if people listen to the audio after listening to my thoughts, maybe people will know what I want to express, but if people don’t listen to my thoughts, they won’t understand some part of my audop. I have many sounds that will mislead the audience.

So we decided to modify it together. At first, Oscar suggested adding some music to express emotions, but soon we found that this was not working because there was no suitable music, and the music made the entire track sound more complicated. Vivian suggests that we can reflect emotions by changing the rate of footsteps. So we tried this solution and found that it worked well. After Oscar corrected some audio errors and adjusted the details, our soundtrack became perfect. When we close our eyes and listen, we can even see the picture. I hope you can also experience it.

synthetic media

The synthetic media I want to share is Synthesia. The reason I chose it is straightforward. Compared with other synthetic media for entertainment, it is more practical and can positively develop society. The company’s main area is to transfer and synthesize (adjust?) the facial expressions and lip shapes of the model in the video from another video. For example, suppose there is a video taken by Beckham (English), another person takes the video in 9 different languages, and AI technology is ultimately used to create a video where Beckham speaks in 9 languages. If it is widely used, perhaps the future media will not be restricted by language, and people can experience media more immersively.

Unfortunately, due to commerce secrets, the company did not open source its program code. Similarly, I have no way of knowing how they made and trained machine learning. But it can be found from their API reference. Their technology is to record the mouth patterns of specific words based on the native speaker and then iterate and train through users’ use in the network.

Synthesia face the same ethical issues as most synthetic media, and that is digitized impersonations. Although it still seems to be used for positive purposes, with more and more text and expressions that can be replaced in the future, it may pose risks to individuals and society in the future. If it is not controlled completely, the subsequent Defamation and False Ligh will undoubtedly cause severe problems in legal and ethical aspects. The consequence will be that fewer and fewer people will use this tool for positively influencing work. On the contrary, it will be used as a tool for negative entertainment, illegal profit, and slander, just like most existing synthetic media.

Identify 3 models on Runway ML

The first thing I want to try is the text to imagine generator. I hope that by inputting enough information about my face, the AI can generate a face that matches my description. I think this principle may be that AI looks for pictures corresponding to text information on the Internet and summarizes words corresponding to most images to achieve the ability to feedback user input.

After the AI generates a photo that meets my description, I hope to synthesize my real face with this “false” face. This will confirm what this looks like and further help me understand how it works. I think this may just bring together the iconic facial features on the face.

In the end, I hope to use the function of the detect face in the images. Help me judge whether the computer-synthesized face can pass this postgraduate entrance examination. I think the principle of this is to rely on the continuous learning of the machine and then use the features to infer whether this is a person. I hope to do this to use the pictures generated by AI to test whether AI can detect AI in modern society.

Experiment in Synthetic Media

Intro

In this experiment, I attempted to use different models in Runway ML to generate documentation and combine it with the concept I wanted to express. In this experiment, I mainly practice two models: Bring Old Photo Back To Life and Neural Style.

Purpose

In this experiment, the essence of what I want to express is to call on people to pay attention to environmental issues and the protection of endangered animals. To achieve this goal, I plan to use some world-famous paintings that symbolize despair and apocalypse by combining their unique colors and atmosphere with some seemingly beautiful scenery and animals to create a visual impact on the audience with solid color matching. So that awaken people’s attention to the deteriorating environmental problems and the protection of endangered animals.

In order to achieve this goal, I will use the Bring Old Photo Back To Life model to help me turn some old world-famous paintings into works of high resolution and color saturation that can be better used. And then, use Neural Style to merge the two pictures so that the natural environment and endangered animals can be rendered with unique colors and atmospheres.

Bring Old Photo Back To Life

Now, please enjoy the works before and after the processing.

Loss of indiaman Kent-Theodore Gudin-1828

Louis Hector Leroux-1829

The River Styx-Felix Resurreccion Hidalgo-1887

The Scream-Edvard Munch-1893

It can be seen that after AI processing, some old paintings have displayed brand new. We can see that the colors of some paintings before the processing were not so bright because they were stored for too long. After the processing, the saturation and contrast of the colors have been significantly improved. In addition, some paintings have parts where the paint falls off, and these parts are filled with the proper colors after processing. AI also helps smooth some sharp and jagged details. This brought these paintings back to life. Of course, the most important thing is that the brand new painting can better display with full color and high contrast in the next step.

Neural Style

It can be seen that the processed picture is rendered with the mood and atmosphere of the painting. And I especially like the first one. After the original colorful and vivid animals are processed, not only the color of the whole picture becomes depressed, but the leopard becomes like a burning feeling. This is quite in line with what I want to express, and it can even be said that it surpasses what I want to tell. This way of expression better shows that people need to pay attention to the protection of endangered animals. 

Also, for other works, after synthesis, they have a more significant visual impact than before. Many scenes seem like doomsday, and I hope that most people can arouse empathy and feel the impact. I hope this way can motivate people’s determination to protect the environment and endangered animals.

The fly in the ointment is that because works with similar themes are all identical in color, it may be difficult to distinguish the difference between each composite. And for works like the scream, the feeling after synthesis is not as good as the feeling that I want to express from the original composition.

After project

After I had the concept I wanted to express initially, I tried to find some videos related to the protection of endangered animals and the corresponding animal synthesis to create scenes where endangered animals are talking. But I found that most video synthesis models are not up to this task. The main reason is that there is a big gap between recognizing human faces and animals faces. In most cases, using animal faces to synthesize human faces will be very weird and unsuitable. And if the video of a person’s speech is applied to the picture of an animal, since the machine cannot connect the human face with the animal’s face, in most cases, I will see, for example, a video where the panda’s right eye is talking and the mouth is shaking. So based on a week-long experiment, I gave up the video format and switched to the image format. Obviously, this is more stable, but it is also less fun and impressive than videos.

But from this experiment, I experienced the creative process of various synthetic media. More importantly, I learned about different synthetic media, which will undoubtedly be a very positive help for my future study and production.

Animate Gifs

This week’s assignment is to make animated Gifs, and my team member Cynthia and I decided to make one for each. And I decided to make my favorite animated image: pig ^ ^

I use Gif production software. Since the content is not too much, I decided to set the number of frames to 8 frames, which is 8 pictures. The main reason for this is to save production time. The following is the production flow chart and finished product.

Next is Cynthia’s work.

storyboard

We made the animation in After Effect and imported them to Adobe Aero. In the end, edited all the AR recordings together in Adobe Premiere and added the sound effects.

Cornell Box

Our initial idea was to generate a university inside of the Cornell box based on each other’s self-awareness. In order to make the teamwork more convenient, we decided to divide the whole Cornell box into three different aspects – undersea, mainland, and above the sky. Each one of us would be responsible for one part and create his own ideal world. After all, we put them together so that a chaotic and irrelevant but thriving world will be generated.

Henry Part:

In Henry part, he is in charge of the creation of the mainland part. He got a lot of inspiration from the Unity Asset Store, and Henry picked one ancient war-style architecture as my reference because he would love to show the aspect that could be both related to the sky and sea elements.

Basically, he re-arranged the scene’s size and location to fit our Cornell box better. Also, he changed the fire effects figures and added a scale change animation to them so that they would appear as a fire happened and rose. Also, he created an animation for one boat from one side of the pier to the other. What’s more, the rest parts were some minor changes and spirits, including the floating wood bucket on the sea, different speeds of the float of water, and some reduction of the architecture to make the entire box more organized and reasonable.

Cynthia Part:

My Part:

For my part, I made underwater scenes. I added objects that I would like to appear underwater in my scene, such as corals, moving seaweeds and bubbles, crystals, and starfish shells.

I think the core of the Cornell box is not to display the items that ‘should’ appear in the box but to simulate some things that ‘should’ appear in the box through unity and other software. So I think the core is to show imagination.

To achieve this, I chose a lot of creatures that only appear in the deep ocean to construct the overall scene. In order to reflect the’vitality’ in the water, I added the practical of photoelectric and the practical of blisters to make the static water look vitality.

I think the core of the Cornell box is not to display the items that ‘should’ appear in the box but to simulate some things that ‘should’ appear in the box through unity and other software. So I think the core is to show imagination.

To achieve this, I chose a lot of creatures that only appear in the deep ocean to construct the overall scene. In order to reflect the’vitality’ in the water, I added the practical of photoelectric and the practical of blisters to make the static water look vitality.

I put it in a container that looks like a swimming pool to reflect the three levels. The vast sea is placed in a huge swimming pool and stored in a small Cornell Box. I hope to convey this concept to users to experience the Cornell Box I made more immersively.

Reference: https://assetstore.unity.com/packages/vfx/particles/environment/underwater-fx-61157