A year ago I first saw Steve Mould video about Motion amplification and I was mesmerized by how we can extract motion from seemingly static video. The example from his video where an electric motor is seen grinding its mount pointing out those areas as potential points of failure was interesting.
There are simple ways to reproduce those effects with any video editor by creating two layers, inverting one, time shift the other, and subtracting them. This will highlight any motion or in this case differences between two frames.
But this is not motion amplification where small, unperceivable motion is amplified so we can see it.
As I had some free time due to sports injury, I revisited this and found original papers on the MIT page listing and demonstrating different approaches. After going through them I selected two; Riesz-pyramid and Eulerian motion amplification.
Riesz-pyramid video motion amplification has a nice pseudocode and based on it I made a basic Python multiprocessing script. It can be found on Git Hub for your perusal.
For testing, I cut up RDI Technologies videos from YouTube so ignore the bottom text that states that’s its original video, and look at the content of motion of the video.
To start things off I tried to get my head around Eulerian motion amplification as it seemed simpler. Just a quick head up, nothing about this is simple as I’m about to find out.
Eulerian methods focus on changes occurring at fixed points in space. Imagine setting up a grid over your video and observing how the color or brightness at each grid point changes over time. You're not following objects as they move instead, you're watching how the pixel values fluctuate at each stationary point.
The video frames are broken down into different layers that capture various levels of detail. This is done using techniques like Gaussian Pyramids or Laplacian Pyramids. Gaussian Pyramid is a way of representing an image at multiple resolutions. We repeatedly blur and reduce the size of the image to create a series of progressively smaller and blurrier images. We start with a high-resolution photo and by creating smaller, blurrier copies we capture different levels of details, from fine textures to broad shapes. For example, if we take a picture of a tree, at higher resolution we will see every leaf and stem on the tree but at lower levels, we would have a hard time recognizing any branches and would see a trunk with a canopy.
At each point in these layers, the changes over time are analyzed. This means looking at how the brightness or color of each pixel changes from frame to frame.
Warning! Flashing images
The video above shows the progress of motion amplification. The first frame is the unprocessed frame from the original video followed by the first level of the Gaussian Pyramid (blurred and downsampled once) then by second level of the Gaussian Pyramid (blurred and downsampled twice) and to complete the first row is the first level of the Laplacian Pyramid (edges and details at Level 1). The second row starts with the second level of the Laplacian Pyramid (edges and details at Level 2) then with a visual representation of detected motion areas followed by Motion map after amplification and finally resulting in motion magnified.
As you might imagine, it takes a lot of processing to do all of this constructing and deconstructing of video frames. The most I managed is 3-4 frames per second utilizing multiprocessing when I convert input video to monochromatic so 1 channel is analyzed which falls to 1 fps when analyzing colored video. This is expected as it takes 3x longer to process 3 channels.
We can then apply temporal filtering which is applying filters that focus on specific frequency bands corresponding to the subtle changes you want to amplify. For example, a human heartbeat has a frequency of about 1 to 1.5 Hz.
Then we do the Amplification where detected changes within the chosen frequency band are amplified by some factor making tiny motions or color changes much more pronounced. The fascinating thing about the Gaussian Pyramid is that it is reversible and we can reconstruct the original image plus the changes we introduced to increase the detected motion.
This approach works but the resulting video is grainy and full of noise.
The Riesz approach uses the Laplacian Pyramid and the Riesz Pyramid. The Laplacian Pyramid builds upon the Gaussian Pyramid by capturing the differences between the levels. For each level in the Gaussian Pyramid, the image is expanded back to the size of the previous level and subtracted from it. This process results in a series of images that highlight the edges and fine details lost during the blurring and downsampling. Essentially, the Laplacian Pyramid isolates the fine details or the high-frequency components of the image, which are essential for detecting subtle movements.
The Riesz Pyramid is an advanced extension of the Laplacian Pyramid that provides additional directional information. While the Laplacian Pyramid captures the magnitude of changes, the Riesz Pyramid also captures the orientation of features in the image. This makes it particularly useful for analyzing motions that have specific directions, enhancing the ability to detect and amplify subtle movements in certain orientations. The Riesz Pyramid, with its directional sensitivity, can be particularly helpful in these applications by capturing not just how much something is moving but also in which direction.
The final result is a new video where the previously imperceptible movements are clearly visible. This can reveal vibrations in machinery, subtle physiological motions in medical imaging, or minute structural shifts in buildings and bridges. By making the invisible visible, motion amplification provides valuable insights that can lead to better diagnostics, safer structures, and more profound scientific discoveries.
This video shows the difference between Riesz based approach and Eulerian motion amplification. The Riesz is monochromatic but can be full color which takes longer to process.
To say that I’m over the moon with results would be an understatement. I managed to scratch my intellectual itch and make something that I could see as useful.