top of page

Controlling a Robot with a Flute

In 2018, my student team designed "Deep Thought," an omni-directional soccer robot. Recently, I added an analog microphone and re-coded it to respond to notes played on a Native American flute.

 
 

Background


In 2017, I was a junior in college taking Intro to Design, a team-based project deliverable class. The class was structured around designing and manufacturing a remote system to compete in a class-wide robot soccer competition. My team designed Deep Thought, an omni-directional platform named for the computer in Hitchhiker’s Guide to the Galaxy (our assigned team number was 042). The main features of the robot are its passive ball capture mechanism, cam-loaded spring launcher and its mecanum wheel drive. Deep Thought advanced to the semi-finals with its unique strafing and aiming ability but was ultimately held back by its slow and underpowered motors – something I wanted to improve on my own.

The robot was relinquished to a box in my closet after the competition – my team let me keep Deep Thought since I had spent the most time and money in manufacturing. I had always planned to work on some upgrades for fun and I finally got the opportunity during my final term of college.


 


ME 451: Learning the Ropes of Robotics

In Spring of 2020, I took a course on coding and data measurement. This course was also structured around a term project: an open-ended Arduino-based system, which must use a sensor with at least three states and actuators with at least three behaviors. I chose to dust off Deep Thought and recode it from scratch with a twist: I wanted to control it with a Native American flute.

I began by searching the web for an FFT library I could use on an Arduino. I found a few spectrogram projects that used an Arduino FFT library with frustratingly little documentation. I ended up reverse-engineering my FFT from a Frequency Visualizer project by Clyde Lettsome, PE. The code included with this project is published with his permission.


The next step was to demonstrate that it worked. I began by wiring a 3.5mm audio jack to one of the Arduino’s analog pin. I found that I could play tuning pitch videos through my computer and see a frequency spike corresponding to each note, like the one shown here:


This plot shows the Arduino serial plotter output while I played a few tuning notes through the audio jack. The FFT I'm using is optimized for speed. It only has 64 bins of resolution, corresponding to frequencies between 0 and 1,000 Hz. I chose this range because it covers the highest note my flute can produce, A 880Hz. Dividing 1,000 Hz into 64 bins gives a resolution of about 16 Hz per bin. Luckily, the smallest distance between any two notes on my flute is great enough that this resolution can easily differentiate close notes. Notice the difference in resolution between the Matlab computed FFT (second plot) and the Arduino FFT (third plot) below:


Now that I had verified the FFT worked, it was time to try it with an actual instrument. I swapped the audio jack for an analog microphone, added a few LEDs, wrote a few lines of code, and tested my new 6-frequency-resolution spectrogram with my flute for the first time:



With a little bit of tinkering, I now had functional code that could differentiate frequencies of sound picked up by a microphone. From here, programming the robot was straight-forward: simply replace each colored LED with a different set of motor commands. Luckily, I had already tested this functionality with the audio jack setup:



 

Hardware Upgrades

I decided to replace all the electronics on Deep Thought, both to increase my familiarity with it and to avoid troubleshooting damage that might have occurred during storage. I also incorporated several upgrades I've wanted to make, including:

  • Replacing the launcher motor L298n H-bridge driver with an Arduino-compatible 5V relay (to get rid of voltage drop associated with the H-bridge driver)

  • Replacing the years old 7.4V LiPo battery with a new 12V battery

  • Replacing the old 6V launcher gearmotor with a higher torque 12V gearmotor

The most fundamental change was the switch to a 12V battery. It would allow the use of a 12V gearmotor which had almost twice the torque of Deep Thought’s original launcher motor.


In addition, I learned from the L298n documentation that these motor drivers had a voltage drop of a little more than 1V. This means that the 7.4V LiPo battery was barely able to supply 6V to the motors when fully charged, and insufficient beyond that. With a 12V battery, there is enough power to overdrive the motors slightly to improve the robot’s poor speed, even after the H-bridge voltage drop. Using Pulse Width Modulation, I supplied ~6.8V to the drive motors, drastically improving speed and performance.


The voltage drop of the H-bridge drivers also lead me to use a relay switch to control the launcher motor. Relay switches have negligible losses, and there is no need for speed or direction control on the launcher motor.


These upgrades are shown in Deep Thought's wiring diagram, pictured below. The 9VDC is supplied by the same 12V rechargeable battery, which has multiple outputs.


 

Control

Deep Thought’s mecanum wheels allow it full 2D motion, including the ability to strafe. To do this, each wheel needs to be independently controlled, using combinations of full forwards, full reverse, and full stop. I mapped 6 notes to 6 basic motions: forwards, backwards, left, right, rotate CW, and rotate CCW.




 

Signal Conditioning


There are many difficulties that arise when using control signals recorded by a microphone rigidly fixed to a loud, clanky robot. The first step to getting a clear control signal is to isolate the microphone from the robot's vibrating chassis as much as possible.


The original soccer robot was controlled by a bluetooth module. I added a 1 meter wooden dowel mast, to suspend the microphone high above the noisy wheels and motors. The microphone is seated in a styrofoam boot, to help with vibrations transmitted through the mast. The microphone fixture also had the cosmetic perk of making Deep Thought resemble a Martian rover.


Most of the signal conditioning occurs on the Arduino and can be broken into 5 processes:


1) Fourier Transform The microphone records a time domain signal. The first layer of processing uses an FFT (Fast Fourier Transform) to convert the recorded audio data to a frequency spectrum. The output spectrum has a resolution of 64 bins that correspond to 0-1,000 Hz, meaning each bin has a width of 31.25 Hz. The FFT sample rate, or Nyquist rate, is 2,000 Hz, which was chosen to record the Nyquist frequency of 1,000 Hz. The Nyquist frequency was chosen to be adequate to cover the highest playable note on the Native American flute, A880.


2) Psuedo-Band Pass Filter The next process is similar to a software band pass filter. The control signal notes range from A440 to A880, so only the bins in the range corresponding to 440 Hz to 880 Hz contain meaningful information. The code only passes bins 25-60, which correspond to 420 - 900 Hz. This captures the signal frequencies without wasting extra processing power and memory on the unused bins.


3) Noise Gate The Arduino runs a loop to find the loudest (highest magnitude) bin, called the dominant bin. It is assumed that if a note is being played, it will be found in the dominant bin. The dominant bin magnitude is compared to a fixed cutoff. If the magnitude is below the cutoff, it is ignored, and if it is greater it is passed. Without this step, the robot would constantly perceive a signal frequency as ambient noise causes random spikes in the frequency-magnitude spectrum.


4) Signal to Noise Comparison This step compares the magnitude of the dominant bin to the average magnitude of the rest of the bins. If the dominant bin is twice the average magnitude, it is passed. Without this step, the robot could be triggered by loud enough background noise.


5) Psuedo-Band Reject Filtering The final process has a similar effect to several band-reject filters. The Arduino looks at the dominant signal (if it has come this far) and checks to see if it lies in a bin that corresponds to an actual note. If the dominant bin corresponds to a signal note, it is passed as a command to the motors.



The cumulative effect of all the conditioning processes is demonstrated below. In the first image, the Robot detects an A880 signal note and rotates counter-clockwise. The second image shows the result of processing on an out-of-tune note I hummed. The Arduino detects a dominant bin (bin 20), but it fails several processes and no movement command is sent to the motors.

Signal Conditioning of A880:

Signal Conditioning of Hummed Note:


After the 5 layer signal processing, the Arduino either identifies a note which corresponds to a movement command or detects nothing and instructs the motors to stay still. At this point I tested the robot with positive results:



 

Ground Testing


With all systems running, I decided to take the robot off the test stand and let it explore the living room. I found the microphone had very limited range. I was worried that the soft tone of the flute would struggle to cut through the rattling of the mecanum wheels and that appeared to be the case.


I noticed some interesting behavior that was dependent on the distance between the microphone and my flute. At ranges less than a half meter or so, the robot responded quickly and accurately to commands. This usually meant crouching down to play the flute right into the microphone. At ranges greater than about 4 meters, the robot would not pick up any commands (the flute must be quiet enough to fail the noise gate process at this range). But between 0.5 and 4 meters, the robot would strangely oscillate rapidly between carrying out the correct command and stopping altogether. For example, if I played C, the robot would jerk forwards then stop, repeating about twice per second as long as I held the note.


This semi-stable behavior seemed very familiar, like the feedback of a poorly tuned carburetor. Maybe the Control Systems class I took the term before primed the thought. Then again, it could be a loose connection or some weird bug in the code. I ran the robot tethered to my laptop to read the serial monitor output and found this:


The signal varied exactly and predictably, and not in random spasms. If there was any doubt that what I was seeing was an oscillation before, it was gone now. I figured the behavior went something like this:

  1. I play a command note

  2. Deep Thought hears the command note, which passes the signal tests and causes a motion command

  3. The motors turn on, raising the noise level

  4. Deep Thought hears both the command note and the noisy wheels

  5. The extra noise causes the note to fail a test (likely the signal/noise comparison), sending a stop command to the motors

  6. Deep Thought hears just the command note and the loop continues

I immediately tried adjusting the Noise Gate and SNR (Signal to Noise Ratio) tests but found that decreasing the SNR threshold made Deep Thought more sensitive to background noise. This was clearly a performance trade-off.


Another route I tried was hard-coding a filter for the motor noises. I took some audio samples of the motors running on the ground and found its FFT spectra. I located any significant frequency bins and made note of their magnitudes, then added a line of code to subtract those values from the signal FFT if the motors were running. This widened the response radii moderately, but seemed inelegant (and hard-coding in this manner is generally not a good practice in the first place. What if the robot tries to operate on a different surface or in a room with different acoustics?)


I ordered a variable-gain microphone with automatic leveling as a possible remedy. When it arrived, I found that the adjustable gain microphone significantly increased the hearing range of the robot. The full control range increased to ~3 meters. The oscillatory behavior still exists, but it is not much of an issue now since the range of the microphone is sufficient for many indoor spaces. It is also somewhat an inevitability that is part of the nature of signal decay, like how AM/FM radio waves will become grainy and noisy as you move out of range. At a certain point, there has to be a transition between a clear signal and no signal, and that transition manifests as oscillatory behavior in this robot.


For now, I am very satisfied and won't immediately pursue diminishing returns to increase the range an additional few meters for twice the effort. Not to mention, I had enough control to make the robot maneuver and obstacle course or shoot down a tower of cans, which are rewarding feats capable of impressing any visiting relatives. However, I've made notes of some things I'd like to improve in the next round of upgrades.



 

Next Steps


In this project, I had accomplished all that I had set out to do. I reprogrammed Deep Thought to listen to commands played on a flute and boosted its speed to something more acceptable (by switching to a 12V power supply and overdriving the motors with 6.8V of PWM power). I drastically improved its ball-launcher speed and range by using a 12V gearmotor with nearly twice as much torque. Finally, I improved the hearing range to make the robot usable from a more practical distance, but I already have a few things I want to add or improve:


1) Audio Out

I want to add an analog speaker so that Deep Thought can communicate back. I could pre-record messages to play when the battery is low or for other features like Tuning Mode.


2) Tuning Mode

I want to allow the user to define their own control tones. By pressing a button, Deep Thought enters tuning mode, indicated by a flashing light. Six lights correspond to the six movements. When light #1 turns on, the user plays a note and Deep Thought records that note for movement #1. This repeats for each light until all the movement commands are set, then Deep Thought indicates that it is ready.


3) Build Quality

I want to replace all of the protoboard and pin elements with sturdier connections. I want to rebuild Deep Thought with more robust connections as if it was built for a customer.


4) Acoustic Improvements

The range of the microphone is still somewhat restrictive, mostly because of the noisy motors. I want to improve its range further by better isolating the microphone, maybe by using a sound shield or a specialized directional microphone.


 

Project Code

This code is partially developed from the Arduino Spectral Analyzer project code released by Clyde Lettsome, PE. This following code is available with his permission.



bottom of page