Blogs
An Essential Article on AI and Human-Machine Teaming:“Functional Scaffolding for Composing Additional Musical Voices” [Recommended Innovation Articles (and Commentary) #23]
by : AOD NetworkOriginal post can be found here: https://benzweibelson.medium.com/an-essential-article-on-ai-and-human-machine-teaming-functional-scaffolding-for-composing-9241fa33f243
Okay, this article is going to be a doozy for some readers (those without STEM or mathematics backgrounds), but do not worry. I really think everyone should read this and I have additional aids. The article is 41 pages long and has some sections that again will be tough for laymen in AI, music, or mathematics- but do not worry. We have great commentary and supporting resources below to guide your self-development journey. Use this article as your baseline, but follow the recommendations below and augment with supporting videos and content as needed. Trust me, if you read this far, you really want to grab this and consume it. Here we go. The article PDF is FREE online and available here:
https://drive.google.com/viewerng/viewer?url=https://eplex.cs.ucf.edu/papers/hoover_cmj14.pdf
1. Everyone is talking about ChatGP4 or AI that does machine learning and can enhance human operations via a new human-machine teaming relationship. For most of us, how this all works is like magic or a black box- the AI gets your question, does something magical, and spits out a pretty decent reply. Similar AI is doing this with art, music, and even videos for beer commercials (saw one on Twitter recently).
2. Bigger question for DoD, USSPACECOM, US Space Force and others- can we use such AI systems in how we go about targeting, forming new strategic plans, operational design, and other sophisticated mission sets that right now are entirely done by humans? Sometimes, the answers we need are not presented in nice little packages with a bow. Military people want a doctrinal chapter or a white paper titled “here is exactly how you can fix human targeting and strategic planning using Chatbot in explicit steps and instructions”. But that is not how the world works, and if you get something like that, chances are someone is trying to sell you a bridge in Brooklyn. What I offer below is the indirect path where we draw from something seemingly out in left field: how music and AI with musical writers and composers are able to work together to generate new, exciting content where listeners cannot tell the difference between purely human composed content and AI augmented. If music can be advanced in creative production this way, why not military activities?
3. Enter Dr. Ken Stanley, who until recently was working with OpenAI (the company that unleashed Chatbot and ChatGP4), and before that he worked at Uber helping them with AI and machine learning. I met him years ago when he was with University of Central Florida, of which this article originates from. Dr. Ken Stanley is one of the top AI experts (in my opinion) out there doing this stuff, and if you struggle with this article right away, I strongly recommend you head over to YouTube and watch this lecture he gives. It should make reading this article much easier (minus the dense STEM AI jargon which you can skim over). See:
I had the pleasure of inviting Dr. Stanley to USSOCOM when I worked there at the Joint Special Operations University (this was purely an academic engagement and done in such fashion that the ideas presented were unrelated to anything classified or operationally related- a host of other similar lectures are uploaded to the ‘Think JSOU’ Youtube channel design lecture series here:
Stanley’s lecture unfortunately was not recorded as that was too early in my time there and we had not yet established the above process. Yet Stanley’s ideas were profound and controversial, particularly when we focused them upon how we go about forming strategies and operations, but also our entire targeting logics. Needless to say, this disrupted USSOCOM participants and there were mixed reactions, (in my opinion of course). That is not surprising- and that will continue to be institutional resistance until we educate ourselves about how AI and human-machine teamings do not need to strictly obey the traditional models for decision-making in earlier “human only” methodologies.
4. This article is about music and AI helping music composers break out of writer’s block, but also how amateur musicians might use AI to leapfrog into what used to be entirely professional music editing and production- with AI nudging the human along with incredible acts of divergent thinking. So, a key take-away here right away is that this is not about making a strategic planner more proficient in converging toward 1x preferred doctrinal way to do something (like COG analysis)… this is about laymen planners or analysts getting help from AI to massively expand their ways of thinking divergently, creatively, improvisationally, so that entire novel pathways might be discovered that otherwise could not be found (we would be too busy following directions to notice). This is a way potentially for organizations tasked with highly complex security tasks to use AI in a new human-machine teaming approach where the AI stimulates the human operator and learns, suggests, and creates a range of options for that operator to choose from, over and over, until the human selects a powerful and entirely new deliverable that the AI has collaborated upon and generated. This sounds wild, but if it can be done with music and the output is indistinguishable from real human deliverables, why can’t this be done in a host of military applications?
Now let’s dive into this article:
Again, give it a few pages as the front portion goes heavy into AI methodology and music theory. On p. 82, we dive into the meat here:
“The idea is that humans, rather than hard-coded rules, can rate candidate generated music in place of an explicit fitness function.” And, “Because the offspring are generated through slight mutations of the underlying genes of the selected parents, they tend to resemble their parents, while still suggesting novel traits. In this way, over many generations, the user in effect breeds new forms (Dawkins 198).” Stanley is talking about how the AI system presented here for creating new music will take original parent music examples, morph them in a range of directions, and then ask the human user “which of these are more appealing”, and iteratively spawn more mutations of what is selected until the user extends this human-machine decision tree toward an entirely new branch, one where the decedent of the original content is very different from the original parent, modified along the way by the human choice and AI generation, leading the team to something that did not exist until now and is potentially better than the original. “Thus the unexploited opportunity at the focus of this article is to borrow from the creative seed already present in the user created scaffold, to enhance the generated output with very few formalized constraint.” (still on p.82). Next, on p. 88, the authors get back to the core theory here and why this is absolutely game-changing… and realize that this was published in 2014, based on earlier work, meaning this all was over a decade ago. AI moves fast, and we are much further along than what was available back in 2014.
Nonetheless, the lessons here are profound. “The theory behind this approach is that, by exploring the potential relationships between a scaffold and the voices generated from that scaffold (as opposed to exploring direct representations of the voice itself), the user is constrained a space in which candidate generated voices are almost all likely to be coherent with respect to the scaffold.” The system is not ripping apart specific musical content and incrementally (ends-ways-means, reverse goal establishment, engineering via a sequential, linear pathway) generating options the way our military goes about forming COAs and doing campaign planning. Instead, the system is improvising using mathematics and the user’s subjective responses to a range of mutations from the previous parent concept. Some will be “bad”, and others “promising,” but each time the AI modifies a range of options, there is emergence and diversity across those children mutations. This completely side-steps any reverse planning from predetermined goals or end-states. The planner cannot plan! The planner designs with the AI to create that which is needed but does not yet exist, nor could be imagined immediately by the planner in some A plus B leads to C logic. This is quite different. p. 89 is full of important lessons for our military readers. First- “ The user explores musical voices in this space by selecting and rating one or more of the computer generated voices from one generation to parent the individuals of the next. The idea is that the good musical ideas from both the rhythmic and pitch functions are preserved with slight alterations or combined to create a variety of new but related functions, some of which may be more appealing than their parents. The space can also be explored without combination by selecting only a single generated voice.
The next generation then contains slight mutations of the original functions.” Now, replace this with some intelligence analysis, or targeting process, or how we generate branches or sequels or COAs in strategic planning. Consider the possibilities here for defense applications. Music is mathematical, but also creative and subjectively human. The AI cannot do this alone- it is the teaming of human operator with AI. War, in many ways, can be appreciated somewhat in mathematical formulation (our analytical processes, metrics, MOPs, MOEs), but there also is a profoundly human and subjective quality. Clausewitz spoke of military genius able to break all the rules and forge new pathways ahead in war. While this process does not promise Napoleon 2.0, it does offer the option that a custom AI system doing this process with a human operator will radically expand that human’s divergence of thinking in whatever we are focused upon. Or, “Because this approach requires an existing composition, it can help composers with writer’s block who would only like creative assistance with single voices, or amateurs with little composition experience.” (p. 89).
An important aspect of this human-machine teaming: neither the human nor the AI are experts in music here. “Notice that no musical expertise is needed in any of these scenarios to generate multiple musical voices.” P. 89. The human could easily be a new planner or analyst, and the AI system drawing upon existing expert content, generating novel and divergent mutations that allow the human operator to experiment with a range of new developments. The term ‘tacit’ and ‘explicit’ are important here- ‘tacit knowledge’ is that high level, mastery knowledge that a deep expert does intuitively. Explicit is when we can simplify knowledge down into clear, precise instructions for most anyone to follow. Complex reality has both- but you cannot easily move tacit to explicit. Or, imagine telling someone how to assemble a bike over the phone where you have the directions and they have the tools. That is explicit. Now try teaching an 8 year old kid how to ride a bike without training wheels over the phone. That is tacit knowledge- it would be near impossible to do this, as that sort of deep mastery cannot be conveyed as simple step-by-step directions. Otherwise, legions of parents would not have their own (usually frustrating) tales of when their child learned to ride without training wheels). What we have here is an AI system that creates a human-machine team that uses explicit knowledge in the formulas to generate tacit results via starting with what could be someone else’s tacit product (here a musical piece, in our modification, a well written strategic plan or target packet)… and modifying it with a user who may not have nearly the same level of mastery. Yet the team generates a tacit and NOVEL result!
On p.94, the authors again reinforce this unique aspect: “Thus the survey is a kind of musical Turing test. This perspective is interesting, because FSMC is based on no musical principle or theory other than establishing a functional relationship; if such a minimalist approach (guided by users’ preferences) can generate plausible musical voices it suggests that the theory behind it is at least promising.” The system passes the smell test where they try to see if others can detect what was generated by AI, and what seems like human created original music. “These results validate the premise that additional evolved voices are at least plausible enough to fool human listeners into confusing partly computer generated compositions with fully human-composed ones.” P. 94. Remember, this was back in 2014. Things are even more sophisticated now, and we are indeed entering a world where deep fakes will morph from pics and audio to full movies, videos, that are so staggeringly realistic, humans will not be able to tell the difference. Imagine entire Marvel movies generated by AI with actors that died 40 years ago- and now apply that to security affairs (cyber, irregular warfare (IW), information wars, deception, covert ops, propaganda, and unconventional warfare (UW)).
One last note on p. 95 to reinforce the “so what” of sharing this fascinating article. Consider your planners, staff, analysts- and when they might “get stuck” in a project. “Results indicate that the users were satisfied with ideas suggested by MaestroGenesis. For instance, when asked if “FSMC helped me explore a broader range of creative possibilities than I could before,” each respondent indicated that MaestroGenesis helped them explore new areas of their creative search space. The students responded, “FSMC freed me from my normal stylistic tendencies,” “I typically follow a sort of pattern when I compose, but FSMC expanded my thinking,” and “specific parts of the output harmonies were very good, and I could see myself applying them in many places throughout the song.” P.95.
If this was fascinating, order “Why Greatness Cannot Be Planned”- the book that Stanley and a co-author wrote around the same time and they go into far greater detail on AI and learning that is not preconditioned with reverse-engineered, ends-ways-means logic. You can find it here:
Our DoD needs to seriously consider human-machine teams beyond the traditional and simplistic “human in the loop, human on the loop” constructs. In this context, the human and the machine are collectively creating together in synthesis, not analysis. These are fascinating (and potentially terrifying) times. Enjoy this recommended article and thanks for reading!