Learning human insight by cooperative AI: Shannon-Neumann measure

Edouard Siregar

doi:10.1088/2633-1357/abec9e

1. Introduction

Insight, defined as a combination of discrimination, understanding and intuition, is critical in the conceptual phases of solution-building, to open-ended real-world challenges. These are a complex mixture of human, financial and technological needs. Advances in Big data and machine learning (ML) generated insights [1] can be extremely useful, yet will remain profoundly incomplete, as long as the human dimension is not well accounted for.

In the context of business, financial or scientific insight and discovery, ML is extensively used to extract useful patterns, by combining big data with human-generated prior domain knowledge, to enable interpretation and explanation. We approach the question from the complementary perspective of human compatible artificial intelligence (AI) [2] instead of taking humans out of the loop during computations (which risks being an opaque 'black box', and missing the human dimension), we focus on human-centered AI, to assist people in gaining insights, by observing human behavior.

We propose a computational measure μ_I of insight I, enabling an AI to compute gains in I, resulting from a cooperative questioning process, between a human(H), and an AI cognitive assistant (A_cog).

The framework for the AI's learning, is a 2-person Iterated Questioning (IQ) game. A set Q, of (yes/no) questions q_i ∈ Q, posed to reach a goal G. The set Q is shared by both H and A_cog, whose only objective is to maximize the total insight gained by H, over the course of the cooperative game [3]. Each IQ-game is played until H achieves a given long-term objective: to reach a minimum threshold of total insight gained. Reaching this threshold, defines a learning episode for A_cog. The game is 'solved', when optimal strategies (the shortest way to reach the threshold) are found for H and A_cog.

For A_cog to achieve its single-minded role, it needs to learn about human insights. It can learn this, by the way we react during the IQ game. In this cooperative 2-person game: A_cog's decision policy is to suggest a probability-ranked list of questions ${q}_{i}\in Q;(i=1,\ldots ,n)$ , it believes (in a Bayesian sense) are most insightful for H, and H's decision policy is to select the most promising questions ${q}_{j}\in Q$ to explore, and then signal how useful q_j was.

This will enable A_cog to compute μ_I, simply by observing H. The measure ${\mu }_{I}^{G}(Q)$ combines the (objective) Shannon entropy S, with a user-defined (subjective) von Neumann utility U . We call it the Shannon-Neumann, or SN-measure of insight gains [3, 4] .

The assistant A_cog's only objective, is to help H gain a maximum total amount of insight ${\mu }_{T}^{G}$ , over the pursuit of a long-term objective. This approach is consistent with Stuart Russell's thesis [5] for designing human-compatible AI, where the AI assistant cooperatively realigns with H's short and long-term goals.

2. Insight gains in a questioning process

2.1. Measuring insight gains

Gains of insight are normally defined as changes in discernment (precision-increase), understanding (error-correction, uncertainty-reduction), and intuition (familiarity-boost) about a topic T. We want to introduce a reasonable computational measure, of gains in insight resulting from exploring a set of questions. Discernment, understanding, and intuition on a topic T, are normally gained, by posing a set of 'good questions', at the 'right times and places', leading to stories, metaphors, analogies, models and theories of T, with increasing accuracy, precision and relevance. In this paper we address how the AI agent learns what a 'good question' is (one that maximizes the insight gains). In the next paper [6], we will model the notion of the 'right times/places', using Cooperative Inverse Reinforcement Learning [7].

The IQ game is a precision-boosting, uncertainty-reducing process, where we start from an approximate understanding of T, whose precision will increase and error will decrease, under iterated questioning. We formalize the IQ, using the simplest model, which still retains its essence: posing a suite of (yes/no) questions Q, to achieve a goal G, which we describe next.

2.2. Framework for insight gains

We create a framework, which enables us to separate the 'good questions' (more informative and insightful ones), from the 'bad questions' (less informative and insightful ones). The framework can be thought of, as an elaborate form of the 20 (yes/no) question game.

A formal questioning framework F (our toy universe), is used to frame and measure insight gains $\delta {I}^{G}(Q)\equiv {\mu }_{I}^{G}(Q)$ towards achieving a goal G, when answering questions in Q. The measure is built, by combining Shannon information S [8] with von Neumann utility measure U, associated to a set of questions Q: we call it the SN-measure. The elements of A_cog's toy cognitive universe $F(T,G,Q)$ are:

A topic T consisting of a bag of N balls, each with only two properties: ${color}(B,W),{size}(L,S)$ . The long-term goal, is to fully determine T.
A set G of short-term goals ${g}_{i}\in G;i=1,2$ about T
A set Q of (yes/no) questions ${q}_{i}\in Q;i=1,\ldots ,4$ . Q is the information source (in information theory setting).

The set Q is a discrete information source [9], as the answers to each q_i produces a certain amount of information (and insight on achieving G), by reducing the uncertainty or ambiguity, about T.

In this IQ game, our long-term goal, is to fully determine T's composition. The short-term goals ${g}_{i}\in G$ , to gain insights on T, are:

${g}_{1}\,=$ 'To know the proportion of balls in T, of each color'
${g}_{2}\,=$ 'To know the proportion of balls in T, of each size'

The insight generating (yes/no) questions q_i ∈ Q on the topic T, could be, for example, the following:

${q}_{1}\,=$ 'are there $N/2$ balls in T, B?'
q₂ = 'are the balls in T, mostly B?'
q₃ = 'are the balls in T, mostly L?'
${q}_{4}\,=$ 'did Sam place the balls in T?'

This simple framework, allows us to express the necessary (and arguably, sufficient) properties of a good measure ${\mu }_{I}^{G}(Q)$ of insight gains, in mathematical terms.

2.3. Necessary properties of insight gains measures

We want to measure the insight gained ${\mu }_{I}^{G}(Q)$ , by a 'yes' answer to each question q_i ∈ Q in the information source Q, with respect to each short-term goal ${g}_{j}\in G$ .

The required properties ${P}_{i}\in P$ , of a meaningful insight measure μ are the following:

1.
P₁: ${\mu }_{I}^{{g}_{1}}({q}_{1})\gt {\mu }_{I}^{{g}_{1}}({q}_{2})\gt 0$
2.
P₂: ${\mu }_{I}^{{g}_{1}}({q}_{2})\gt 0$ and ${\mu }_{I}^{{g}_{2}}({q}_{3})\gt 0$
3.
P₃: ${\mu }_{I}^{{g}_{1}}({q}_{3})=0$ and ${\mu }_{I}^{{g}_{2}}({q}_{2})=0$
4.
P₄: ${\mu }_{I}^{{g}_{1}}({q}_{4})=0$ and ${\mu }_{I}^{{g}_{2}}({q}_{4})=0$

The justifications for these requirements on a measure μ of insight gain, are:

P₁: ${\mu }_{1}\gt {\mu }_{2}$ when a 'yes' answer to q₁ is more informative (reduces more uncertainty) than a 'yes' answer to q₂, and both are useful for achieving g₁.
P₂: $\mu \gt 0$ when a 'yes' answer to q_i is informative (about T) and useful for g_j
P₃: μ = 0 is when a 'yes' answer to q_i is informative (about T), but q_i useless for g_j
P₃: μ = 0 when a 'yes' answer to q_i does not produce relevant information

The intuitive interpretation is that for a question to be insightful (produce new insight), it must be simultaneously, informative (Shannon) about the topic T, and useful (von Neumann) for achieving the goal G. In other words, we define insightful, as both informative (uncertainty, error and ignorance reducing) about a topic, and actionable (useful) for achieving a goal.

We can now construct the measure ${\mu }_{I}^{G}(Q)$ of insight gains, based on the required properties ${P}_{i}\in P;(i=1,\ldots ,4)$ .

3. Shannon-neumann measure ${\mu }_{I}^{G}(Q)$

Given the property requirements $P=\{{P}_{i};(i=1,\ldots ,4)\}$ , for the measure of insight gains μ, we can now precisely define the measure ${\mu }_{I}^{G}(Q)$ , in terms of Shannon entropy S, and a (human evaluated) von Neumann utility U.

Four our need, we define the Shannon [10] information $S({q}_{i})$ produced by the 'yes' answer to a question q_i, as (log base 2):

$\begin{eqnarray}&&S({q}_{i})\ \equiv \mathrm{log}\ (1/{p}_{i})=-\mathrm{log}\ {p}_{i}\ ({bits})\end{eqnarray} \tag{ 1 }$

and the Shannon entropy S(Q) of an entire information source Q, is the expected amount of information from the source Q:

$\begin{eqnarray}&&S(Q)\ \equiv -{\rm{\Sigma }}\ {p}_{i}\ \mathrm{log}\ {p}_{i}\ ({bits})\end{eqnarray} \tag{ 2 }$

Relations (1, 2) are fundamental definitions of entropy, in information theory [11] and previously, in statistical mechanics [12, 13]. In the context of our 2-person cooperative IQ-game, ${p}_{i}=p({q}_{i})$ is the probability that the answer to the (yes/no) question q_i about the topic T, is 'yes'.

Note that, the more uncertainty (ignorance) the answer to q_i resolves, the lower its probability $p(\ {q}_{i}\ )$ , and the more informative it is. The information units are bits, when the log base is 2. So a good estimator for the probabilities p_i, is the amount of uncertainty a 'yes' answer eliminates (just as 'good' questions, in the 20-question game cut possibilities by half—this is also true for the coin-weighing game to identify the single defective coin).

For example, if the random variable is throwing a fair die, the answer to ${q}_{1}\,=$ 'is the outcome a 6?' is less probable ( $p({q}_{1})=1/6$ ) and more informative than the answer to q₂ = 'is the outcome an even number?' ( $p({q}_{2})=1/2):S({q}_{1})\ \equiv \mathrm{log}\ (1/6)\ ({bits})\gt S({q}_{2})\ \equiv \mathrm{log}\ (1/2)$ (bits), and the 100% certain answer to q₃ = 'is the outcome an integer?' adds no information: $S({q}_{3})\ \equiv \mathrm{log}\ (1)=0\ ({bits}).$

The Shannon entropy, is consistent with some desirable properties of a insight gains, but is missing a key ingredient: utility with respect to achieving the goal g, to which we now turn. The search for new insight, is usually done in the context of a specific target goal.

Standard utility, as defined in the von Neumann-Morgenstern utility theorem [15], has long been a foundation of economic and (rational) decision-making, and has successfully described some aspects of human decisions, but not all, by any means [15]. For our cooperative 2-person IQ game, we use the simplest notion of utility, which does the job (Occam's razor). The simplest utility function $U({q}_{i},{g}_{j})\in \{0,1\}$ , is assigned by H, once the question q_i has been answered, since only H can tell how useful a question q_i is, with respect to his/her short-term goal g_j. The cooperative H would need to express an observable sign for U (e.g. by clapping or smiling U= 1 (utils), or not clapping or smiling U = 0 (utils) ), so that the agent A_cog can measure $U({q}_{i},{g}_{j})$ , simply by observing H.

To satisfy all properties ${P}_{i}\ \in P\ (i=0,1,2,3,4)$ , we define the insight gained ${\mu }_{I}^{{g}_{i}}(\ {q}_{i}\ )$ , from a 'yes' answer to a question q_i (information source), to achieve a goal g_j to be:

$\begin{eqnarray}&&{\mu }_{I}^{{g}_{i}}({q}_{i})\ \equiv S({q}_{i})\ U({q}_{i},{g}_{j})=\mathrm{log}[1/p({q}_{i})]\ U({q}_{i},{g}_{j})\ ({bits}.{utils})\end{eqnarray} \tag{ 3 }$

Definition (3) for the insight gain, satisfies all properties ${P}_{i}\in P$ , and is a mixture between the Shannon information S produced, and a personal subjective utility U (von Neumann), acting as an AI-Human bridge for gaining insight. The SN-measure's units can be taken to be ${bits}.{utils}$ , and can be normalized into a mathematical (monotone) measure on $[0,1]$ . It satisfies the definition of insight as a combination of discernment, understanding and intuition, which all increase under the error-correcting, uncertainty-reducing, and familiarity-boosting IQ game.

4. Shannon-neumann strategies

For the AI agent to learn, the SN-measure (3) demands an iterated cooperation between A_cog and the user H, during the IQ, to reach a long-term goal T. In this 2-person (A_cog, H) cooperation, each player agrees to follow a well-defined decision strategy (policy) Π^A and Π^H.

Policy Π^A: given a short-term goal ${g}_{j}\in G$ , A_cog suggests to H, a Bayesian probability (evidence-based plausibility beliefs) distribution, as an ordinal-ranked list of questions Q, from most to least likely informative. A_cog can start with an estimated Bayesian prior probability distribution $\{p(\ {q}_{i}\ )\}$ , using the relative amounts of uncertainty removed by 'yes' answers, for an ordinal ranking of the $p(\ {q}_{i}\ )$ . This means A_cog's initial policy Π^A, is approximate. The 2-person iterated cooperative game should be designed so that ${{\rm{\Pi }}}^{A}\ \to {{\rm{\Pi }}}^{A* }$ , the optimal policy. In our next paper, we show that a Shannon-Neumann strategy (based on the SN-measure) allows us to reduce a hard 2-person game of Cooperative Inverse Reinforcement Learning, to a simpler Q-Learning, proven to converge to a near-optimum [17].

Policy Π^H: given a short-term goal ${g}_{j}\in G$ , H goes through A_cog's proposed probability-ranked list of questions ${q}_{i}\in Q$ , and expresses (signals) their Utility $U({q}_{i},{g}_{j})$ .

The IQ game enables A_cog to refine its own understanding of human insight gains, in a iterated manner. Recall that A_cog's only purpose in life (objective), is to maximize H's total gain of insights ${\mu }_{T}^{G}(Q)$ , over the course of IQP. Using the SN-policy will reduce solving the computationally hard 2-person cooperative game, to a proven Q-Learning of a near-optimal policy ${{\rm{\Pi }}}^{A* }$ .

We now ask an important question: is there an optimal way to structure the information source Q, for the short-term goal G? The answer is 'yes', thanks to Shannon's notion of source entropy.

The Shannon entropy S(Q) of the whole information source Q, is the expected amount E, of information from the source Q:

$\begin{eqnarray}&&S(Q)\ \equiv E(-\mathrm{log}\ {p}_{i})=-{\rm{\Sigma }}\ p(\ {q}_{i})\ \mathrm{log}\ p(\ {q}_{i}\ ),(i=1,\ldots ,n)\end{eqnarray} \tag{ 4 }$

where ${p}_{i}=p(\ {q}_{i}\ )$ is the probability of a 'yes' answer to the (yes/no) question ${q}_{i}\ \in Q\ (i=1,\ldots ,n)$ . In agreement with Shannon's theory, the probabilities are set by the information source, in our case A_cog, which structures Q. The AI agent A_cog, can construct Q, to interpret the questions q_i, as independent hypotheses to be tested, and $p({q}_{i})$ as the probabilities of outcomes, of independent experiments (e.g., repeated coin tosses), or as probabilities of answers to independent questions (e.g., 20 (yes/no) question game).

In this latter interpretation, S(Q) is the expected Shannon entropy (mean information gained per question), from the source Q, which we can maximize, by setting an equilibrium distribution:

$\begin{eqnarray}&&{p}_{i}=\ p(\ {q}_{i}\ )=1/n,\forall {q}_{i}\ \in Q\end{eqnarray} \tag{ 5 }$

All (yes/no) questions q_i ∈ Q, have an equally probable 'yes' answer, and the expected information gain S(Q) is maximum:

$\begin{eqnarray}&&S(Q)\equiv -{\rm{\Sigma }}\ (1/n)\ \mathrm{log}\ (1/n)=\mathrm{log}\ (n)\ ({bits}\ {per}\ {question})\end{eqnarray} \tag{ 6 }$

To optimize the effectiveness/efficiency of the IQ game, the agent A_cog should aim to structure Q as close as possible to an equilibrium distribution, to maximize the expected information (and thus, insight) gains by H, during the course of the entire IQ episode. Also, S(Q) increases with the number n of questions in Q.

5. Conclusion

In complex real-world challenges, conceptually sound solutions are based on a solid foundation of key conceptual insights, gained by posing 'good' questions, at the 'right' times/places. If the foundation is weak, due to insufficient insight (often, strong from a strictly financial and/or technological standpoint, but weak on human dimensions), the flawed resulting solutions can be very costly or impossible, to correct downstream. The outcomes in countries with 'just-in-time' supply chains and fragmented health care systems, during the 2020 global pandemic, comes to mind. Thus AI tools which boost human-insight are invaluable, to help us build sound solutions to complex open-ended challenges: designing, implementing, optimizing, validating and verifying complex human-technological systems and networks.

We defined a computational measure of insight gains, which an AI agent can compute, simply by having an internal questioning framework $F(T,G,Q)$ , and by observing how a human behaves. The measure requires an information source Q, that is simultaneously informative S (in Shannon bits) about a topic T, and actionable U (useful, in von Neumann utils) for achieving a given goal G. This measure satisfies four necessary properties ${P}_{i}\ \in P$ , as well as the intuitive definition of new insights, as gains in discernment, understanding and intuition.

From the perspective of solution-building, as a two-step cycle between (A): formulating a conceptually sound component on solid insights, and (B): executing the technical component (implementing the conceptual component), our measure ${\mu }_{I}^{G}(Q)$ is most timely, since the AI/ML revolution is increasingly automating tasks (B).

In fact, we argue, along with many others before us [18–22] that H should always remain heavily in the loop, no matter how potent AI/ML becomes, for two main reasons: (1) it may eventually be the only work left for us to do, (2) no advanced non-human entity (e.g. Artificial General Intelligence AGI) can do it as well for our own benefit (just as we are incapable of doing it for, say, Bonobos and Orcas, no matter how much smarter we think we are). It takes a human to think and feel like one! This perspective makes our Shannon-Neumann measure ${\mu }_{I}^{G}(Q)$ most valuable as a AI-Human bridge.

In this paper, our SN-measure addressed the question of 'good vs bad' questions. In paper II, we address the difficulty of determining the 'right' times/places, by formalizing the 2-person IQ game, as cooperative inverse reinforcement learning.

Data availability statement

No new data were created or analysed in this study.

Learning human insight by cooperative AI: Shannon-Neumann measure

Article metrics

Author e-mails

Author affiliations

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Insight gains in a questioning process

2.1. Measuring insight gains

2.2. Framework for insight gains

2.3. Necessary properties of insight gains measures

3. Shannon-neumann measure ${\mu }_{I}^{G}(Q)$

4. Shannon-neumann strategies

5. Conclusion

Data availability statement

Learning human insight by cooperative AI: Shannon-Neumann measure

Article metrics

Share this article

Author e-mails

Author affiliations

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Insight gains in a questioning process

2.1. Measuring insight gains

2.2. Framework for insight gains

2.3. Necessary properties of insight gains measures

3. Shannon-neumann measure {\mu }_{I}^{G}(Q)

4. Shannon-neumann strategies

5. Conclusion

Data availability statement

3. Shannon-neumann measure ${\mu }_{I}^{G}(Q)$