**Machine Reasoning for Decision Making by Information Entropy Minimum Principle****(An algorithmic approach for inductive reasoning, the way we think and judge)**

Dr. Charles Kim (ckim@howard.edu)

Professor of Electrical Engineering and Computer Science

Provision of quantitative threat level given diverse
datasets and information requires an intelligent system which extracts dominant
contributors and learns and updates as new data is added to the datasets.
The machine reasoning system keeps, upon existing and updated datasets,
extracting dominant contributory attributes, generating rules for outcome
(True/False, Good/Bad, Threat/NoThreat) determination with the attributes, and
producing the probability of the rules themselves along with margins of error.

The main theory behind
dominant attribute discovery and decision rule extraction from datasets is the
information entropy minimum principle. “Information measure” (I) is
defined as proportional to the negative of the logarithm of
probability (p), with k a constant: I = -k*ln(p).
Information entropy (S) is defined as the expected value of
information: S= -k*p*ln(p).
In the entropy minimum state, all of the information has been extracted,
and there is no information gain, leading to maximum certainty.

The first step in determining the dominant attribute is to
convert all analog-valued sample data to binary valued data.
The “binarization” is performed by threshold calculation.
The calculated threshold value with minimum conditional entropy optimizes
the separation of two outcomes,
Threat (*T*) and No-Threat (*F*).
The conditional entropy *S*(*x*),
which, for a chosen value *x*, is
defined with conditional probabilities of two outcomes, T and F, under 2
conditions (one for a sample value lower (x-) than a certain threshold value x
and the other greater (x+) than that) as,

S(x) =
-p(x-) [p(T|x-)ln(p(T|x-)) + p(F|x-)ln(p(F|x-)) -p(x+) [p(T|x+)ln(p(T|x+)) +
p(F|x+)ln(p(F|x+)).

*i*th attribute, Si, for T or F under 0 or 1 attribute value is as
follows:

Si = -pi(0)
[pi(T|0)ln(pi(T|0)) + pi(F|0)ln(pi(F|0)) -pi(1) [pi(T|1)ln(pi(T|1)) +
pi(F|1)ln(pi(F|1)).

After applying the conditional entropy to all
*m* attributes, a certain attribute Ak
which produces the minimum conditional entropy will be the best attribute in
correlating the sample data to the outcomes.
Then the decision rule, R_{k} for the attribute
*k*, can be drawn from the best
(highest) conditional probability from the set of four: p_{k}(T|1),
p_{k}(F|1), p_{k}(T|0), and p_{k}(F|0).

If, for
example, p_{k}(T|1)
is the highest from the set,
then the decision rule is formed as follows:

R_{k}:
IF (A_{k} = 1), THEN (T).

In this
step, the probability (or certainty) of this decision rule itself is generated
from the maximum entropy based Bayes estimate by <p(O)> = {x + 1 }/ {n + 2}, where, x_{ }is the total number of samples satisfying the
condition (T|1), and n_{ }is the
total number of samples satisfying the attribute condition. Also, the margin of
error of the drawn probability is obtained by e(O)= z*Ö[{
<p(O)>* (1 – <p(O)>}/{n+2)] ,where z is a z-score value for desired confidence
interval.

Usually, not all samples can be directly linked to a single decision rule.
Therefore, we apply step-wise approximation by which, after the first
attribute and its corresponding decision rule are found, we remove all the
samples which match the decision from the binarized dataset and we repeat the
conditional entropy minimum process for the remaining data samples.

We tested the implemented machine reasoning system with a few example datasets including Political Regime Characteristics and Transitions, Behavioral Risk Factor Surveillance System of CDC, and the Profiles of Individual Radicalization in the United Status (PIRUS) of the National Consortium for Study of Terrorism and Response to Terrorism.

Application Areas: (1) Threat level determination in Irregular Warfare and Counterinsurgency with human terrain data; (2) Dominant behavior discovery in insider threat detection and monitoring, resiliency, and diagnostics; (3) Radicalization detection; (4) Machine learning for Radiation Effect for parametric impact on sensitivity of electronic devices.

**Links**

Whitepaper: Development of an Automated Diagnostic Rule Generation System for Mental and Behavioral Disorders

Dissertation: An intelligent decision making system for detecting high impedance faults

Article: Classification of faults and switching events by inductive reasoning and expert system methodology

Article: A learning method for use in intelligent computer relays for high impedance faults

Article: High impedance fault detection using an adaptive element model

Article: Machine reasoning for determining radiation sensitivity of semiconductor devices (The 20th International Conference on Artificial Intelligence, July 30 - August 2, 2018. Las Vegas, NV.)

Article: Identification of symptom parameters for failure anticipation by timed-event trend analysis

**WWW.MWFTR.COM**