Whiteboard Technical Series: Mutual Information
Yes, you DO need mutual information in your network.
Push play to discover how the Juniper Mist™ AI-driven platform uses mutual information to help you understand which network features — such as mobile device type, client OS, or access point — have the most information for predicting failure or success in your service-level expectation (SLE) metrics.
You’ll learn
The definition of mutual information, what it means, and some examples
How mutual information works with SLA metrics
Who is this for?
Transcript
0:12 today we're talking about how the
0:14 Juniper missed AI driven platform uses
0:17 mutual information to help you
0:19 understand which network features such
0:21 as mobile device type client OS or
0:24 access point have the most information
0:26 for predicting failure or success in
0:29 your SLE client metrics let's start with
0:32 a definition of mutual information
0:34 mutual information is defined in terms
0:36 of the entropy between random variables
0:39 mathematically the equation for mutual
0:42 information is defined as the entropy of
0:44 random variable X minus the conditional
0:47 entropy of X given Y now what does this
0:50 mean let me give you an example let's
0:53 say Y is one of our random variables
0:55 that we want to predict and represents
0:57 the SLE metric time to connect and it
1:01 can be one of two possible values pass
1:03 or fail next we have another random
1:06 variable X that represents a network
1:08 feature that can have a possible value
1:10 of present or not present an example of
1:13 a network feature can be a device type
1:15 OS type time interval or even a user or
1:19 an AP any possible feature of the
1:21 network can be represented by the random
1:23 variable next we'll look at what we mean
1:25 by entropy for most people when they
1:28 hear the term entropy they think of the
1:30 universe and entropy always increasing
1:33 as the universe tends towards less order
1:35 and more randomness or uncertainty so
1:38 entropy represents the uncertainty of a
1:40 random variable and the classic example
1:42 is a coin toss if I have a fair coin and
1:46 I want to flip that coin the entropy of
1:49 that random variable is going to be
1:51 given by the sum of the probability of X
1:53 I times the log two of the probability
1:56 of X and for that fair coin the
1:59 probability is that 50% will be heads
2:01 plus 50% will be tails and the entropy
2:05 is going to be equal to 1 the maximum
2:07 entropy possible when you have maximum
2:10 uncertainty the random variable will
2:12 have maximum entropy if we take an
2:15 example where we don't have a fair coin
2:16 we have some hustler out there and he's
2:18 using a loaded coin let's say the
2:20 probability of heads is 70% and the
2:22 probability of tails is 30%
2:25 now in this case
2:26 your maximum entropy is going to be 0.88
2:29 so you can see that as the uncertainty
2:32 goes down your entropy will trend toward
2:34 zero if you were at zero entropy that
2:38 would mean no uncertainty and the coin
2:40 flip would always be heads or tails now
2:42 let's go back and see how mutual
2:44 information works with our SLA metrics
2:46 graphically what does this equation look
2:49 like let's say we look at how this
2:51 circle here represents the entropy of my
2:53 SLA metric Y and this circle is the
2:57 entropy of my feature random variable X
2:59 so if you look at our equation the
3:03 conditional entropy of random variable Y
3:05 given the network feature X is this area
3:08 here if I subtract the two what we're
3:11 looking for is this middle segment this
3:14 represents the mutual information of
3:16 these two random variables and it gives
3:19 you an indication of how well your
3:20 network feature provides some
3:21 information about your s le metric
3:23 random variable Y if the network feature
3:26 tells you everything about the SLA
3:28 metric then the mutual information is
3:30 maximum if it tells you nothing about
3:32 the SLA metric then the mutual
3:34 information between x and y is zero now
3:38 mutual information tells you how much
3:41 information a network feature random
3:43 variable Y gives you about the s le
3:45 metric time to connect but it doesn't
3:48 tell you whether the network feature is
3:49 better at predicting failure or success
3:51 of the SLA metric for that we need
3:54 something called the Pearson correlation
3:56 if you look at the picture of the
3:58 correlation it tells us a couple of
4:00 things one is the amount of correlation
4:03 with a range from negative 1 to 1 the
4:06 other is the sign negative and positive
4:09 which is a predictor of pass or fail so
4:13 now we have these two things first is
4:15 the magnitude indicating how correlated
4:18 the two random variables are second is
4:21 the sign which indicates failure or
4:23 success if the correlation is negative
4:26 the network feature is good at
4:28 predicting failure if it's positive it's
4:31 good at predicting pass if the Pearson
4:34 correlation is zero it means there is no
4:36 linear correlation between the variables
4:38 but there could be mutual information
4:40 between the two but the Pearson
4:43 correlation does not tell us the
4:45 importance of the network feature or if
4:47 there's not enough data to make an
4:49 inference between the network feature
4:50 random variable and the SLA metric
4:53 random variable that's given back to our
4:56 graphic of the circles there may be one
4:59 case where I have very high entropy for
5:01 both variables but there may be another
5:03 case where I have much smaller entropy
5:05 on one of those variables both of these
5:07 examples may be highly correlated with a
5:10 high Pearson's value but the entropy of
5:12 mutual information will be much higher
5:14 in the first case which means this
5:17 random variable has much more importance
5:19 in predicting success or failure of a
5:22 feature I hope this gives a little more
5:25 insight into the AI we've created a mist
5:27 and if you look at the MIS dashboard the
5:29 result of this process is demonstrated
5:31 by our virtual assistant