nm0929: right shall we make a start them please [0.6] am i switched on yeah [0.
3] good [1.2] er [0.3] just a reminder as always [0.5] er i have my office 
hours people are beginning to come and see me which is good i get the feeling 
that the assessed exercise is focusing minds which is great [0.4] er [0.2] 
notice when these hours are i sh-, i'm [0.2] i'm available then [0.5] with a 
caveat from yesterday unfortunately Murphy's law [0.2] but these are times when 
you can come anyway [0.3] you don't need to ask me first can i come and see you 
in your office hours just turn up [0.9] and the same is true of other lecturers 
and class tutors [0.4] er as well [0.2] so do use the [0.2] office hours if you 
need to [0.6] and you will of course find as you start revising for your 
examinations that you need [0.4] you need to so get used to it earlier [0.3] 
rather than later [0.8] where are we in the things that we have to discuss [0.
4] well we've got three more lect-, [0.2] three more weeks of lectures this 
week [0.6] er is which is seven [0.2] eight and nine we've got two lectures 
each week in those weeks there are no lectures after 
week nine [0.6] in this course [0.7] and we're going to be finishing off with 
probability distributions today [0.3] and we're going to move on to the 
applications [0.5] er in in the five remaining [0.3] lectures [1.2] so this is 
where we are at the moment [7.8] right when we finished er last week [0.3] we 
were talking about the topic of [0.2] er covariance joint distributions for [0.
2] random variables and talking about working out the relationship between 
variables [0.7] and this was the example that i was using it's all in your 
notes but i've written it out again [0.4] to illustrate [1.0] and what we have 
here [0.6] is [0.4] a joint distribution if you look at the [0.2] if you look 
at the table we have two variables [0.3] one which was related to the level of 
demand [0.3] and one which related to [0.2] the number of advertisements and 
the story was [0.2] that a company was placing weekly advertisements [0.3] and 
it was generating demand as a result of those [0.4] and we had [0.3] an 
experimental probability distribution [0.2] where we've got probabilities in 
blue [0.2] attached to each pair of possible 
values for the random variable [1.8] and then we said [0.2] well given that 
you've got these blue probabilities which we call joint probabilities [0.4] you 
could also work out [0.4] marginal probabilities if you know [0.4] all the 
probabilities associated with X-equals-nought [0.2] you can add them up [0.3] 
to find the overall probability that X-equals-zero [0.3] and that was a column 
sum [1.4] and we've moved on to thinking [0.2] not just about things relating 
to marginal distribution such as expected values or variances [0.4] but on to 
the relationships between variables [0.4] where we needed to look at [0.2] 
these joint probabilities [0.9] so this was our [0.4] basic data basic example 
[0.2] that's driving the [0.7] subject of the lecture [2.1] and we finished 
last week [0.9] by [0.3] looking at the [1.1] formula for the covariance [0.5] 
between random variables [1.1] just cover that up for the moment [1.1] and [0.
2] it looked fairly messy but we dealt with some motivation for it [0.7] and [0.
2] remember that this is [0.2] covariance for random variables there's a 
distinction between what goes on for random variables 
which is a theoretical kind of concept [0.3] and what goes on for real data [0.
9] but the story here is [0.3] we are looking at [0.6] effectively average 
deviations of each variable from its mean [0.4] but [0.2] particularly [0.2] 
how they do that together whether when X is below its mean Y tends to be or not 
[0.3] we're looking at the [0.2] average of these things in the sense that 
we're looking at how important these joint deviations are [0.4] as measured by 
how likely those occurrences are [0.3] corresponding occurrences are [1.2] so 
we take a weighted average of these paired deviations [0.4] and this double sum 
that you see here [0.2] simply means [0.3] that you have to add across all 
possible pairs of values of the two random variables [0.5] so this was the 
story [1.5] so the double sum there [0.2] across X and across Y simply means 
that what you're doing is looking at all possible pairs of values of the random 
variable [1.2] s [0.4] concerned [1.3] and what we've got is a weighted average 
[0.2] of these paired deviations from the mean [0.7] the weighted average is 
obtained by multiplying 
each paired deviation [0.4] by the probability of getting that deviation [0.6] 
and all those probabilities lie between nought and one so we call it a weighted 
average [1.7] and that's covariance for [0.2] random variables [0.8] and of 
course if you want to calculate it as we're about to see [0.3] it can be rather 
tedious [0.2] but the main thing is to get in your ideas the motivations for 
what we're doing we're looking at [0.3] the relationship between variables [0.
3] by looking at [0.2] how likely [0.2] the two variables seem to be [0.3] both 
below or above their means together [0.2] or opposite sides of their means 
together [0.5] the E [0.2] is the expected value that we discussed last week 
that's the weighted average [0.2] of possible values of the random variable [0.
3] using probabilities as the weights [2.2] so that's [0.2] covariance [0.7] 
for random variables if you want to compare it with what goes on with real data 
[0.3] in the assessed exercise you've got to calculate some correlations [0.9] 
er for which you'll need covariances [0.3] just to [0.6] compare [0.2] if you 
were looking at covariance 
for real data [0.2] here's what you would do [0.3] and it's possible just to 
identify the different points in the formulae [0.5] er so that you can see 
where the [0.4] points of comparison occur [0.5] so [0.2] first of all we're 
not using random variables we're using real data [0.3] and our real data when 
we look at covariance occurs in pairs [1.0] pairs of values for two [0.2] 
variables [0.6] and [0.2] when we've got the real data formula [0.2] again 
we've got deviations of er the X variable from its sample mean [0.2] and the Y 
variable from its sample mean instead of expected value [2.2] and instead of a 
probability weighting [0.2] the weighted average we've got a real average [0.2] 
although use N-minus-one instead of N [0.9] so we're averaging [0.4] the [0.9] 
cross-product of the deviations from the mean just as we're doing with the 
random variable case [0.6] you only see one sum here [1.5] because this is 
enough [0.3] to do the summing across all pairs of values because the random 
variables come in the data come in pairs a value for the X a value for the Y [0.
9] but you can see it in 
the same way [0.2] we're summing across all possible pairs [0.2] and these are 
actually the weights [0.2] every time we put in something into here we give it 
a weight of one over N-minus-one [0.3] it's the same sort of story [0.6] but 
it's for real data [0.3] rather than for [0.2] random variables [5.1] so [2.6] 
covariance for random variables then [0.7] is a rather messy looking thing but 
it's doing basically the same thing [0.7] summing across all possible pairs [0.
2] comparing deviations from the means of the two random variables [0.2] and 
weighting the lot [0.6] taking an average [0.2] so it's the same story [1.3] 
but as with real data [1.0] the problem with the covariance measure to capture 
the relationship between random variables [0.7] is that the units of the thing 
[0.5] a-, whether it's big or small [0.2] will essentially depend on whether 
the n-, units of measurement of X and Y are big or small [0.4] so [0.2] it 
won't do for comparing the strength of relationships between [0.3] different 
pairs of random variables [0.4] because its size [0.2] will change with the 
size of the random variables it's 
not a standardized measure [0.3] and we know what we do with real data [0.5] we 
take the covariance [0.5] and we divide by the [0.4] standard deviations of the 
two variables involved [0.6] that's what we do with sample data [0.2] and it's 
exactly what we do [0.3] with random variables [0.3] so [0.2] we have 
covariance which doesn't necessarily work for us in the sense that it's [0.2] 
not properly scaled [0.6] and instead of covariance [0.5] we use [0.7] 
correlation [1.0] and the correlation story [0.2] is related to the covariance 
story [0.3] in just the same way as it is for [0.7] real data [0.2] that's to 
say [0.4] the correlation [0.5] is the covariance divided by the two [0.5] can 
see a mistake in the slide [0.5] divided by the two [0.3] standard deviations 
[0.2] that should be the standard deviation of [0.9] Y [5.6] and a standard 
deviation definition of course is the standard deviation definition that we use 
for random variables [0.3] it's the [0.3] weighted average of the deviation of 
the random variable of its mean squared [0.2] where the weights to the 
probabilities for that random variable alone they're the 
marginal probabilities [1.0] if you don't take the square root it's the 
variance to get the standard deviation you must take the square root [0.5] but 
the relationship is the same as before correlation [0.4] is covariance [0.3] 
divided by the two standard deviations [0.9] and that gives us a measure [0.6] 
that must lie between minus-one and plus-one and just with the product moment 
correlation coefficient [0.6] minus-one or plus-one indicates [0.5] a perfect 
linear relationship between the random variables [0.4] something in between is 
weaker [0.3] the closer it is to zero the weaker is the relationship [1.4] so 
what we want to do in er in this lecture [0.3] is just run through some 
calculations [0.5] of covariance and hence of correlation [0.6] and finish off 
with some [0.4] more general stories about [0.5] er expected value and 
variances [1.1] so [0.4] we're now moving into section four-point-five of the 
notes [0.2] if you look in section four-point-five and thereabouts you'll see 
the details of the calculations that i'm going to go through with you now [3.3] 
so it's a [0.2] sensible place to be looking [17.7] 
so [0.4] first of all the [0.9] numbers that we're actually going to be feeding 
into this calculation [0.3] are the numbers from this joint distribution for 
demand and advertising [0.6] so [0.7] all the information that we can possibly 
have about these two random variables is encapsulated in the joint distribution 
[0.3] we're then going to summarize this joint distribution by looking at the 
covariance and the correlation between these random variables [0.7] in order to 
do that [0.3] we are going to have to look at the indi-, the properties 
individually of X and Y [0.2] 'cause we're going to need want to know what 
their expected values are [0.4] and we're going to need to know what the 
variances are so we can get the standard deviations [0.4] so we are going to 
need [0.3] the so-called marginal distributions the probabilities associated 
with specific values of each variable [0.4] alone [0.8] and that's what you've 
got [0.3] in the column and the row sums which are [0.2] illustrated in the 
notes [0.3] so we're going to need the joint probabilities to 
work out the covariance and hence the correlation but on the way [0.3] we're 
going to need [0.5] the m-, so-called marginal probabilities the probabilities 
associated with [0.4] values of the r-, each random variable taken [0.4] 
individually [0.6] 'cause we're going to look at deviations from the mean of 
each random variable [0.9] so this is the basic information that we're going to 
use [13.5] so to get at the covariance here's the covariance formula to get at 
the covariances we're going to need [0.3] the expected value of X [0.2] and the 
expected value of Y [0.5] and then to get at the correlations we're also going 
to need [0.4] the standard deviation of X and the standard deviation of Y [0.4] 
so [0.2] we o-, to get those individual quantities the expected value and the 
standard deviation [0.3] we need the individual distribution [1.1] the margin-, 
so-called marginal distributions [2.5] so here's the s-, here's the story of 
the calculations made out in [0.3] tabular form [10.1] so this is all to do 
with our [0.4] advertising demand example [0.5] so here's the 
story [0.4] on the marginal distribution of X just looking at X alone there are 
two things we want to know about X alone [0.4] we want to know what its 
expected value is [0.3] and we want to know what its variance is [0.2] so we 
can take its square root to get its standard deviation to use in the 
correlation [0.9] so [0.3] here we've got [0.2] the value possible values of 
the random variable capital-X which we denote by little-X [0.4] and we get 
nought one and two [0.9] from the joint distribution we were able to calculate 
the marginal probabilities associated with each of those values of X [0.3] and 
there they are [1.6] to calculate the expected value [0.2] it is the weighted 
average of the [0.2] possible values of the random variable where the weights 
are the probabilities that is to say [0.5] it's each X times its probability [0.
5] multiplied together [0.4] and then [0.2] those individual products summed [0.
3] that will be the expected value [0.6] so the expected value of X here [0.3] 
is one-point-one-one [3.6] and that's for the [0.2] fine that gives us one 
thing that we need for the [0.4] about 
the marginal distribution about the individual X distribution [0.9] the other 
thing we need [0.4] is the variance [0.4] and the contributions to the variance 
[0.5] are [1.5] the deviations that of the random variable from its expected 
value how far away is does it [0.4] get from its expected value how spread out 
is the distribution [0.5] and in the end what is the average of those sorts of 
that sort of spread [0.4] so what we look at [0.3] is the difference between 
each individual value [0.5] and the measure of central tendency the expectation 
[0.8] we square that [0.5] and we take the weighted average [0.3] of these 
numbers [0.4] where the weights again are the probabilities [0.3] and [0.2] 
this [0.5] squared deviation [0.4] is much more likely because it has a 
probability of point-five-one [0.4] than this squared deviation which only has 
a probability of point-one-nine [1.9] so again [0.4] we've got to [0.2] 
multiply [0.6] each element in this column by its corresponding element in this 
column [0.3] and add the things together [1.9] which is what we see [0.4] there 
[0.4] in the final column except you 
don't see it [2.7] okay so [0.3] in here [0.2] point-two-three-four-one [0.4] 
is [0.2] this number [0.3] multiplied by this one [0.5] and then the variance 
is just the sum of those things [2.9] so as long as there aren't too many 
numbers it's er [0.3] tedious but not to the point of exhaustion [3.1] and 
that's the variance that's the weighted average in terms of the random variable 
the way that the random variable deviates around its mean [0.3] how spread out 
the distribution of the random variable is [1.7] just remember [0.3] that when 
we finally get to feeding this into the formula for correlation [0.5] we'll 
want the square root of this quantity [1.0] so [0.2] although the variance is 
about point-four-eight [0.3] the standard deviation is going to be bigger than 
that the square root of the number that's less than one [0.5] is bigger than 
the original number [0.6] so it's about point-seven [4.3] and you have to 
repeat these this sequence of calculations [0.8] for the Y variable [0.9] so 
you want the same sorts of ranges of columns [0.3] for the Y variable [0.8] and 
[0.4] to complete the calculations [0.2] 
so [0.3] you need to know the values that the random variable can take the 
probabilities that it can take that value [0.7] that eventually gives you the 
expected value [0.3] hence you can calculate the square of the deviation of the 
values [0.3] from the expected value [1.4] weight those by the probabilities 
again [0.4] add them up [0.3] that gives you the variance [0.3] the square root 
of which [0.4] is the standard deviation [0.8] so [0.3] in black there [0.2] 
you have the key quantities that you need to obtain [0.5] from the [0.3] 
marginal distributions they themselves of course are coming from the joint 
distributions everything is in the joint distribution [0.3] the joint 
distribution is everything what we're doing is calculating some [0.2] summary 
descriptors [0.4] of that distribution [0.3] both individually [0.5] expected 
[0.4] value and variance and eventually [0.3] in terms of the relationship 
between the variables [2.6] so those are the numbers that we need [1.6] these 
calculations are laid out [0.5] but [0.2] we've just been through one to make 
it clear [0.8] it's going [3.3] so [0.2] we now need to 
turn back to the joint distribution [0.6] to think about [0.6] these [0.3] the 
way that the variables [1.1] change together [0.2] rather than individually [11.
3] okay [1.2] this table is also in your notes it comes out rather [0.2] small 
here [0.6] but [0.3] what we need to do [0.4] is to start thinking about [0.3] 
the relationship between [0.5] the random variables so we've got [0.3] to [0.2] 
to move on now to the joint distribution [0.9] so [0.4] the first thing [0.2] 
to think about is [0.2] how to organize the pairs of values that can occur [0.
6] well if you just look at the first and the third column here you can see how 
i've organized it i've tried to organize it systematically [0.5] i've said [0.
3] fix the Y value at one [0.5] work through the possible X values [0.2] fix 
the vy-, Y value at two [0.3] work through the corr-, possible X values [0.2] 
fix the Y value at three [0.3] work through the possible X values just do it 
systematically you could lay it out differently [0.4] but we have got to cover 
[0.5] all possible pairs [0.5] of values [0.7] so [0.8] every row in this table 
corresponds to a pair of values of X and Y [0.3] so when we're trying 
to sum across all possible pairs what we're doing again [0.3] is summing the 
column [3.1] so in the middle [0.2] column there two [0.3] i've just u-, i've 
put [0.2] the deviation of X from its mean why do i need that [0.2] because 
that term feeds in [0.3] to my covariance formula [0.7] not on its own [0.4] 
but it's still in there [0.6] so [0.2] here i've got all my deviations i'd 
already calculated these numbers for the purposes of calculating the variance 
of X [0.2] so it's not although i've written them in again [0.6] w-, you would 
have already calculated them [0.9] and of course [0.2] it repeats so [0.3] as 
an an as X-equals-nought the deviation is always minus-one-point-one-one [0.5] 
at X-equals-two the deviation is always nought-point-eight-nine [0.8] so it 
reappears [2.0] and the same will be true [0.7] of the way [0.2] Y deviates it 
from its mean only because of the way i've got it organized [0.2] i've got 
three [0.2] Y-equals-one values so i've got three deviations the same [0.7] 
three two values together [0.3] so i'll get three values the same there [0.5] 
but again i i would have already 
calculated these deviations to get at the variance [0.3] i'm going to use them 
again to get at the covariance [0.4] but to get at the covariance [0.5] i'm 
going to [0.2] multiply these [0.2] elements from these two columns together [0.
2] i'm interested in the [0.4] relationship between the deviation of X from its 
mean [0.4] and Y from its mean [0.5] that was what the formula said [2.4] 
so [0.3] i need [0.2] a fifth column which would be a brand new column [0.4] 
which is the product [0.2] row by row [0.3] of what's in the second column and 
what's in the fourth column [0.4] so my one-point-o-four here [0.3] is minus-
one-point-one-one [0.6] multiplied by minus-nought-point-nine-four [1.0] both 
numbers [0.4] tend to be [0.2] er both numbers there are below their mean [1.0] 
next time [0.2] we're multiplying [0.3] minus-point-one-one [0.2] by minus-
point-nine-four again [0.3] both numbers tend to be below their mean that's 
evidence for positive covariance [1.0] they vary together [0.2] however the 
next number [0.4] point-eight-nine [0.4] times minus-point-nine-four [0.4] X is 
above its mean but Y is below [0.5] [sniff] [0.9] how important is that [0.5] 
well we it depends 
on the probability of getting that pair of values [0.4] so we've got two [0.3] 
pairs of values that are both below their mean [0.3] and they both have 
positive probabilities [0.3] one where one's above one's below [0.2] and it has 
a positive probability but it's relatively small [0.6] so on balance when we 
look at that [0.3] it would appear that we've got [0.4] er favour for a 
positive [1.3] covariance or evidence for positive covariance [0.2] and then we 
work our way down [0.4] that's below that's above multiply them together [0.3] 
get a positive number [0.4] again [0.6] that's actually sorry we get a n-, we 
get the negative [0.3] number here i've moved this across to the probability 
column now [0.2] we're getting one below [0.2] one above they're opposite [0.6] 
opposite locations with respect to the mean [0.3] and so on through [1.2] so 
that's this column here [0.2] this column here [0.2] is a deviation of X from 
its mean [0.3] times the deviation of Y from its mean for each pair [0.3] of 
values [2.4] we want to find out what the average [0.6] de-, [0.2] average 
combination of these [0.3] deviations is when multiplied 
together [0.2] so we're going to multiply each product by its probability [0.2] 
of occurring [0.6] that's the joint probability of getting the underlying 
values of X and Y [0.6] that will then give us [0.4] eventually when we add all 
those things together it'll give us a weighted average [0.3] so [0.3] what we 
want to do [0.3] is multiply [0.4] this column [0.2] which itself is the 
product of these two [1.0] by the probability [1.2] and that's the final column 
[0.5] that we've got in that table [1.1] so the final column there [0.4] is [1.
2] the element in column two [0.3] times the element in column four [0.4] times 
the element in column six [2.6] but that's for each so we've got one row [0.3] 
corresponds to one pair of values of the random variables [0.3] we want to sum 
across all pairs [0.3] we were expressed that as a double sum but all you have 
to think of is summing across all possible combinations [0.3] so what we have 
to do [0.2] is sum up this column of num-, numbers here [0.8] and find out what 
we get [0.3] and that will be the covariance [0.4] between these two [0.3] 
random variables [1.3] now the [0.5] that's [0.7] just a matter 
of hitting the buttons on a calculator then it's about nought-point-one [2.1] 
we can't read anything at all [0.5] into the numerical value [0.6] of that 
number [0.5] because the number that you get out of that [0.2] is clearly going 
to depend [0.3] on the sizes of the numbers that went in in the first place and 
if these sizes are large [0.4] then that number is going to tend to be large [2.
2] what we can do however is take something from the sign the sign is positive 
so [0.3] whatever kind of relationship there is here [0.2] or linear 
relationship of course specifically we are talking about [0.5] is positive 
whatever there is [0.6] it would represent [0.3] a situation where [0.5] if X 
is above its mean [0.2] Y would tend to be above its mean [0.3] if w-, X is 
below its mean [0.2] Y would tend to be [0.2] below its mean [0.3] whether 
they're below or above or not [0.3] they're paired [0.3] in that sense [3.1] so 
[0.3] what we've done then [0.2] is to compute the formula at the bottom of the 
screen [2.1] we've taken [0.3] all the possible pairs of X-minus-E-of-X Y-minus-
E-of-Y multiplied them together multiplied 
them by their probabilities [0.2] and added the lot together [0.3] and that's 
the covariance [2.7] however [0.2] we want the correlation we want this scaled 
version that's going to tell us [0.2] is that [0.4] er [0.5] a very strong 
relationship or isn't it we've got to remove the [0.4] size effect [0.4] from 
the [1.3] er measure [0.2] and we do that [0.5] by dividing by the standard 
deviations [13.5] we've worked out what the standard deviations are [2.0] on 
the last slide but one [1.0] so [0.8] all we have to do [0.2] is to substitute 
them into the formula all the bits of the formula are now calculated all we got 
to do is substitute them [0.4] so the correlation is the number we worked out 
by the covariance [0.5] divided by both the standard deviation of X [0.4] and 
the standard deviation of Y [0.6] and that will give us a standardized measure 
that must lie between minus-one and plus-one [0.5] the closer it is to zero the 
weaker is the relationship [0.4] the closer it is to minus-one the closer it is 
to a perfect downward sloping relationship [0.3] the closer it is to plus-one 
[0.5] the closer it 
is to a perfect upward sloping [0.5] relationship [1.0] so [0.2] there are the 
numbers that we've calculated on the previous slides [0.4] they simply have to 
be substituted [0.3] into the formula [0.9] you can multiply these two together 
first [0.4] before dividing them into that one if you prefer or can do it 
sequentially [0.5] divide covariance by standard deviation of X [0.4] and then 
divide that number by the standard deviation of Y it's the same result [0.2] 
will [0.7] occur [1.4] and [0.3] the bottom line [0.6] for the correlation 
between these two random variables [0.8] is [0.4] that [0.7] we get a number 
that's about nought-point-two [1.7] that's rounded to four decimal places [0.3] 
the number that you see up there [0.5] so the correlation is positive but it's 
weak [1.1] er there doesn't seem to be a strong [0.3] relationship between [0.
6] er the [0.2] advertising [0.6] and the demand levels [0.7] over the period 
that this empirical probability distribution has been constructed [3.2] we 
shouldn't perhaps be [0.2] too surprised about that [0.6] but this number for 
some distributions this number will be [0.4] a lot bigger [0.9] 
but we've got [0.7] now a number of summary statistics for our joint 
distribution [0.8] we've got [0.7] b-, in particular we got individually for 
the Ys and the Xs [0.6] their expected value which is a measure of the s-, [0.
2] of the er central tendency or the general location of the distribution of 
each individually [0.6] we've got the variance which is a measure of their 
spread [0.3] about that measure of central tendency how spread out are they 
about that standard [0.2] that er [1.6] overall measure of location of the 
distribution [0.7] and then we've got two measures [0.4] of the relationship 
between the variables we've got the [0.2] covariance [0.6] and then we've got 
the correlation [2.6] so the numbers are are er er somewhat tedious but what 
i'd really like you to take away from this is [0.3] er [0.3] you are you do 
have for the purposes of examinations to be able to perform such calculations 
[0.4] but in the longer term [0.3] what you need to develop is some intuition 
about what's going on with these definitions what it what is really 
being delivered to you when you make these calculations [1.0] er because in 
practice if you do upgrade any much more statistics of course [0.4] you won't 
be handling the individual calculations [0.2] you'll leave that to a machine [0.
3] but if you don't know what the formula is doing [0.4] you really can't 
interpret the answers very reliably [2.7] so [0.3] that's joint distributions 
then [0.2] and summary statistics for distributions of random variables [0.2] 
and each one of those [0.2] has an analogue with real data [0.4] we start off 
with real data get an understanding of what we're doing with real data [0.2] 
and then to develop more statistical techniques we develop tools [0.3] random 
variables and probability distributions [0.3] which allow us to [0.3] become 
more sophisticated in our analysis of real data there's a feedback effect [2.0] 
the last er few things that i want to talk about in this section of the 
lectures [0.6] er [0.2] harp back [0.4] to [0.2] looking at individual 
variables looking at individual random variables [3.4] and i want to talk then 
again about expected value and variance [0.6] but i want to [0.4] point out to 
you a very important feature of these [0.2] er [0.3] calculations [0.8] very 
often [0.5] you'll have some information about some variable [0.8] that's not 
itself directly of interest [0.7] the thing that you want to think about [0.5] 
is is related to the [0.6] er that random variable about which you have 
information [0.3] but is not the same as [0.3] that [0.6] random variable [2.2] 
so you might have a probability distribution about s-, [0.2] about in this case 
it's going to be a [0.4] about sales in this 
illustrative example i'm going to introduce [0.2] but you may not want to know 
about sales you may not want to know what the expected value of sales is you 
may not want to know what the variance of sales is [0.8] what you may want to 
know [0.4] is [0.5] what happens to profits what's the properties of profits 
not what are the properties of sales [1.0] but i-, what you know about is sales 
how can you move from one to the other [1.5] so here's a story [0.6] that is [0.
5] very useful and this story also applies although we're going to introduce it 
for random variables and talk about expected values and variances [0.9] exactly 
the same rules apply for real data [0.3] in other words to averages [0.3] and 
to sample variances [1.2] so supposing we've got [0.7] two [0.3] we've got a 
random variable we've got information on sales [0.4] but what we want to know 
[0.5] is profits [0.4] we want to know about [0.3] profits [1.8] so [0.2] 
here's the relationship [0.5] that has been found to exist [0.4] between [0.3] 
profits [0.3] capital-P [0.3] and sales capital-X i'm using capital letters 
because i want you at this stage to 
think about these things as [0.3] random variables in general [0.5] they will 
have probability distributions [0.5] we won't [0.4] we will be able to get [0.
8] realizations or particular values in practice [0.2] but when we're thinking 
about them as random variables we want to think about them in the abstract [0.
5] so this says [0.3] that whatever value of X you happen to have from your 
distribution [0.2] i-, you've the corresponding value of P can be calculated [0.
3] and so it can [0.2] be stated as a general rule as the relationship between 
[0.5] random variables [0.5] the units here [0.5] are [0.2] that er 
everything's been made in er is measured in thousands of pounds [0.7] er [0.2] 
and in some cases thousands of pounds [0.2] per day [0.9] and you can interpret 
the u-, the terms that you see here it's a linear equation it's simple [0.2] 
and the rules that i'm going to deal with [0.3] are specific to linear 
equations linear equations are ones where [0.3] one variable [0.3] is a 
constant multiplied by the other [0.2] with another constant added or 
subtracted you've seen that before [0.3] with what namex did 
for elasticity and so on [1.1] and you can interpret these coefficients as i 
have done there [0.3] the three [0.3] is giving you some measure [0.2] of the 
profit per car [0.4] in thousands of pounds [0.5] the minus-two [0.4] is [0.4] 
some fixed costs per day [0.4] in thousands of pounds so what you've got here 
[0.4] are [0.3] the amount of [0.6] profits that you're going to get [0.2] per 
car [0.3] minus the amount of money that you're going to lose anyway as a 
result of say keeping your showroom going or something of that sort [2.9] so 
this relationship is in terms of random variables so R-Vs here means i'm using 
capitals i'm talking about random variables in general [0.4] but i can of 
course use the rule [0.5] for specific values of the random variables so er [0.
2] if we know [0.5] or somebody wants asked wants to ask wants er us to 
consider [0.3] what happens when the random variable X takes the value 
specifically one [1.6] then we can work out [0.2] what the corresponding 
profits would be [0.8] by simply feeding that one [0.3] into the formula [0.4] 
and in this case we'd get profits of one unit [0.8] thousand pounds per 
day [4.3] we've got [0.8] i-, either because we [0.2] s-, [0.2] er developed a 
statistical model or we've got an experimental probability distribution [0.4] 
we've got information [0.3] about [0.2] X [2.7] but what we want to know [0.7] 
about is not X [0.3] it's profits this is what matters or at least for some 
reason this is what we're asked to investigate [0.7] so how do we get a story 
about profits from the story about sales and how that in terms of probability 
distributions and their summary statistics [2.0] well when you look at the [0.
5] equation [1.1] you can see [0.4] that [1.2] any value of X [0.2] generates a 
particular value of P [0.8] so [0.4] if we change the value of X we'll 
necessarily change the value of P [0.6] and if we never come back to the same 
value of X we'll never come back to the same value of P [0.6] a particular 
value of profits of er sales X [0.2] is associated uniquely with a particular 
value of profits [0.4] P [0.7] so [0.2] if somebody tells us what the 
probability of getting some particular value of sales is [1.1] all we have to 
do [0.6] is to u-, to describe the probability of getting the 
corresponding value of profits is saying it's the same [0.7] so [0.2] the 
probability of getting [0.3] profits of unit one [0.3] is the same s-, [0.2] as 
the probability of getting sales of unit one the probabilities are going to be 
the same [0.7] it's called a one to one relationship you've got no overlapping 
at all [3.2] so what this tells us [0.3] is we [0.2] can e-, if we know what 
the probability distribution of the [0.2] Xs is [0.7] we know what the 
probability distribution of the Ps are [0.8] we can just read off what the 
probability of the corresponding X value is and we'll have a set of pairs of 
values of profits [0.4] with their probabilities [0.4] so we'll know what the 
[0.4] probability distribution of profits is [0.3] from which [0.4] we can 
calculate [0.3] any of the summary statistics for the distribution [0.7] er 
that we want [0.2] expected value or variance using the formulae we've got [1.
3] so we can regenerate a new probability distribution [0.4] and we can 
calculate expected values and variances [0.6] if we want to know the 
probability distribution 
specifically [0.4] we're going to have to go through this process [0.5] but it 
turns out that if you want to know just these summary statistics [0.7] you 
don't have to go through the irksome process [0.3] of calculating all the 
probabilities [0.2] and then running through the formulae there's a sh-, a very 
important [0.3] short cut [10.9] so just to be clear [0.2] supposing [0.3] you 
wanted to work out [1.0] what the [1.0] er [0.5] expected value of profits was 
if you do it longhand what have you got to do [0.8] well for each possible 
value of X [1.1] you have to work out the corresponding value of profits [1.1] 
and you have to know what the probability of getting that value of X is [0.2] 
it'll be the same as the probability of getting that value of profits [0.7] but 
you have to do this calculation [0.4] for every single [0.9] value of X you 
have to work out a new value of P [1.0] to get the probability and then you're 
going to have to multiply [0.3] the value of P by its probability [0.4] for 
each possible value [0.6] and add them all up [0.3] to get the expected value 
[0.5] er there isn't an 
answer here i haven't written it out that's a tedious calculation [0.6] and if 
there were very many possible values for the random variable [1.3] not just six 
maybe sixty [0.2] it would be a real pain [1.7] but this is how you'd work out 
the expected value of P if you had to [0.5] it's minus-two times point-one-
eight plus one times point-three-nine plus [0.2] four times point-two-four [0.
4] all the way to the end [0.3] thirteen times point-o-one [2.6] and er [0.3] 
the variance would be even worse you'd have to work out all the s-, deviations 
square them multiply the probabilities add them [0.4] a real [0.3] pain [2.1] 
so you don't do that [0.4] when the relationship between the random variables 
[0.4] is linear [0.4] when the relationship between the random variables is 
linear [0.5] you've got a very simple [0.4] alternative [0.2] route to follow 
[3.1] 
so [0.8] we just have [0.2] a general statement and we say here's a linear 
relationship between a random variable X and a random variable Y [0.5] we 
multiply by B [0.2] add A [1.1] so we've got all the information we need about 
Y we've perhaps got its probability distribution [0.5] and we want to find out 
what the expected value of X is how do we go about it [1.4] well all we have to 
know [0.5] is that [0.3] the relationship between the expected values [0.3] is 
exactly the same as the relationship between the random variables themselves [0.
5] so [0.3] if in order to get any value of Y you must multiply X by B and add 
A [0.4] it's true that if you want the expected value [0.3] of Y [0.2] you 
simply multiply by B [0.3] and add A [4.8] so you certainly don't go through 
all the rigmarole of working out all the [0.3] probabilities and expected 
values and the [0.2] and so on [0.3] all you need to know [0.4] is 
what the expected value of X is so if somebody's told you what the expected 
value of X is or you've got that information from somewhere [0.5] it's very 
easy to work out the expected value of Y [1.6] the same is true of the variance 
[0.3] but don't forget with the variance you're always looking at [0.8] 
deviations from the mean [1.4] and you're squaring them [1.5] so [0.2] when we 
look at the mean of Y or the ad-, expected value of Y [0.4] no matter which 
expected value no matter what the actual value of Y is [0.3] the [0.3] the 
specific value of Y is [0.2] the expected value is going to [0.2] involve the 
term A [0.8] so when we look at the difference between the average and the 
actual both the average and the actual [0.3] will include a term in A [0.2] 
here's the A coming in from the average [0.3] here's the A coming in from the 
actual if you like [0.6] so when we take the difference [1.0] the A is going to 
disappear [0.8] so when we look at deviations of Y about its mean [0.5] clearly 
the A bit's not going to play a role [0.5] what is going to play a role is the 
B bit [1.0] but that's going to 
get squared up because variance we look at the square of the deviation of the 
mean [0.4] so when we look at say this minus this [0.2] the A will disappear 
but we'll have B in there [0.9] when we look at variance we square that [0.3] 
so we're going to end up with B-squared [0.6] and in fact the relationship 
between the variances [0.8] is this [0.6] the variance of Y [0.4] is [0.7] B-
squared times the variance of X [0.6] you have to the A is irrelevant it 
disappears when taking the difference [0.3] the B sticks around [0.6] but it 
has to be squared up because variance is the squaring operation [1.7] and these 
relationships would hold as well for sample data [0.5] if you knew what this 
samp-, if you got some data and you knew what the average value of X was [0.3] 
and you knew that X was related to Y in this way [0.4] then the average value 
of Y could be worked out [0.3] using this formula but plugging in the average 
value of X [0.3] similarly [0.3] if you knew what the sample variance of X was 
[0.4] then the sample variance of Y [0.3] can be worked out like this [0.3] 
it's true both of the [0.7] 
random variable case which we sometimes refer to as the population case [0.4] 
and [0.5] real data [1.4] it's also the case [0.3] that these relationships 
hold not just for discrete random variables we've been pushing the discrete 
random variable story for reasons of simplicity [0.4] these relationships [0.3] 
also hold for continuous random variables although we haven't defined exactly 
what we mean by expected value and variance [0.4] you should be developing some 
conceptual ideas of what we mean [0.4] and we are able to define these things 
for continuous random variables [0.3] in which case [0.3] these relationships 
continue to hold it's very general result [0.5] when the basic relationship 
between the random variables is linear [3.3] so [0.3] here is [0.3] er a very 
simple [0.3] er illustration then [1.0] we had that profits was [0.2] three 
times sales minus two [5.2] er once you know that the expected value of sales 
is one-point-five you can easily calculate the expected value of profits [0.6] 
by feeding it into the formula [0.8] you don't have to go through all the 
rigmarole of working out a probability distribution [0.4] and er going through 
y-, every step in the calculation of expectation [0.9] similarly [0.5] with 
variance [0.5] the A bit is our minus-two here [1.2] it's the constant that you 
add on in this case it's a subtraction so it's minus-two [0.2] but it's 
irrelevant [2.1] and the only thing that's important is the multiplicative 
factor of three [0.8] and that [0.4] of course has to be squared [0.3] so we 
get the variance of P is three-squared times the variance of X [0.4] variance 
of X one-point-two-five [0.4] and so we get the variance of P [0.8] which is 
large of course 'cause the multiplicative [0.3] term is larger than one [1.6] 
we'd have got variance reduction [0.4] if this number instead of three was a 
number less than one [0.8] just depends on what's er being fed in [2.1] so [0.
3] when there's a linear relationship between your random variables don't 
recalculate [0.6] expected mean and variance use these formulae [16.6] we make 
far less play of the following results but nonetheless they're interesting [0.
2] 
they're important to know about [0.2] at least to be aware of [0.4] there are 
all kinds of other relationships like this that exist [0.6] if we want to if 
we've got to [0.6] add [0.4] random variables together [0.2] possibly 
multiplied by their own constants [0.4] there are all kinds of simple rules for 
working out the new expected values [0.3] and the new variances from the old 
expected values and variances [0.9] so a very simple case is [0.4] if you want 
the expected value of the sum of two random variables [0.3] it's just [0.2] the 
sum of the expected values [0.6] very straightforward indeed [2.1] variance is 
more interesting [0.2] when you come to think about sums [0.4] because if 
you're dealing with a sum [0.6] you don't just have to consider what's 
happening [0.4] to each variable [0.5] individually [0.2] you have to think 
about what's happening to them [0.3] together [0.5] so [0.3] if you're faced 
with a position of needing to work out the variance of a sum [1.4] you have to 
take account of the covariance the way that they [0.3] vary together [1.0] so 
[2.3] you'll notice we've got there's coefficients 
being squared up here [0.5] but [0.2] let's look at the really simple case to 
make the point if you wanted the variance of a sum [0.4] the variance of a sum 
would be the sum of the variances that's fine [1.2] but [0.2] you also have to 
take account of the covariance [0.2] the extent to which they vary together [0.
8] and if the covariance is negative [0.4] you can see that this that will [0.
3] reduce the variance compared with simply this sum of the individual 
variances the covariance is helping to [0.6] have a reducing effect [0.4] on 
the aggregate variance in there [2.1] if that er if the covariance is negative 
[1.0] notice also then [0.3] that [1.0] in the special case where there is no 
covariance where there's no relationship at all between the random variables [0.
3] then indeed [1.2] if we want the variance of the sum it's the sum of the 
variances [0.8] that should remind you a little bit of the story about 
independence and probabilities if you want the probability of a joint event [0.
6] we said [0.3] sometimes you can just multiply the two probabilities together 
[0.7] 
but only if the two [0.5] events are independent [0.4] so the story here is [0.
4] if you want the variance of the sum [0.3] you can take the sum of the 
variances [0.5] but only if [0.5] there's no relationship between the variables 
the covariance is zero [2.7] so you then might ask well [1.7] is it true that 
er [1.1] zero covariance means that the [0.2] random variables are [0.4] 
independent can i say that if there's no covariance between the data they're 
really independent remember we defined independence quite precisely earlier on 
[1.3] well [0.6] unfortunately that's not the case [1.3] the definition of 
independence is very precise and it only involves probabilities [0.4] when we 
looked at covariance [0.3] we weren't just interested in [0.3] probabilities we 
were also interested in values that the random variables can take [2.2] and it 
turns out [0.2] that if you start off with random variables that really are 
independent [0.5] then it's certainly true [0.7] that there won't be any 
covariance between them the co-, rather to be [0.3] c-, more careful the 
covariance will be zero [0.2] between 
them [1.0] that's that's okay [0.5] but [0.3] it doesn't go the other way [0.2] 
that is to say [0.7] just because [0.6] when you feed everything into the 
covariance formula everything cancels out to give you zero covariance [0.4] it 
doesn't mean strictly speaking [0.4] that the random variables are independent 
and the reason is [0.3] that [0.2] what's going into the covariance formula [0.
2] is not just the probabilities joint probabilities [0.2] and that [0.7] it's 
values of the random variable too [0.6] when we look at independence [0.3] the 
only issues [0.4] are the marginal probabilities the individual probabilities 
[2.9] okay [0.2] that's it for the lecture can i remind you please er there's 
the er assessed exercise which is due in this time [0.2] next week [0.3] if 
you've got questions about it come and speak to me or your class tutor [0.8] er 
[0.6] and the problem sets for discussion this week [0.3] are the ones on 
expected value variance and covariance [0.4] okay thanks very much that's it