nm0929: right shall we make a start them please [0.6] am i switched on yeah [0. 3] good [1.2] er [0.3] just a reminder as always [0.5] er i have my office hours people are beginning to come and see me which is good i get the feeling that the assessed exercise is focusing minds which is great [0.4] er [0.2] notice when these hours are i sh-, i'm [0.2] i'm available then [0.5] with a caveat from yesterday unfortunately Murphy's law [0.2] but these are times when you can come anyway [0.3] you don't need to ask me first can i come and see you in your office hours just turn up [0.9] and the same is true of other lecturers and class tutors [0.4] er as well [0.2] so do use the [0.2] office hours if you need to [0.6] and you will of course find as you start revising for your examinations that you need [0.4] you need to so get used to it earlier [0.3] rather than later [0.8] where are we in the things that we have to discuss [0. 4] well we've got three more lect-, [0.2] three more weeks of lectures this week [0.6] er is which is seven [0.2] eight and nine we've got two lectures each week in those weeks there are no lectures after week nine [0.6] in this course [0.7] and we're going to be finishing off with probability distributions today [0.3] and we're going to move on to the applications [0.5] er in in the five remaining [0.3] lectures [1.2] so this is where we are at the moment [7.8] right when we finished er last week [0.3] we were talking about the topic of [0.2] er covariance joint distributions for [0. 2] random variables and talking about working out the relationship between variables [0.7] and this was the example that i was using it's all in your notes but i've written it out again [0.4] to illustrate [1.0] and what we have here [0.6] is [0.4] a joint distribution if you look at the [0.2] if you look at the table we have two variables [0.3] one which was related to the level of demand [0.3] and one which related to [0.2] the number of advertisements and the story was [0.2] that a company was placing weekly advertisements [0.3] and it was generating demand as a result of those [0.4] and we had [0.3] an experimental probability distribution [0.2] where we've got probabilities in blue [0.2] attached to each pair of possible values for the random variable [1.8] and then we said [0.2] well given that you've got these blue probabilities which we call joint probabilities [0.4] you could also work out [0.4] marginal probabilities if you know [0.4] all the probabilities associated with X-equals-nought [0.2] you can add them up [0.3] to find the overall probability that X-equals-zero [0.3] and that was a column sum [1.4] and we've moved on to thinking [0.2] not just about things relating to marginal distribution such as expected values or variances [0.4] but on to the relationships between variables [0.4] where we needed to look at [0.2] these joint probabilities [0.9] so this was our [0.4] basic data basic example [0.2] that's driving the [0.7] subject of the lecture [2.1] and we finished last week [0.9] by [0.3] looking at the [1.1] formula for the covariance [0.5] between random variables [1.1] just cover that up for the moment [1.1] and [0. 2] it looked fairly messy but we dealt with some motivation for it [0.7] and [0. 2] remember that this is [0.2] covariance for random variables there's a distinction between what goes on for random variables which is a theoretical kind of concept [0.3] and what goes on for real data [0. 9] but the story here is [0.3] we are looking at [0.6] effectively average deviations of each variable from its mean [0.4] but [0.2] particularly [0.2] how they do that together whether when X is below its mean Y tends to be or not [0.3] we're looking at the [0.2] average of these things in the sense that we're looking at how important these joint deviations are [0.4] as measured by how likely those occurrences are [0.3] corresponding occurrences are [1.2] so we take a weighted average of these paired deviations [0.4] and this double sum that you see here [0.2] simply means [0.3] that you have to add across all possible pairs of values of the two random variables [0.5] so this was the story [1.5] so the double sum there [0.2] across X and across Y simply means that what you're doing is looking at all possible pairs of values of the random variable [1.2] s [0.4] concerned [1.3] and what we've got is a weighted average [0.2] of these paired deviations from the mean [0.7] the weighted average is obtained by multiplying each paired deviation [0.4] by the probability of getting that deviation [0.6] and all those probabilities lie between nought and one so we call it a weighted average [1.7] and that's covariance for [0.2] random variables [0.8] and of course if you want to calculate it as we're about to see [0.3] it can be rather tedious [0.2] but the main thing is to get in your ideas the motivations for what we're doing we're looking at [0.3] the relationship between variables [0. 3] by looking at [0.2] how likely [0.2] the two variables seem to be [0.3] both below or above their means together [0.2] or opposite sides of their means together [0.5] the E [0.2] is the expected value that we discussed last week that's the weighted average [0.2] of possible values of the random variable [0. 3] using probabilities as the weights [2.2] so that's [0.2] covariance [0.7] for random variables if you want to compare it with what goes on with real data [0.3] in the assessed exercise you've got to calculate some correlations [0.9] er for which you'll need covariances [0.3] just to [0.6] compare [0.2] if you were looking at covariance for real data [0.2] here's what you would do [0.3] and it's possible just to identify the different points in the formulae [0.5] er so that you can see where the [0.4] points of comparison occur [0.5] so [0.2] first of all we're not using random variables we're using real data [0.3] and our real data when we look at covariance occurs in pairs [1.0] pairs of values for two [0.2] variables [0.6] and [0.2] when we've got the real data formula [0.2] again we've got deviations of er the X variable from its sample mean [0.2] and the Y variable from its sample mean instead of expected value [2.2] and instead of a probability weighting [0.2] the weighted average we've got a real average [0.2] although use N-minus-one instead of N [0.9] so we're averaging [0.4] the [0.9] cross-product of the deviations from the mean just as we're doing with the random variable case [0.6] you only see one sum here [1.5] because this is enough [0.3] to do the summing across all pairs of values because the random variables come in the data come in pairs a value for the X a value for the Y [0. 9] but you can see it in the same way [0.2] we're summing across all possible pairs [0.2] and these are actually the weights [0.2] every time we put in something into here we give it a weight of one over N-minus-one [0.3] it's the same sort of story [0.6] but it's for real data [0.3] rather than for [0.2] random variables [5.1] so [2.6] covariance for random variables then [0.7] is a rather messy looking thing but it's doing basically the same thing [0.7] summing across all possible pairs [0. 2] comparing deviations from the means of the two random variables [0.2] and weighting the lot [0.6] taking an average [0.2] so it's the same story [1.3] but as with real data [1.0] the problem with the covariance measure to capture the relationship between random variables [0.7] is that the units of the thing [0.5] a-, whether it's big or small [0.2] will essentially depend on whether the n-, units of measurement of X and Y are big or small [0.4] so [0.2] it won't do for comparing the strength of relationships between [0.3] different pairs of random variables [0.4] because its size [0.2] will change with the size of the random variables it's not a standardized measure [0.3] and we know what we do with real data [0.5] we take the covariance [0.5] and we divide by the [0.4] standard deviations of the two variables involved [0.6] that's what we do with sample data [0.2] and it's exactly what we do [0.3] with random variables [0.3] so [0.2] we have covariance which doesn't necessarily work for us in the sense that it's [0.2] not properly scaled [0.6] and instead of covariance [0.5] we use [0.7] correlation [1.0] and the correlation story [0.2] is related to the covariance story [0.3] in just the same way as it is for [0.7] real data [0.2] that's to say [0.4] the correlation [0.5] is the covariance divided by the two [0.5] can see a mistake in the slide [0.5] divided by the two [0.3] standard deviations [0.2] that should be the standard deviation of [0.9] Y [5.6] and a standard deviation definition of course is the standard deviation definition that we use for random variables [0.3] it's the [0.3] weighted average of the deviation of the random variable of its mean squared [0.2] where the weights to the probabilities for that random variable alone they're the marginal probabilities [1.0] if you don't take the square root it's the variance to get the standard deviation you must take the square root [0.5] but the relationship is the same as before correlation [0.4] is covariance [0.3] divided by the two standard deviations [0.9] and that gives us a measure [0.6] that must lie between minus-one and plus-one and just with the product moment correlation coefficient [0.6] minus-one or plus-one indicates [0.5] a perfect linear relationship between the random variables [0.4] something in between is weaker [0.3] the closer it is to zero the weaker is the relationship [1.4] so what we want to do in er in this lecture [0.3] is just run through some calculations [0.5] of covariance and hence of correlation [0.6] and finish off with some [0.4] more general stories about [0.5] er expected value and variances [1.1] so [0.4] we're now moving into section four-point-five of the notes [0.2] if you look in section four-point-five and thereabouts you'll see the details of the calculations that i'm going to go through with you now [3.3] so it's a [0.2] sensible place to be looking [17.7] so [0.4] first of all the [0.9] numbers that we're actually going to be feeding into this calculation [0.3] are the numbers from this joint distribution for demand and advertising [0.6] so [0.7] all the information that we can possibly have about these two random variables is encapsulated in the joint distribution [0.3] we're then going to summarize this joint distribution by looking at the covariance and the correlation between these random variables [0.7] in order to do that [0.3] we are going to have to look at the indi-, the properties individually of X and Y [0.2] 'cause we're going to need want to know what their expected values are [0.4] and we're going to need to know what the variances are so we can get the standard deviations [0.4] so we are going to need [0.3] the so-called marginal distributions the probabilities associated with specific values of each variable [0.4] alone [0.8] and that's what you've got [0.3] in the column and the row sums which are [0.2] illustrated in the notes [0.3] so we're going to need the joint probabilities to work out the covariance and hence the correlation but on the way [0.3] we're going to need [0.5] the m-, so-called marginal probabilities the probabilities associated with [0.4] values of the r-, each random variable taken [0.4] individually [0.6] 'cause we're going to look at deviations from the mean of each random variable [0.9] so this is the basic information that we're going to use [13.5] so to get at the covariance here's the covariance formula to get at the covariances we're going to need [0.3] the expected value of X [0.2] and the expected value of Y [0.5] and then to get at the correlations we're also going to need [0.4] the standard deviation of X and the standard deviation of Y [0.4] so [0.2] we o-, to get those individual quantities the expected value and the standard deviation [0.3] we need the individual distribution [1.1] the margin-, so-called marginal distributions [2.5] so here's the s-, here's the story of the calculations made out in [0.3] tabular form [10.1] so this is all to do with our [0.4] advertising demand example [0.5] so here's the story [0.4] on the marginal distribution of X just looking at X alone there are two things we want to know about X alone [0.4] we want to know what its expected value is [0.3] and we want to know what its variance is [0.2] so we can take its square root to get its standard deviation to use in the correlation [0.9] so [0.3] here we've got [0.2] the value possible values of the random variable capital-X which we denote by little-X [0.4] and we get nought one and two [0.9] from the joint distribution we were able to calculate the marginal probabilities associated with each of those values of X [0.3] and there they are [1.6] to calculate the expected value [0.2] it is the weighted average of the [0.2] possible values of the random variable where the weights are the probabilities that is to say [0.5] it's each X times its probability [0. 5] multiplied together [0.4] and then [0.2] those individual products summed [0. 3] that will be the expected value [0.6] so the expected value of X here [0.3] is one-point-one-one [3.6] and that's for the [0.2] fine that gives us one thing that we need for the [0.4] about the marginal distribution about the individual X distribution [0.9] the other thing we need [0.4] is the variance [0.4] and the contributions to the variance [0.5] are [1.5] the deviations that of the random variable from its expected value how far away is does it [0.4] get from its expected value how spread out is the distribution [0.5] and in the end what is the average of those sorts of that sort of spread [0.4] so what we look at [0.3] is the difference between each individual value [0.5] and the measure of central tendency the expectation [0.8] we square that [0.5] and we take the weighted average [0.3] of these numbers [0.4] where the weights again are the probabilities [0.3] and [0.2] this [0.5] squared deviation [0.4] is much more likely because it has a probability of point-five-one [0.4] than this squared deviation which only has a probability of point-one-nine [1.9] so again [0.4] we've got to [0.2] multiply [0.6] each element in this column by its corresponding element in this column [0.3] and add the things together [1.9] which is what we see [0.4] there [0.4] in the final column except you don't see it [2.7] okay so [0.3] in here [0.2] point-two-three-four-one [0.4] is [0.2] this number [0.3] multiplied by this one [0.5] and then the variance is just the sum of those things [2.9] so as long as there aren't too many numbers it's er [0.3] tedious but not to the point of exhaustion [3.1] and that's the variance that's the weighted average in terms of the random variable the way that the random variable deviates around its mean [0.3] how spread out the distribution of the random variable is [1.7] just remember [0.3] that when we finally get to feeding this into the formula for correlation [0.5] we'll want the square root of this quantity [1.0] so [0.2] although the variance is about point-four-eight [0.3] the standard deviation is going to be bigger than that the square root of the number that's less than one [0.5] is bigger than the original number [0.6] so it's about point-seven [4.3] and you have to repeat these this sequence of calculations [0.8] for the Y variable [0.9] so you want the same sorts of ranges of columns [0.3] for the Y variable [0.8] and [0.4] to complete the calculations [0.2] so [0.3] you need to know the values that the random variable can take the probabilities that it can take that value [0.7] that eventually gives you the expected value [0.3] hence you can calculate the square of the deviation of the values [0.3] from the expected value [1.4] weight those by the probabilities again [0.4] add them up [0.3] that gives you the variance [0.3] the square root of which [0.4] is the standard deviation [0.8] so [0.3] in black there [0.2] you have the key quantities that you need to obtain [0.5] from the [0.3] marginal distributions they themselves of course are coming from the joint distributions everything is in the joint distribution [0.3] the joint distribution is everything what we're doing is calculating some [0.2] summary descriptors [0.4] of that distribution [0.3] both individually [0.5] expected [0.4] value and variance and eventually [0.3] in terms of the relationship between the variables [2.6] so those are the numbers that we need [1.6] these calculations are laid out [0.5] but [0.2] we've just been through one to make it clear [0.8] it's going [3.3] so [0.2] we now need to turn back to the joint distribution [0.6] to think about [0.6] these [0.3] the way that the variables [1.1] change together [0.2] rather than individually [11. 3] okay [1.2] this table is also in your notes it comes out rather [0.2] small here [0.6] but [0.3] what we need to do [0.4] is to start thinking about [0.3] the relationship between [0.5] the random variables so we've got [0.3] to [0.2] to move on now to the joint distribution [0.9] so [0.4] the first thing [0.2] to think about is [0.2] how to organize the pairs of values that can occur [0. 6] well if you just look at the first and the third column here you can see how i've organized it i've tried to organize it systematically [0.5] i've said [0. 3] fix the Y value at one [0.5] work through the possible X values [0.2] fix the vy-, Y value at two [0.3] work through the corr-, possible X values [0.2] fix the Y value at three [0.3] work through the possible X values just do it systematically you could lay it out differently [0.4] but we have got to cover [0.5] all possible pairs [0.5] of values [0.7] so [0.8] every row in this table corresponds to a pair of values of X and Y [0.3] so when we're trying to sum across all possible pairs what we're doing again [0.3] is summing the column [3.1] so in the middle [0.2] column there two [0.3] i've just u-, i've put [0.2] the deviation of X from its mean why do i need that [0.2] because that term feeds in [0.3] to my covariance formula [0.7] not on its own [0.4] but it's still in there [0.6] so [0.2] here i've got all my deviations i'd already calculated these numbers for the purposes of calculating the variance of X [0.2] so it's not although i've written them in again [0.6] w-, you would have already calculated them [0.9] and of course [0.2] it repeats so [0.3] as an an as X-equals-nought the deviation is always minus-one-point-one-one [0.5] at X-equals-two the deviation is always nought-point-eight-nine [0.8] so it reappears [2.0] and the same will be true [0.7] of the way [0.2] Y deviates it from its mean only because of the way i've got it organized [0.2] i've got three [0.2] Y-equals-one values so i've got three deviations the same [0.7] three two values together [0.3] so i'll get three values the same there [0.5] but again i i would have already calculated these deviations to get at the variance [0.3] i'm going to use them again to get at the covariance [0.4] but to get at the covariance [0.5] i'm going to [0.2] multiply these [0.2] elements from these two columns together [0. 2] i'm interested in the [0.4] relationship between the deviation of X from its mean [0.4] and Y from its mean [0.5] that was what the formula said [2.4] so [0.3] i need [0.2] a fifth column which would be a brand new column [0.4] which is the product [0.2] row by row [0.3] of what's in the second column and what's in the fourth column [0.4] so my one-point-o-four here [0.3] is minus- one-point-one-one [0.6] multiplied by minus-nought-point-nine-four [1.0] both numbers [0.4] tend to be [0.2] er both numbers there are below their mean [1.0] next time [0.2] we're multiplying [0.3] minus-point-one-one [0.2] by minus- point-nine-four again [0.3] both numbers tend to be below their mean that's evidence for positive covariance [1.0] they vary together [0.2] however the next number [0.4] point-eight-nine [0.4] times minus-point-nine-four [0.4] X is above its mean but Y is below [0.5] [sniff] [0.9] how important is that [0.5] well we it depends on the probability of getting that pair of values [0.4] so we've got two [0.3] pairs of values that are both below their mean [0.3] and they both have positive probabilities [0.3] one where one's above one's below [0.2] and it has a positive probability but it's relatively small [0.6] so on balance when we look at that [0.3] it would appear that we've got [0.4] er favour for a positive [1.3] covariance or evidence for positive covariance [0.2] and then we work our way down [0.4] that's below that's above multiply them together [0.3] get a positive number [0.4] again [0.6] that's actually sorry we get a n-, we get the negative [0.3] number here i've moved this across to the probability column now [0.2] we're getting one below [0.2] one above they're opposite [0.6] opposite locations with respect to the mean [0.3] and so on through [1.2] so that's this column here [0.2] this column here [0.2] is a deviation of X from its mean [0.3] times the deviation of Y from its mean for each pair [0.3] of values [2.4] we want to find out what the average [0.6] de-, [0.2] average combination of these [0.3] deviations is when multiplied together [0.2] so we're going to multiply each product by its probability [0.2] of occurring [0.6] that's the joint probability of getting the underlying values of X and Y [0.6] that will then give us [0.4] eventually when we add all those things together it'll give us a weighted average [0.3] so [0.3] what we want to do [0.3] is multiply [0.4] this column [0.2] which itself is the product of these two [1.0] by the probability [1.2] and that's the final column [0.5] that we've got in that table [1.1] so the final column there [0.4] is [1. 2] the element in column two [0.3] times the element in column four [0.4] times the element in column six [2.6] but that's for each so we've got one row [0.3] corresponds to one pair of values of the random variables [0.3] we want to sum across all pairs [0.3] we were expressed that as a double sum but all you have to think of is summing across all possible combinations [0.3] so what we have to do [0.2] is sum up this column of num-, numbers here [0.8] and find out what we get [0.3] and that will be the covariance [0.4] between these two [0.3] random variables [1.3] now the [0.5] that's [0.7] just a matter of hitting the buttons on a calculator then it's about nought-point-one [2.1] we can't read anything at all [0.5] into the numerical value [0.6] of that number [0.5] because the number that you get out of that [0.2] is clearly going to depend [0.3] on the sizes of the numbers that went in in the first place and if these sizes are large [0.4] then that number is going to tend to be large [2. 2] what we can do however is take something from the sign the sign is positive so [0.3] whatever kind of relationship there is here [0.2] or linear relationship of course specifically we are talking about [0.5] is positive whatever there is [0.6] it would represent [0.3] a situation where [0.5] if X is above its mean [0.2] Y would tend to be above its mean [0.3] if w-, X is below its mean [0.2] Y would tend to be [0.2] below its mean [0.3] whether they're below or above or not [0.3] they're paired [0.3] in that sense [3.1] so [0.3] what we've done then [0.2] is to compute the formula at the bottom of the screen [2.1] we've taken [0.3] all the possible pairs of X-minus-E-of-X Y-minus- E-of-Y multiplied them together multiplied them by their probabilities [0.2] and added the lot together [0.3] and that's the covariance [2.7] however [0.2] we want the correlation we want this scaled version that's going to tell us [0.2] is that [0.4] er [0.5] a very strong relationship or isn't it we've got to remove the [0.4] size effect [0.4] from the [1.3] er measure [0.2] and we do that [0.5] by dividing by the standard deviations [13.5] we've worked out what the standard deviations are [2.0] on the last slide but one [1.0] so [0.8] all we have to do [0.2] is to substitute them into the formula all the bits of the formula are now calculated all we got to do is substitute them [0.4] so the correlation is the number we worked out by the covariance [0.5] divided by both the standard deviation of X [0.4] and the standard deviation of Y [0.6] and that will give us a standardized measure that must lie between minus-one and plus-one [0.5] the closer it is to zero the weaker is the relationship [0.4] the closer it is to minus-one the closer it is to a perfect downward sloping relationship [0.3] the closer it is to plus-one [0.5] the closer it is to a perfect upward sloping [0.5] relationship [1.0] so [0.2] there are the numbers that we've calculated on the previous slides [0.4] they simply have to be substituted [0.3] into the formula [0.9] you can multiply these two together first [0.4] before dividing them into that one if you prefer or can do it sequentially [0.5] divide covariance by standard deviation of X [0.4] and then divide that number by the standard deviation of Y it's the same result [0.2] will [0.7] occur [1.4] and [0.3] the bottom line [0.6] for the correlation between these two random variables [0.8] is [0.4] that [0.7] we get a number that's about nought-point-two [1.7] that's rounded to four decimal places [0.3] the number that you see up there [0.5] so the correlation is positive but it's weak [1.1] er there doesn't seem to be a strong [0.3] relationship between [0. 6] er the [0.2] advertising [0.6] and the demand levels [0.7] over the period that this empirical probability distribution has been constructed [3.2] we shouldn't perhaps be [0.2] too surprised about that [0.6] but this number for some distributions this number will be [0.4] a lot bigger [0.9] but we've got [0.7] now a number of summary statistics for our joint distribution [0.8] we've got [0.7] b-, in particular we got individually for the Ys and the Xs [0.6] their expected value which is a measure of the s-, [0. 2] of the er central tendency or the general location of the distribution of each individually [0.6] we've got the variance which is a measure of their spread [0.3] about that measure of central tendency how spread out are they about that standard [0.2] that er [1.6] overall measure of location of the distribution [0.7] and then we've got two measures [0.4] of the relationship between the variables we've got the [0.2] covariance [0.6] and then we've got the correlation [2.6] so the numbers are are er er somewhat tedious but what i'd really like you to take away from this is [0.3] er [0.3] you are you do have for the purposes of examinations to be able to perform such calculations [0.4] but in the longer term [0.3] what you need to develop is some intuition about what's going on with these definitions what it what is really being delivered to you when you make these calculations [1.0] er because in practice if you do upgrade any much more statistics of course [0.4] you won't be handling the individual calculations [0.2] you'll leave that to a machine [0. 3] but if you don't know what the formula is doing [0.4] you really can't interpret the answers very reliably [2.7] so [0.3] that's joint distributions then [0.2] and summary statistics for distributions of random variables [0.2] and each one of those [0.2] has an analogue with real data [0.4] we start off with real data get an understanding of what we're doing with real data [0.2] and then to develop more statistical techniques we develop tools [0.3] random variables and probability distributions [0.3] which allow us to [0.3] become more sophisticated in our analysis of real data there's a feedback effect [2.0] the last er few things that i want to talk about in this section of the lectures [0.6] er [0.2] harp back [0.4] to [0.2] looking at individual variables looking at individual random variables [3.4] and i want to talk then again about expected value and variance [0.6] but i want to [0.4] point out to you a very important feature of these [0.2] er [0.3] calculations [0.8] very often [0.5] you'll have some information about some variable [0.8] that's not itself directly of interest [0.7] the thing that you want to think about [0.5] is is related to the [0.6] er that random variable about which you have information [0.3] but is not the same as [0.3] that [0.6] random variable [2.2] so you might have a probability distribution about s-, [0.2] about in this case it's going to be a [0.4] about sales in this illustrative example i'm going to introduce [0.2] but you may not want to know about sales you may not want to know what the expected value of sales is you may not want to know what the variance of sales is [0.8] what you may want to know [0.4] is [0.5] what happens to profits what's the properties of profits not what are the properties of sales [1.0] but i-, what you know about is sales how can you move from one to the other [1.5] so here's a story [0.6] that is [0. 5] very useful and this story also applies although we're going to introduce it for random variables and talk about expected values and variances [0.9] exactly the same rules apply for real data [0.3] in other words to averages [0.3] and to sample variances [1.2] so supposing we've got [0.7] two [0.3] we've got a random variable we've got information on sales [0.4] but what we want to know [0.5] is profits [0.4] we want to know about [0.3] profits [1.8] so [0.2] here's the relationship [0.5] that has been found to exist [0.4] between [0.3] profits [0.3] capital-P [0.3] and sales capital-X i'm using capital letters because i want you at this stage to think about these things as [0.3] random variables in general [0.5] they will have probability distributions [0.5] we won't [0.4] we will be able to get [0. 8] realizations or particular values in practice [0.2] but when we're thinking about them as random variables we want to think about them in the abstract [0. 5] so this says [0.3] that whatever value of X you happen to have from your distribution [0.2] i-, you've the corresponding value of P can be calculated [0. 3] and so it can [0.2] be stated as a general rule as the relationship between [0.5] random variables [0.5] the units here [0.5] are [0.2] that er everything's been made in er is measured in thousands of pounds [0.7] er [0.2] and in some cases thousands of pounds [0.2] per day [0.9] and you can interpret the u-, the terms that you see here it's a linear equation it's simple [0.2] and the rules that i'm going to deal with [0.3] are specific to linear equations linear equations are ones where [0.3] one variable [0.3] is a constant multiplied by the other [0.2] with another constant added or subtracted you've seen that before [0.3] with what namex did for elasticity and so on [1.1] and you can interpret these coefficients as i have done there [0.3] the three [0.3] is giving you some measure [0.2] of the profit per car [0.4] in thousands of pounds [0.5] the minus-two [0.4] is [0.4] some fixed costs per day [0.4] in thousands of pounds so what you've got here [0.4] are [0.3] the amount of [0.6] profits that you're going to get [0.2] per car [0.3] minus the amount of money that you're going to lose anyway as a result of say keeping your showroom going or something of that sort [2.9] so this relationship is in terms of random variables so R-Vs here means i'm using capitals i'm talking about random variables in general [0.4] but i can of course use the rule [0.5] for specific values of the random variables so er [0. 2] if we know [0.5] or somebody wants asked wants to ask wants er us to consider [0.3] what happens when the random variable X takes the value specifically one [1.6] then we can work out [0.2] what the corresponding profits would be [0.8] by simply feeding that one [0.3] into the formula [0.4] and in this case we'd get profits of one unit [0.8] thousand pounds per day [4.3] we've got [0.8] i-, either because we [0.2] s-, [0.2] er developed a statistical model or we've got an experimental probability distribution [0.4] we've got information [0.3] about [0.2] X [2.7] but what we want to know [0.7] about is not X [0.3] it's profits this is what matters or at least for some reason this is what we're asked to investigate [0.7] so how do we get a story about profits from the story about sales and how that in terms of probability distributions and their summary statistics [2.0] well when you look at the [0. 5] equation [1.1] you can see [0.4] that [1.2] any value of X [0.2] generates a particular value of P [0.8] so [0.4] if we change the value of X we'll necessarily change the value of P [0.6] and if we never come back to the same value of X we'll never come back to the same value of P [0.6] a particular value of profits of er sales X [0.2] is associated uniquely with a particular value of profits [0.4] P [0.7] so [0.2] if somebody tells us what the probability of getting some particular value of sales is [1.1] all we have to do [0.6] is to u-, to describe the probability of getting the corresponding value of profits is saying it's the same [0.7] so [0.2] the probability of getting [0.3] profits of unit one [0.3] is the same s-, [0.2] as the probability of getting sales of unit one the probabilities are going to be the same [0.7] it's called a one to one relationship you've got no overlapping at all [3.2] so what this tells us [0.3] is we [0.2] can e-, if we know what the probability distribution of the [0.2] Xs is [0.7] we know what the probability distribution of the Ps are [0.8] we can just read off what the probability of the corresponding X value is and we'll have a set of pairs of values of profits [0.4] with their probabilities [0.4] so we'll know what the [0.4] probability distribution of profits is [0.3] from which [0.4] we can calculate [0.3] any of the summary statistics for the distribution [0.7] er that we want [0.2] expected value or variance using the formulae we've got [1. 3] so we can regenerate a new probability distribution [0.4] and we can calculate expected values and variances [0.6] if we want to know the probability distribution specifically [0.4] we're going to have to go through this process [0.5] but it turns out that if you want to know just these summary statistics [0.7] you don't have to go through the irksome process [0.3] of calculating all the probabilities [0.2] and then running through the formulae there's a sh-, a very important [0.3] short cut [10.9] so just to be clear [0.2] supposing [0.3] you wanted to work out [1.0] what the [1.0] er [0.5] expected value of profits was if you do it longhand what have you got to do [0.8] well for each possible value of X [1.1] you have to work out the corresponding value of profits [1.1] and you have to know what the probability of getting that value of X is [0.2] it'll be the same as the probability of getting that value of profits [0.7] but you have to do this calculation [0.4] for every single [0.9] value of X you have to work out a new value of P [1.0] to get the probability and then you're going to have to multiply [0.3] the value of P by its probability [0.4] for each possible value [0.6] and add them all up [0.3] to get the expected value [0.5] er there isn't an answer here i haven't written it out that's a tedious calculation [0.6] and if there were very many possible values for the random variable [1.3] not just six maybe sixty [0.2] it would be a real pain [1.7] but this is how you'd work out the expected value of P if you had to [0.5] it's minus-two times point-one- eight plus one times point-three-nine plus [0.2] four times point-two-four [0. 4] all the way to the end [0.3] thirteen times point-o-one [2.6] and er [0.3] the variance would be even worse you'd have to work out all the s-, deviations square them multiply the probabilities add them [0.4] a real [0.3] pain [2.1] so you don't do that [0.4] when the relationship between the random variables [0.4] is linear [0.4] when the relationship between the random variables is linear [0.5] you've got a very simple [0.4] alternative [0.2] route to follow [3.1] so [0.8] we just have [0.2] a general statement and we say here's a linear relationship between a random variable X and a random variable Y [0.5] we multiply by B [0.2] add A [1.1] so we've got all the information we need about Y we've perhaps got its probability distribution [0.5] and we want to find out what the expected value of X is how do we go about it [1.4] well all we have to know [0.5] is that [0.3] the relationship between the expected values [0.3] is exactly the same as the relationship between the random variables themselves [0. 5] so [0.3] if in order to get any value of Y you must multiply X by B and add A [0.4] it's true that if you want the expected value [0.3] of Y [0.2] you simply multiply by B [0.3] and add A [4.8] so you certainly don't go through all the rigmarole of working out all the [0.3] probabilities and expected values and the [0.2] and so on [0.3] all you need to know [0.4] is what the expected value of X is so if somebody's told you what the expected value of X is or you've got that information from somewhere [0.5] it's very easy to work out the expected value of Y [1.6] the same is true of the variance [0.3] but don't forget with the variance you're always looking at [0.8] deviations from the mean [1.4] and you're squaring them [1.5] so [0.2] when we look at the mean of Y or the ad-, expected value of Y [0.4] no matter which expected value no matter what the actual value of Y is [0.3] the [0.3] the specific value of Y is [0.2] the expected value is going to [0.2] involve the term A [0.8] so when we look at the difference between the average and the actual both the average and the actual [0.3] will include a term in A [0.2] here's the A coming in from the average [0.3] here's the A coming in from the actual if you like [0.6] so when we take the difference [1.0] the A is going to disappear [0.8] so when we look at deviations of Y about its mean [0.5] clearly the A bit's not going to play a role [0.5] what is going to play a role is the B bit [1.0] but that's going to get squared up because variance we look at the square of the deviation of the mean [0.4] so when we look at say this minus this [0.2] the A will disappear but we'll have B in there [0.9] when we look at variance we square that [0.3] so we're going to end up with B-squared [0.6] and in fact the relationship between the variances [0.8] is this [0.6] the variance of Y [0.4] is [0.7] B- squared times the variance of X [0.6] you have to the A is irrelevant it disappears when taking the difference [0.3] the B sticks around [0.6] but it has to be squared up because variance is the squaring operation [1.7] and these relationships would hold as well for sample data [0.5] if you knew what this samp-, if you got some data and you knew what the average value of X was [0.3] and you knew that X was related to Y in this way [0.4] then the average value of Y could be worked out [0.3] using this formula but plugging in the average value of X [0.3] similarly [0.3] if you knew what the sample variance of X was [0.4] then the sample variance of Y [0.3] can be worked out like this [0.3] it's true both of the [0.7] random variable case which we sometimes refer to as the population case [0.4] and [0.5] real data [1.4] it's also the case [0.3] that these relationships hold not just for discrete random variables we've been pushing the discrete random variable story for reasons of simplicity [0.4] these relationships [0.3] also hold for continuous random variables although we haven't defined exactly what we mean by expected value and variance [0.4] you should be developing some conceptual ideas of what we mean [0.4] and we are able to define these things for continuous random variables [0.3] in which case [0.3] these relationships continue to hold it's very general result [0.5] when the basic relationship between the random variables is linear [3.3] so [0.3] here is [0.3] er a very simple [0.3] er illustration then [1.0] we had that profits was [0.2] three times sales minus two [5.2] er once you know that the expected value of sales is one-point-five you can easily calculate the expected value of profits [0.6] by feeding it into the formula [0.8] you don't have to go through all the rigmarole of working out a probability distribution [0.4] and er going through y-, every step in the calculation of expectation [0.9] similarly [0.5] with variance [0.5] the A bit is our minus-two here [1.2] it's the constant that you add on in this case it's a subtraction so it's minus-two [0.2] but it's irrelevant [2.1] and the only thing that's important is the multiplicative factor of three [0.8] and that [0.4] of course has to be squared [0.3] so we get the variance of P is three-squared times the variance of X [0.4] variance of X one-point-two-five [0.4] and so we get the variance of P [0.8] which is large of course 'cause the multiplicative [0.3] term is larger than one [1.6] we'd have got variance reduction [0.4] if this number instead of three was a number less than one [0.8] just depends on what's er being fed in [2.1] so [0. 3] when there's a linear relationship between your random variables don't recalculate [0.6] expected mean and variance use these formulae [16.6] we make far less play of the following results but nonetheless they're interesting [0. 2] they're important to know about [0.2] at least to be aware of [0.4] there are all kinds of other relationships like this that exist [0.6] if we want to if we've got to [0.6] add [0.4] random variables together [0.2] possibly multiplied by their own constants [0.4] there are all kinds of simple rules for working out the new expected values [0.3] and the new variances from the old expected values and variances [0.9] so a very simple case is [0.4] if you want the expected value of the sum of two random variables [0.3] it's just [0.2] the sum of the expected values [0.6] very straightforward indeed [2.1] variance is more interesting [0.2] when you come to think about sums [0.4] because if you're dealing with a sum [0.6] you don't just have to consider what's happening [0.4] to each variable [0.5] individually [0.2] you have to think about what's happening to them [0.3] together [0.5] so [0.3] if you're faced with a position of needing to work out the variance of a sum [1.4] you have to take account of the covariance the way that they [0.3] vary together [1.0] so [2.3] you'll notice we've got there's coefficients being squared up here [0.5] but [0.2] let's look at the really simple case to make the point if you wanted the variance of a sum [0.4] the variance of a sum would be the sum of the variances that's fine [1.2] but [0.2] you also have to take account of the covariance [0.2] the extent to which they vary together [0. 8] and if the covariance is negative [0.4] you can see that this that will [0. 3] reduce the variance compared with simply this sum of the individual variances the covariance is helping to [0.6] have a reducing effect [0.4] on the aggregate variance in there [2.1] if that er if the covariance is negative [1.0] notice also then [0.3] that [1.0] in the special case where there is no covariance where there's no relationship at all between the random variables [0. 3] then indeed [1.2] if we want the variance of the sum it's the sum of the variances [0.8] that should remind you a little bit of the story about independence and probabilities if you want the probability of a joint event [0. 6] we said [0.3] sometimes you can just multiply the two probabilities together [0.7] but only if the two [0.5] events are independent [0.4] so the story here is [0. 4] if you want the variance of the sum [0.3] you can take the sum of the variances [0.5] but only if [0.5] there's no relationship between the variables the covariance is zero [2.7] so you then might ask well [1.7] is it true that er [1.1] zero covariance means that the [0.2] random variables are [0.4] independent can i say that if there's no covariance between the data they're really independent remember we defined independence quite precisely earlier on [1.3] well [0.6] unfortunately that's not the case [1.3] the definition of independence is very precise and it only involves probabilities [0.4] when we looked at covariance [0.3] we weren't just interested in [0.3] probabilities we were also interested in values that the random variables can take [2.2] and it turns out [0.2] that if you start off with random variables that really are independent [0.5] then it's certainly true [0.7] that there won't be any covariance between them the co-, rather to be [0.3] c-, more careful the covariance will be zero [0.2] between them [1.0] that's that's okay [0.5] but [0.3] it doesn't go the other way [0.2] that is to say [0.7] just because [0.6] when you feed everything into the covariance formula everything cancels out to give you zero covariance [0.4] it doesn't mean strictly speaking [0.4] that the random variables are independent and the reason is [0.3] that [0.2] what's going into the covariance formula [0. 2] is not just the probabilities joint probabilities [0.2] and that [0.7] it's values of the random variable too [0.6] when we look at independence [0.3] the only issues [0.4] are the marginal probabilities the individual probabilities [2.9] okay [0.2] that's it for the lecture can i remind you please er there's the er assessed exercise which is due in this time [0.2] next week [0.3] if you've got questions about it come and speak to me or your class tutor [0.8] er [0.6] and the problem sets for discussion this week [0.3] are the ones on expected value variance and covariance [0.4] okay thanks very much that's it