nf0955: i wanted to start by saying that at the end o-, o-, well not quite at the end [0.2] of the lecture i'd made a correction but somebody came and asked me at the end about the correction [0.5] so can you just check in your notes where i was [2.4] writing the [1.3] variance of G da-, [0.3] G-of-X [1.4] approximately equal to [0.5] G-dashed at mu [3.9] right i forgot the squared initially can you check you've all i think you've probably all checked that [0. 3] i'd put the squared into all the other formulae but just not in here [1.4] so can you just check that you've got that correct in your notes or you might get [0.8] slightly puzzled when you come back to them [15.7] right i was then going on to talk about the Kaplan-Meier estimate [3.6] so Kaplan- [2.0] Meier [2.2] estimator [2.2] and [2.0] it just incidentally was a paper published by two people called Kaplan and Meier in nineteen-fifty-eight [0.7] er [1.1] just to show you that quite a lot of the stuff [1.0] we do isn't [0.4] late well second half of the twentieth century [0.4] sometimes if you try to explain to people you're doing research in statistics they look at b-, you blankly and say what is there to do they seem to think all these things just [0.6] appeared instantly [0.9] er and nobody ever thought about them [4.4] right and the notation i was [0.2] using [0.4] was [4.9] you should just about be at this stage in your notes anyway this should already be written down [0.9] so the notation was we're looking at [0.8] T-zero T-one [0.9] up to [0.2] did i say T-N or T-K [1.5] N sf0956: N [0.9] nf0955: [laugh] er [0.8] and we [0.2] as usual have [0.8] T-nought equal to zero [4.7] and then i gave you a description of the intervals which i did [0.2] correctly verbally and not correctly [0.5] in the written form [0.4] we think about the intervals [0.6] I-I [0.3] being [1.3] zero [0.3] T-one [4.6] sorry i knew i just [1.7] that's what got me confused you have to have that [0.2] square bracket of the initial zero [1.6] T-one [0. 2] to T-two [0.8] and so on so the intervals are [0.8] with the exception of at zero [2.3] T-I [1.9] [1.8] T-I-plus [1.7] T-I-plus-one [0.8] with the square bracket [3.2] so that you're actually including [2.8] the interval [8.7] doesn't really make sense t-, [1.2] to put that down as an equation [2.4] okay then we need exactly the same thing as in an actuarial life table [1.2] except that instead of looking at intervals [0.5] [0.4] we're looking at point estimates so that [4.0] for [1.9] all those intervals I equals [1.8] one [1.0] up to N [0.6] we let [0.8] D-of-T- [0.3] I [0.9] be equal to the number [0.8] who died [2.0] at [0.5] T-I [0.7] and now we are requiring that we have the exec-, exact death time so it's not in an interval it's at a particular time [6.3] and we're going to need again we're going to need to know how many people there are [0.3] so [1.7] for I-equals-one up to [1.0] N [2.6] [1.9] and zero [0.5] er [0.5] N-of-T-I is [0.7] the number [1.7] alive and at risk [2.3] at [1.8] time I [11.1] so those two are r-, [0.3] really no different from [1.4] the actuarial life tables [0.4] the difference comes in [0. 4] in the censoring [2.1] which i-, so again [2.1] for any time we want to know [1.6] C ah s-, [0.4] C-of-T-I [1.6] but this time [2.1] it's in a different interval so [0.3] it's the number censored [2.6] sorry the number [1.6] censored [3.2] in [4.8] T-I-minus-one [0.4] T- [0.2] I [1.6] and the [1.3] equality and inequality changed round [4.2] [0.4] we defined the time points to be the point at which deaths occur not at which points at which censoring occur [0.8] so we can actually get things happening in an interval for censorings unlike for deaths [0.8] [0.2] and this is the point i was making that if we actually had a death [3.3] at [0.3] some value say [1.1] eight doesn't matter what the eight is [1.2] then [1.4] the [0.5] number of deaths at point eight would be equal to one [1.0] but if there was a censored observation [2.4] at [0.9] time eight as well [0.9] we regard that as happening [0.2] after that death [0.3] in other words [0.4] it has to go [0.4] [0.9] if the because the interval will be [1.0] something [2.1] eight for the censoring where that's strictly [1.3] values less than eight [0.2] the censoring goes into the interval above this [3.7] it's actually much easier to do this often than to [0.5] er [0.6] [0.3] think about the [1.4] worry too much about the formulae [0.7] what it does mean of course is that you can also say [2.2] therefore [2.2] the number at time [1.0] I-plus-one [2.1] well what's going to be there at the number at time I-plus-one [2.1] it'll start off with a number [1.7] at the beginning [1.0] at the previous interval [2.0] and we're going to have to subtract [0.9] the deaths [4.6] that happened at that time [1.1] and then we're going to subtract the censoring [0.5] from [0.7] the next time [9.2] er oh [1.1] sorry [0.8] just used an abbreviated notation sorry [1.7] minus [0.2] er [4.8] sorry you quite often do land up writing D-I it's putting in the T-I just to make it explicit that it's on the [0.3] original [1.2] all the times [0.7] the other thing obviously we need is [2.8] that [0.8] the number [1.5] at time zero is [0.5] the total number in the study [9.3] okay i've said obviously that in fact yesterday we were discussing how you modify this [1.1] in the occasions when [1.1] when that doesn't actually happen there's some interesting situations [3.1] so this is really by way of setting up definitions which actually give us [3.2] unsurprisingly an estimator that looks the same [0.6] just make that T [1.5] and [0.2] once again we're going to have a product [1.4] this time i'm going to write the product as all T-I [0.9] such that [1.5] T-I is less than or equal to T [4.8] and [2.4] not surprisingly that's the product over [0.7] all those [0. 4] D-T-Is [1.4] N-T-Is [11.8] what we can also notice from this is that [1.6] you can r-, [0.2] write this more directly [0.2] as a recursion [1.0] by [1.4] using the same [3.7] product [0.2] but let's write each of these as [1.3] S-of- [1.5] T-I [14.2] and writing it in this product form [0.6] is the background to the other name that this is given [0.7] which is this [2.5] is also [2.2] known [1.1] as the [1.4] product [3.8] limit [1.4] estimator [15.6] one of the other things that can be established on this is that [0.3] again it can be shown to be [2.5] the maximum likelihood estimator [9.2] okay [0.3] well you may need to [1.0] essentially specify [0.2] conditions on [0.2] nothing [0.6] the how is it not changing when there's no information [1.5] and [0.7] the other thing we're going to want to know is the formula [0.6] and [0.7] sorry the the variance and [3.0] we don't need to derive the variance of it again because the [0.9] variance [2.8] is [0.8] [1.2] similar [2.1] in the obvious way [0.7] to Greenwood's formula for the l-, er [0.2] life table [29.6] if you do do some reading around on the survival [0.2] analysis books you might find a couple of other [0.8] approxim-, [0.2] approximations to the variance [0.5] that you can use particularly if there's no censoring [1.6] if there is no censoring then at any point you're just using a simple binomial estimate [0.8] so you can just use [0. 2] the binomial formula [0.9] er [0.9] in general [0.2] you're going to be doing this kind of thing where you've got enough data to want to use a package which will calculate the variance for you [0.4] so the simple approximations are not that critical [1.1] okay i put some data up [0.2] on [1.6] Wednesday [0. 8] and [0.6] er [1.5] i have got an estimator Kaplan-Meier curve for that [0.6] which i shall hand out in due course the main reason i'm not handing it out yet is i'm going to ask you to do something [0.3] which has a solution on the back of the sheet [1.2] er [0.5] and that just goes through the calculation [0.9] [0. 5] for you so [0.9] you've got the list of times [0.2] and for practice you can try doing it yourselves and then [0.3] see whether you get the same answers [1. 1] one of the other things that's useful to do [0.4] which may or may not endear me entirely to [1.6] er people fiddling around watching what i'm up to [0.3] er [26.6] right i should get these lights out [3.6] okay what's quite often [0.4] you'll do with an actuarial life test like a life table [0.5] is draw a survival curve [1.3] and [6.3] you can see the fact that we have estimates only at times in which events occur [1.3] by the jumps whenever there's an event [0. 4] and you can see that the jumps don't [0.8] always happen at [0.4] identical intervals [1.6] you can also see in this group this is actually gastric cancer with a couple of comparisons [1.2] with time and days [1.5] you can see a [0.4] a very dramatic [0.6] drop in [0.7] survival [0.5] so that [0.3] five-hundred days you've only got about a third of the people still alive [0.9] and then [0. 6] the survival [0.2] levels off a little bit [3.0] but the other reason for showing this [0.2] curve again there's a there's a printed one in the handout i'll give you so [0.2] you can [0.2] have a look at that being drawn up [0.6] the other reason for showing the curve is that of course generally what we're interested in [0.4] is not just the survival [0.2] i mean we may well want to know what the median survival is [0.4] looks like about a [0.6] a year in the radiation group with chemotherapy [0.7] and maybe about [0.5] a year and a half to two years in the other group [0.8] but we actually would want to do a comparison of those [0.7] in that group it looks as though the dotted line [1. 2] has got a better survival [0.9] which is the chemotherapy-only group [0.7] but how would we think about formally testing that [2.4] well that's where we get [0.5] back to a log-rank test [1.3] which i'll talk about [0.3] in formulae for in a minute i'm just trying to see if i can pick up [0.6] let's concentrate on [3.1] just that little bit of the [0.3] graph at the moment [0.9] what we could say at that point is that [0.6] in the one group [0.2] er [0.7] there will be some number [0.3] at whatever the time is [1.1] and that's going to be the number from the chemotherapy and radiation group [1.3] and [0.4] of those [0.9] chemotherapy and radiation group [0.5] there was a death at that [0.4] we cannot count the number of deaths at that time [0.9] in this case [0.5] just here there are no deaths [1.6] whereas here [0.2] there'll be some fixed number in the chemotherapy-only group [1.3] and the death at that time [1.0] we can see there is [0.9] a jump so there must have been a death [1.5] and we can think of that [0.7] as then being a comparison of two binomials [0.5] we could even set it up [1.0] as a small [0. 2] two by two table [0.6] and do a formal comparison of [1.1] given the numbers in the two groups so if there were equal numbers in the two groups at this point at risk [1.0] and we saw one death [1.0] then we would divide that one death in terms of expectation [0.5] into half in one group half in another [1. 6] and provided we can assume th-, independence we can actually do that [0.7] along the whole curve and that's the basis of the log-rank test [2.8] so to come back to the [0.8] blackboard [2.0] and [0.3] i'll leave that a minute 'cause i can see a couple of pens [2.1] blackboard and chalk in fact i think we're coming back to [1.0] a couple of minutes of you [0.2] talking about what i've just said and whether it makes sense while i clean the blackboard [6.6] or anything else you want to talk about nf0955: okay have you got any questions on that [1.4] no okay [0.4] log-rank test [11.1] i'm not really sure where the word log comes into [0.7] this [0.8] but the reason we talk about ranks is that we simply look at the order inven-, [0.2] which the events occur [0.6] and we don't look at how far apart they are in other words we've got the rank times [0.6] but not [0.2] the actual values of the times [0.7] and the point of a log-rank test is [2.0] to compare two survival gr-, [0.4] two or more survival curves so if we [2.1] wish to [2. 1] test [1.5] whether [1.6] two survival curves are equal [14.5] in a non- parametric [0.6] context so non-parametrically [8.7] we [2.6] can use [2.1] the log-rank test [10.4] okay as an aside that you don't need to write down but it's probably [0.4] quite useful general knowledge [1.1] non-parametric tests are essentially tests based on things like ranks that don't take [0.2] the actual values into account [1.7] we don't happen to teach very much about them at all on MORSE i think in fact this course is the only one that [0.5] probably mentions them [0.5] although i know at least some of you have discovered them in doing your reading of the literature [1.1] er [1.6] they tend to be very popular in the social sciences [0.2] probably historically as much as anything else [1.3] i've said we can use the log-rank test because there is also [0.3] a Wilcoxan [4.6] i've been asked about the Wilcoxan test this morning and the name of the Wilcoxan test [1.3] that actually relates to survival has just escaped me there is a Wilcoxan test that's related to survival [0.4] so if you're reading any of the survival texts [0.4] you might find reference to both of them [0.8] er [0.4] anything based on ranks tends to in-, require a lot more hard work which is why i'm not going to describe it but it does exist [0.5] non-paracmetric tests are around [0.5] er [0.2] so if somebody at an interview or anything asks you about them [0.4] you've heard of them [0.3] you just haven't studied them [1.0] er [2.1] so e-, end of that aside we're being non-parametric we're comparing two groups [0.4] and as i said what we're really wanting to do [0.9] is [0.8] what we're going to do is look at each point but let's think about if we're comparing and doing a significance test we need a null hypothesis [0.8] and it's slightly complicated in this [0. 2] context [1.6] to write it down [0.6] it's not quite as trivial as some other things so that the null hypothesis [4.4] is [2.0] that [0.3] the cur-, [0.2] the [1.4] curves [1. 5] are [3.0] identical [4.4] I-E [2.9] S for group one [0.2] of T [1.2] equals S for group two [0.2] of T [0.8] that's a form we [0.2] usually can write hypotheses in but here [0.6] we have to add in the comment for all T [0.7] T greater than or equal to [0.2] zero [6.3] and [0.5] you probably want to write what i just said [0.7] where [0.8] S-one [1.6] S-two [3.0] are the curves [3.1] for [2.3] groups one and two [10.0] the for all T [0.5] is [0.8] the most powerful way of doing things in other words we want to compare what's happening on the whole survival [0.7] it is actually very common in medicine to look at [0.5] survival up to thirty days after an operation [0.7] or survival up to one year [0.4] that's certainly convenient as a summary [0.3] it's just not as [0. 4] informative as it could be [0.5] as a test [0.6] so yesterday er er [0.2] one of the medical professors was talking about survival after [0.2] a difficult operation [0.5] and that was expressed in terms [0.5] of survival to thirty days for [0.2] general discussion [0.7] but the formal tests were done in terms of complete survival curves [1.1] what we actually do is we again think of at about each point [1.3] so at each [2.2] time [1.4] T-I [1.5] at which a death occurs [0.8] at which at least i should say [2.3] one death occurs [7.2] okay and that's one death occurs in either group [4.8] what we do is we [0.2] form [1.8] a two by two [0.3] table [3.4] okay i mean in reality we don't form the whole table but that's what we're doing conceptually [1.7] and what does that table look like [2.2] died [2.6] not dead [12.0] group one [2.9] group two [3.2] and i'm [0.2] probably going to swap to a yeah i'm going to swap to a shorthand notion [0.6] drop the T so let's just make that [0.5] D-one-I [1.1] which can of course now be zero because just 'cause somebody's died in group one they don't have to have died in group [0.2] two [0.2] as i showed [1.2] D-one-I D- two-I [1.6] and this is total [1.8] at risk at that point so [1.6] N- [0.3] one- I [0.7] N- [0.3] two-I [12.5] two by two tables you've seen before [0.8] and the standard way of dealing with those is look [0.4] doing an observed minus expected comparison [0.8] so [2.5] the [0.4] expected [3.7] number [2.2] of deaths [2.6] in [2.5] group [1.5] one [0.3] at time I [6.0] is [2.1] expected value of the random variable [0.5] D-one-I [1.4] we will put down as [1.0] N- one-I [1.6] N-one-I plus [0.6] N-two-I [1.5] times the total number of deaths that occur [0.4] at that particular point [10.1] okay that's nothing new to probably [2.2] first year [3.2] what i'm not sure is whether you're going to remember the variance expression for that [2.9] anybody remember the variance [9.5] might just about regret starting writing it on that bit of the board [2.7] sm0957: is the not dead column the same as the [0.5] total risk column or [0.4] nf0955: oh sorry i've [3.9] i was just being lazy and not filling in the to-, [0.2] the [0.2] the whole thing [1.2] those were all the all those that are at risk [0.2] of whom [0.3] those died and these ones didn't die at that point [0. 6] you can [9.7] could also write [1.7] D-I [0.3] N-I minus D-I [0.8] where you're summing up over the [0.3] subscripts [15.6] right [2.7] the variance term [1.5] involves [0.5] [0.4] quite a lot of elements [0.5] if you really want to s-, [0.2] think about these tables in detail you actually land up with a hypergeometric distribution [0.8] er which would be [0.3] a nice thing to set as a exam question for second year [1.0] but for this year [3.0] we're talking about N-one-I N-two-I [1.3] so [0.3] those two [1.6] multiplied by [0.9] D-I [0. 7] N-I minus D-I [1.1] so you're multiplying [0.4] the four margins [1.4] sorry [0.2] yeah the four marginals together [1.1] row margins [0.5] and column margins [2.0] and then what you divide by [0.5] is a function of the total [0. 5] it's actually N-I-squared [0.7] N-I-minus-one [0.7] so if any decent sample size [0.5] is just N-I-cubed [2.3] and [0.4] let's write that one again [1.4] think it should be fairly clear that er [1.5] again it's the kind of thing you write a program for rather than [0.6] enjoy doing on your calculator [0.6] because this is for only one time [1.0] and we're going to need to think about it for a whole lot of times [4.5] so in [1.5] think i'll just got to clean the board anyway i'll clean these two while you [3.0] or at least one of them [0.9] while you decide if you've got any questions on that nf0955: okay so what we do [0.5] that's the expected value for a single [0.5] time point [1.6] what we want to do is to let [1.5] E- [0.8] one expected for group one [0.9] well it's the fairly obvious thing you're going to do you're going to sum up [0.9] over [0.9] all your times [1.1] and [2.1] use E [1.1] er E-one-I [1.3] where [2.2] E-one-I [0.2] is precisely the expected value of E [1. 3] D-I the n-, the expected number of deaths at each interval [8.7] and similarly for [0.7] E-two-I [5.1] obvious thing we're going to want to do as well is have the [0.5] an observed so we're going to have [0.4] the same notation [0.8] observed simply [0.4] equals the [1.6] number of deaths [0.9] actually occurring in group one at each of those times [4.1] and the other thing we're going to want is the variance [2.8] which shouldn't come as a surprise either [1.5] the variance [1.8] as i said there's an independence assumption that we make [0.5] the variance is the sum of the [2.3] variances at each point [2.9] and we could of course [1.4] use a shorthand notation [1.2] just calling it v-, [0.6] V-I [0.2] at each point [2.9] why am i calling it V-I and not V-one-I [9.4] yes [4.7] sm0958: [0.6] nf0955: 'cause if you look at that [0.2] if i swap those round all i'd be doing is swapping the [0.3] position of the N-one and N-two it would make no difference [1.4] okay so we've got [0.7] an expected value an observed value [0. 2] a variance [0.7] so the log-rank test comes back to something that [1.3] [0. 7] would often be denoted by a Z statistic [2.2] so the log-rank [3.0] test [0. 6] uses [1.2] Z equals [2.0] observed [0.2] minus expected [1.6] divided by [1. 4] the square root [0.2] of the variance [4.6] and it uses that [2.7] that's why we tend to use Z [0.2] compared [2.7] to the standard normal [9.7] there's another way in which it's quite commonly done [1.0] alternatively [6.4] Z-squared [2.3] in other words something that's [1.4] obviously looks like a chi-squared term [0.9] O-minus-E squared [1.8] over V [1.8] is compared to a [0. 8] chi-squared on one [0.2] degree of freedom [2.2] and there's a reason for mentioning [0.3] that nf0955: it's not immediately obvious how we're going to generalize a two by two table which is quite a nice thing to say a two by three table [1.4] which is why one can think of a simpler version [0.7] than the log-rank [0.7] the log- rank is what you would use if you've only got two groups but if you want to think about a generalization [1.3] alternatively [2.3] we can [0.7] use [2.5] and as i don't use this very often i definitely don't remember it [0.8] we use something that requires us to think about E-two [1.3] which is [1.5] pretty simple that's just the [0.5] sum of the [3.2] expected value sorry of D- [0.9] two-I [0.2] in the obvious notation [5.7] and [3.4] if you want to write it explicitly so that's [0.7] N-two-I [0.9] D-I [1.4] over N- I [7.0] er similarly O-two is the [1.2] should really have memorized this [0.5] the er [0.4] obvious definition is just the sum of all the deaths [8.5] the one thing about that Z- [0.4] squared on one degree of freedom that doesn't look completely standard is to being divided by the variance [1.5] so for this one we just use that completely standard form use [1.3] X-squared equal to [1.3] E- one-minus- [0.5] O-one [0.2] squared [0.8] over [0.3] E-one [0.2] plus [1.9] E- two-minus- [3.0] O-two squared [1.1] over [1.5] E-two [2.9] and anybody who feels really energetic can start playing with the formulae and seeing just how different they might get [2.6] we're again referring to [1.2] a chi-squared just on one degree of freedom [4.0] but what can we say about this well [1.7] the disadvantage [0.6] why don't we use the simpler one well the disadvantage [2.1] is [1.0] that [1.7] X-squared is [0.5] conservative [4.9] and by conservative [2.2] i mean that if you had something that was at the borderline say over five per cent significance level [0.8] if you did the log-rank test it would show as significant [0.8] if you did the [0. 4] simpler test it would tend to show not as significant so that's what we mean by conservative [1.0] but [0.2] looking at that formula [2.9] the advantage is well if you think of the question i asked you how do you generalize this to three groups [4.0] if instead of having [0.3] chemotherapy and radio-, [0.6] versus chemotherapy plus radiotherapy we'd had a third group [0.3] radiotherapy only [1.5] how would you generalize [0.3] this what's the obvious generalization [5.4] sm0959: the term for [0.8] E-three [0.5] nf0955: you just put in an E-three term or however many you like because [0.5] these terms are now [0.4] pretty obvious so the [0.7] advantage [4.4] is [1.0] ease [0.6] of [1.7] generalizing [3.9] to [0.7] N groups [10.8] okay [1.1] what i thought i would do i thought it might have been slightly nearer the middle rather than nearer the end of the lecture [0.8] er [1.7] i take it you've all got the dataset from [0.2] that i m-, m-, [0.2] listed last week 'cause i haven't written it down [2. 2] for the control group [5.4] i'll give you the first few observations not necessarily all of them i mean i'll write them all down [0.6] but [0.2] you don't need to copy them all down what i want you to do is to try to write down [0.4] the first couple of lines of a table [0.4] to calculate what you need [0. 6] for [0.9] a log-rank test [0.6] er the table i've got has got [3.9] ten columns so have a think about which columns and what you're going to put into those columns [2.3] 'cause that way [0.8] you're more likely to remember it if i decide to put this into an exam [4.6] which is the other advantage of a simple procedure [0.6] put it into an exam more easily [0.7] okay so in the control group the [0.8] times were [0.3] two [0.5] three nf0955: and i'm [0.3] planning to ask somebody to come and write up what they've thought on the board so [0.8] as a just a gentle aid to actually addressing the problem nf0955: probably give you another [0.4] at least three or four minutes if not more [1.7] and i probably won't ask for a volunteer [0.3] probably just ask someone [4.6] if i ask for a volunteer i know who's hard-working and who will probably have an answer so [1.6] those of you who are quiet i've no idea how good you are nf0955: okay i'm going to resist the temptation to use the advant-, the fact that i know some people's names and not others [14.9] okay [0.4] er but what i'm going to do is go for colour so [0.8] gentleman in the nice red sweatshirt [0.8] [laugh] [0.3] i think you guessed that one was coming when i said colour [1.3] [laugh] come on come and write down what i don't [0.3] doesn't matter whether you've got it right or wrong sm0960: down [0.5] nf0955: yes but you've had a discussion 'cause i can see that [1.2] sm0960: [laughter] [0.8] nf0955: [laugh] [0.6] did anyone come up with anything [0.3] about how you're going to tackle it [3.6] i can start working systematically through all of you [0.7] namex [2.3] what column headings would you have had [1.3] sm0961: you've got to look at the [0.9] it has something to do with two sets of [0.2] trials you have the control group and the [2.3] the nf0955: drug group sm0961: the drug group from last time nf0955: yeah [5.0] okay so what do we need to have in those [2.5] er let's go to the back row [1.2] what what information are we going to need to have on those [0.7] to fulfil the formulae [1.3] sf0962: [2.5] nf0955: so we're going to have the time so we're going to have the number [0.6] dead at a particular time [1.3] and [2.1] sf0962: [1.1] nf0955: total number at risk [1.2] and i should [0.2] probably put in [2.2] that T is one for [3.6] control [0.2] drug [0.2] yeah [0.2] so most of you worked that much out yep [2.8] how are you going to work out the ex-, what do you need to work out the expected numbers in either of those groups [2.6] oh [0. 8] i rubbed the formulae off but you've got in your notes [3.9] sf0963: total number of deaths [2.7] nf0955: D-T [0.5] think i'll sw-, [0.2] swap between T and I [1.0] and therefore so the total number [2.1] and that will allow you to go for [0.8] E- one [0.3] at time T [0.4] E-two at time T [1.3] and the variance at time T [1. 9] so you're all now going to be able to remember that [0.2] without having to be told it [0.7] she says cheerfully [1.4] er [1.2] what's the very first time you've got [0.3] in the datasets we're talking about [0.5] and i should actually say [0.2] er [3.6] we're starting with twenty-two people in each group what's the first time [0.3] between those two datasets [1.3] middle of the threesome [1.8] what's the first time [0.7] sm0964: er [0.4] T [0.5] nf0955: right [0.7] and what do we need to fill in for the rest of that column [1.2] sm0964: sorry [0.2] nf0955: what do we need to fill in for the rest of that row [1.0] sm0964: er [3.0] the deaths [0.4] nf0955: yes [0.6] sm0964: i er [0.4] er nf0955: which are [laughter] [0.2] sm0964: er [0.5] two [0.7] well one in each [2.8] so two for the [0.5] next one [1.6] er [0.6] yeah [2.0] [laughter] forty- [0.6] two [0.3] nf0955: it's actually forty-four it's the tot-, it's the the two together sm0964: nf0955: yeah [0.3] then goes down to forty-two for here [2.5] expected [1.0] it's really easy [0. 3] let's go to the [0.3] person sitting on his own [0.8] expected number of deaths in each group [1.4] well you've got you've got a f-, [0.2] er formula up there [1.0] what's the expected number of deaths in group two [3.7] how many deaths in group two [1.1] at this time [2.7] sm0965: [1.7] nf0955: yeah [0.5] one [1.8] how many deaths in total [1.9] sm0965: [2.6] nf0955: and how many if it was total number [4.8] oops sorry it was total [3.7] sorry i'm putting the [2.8] forty-four at the bottom and the twenty [0.2] sorry [0.6] total [0.9] total number er [0.3] at risk in one group twenty-two [0.2] total number forty-four [0.6] and [3.2] sorry [0.2] i'm [3.3] going to write the answer down much more easily which is what i did rather than writing the formula down [1.1] let's just do it [1.0] the way i was thinking of it [0.7] the total numbers [0.2] which is the way you just think number in this group is a n-, proportion of the number of that group is a half which is why i was writing a half down [0.4] and the total number of deaths was two [0.5] so the expected number has to be [0.9] equal to one [2.6] this one's a really easy one 'cause there's one death in each group [0.5] and the group sizes are equal so one expected one expected [0. 6] and the variance term [4.7] okay the next time is [1.1] six [3.1] there's only one death [3.8] but the group sizes are equal again [0.6] sm0965: [0.8] nf0955: sorry sm0965: are they not three and four [1.3] nf0955: oops [0.2] i'm sorry [0.9] this is time three not time six [0.5] [laugh] thank you [0.5] [laugh] [0.7] er at time three [0.2] there is one death in the control group [0.4] no deaths there group sizes are equal [0.8] so we can see the expecteds come in at a half each [2.9] and [0.2] for the rest of the table it's all [0.2] written out in the handout [2.9] and that [0.4] wraps up the [1.4] non-parametric side of survival analysis [0.8] actuarial life tables [0.5] Kaplan-Meier or product limit [0.8] log-rank to compare survival curves so on Monday we'll go over to parametric [0.5] methods for survival [1. 0] so [0.3] any questions and the handout is at the front so you don't need to worry about [0.6] copying down this 'cause it's all on the handout