nf0951: to start with er i gather that Newton-Raphson having happened at A-
level was rather a long time ago and you had a bit of fun with it on Friday so 
i thought i'd just give you a quick reminder the idea is we've got some sort of 
function that crosses zero what we want to know is the point X-star such that F-
of-X-star equals zero you shouldn't need to write this down by the way er and 
there are all sorts of ways we could find that what Newton-Raphson depends on 
is saying well we'll take a guess at where we're starting so we'll call that 
guess X-zero and we'll evaluate F-of-X-zero and then we have to decide what to 
do having seen what size it is and Newton-Raphson is based on the principle 
that what we do is look at the tangent at that point and we follow the tangent 
down 
and that will bring us closer to this root so we follow the tangent down to X-
one well one way you can think about that obviously the tangent is F-of-X-
nought which is the gradient and what is the gradient well the gradient is F-of-
X-nought minus zero divided by if we have mm i don't know if we really need an 
origin let's put an origin somewhere but it's divided by X-nought minus X-one 
so one way of thinking of Newton-Raphson is precisely this that we're taking a 
triangle we then do the same thing we come here follow the gradient up and 
we're very near a 
nearly at the spot so conceptually that's what Newton-Raphson's doing if you 
land up forgetting it that's one way of remembering it if if you like the 
geometrical way of remembering it an alternative is the thing we were lo-, 
talking about a lot in the asymptotics which was series expansions so what 
we're interested in is this point where F-of-X equals zero well let's do an 
expansion that's approximately equal to F-of- X-nought plus X-star minus er 
which way around is it X-nought minus X-star and then the first derivative and 
in either case what we do is we basically just solve those equations so if we 
look at this equation what we're saying is that X-star minus X-nought so i've 
changed the sign equals F-of- X-nought over F-dashed-of-X-nought and then we're 
going to actually i have a feeling you might have to 
check me on the signs on this one off the top of my head we've got a solution 
so typically we take this as our X-one and then we'd iterate okay that that's 
the basic principle you can check whether i've memorized which way round that 
goes er by looking at solving for that one X-one yep the other point is that 
it's actually much easier to remember Newton-Raphson in a simple form like this 
write it down and then fill in what the function is in that second exercise 
rather than trying to write it in too full of generality and in terms of 
tutorials the next tutorial's going to be a week on Wednesday so you'll get 
another chance to both look at those exercises or ask about the projects okay 
so that was my way of being an aside because what we're starting to talk about 
now until pretty well the end of term module the tutorials and some lectures on 
ethics is survival 
analysis so i'm just going to take you back to the first lecture where we had a 
whole lot of lifetimes of people and the reason for the odd shape was these 
were all people who were dead and these were a mixture of people who are alive 
and dead and i showed you what a survival plot looked like and that's a fairly 
standard survival plot where we start with everyone being alive and then we 
drop the estimates in a step function so that if we look at this group those 
who are wheelchair-bound we see that seventy per cent of them survive to age 
ten for this cohort i know the group who's studying cerebral palsy er 
somebody's got a friend who wasn't expected to live past primary school age but 
you can see from this that even quite severely handicapped people have got a 
good chance of living beyond primary school age right so er right i'm actually 
going to put this down now you might 
like to think about what the crucial elements of survival analysis are that 
make it a different topic i'm just going to move the screen which justifies 
having a separate section about it okay so cerebral palsy's rather a large 
dataset it's difficult to draw some examples so what i'm going to do is give 
another couple of examples of the data and then i'm going to go into a 
discussion of the crucial definitions and the actual definitions themselves so 
the topic's called survival analysis and most people tend to think of lifetimes 
are for d of human lifetimes with a with the word survival you do tend to think 
of death it's also called failure time analysis because it's used in economics 
where you stress test components see how long they last it's used in economics 
how long are people unemployed or how long are do companies survive and is it 
different for greenfield versus brownfield sites it 
can even be used for lengths how long a piece of wool can you get before it 
breaks if you're spinning it with one or two strands that's going to be 
different from spinning with multiple strands er and that's actually Cox and 
Oakes which is one of the books that's mentioned David Cox actually worked in 
the Wool Research Institute i think it was called precisely on lengths of yarn 
so that that is actually quite a major application but as i say typically we're 
going to be thinking in medical examples of people doing something like 
entering a screening programme entering a trial being born and then we follow 
them up till an event and the reality of what will actually happen is that 
we've got calendar time so im-, got say nineteen-ninety here two-thousand here 
but not everybody turns up in nineteen-ninety at the beginning so we get people 
coming in maybe dying coming in living coming in dying at some 
stage carrying on living er that person's just emigrated to Australia and we've 
stopped the study in two-thousand so in calendar time that's the kind of 
pattern we've got but in terms of what we want to analyse we'll much more 
typically just use time from zero up to we'll try to draw this reasonably to 
scale zero up to ten so these first two points whoops are pretty much at the 
same point but then we start having to think about these points coming back to 
being censored that point's if you want to draw yourselves a more accurate 
picture er you can so the idea is that we can either measure on a scale of 
calendar time or in some sense an exposed time so if we're thinking of 
something like 
hormone replacement therapy and whether it carries a greater risk of heart 
attack or stroke we're interested in the length of time women are on hormone 
replacement therapy we may secondarily be interested in the date 'cause it'll 
change the prescriptions of the components but primarily we're just interested 
in the length of time another way you may actually get data er is that you 
don't actually get it in that form much more likely in this example from kidney 
register data it's quite a lot quite likely that instead of actually getting a 
graph like that somebody will have done the the summary so this is counter-
registry type data and the kind of thing you would get is year since diagnosis 
zero to one one to two two to three three to four so 
you'd get subdivisions by year and those of you who are going to do actuarial 
science will find you tend most typically to work in subdivisions of year as 
opposed to exact times that i've used there then you're going to have the 
number at the start oh a hundred-and-twenty-six the number of deaths in that 
year was forty-seven and the number what are called lost to follow-up lost to 
follow-up's meant to be quite a general term because remember in the cerebral 
palsy case you're going to land up with people who are still alive so you 
you're losing them in tha-, in that sense okay so we had nineteen sixty five 
seventeen thirty-eight two fifteen er twenty-one two nine ten zero six four 
zero four probably sh-, gone up to age to year six and the kinds of questions 
you typically find people interested 
in this are well one thing that's very widely used in cancer er is one and five 
year survival rates so what is the let's just say five year survival rate you 
might think about medians what is the median survival and what is the life 
expectancy and take that to be the formal mean okay just so you can focus on 
the kinds of problems have a quick look at this data and discuss with the 
person next to you perhaps that first question how are you going to estimate 
the five year survival rate and can you think of anything that looks obvious 
but that's going to be wrong and i'll give you about half a minute on that then 
i might ask somebody to answer the question 
nf0951: 
anybody willing to o-, volunteer an obviously wrong answer simple answer but 
that is likely to be wrong for the five year survival rate is ten out of one-
twenty-six likely to be a good i-, estimate okay you agree it isn't a good 
estimate why broadly speaking 
sf0952: don't know why 
nf0951: 'cause of all the people you've lost to follow-up so that's basically 
why all of these questions although they're quite reasonable questions that's 
why they're going to need some sensible er methods that allows for all those 
people who get lost so in order to define s-, survival times there are three 
critical elements so for the definitions of i'm going to carry on calling them 
survival times every now and again 
i'll make reference to these other things that aren't necessarily survival very 
first thing we need to know for a survival time is the start point start point 
for each individual and the first examples we can think of might make you 
wonder why we need to discuss this 'cause the kinds of examples you might like 
to think of are date of birth for cerebral palsy or date of entry to a 
randomized control trial fairly obvious that that's the date you should use 
randomized trials accrue people over time just as people come in to cancer 
registries over time or anything else where it starts [cough] to be slightly 
more complicated is if you think of epilepsy which i've mentioned before and by 
the time somebody who has epilepsy is randomized into a trial they've got to 
have shown symptoms of the disease so you might well think shouldn't we start 
from when they first showed symptoms of the disease rather 
than just from entry to the trial the advantage of starting at entry to the 
trial is it's going to be unbiased because of the randomization mechanism you 
should have equal lengths of time before in both arms because recall of first 
events is going to be quite poor and so in fact with epilepsy what you do is 
you do start from date of randomization you also take into account when the 
first symptoms were and how bad things have been but as a covariate not as the 
start point er so if you want to put a n-, you know an aside s-, some sort of 
remarks on that it's not compulsory but things like randomized control trials 
are quite easy something where it becomes much more critical to define the 
start time is something like screening for disease there's still quite a big 
debate about the value of screening for breast cancer er it's been in the media 
a couple of t-, in the last couple of years a fair bit because Scandinavians 
have said not only is this a waste of money 
it actually kills more people than it benefits and the head of the U-K breast 
screening has said no no no we are wonderful well what's the problem the thing 
about screening for a disease is the whole point is you want to pick the 
disease up early before there are symptoms so you can intervene now what that 
means is if you think about it even if you did nothing the time from first 
saying somebody's got breast cancer to death is going to be longer in a 
screening programme than if you wait for symptoms if you want to think of it as 
a line again you think of somebody ambling along and at this point they have 
symptoms and they go along to see their G-P and at this point they die and what 
a screening programme tries to do is to say 
let's see if we can leap in here with some kind of tests and pick them up so if 
you measure from the time of screening it's always going to be longer than from 
symptoms so the mere increase in length of time doesn't tell you anything at 
all about the benefit of the screening programme fact the only real way you can 
tell about the benefits of screening programmes is i-, well ideally randomized 
control trials but otherwise you've got to have two populations one screened 
one not that's what all the debate is about what does the evidence from those 
kinds of trials show do they show a benefit to screening or not and the other 
occasion where you would get er slightly have to think carefully about your 
defined point would be exposure to disease so the case control cohort studies 
we've talked about if you're thinking of something like asbestosis or even s-, 
exposure to cigarette smoking you want to do it from the 
start of the exposure it may well be confounded with age i-, you may want to 
know whether starting to smoke at age ten has a different effect from starting 
to smoke at age twenty but you need to think about the exposure so if you like 
briefly the the kinds of issues here would be comparing a randomized control 
trial versus screening and exposures so in ec-, exposure to risk factors and 
it's those latter two that that warn you why this is such an important point 
and then the thir-, second thing to think about is the time scale or we might 
change that to 
saying the measurement scale and again that's essentially because if we're 
thinking about the generality of survival analysis typically when we're 
thinking in medical terms we are just thinking of days months years that kind 
of thing we could be thinking in engineering about the load on a spring er 
that's what you do in stress testing load on a spring load on a bridge load on 
an aircraft wing to see when the rivets pop out er that sort of thing what kind 
of impact er concrete can sustain if you're dropping loads on it and as i said 
things like yarn you might have thickness you might have length before things 
break down er and so you just need to agree on that this is also incidentally 
one point where one of the statistical er groups of models come in is whether 
you're going to transform the time scale so should you be modelling on actual 
time scale or on log of time scale so we've got a beginning we've got a time 
scale clearly the thing we need 
is an end so we need a well defined unique event death being the most common 
one that we'll be dealing with but in fact one of my colleagues when i was 
doing a PhD the er failure point they were looking at was the birth of a baby 
they were measuring length of labour so rather ironically er the failures at 
that stage was a successful live birth er where does this get complicated just 
as i point out a s-, a few issues here where you might need to think carefully 
well as i say if it's death it's not too tricky but quite often we're going to 
be looking at things like a recurrence of cancer or you could look at that and 
there you could have multiple events so you'd want to say first recurrence if 
you're going over to something like epilepsy or asthma where you have repeated 
er attacks quite often you won't 
be using survival analysis you'll be using methods for modelling stochastic 
processes which some of you will have studied [cough] and you may or may not 
want death from a particular cause you may only want deaths from lung cancers 
so any deaths from heart attacks might not be of interest well that's fine the 
most the one that's going to mean that we've got complications in life is that 
defined end point death or a recurrent a recurrence of the tumour or as i said 
in the case of labour statistics birth can be your end point all studies of 
premature children and and delaying er the birth of the child birth will be an 
end point er study that biological sciences was hoping to do but of course the 
whole point is lost to follow-up and what do we do about loss to follow-up well 
that brings us into the major definition that we have in survival analysis of 
censoring and so i'll ca-, call this four it's the one first one two three that 
are the essential things to have survival analysis four is required to make 
sense of some of the rest of this which is censoring okay most of you might 
have thought of censoring in terms of governments telling you what films you 
can't can or can't watch or extracting parts of newspapers some of some of most 
of you won't have but some of us have been in countries where the newspapers 
appear with blank sections 'cause it's been written out er and that's the same 
[cough] 
same meaning the reason the word's choosed in this chosen in this context 
censoring is just saying we have no more information so censoring of times and 
the w-, mechanism in which this is viewed is to say that we have for each 
individual er where am i going oops up here i think for each individual a time 
C-I m-, beyond which we don't observe them do not observe them okay so this 
means in fact that er time that we're actually going to observe is made up of 
two parts so if we let the 
or an individual's actual lifetime what would we we would see if we were able 
to follow them up indefinitely be X-I then we observe the survival time so 
we're going to observe the survival time which we're going to call T-I and T-I 
is a function of two things X-I and C-I so can you write down what that 
function must be the observed survival time is what function of the actual 
survival time and censoring simple function if you think of that top left board 
where we've got crosses and then we've got the lines that go into circles or 
keep going on right and if we were to censor at two-thousand what do we do with 
any line that goes through that two-thousand mark we take the first line are we 
going to s-, observe the censoring 
time or the death time right we're always going to observe the er i'm going to 
regret this aren't i this person had a notional censoring time we've got 
notional censoring times for these people and we'll al-, always observe the 
minimum of the death time and the censoring time because that individual we'd 
stopped watching at two-thousand so we wouldn't have seen them so the the 
function we want here is min ah but we don't only observe the minimum 'cause 
that wouldn't be much use to us we also need and an indicator function and 
this'll sometimes be given as a death and sometimes be given as censoring we'll 
call it delta-I which is going to equal one if X-I is less than or equal to C-I 
in that case you can think of it as indicating that the death has occurred and 
it's going to equal zero if X-I is greater than C-I 
in other words we haven't actually observed the event in all the analysis that 
we do we're going to be assuming that censoring is non-informative that we're 
not going to learn anything from the censoring the ways in which censoring turn 
up they actually are given the names type one and type two as i quite often 
find it difficult to remember which one is which i'm not going to ask you to do 
that type one censoring is the kind of thing you most often get in medical 
statistics you have a study it has to finish at some point so if it finishes at 
two-thousand or if it finishes at a series of dates so it finishes in two-
thousand in the Walsgrave Hospital but we carry out data collection in one or 
two other hospitals at a later date but we're still finishing at fixed times 
then that's called type one censoring the reason you don't observe people isn't 
because you've decided i'm going to ignore that person it's for a fixed time 
you've 
stopped the study type two censoring is much less common in medical statistics 
but it's very common in engineering which is to say i'm going to observe this 
cohort of individuals until a certain number or certain percentage of them have 
died or failed so i'm putting twenty items on test at different loadings and 
once we've put the loads up to the point at which ten of them have failed i'm 
going to stop the study so type two censoring is dependent on the number of 
failures so it actually does depend on the whole time process up to that point 
the way in which it determines when the tenth failure will occur but what it 
doesn't do is depend on anything in the future and then you can get other kinds 
of censoring mecha-, mechanisms but what's a s-, crucial is that you want your 
so i-, in this course well there is research in other things but in this 
course and in most of the work that you'll look at we assume that er right 
assume that censoring is independent in a fairly general sense of survival what 
we want more formally is that the probability that T is greater than some value 
T given that this was censored at time C well that shouldn't depend on C as in 
that that particular point the times have been censored so we just want that to 
be equal to the probability that T is greater than T given that we already know 
that the time is greater than C not the fact of censoring just the sheer time 
so this would hold true true for all times before that actual censoring time so 
having got the definition of the survival time the thing that we the main 
variable that we use within survival is 
the survival function is the focus sorry survival function almost invariably 
called S for survival S-T-of-T is the probability of the random variable T 
being greater than time T probability of surviving beyond time T so what how 
does that relate to the functions you're used to dealing with with random 
variables tell the person next to you how you'd write that in terms of a 
familiar function and what the function is any volunteers apart from the usual 
suspects does it look like anything you recollect meeting before yes puzzled 
looks [laugh] someone be kind to me where have you seen a function like this 
before but what was the function you've all seen it [laugh] some volunteer no 
no idea probabilities what's one of the 
standard things we know about probabilities and so how can you convert that 
probability statement into another probability statement 
sf0953: so if like S-T-of-T is one minus the probability of T is less than 
nf0951: thank you which is usually known as 
ss: 
sf0954: density 
nf0951: cumulative density cumulative density function or distribution function 
so most of the things you'll have done before in likelihood is basically been 
worked on the density function survival works on one minus the distribution 
function in other words those 
plots i was showing you where i talked about the probability of surviving 
beyond some time those were plots of an empirical survival function which was 
actually one minus your standard cumulative density function right er the next 
logical thing for me to do is to start talking about how we do a life table 
analysis of that data and given that it's lunchtime and you've got a l-, other 
things to do i'm actually planning i said i'd try to finish these lectures 
slightly early most times so i think it's actually more sensible for me to stop 
at this point answer any questions and see you again on Wednesday morning at 
five past nine