## / Probability question

This topic has been archived, and won't accept reply postings.
I'm hoping that one of the more mathematically-minded UKC members can help me with this question, as although I can do speling and gramer good, I'm pretty hopeless at maths:

There are 900 people working in a mine during a given year and their hours spent underground are logged. It's calculated that for every 400 hours spent underground, one miner is killed in a mining accident.

Miner A spent 400 hours underground
Miner B spent 800 hours underground
Miner C spent 1,200 hours underground
Miner D spent 1 hour underground

What was the probability of each being killed?

I'm thinking:

Miner A = 1 in 900
Miner B = 1 in 450
Miner C = 1 in 300
Miner D = 1 in 360,000

Is that correct?
In reply to Foxache: Depends on whether the mine replaces dead miners. If not then this makes things more complex. Imagine in the first 400 hours, the chance of any one person being killed is 1 in 900. But then if the miner isnt replaced then its 1 in 899 for the next 400 hours and so on.
>
>
> Is that correct?

Probably...
In reply to Foxache: ...seems you don't do grammar or spelling too well either.

One thing to remember about Probability, is that it should be expressed as a number ranging from '0' to '1', where '0' has no chance of happening and '1' where it is guaranteed.

Disclaimer: my stats is kinda rusty. Someone else might know better.

Your answer is very slightly wrong I'm afraid. You can see this by working out the probability of a man who works 1 million hours being killed - by your method it's more than one, which makes no sense.

Assuming the accident rate is 1 accident per 400*900=360000 hours. If we assume the accidents happen at random, then they will follow a Poisson distribution.

The Poisson distribution states that the probability of no accidents (k=0 in the usual formula) is:
P = exp(-t/T)
where t is the time, and T is the average time between accidents. So the risk to your miners is:

A: 0.11105% (1 in 900 would be 0.11111% so very close)
B: 0.22197% (1 in 450 is 0.22222% so also very close)
C: 0.33277% (1 in 300 is 0.33333%, again, very close)
D: 0.00027% (1 in 360000 is correct to 6 significant figures)

But now our miner E, who works 1 million hours:
E: 93.7823% (less than one, as it should be)

In other words, you can just divide the mean rate by the time to get the probability if the event is rare - which in your case it is. If the miner is going to be underground for any substantial fraction of the mean time between accidents, you have to do the complete calculation.

The question of miner replacement only comes up if either:
1) You want the probability of any miner being killed, not just miner A
or
2) You get one accident per 400 hours however many men are in the mine

Which is not how I read the question. Though interpretation 2 is possible, the question's not totally clear on that.

Yes youre right. Probability is one of those cases where good old fashioned common sense and intuition are often wrong.
> Yes youre right. Probability is one of those cases where good old fashioned common sense and intuition are often wrong.

You didn't get it wrong because you applied "good old fashioned common sense", you got it wrong because you misread the question!

I agree with you that the method Foxache used only gives an approximation but I think it's worth pointing out, for his sake, that it's quite likely the person setting the question intended the approximation to be used and didn't expect any knowledge of the more complicated method.
In reply to Luke90: I never use it for anything!
In reply to All: Superb! Thanks a lot folks. I still maintain that whoever marked my university Quantitative Methods paper years ago must've mixed mine up with someone else's
Isn't probability about prediciting future events?
So you can't say "What was the probability of each being killed?" in any meaningful way. The miners were each either killed or not killed during that year.
> Isn't probability about prediciting future events?

Time needn't have anything to do with it. A probability arises whenever you have incomplete information about a system.

> So you can't say "What was the probability of each being killed?" in any meaningful way. The miners were each either killed or not killed during that year.

Past events whose outcomes you do not know are indistinguishable from future events whose outcomes have not yet occurred. We don't know if Miner X survived, but we can compute some probability that they did.
>
> Disclaimer: my stats is kinda rusty. Someone else might know better.
>
> Your answer is very slightly wrong I'm afraid. You can see this by working out the probability of a man who works 1 million hours being killed - by your method it's more than one, which makes no sense.
>
> Assuming the accident rate is 1 accident per 400*900=360000 hours. If we assume the accidents happen at random, then they will follow a Poisson distribution.
>
> The Poisson distribution states that the probability of no accidents (k=0 in the usual formula) is:
> P = exp(-t/T)
> where t is the time, and T is the average time between accidents. So the risk to your miners is:
>
> A: 0.11105% (1 in 900 would be 0.11111% so very close)
> B: 0.22197% (1 in 450 is 0.22222% so also very close)
> C: 0.33277% (1 in 300 is 0.33333%, again, very close)
> D: 0.00027% (1 in 360000 is correct to 6 significant figures)
>
> But now our miner E, who works 1 million hours:
> E: 93.7823% (less than one, as it should be)
>
> In other words, you can just divide the mean rate by the time to get the probability if the event is rare - which in your case it is. If the miner is going to be underground for any substantial fraction of the mean time between accidents, you have to do the complete calculation.

> It's calculated that for every 400 hours spent underground, one miner is killed in a mining accident.

This is ambiguous. Does it mean that for every 400 hours *each* miner spends underground one is killed? I.e. one death per 400 miner-hours.

Or does it mean one death for every 400 hours all 900 miners spend underground? I.e., one death per 400*900 = 360,000 miner-hours.

It's worse than ambiguous, surely. As written the interpretation must be the first. Which then makes the rest of the question absurd because the survival of the miner who managed to work 1200 hours becomes pretty much miraculous?!

<\pedantry>
> It's worse than ambiguous, surely. As written the interpretation must be the first. Which then makes the rest of the question absurd because the survival of the miner who managed to work 1200 hours becomes pretty much miraculous?!

That's hardly fair. The foreman is presumably not waiting at the mine-shaft with a gun to make sure that exactly one man dies every 400 hours. So whichever of the two ways that you read the question, it's telling you that a certain death rate per man-hour worked has been observed, not that a death is guaranteed in any particular time interval.
In reply to crossdressingrodney: this has fried my brain.

Suppose the average death rate is 1 per 400 hours worked. And suppose I work 400 hours. The mathematical probability that I would die would presumably be 1. But presumably the actual probability that I would die is less than 1 because there's a chance I might still not die. (In fact, just as I am getting ready to go into my last hour of work I still "only" have a 1 in 400 chance of dying.)

So how does that work?

>So how does that work?

Not like that!

I suggest googling 'Poisson distribution'.

jcm
> (In reply to Bob Hughes)
>
> >So how does that work?
>
> Not like that!
>

didn't think so.

> I suggest googling 'Poisson distribution'.
>

cheers - will do

i'd dump the maths and say that the one with least experience is most

at risk, bit like everest

Jack, it sounds like you know more stats than me; I have to admit I can't remember what a Poisson distribution is. But in the spirit of UKC, let me have a crack at it anyway!

It seems like a reasonable assumption that a given miner in a given period of hours underground has a certain probability of dying, independently of anything else. Let's set

p = probability of a given miner dying during a given 400 hour period underground.

This will be the answer to the first question for miner A. It's also useful to define

q = probability that a given miner does not die during a given 400 hour period.

Obviously, q = 1 - p. For miner B, we want to know the probability, p', that a given miner dies during an 800 hour period underground. Since not dying during an 800 hour period is the same as not dying during the first 400 hours AND not dying during the second 400 hours, the probability, q', that he doesn't die during these 800 hours is equal to the probability that he doesn't die during the first 400 hours multiplied by the probability that he doesn't die during the second 400 hours. So

q' = q^2

or

1 - p' = (1 - p)^2

or

p' = 1 - (1 - p)^2.

Similarly, 1200 = 3*400 hours spent down t'pit will be your last with probability

p'' = 1 - (1 - p)^3.

You might be able to spot a pattern now and guess that your chances of death after 1 = (1/400)*400 hour is

p''' = 1 - (1 - p)^(1/400).

In fact this is right because it is equivalent to

(1 - p) = (1 - p''')^400,

which just says that to survive 400 hours you have to not die during the first hour and not die during the second hour and so on right through to the 400th hour.

In general, your chances of surviving t hours is given by

p_t = 1 - (1 - p)^(t/400).

Re-writing this as

q_t = q^(t/400),

it begins to look like your formula for the Poisson distribution, i.e. exponential decay, but with a different base.

Anyway, to answer the original question, all you have to do is figure out what p is and you're done.

I'm not certain how to interpret the data we're given, but taking the expected number of deaths for a cohort of 900 miners each working 400 hours to be equal to 1, we get

expected number of deaths for a given miner working 400 hours = 1/900

p = probability of death for a given miner working 400 hours = 1/900.

miner A : 0.0011111
miner B : 0.0022210
miner C : 0.0033296
miner D : 0.0000028

The guy who works 1 million hours should really get a will drawn up

miner E : 0.9397194.

These answers are really close to yours, Jack, but different. I don't know if that's because our models are different or because we're reading the input data differently, or one of us has made a mistake. Any ideas?

>expected number of deaths for a given miner working 400 hours = 1/900

p = probability of death for a given miner working 400 hours = 1/900.

I don't think we do, do we? That's the exact point of a Poisson distribution, I believe.

jcm

Surely you only let them all work for 399 hours and have 1 hour above ground.

That way no one gets killed!

<coat on and out door>
>
> >expected number of deaths for a given miner working 400 hours = 1/900
>
> p = probability of death for a given miner working 400 hours = 1/900.

Or, to put the same point another way, applying your above reasoning, what would you say the probability of death for a miner working 900 x 400 hours was? And 400 x 901?!

jcm

and...do you take into account the improvements in safety systems after every death

i'll get me coat

If those figures of 1 death for every 400 hours is actual then those Chilean miners must have been absolutely brickin it!!

Which of the two quoted sentences are you querying?

I think that the expected number of deaths for n miners each working 400 hours is in n; i.e. double the number of miners on shift and you double the expected number of deaths.

I don't claim the same is true of time. As you imply, and as I spelled out in my long post, if a miner spends twice as long at the coal-face he doesn't face twice the likelihood of death.

> Or, to put the same point another way, applying your above reasoning, what would you say the probability of death for a miner working 900 x 400 hours was? And 400 x 901?!

Sticking t = 400 x 900 and t = 400 x 901 in my formula

p_t = 1 - (1 - p)^(t/400)

I get 0.632325 and 0.632734.

(oops, I wrote "surviving" instead of "dying" at one point in my long post)

I'm afraid I can't understand your long post, because I don't understand the symbols you use in your formulas.

My basic point though is that if on average one miner is killed for every 360,000 man hours (your first sentence) it does not follow that the probability of one miner who works 400 hours being killed is 1/900 (as I think you were saying in your second sentence). Very nearly, but not quite.

jcm
>
> I'm afraid I can't understand your long post, because I don't understand the symbols you use in your formulas.

Fair enough. ^ means "to the power of" if that helps.

> My basic point though is that if on average one miner is killed for every 360,000 man hours (your first sentence) it does not follow that the probability of one miner who works 400 hours being killed is 1/900 (as I think you were saying in your second sentence). Very nearly, but not quite.

That's not quite what my first sentence said. But I think I see where the discrepancy is now.

I'd understood that in the original statistic dead miners were not replaced. In that case you can't sensibly talk about deaths per man-hour, because you'd expect 2 men working 1 hour to suffer slightly more deaths than 1 man working two hours (since he can't die twice).

If dead miners are replaced then the calculation changes by a small amount (corresponding to the situation where someone dies and then the bloke who replaces him also dies).

I'm too tired to tell if this makes sense any more.

>In that case you can't sensibly talk about deaths per man-hour, because you'd expect 2 men working 1 hour to suffer slightly more deaths than 1 man working two hours (since he can't die twice).

I don't know about that. If the chap dies then he doesn't do the last part of the shift, so there are fewer man hours.

I think it's pretty simple; just because it's true that over the long run one miner dies every 360,000 hours, it doesn't follow that the chance of one miner who works 400 hours getting killed during that period is exactly 1 in 900.

jcm
> (In reply to Jack B)
> [...]
>

You're a very punny guy!

The question should be rephrased:

There were 900 people working in a mine during a given year and their hours spent underground are logged. It's calculated that for every 400 man-hours spent underground, one miner is killed in a mining accident.

The miners are asked what their hours for the next year will be

Miner A intends to spend 400 hours underground
Miner B intends to spend 800 hours underground
Miner C intends to spend 1,200 hours underground
Miner D intends to spend 1 hour underground

What is the probability of each being killed in the next year?

The fact that there were 900, and that their hours were logged is pretty unimportant unless you want error bars on the likelihood of the miners dying (and then you would need the cumulative amount of time spent in the mine, and whether the men were replaced once one died)

A further consideration needs to be taken into account- If a man survives his first 400 hours, is he less likely to die in the next 400? Should you therefore hire experienced men?

If a death in your mine loses you a day of work and you have to pay the man's family a large lump sum, should you invest in better working conditions?

What are you mining that's worth the death of a man every 400 hours?

Why have you not been shut down by health and safety yet???

FWIW, clearly the OP meant that every 360,000 man hours, though perhaps he could have expressed that more clearly.

jcm
> >In that case you can't sensibly talk about deaths per man-hour, because you'd expect 2 men working 1 hour to suffer slightly more deaths than 1 man working two hours (since he can't die twice).
>
> I don't know about that. If the chap dies then he doesn't do the last part of the shift, so there are fewer man hours.

Yeah, maybe you're right.

> I think it's pretty simple; just because it's true that over the long run one miner dies every 360,000 hours, it doesn't follow that the chance of one miner who works 400 hours getting killed during that period is exactly 1 in 900.

This is certainly true, I agree.
In reply to crossdressingrodney: let's face it, Jack is brainier than you and johnboy..;)
In reply to Anyone who still has the will to live:

I'm hoping that other question was based on a fictitious mine, but I do have one based on real data if it helps:

"In 2011 there were 1,238,231 registered motorcyclists in the UK and 1,945
motorcyclists were killed or seriously injured (KSI) for every billion
miles ridden.

1. On average how many miles must be ridden before 1 accident occurs?
2. What is the risk to each rider when they've ridden for the above number
of miles?
3. A rider has an annual mileage of 15,000. What is the probability of them
being KSI during:
a) Year 1?
b) Year 5?"

I'm pretty sure I've still not grasped this but for #1 I just divided the total mileage by the number of accidents, so 1,000,000,000/1945 = an average of 1 accident per 514,138
miles.

For #2 I worked along the lines of one rider dies per 1,238,231 on the roads per 514,138 miles ridden so the risk to each after riding 514,138 miles must be 1 in 1,238,231 (0.00008076%?).

For #3 I tried (and failed) to use those numbers in the Poisson distribution formula in an earlier reply: P = exp(-m/M). If I do exp(-15000/514138) I get 0.029175, which seems completely wrong.

Not read the thread in detail, but, without going into technical detail, I think the misunderstanding of this sort of thing stems from thinking wrongly that if the chances of something happening in a given year is p, then the chances of it happening at least once in 2 years is 2p (when it is in fact 2p-p^2. The p^2 is subtracted so that the chances of it happening twice is not counted twice). An example which makes this obvious is to ask what the chances are of getting at least one head is when a coin is tossed twice. Clearly it is not 2x50%=100%.
In reply to Foxache: To do half a milliom miles in a year would mean that you have to go at 70mph for 20 hours a day every day. If you don't get pranged you're going to die of something else!

> I'm pretty sure I've still not grasped this but for #1 I just divided the total mileage by the number of accidents, so 1,000,000,000/1945 = an average of 1 accident per 514,138 miles.

This is correct (though not, I think, at all obviously so. I think you may have got lucky!)

> For #2 I worked along the lines of one rider dies per 1,238,231 on the roads per 514,138 miles ridden so the risk to each after riding 514,138 miles must be 1 in 1,238,231 (0.00008076%?).

No I think it is 1-exp(-1)=0.632. I think....... interestingly this is completely independent of the data: the chances of an event happening within the average waiting time for an event to happen is fixed! Cool! I never knew that before!
>
> For #3 I tried (and failed) to use those numbers in the Poisson distribution formula in an earlier reply: P = exp(-m/M). If I do exp(-15000/514138) I get 0.029175, which seems completely wrong.

You actually calculated P = 1-exp(-m/M) which is correct for first year.

For year 5 the probability will be the same assuming you have survived the first four years and that we don't assume you are safer than average having survived those four years!

Mine too....

My understanding, I could be wrong.

One death per 400 hours, = 1/400 chance of dying per hour, therefore

1-(1/400) = chance of not dying per hour. Now you can simply multiply the probabilities with each other.

A# (1-(1/400))^900 = 0.105 or 10.5 % chance of surviving 900 hours.
B# (1-(1/400))^450 = 0.324
C# (1-(1/400))^300 = 0.472
D# (1-(1/400))^300 = 0.9975

It is a dangerous mine.

That can't be right, though. You make your calculation and get your numbers. Now imagine a similar mine, which however exists in a universe where the unit of time is not hours but a unit called dhours, which are each 120 of our earth minutes. In that universe someone called John124 is posting on an internet forum saying that the chance of dying every dhour is 1/200, and performing a similar calculation, except that he's coming up with a different result.

I tell you, Mr Poisson has the answer. That's all I can tell you, because Mr P was too clever for me to follow him, but I do know it's not as simple as it looks.

jcm

(1-(1/200))^150 = 0.472
(1-(1/400))^300 = 0.472

It is not sensitive to the time unit used, provided that the same unit is used in both places. Try it.

I don't think that it is a Poisson distribution, I think that it is Binomial.

You have a probability of dying or not dying at any given point in time, and your overall chance of dying increases as you spend time in the mine.

The Poisson distribution would be useful for the mine owners to predict the variance in the total number of mine deaths per year.

http://en.wikipedia.org/wiki/Binomial_distribution#Example
>
> (1-(1/200))^150 = 0.472
> (1-(1/400))^300 = 0.472
>
> It is not sensitive to the time unit used, provided that the same unit is used in both places. Try it.

(1-1/n)^0.75n gets closer and closer to exp(-0.75) as n gets biger and bigger.

>It is not sensitive to the time unit used, provided that the same unit is used in both places. Try it.

I don't understand this sentence. My entire point is that in the parallel universe they use a different time unit.

You presumably think that the chances of a miner dying in two hours are one minus 399/400 squared.

However, John124, applying the same reasoning as you, reckons that the chances of a miner dying in a dhour is 1/200, which is not the same number at all.

jcm
> (In reply to Jack B)
>
> I don't think that it is a Poisson distribution, I think that it is Binomial.

The Poisson is the limit of the Binomial as the time interval considered gets smaller and smaller. For a small enough time interval, the Binomial will give a vey good aproximation, but not exact because it does not take account of the possibility that the event could occur more than once in the time interval. As the time interval gets smaller and smaller, this possibility becomes vanishingly small, so the Poisson limit does not have this inaccuracy and is exact. It is all a bit technical.
>
> No I think it is 1-exp(-1)=0.632. I think....... interestingly this is completely independent of the data: the chances of an event happening within the average waiting time for an event to happen is fixed! Cool! I never knew that before!

Is that the same as a 63% chance of being KSI then? If so it seems incredibly high. Or is that based on the (hypothetical) scenario of a rider doing the 500,000+ miles within one year?

> You actually calculated P = 1-exp(-m/M) which is correct for first year.
>
> For year 5 the probability will be the same assuming you have survived the first four years and that we don't assume you are safer than average having survived those four years!

Why does the probability of KSI not increase year on year as more miles are travelled, and the rider gets closer and closer to the half million mark (for which there was a much higher probability of KSI)?

But don't I get round that by taking the probability that an accident doesn't happen?

^ means to the power.

^2 = squared
^3 = cubed
> (In reply to Robert Durran)
>
> But don't I get round that by taking the probability that an accident doesn't happen?

No. Afraid not.

> (In reply to Robert Durran)

> Is that the same as a 63% chance of being KSI then? If so it seems incredibly high. Or is that based on the (hypothetical) scenario of a rider doing the 500,000+ miles within one year?

63% is the probability of being KSI if you ride 514138 miles (more than twenty times round the equator!). It will be the same in all similar scenarios: eg if buses go past your house at random but on average once an hour, then if you leave your house at a random time, the probability of a bus arriving within an hour is 63%.

> Why does the probability of KSI not increase year on year as more miles are travelled, and the rider gets closer and closer to the half million mark (for which there was a much higher probability of KSI)?

Obviously the probability of getting KSI in the first 5 years is greater than that of being killed in the first year, but, given that you have survived the first four years, the probability of getting KSI in the fifth year is the same as that of being KSI in the forst year (unless you take account of stuff like assuming someone who has survived four years is safer than average).

>
> One death per 400 hours, = 1/400 chance of dying per hour

This is not true. To see why not, suppose there is, on everage, one death per 1 hour. Clearly there is not, as in youur logic, a 1/1=1=100% chance of dying per hour (there would be if the deaths happened on the hour, every hour, but they do not!)
> (In reply to Robert Durran)
>
> why?

Sorry. I was a bit hasty there and didn't say what I meant to say.

You can indeed use the probability that an accident doesn't happen, but this is 1-(probability that an accident does happen) and, as I said above, this is not simply 1/400 for an hour's interval. To find the probability that an accident happens in an hour is not easy from first principles, requiring the limiting process of the Binomial distribution to get the Poisson distribution. It is in fact 1-exp(-1/400) and the probability that an accident does not happen in an hour is therefore exp(-1/400), so the probability that an accident does happen in, say, 300 hours is therefore [exp(-1/400]^300 and the probability that an accident does not happen in 400 hours is [exp(-1/400]^400= exp(-1) and so the probability that an acident happens in 400 hours is 1-exp(-1)=0.632, the result I mentioned earlier: the probability of an accident happening within the average waiting time for an accident.

Here is the derivation of the Poisson as a limit of a Binomial distribution:

http://www.the-idea-shop.com/article/216/deriving-the-poisson-distribution-from-the-binomial

Interesting. I can see why the probability of dying is not 1/400 per hour. However I think that I might have to dig out my very old A-level textbooks before I understand the next bit.
In reply to Robert Durran: The Poisson distribution? Is that the miracle of the loaves and the fishes?
> (In reply to Robert Durran) The Poisson distribution? Is that the miracle of the loaves and the fishes?

No, but if you were an unskilled fisherman who just caught the occasional fish at random, it would tell you the probability of catching enough fish for your tea in, say, the next hour.

Using the motorcyclist question as an example, if it were stated that KSIs were 3x more likely to occur at weekends and one of the example riders was only ever going to ride during the week, could their risk be calculated as previously and then simply divided by four?
> (In reply to Robert Durran)
>
> Using the motorcyclist question as an example, if it were stated that KSIs were 3x more likely to occur at weekends and one of the example riders was only ever going to ride during the week, could their risk be calculated as previously and then simply divided by four?
.
If the original accident rate per mile ridden assumes the same mileage per day at weekends as during the week, then by my calculations the accident rate per mile for someone who only rides at weekends will be less by a factor of 7/11:

Suppose that the accident rate per mile during the week is p and the accident rate per mile at weekends is 3p. Suppose the mileage per day is m. Then the rider who rides at weekends rides 5m miles on weekdays expecting 5mp accidents, and rides 2m miles at weekends expecting 2m x3p = 6mp accidents. so over the whole week they ride 2m+5m=7m miles and expect to have 5mp+6mp=11mp accidents. So, in the long term, their average accident rate per mile is 11mp/7m= 11p/7. ie greater than the weekday only rider by a factor of 11/7.
> So, in the long term, their average accident rate per mile is 11mp/7m= 11p/7. ie greater than the weekday only rider by a factor of 11/7.

Ah ok, I can see that your way is more logical. I think I've almost got my head around this now, but if weekday-only rider's accident rate/mile is (presumably) 5mp/5m = 5p/5 and weekend + weekday rider's accident rate/mile = 11mp/7m = 11p/7, how have you worked out the difference between 5p/5 and 11p/7 is 11/7?

Missed a bit off my last post:

Is it because 5p/5 is 1, so for every 1 accident had by Mr weekday-only, Mr weekday + weekends has 11/7 (1.5714)?