Inj rating & Gs missed per season SOLVED AT LAST!

Postby ClowntimeIsOver » Mon Aug 23, 2010 7:09 pm

Marcus, thanks for your thoughtful response. I disagree with you on most points, but I don't mean any disrespect ... and for all I know, I'm wrong. Below, I've quoted your points only enough so that someone can refer to your full statements in the above post; I don't mean to misrepresent them.

1- 700 PA seems a bit high ...

RESPONSE: A previous poster on this board gave the following data for each position in the batting order:

1 775 full-season PAs
2 756
3 737
4 725
5 706
6 687
7 668
8 649
9 599

I don't know the source of this info, but if it's accurate it gives an average of 700 PAs for a DH league (1-9) and 712 PAs for a non-DH league (1-8).

However, as mentioned, with a slight margin of error (bigger for high injury ratings), my figures can be pro-rated to 700. For example, an injury-1 player when the number "775 PAs" is used in the formula (full-season uninjured lead-off hitter) would be just 0.1 "games missed" different by pro-rating than by the exact formula. (An injury-6 would be off by about 2 games missed if pro-rated rather than plugging in 775 for 700 in the formula.) So my figures can be reduced by 5%, say, if someone thinks 665 full-season uninjured PAs is more accurate than 700 PAs, and the results will be close to what the full formula would produce.


2- ... players start the season all healthy. So injury-risk is truncated ....

RESPONSE: I don't understand this point. All it means is that everybody gets "one free PA." That's negligible.

3- ... At one extreme, a player may be injured right off the first at-bat. And at the other extreme, a player may never be injured at all ...... The risk of NOT being injured after 432 PAs is: (215/216)^432, or 13.47%.

RESPONSE: This is mathematically irrelevant over a large sample. For example, a coin flipped four times has a 6.25% chance of never coming up tails. However, that fact does not alter the 50/50 distribution that would be found in a large sample of four flips. For an SOM injury-1 player, there's a 91% chance of not getting injured in 20 PAs, a 63% chance in 100 PAs, and a 1% chance in 1000 PAs. How does this possibly affect the average number of injuries (1 in 216) in a very large sample? Meanwhile, at the other end of the bell curve, a player could get injured two times as often as chance, or five times as often as chance, in 432 PAs; in both cases (or all other rare cases, including no injury), the odds are very low. However, both sides of the bell curve cancel each other out. Yes, in 432 PAs, there's a 13% chance of no injury; but there's also a chance of 1, 2, 3, 4, 5 or 432 injuries, and the aggregate of all these (and all others) will inevitably equal 1/216 in the long run.

4- On average, players have more impact than half-a-game before they get injured, because they complete the at-bat during which they get injured ...

RESPONSE: I don't understand this. For an unsubstituted starter, distributed over the batting order and over home/away games in a big sample, half of all injuries will occur in the first half of a game, and half will occur in the second half of a game (meaning the average is "one-half game per injury"); a tenth will occur in the first tenth, and a tenth will occur in the last tenth. If this were not true, it would mean that "more" (unsubstituted) starters play in the second half of games (or last tenth) than in the first half (or first tenth).

In any case, this affects only "games missed" but does not affect "starts missed." The difference between those two is small, and only that difference would be affected (specifically, reduced by about one-fourth, or e.g. 0.3 games for an injury-1 599 PA player) if you're right.

5- In Strat, injuries don't happen during bunts, squeeze, and hit-and-runs.

RESPONSE: This is mentioned in my post above ("bunts, H&Rs, etc."). By "PA," I mean an SOM injury-possible PA. So the 700 figure would reflect that, and, if wrong, could be pro-rated with a slight error.
ClowntimeIsOver
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby ClowntimeIsOver » Mon Aug 23, 2010 7:57 pm

Marcus wrote: 4- On average, players have more impact than half-a-game before they get injured, because they complete the at-bat during which they get injured ...

My response above is somewhat wrong (I said the correct number was in fact half a game, on average, for an unsubstituted starter). How much of a "game" does an away-team hitter "play" if he's injured in his first at-bat in the first inning? He's replaced on the bases (if a HBP) and in the field (in the bottom of the inning), but gets a PA. So it's somewhat less than 1/17 games if the home team wins (and slightly less than that if the home team wins in a walk-off), 1/18 if the away team wins, and less than either if the game goes to extra innings. Using the simplest example, it's somewhat less than the average of 1/17 and 1/18. So that number has to be added to "one" (for an uninjured player) and averaged, producing a number somewhat less than 0.53 games played on average. So my "half a game missed" figures should be changed to no less than "0.47 games missed." This will very slightly decrease the "games missed" (but not "starts missed") figures in the table.
ClowntimeIsOver
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Mon Aug 23, 2010 10:49 pm

Hey, wachovia,

Like I said, your estimates are the closest we have ever seen, so really,
kudos!!

To answer to your point, I don't think a full season, as large as it is, is large enough to prevent the slight statistical aberrations I referred to. But in any case, I think best to just see actual data.

I couldn't retrieve the dataset I had in mind this afternoon, where I pooled 50 seasons of the same team.

But I found another dataset large enough. It is based on sims done last year, basically a league I simmed 10 times, providing a potential pool of 1080 seasons when all players are pooled. The only danger with such a set is that some teams use platoons, which obviously lowers down the number of games played during a season, so I took a careful precaution to not include data from platoons, or potential platoons (I was conservative, so hopefully I didn't put any bias in the data). And I pooled together players that fit with the different categories. Hopefully, I didn't do any mistake:

600+(inj 1): based over 370 seasons of different players
G=156.39; injury=5.61;

600-(inj 1), based on 230 seasons
G=152.24; injury=9.76;

600-(inj 2), based on 190 seasons
G=143.01; injury=18.99

600-(inj 3), based on 70 seasons
G=134.31; injury= 27.69

600-(inj 4), based on 60 seasons
G=130.32; injury=31.68

600-(inj 5) based on 60 seasons
G=122.92; injury=39.08

600-(inj 6), based on 20 seasons
G=111.1; injury=50.9


This last category (players with 6 chances of injury) is based on only 20 seasons, so it obviously lacks robustness. It's the only one that is over your estimation. All your other estimations slightly overestimate the frequency of games missed due to injury, by:

-roughly half a game for players with 600+PAs,
-roughly a game for low-injury risk with 600-PAs,
-and perhaps up to two games for higher-injury risks (although, frankly, we would need more data, as the sample sizes are smaller and show lack of robustness).
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Mon Aug 23, 2010 11:06 pm

As for the "and the rest of the game", calculations are not so easy to do:

A player on average has roughly 4.25 at-bats per game. He can be injured only after he completed his at-bat. So the player is responsible for:

1/4.25=23.5% of the played time for his spot when injured after the 1st at-bat
2/4.25=47% of the played time for his spot when injured after the 2nd at-bat
3/4.25=71% of the played time for his spot when injured after the 3rd at-bat
4/4.25=94% of the played time for his spot when injured after the 4th at-bat
4.25/4.25=100% of the played time for his spot when injured after the 5th at-bat.

On average, injury will occur on a distributed curve whose average will fall somewhere between the 2nd and the 3rd at-bat, so chances are that, when he gets injured, the player will have been responsible for playing time at a value between 47% and 71%, so somewhere around 60%, but the precise number, I couldn't tell. But it's definitively over 50%, it cannot be that close to 47% as compared to 71%.
Last edited by MARCPELLETIER on Tue Aug 24, 2010 12:34 am, edited 1 time in total.
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Tue Aug 24, 2010 12:09 am

[quote:3af3e903b9]3- ... At one extreme, a player may be injured right off the first at-bat. And at the other extreme, a player may never be injured at all ...... The risk of NOT being injured after 432 PAs is: (215/216)^432, or 13.47%.

RESPONSE: This is mathematically irrelevant over a large sample.[/quote:3af3e903b9]

Somewhat yes. But at 672 PAs, you deal with a sample that is only 3 times larger than the 216 PAs on which you make your estimations, and the probability we are dealing with are relatively small (most of them are below 5%), so with such statistical distribution, the possibility of small aberrations is not negligible, although, as you can see from the actual results, it's fairly small, small enough to be ignored I guess, by most folks!!
Last edited by MARCPELLETIER on Tue Aug 24, 2010 12:39 am, edited 2 times in total.
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Tue Aug 24, 2010 12:26 am

Final post, I promise!!

[quote:d14ef7f1d5]
1 775 full-season PAs
2 756
3 737
4 725
5 706
6 687
7 668
8 649
9 599
[/quote:d14ef7f1d5]

I don't know who wrote this distribution of PAs across a lineup (hopefully, it's not me!!! :shock: ), but this distribution is simply impossible from a baseball stand-point. The 9th spot cannot have 176 less PAs than the first spot in a 162-game season.

Perhaps this person excluded PAs that were pinch-hit, something that is likely to occur more often with the 9th spot. Only a guess.
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby dalekeener » Tue Aug 24, 2010 1:36 pm

I just wish I was smart enough to think of figuring that out...outstanding job!!!....Now I will go back to watching my Kick *** DVD.....
dalekeener
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby ClowntimeIsOver » Wed Aug 25, 2010 2:57 am

[quote:28612fc6ca="marcus wilby"]As for the "and the rest of the game", calculations are not so easy to do:

A player on average has roughly 4.25 at-bats per game. He can be injured only after he completed his at-bat. So the player is responsible for:

1/4.25=23.5% of the played time for his spot when injured after the 1st at-bat
2/4.25=47% ... [etc., see post above]

On average, injury will occur on a distributed curve whose average will fall somewhere between the 2nd and the 3rd at-bat, [etc., see post above].[/quote:28612fc6ca]

Thanks. Could you revise your numbers based on the fact that plate appearances are not the only aspect of "games played," which in fact involve the following two points:

1) replaced by a different fielder in the next half-inning;

2) replaced by a pinch-runner if injured but reaches base.

Your numbers ignore the fact that a replacement player appears BETWEEN the injury-PA and the replacement PA (particularly in the field), as though no portion of a "game" is missed in that gap.

As for the rest:

1) I still disagree with your idea that 1/216 is greater than the relevant base number; my formula presumes the largest possible sample, and I don't see why "maximum PAs in one season" has anything to do with how the numbers will average out ultimately (EXCEPT for the problem of late-season injuries extending past game 162).

2) I'm sure you have much better data on full-season no-injury PAs than I do. I wish there were some way to approximate the correct number. This is especially difficult in that lead-off hitters will have substantially more PAs, and all-glove 8th or 9th hitters will have substantially fewer PAs, than the "average" 5th-spot hitter (who might often bat 2nd or 8th, too, if imbalanced vs LHP/RHP ... and on and on and on). I will post the revised correct formula soon (posting the "missed starts" formula first, which is easier than "missed games," the former NOT involving the "remainder of game" problem) with parameters rather than numbers, so that people can plug in whatever they want on their spread-sheets. I doubt it will include corrections (except approximate) for late-season injuries, though.

3) As a general point: it's interesting to see that the 599/600 division amounts to only about 5 starts on average for injury-1 players. Joe Mauer may be undervalued. Speaking of catchers, NONE of the above is accurate for them, given that the "bullet-proof" factor applies whenever one of two (but not of three) catchers is hurt. (Also, I think online SOM has a rule where no team can have more than some maximum number of injured players at a time, which would slightly affect everything in large samples.)
ClowntimeIsOver
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby ClowntimeIsOver » Wed Aug 25, 2010 4:02 pm

Also, you've made me see an error in my results, by pointing out that "everyone starts the season healthy." That means everyone gets one "free" start. But it also means that the very unluckiest possibility (necessary for the math, though it would never happen) is eleven "free" games per season, because a player who got a 15-game injury in every start would STILL have eleven starts per season. This means that the possible range during a season is not 0-162 missed starts, but 0-151. When finding averages over an "infinite" sample, that difference will cause a reduction of roughly 11/162 in the overall numbers (the possible range of results always affects the average result in a random distribution).

It turns out that this fact brings your numbers into almost perfect agreement with mine. Look:

Your figures for each injury rating (except injury-6) are several percent lower than mine. What happens if these differences (including injury-6, a positive number) are averaged? It shows an average difference of about minus 6.3%, as a first approximation.

But injury-1 players (above and below 600 AB+W) and injury-2 players are more common than all the others. If those differences are given twice the weight (again, just to approximate), the average difference is somewhat greater: minus 6.64%. Chances are, a really accurate weighting would increase that number a bit more.

It so happens that the "eleven free starts" equals 6.79% of a season. In other words, by reducing my numbers by 11/162, their average weighted difference from yours is nearly zero -- even if 700 PAs is not quite the right number for average full-season PAs. (Since, for various reasons, 11/162 is actually too much, an adjustment lower in PAs is definitely necessary -- probably somewhere in the neighborhood of 670 PAs. What's really at work is the combination of lower average PAs, a lower range of possible missed starts, and a lower average number of missed starts per injury due to late-season injuries for sub-600 players. Those three combined, if factored in, would give almost exactly correct numbers for a mid-batting-order player.)

This is my last post in this thread. I'll post the best revised formula for missed starts in another thread (adjusting for end-of-season injuries), using parameters rather than numbers so people can plug in their own numbers (including allowing for full post-season play, if they want). (A "Missed Games" formula is too hard, for now, but basically amounts to "missed starts" plus the injury rating, and a bit less for higher ratings.)

Incidentally, in my original post "I got the number wrong," to quote the Beatles, for average missed starts per injury. It should be 3.55 (not 3.6) for sub-600, and 1.85 (not 1.9) for 600+. Neither number accounts for late-season injuries, but I know how to fix that and will put it in the final (parameter) formula.
ClowntimeIsOver
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby LA Bear » Wed Sep 08, 2010 10:32 pm

This was a very informative read. Thank you.
LA Bear
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

PreviousNext

Return to Strat-O-Matic Baseball Online 20xx

Who is online

Users browsing this forum: No registered users and 12 guests

cron