The importance of ballpark: more to it than BPs

Postby MARCPELLETIER » Thu Oct 13, 2005 5:01 pm

bbrool,

obviously, this would be a great idea, but I have to insist that it will take a very appreciate amount of time to record sufficient data in all context. I'm saying this from experience: before having this team, I started to compute the data just in the way you mention, observing the data and then putting it into an Excel file, but it took me, if I recall correctly, one hour for 20 games, and I was far far away from having sufficient data in all contexts (some context, say 0 out with men on third, are quite rare and would require almost a full season before having about 25 PA for this particular context). That was for one stadium. You can imagine that we would to do this for at least five different stadiums.

The best, by far, would be to have a program who would do it for us. Would be free of errors, and IMO, a program could be easily built, because all the needed information are always found at the same place:

[quote:df69f659f8]
0 G.Sheffield 3 Ground Out b-0
1 V.Castilla 6 Double b-2
1 2 J.Varitek 2 Single 2-3 b-1
1 1 3 L.Ford 3 Foul Out b-0
2 1 3 O.Cabrera 1 Fly Out b-0
[/quote:df69f659f8]

Number of outs is at the far left.
Bases context are at the right of it.
And at the far right, we would have all the necessary information to calculate runs created.

Anyone has a student project in informatics? :)
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Thu Oct 13, 2005 5:09 pm

Just realized that a program wouldn't work...or at least would be much more complicated that I first thought. Reason is that I am also interested in how RC changes relative to the line-up position (how RC formulas differ from lead-offs to clean-ups) and this information is not present handely in the scoresheet.

BTW, I am still first in offense and first in pitching after 72 games, despite a team-OPS that puts my team 7th in the league. Four teams have offenses that trail me by less than five runs. ERA is safe though. First in hits produced and in less hits allowed.
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Fri Oct 14, 2005 12:03 am

Just like that, I got literally cruched in today's series, and my team is now fifth in run scores and third in runs allowed. Who knows if I have a good team!!!
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby childsmwc » Fri Oct 14, 2005 11:31 am

Luckyman,

While the data still requires some massaging, if someone had a program that could pull this info out of the online pages then hopefully a process could be created to fill in the missing information (ie. batting order slots). Since this goes 1-9 and then repeats we should be able to paste this additional data in fairly quickly after the raw data has been downloaded into a database.

I will use the old copy and paste from a few games and see what I can come up with to "automate" the excel work process.

Bbrool
childsmwc
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby wavygravy2k » Fri Oct 14, 2005 4:01 pm

A while back I was trying to create a vb program that sims games. I'll see if I still have it. As far as I recall the program seems to work most of the time unless there's an unusual inning.
wavygravy2k
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby JayJelinek » Fri Oct 14, 2005 5:59 pm

Bbrool and Marcus,

A couple of points on the above replies.

1. Regards the difficulty of parsing out the game logs to extract context information such as outs and men on (and associate that with each offensive event), the program to do the parsing is straightforward enough. Again, I'm a programmer by trade so can speak to this.

2. I think Marcus also desires to extract the active lineup position (as well as outs and men on), so that the database would contain information as to how frequently the cleanup hitter comes up with 1 out and men on 1st and 3rd (for example). This is also possible since the name of the player at bat is given in the log and this can be correlated back to the lineup position using the info in the boxscore. Substitutions make the box score harder to parse as to who was originally in the 1-9 slots and who got subbed where, but its all possible; albeit more complicated than just 1) above.

Bbrool, if you don't have a tool to do this game log parsing, I'd volunteer to work on this also as time permits (which means likely it'll be months before I have something working). But given enough interest I'd get started, again in the context of the NL and AL 1969 leagues currently drafting. And then, if the tool to parse the game logs was built, then the crawler to automate the crawl through all game logs in a league would come next.

But my skills are probably going to end at the point of building the tools to gather the data, and then the sabermetrics guys are going to have to design the right analyses. I guess if we get the data people will be happy to analyze, huh? :)

And, yes, I absolutely agree that data taken from 1969 replays would not be fully relevant to 200x replays or ATG. The game changes too much between the eras. The way to start would be to take one of the games (such as 1969), analyze that, and then try to extend the analysis method onto another season.

Jay
JayJelinek
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby MARCPELLETIER » Fri Oct 14, 2005 7:31 pm

Jay,

you fully understood what I was looking for. That being said, I am not a programmer but it seems to me that going by names will not make the job done, as many owners--such as myself--change their line-up on a day-to-day basis (not counting the fact that, when injuries come, SuperHal modifies the line-up considerably). I would thus tend to agree with bbrool that some ways to specify a 1 to 9 order would probably be a better option.

Obviously, if this turns out to be too complicated, we might settle for register the stats by considering the stadium and the environment context, but by ignoring the line-up order.
MARCPELLETIER
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby JayJelinek » Tue Oct 18, 2005 8:35 pm

Marcus,

Thanks. Any discussion about how to extract lineup positions out of the play-by-play logs in the context extraction we can defer until such time as I have something to actually show. Suffice it to say I am confident that it can be done by correlating names in the play by play log to the box score above it, which lists the actual lineup used for each game regardless of mgr changes or injuries.

However, this is a problem complex enough to conquer a piece at a time. Therefore, over the next several weeks I will twiddle with a tool which parses the play by play log into a database of offensive events and the associated context in which they occured, regardless of lineup position. Once that is working will look at the lineup information. Will keep you posted (more likely will keep Bbrool posted as I'm starting two leagues off with him currently.

I'm really interested in doing something along these lines, just won't happen overnight.
JayJelinek
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Postby deeznuts515 » Mon May 15, 2006 3:31 am

Thanks Marcus and other contributors to this thread - it's helped a lot in Petco; specifically the tip about the value of the Clark / Thomas types in low HR parks...
deeznuts515
 
Posts: 55
Joined: Tue Jul 03, 2012 2:34 pm

Previous

Return to Strategy

Who is online

Users browsing this forum: No registered users and 5 guests

cron