batting {UsingR} | R Documentation |
Batting statistics for 2002 baseball season
Description
This dataset contains batting statistics for the 2002 baseball season. The data allows you to compute batting averages, on base percentages, and other statistics of interest to baseball fans. The data only contains players with more than 100 atbats for a team in the year. The data is excerpted with permission from the Lahman baseball database at http://www.seanlahman.com/.
Usage
data(batting)
Format
A data frame with 438 observations on the following 22 variables.
- playerID
This is coded, but those familiar with the players should be able to find their favorites.
- yearID
a numeric vector. Always 2002 in this dataset.
- stintID
a numeric vector. Player's stint (order of appearances within a season)
- teamID
a factor with Team
- lgID
a factor with levels
AL
NL
- G
number of games played
- AB
number of at bats
- R
number of runs
- H
number of hits
- DOUBLE
number of doubles. "2B" in original dat a base.
- TRIPLE
number of triples. "3B" in original data base
- HR
number of home runs
- RBI
number of runs batted in
- SB
number of stolen bases
- CS
number of times caught stealing
- BB
number of base on balls (walks)
- SO
number of strikeouts
- IBB
number of intentional walks
- HBP
number of hit by pitches
- SH
number of sacrifice hits
- SF
number of sacrifice flies
- GIDP
number of grounded into double plays
Details
Baseball fans are “statistics” crazy. They love to talk about things like RBIs, BAs and OBPs. In order to do so, they need the numbers. This data comes from the Lahman baseball database at http://www.seanlahman.com/. The complete dataset includes data for all of baseball not just the year 2002 presented here.
Source
Lahman baseball database, http://www.seanlahman.com/)
References
In addition to the data set above, the book Curve Ball, by Albert, J. and Bennett, J., Copernicus Books, gives an extensive statistical analysis of baseball.
See https://www.baseball-almanac.com/stats.shtml for definitions of common baseball statistics.
Examples
data(batting)
attach(batting)
BA = H/AB # batting average
OBP = (H + BB + HBP) / (AB + BB + HBP + SF) # On base "percentage"