| spambase {bayesreg} | R Documentation |
Spambase
Description
This is a well known dataset with a binary target obtainable from the UCI machine learning dataset archive. Each row is an e-mail, which is considered to be either spam or not spam. The dataset contains 48 attributes that measure the percentage of times a particular word appears in the email, 6 attributes that measure the percentage of times a particular character appeared in the email, plus three attributes measuring run-lengths of capital letters.
Usage
data(spambase)
Format
A data frame with 4,601 rows and 58 variables (1 categorical, 57 continuous).
is.spamIs the email considered to be spam? (0=no,1=yes)
word.freq.makePercentage of times the word 'make' appeared in the e-mail
word.freq.addressPercentage of times the word 'address' appeared in the e-mail
word.freq.allPercentage of times the word 'all' appeared in the e-mail
word.freq.3dPercentage of times the word '3d' appeared in the e-mail
word.freq.ourPercentage of times the word 'our' appeared in the e-mail
word.freq.overPercentage of times the word 'over' appeared in the e-mail
word.freq.removePercentage of times the word 'remove' appeared in the e-mail
word.freq.internetPercentage of times the word 'internet' appeared in the e-mail
word.freq.orderPercentage of times the word 'order' appeared in the e-mail
word.freq.mailPercentage of times the word 'mail' appeared in the e-mail
word.freq.receivePercentage of times the word 'receive' appeared in the e-mail
word.freq.willPercentage of times the word 'will' appeared in the e-mail
word.freq.peoplePercentage of times the word 'people' appeared in the e-mail
word.freq.reportPercentage of times the word 'report' appeared in the e-mail
word.freq.addressesPercentage of times the word 'addresses' appeared in the e-mail
word.freq.freePercentage of times the word 'free' appeared in the e-mail
word.freq.businessPercentage of times the word 'business' appeared in the e-mail
word.freq.emailPercentage of times the word 'email' appeared in the e-mail
word.freq.youPercentage of times the word 'you' appeared in the e-mail
word.freq.creditPercentage of times the word 'credit' appeared in the e-mail
word.freq.yourPercentage of times the word 'your' appeared in the e-mail
word.freq.fontPercentage of times the word 'font' appeared in the e-mail
word.freq.000Percentage of times the word '000' appeared in the e-mail
word.freq.moneyPercentage of times the word 'money' appeared in the e-mail
word.freq.hpPercentage of times the word 'hp' appeared in the e-mail
word.freq.hplPercentage of times the word 'hpl' appeared in the e-mail
word.freq.georgePercentage of times the word 'george' appeared in the e-mail
word.freq.650Percentage of times the word '650' appeared in the e-mail
word.freq.labPercentage of times the word 'lab' appeared in the e-mail
word.freq.labsPercentage of times the word 'labs' appeared in the e-mail
word.freq.telnetPercentage of times the word 'telnet' appeared in the e-mail
word.freq.857Percentage of times the word '857' appeared in the e-mail
word.freq.dataPercentage of times the word 'data' appeared in the e-mail
word.freq.415Percentage of times the word '415' appeared in the e-mail
word.freq.85Percentage of times the word '85' appeared in the e-mail
word.freq.technologyPercentage of times the word 'technology' appeared in the e-mail
word.freq.1999Percentage of times the word '1999' appeared in the e-mail
word.freq.partsPercentage of times the word 'parts' appeared in the e-mail
word.freq.pmPercentage of times the word 'pm' appeared in the e-mail
word.freq.directPercentage of times the word 'direct' appeared in the e-mail
word.freq.csPercentage of times the word 'cs' appeared in the e-mail
word.freq.meetingPercentage of times the word 'meeting' appeared in the e-mail
word.freq.originalPercentage of times the word 'original' appeared in the e-mail
word.freq.projectPercentage of times the word 'project' appeared in the e-mail
word.freq.rePercentage of times the word 're' appeared in the e-mail
word.freq.eduPercentage of times the word 'edu' appeared in the e-mail
word.freq.tablePercentage of times the word 'table' appeared in the e-mail
word.freq.conferencePercentage of times the word 'conference' appeared in the e-mail
char.freq.;Percentage of times the character ';' appeared in the e-mail
char.freq.(Percentage of times the character '(' appeared in the e-mail
char.freq.[Percentage of times the character '[' appeared in the e-mail
char.freq.!Percentage of times the character '!' appeared in the e-mail
char.freq.$Percentage of times the character '$' appeared in the e-mail
char.freq.#Percentage of times the character '#' appeared in the e-mail
capital.run.length.averageAverage length of contiguous runs of capital letters in the e-mail
capital.run.length.longestMaximum length of contiguous runs of capital letters in the e-mail
capital.run.length.totalTotal number of capital letters in the e-mail
Source
https://archive.ics.uci.edu/ml/datasets/spambase/