top.percent.by {fishRman} | R Documentation |
Subset the top percent of a dataframe by a specific column
Description
Function that sorts a dataframe in descending order for a specific column, calculates the sum of all rows for that column, applies the chosen percentage to said sum, and subsets the minimum number of consecutive rows needed to reach this value.
Usage
top.percent.by(df, percentage, by)
Arguments
df |
A dataframe object as downloaded from GFW's Google Big Data Query. |
percentage |
Number. The 'x' in 'the top x percent of the dataframe'. |
by |
Character. The name of the column for which the percentage will be calculated. |
Value
A dataframe.
Examples
dated <- c("2020-01-01", "2020-01-02")
lat <- c(40, 41)
lon <- c(12,13)
mmsi <- c("34534555", "25634555")
hours <- c(0, 5)
fishing_hours <- c(1,9)
df <- data.frame(dated, lat, lon, mmsi, hours, fishing_hours)
who.fishs.the.most <- top.percent.by(df, 90, "fishing_hours")
print(who.fishs.the.most)
[Package fishRman version 1.2.3 Index]