Analyse Text Documents Using Ecological Tools


[Up] [Top]

Documentation for package ‘inpdfr’ version 0.1.12

Help Pages

inpdfr-package inpdfr: A package to analyse PDF Files Using Ecological Tools.
doCluster Performs a cluster analysis on the basis of the word-occurrence data.frame.
doKmeansClust Performs a k-means cluster analysis on the basis of the word-occurrence data.frame.
doMetacomEntropart Performs an analysis of ecological diversity and structure.
doMetacomMetacom Performs a metacomunity analysis.
excludeStopWords Exclude StopWords form the word-occurrence data.frame.
exclusionList_FR Stop words in French.
exclusionList_SP Stop words in Spanish.
exclusionList_UK Stop words in English.
getAllAnalysis A quick way to compute a set of analysis from the word-occurrence data.frame.
getListFiles List files in a specified directory sorted by extension.
getMostFreqWord Returns most frequent words.
getMostFreqWordCor Test for correlation between the most frequent words.
getPDF Extract text from PDF files and return a word-occurrence data.frame.
getStopWords Load a list of stopwords.
getSummaryStatsBARPLOT Perform a barplot with the number of unique words per document
getSummaryStatsHISTO Plot an histogram with the number of words excluding stop words
getSummaryStatsOCCUR Plot a scatter plot with the proportion of documents using similar words.
getTXT Extract text from TXT files and return a word-occurrence data.frame.
getwordOccuDF A quick way to obtain the word-occurrence data.frame from a set of documents.
getXFreqWord Returns most frequent words
IdentifyStructure Copy of the identifyStructure function from Tad Dallas metacom package.
inpdfr inpdfr: A package to analyse PDF Files Using Ecological Tools.
loremIpsum Lorem Ipsum text.
makeWordcloud Word cloud based on the word-occurrence data.frame.
mergeWordFreq Merge word-occurrence data.frames into a single data.frame.
postProcTxt Prossess vectors containing words into a data.frame of word occurrences.
preProcTxt Extract text from txt files and pre-process content.
quitSpaceFromChars Delete spaces in file names.
truncNumWords Truncate the word-occurrence data.frame.
wordOccuDF Lorem Ipsum word occurrences.