conllu_dev_q11_2_nltk {finnsurveytext}R Documentation

Young People's Views on Development Cooperation 2012 q11_2 response data in CoNLL-U format with NTLK stopwords removed

Description

This data contains the responses to Development Cooperation q11_2 dataset in CoNLL-U format with ISO stopwords and punctuation removed.

Usage

conllu_dev_q11_2_nltk

Format

## 'conllu_dev_q11_2_nltk' A dataframe with 4407 rows and 14 columns:

doc_id

the identifier of the document

paragraph_id

the identifier of the paragraph

sentence_id

the identifier of the sentence

sentence

the text of the sentence for which this token is part of

token_id

Word index, integer starting at 1 for each new sentence; may be a range for multi-word tokens; may be a decimal number for empty nodes.

token

Word form or punctuation symbol.

lemma

Lemma or stem of word form.

upos

Universal part-of-speech tag.

xpos

Language-specific part-of-speech tag; underscore if not available.

feats

List of morphological features from the universal feature inventory or from a defined language-specific extension; underscore if not available.

head_token_id

Head of the current word, which is either a value of token_id or zero (0).

dep_rel

Universal dependency relation to the HEAD (root iff HEAD = 0) or a defined language-specific subtype of one.

deps

Enhanced dependency graph in the form of a list of head-deprel pairs.

misc

Any other annotation.

Source

<https://urn.fi/urn:nbn:fi:fsd:T-FSD2821>


[Package finnsurveytext version 1.0.0 Index]