- Apr 25, 2016 TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files.
- Jun 25, 2018 TextSTAT is text concordance software that runs on both Windows (XP or Vista) and Mac. According to the TextSTAT website the Windows version 'includes everything you need to use TextSTAT with Windows. It comes as a single installation file.' I haven't tested this.
Textstatfrequency: Tabulate feature frequencies Description Produces counts and document frequencies summaries of the features in a dfm, optionally grouped by a docvars variable or other supplied grouping variable. Jan 08, 2021 In quanteda: Quantitative Analysis of Textual Data. Description Usage Arguments Details Value Note See Also Examples. View source: R/textstatsimil.R. These functions compute matrixes of distances and similarities between documents or features from a dfm and return a matrix of similarities or distances in a sparse format.
Description
Produces counts and document frequencies summaries of the features in adfm, optionally grouped by a docvars variable or other suppliedgrouping variable.
Usage
Arguments
a dfm object
(optional) integer specifying the top n
features to be returned,within group if groups
is specified
either: a character vector containing the names of documentvariables to be used for grouping; or a factor or object that can becoerced into a factor equal in length or rows to the number of documents.NA
values of the grouping value are dropped.See groups for details.
character string specifying how ties are treated. Seedata.table::frank()
for details. Unlike that function,however, the default is 'min'
, so that frequencies of 10, 10, 11would be ranked 1, 1, 3.
additional arguments passed to dfm_group()
. This canbe useful in passing force = TRUE
, for instance, if you are grouping adfm that has been weighted.
Using Textstat To Draw
Value
a data.frame containing the following variables:
feature
(character) the feature
frequency
count of the feature
rank
rank of the feature, where 1 indicates the greatestfrequency
docfreq
document frequency of the feature, as a count (thenumber of documents in which this feature occurred at least once)
docfreq
document frequency of the feature, as a count
Using Textstat To Print
group
(only if groups
is specified) the label of the group.If the features have been grouped, then all counts, ranks, and documentfrequencies are within group. If groups is not specified, the group
column is omitted from the returned data.frame.
textstat_frequency
returns a data.frame of features andtheir term and document frequencies within groups.