gradec.model.annotate_lda
- gradec.model.annotate_lda(dataset, dataset_nm, feature_group, n_topics=200, n_cores=1)[source]
Annotate Dataset with the resutls of an LDA model.
- Parameters:
dset (
Dataset) – A Dataset with, at minimum, text available in theself.text_columncolumn of itstextsattribute.n_topics (
int) – Number of topics for topic model. This corresponds to the model’sn_componentsparameter. Must be an integer >= 1.dset_name (str) – Dataset name. Possible options: “neurosynth” or “neuroquery”
data_dir (str) – Path to data directory.
n_cores (
int, optional) – Number of cores to use for parallelization. If <=0, defaults to using all available cores. Default is 1.
- Returns:
dset (
Dataset) – A new Dataset with an updatedannotationsattribute.