Change search
Link to record
Permanent link

Direct link
Skeppstedt, Maria, Dr.ORCID iD iconorcid.org/0000-0001-6164-7762
Alternative names
Publications (10 of 19) Show all publications
Ahltorp, M. & Skeppstedt, M. (2024). 1 1/2 years of developing Word Rain. In: : . Paper presented at Swedish Language Technology Conference.
Open this publication in new window or tab >>1 1/2 years of developing Word Rain
2024 (English)Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

The Word Rain visualisation technique is a development of the classic word cloud, providing more possibilities for analysis of the texts visualised. We here briefly describe the work carried out so far, the reasoning behind it, and what could lie ahead.

Keywords
word rain, text visualization, word cloud
National Category
Natural Language Processing
Research subject
Language Technology
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2843 (URN)
Conference
Swedish Language Technology Conference
Funder
Swedish Research Council, 2017-00626Swedish Research Council, 2021-00176Swedish Research Council, 2021-00181
Available from: 2024-11-29 Created: 2024-11-29 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M. & Ahltorp, M. (2024). Using Topics2Themes and Word Rain to visualise topics in Swedish news on climate change. In: Vincent Vandeghinste and Thalassia Kontino (Ed.), CLARIN Annual Conference Proceedings: . Paper presented at CLARIN Annual Conference (pp. 112-115). Barcelona, Spain
Open this publication in new window or tab >>Using Topics2Themes and Word Rain to visualise topics in Swedish news on climate change
2024 (English)In: CLARIN Annual Conference Proceedings / [ed] Vincent Vandeghinste and Thalassia Kontino, Barcelona, Spain, 2024, p. 112-115Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

The classic word cloud remains a popular visualisation technique, also to use for more advanced text exploration and comparison tasks. However, since the standard word cloud does not provide any support for these kinds of analytical tasks, we have created the Word Rain visualisation technique, which is a development of the classic word cloud. The Word Rain technique positions paradigmatically similar words close to each other on the x-axis, which makes it easier to identify semantic word clusters and to carry out comparison tasks. We have previously applied the technique on several different tasks, and we here show how the Word Rain visualisation can support a topical analysis of the text collection content. We first apply the topic modelling tool Topics2Themes to a collection of texts on the subject of climate change, and then use the Word Rain technique to visualise the automatically extracted topics. The Word Rain visualisation applied on the entire text collection provides an overview of its content, sorted according to paradigmatic similarity. When also creating focused word rain visualisations for the extracted topics, a visual semantic profile for each one of the topics is created, which supports the tasks of understanding and comparing topics. We have, thereby, here provided yet an example of how the Word Rain technique can be practically used for visualising and exploring texts.

Place, publisher, year, edition, pages
Barcelona, Spain: , 2024
Series
CLARIN Annual Conference Proceedings, ISSN 2773-2177
Keywords
word rain, text visualization, word cloud, climate change, topic modelling
National Category
Natural Language Processing
Research subject
Language Technology
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2839 (URN)
Conference
CLARIN Annual Conference
Funder
Swedish Research Council, 2017-00626Swedish Research Council, 2021-00176Swedish Research Council, 2021-00181
Available from: 2024-11-29 Created: 2024-11-29 Last updated: 2025-09-05Bibliographically approved
Ahltorp, M. & Skeppstedt, M. (2024). Word Rain as a Service. In: Vincent Vandeghinste and Thalassia Kontino (Ed.), CLARIN Annual Conference Proceedings: . Paper presented at CLARIN Annual Conference (pp. 22-25). Barcelona, Spain
Open this publication in new window or tab >>Word Rain as a Service
2024 (English)In: CLARIN Annual Conference Proceedings / [ed] Vincent Vandeghinste and Thalassia Kontino, Barcelona, Spain, 2024, p. 22-25Conference paper, Oral presentation with published abstract (Refereed)
Abstract [en]

Word Rain is a novel approach to the classic word cloud that uses word embeddings to make it useful for exploring the word content of a text or corpus. Downloading and running the code can, however, be prohibitively difficult or cumbersome for non-technical users and casual evaluation. Since Word Rain also requires a word embeddings model, the inexperienced or casual user would benefit greatly from a streamlined interface. We have therefore collected everything that is needed in a web based service and are making it available as a SWELANG K-centre resource.

Place, publisher, year, edition, pages
Barcelona, Spain: , 2024
Series
CLARIN Annual Conference Proceedings, ISSN 2773-2177
Keywords
word rain, text visualization, word cloud
National Category
Natural Language Processing
Research subject
Language Technology
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2837 (URN)
Conference
CLARIN Annual Conference
Funder
Swedish Research Council, 2017-00626
Available from: 2024-11-29 Created: 2024-11-29 Last updated: 2025-09-05Bibliographically approved
Ahltorp, M., Hessel, J., Eriksson, G., Skeppstedt, M. & Domeij, R. (2022). A Digital Swedish–Yiddish/Yiddish–Swedish Dictionary: A Web-Based Dictionary that is also Available Offline. In: Proceedings of the EURALI Workshop @LREC2022: . Paper presented at LREC 2022.
Open this publication in new window or tab >>A Digital Swedish–Yiddish/Yiddish–Swedish Dictionary: A Web-Based Dictionary that is also Available Offline
Show others...
2022 (English)In: Proceedings of the EURALI Workshop @LREC2022, 2022Conference paper, Published paper (Refereed)
Abstract [en]

Yiddish is one of the national minority languages of Sweden, and one of the languages for which the Swedish Institute for Language and Folklore is responsible for developing useful language resources. We here describe the web-based version of a Swedish–Yiddish/Yiddish–Swedish dictionary. The single search field of the web-based dictionary is used for incrementally searching all three components of the dictionary entries (the word in Swedish, the word in Yiddish with Hebrew characters and the transliteration in Latin script). When the user accesses the dictionary in an online mode, the dictionary is saved in the web browser, which makes it possible to also use the dictionary offline.

National Category
Specific Languages
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2471 (URN)
Conference
LREC 2022
Funder
Swedish Research Council, 2017-00626
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M. & Robin, S. (2022). A Snapshot of Climate Change Arguments: Searching for Recurring Themes in Tweets on Climate Change. In: CLARIN Annual Conference Proceedings 2022: . Paper presented at CLARIN Annual Conference 2022.
Open this publication in new window or tab >>A Snapshot of Climate Change Arguments: Searching for Recurring Themes in Tweets on Climate Change
2022 (English)In: CLARIN Annual Conference Proceedings 2022, 2022Conference paper, Published paper (Refereed)
Abstract [en]

We applied the topic modelling tool Topics2Themes to a collection of German tweets on the subject of climate change. Topics2Themes is currently being further developed and evaluated within Spr{\aa}kbanken Sam, which is a part of {\sc SWE-CLARIN}.  The tool automatically extracted 15 topics from the tweet collection. We used the graphical user interface of Topics2Themes to manually search for recurring themes among the eight tweets most closely associated with the topics extracted. Although the content of the tweets associated with a topic was often diverse, we were still able to identify recurring themes. More specifically, 14 themes that occurred at least three times were identified in the texts analysed.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2473 (URN)
Conference
CLARIN Annual Conference 2022
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M., Mattson, M., Ahltorp, M. & Domeij, R. (2022). Converting from the Nordic Terminological Record Format to the TBX Format. In: Proceedings of the TERM21 Workshop, Language Resources and Evaluation Conference (LREC 2022): . Paper presented at Language Resources and Evaluation Conference (LREC 2022).
Open this publication in new window or tab >>Converting from the Nordic Terminological Record Format to the TBX Format
2022 (English)In: Proceedings of the TERM21 Workshop, Language Resources and Evaluation Conference (LREC 2022), 2022Conference paper, Published paper (Refereed)
Abstract [en]

Rikstermbanken (Sweden’s National Term Bank), which was launched in 2009, uses the Nordic Terminological Record Format (NTRF) for organising its terminological data. Since then, new terminology formats have been established as standards, e.g., the Termbase eXchange format (TBX). We here describe work carried out by the Institute for Language and Folklore within the Federated eTranslation TermBank Network Action. This network develops a technical infrastructure for facilitating sharing of terminology resources throughout Europe. To be able to share some of the term collections of Rikstermbanken within this network and export them to Eurotermbank, we have implemented a conversion from the Nordic Terminological Record Format, as used in Rikstermbanken, to the TBX format.

National Category
Languages and Literature
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2472 (URN)
Conference
Language Resources and Evaluation Conference (LREC 2022)
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M., Domeij, R., Eriksson, G. & Öqvist, J. (2022). Digital humanities for the spreadsheet nerd: Presenting the output of a topic modelling tool as tabular data. In: DHNB 2022 Conference: Book of Abstracts. Paper presented at Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022).
Open this publication in new window or tab >>Digital humanities for the spreadsheet nerd: Presenting the output of a topic modelling tool as tabular data
2022 (English)In: DHNB 2022 Conference: Book of Abstracts, 2022Conference paper, Oral presentation with published abstract (Refereed)
National Category
Other Humanities not elsewhere specified
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2470 (URN)
Conference
Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022)
Projects
Tilltal
Available from: 2022-07-15 Created: 2022-07-15 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M., Ahltorp, M., Eriksson, G. & Domeij, R. (2021). A Pipeline for Manual Annotations of Risk Factor Mentions in the COVID-19 Open Research Dataset. In: Selected Papers from the CLARIN Annual Conference 2020: . Paper presented at CLARIN Annual Conference 2020.
Open this publication in new window or tab >>A Pipeline for Manual Annotations of Risk Factor Mentions in the COVID-19 Open Research Dataset
2021 (English)In: Selected Papers from the CLARIN Annual Conference 2020, 2021Conference paper, Published paper (Refereed)
Abstract [en]

We here demonstrate how a set of tools that are being maintained and further developed within the Språkbanken Sam and SWE-CLARIN infrastructures can be employed for creating manually labelled training data in a low-resource setting. As example text, we used the “COVID-19 Open Research Dataset”, and created manually annotated training data for its associated Kaggle task,“What do we know about COVID-19 risk factors?”. We first used our topic modelling tool to i) select a text set for manual annotation, ii) classify the texts into preliminary classification categories, and iii) analyse the texts in search for potential refinements of the annotation categories. We then annotated the text set on a more granular level by labelling the token sequences that indicated the existence of the refined categories in the text. Finally, we used the granularly annotated text set as a seed set, and applied our active learning tool for actively selecting additional texts for annotation. For the token-sequence annotations, we used our text annotation tool, which includes support for incorporating automatic pre-annotations.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2075 (URN)
Conference
CLARIN Annual Conference 2020
Funder
Swedish Research Council, 2017-00626
Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M., Ahltorp, M., Domeij, R., Eriksson, G. & Öqvist, J. (2021). Mining for Recurring Themes in Speech Recording Descriptions. In: : . Paper presented at The 9th Swedish Workshop on Data Science.
Open this publication in new window or tab >>Mining for Recurring Themes in Speech Recording Descriptions
Show others...
2021 (English)Conference paper, Poster (with or without abstract) (Refereed)
National Category
Natural Language Processing
Research subject
Language Technology
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2217 (URN)
Conference
The 9th Swedish Workshop on Data Science
Projects
TilltalNationella språkbanken
Funder
Riksbankens Jubileumsfond, SAF16-0917:1
Available from: 2021-12-12 Created: 2021-12-12 Last updated: 2025-09-05Bibliographically approved
Skeppstedt, M., Domeij, R. & Skott, F. (2021). Snippets of Folk Legends: Adapting a Text Mining Tool to a Collection of Folk Legends. In: Post-Proceedings of the 5th Conference Digital Humanities in the Nordic Countries (DHN 2020): . Paper presented at 5th Conference Digital Humanities in the Nordic Countries (DHN 2020).
Open this publication in new window or tab >>Snippets of Folk Legends: Adapting a Text Mining Tool to a Collection of Folk Legends
2021 (English)In: Post-Proceedings of the 5th Conference Digital Humanities in the Nordic Countries (DHN 2020), 2021Conference paper, Published paper (Refereed)
Abstract [en]

A topic modelling tool was adapted to requirements for a collection of Swedish folk legends. To offer an overview of a list of folk legend texts, which had been automatically extracted by the topic modelling tool, snippet text versions of the folk legends were displayed. The snippets were constructed from the full-text versions of the legends using the sentences most relevant to the topics extracted by the topic modelling algorithm. In addition, collection-adapted data was constructed for performing a pre-processing of the folk legend texts, before they were submitted to the topic modelling algorithm. This data consisted of a collection-adapted stop word list and word lists for improving the quality of clusters of semantically similar words.

National Category
Natural Language Processing
Research subject
Language Technology
Identifiers
urn:nbn:se:sprakochfolkminnen:diva-2074 (URN)
Conference
5th Conference Digital Humanities in the Nordic Countries (DHN 2020)
Projects
Nationella språkbanken
Funder
Swedish Research Council, 2017-00626
Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2025-09-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6164-7762

Search in DiVA

Show all publications