Group members: Maija Absetz, Kari Jalonen, Anne Järvinen, Mikko Koho, Matti La Mela, Rafael Leal, Joni Oksanen, Anna Ristilä, Laura Sinikallio, Jouni Tuominen
Our research theme is the development of city image in the Finnish parliamentary discussions in the late 20th century. More than what is happening, we are interested in the meaning of places, as the focus on city image suggests. Although places have a concrete reference to coordinates in the map, which we have been utilizing, places have reference to ideas, sentiments and images of what a place is thought to be. Our concept of a place is therefore much broader than just a concrete location in the map. Considering the meaning-making of cities we must recognise that every speech concerning places in the plenary session is also a speech act that shapes and modifies the meaning of that place. It is highly important how cities are being discussed in parliamentary debates, since the proportions in state subsidies are decided in the great hall of the parliament. Image of a city as a city of art, technology, industry or gloom and doom can decide the fate if they are getting more investments or being forgotten as lost causes.
In our project, we were especially interested in if we can see change during the research period. What better way to bring out change than focus on crises? The timing of crisis is a complex thing, but when comparing for example the change in gross domestic product (GDP) and unemployment rates, the importance of the 1990s depression rises in focus (see Statistics Finland). Our focus for places in parliamentary speeches narrowed down to the two latest economic crises: years 1986 to 1995 around the 1990s crisis and 2004 – 2013 around financial crisis. Also, we used the oil crises of the 1970s (1969-1978) as a comparative point. Our research questions were:
- What cities are mentioned in parliamentary speech in Eduskunta, the Parliament of Finland?
- What changes can we find in the speech during the moments of crisis? In particular, are we able to find to which cities the crisis was attributed in particular?
- Are we able to tackle these questions by employing the semparl linked open data, which includes recognised named-entities, and applying methods in sentiment analysis, topic modeling, and semi-automatic reading to our data?
When tackling with crises cities have been lifted alongside with states. Cities as administrative units are more flexible to choose their approaches to problems; everything from birth control, energy consumption to mental health care solutions. At the same time they are highly dependent on state subsidies in tax money, especially if their own tax pool is small compared to taxpayers of the area. It is also relevant in the sense that the 1990s depression is still strongly influencing the self-conception of many Finns and Finland. Why not cities as well? Comparing 1990s depression to 2008-2009 financial crisis we get more information on the sentiment around 1990s.
What our research showed is that Finland is a very diverse country with different problems in the capital city area compared to more distant and rural cities. Helsinki is overly represented in all the analyses and therefore presents a problem of hiding any other difference in the data.
We were able to pinpoint and visualize with the help of named-entities the parliamentary speeches where Finnish cities were mentioned. We also found differences to what extent the MPs of different political parties address the different cities in their speech. For the period of 1986-1995, for instance, we find that the key national parties the Centre Party and the Social Democratic Party spoke the broadest about different cities, and that smaller parties focused on specific cities, which have been relevant for their agenda. In our sentiment analysis, we found that the moments of crises did not produce more sentimental speaking, but on the contrary, the crisis of the 1990s led to the decline of sentiment words about the cities. Our hypothesis is that the crises involve more neutral handling of the crises and that the effect of the crises of the 1990s has toned down the political language in the Parliament.
Finally, based on these results on all cities, we focused on selected cities where the crises appeared to have an impact on parliamentary speech. We found that there were differences to the extent how stable the city images were during the crises. For the period of 1896-1995, our results with TF-IDF (term frequency–inverse document frequency) show vocabulary that is city-specific. For some cities, we find observations of crisis-related words appearing in the early 1990s, for example, in relation to housing (Helsinki) or unemployment (Vantaa). Finally, we conducted close reading with keyword searches. These cases, which were identified with the big data analysis, helped us to elaborate on the question how the cities managed the crises and what meanings were attributed to them.
DHH21 included communication in social media, mainly Twitter. Not every member had their own Twitter account, so we created a Twitter account for the project. You can find us @citiesinparl and with hashtag #semparl.
This year the Digital Humanities Hackathon was held completely online due to Covid-19. For data share we used Google Drive provided by organizers, so that all members had access to the data. For coding script share we used Google Colab and GitHub. Google Colab has the benefit that it requires no software installation and can be run in the Colab file. Our GitHub-repository can be found at github.com/dhh21/semparl.
Our source material were the plenary speeches from Parliament of Finland from its start in 1907 until early 2021, covering the whole of PoF’s history. More specifically, the data contained all speeches and utterances transcribed in PoF’s plenary session minutes and various related metadata. Metadata included temporal information, speaker information such as name and party and so on. This data was provided by the Semantic Computing research group and access to it is currently limited as it is still partially work in progress. The dataset is described in detail in Sinikallio et al. (2021) and Leskinen et al. (2021). The data is stored in RDF (resource definition format) and was accessed either with SPARQL queries or a prototype portal that requires no technical know-how.
The data also contains information on named entities. Each speech in the data has gone through named entity recognition (NER) that was done using the upgraded Nelli tool (Sinikallio et al. 2021, Tamper et al. 2020) Named entities have been divided into several categories of which two were of special interest to us: <referenceToPlaceName> and <referenceOrganizationName>. The first one covered found references to geographical and political locations (Helsinki, Finland). The second one included different organizations and as some of these are very strongly connected to certain locations (Helsingin kaupunki) we decided to include them in our scope.
Our final corpus included the speeches for the three crises:
- The 1990s crisis (1985-1995) corpus contained 14864 speeches about cities from a total of 135 427 speeches held in Eduskunta.
- The 2008 Financial crisis (2004-2013) corpus contained 21101 speeches about cities from a total of 160 909 speeches held in Eduskunta.
- The Oil crisis of the 1970s (1969-1978) corpus contained 9405 speeches about cities from a total of 113 931 speeches held in Eduskunta.
The quality of FINBERT NER tagging regarding locations was evaluated by sampling 200 speeches per year from timespan of 1985-1995. The sample consisted of 2000 speeches with a total of 17 082 observations to check. We manually evaluated the quality of 2147 (12.57%) entries by evenly checking speeches from each year included in the sample. There were 250 location entities identified by NER algorithm in our sample. Our manual evaluation showed that NER recognized places with precision of 85.6% and recall of 96.2% resulting in a F-score 90.7%.
2.1 Data Extraction
In this project, we utilized three different interfaces of the Semantic Parliament for data collection. For data extraction purposes, we first used a YASGUI SPARQL portal to establish query structure and desired information content. YASGUI is a very user-friendly environment that produces easy-to-read layouts and thus suited well for the task. However, we also needed to include variables in the queries, so final data extraction was done with Python.
After NER quality evaluation a Python script with a SPARQL wrapper was produced. Since NER tagging proved to be trustworthy, all speeches containing a place or an organization NER tag of a Finnish city were collected. Quality analysis had previously shown that place-related NER’s included NER’s that refer to broader concepts, such as Mikkelin lääni (the Mikkeli Province), which would cause noise to the data. To overcome this, we embedded the SPARQL query with surface forms to include and exclude in the data extraction parameters, collected earlier in the quality analysis. In the end, for each city, all speeches containing that city were collected from the database. The data extracted per speech is described in Table 1.
Table 1. Data extracted per speech.
|A unique identifier of speech|
|A unique identifier of speaker|
|The content of the speech given in the parliament|
|The surface forms of the city found in the data|
We performed the SPARQL queries for speeches mentioning cities for all three time spans covering economic crises. The final data set contained a total of 45 370 speeches, of which 29 039 distinct. The data extraction script is available in GitHub, but, for time being, access to the Semantic Parliament database is restricted.
2.2 Data Visualization
In figure 3, we can observe from the animated map the annual change of how many times a city is mentioned in plenary sessions. The grey colored municipalities are not included in the study. The number of mentions is quite low and the temporal effect in the number of mentions is affecting mostly the biggest cities and the regional centers. Peak of mentions in the plenary session is the year 1989. The map was created using R package geofi.
Figure 4 is a similar map from the time period focused on the financial crisis of 2008. Similar to the 1990s recession years, the largest cities are mentioned distinctly more often than the smaller municipalities. In 2009, there were the most mentions of cities in general in the plenary sessions
Initially we try to normalize the number of mentions by municipal population (Statistics Finland, 2021), to observe which cities are discussed more often and what we would expect based on the national significance or impact of the city. We also briefly considered using statistics related to municipal economy, but we were not able to find such statistic that would represent the state of the economy comprehensively. Our time period of 10 years is a relatively short period of time for the economy and we could not rule out other external factors that could affect such statistics related to the economy. We tried to evaluate the change by computing the number of mentions per capita for each city included in our study, but this affected mostly the highly populated cities, which would suggest that in terms of population, the biggest cities should have been discussed even more than what they were during the time periods of our study. Instead we used rate per 100 000, which is more widely used in epidemiology to follow prevalence. However, even using the rate per 100 000, we cannot observe any coherent pattern in the rate of city mentions as there is no clear trend in either of the 10 year periods.
Figure 5. Cities mentioned in plenary sessions of Finnish parliament (1986-1995). Note: No. mentions per 100 000 inhabitants.
Figure 6. Cities mentioned in plenary sessions of Finnish parliament (2004-2013). Note: No. mentions per 100 000 inhabitants.
We also created an interactive UI (jonioks.shinyapps.io/citiesinparl) for users to explore the data of cities mentioned in plenary sessions using R and Shiny. The web UI enables users to select which time period or economic crisis they want to observe and which year in exact as well as choose if the values are displayed as absolute values or adjusted per capita. We had plans to integrate many more options, suchs as selections enabling users to select parliamentary groups or view the map of Finland by electoral districts. However, implementing these options proved to be more laborious than we expected, but we hope to add these features in the future
Finnish is a morphologically rich language, and inflected word forms present a problem in analysis methods. To address this we had our corpus lemmatized. For lemmatization we used Turku Neural Parser, which is mostly used lemmatizer for Finnish.
3.1 Sentiment analysis
Sentiment analysis is used for detecting emotion intensity, positivity/negativity or more precise emotions in large text corpora. Typical use cases are customer feedback or social media monitoring, because in such texts emotions are expressed very openly and strongly and are thus easy to detect. Today, sentiment analysis and position-taking analysis are widely used also for parliamentary data (Abercombie & Batista-Navarro 2019). There has even been a study for Finnish parliamentary data for differences in sentiment between the prime minister party, coalition parties and the opposition (Proksch et al. 2018).
For the needs of this project we used a modified version of Finnish Emotion Intensity Lexicon (FEIL) which is translated from NRC Emotion Lexicon to Finnish (Öhman 2021a). FEIL is built based on Sentiment and Emotion Lexicon for Finnish (SELF), which also gives value of positivity or negativity to the word whereas SELF only gives a score as to how intense the emotion associated with the word is (value between 1 and 0). (Öhman 2021a, 1-2)
The Finnish Lexicon is constructed based on NRC Emotion Lexicon originally in English but translated into more than hundred languages with Google Translate. Translation errors were tackled with synonyms or simple removal, which led to the Finnish lexicon being 10,5% shorter than the original, resulting in a total of 7291 words. Language cannot be separated from the culture around it, which leads to specific problems when translating to Finnish. The intensity level of FEIL was not changed from original wording which can lead to misleading intensity of wording. (Öhman 2021, 2-3)
A Python code was used to recognize FEIL words in lemmatized parliament speech sentences that mention one or more cities. A small list of words related to parliamentary work was excluded from the FEIL lexicon, since such words do not carry emotion in parliamentary context (table 2).
Table 2. Parliamentary-specific stopwords used in our data extraction.
The aim of this analysis was to crudely measure the amount of emotion related to discussions about cities. The FEIL lexicon included specific sentiment intensities for all words in it but some words had more than one emotion (e.g. fear, anticipation etc.) and sentiment intensity linked to them. In such cases we calculated the average of such intensity scores for simplicity’s sake. The result was a lexicon with just one general emotion intensity score per word with no regard to positivity or negativity, which was used with the Python code to calculate the number of sentiment words in a sentence and a normalized sentiment intensity score for each sentence.
3.2 Topic modelling
For topic modelling we used Latent Dirichlet Allocation (LDA) provided in Scikit Learn for lemmatized speeches. LDA algorithm is not able to determine the number of topics, so we chose 20 topic groups. For each topic we extracted top 20 words. We analysed the top words within a topic to find topic labels, and the 20 topics seem very coherent and representative. For this project, the following topics stand out:
Table 3. Most prevalent words for central topics.
First, we found the 20 topics to be very coherent in content, and surprisingly representative of important themes related with the role of cities in parliamentary discussions. Rather surprisingly, 19 of our 20 topics were meaningful and easy to link to societally important debates, and to aspects relevant to our research questions. All in all, our results are encouraging to the applicability of topic modeling for analysis of parliamentary debates.
Finally, we compared the prevalence of these topics during the three time spans. We performed this phase through analyzing the average p values of our topics during the years in our data.
3.3 Studying meaning given to cities with TF-IDF
The TF-IDF algorithm tries to characterize documents by highlighting their outstanding keywords. It works by muffling words that are common throughout the dataset in favour of more unique terms. So words that are rare in the dataset but appear multiple times in a document are taken to be meaningful and representative of this document.
This algorithm was used to understand which uncommon words were used in the same sentence as specific cities as a proxy to understand how these cities are talked about in Parliament. The dataset is divided by city and year: each document contains all sentences referring to a city in a certain year. All 107 Finnish cities received their own documents. The lemmatized texts were converted to lowercase and had their stopwords filtered out.
3.4 Close reading places
With the full dataset of cities and speeches, searches with key words connected to place names was possible. Since our interest is in the resilience of cities through crises, key words concerning crises – unemployment, crisis, depression – were chosen, which are shown in Figure 8 below.
Figure 7. Search hits with keywords related to crises (crisis, depression, unemployment).
From these figures, some cities appear more frequently than others. In these figures top 6 cities were chosen and some overlapping as well. But only Helsinki appeared in all of these lists (Figure 8), so that is why it was chosen separately. With close reading we can test if the figures actually trap the idea we were after.
Figure 8. Search hits with keywords related to crises (unemployment in blue, depression in orange, crisis in grey) for Helsinki.
Close reading reveals that indeed unemployment was an issue concerning Tampere and area around it (Pirkanmaa). On the other hand, the 1993 file concerning all speeches concerning Tampere, showed that only 13 hits of them were directly connecting Tampere and unemployment. This goes to show that our Figure 7 shows that unemployment is one theme inside Tampere discourse, put simple speech level search item of both city and unemployment is not very specific in showing the interconnectedness of these factors.
Unemployment rates exceeding over 20% were mentioned multiple times. Often with unemployment rates also mentioning Tampere as industrial city came up (“teollisuuskaupunki”). Tampere was referred as a city for automation and art. Major part of direct topics around Tampere concerned roads, railroads, hospitals and university. Although these are not directly connected to unemployment, it is very much intertwined with the 1990s crisis. One of the speeches even lays it out loud, that state funding of the city should be debated openly and how it is regionally flawed.
The state fund conversation is especially strong around discourse concerning Helsinki speeches from year 1993. From the year 1993 from the results concerning Helsinki, only three of them were directly concerning unemployment. Rather than unemployment, the 1993 peak concerning Helsinki comes from other factors around Helsinki. The biggest issue revolved around state subsidies from state to municipalities. Here directly visible is the division between Helsinki and the rest of the country. On one hand Members of Parliament complain that Helsinki is depicted as scapegoat for all problems and its input as locomotor for economy and innovation is not recognised. At the same time Helsinki faced problems like rest of the country and it is voiced in the speeches that these problems are neglected in general discussion. On the other hand clear injustice is voiced that Helsinki gets all the benefits whereas rest of the municipalities must pay the prise of the crisis.
Strange peak of Kemi
The figure showing results on search word “crisis” had most variance and general trend following 1990s and 2008 economic crises is more scattered. Most peculiar is the 1986 peak where other cities follow but Kemi is most visible. Close reading revealed, that destruction of Chernobyl power plant in 1986 was the fuel for crisis topic for that year rather than any crisis concerning the city of Kemi. As a singular mention, crisis of state owned corporation Kemi was labelled to suffer from problems dating back to 1970s. Other than that, no visible link to Kemi was found, but rather tackling with crisis legislation fuelled by international crisis of Chernobyl power plant destruction. Surprisingly, the discussion mirrored a lot our modern Corona crisis with the problem of combining citizen rights to move freely to economic freedom and other other hand state’s difficulty to tackle with a problem if no power to limit human interaction is given to it.
4. Results and Discussion
4.1 Use of the Semantic Parliament (Parlamenttisampo)
This project as part of Helsinki Digital Humanities Hackathon 2021 The Semantic Parliament is by no means restricted to this Hackathon project. Still in progress, the project is a consortium research project of University of Helsinki, University of Turku and Aalto university bringing together expertise in different research domains as it combines humanities, social science and computer science. https://seco.cs.aalto.fi/projects/semparl/ The data used in the Semantic Parliament is open data from the Finnish Parliament. The end product of this open data is web page interphase ParliamentSampo which is similar to other Sampo-projects such as BiographySampo and AcademySampo.
ParliamentSampo (in future: Parlamenttisampo.fi) is a work-in-progress search engine, where user can search for results from the whole dataset from the year 1907 to 2021. The end product ParliamentSampo is the most easy access tool to all interest in Finnish parliament debates. The long time spectre is especially attractive to historians, who up until now have had to manually find and go through relevant data. Data has been open, but scattered as part of it has been born digitally or digitised and part of it found in print. Now, these documents are not just digitised but also formatted (technical term, OCR) so that open word searches are possible from the whole time period. In addition to open word searches, automatic background information of each document is also provided such as date of speech, speaker, party of speaker and place names mentioned in the speech.
With these simple tools, what can a historian do? With simple cutted one word searches or search as a phrase results can show us the time when a certain term has emerged. This is an excellent tool to map and locate catachrestical moments from original sources. For example, dating the conversation around basic income showed us that the first hit with the full word “perustulo” (eng. Basic income) gave us a result from the year 1969. With a cutted word the word “basic income” got mixed with “basic customs” due to more frequent OCR mistakes of the older documents. With the access to litterated speech can be found out that the term is restricted to social security of military invalides and only after 1986 the term “perustulo” is connected more broadly to social security in general. With the older term “kansalaispalkka” (citizen salary) we find results already from the year 1980. Paula Eenilä’s speech represents the main components of the modern basic income debate: that minimum basic security should cover all those who receive some form of social security benefit. This speech insinuates a birth of a new term with this sentence: “Perhaps this benefit could be called simply as citizen salary, if we want to stick with salary.” (26.09.1980, Vp 1980 – istunto 83 – puhe 14 (Paula Eenilä)) Only a further investigation could reveal, if Paula Eenilä’s utterance indeed is the first public declaration of the term, but it can be said, that it is the first time it was brought out parliamentary debates and therefore made a general topic. Close reading of these documents not only suggest a starting point for a new term but enables analysing its change in times. This approach requires prior knowledge of the terms and is in the end suitable only for qualitative research. The interphase YASGUI SPARQL that is built in the end product of for example in the AcademySampo opens up possibilities for a more quantitative approach as done also in this project.
4.2 Sentiment analysis
The results derived from the sentiment analysis scores cannot be fully trusted for several reasons. The main problem is the lexicon itself, which was originally translated from English and is intended for different contexts (Öhman 2021b, 48). A contextualized lexicon created specifically from the parliament speech data would have been more optimal, but was unattainable due to strict time restrictions. Another aspect to keep in mind is the simplicity of the calculations. Machine learning approaches generally supercede lexicon-based approaches (Abercrombie 2020) but it would have taken a long time to train and implement a model. Since no-one in our group had previous experience with sentiment analysis we chose to keep the analysis as simple as possible.
Though the results are unreliable, some speculations are still possible. We inspected the years around the three economic crises described earlier to see if any correlation emerged. In Figure 9, the unemployment rates and sentiment words per year have been depicted. 1990 crisis can be seen as a large rise in unemployment levels but a decline in emotional words.
Figure 9. Total sentiment words in sentences where the city is mentioned during the three periods.
Two alternative (speculative) explanations arise: either the economic depression has really depressed people to speak more plainly, as is the case with clinically depressed people (e.g. Yazdagar et al 2017), or the speech has concentrated on economics, which as a topic is probably not spoken about with emotion words.
With better lexicon and a machine learning model we could have gotten a lot more from this data. Anna Ristilä is currently writing a dissertation concentrating on the same parliament data and has plans to further develop the sentiment analysis approach. We could also have derived some case specific information with close reading but, again, time did not allow this.
4.3 TF-IDF analysis for Helsinki
Table 4. Top keywords using TF-IDF for Helsinki (1986-1994)
Table 4 above shows the top keywords for the emblematic city of Helsinki during the 10-year period around the crisis of 1990. Since TF-IDF searches for uncommon words and the crisis was national — i.e., present in the whole dataset — it cannot be easily seen from the data. Asunto ‘housing’ is a yearly concern when talking about the city before the crisis and peaks in 1990 with the addition of the Helsinki-specific Hitas, a housing price-and-quality control system, and vuokra ‘rent’ in 1991. However, after this latter year the theme vanishes from the list. At the same time, the keyword sanoma, referring to Helsingin sanomat, a newspaper, takes the top position in 1991 and keeps it in all years following the crisis.
There are other references to institutions and places in the capital of Finland, such as yliopisto ‘university’, kaupunki ‘city’ lentoasema ‘airport’. The discussion about the Finnish European Union membership referendum in 1994 can also be seen, with the words EU, Eurooppa ‘Europe’, unioni ‘union’ and Suomi ‘Finland’ appearing in this year’s data.
A more careful analysis would refer to the raw counts of these words in the dataset overall as well as consider their textual context. This was not done due to lack of time.
4.4 Topic modeling
The figure below shows an excerpt of the prevalence of our key topics during our second period, the 1990’s recession.
Figure 10. Prevalence of key topics, 1986-1995. Note: Topic 0: Budgeting, taxation; Topic 7: Research and universities; Topic 8: Future and development; Topic 13: National economics; Topic 14: Wellbeing, families; Topic 17: Employment, econ. development.
We found surprisingly little variation in the prevalence of our chosen topics during the dramatic recession years (you may compare this figure with the unemployment rates included in figure 9 above). This is probably due to what would be ‘noise’ for our research question: the the broad significance of cities for parliamentary discussions, leading to a variety of meanings and discussions included in our data. While a detailed analysis of these results would require more reliability checks and the evaluation of alternate explanations, we were able to derive some conclusions from these preliminary results.
Unsurprisingly, discussion of finances was central to parliamentary discussions related to cities during our period of analysis. However, this discussion changed between 1986 and 1995; discussion of our topic 13 related to national economics became less visible in city-related debates, and (more concrete) discussion on budgeting and taxation gained dominance. This may reflect a stronger focus on the daily survival and liquidity of local government and the state over long-term issues.
On a national level, the discussion of wellbeing and unemployment does not pick up until 1995, when the recession is already easing off. This suggests that we should broaden our period of analysis to include more of what has been referred to the long tail of the 1990’s recession. Nevertheless, a national-level study should be complemented with an analysis of the different ways in which the recession impacted cities; we outline this as a suggestion for further research below:
For further research, we envision to use the descriptions of (theoretically derived) groups of cities, such as industrial towns, major cities, or cities suffering from higher-than-average increases in unemployment. Moreover, we would include covariants in the topic model through the application of e.g. structural topic modeling would enable us to recognize the different thematic ways in which different cities (or groups of cities) are described in parliament.
Abercrombie, G., Batista-Navarro, R. Sentiment and position-taking analysis of parliamentary debates: a systematic literature review. J Comput Soc Sc 3, 245–270 (2020). https://doi.org/10.1007/s42001-019-00060-w
Kainu, M., Lehtomäki, J., Parkkinen, J., Miettinen J.,Kantanen, P. and Lahti, L. Retrieval and analysis of open geospatial data from Finland with the geofi R package. R package version 1.0.00002.https://ropengov.github.io/geofi/
Leskinen, P., Hyvönen, E. & Tuominen, J. (2021) Members of Parliament in Finland Knowledge Graph and its Linked Open Data Service. Submitted. https://seco.cs.aalto.fi/publications/2021/leskinen-et-al-mps-2021.pdf
Magnusson M., Kainu, M. Huovari, J. and Lahti L. (rOpenGov). pxweb: R tools for PXWEB API. http://github.com/ropengov/pxweb
Proksch, S.-O., Lowe, W., Wäckerle, J. and Soroka, S. (2019), Multilingual Sentiment Analysis: A New Approach to Measuring Conflict in Legislative Speeches. Legislative Studies Quarterly, 44: 97-131. https://doi.org/10.1111/lsq.12218
Sinikallio, L., Drobac, S., Tamper, M., Leal, R., Koho, M., Tuominen, J., La Mela, M. & Hyvönen, E. (2021) Plenary Debates of the Parliament of Finland as Linked Open Data and in Parla-CLARIN Markup. Data and Knowledge (LDK 2021). https://seco.cs.aalto.fi/publications/2021/sinikallio-et-al-speeches-2021.pdf
Statistics Finland (2021). “Kuntien tunnusluvut muuttujina Alue, Tunnusluku ja Vuosi.” [Data accessed 2021-05-27 11:53:29 using pxweb R package 0.10.4]
Tamper, M., Oksanen, A., Tuominen, J., Hietanen, A. & Hyvönen, E. (2020) Automatic Annotation Service APPI: Named Entity Linking in Legal Domain. The Semantic Web: ESWC 2020 Satellite Events (Harth, Andreas, Presutti, Valentina, Troncy, Raphaël, Acosta, Maribel, Polleres, Axel, Fernández, Javier D., Xavier Parreira, Josiane, Hartig, Olaf, Hose, Katja and Cochez, Michael (eds.)), Lecture Notes in Computer Science, vol. 12124, pp. 208-213, Springer-Verlag, https://doi.org/10.1007/978-3-030-62327-2_36
Amir Hossein Yazdavar, Hussein S. Al-Olimat, Monireh Ebrahimi, Goonmeet Bajaj, Tanvi Banerjee, Krishnaprasad Thirunarayan, Jyotishman Pathak, and Amit Sheth. (2017). Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media. Proc IEEE ACM Int Conf Adv Soc Netw Anal Min. https://doi.org/10.1145/3110025.3123028
Öhman, Emily. (2021a) SELF & FEIL: Emotion and Intensity Lexicons for Finnish. https://paperswithcode.com/paper/self-feil-emotion-and-intensity-lexicons-for
Öhman, Emily. (2021b) The Language of Emotions: Building and Applying Computational Methods for Emotion Detection for English and Beyond. Dissertation. University of Helsinki, Department of Digital Humanities. http://urn.fi/URN:ISBN:978-951-51-7106-1