Sei sulla pagina 1di 140

Of Deutsches blood?

Author attribution research on the Wilhelmus, the Dutch


national anthem, based on computational quantitative analyses
Thesis, Tim de Winkel, Research Master Dutch language & literature, 14-08-2015,
University of Utrecht, Supervisors; Prof. dr. Els Stronks (UU) & Prof. Dr. Karina van
Dalen-Oskam (UvA), Second corrector and reviewer; dr. Peter Boot
Please feel free to request access to a fully digital version of my thesis, with the
benefit of the inclusion of the interactive graphs, unavailable in the paper
version, at t.dewinkel@students.gmail.com or timdewinkel@gmail.com

Abstract
The author of one of the most important Dutch texts in Dutch (literary) history, and on top of that the oldest
national anthem in the world, is yet to be determined. Researching the historical context and performing
qualitative text analysis has not produced conclusive answers and put a name on the anonymously
published song. Ill try to discover the author of the Wilhelmus using quantitative analysis, and the methods
and means from the computational literary studies. This involves the use of computers, performing statistical
analysis on the Wilhelmus in order to determine an authorial signal, based on textual features, and combine
this linguistic fingerprint with those of the other texts of my corpus, predominantly texts from possible authors
of the anonymous Hymn. Previous research with these methods show very promising results but the short
text size of the Wilhelmus, only 551 words, temper the expectations. Ill test previously proposed and
generally considered valid options of potential authorship of the Wilhelmus, while at the same time trying to
determine if, currently, this type of research, these methods and the available tools are capable of handling
such questions. This leads to the following two research questions; Who is the author of the Wilhelmus? &
Can the complicated real world authorship attribution case of the Wilhelmus be solved with the methods of
quantitative analysis and the tools of computational literature? While testing authorship signals, Ive also
tested other stylistic effects based on language or dialect and genre, type or topic. These effects were
present in my corpus, measurable with my methods and clearly visible in my graphs. Surprisingly enough,
the Wilhelmus shared very little stylistic effects with any other text from all the authors in the corpus.
Attempts to draw out the texts lingual characteristic, by varying features, culling, sampling or testing on
different corpora, some including only Marnix and Coornhert texts, all failed to produce a strong, consistent
and reliable attribution. When examining the nature of the failed attribution, by doubting the distance
measure Burrows Delta or by analyzing the distinctive components of the Wilhelmus with a principle
component analysis, I got results that are worth exploring and valid options for future research.

Index
Title page

Index

Quotes

Introduction

Theory

11

The Wilhelmus
1. Songs of collectivity
2. The national anthem of the Netherlands
3. The open questions of the Wilhelmus

When?
Why?
Where?
Who

4. Obstacles
Idiom
metric
subjectivity
5. Conclusions on Wilhelmus research

Computational literary studies


1.
2.
3.
4.
5.

23

Nontraditional authorship attribution studies


Distant reading
Why not stylometric/visual?
The human factor and real world authorship attribution
Categorization tasks

Relevance

28

The Wilhelmus as object of interest


1. Lack of stylistic research
2. Methodological gain

Relevance for computational research

Methods
Automated Authorship Attribution
Stylometric features

31

32
33

1. Size of Units of features

- Character features
- Lexical features
- Syntactical features
- Semantic features
2. Types of features
- Complexity measures
- Punctuation
- Idiosyncrasies
- N-grams on character-level
- N-grams on word-level
- Word frequencies
- Syntactic (Distribution of parts of speech or POS)
- Univariate vs. Multivariate approach
- Conclusions on features

Distance measures
Computational means
Text

43
46
47

1. Instance based vs. profile based distinction


2. Problems of representation
- Text length
- Imbalanced problem

Real world AA problems/tasks & their methods/solutions

50

1. Author Verification problem and the unmasking method solution


2. Needle in the haystack problem

Tests

54

1. TEST 1 multiclass, single-label text categorization task

Exploring Gephi visualization

Goal-specific test on specialized corpora


2. TEST 2 Multivariate statistical analysis techniques (PCA)
Dimension reduction
Authorship attribution

Corpus

57
1. Considerations for the researcher
2. Explanation corpus

Three corpora

59

1. DBNL corpus
2. Meertens corpus
3. Specialized corpus
- Het Wilhelmus and the Geuzenliedboek
- Marnix
- Coornhert
- Other possible authors
- Probable authors, included, half-included or not included
- Improbable authors included
- Anonymous texts

Hypotheses

65
3

Specialized sub corpora


1.
2.
3.
4.

Language Hypotheses
Genre Hypotheses
Authorship Hypotheses
Hypotheses is will not test

Expected problems
Obstacles of computational literary studies
Obstacles for authorship attribution
Case specific problems

71
72

1. R & Cherry picking


2. Editions & Spelling
3. Availability of German texts

Problems of the researcher

Analyses and Results

74

75

Birds eye view on the DBNL corpus


- Analysis, interpretation and conclusions

Specialized corpora

78

- Analyses language hypotheses


1. Analyses language corpus 1 Alle talen, proza & poez
2. Conclusions language corpus 1 Alle talen, proza & poez
3. Analyses language corpus 2 Alle talen, poez (DUI-NED-ZUID-KLAS)
4. Conclusions language corpus 2 Alle talen, poez (DUI-NED-ZUID-KLAS)
5.
6.
7.
8.

Analyses language corpus 3 DUI NED ZUID poez


Conclusions language corpus 3 DUI NED ZUID poez
Analyses language corpus 4 DUI NED ZUID poez
Conclusions language corpus 4 DUI NED ZUID poez

- Conclusion language hypotheses

89

- Analyses genre hypotheses


1. Analyses genre corpus 1 Alle genres LIED, PSALM, PROZA, POEZ
2. Conclusions genre corpus 1 Alle genres LIED, PSALM, PROZA, POEZ
3. Cancelling hypotheses
4. Analyses genre corpus 2 poezie vs. proza
5. Analyses genre corpus 3 proza vs. psalm
6. Analyses genre corpus 4(.b) poezie vs. psalm
- Conclusions genre hypotheses

92

- Analyses author hypotheses

100

1.
2.
3.
4.
5.
6.
7.
8.
9.

98

Analyses author 0-corpus


Conclusions author 0-corpus
Analyses author corpus 1 auteur minus proza en psalmen
Conclusions author corpus 1 auteur minus proza en psalmen
Analyses author corpus 2 'auteur balans'
Conclusions author corpus 2 'auteur balans'
Analyses author corpus 3.1 Marnix Coornhert
Analyses author corpus 3.2 Marnix Coornhert balance corpus
Conclusions author corpus 3 Marnix Coornhert

- Distance measures
- PCA
- Conclusions author hypotheses

Meertens corpora

109
112
113

115

- XML works corpus


- XML parts corpus
- TXT songs of geuzenlied corpus
- Conclusions Meertens corpus:

Conclusions

122

Enumeration test conditions and test results


Topical sub questions
Methodological conclusions
Methodological sub questions
Concluding

Discussion and recommendations for future research

130

Text size and available text


Reasons for no AA on the Wilhelmus
- Future corpus building
- Future design
- Another authorial signal

Future research

134

- Methodological variation for future research


- Different question for future research

End plead
Collaboration and thanks to

138
139

Een nieuw Geusenlieden boecxken waerinne begrepen is, den gantschen handel der Nederlandsche
geschiedenissen, dees voorleden jaeren tot noch toe gedragen.1

Bouwstenen, fundamenten, grondslagen het blijven metaforen die iets vanzelfsprekend moeten laten
lijken wat niet vanzelfsprekend is.2

Toen ze hadden het schavot


beklommen
Riepen ze: Het kan ons niks
verdommen.
Toen ze aan de touwen hingen,
Begonnen ze Wien Neerlands bloed
te zingen.
Maar Piet die kon de wijs niet
houwen
En zong Wilhelmus van Nassauwe3

1 G.A. van Es, Het Wilhelmus, in Het Wilhelmus in artikelen. Een bundel herdrukte studies over het Wilhelmus, ed. J. de
Gier (Utrecht: Hes, 1985), 51.

2 Frans-Willem Korsten, Grondslagen, situaties en houdingen Vooys; tijdschrift voor letteren 33, no. 2
(2015)
3 Louis Peter Grijp, Inleiding, in Nationale hymnen. Het Wilhelmus en zijn buren, ed. Louis Peter Grijp
(Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998), 13. Look up in; Eyck, F.G., The voice of Nations. European National
Anthems and their Authors (Westport, Conn/London 1995, pag xix)

Introduction
At least one thing was familiar during the brand new major sports events, both held for the very first time,
July of this year; Opening every World Cup game of the lionesses, the Dutch female national soccer team,
and rewarding any gold medal won by a Dutch athlete at the European games, the oldest and most beautiful
anthem in the world will chime in the stadiums and flow through our television speakers, or, at least, that one
verse we actually know will. As the Dutch symbol of tradition as well as revolution, representing the nation,
its people and the Royal house, the Wilhelmus was heard no less than 26 times in Azerbaijan and 4 times in
Canada. Despite our alleged down to earth mentality and the recent turbulent social debates around Dutch
symbols and traditions, we seem to take pride in our national anthem and do not challenge its unifying
function.

Although it

might be no surprise that the man in the street has no extensive knowledge of the origin of things, after all
the average citizen of Rotterdam thinks the Erasmus bridge is named after its architect, the absence of an
author for our national anthem should be part of the collective memory. Elementary school textbooks and,
more importantly, Wikipedia,4 dont help by proclaiming Marnix van Sint-Aldegonde as the author, which is an
uncertainty at best. Truth is that the Dutch literary history has a huge open case for authorship attribution,
because the author of the Wilhelmus has never been sufficiently established. Prof. Dr. K. Heeroma
maintained, back in 1970,5 that the idea of an anthem springing from its people is almost to romantic to
debunk .6 However, I consider the relevance, or even importance, of this national hymn, which is so
fundamental to the origin story and perhaps even the origin of the independence of the Netherlands, to great
to leave to myth.

Im going to do

a serious attempt at solving the case of open authorship of the Wilhelmus song, by performing quantitative
analysis. Research based on biographical or other historical information and qualitative analysis, has filled
up the Dutch libraries but hasnt led to conclusive evidence or scientific consensus. There have however
been no attempts, that I know of, to solve this question with quantitative analysis. This is not all that
surprising due to the infancy of the computational literary studies and the towering case-specific problems
with stylometric research on our national anthem, who seem as impossible to overcome as the Duke Alva
himself. Although the main question (who is the author of the Wilhelmus?) has not been answered, research
has yielded a whole spectrum of assumptions and conclusions, unfortunately often incompatible. I intent to
test some of these theories, with quantitative analyses, while going after the author. By including all the
4 https://nl.wikipedia.org/wiki/Wilhelmus Last checked at 02-08-2015
5 Klaas Hanzen Heeroma, Tsal hier haest zijn ghedaen , in Het Wilhelmus in artikelen. Een bundel herdrukte studies over
het Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 251-268.

6 Ik houd het probleem van het auteurschap van het Wilhelmus voor onoplosbaar en heb daar eigenlijk ook
wel vrede mee. Dat hrt bij een dergelijk lied, een dergelijke verbeelding, een echt geestelijk verzetslied,
7

possible authors, suggested by previous research, in my corpus and analyzing their style quantitatively, I aim
to find some statistical evidence for their inclusion or exclusion on the list of possible authors. Authorship
attribution with quantitative analysis and other means and methods from the computational literary sciences
have worked very well in several languages, including Dutch.7
The primary goal of my thesis is to contribute to the discussion on the authorship of the Wilhelmus. If my
research brings us any closer to finding that author, I consider my thesis a success. My secondary goal is to
explore the possibilities of computational literary studies for authorship attribution. By doing research which
is obviously stretching the capabilities of its methods, and by critically analyzing any methodological step of
the way, this thesis will be, besides an attempt at determining the author of the Wilhelmus, a test case for
using quantitative stylistic analysis with computational means on a real life open attribution case. The goal of
my research is not necessarily to give conclusive evidence, but rather to give an indication if its even
reasonable to pursue an answer with this type of data-analysis. Of course actually answering my main
question would be ideal. So one could call my thesis an investigative or explorative experiment, while
researching a major open question of the Dutch literary history.
Both goals are, in my opinion, highly relevant. Knowledge about the author of the Wilhelmus is
relevant because of the immense symbolic value the text has, as the worlds oldest national anthem and as
a part of the origin story of the Netherlands. The methodological questions are relevant because these
concern the relatively new methods of a relatively new approach of literary research, controversial and
possibly revolutionary. Knowledge of its methodological capacities and examples of successful or
unsuccessful research may contribute to the understanding and perhaps acceptance of this approach to the
study of literature. Ive formulated the two following main research questions;
1. Who is the author of the Wilhelmus?
2. Can the complicated real world authorship attribution case of the Wilhelmus be solved with the methods of
quantitative analysis and the tools of the computational literary studies?
Ive formulated the following sub questions who are part of, and/or can add to answering my two main
questions;

Which candidate author do my quantitative stylistic analyses point to or support as the author of the
Wilhelmus?

7 See; Burrows, J.F. Delta: A Measure of Stylistic Difference and a Guide to likely Authorship. Literary and Linguistic
Computing 17. 3 (2002), Hoover, David L. Testing Burrowss Delta. Literary and Linguistic Computing 19.4 (2004b), &
Schch, Christof. Beyond the black box, or: understanding the difference between various statistical distance measures.
The Dragonflys Gaze. Computational analysis of literary texts (Research blog, August 3, 2012):
https://dragonfly.hypotheses.org/101

Which candidate author do my quantitative stylistic analyses rule out as the author of the
Wilhelmus?

Who is, or is more likely to be the author the Wilhelmus, Marnix van Sint-Aldegonde or Dirk
Volkertszoon Coornhert?

What are currently the limits of computational stylistics and authorship attribution?

Can my methods detect authorial signals?

Can my methods eliminate one or more of the usual suspect of the Wilhelmus authorship attribution
case?

Can my methods give supporting evidence for one or more of the usual suspect for the Wilhelmus
authorship attribution case?

Are my methods useful and/or sufficient for authorship attribution for texts of around 550 words?

A major methodological problem, that imposed itself as soon as I explored the possibilities of successful
authorship attribution, is the small size of the to be attributed text.The Wilhelmus only has 551 words, which
is considered way to short to make any solid authorship attribution. Estimates about the minimal amount of
words required to do successful authorship attribution on poetry are between 217 8 and 30009 words, at best.
Research attempting to determine this bare minimum is often inconclusive and conclusions differ.
Other problems I experienced during preliminary research were difficulties in corpus building, the
general poor availability of ready-for-analysis digital text versions, excluding genre and spelling effects, and
possible plural authorship. Ill invest large amounts of my time and effort in collecting and preparing
contemporary texts from possible authors, preferably in the same genre as the Wilhelmus, the
geuzenliederen (songs of beggars), and in the right (that is: original) spelling.
I stumbled upon several sub questions about stylistics that needed to be answered before I could
focus on the question of authorship. These questions were; Can my methods detect language signals? Can
my methods detect genre effects? What language effects does the Wilhelmus signal? What kind of genre
effects does the Wilhelmus signal?
My corpus probably needs to account for other effects and considerations that are not yet visible but
that will reveal themselves while performing my analyses. Ill keep my corpora as broad as possible, to
ensure maximum flexibility.

8 Moshe Koppel, Jonathan Schler, and Shlomo Argamon, Computational Methods in Authorship Attribution,
Journal of the American Society for Information Science and Technology 60, no. 1 (2009)
9 Maciej Eder, Does Size Matter? Authorship Attribution, Small Samples, Big Problem Digital Humanities (2010):
Conference Abstracts (2010): 6.

A variety of statistics, features and other methods are currently under consideration, but this thesis will
definitely rely heavily on the Burrows delta, a statistical procedure where the differences of relative
frequencies of textual features, between a target text and the average of the corpus, determines the stylistic
distance of the texts of a corpus, and the software R including its packet Stylo, to do the calculations.
As for my secondary literature, Ill need to construct a composite body of theory, bringing together
two traditions, that of quantitative stylistic analysis and authorship attribution, and that of the Wilhelmusresearch revolving around the possible author(s), in order to fully grasp the questions Im asking, to
understand the means and methods that are required for answering it, and to avoid avoidable pitfalls.
My thesis will of course be product of independent research, but Ive already, and will continue to do
so, collaborated with numerous specialist in the field and several institutions. A lot of these contacts were
made thanks to my supervisors, prof. dr. E. Stronks and prof. dr. K.H. Dalen-Oskam.
As mentioned, authorship attribution for texts of this size might not be possible, so why try? My confidence in
the usefulness of the attempt is based on the hopeful results of a similar but much smaller research that Ive
performed at the Jagiellonian University in Krakow. During an internship under the supervision of Jan
Rybicki, member of the Computational Stylistics Group and an absolute expert on stylistic research and
performing quantitative analysis with computational means, I set up a small experiment in order to find out
who the author of the Wilhelmus was, using the means and methods of the computational literary studies.
My results came not even close to answering my question. The Wilhelmus was attributed to several authors
as it moved around the graph extensively, meaning the parameters are insufficient for capturing the stylistic
signal of the text, or worse, the text is too small for the methods to capture its authorial signal. Despite all
this, I was positively surprised. Looking at the results I saw stylistic resemblance of texts written by same
author, clustering to the same branch of the tree diagram. I even saw faint genre signals, while I was dealing
with texts that were merely 550 words long. As it turns out, even when dealing with very short texts, some
traces of style are visible for quantitative analysis. The results seemed hopeful in spite of the limited amount
of hours and blank pages at my disposal and some geographical inconvenience, since a lot of the Dutch
sources were unavailable to me, behind my computer at Henryka Siemiradzkiego, 10 and random trips to the
Dutch libraries had become costly. There was so much room for improvement.

This is

why Ill have another go at determining the author of my national anthem. A thesis provides the amount of
time and space I need, the Dutch libraries among other institutions are accessible and Ive made progress in
the field of computational literary studies. Now all is set for me to contribute to the tradition of the Wilhelmusresearch by taking on this interesting test-case for authorship attribution on very short early modern texts.

Theory
10 Alley of the Karmelicka, road in the centre of Krakow.
10

The Wilhelmus
Songs of collectivity
The origin of the popularity of a national anthem can be found in moments of strong collective feelings,
whereupon its able to feed enough to last the times of calm, until another moment of national excitement
emerges.11
In 1568, the bleak year of the duke of Alvas triumphs, the Netherlands were in need of collectivity,12 as the duke
swept the low countries with military operations and restored the repressive reign of Philips II. Opinionating or
propagating papers, which came in rapid succession in the year 1568, were meant to form the public opinion.
They were political weapons and major part of the battle for independence. They justified the use of violence and
the rise in arms against Alvas rule, and openly preached, although still in vain, a national revolt.13 Songs were
also used to thank and honor supporters, to convince doubting citizens or city governments (Amsterdam!) and to
attack opponents.14
Just like martyr songs, who are an expression of the revolt of the oppressed believers, songs of beggars
or geuzenliederen evoke consensus and resolution. The songs were both a herald, as an actual part of the
resistance. Poets and singers adorned themselves with the honorary name geuzen (beggars), first uttered as an
insult, in 1566, by the counselors of Margaretha of Parma, guardian of the Netherlands under the Spanish King,
branding the train of nobleman petitioning against the inquisition. Vive le Geus was the word!15 16
The Wilhelmus became the most famous geuzenlied in history, acting as the anthem of the royalists.17
Even today it reminds us of the nationalist unity of nation, monarch and religion, which meant in the 16th century
that Holland, Willem of Orange and Calvinism, were indissoluble attached to each other.18 In the song Willem of
Orange or Willem van Oranje, father of the fatherland, defends the Dutch David, Holland, and the Calvinist way of
life, against the common enemy, Catholic Spain. The acrostic emphasizes this by forming WILLEM VAN
NASSOV,19 with first letters of each of the fifteen verses, including the prinsestrofe, verse of the prince, referring

11 Grijp 1998, 11.


12 van Es 1985, 47.
13 F.K.H. Kossmann, Het ontstaan van het Wilhelmus, in Het Wilhelmus in artikelen. Een bundel herdrukte studies over
het Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 24.
14 Karel Porteman and Mieke Smits-Veldt, Een nieuw vaderland voor de muzen. Geschiedenis van de Nederlandse
literatuur 1560-1700 (Amsterdam: Bert Bakker, 2008), 74.

15 Refering to the (rhyming) slogan of resistance Vive le Geus is de leus meaning Long live the beggar is
now the slogan.
16 Porteman and Smits-Veldt 2008, 74.
17 Porteman and Smits-Veldt 2008, 76.
18 Martine de Bruin, Het Wilhelmus tijdens de Republiek., in Nationale hymnen. Het Wilhelmus en zijn
buren, ed. Louis Peter Grijp (Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998)
19 Porteman and Smits-Veldt 2008, 76.
11

to the Prince and his house of Nassou. The use of acrostics20 as well as prinsenstrofen21 places it in the tradition
of the chambers of the rederijckers22, local gatherings of poets, often affiliated to or representing a city.
The Wilhelmus stayed popular throughout the Eighty Years War, and long after the peace of Mnster, the
Wilhelmus retained both its political overtone and its signaling function.23 It kept the memory of the Eight Years
War vivid, not in the least because the acquired freedom was jeopardized several times after 1648.24 This resulted
in, because of the symbolic value the song carries, a ban during periods of occupation, in the disaster year 1672
as well in the heyday of the Patriots in 1787.25 The Wilhelmus was also forbidden in the Japanese camps in the
Second World War. The Frisian anthem, which the Japanese did not know, served as a substitute.26
Despite of a history full of exile and resistance, the anthem survived and kept ascending, for example
when the member of the Oranje family came to power again or after the expulsion of the French in 1813.27 This
is because national anthems grow in moments of strong collective feelings. The liberation of the French
domination in 1813, the Belgian revolt in 1830-1831, the inauguration of Wilhelmina in 1889 are all examples of
this.28
The parallel with the Eight Years Wars was experienced especially strong, in 1940-1945. The resistance
poetry of those years contained many reminiscence of the Wilhelmus. There was, for example, a bundle
significantly called the Geuzenliedboek, meaning a book of beggars songs, which carries the same name as the
work were we find the earliest versions of the Wilhelmus, among other geuzeliederen29 of course. Again the main
goal was dispelling that tyranny and they remained faithful to their fatherland upon death.30 During the Second
World War the Wilhelmus was the resounding symbol of the Dutch aversion against the Germans in general, and
of the active underground resistance in particular. The objections to the anthem that some socialists still felt
before war, disappeared completely after the invasion of the Germans. Simon Carmiggelt said in 1960 Only when
the Germans stood at the borders, the Wilhelmus became my song.31
In spite of the strong associations of righteousness the Dutch feel upon hearing the Wilhelmus, its
symbolic appropriation has not been unequivocal. Even before the outbreak of the war the national Socialists

20 Porteman and Smits-Veldt 2008, 76.


21 Jan Thnisz, Van Sint Jans onthoofdinghe, eds. Paul Laport, Frdrique de Muij and Marijke Spies
Amsterdam/ Mnster: Stichting Neerlandistiek VU/ Nodus, 1996).
22 The translation would be rhetoricians but for this particular group the noun has become a pronoun.
23 de Bruin 1998, 23.
24 de Bruin 1998, 37.
25 de Bruin 1998, 24.
26 Louis Peter Grijp, Nationale hymnen in het Koninkrijk der Nederlanden, II 1940-1998. in Nationale hymnen. Het
Wilhelmus en zijn buren, ed. Louis Peter Grijp (Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998), 79.
27 de Bruin 1998, 24.
28 Louis Peter Grijp, Nationale hymnen in het Koninkrijk der Nederlanden, I 1813-1939. in Nationale hymnen. Het
Wilhelmus en zijn buren, ed. Louis Peter Grijp (Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998), 67 -71.

29 Meaning Beggars songs.


30 Grijp II 1998, 76.
31 Grijp II 1998, 75.
12

hijacked the song, and sang it during key moments of the war. During a mass meeting in the Galgenwaard,32 in
1941, to celebrate Anton Mussert, leader of the NSB, the Dutch National Socialist, the Wilhelmus was sung while
performing the Hitler salute. One can point out the irony that, during the Second World War, the Wilhelmus
experienced an apotheosis, being owned and appropriated by the total sum of its people, from socialist to
NSBer,33 be it on the opposite sides of a rifle. In the 18th century the anthem was above all the song of the
Orangists, supporters of the royal house, while in the next century the Calvinists appropriated the song, and they
were followed in the twentieth century by all kinds of groups including by the NSB. The Wilhelmus follows the
political changes of power, both the dominant as the subversive streams.34

The national anthem of the Netherlands


During the first years of its existence, the Wilhelmus meets, for the most part, the demands of what is meant by a
national anthem,35anonymous, widespread among the people, orally passed on and as a result, delivered in many
editions.36 In addition to this, the nationalist song complied to the criterion of the uniqueness, since it was
legitimized by broad-based popularity37 despite the functional nature it had when being conceived. No other song
in Dutch history combines these feats, certainly not with a period of struggle for independence. So while the
Wilhelmus met already in the sixteenth and seventeenth century all the conditions, which would later be included
in the textbook definition to describe a national anthem with, the concept of the national anthem did not exist until
1773,38 when the German poet J.G. Herder along with other German romantics defined it as such.39
Surprisingly the Wilhelmus has only been recognized as the official national anthem in 1932.40 It
superseded Wier Neerlands bloed in de adren vloeit, of the famous Rotterdam poet, Hendrik Tollens,
representing the Dutch nation but not necessarily the monarchy, as official national anthem, mostly because of its
great popularity.41 The story goes that Wien Neerlands bloed was initially preferred over our current national
anthem, because the later reminded the Dutch people too much of inner quarrel. Peter Louis Grijp, professor in
musicology and expert in Dutch song culture questions this statement because when Wien Neerlands bloed was
chosen, there was no noteworthy protest.42
Initially, the songs of a geuzenliedboek appeared independent as loose sheets or pamphlets, which in the
sixteenth century were published on a daily basis. The songs were often concerned or related with recent affairs

32 Soccer stadium of Utrecht


33 Grijp II 1998, 77, 79.
34 Grijp inleiding 1998, 11.
35 de Bruin 1998, 18.
36 Grijp I 1998, 62.
37 de Bruin 1998, 18.
38 de Bruin 1998, 16.
39 Grijp I 1998, 71.
40 Grijp inleiding 1998, 8.
41 Grijp inleiding 1998, 12.
42 Grijp I 1998, 59.
13

and probably published quickly after the events they described.43 The estimate is that the first version of the
geuzenliedboek was published in 1574, including the Wilhelmus which was then already the most famous
geuzenlied of its time. From that moment on, publishers rapidly produced new - almost exclusively Dutch
editions of the geuzenliedboek, extended with both more recent as older songs, up to a sum of 252.44
For a long time the Wilhelmus as it occurs in Een nieu GeusenLieden Boecxen (1581), the title of the this
particular edition of the geuzenliedboek, was considered the oldest persevered Dutch version of our national
anthem.45 This was until 1996 when Martine de Bruin came across an even older edition of the geuzenliedboek in
the Paris Biblioteque Nationale. This new found version - without explicit references to year, place and author could on account of historical and typographic research be attributed to a certain publisher, and so could the date
be determined. It seems to have been published in 1577-1578 by Jan Canin in Dordrecht and was therefore three
or four years older than the standard edition of 1581. Shortly after this discovery, De Bruin found an even older
edition of the geuzenliedboek in the Niedersachsische Staats und Inuversitatsbibliothek in Gottingen. This new
oldest printing was included in the Repertorium van het Nederlands lied tot 1600 as number D294, without giving
much publicity to it.46 This is still not the oldest Wilhemus-text that survived the ages, because that is a German
edition published in 1573.47 Generally, it is believed that the first Dutch edition of Een nieu GeusenLieden
Boecxen originated in 1574. This means that, even with the discovery of the book from 1577-1578, the first
printing is not yet in sight.48
The title of being the oldest national anthem,49 confirmed by several illustrious cultural whales like the Guinness
book of records or Flippo number 430,50 refers to the conjunction between text and melody, which tribes from the
sixteenth century. The Wilhelmus was sung at de wijse van Chartres, the melody of the popular French antiHuguenots song.51 The oldest known paper version of the melody dates from 1574, and is assumed at that point
in time, to be at least six years old.52 The oldest Dutch version where both the text and melody of the Wilhelmus
are written down, can be found in Adriaan Valerius Nederlandtsche gedenck-clanck from 1626.53 In terms of text,
the Wilhelmus is exceeded by the thousand year old Japanese anthem Kimiga yowa, but this anthem received its
melody only in 1880. Consequently, the Wilhelmus as a song, meaning an inseparable whole of text and melody,

43 Martine de Bruin, Een ng ouder geuzenliedboek: signalement van de druk [1576-1577] met de oudst bekende
Nederlandse Wilhelmustekst in De fiere nachtegaal: het Nederlandse lied in de middeleeuwen, eds. Louis Peter Grijp and
Frank Willaert (Hilversum: Verloren, 2008), 231-250.
44 Porteman and Smits-Veldt 2008, 75.
45 Kossmann 1985, 343.
46 de Bruin 2008, 231- 233.
47 de Bruin 2008, 234-235.
48 de Bruin 2008, 231-233.
49 de Bruin 1998, 16.

50 Flippos were a huge promotional toy included in Lays chips. Many of tokens had an educational function.
Flippo number 430, chester in orange, tells us that the Wilhelmus is the oldest national anthem.
51 Porteman and Smits-Veldt 2008, 76.
52 de Bruin 1998, 25.
53 Kossmann 1985, 343.
14

is the oldest anthem of the world. However, this statement is only true if were talking about anthems that are
currently officially acknowledged as the national anthem,54 so excluding past or unofficial national anthems.
The currently oldest edition of het geuzenliedboek includes 89 songs and 5 choruses, of which almost all
were, upon the discovery in the Gottingen, already known from other printings.55 Every reissue of the
geuzenliedboek includes the Wilhelmus, and while the text varies over the different editions, the variations are
surprisingly small. Apparently the evolution of the text of the Wilhelmus has, against custom, come to an halt
since its inclusion in the geuzenliedboek. The text seems to have been significantly less subject to change than
the melody, although it is possible that big alterations performed in the oral tradition, have vanished from our
sight.56

The open questions of the Wilhelmus


The Wilhelmus, as the literary embodiment of Dutch resistance, is obviously an important part of the story of the
emergence of the modern Netherlands and perhaps even an important part of its actual independence. It gained
this important status almost immediately after its genesis, nevertheless a lot of essential information, like the
author, remains unknown.
Willem van Oranje turned, in a particularly efficient manner, everything the humanistic culture had to
offer; poets, artists, jurists, theologians, to serving a common cause, that exceeded ideological differences
between them, with the religious understanding and the political legitimacy of securing the sovereignty and
continuing existence of the Netherlands.57 Franciscus Ridderus wrote in his 1672 Noodige Tijd-korter in oorlog en
vrede that the Wilhelmus did more for his fatherland than 10000 soldiers, because when soldiers or sailors heard
it sang, their blood flowed.58 59 The duke of Alva recognized the potential of the humanistic weapons and under his
reign, many texts were banned, the chambers of the rederijckers were disbanded and many writers have been
exiled or silenced in other ways.60 The anonymous publication of the Wilhelmus is probably due to this
aggression, and it constrains the understanding of our history today still.
To really know the nature of this Dutch national anthem, we should be able to answer the following
questions: when, where, where to and by whom was the Wilhelmus created? These are the four classic questions
of the Wilhelmus-research. In the next section Ill elaborate on the current debates and consents on these
questions.

When?
54 de Bruin 1998, 16.
55 de Bruin 2008, 234-235.
56 de Bruin 1998, 26, 28.
57 Porteman and Smits-Veldt 2008, 70.
58 Van Welck liedje een wijs man eens seyde, dat het aan ons Vaderlant meer voordeel gedaen heeft, als tienduysent
Soldaeten, want als Soldaet en Matroos dat hoort, dan wordt haer bloedt gaende.
59 van Es 1985, 50.
60 Porteman and Smits-Veldt 2008, 68.

15

An extensively debated historical question about the Wilhelmus, important for the attribution to an author,
revolves around the date of conception. Several important events that took place in the year 1568, like the battle
in Friesland, the battle in Heiligerlee and the trip across the Maas to battle the Duke of Alva, are mentioned in the
Wilhelmus.61 The military campaign of the Prince in the fall of 1568 is a historic terminus a quo, being the latest
historical occurrence the Wilhelmus clearly refers to, in the eleventh stanza.62 This means of course that the text
was written after these events took place. In a similar but slightly less objective fashion, an end date for the
composition of the Wilhelmus can be determined, by the events it doesnt mention. The prince returns to Holland
in 1572 and this marks the start of the real revolt. On the first of April the small city of Den Briel was conquered
back by the geuzen. This famous first victory resulted in the even more famous rhyme On the first of April, Alva
lost his glasses,63 meaning that from that point on, the Duke lost his air of invincibility along with his perfect
military record earned by his skills of asserting the situation. Besides this, its also a play on words since the
Dutch word for glasses is bril, which is similar to Den Briel, the city. Anyway, this takeover is not mentioned in the
Wilhelmus, and furthermore the tone of the song doesnt correspond with the change that moment brought.
Therefore, 1 April 1972 is seen as another boundary date.64
Although there is consensus on where these coarse border dates should lay, a more precise dating has not been
agreed upon. Before the Second World War it was assumed that the Wilhelmus was manufactured at the end of
1568 or at the beginning of 1569, shortly after the failed campaign of the prince against Alva in Maastricht.65 This
dating assumes that the song is written directly after the historic episode it describes. Postwar Wilhelmusresearch however, dates our national anthem, considering it a propaganda song, at 1571-1572, or at the earliest
1570.66 Assuming that a propaganda song is released directly after its manufacture as it seeks immediate
impact,67 this dating is based on the assessment that this was the period when propaganda like that of the
Wilhelmus was needed. In addition to this, are there a lot of other texts, or manifests, of Prince Willem van
Oranje-Nassau from the period 1570-1572, which have a similar message as the Wilhelmus.68 Another argument
for the terminus ad quem of 1572, is the fact that song 55 in Kuipers edition of the Geuzenliedboek, Ras
seventhien Provincen, which must be written in July or August 1572, has Wilhelmus van Naussauwe, as the
direction of melody.69
The previous section is a concise and simplifying summary of the enormous discussion concerning this question.
I will refrain from reporting this discussion any further, because elaboration of this discussion is not necessary for
61 Abraham Maljaars, Het Wilhelmus: auteurschap, datering en strekking: een kritische toetsing en nieuwe interpretatie
(Kampen: Kok, 1996), 167.

62 Heeroma 1985.
63 Ive translated the original slogan which says Op 1 april verloor Alva zijn Bril.
64 Maljaars 1996, 165.
65 Maljaars 1996, 151, 167.
66 Maljaars 1996, 151, 171, 217.
67 Maljaars 1996, 217-218.
68 Maljaars 1996, 171-173.
69 Heeroma 1985.
16

the understanding of my research, nor is it necessary for performing it. If my thesis were to produce results that
inform us in any way about a more precise date of creation of the Wilhelmus, this will be highly relevant, on which
Ill elaborate in the following sections.
Where to?
Another appertained classic question in the Wilhelmus research is, where to it was written. What is the exact
nature of the song? An interpretation of the tone and ethos of the song as apologetic, comforting, and accepting,
is vital for an early dating, because then would the nature and function of the song agree with the year of
presumed release.70 At the end of 1568 and the beginning of 1569 the Dutch needed the comfort and the Prince
had lost enough to be accepting about and to apologize for. If the Wilhelmus is indeed a consolatory song, a
poetical reproduction of a supposed farewell speech by the prince at the disposal of his troops at Staatsburg in
January 1569, then an early dating, 1568-1569, would be a logical one.71 Researchers Schotel, van Eyck and
Duinkerken among others, agreed to this stance.72
This interpretation has, however, gone out of fashion. Kuiper and Drewes, for example, kept to a later
dating, because they defined the Wilhelmus as a propaganda song,73 which should make the Dutch support the
prince and shed the Spanish yoke. An early dating would then be impossible, because the situation would have
been too hopeless, the people too skeptic and the prince would have been too bitter for pugnacious propaganda.
The song, in this interpretation, focuses on Orange, presents him to the Dutch people and appeals to follow.74
The debate has not been settled but one assumptions runs through it, the song implies the attitude of the people
to whom it was destined for.
Solving the where to? may solve the question of When?. However, searching for the intention of an author is a
practice of ill repute, because evidence is often based on personal interpretation. Lets include another opinion to
emphasize this point.
According to F.K.H. Kossmann, the tune of the Wilhelmus reveals that it should not only be regarded as
a geuzenlied, but that is has also origin in spiritual songs, martyr songs and historical songs. It contains a polite
and spiritual sound, as well as a rough sound which is characteristic of a mockery song or a soldiers story.75 The
manifestos of the prince had already justified his actions and announced his political sense. Saravias76 sermon
had placed, on behalf of the prince, the strength and value of the spiritual above that of the earth.77 The poet of
the Wilhelmus must have been interested in another urge, namely to personate the emotional life of the prince.

70 Maljaars 1996, 151, 167.


71 A. J. Veenendaal, Vier vragen betreffende het Wilhelmus. in Het Wilhelmus in artikelen. Een bundel
herdrukte studies over het Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 74.
72 Veenendaal 1985, 80.
73 Veenendaal 1985, 80.
74 Maljaars 1996, 217.
75 Kossmann 1985, 38-39.
76 Secretary of the prince at that time
77 Kossmann 1985, 43.
17

The defeated general, the ignored nobleman, the righteous and selfless politician and above all the sincere
conscience, which the poet gave him, the symbol of palladium and exile. Its not a song which draws its poetic
value from the beauty of expression, its artistic form or the depth of its contents, but a true expression of sincere
emotion, which rules over form and content. In other words, according to Kossmann it was much more than a
propaganda script.78

This

en face convincing assertion can be challenged by the fact that the Wilhelmus was, rather quickly after its
making, grouped in with other geuzenliederen in a geuzenliedboek. Also, textually there seems to be no reason to
separate the Wilhelmus from other geuzenliederen. Whether you interpret Kossmanns theory as an example for
the need of contextual research, or for an argument against speculation about the intention of the author, it
showcases the diversity of interpretations qualitative research bears, as well as the necessity of solid theoretical
ground in the Wilhelmus-research.

Where?
The third classical Wilhelmus-question revolves around the geographical location of its conceiving. This question
is again connected to the others as the location can give us exclusion about the time and reasons for creation.
The general consensus is that the Wilhelmus must have been written somewhere in or nearby the presence of
the prince, probably during his exile at the stronghold Dillenburg, in Germany.79 In the years 1568-1572, Willem
van Oranje tried to win over the public opinion of our eastern neighbors. It is therefore most likely that the song
was spread in German around the same time that it was first published in the Netherlands, following the same
pattern as the other propaganda scripts of the prince.80 This does not mean necessarily mean that it was written
in Germany or in German for that matter. The debate has not been as vast as that of the other Wilhelmusquestions, probably because therere not enough historic certainties to speculate on.

Who is the author?


Theres a huge amount of research attempting to determine the author of the Wilhelmus. The two most important
potential authors, whose names also come up the earliest in the research tradition after the Wilhelmus author are
Marnix van Sint Aldegonde and Dirk Volkertszoon Coornhert.81
Marnixs name has been connected to composing the Wilhelmus since Jacob Verheiden reticently
suggested it in a short panegyric soon after Marnixs dead. This hypothesis, that assumes Marnix as the author of
the Wilhelmus, is called in the research tradition the Marnix hypothesis. Later on, Willem de Gorter, a rederijcker
from Mechels, put Marnixs name under a printed version of the song. This cant be in correspondence with the
general assumption because Adrianus Valerius, known for the first remained print of the text and music of the

78 Veenendaal 1985, 73.


79 Eberhard Nehlsen, Het Wilhelmus over de grens in Nationale hymnen. Het Wilhelmus en zijn buren, ed. Louis Peter
Grijp (Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998), 96.
80 Nehlsen 1998, 97.
81 J. de Gier, Voorwoord: een verantwoording. in Het Wilhelmus in artikelen. Een bundel herdrukte studies over het
Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 9.

18

song combined, altered the lyrics of the Wilhelmus, something that wouldnt even have been a possibility if the
glorified Marnix was generally considered its author.82 J. Spoel perpetuated the fictive moment where the alleged
poet recites the song to Willem of Orange. This scene is without a doubt inspired by the famous painting of Pils,
who portrayed the poet of the Marseillaise reciting his song to the major of Staatsburg.83 Nowadays, the common
Dutchman, who knows only the first and possibly sixth couplet, will point to Marnix as the author of its national
anthem.84 The tradition of Wilhelmus-research favors Marnix as well, as Drewes, Rooker en Lenselink all
performed comparative stylistic analysis on his texts in relation to the hymn.85
Reasons in favor of Marnix being the author, are numerous and divers. One example of these, is the
allegation that the anthem has lingual elements of the dialects spoken in the southern regions of the Netherlands,
the region where Marnix was born and raised.86 An important argument against Marnix authorship is the literary
quality of the song. Our national anthem has seven rhyme schemes of poor quality, of which Valerius version
corrects four.87 Therefore, adversaries of Marnixs authorship, among them Maljaars, Van Eyk and Buitendijk
claim that the attribution to Marnix, degrades him to a second or third rate poet.88 It goes without saying that the
rhetoricians and other renaissances poets in the second half of the sixteenth century consider purity of rhyme as
a quality of good poetry. An arrived and excellent poet such as Marnix, shows hardly any impure rhyme in his
oeuvre.
Coornhert was first mentioned in 1663, by the remonstrant preacher Geeraert Brandt89in his Historie der
reformatie in de Nederlanden.90 There has been considerately less research after Coornhert as the author of
Wilhelmus, stylistic and otherwise. Garmt Stuiveling was the only notable modern researcher who paid attention
to Coornhert as possibility. This is strange because he seems a high potential option. He knew the prince
personally and has been incarcerated for a short period of time because of this, after which he faced a long
lasting banishment, under the reign of Alva.91 Coornhert never alleged himself to institutions such as the church,
school or a Rhetoricians chamber. He was opposed against the Catholic persecution of heretics as well as the
Calvinist iconoclasm.92
Marnix was an obvious Calvinist while Coornhert is, arguably, a Libertarian. Over the course of history
some (Drewes) have interpreted the Wilhelmus as containing specifically Calvinistic elements, while others see a

82 Kossmann 1985, 42.


83 Kossmann 1985, 42.
84 H. Bonger, De dichter van het Wilhelmus. in Het Wilhelmus in artikelen. Een bundel herdrukte studies
over het Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 188.
85 Bonger 1985, 189.
86 Veenendaal 1985, 91.
87 Maljaars 1996, 132-133
88 Maljaars 1996, 119.
89 Kossmann 1985, 42.
90 Bonger 1985, 187.
91 Porteman and Smits-Veldt 2008, 97.
92 Porteman and Smits-Veldt 2008, 97.
19

more general Christian tone.93 Choosing a side in author debate is often attached to a typical Dutch complication.
Because of the typical Christian character of the Wilhelmus, which is atypical for a national hymn, and because of
the Dutch history of religious quarrel, people tend to project their own religious believes onto the lyrics. Its not
oinconceivable that they wish, subconsciously or not, for the author of the national anthem to be a fellow believer.
This occurs especially in reformed circles, where they make a point of the Calvinist interpretation.94 As you can
see, also this unofficial fifth cardinal question, about the presence and forte of the Calvinist and anti-Catholic tone
of the Wilhelmus, which is typical for the ideas of both Marnix van Sint Aldegonde and Willem van Oranje during
the years 1968- 1971,95 is strongly connected to the question of authorship. Perhaps the refusal of Coornhert to
allege himself to institutions or communities, made it harder to strongly identify with him, and because of this,
imagine him as the author anthem that binds a nation. This is only speculation from my part, but the absence of a
body of serious research after his character is highly conspicuous.

Obstacles
The anonymous publication of the Wilhelmus is on itself in no way strange or rare.96 Most of the pro-Oranje
pamphlets were published anonymously. A lot of these songs have never been linked to a specific author. For
example, only a minority of 60 songs of the Een nieu geuse lieden boexcken of 1577- 1578 have an identified
author.97 What stands out with the Wilhelmus is the amount of attention it has received. Why hasnt the author
admitted to his masterpiece, when the battle for independence was won and his song could have made him
immortal. More importantly, how come all my predecessors fell short when it came to attributing the text to its
author. Even if only Marnix was seriously considered, as the possible author of the Wilhelmus, the question
remains why all the research didnt lead to more significant results. How come that despite extensive research,
stylistic or otherwise, has not yet determined Marnix alleged authorship to be true or false? Limiting myself to
stylistic research, when trying to understand the current status quo in the Wilhelmus-tradition, several major
methodological flaws become visible when performed by a human expert based on close reading. Ill discuss
them per problem.

Idiom
Van Haeringen98 states that studies on the basis of similarities, comparing the idiom of the Wilhelmus with that of
texts by Marnix, cant produce convincing evidence. Large amounts of similarities are to be expected a priori,
because the Wilhelmus and the Marnix corpus share a common cultural background, being written in same time,

93 Veenendaal 1985, 91.


94 Grijp 1998, 10.
95 Maljaars 1996, 160-163.
96 Kossmann 1985, 42.
97 De Bruin 1998, 26, 28.
98 C.B. van Haeringen, Rec. van J.B. Drewes, Wilhelmus van Nassouwe De Nieuwe Taalgids 39 (1946):
188-189.
20

same geographical location and probably the same language, therefore sharing topical and lingual influences.99
Examples of these type of studies, like that of Lenselink,100Rooker101and Den Besten,102confirm Van Haeringens
statement as they seem to reach no further than the identification of some striking resemblances.103 In addition to
this, do Marnix Psalmen Davids, a text often used for comparative stylistic analysis, count a 1419 verses and this
translates to, after a rough estimation of Maljaars,104 8000 lines of poetry. The text size of the psalms is 70-fold
compared to the Wilhelmus of 120 lines of poetry. This imbalance almost guarantees the occurrence of the idiom
of the Wilhelmus to be present in the psalms but not the other way around. The discovered parallels are,
however, also found in other authors their oeuvres, making them aspecific and therefore substantiating little,105
and not being very convincing unless preaching to the choir.106
Another problem with comparing the idiom of two texts is the possibility of imitation. How do we know, for
example, that Marnix isnt strongly influenced by the Wilhelmus and imitates this in his other songs, making them
stylistically similar to the anthem. Considering the times and the influence of the Wihelmus, this seems very
logical. To make matters worse, we also do not know if the Wilhelmus itself hasnt been written by an author who
was heavily influenced by Marnix work. Again, this is, considering the influence of Marnix and the practices of
emulation of those times, not unthinkable.107

Metric
One major argument used in favor of Marnix as author, proclaimed by Den Besten and Otterloo, is that Marnix
was the only poet to convincingly use jambs, at such an early phase of the development of metric in poetry,
thereby solving the problematic literary quality of the Wilhelmus, as well as strengthening their Marnix-hypothesis.
This theory has, however, been debunked by Maljaar, who claims at that point in time, many, among them Marnix
and Coornhert,108 were able to produce a iambic ground pattern. Jan van Hout was credited by his peers as the
first one who applied this altercation, and witness reports should definitely not be ignored.109 One of those
witnesses is poet Maeren Beheyt, who dedicates his own poem Van t Maetvinden to Jan van Hout, concluding
his last verse with the line; Neerduyts maetklanckx voorbeelt sproot uyt van Hout in Leyden, meaning The

99 Maljaars 1996, 12.


100 S.J. Lenselink, "Marnix en het Wilhelmus." Tijdschrift voor Nederlandse Taal en Letterkunde 67 (1950):
241-263.
101 C. Rooker, Marnix, de Dordtse rede en het Wilhelmus. De Nieuwe Taalgids 71 (1979): 145-164.
102 Ad den Besten, Wilhelmus van Nassouwe. Het gedicht en zijn dichter (Leiden: Martinus Nijhoff, 1983).
103 Maljaars 1996, 16-17.
104 Maljaars 1996, 15.
105 Maljaars 1996, 59.
106 Maljaars 1996, 14.
107 Maljaars 1996, 20-21.
108 Maljaars 1996, 36, 40, 46 and 49.
109 Maljaars 1996, 11.
21

example for Nederduits110 metric arose from Hout of Leyden.111

The

reason this argument is included in this section is that analyses of metric could be considered as a useless
exercise to begin with, and thereby losing yet another feature that could provide information. Reason for the
dismissal is that the axiom,112 of the Wilhelmus as a consciously arranged metric structure, seems very
improbable. The metric of sixteenth century songs was often evoked by the melody of the music.113 I see no
ground for regarding the Wilhelmus to be exceptionally free from its musical chains.

Subjectivity
A major obstacle for human expert comparative stylistic analysis is the subjective nature of differentiating between
relevant and irrelevant stylistic parallels. Maljaars explains that a completely objective norm for the amount of
differences that determine when a certain text has or has not been written by a certain author, is non-existent.
Nobody can factually determine the relevance or amount of similarities or dissimilarities needed between two
corpora of language in order to be considered as by the same or by different authors.114
Conclusions of existing Wilhelmus-research
The four questions that Veenendaal askes in his fundamental article115 from 1954, namely; when, where, where to
and by whom is the Wilhelmus written, are not yet adequately answered despite the detailed examination of the
anthem. The shy conclusions with which Ad Den Besten had to end his scientific paper116 are characteristic for
the current situation in the Wilhelmus research: adamant evidence has still not been found.117 The exact year and
place of emergence are disputed, and so is its literary value and the nature of the national anthem, alongside of
course, its anonymous poet. In the course of time it has been characterized as a valedictory song,118 a
consolatory song,119 an encouragement or exhilarating song, an apologia120 and as propaganda121 song.122 There

110 A dialect or combination of Dutch and German spoken in the eastern provinces of the Netherlands and
the western provinces of Germany.
111 Maljaars 1996, 41-43.
112 Maljaars 1996, 48.
113 Maljaars 1996, 48.
114 Maljaars 1996, 12.
115 Veenendaal 1985, 73-92.
116 Den Besten 1983.
117 Abraham Maljaars and Samuel Jan Lenselink, Inleiding en verantwoording., in Het Wilhelmus: Een Bibliografie (The
Hague: Stichting Bibliographia Neerlandica, 1993), 1, 5.

118 A. van Duinkerken, "Het Wilhelmus" in Verzamelde Geschriften III (Utrecht: Het Spectrum, 1962).
119 Gilles Dionysius Jacob Schotel, Gedagten over het oude volkslied Wilhelmus van Nassouwen en den vervaardiger van
hetzelve (Leiden: 1834); Johannes Postmus, Het Wilhelmus (Kampen: Kok,1900);
P.N. van Eyck, "Het Wilhelmus." in Wilhelmus van Nassouwe (Middelburg: P. Geyl, 1933).

120 P. Leendertz (jr.), Het Wilhelmus van Nassouwe. Met verklaring en historische toelichting. (Zutphen:
Thieme, 1925).
121 E.T. Kuiper, Wilhelmus van Nassouwe. Taal en Letteren 12 (1902): 1-7-120; Veenendaal 1985.
122 van Es 1985, 47.
22

is some consensus about a date of commencement, between October 1568 and April 1572, but within those four
years the opinions differ greatly, as do they on the other three questions.123
My thesis initially attempts to answer the who? question but as it seems, insight about the when, where
and why brings forth insight that may answer my main question, therefore I expand my field of interest. I dare to
say that contributing to any of the four Wilhelmus-questions means contributing to every of the four Wilhelmusquestions. The Marnix or Coornhert debate for example, is largely based on historic knowledge. If these historic
facts were changed or reinterpreted, Marnix could lose its role as primary option, like what after the second world
war when the birth date of the anthem was reconsidered. In the years 1568 and 1569 Marnix was not yet
convinced, although he knew Willem van Oranje through his brother Lodewijk van Nassau, that the Prince was
the man the Wilhelmus portrays him to be.124 The poet had seen Hendrik van Bredero as the promised leader of
the Netherlands,125 but Brederos had to cease his armed resistance against Spain in 1567. Marnix said about
the period after this defeat, se soumettre a un chef, qui commande avec authorit126, a headless resistance. If
the anthem was written during this period, its very unlikely Marnix is its author. This is an illustration of how one
piece of information can change the outlook on the whole puzzle.
As mentioned, Ill not chase argumentation on the basis of biographic or other historic evidence, nor am I in the
conviction that I could do this better than my predecessors. The Wilhelmus-case is, however, momentarily void of
stylistic research and this is where Im determined to contribute. A large part of the problems of traditional stylistic
research, that are discussed in the past section, can now be circumvented and solved by the methods of
nontraditional authorship attribution. Now that the historical context and the Wilhelmus research tradition has
been summarized, I discuss the new methodology and its theory in the following sections.

Computational literary studies


When performing research on the author of the Wilhelmus, the possible contributions one can make are not as
numerous in one area as they are in the other. By now it should be clear that Wilhelmus-research fits into a rich
tradition, a history even, of academics, and one has to search for the empty shelves in this library, left empty by
illustrious academic predecessors. Especially for me, a definite beginner to the subject, the relevance of my
contributions depends on finding the blank pages, or in my case perhaps even laying groundwork for a new wing.
The evidence Ill chase will not be based on biographical and historical knowledge. Researching these contextual
sources of information has already been done, far better than I can ever hope to do, by experts who have
absolute superior knowledge of the subject at hand and the methods used for such research.
Instead I will focus on textual sources, the clues in the text of the Wilhelmus itself. Although this field of
expertise is not nearly as crowded as the context based research, the practice of determining or verifying the

123 Veenendaal 1985, 79.


124 Maljaars 1996, 151-152.
125 Maljaars 1996, 157.
126 Maljaars 1996, 158-159.
23

authorship of an anonymous text based solely on internal evidence is also a very old one, dating back to at least
the medieval scholastics.127 Human experts performed qualitative analyses, finding clues within the text. Being
able to contribute will still be almost impossible if Im only to follow in the footsteps of giants, meaning trying to
repeat or add to their methods. Ive got to try to stand on their shoulders, applying relatively new methods to this
old case pulsing of tradition, resting one foot on the canon of Wilhelmus research and one foot on the discipline of
text based authorship attribution with statistical methods and computational means.
These methods require quantitative analyses. While qualitative research is based on precise observation
and description of individual occurrences, quantitative research is based on computing frequencies, relations, and
distributions of features and relevant statistics.128 Most modern research in the computational literary studies, and
the research I favor is likely to practice a mixed method. Qualitative observations can be confirmed by
quantitative analysis, and quantitative findings often need qualitative analyses to explain certain results. Ill
elaborate in the following sections.

Nontraditional authorship attribution studies


Authorship attribution based on internal textual features has gained greater prominence the past decades not
only because of the broadening of its applications to history, forensics, corpus stylistics129 and other fields of
literary study, but also because of the development of computational methods and tools for its problems.130 It
began with the study by Mosteller and Wallace131 (1964) on the authorship of The federalist papers, undoubtedly
the most influential research of its kind. Essentially, it initiated nontraditional authorship attribution studies, as
opposed to traditional human expert based methods.132 Since then and until the late 1990s, research in
authorship attribution was dominated by attempts to define features for quantifying writing style, a line of research
known as stylometry.133 The fundamental assumption is that individuals have idiosyncratic habits of language
use, leading to the stylistic similarity, of texts written by the same person.134 For authorship attribution involving

127 Koppel, Schler and Argamon 2009.


128 J. Berenike Herrmann, Karina van Dalen-Oskam, Christof Schch, Revisiting Style, a Key Concept in Literary Studies.
Journal of Literary Theory 9, no.1 (2015).

129 Oakes, M. P. Corpus linguistics and stylometry. In A. Ldeling & M. Kyt (eds.) Corpus Linguistics: An International
Handbook, Berlin: Mouton de Gruyter, Berlin, pp. 10701090. 2009. print
130 Koppel, Schler and Argamon 2009 .
131 Frederick Mosteller and David L. Wallace, Inference and disputed authorship: The Federalist (Reading, MA:AddisonWesley,1964).

132 Efstathios Stamatatos, A survey of modern authorship attribution methods, Journal of the American Society for
Information Science and Technology 60, no. 3 (2009).
133 David I. Holmes, Authorship attribution, Computers and the Humanities 28. no.2 (1994).David I. Holmes, The
Evolution of Stylometry in Humanities Scholarship, Literary and Linguistic Computing
13, no.3 (1998).

134 Stefan Evert et al. Explaining Delta, or: How do distance measures for authorship attribution work?
Presentation 24 July 2015, Lancaster.
24

quantitative text analysis, this leads to the practice of attributing texts of unknown or disputed authorship to an
author with the same style based on quantitatively measured linguistics.135

Hesitation by literary scholars and mistrust of such a blatantly quantitative approach may be
alleviated by choosing the least contestable mode of analysis, namely that of counting. The stylometrist
looks for a unit of counting which translates accurately the style of the text, where we may define style as a
set of measurable patterns, which may be unique to an author.136 Thus, style, as defined by computational
literature, means something completely different than our usual understanding of this term. 137 Traditional
stylometry, has a holistic approach to style, focusing on the semantics of a text, while computational
stylometry focuses on the aspect of style which is usually ignored by the traditional stylistics.138 For the sake
of clarity, I employ the following definition of style: Style is a property of texts constituted by an ensemble of
formal features which can be observed quantitatively or qualitatively.139 In this definition, style is not
something unique to literary works; rather, every text has a certain kind of style, however, the described
ensemble of formal features may be interpreted from a literary perspective. 140
The statistical analysis of a literary text can be justified by the need to apply an objective methodology to
work which, for a long time, may have received only impressionistic and subjective treatment.141 Features are
counted or measured and often further analyzed by mathematical statistics to discover threshold, significant
regularity (patterns) or irregularity (outliers) and many more stylistics characteristics of a text or corpus of virtually
any size. Statistically and computationally supported authorship attribution measures textual features, sometimes
of great complexity and/or in great quantity, to distinguish between texts written by different authors.142 This is not
only another way of analysis, its another way of reading a text.

Distant reading
With the ever progressing digitalization of literature and the development of tools for selection, structuring and
analysis for digital text, distant reading has become an alternative for close reading.
When close reading, the heuristically performing researcher searches within a text for confirmation or
falsification of a theory or a frame of interpretation. This method has given the literary scholar many findings of
subjective or intersubjective nature. A reader or group of readers are limited in the amount of work they can read.

135 Eder 2010


136 Holmes 1994.
137 Jan Rybicki, Visualizing Literature: Artistic Statistics. in Art in Literature, Literature in Art, Eds. MagdalenaBleinert,
Izabella Curyo-Klag, and Boena Kucaa (Krakau: Jagiellonian University Press, 2014).
138 M. Eder, A birds eye view of early modern Latin: distant reading, network analysis and style variation. in
NewTechnologies in Medieval and Renaissance Studies, Eds. M. Ullyot, D. Jakacki, and L. Estill (Toronto/Tempe: Iter and the
Arizona Center for Medieval and Renaissance Studies, 2015).

139 Herrmann, van Dalen-Oskam and Schch 2015.


140 Herrmann, van Dalen-Oskam and Schch 2015.
141 Holmes 1994.
142 Stamatatos 2009.
25

The study or reading of entire genres or literary periods are therefore beyond the grasp of the close reader.
Moreover, the counting of small textual elements as words, letters or punctuation is most of the time too time
consuming for the scholar to perform even when confining himself to one or several novels.143
Distant reading is a mode of reading where textual units are quantitatively analyzed with the aid of a
computer. This databased research, disaggregates the text in measurable units, transfers them in numbers, who
can be analyzed in enormous amounts with relative ease. It is only with these methods, those of the
computational literary studies, that patterns, correlations, models and structures within the data can be calculated
and visualized and that these trends and relations can be discovered and confirmed or rejected.144
The methods of distant reading have a character that conforms to the social sciences, where data and
interpretation of that data are often the main focus of performing research. The analysis itself doesnt contain
interpretation and the data is an objective results based on calculations.145 Interpretation of the text starts with the
interpretation of the results or data. Conclusions are backed up or justified by reproducible numbers, of course
always depending on the methods, the corpus, the parameters, the tools and other choices of the researcher
concerning design, who can be considerate as an approximation of the facts. Distance is a condition of
knowledge.146 The distance to the text allows us to research on a scale and of a kind that is unfeasible with close
reading.

Why not stylometric/visual?


Computational and statistical analyses might give you knowledge about the tiny characteristics of an ocean of
text, it definitely does not present you a holistic view. While distant reading, the researcher will not gain
knowledge over the semantic dimension of his corpus. Handles for interpretation during a distant reading are
numbers of which, in the most extreme case, we dont even know the texts that produced them. Much like looking
at a city from a map, you do not know any of its inhabitants. So not only will I be relatively limited in my
biographical and historical knowledge of the Wilhelmus and its potential authors, but Ill also loose, along with the
practice of hermeneutics, basic knowledge about the texts in my corpus.
I will not read, in the traditional sense, the complete texts of my corpora, let alone perform a close
reading on them, because I dont actually require all this semantic knowledge. A computer can count features, do
statistical analyses and provide us with the results optionally in the form of a useful or flashy visualization. The
computer will not know, why to prefer Cormac McCarthy over Julian Barnes, despite this being obvious, but he
will be able to quantify and analyze the textual content of both, in a matter of seconds.

143 Tim de Winkel Het vergezicht van de Nederlandse literatuur. Een distant reading van een groot corpus van modern
Nederlandstalig proza 2015 unpublished

144 Anne Burdick et al., Emerging Methods and Genres in Digital Humanities (Cambridge, MA: The MIT
Press, 2012), 38.
145 Stephen Ramsay, Reading Machines: Toward an Algorithmic Criticism (Champaign IL: University of
Illinois Press, 2011), 19.
146 Franco Moretti, Conjectures on World Literature. New Left Review 1 (2000): 57.
26

While I lose one aspect of understanding my corpus, I will gain the opportunity to read an enormous
quantity of work, read features smaller than the units we traditionally read like characters or punctuation marks,
and perform difficult analysis, not performable with the naked eye, a pencil and a piece of paper. In order to gain
knowledge about the system, youve got to sacrifice the knowledge about the texts itself.147
This makes the essence of non-traditional authorship attribution and stylistics incompatible with the
traditional way of reading in such a way that I join the assertion that we see computational literature not only as
new methodological options, but as a new way of reading. So when reading a text, or perhaps even a great
number of texts, or when asking questions about literature, language and its anchors in society, and the scholar
has to determine how to go about it, why not stylometric/visual?148

The human factor and real world Authorship attribution


Ive spoken so far about data-based research and objectivity but this does not mean that all research conditions
are determined by the computer. We do not take the human factor out of the analyses, and its definitely not as
simple as it might have come across the last few pages.
The stylometrists job is to set the stage for analysis, by providing the material and set the conditions as
well as interpret the results and ask the right questions to begin with, again highly similar to the gamma sciences.
Before the analyses, the selection of the corpus, tools, methods and parameters have to be determined, as is the
drawing up of the hypotheses and of course the interpretation of the results. All of this is under control of the
human researcher and he is accountable for his choices. This means you have to have solid theoretical and
practical knowledge, of both the methods and programs, including the mathematical statistics that they use, as
you should be able to understand the results and anticipate on the limitations of the formula, parameters, chosen
features and so on. In addition to this, in an experimental setting, the feedback during the analysis, results or
errors, along with your methodological design and scientific goal, will cause you to adjust all of these variables,
texts included in your corpus, the features measured, the methods used and perhaps even programs which run
your analyses.
These considerations seem to go against the earlier theoretical emphasis on the perks and drawbacks
of distant reading, losing knowledge about the semantic and biographical dimension of your corpora. In order to
pick your texts for inclusion and interpret the results of the analysis, knowledge is required. However, knowledge
might be very specific or limited, as criteria for inclusion may be everything French, Renaissance texts about
woman, any text from 1945 or even any blog available. Knowledge might also be pretty extensive as the
requirements you set on a text might have to comply to written by a possible author of The Wilhelmus and nonspecific in any other way.

147 Moretti 2000.


148 Rybicki 2014, 9.

27

When performing authorship attribution the choice of criteria believed to characterize authors is the very first step.
One should probably not believe that any single set of variables is guaranteed to work for every problem, so
researchers must be familiar with variables that have worked in previous studies as well as the statistical methods
to determine their effectiveness for the current problem.149 In my case the selection of the features, tools and
methods depends on its ability to handle short and noisy text from multiple candidate authors.

Categorization task
A variety of methods has been applied to authorship problems of various sorts,150 151 like author verification152,
plagiarism detection, author profiling and detection of stylistics inconsistencies.153 The simplest kind of authorship
attribution, and the one that has received the most attention is the one in which we are given a small closed set
of candidates authors and are asked to attribute an anonymous text to one of them. Ideally, we have copious
quantities of text of undisputed authorship by each candidate author and that the anonymous text is reasonably
long.154 This is called a multiclass, single-label text categorization task.155 This is a very solvable problem, and
done very often and very successful.
So in the straightforward form, authorship attribution problems fit the standard modern paradigm of a
categorization problem.156 157 There are however some important characteristics that distinguish authorship
attribution from other text categorization tasks and these are the differences weve got to keep an eye on when
determining design choices.158 First of all, in style-based text categorization, the most significant features are the
most frequent ones159 while in topic-based text categorization, the best features should be selected based on their
discriminatory power.160 Another difference is that in authorship attribution tasks, especially in forensic

149 Holmes 1994.


150 Patrick Juola, Authorship Attribution. Foundations and Trends in Information Retrieval 1. no.3 (2008).
151 Koppel 2009.
152 Moshe Koppel and Jonathan Schler, Authorship verification as a one-class classification problem. in
Proceedings of the 21st International Conference on Machine Learning (New York: ACM Press, 2004).
153 Stamatatos 2009.
154 Moshe Koppel, Jonathan Schler, Shlomo Argamon, and Yaron Winter, The Fundamental Problem of Authorship
Attribution. Stylometry and Authorship Attribution 93 no.3 (2012).
155 Stamatatos 2009.
156 David D. Lewis and Marc Ringuette, A Comparison of Two Learning Algorithms for Text Categorization. Symposium on
Document Analysis and IR, ISRI (Las Vegas, 1994). Fabrizio Sebastiani, Machine learning in automated text categorization.
ACM Computing Surveys 34 no. 1 (2002).

157 Koppel, Schler and Argamon 2009.


158 Stamatatos 2009.
159 John Houvardas and Efstahtios Stamatatos N-gram feature selection for authorship identification. in Proceedings of
the 12th International Conference on Artificial Intelligence: Methodology, Systems, Applications (Berlin: Springer, 2006).&
Moshe Koppel, Nakot Avika and Ida Dagan, Feature instability as a criterion for selecting potential style markers. Journal of
the American Society for Information Science and Technology 57. no.11 (2006).

160 George Forman, An extensive empirical study of feature selection metrics for text classification.
Journal of Machine Learning Research 3 (2003).
28

applications, there is extremely limited training text material while in most text categorization problems, there is
plenty of both labeled and unlabeled data.161 This is definitely the case in my thesis. Also, in most cases candidate
authors are imbalanced. In such cases, the evaluation of authorship attribution methods should not follow the
practice of other text categorization tasks, since they most of the time have a well-balanced corpus.162
Writing in a forensic context, Bailey163 proposed three rules to define the circumstances necessary for
authorship attribution164: One, the number of putative authors should constitute a well-defined set; Second, the
lengths of the writing should be sufficient to reflect the linguistic habits of the author of the disputed text and also
those of each of the candidates; Third, the texts used for comparison should be commensurate with the disputed
writing. Bailey lists the general properties for such features: They should be salient, structural, frequent and
easily quantifiable, and relatively immune from conscious control.

Relevance
The Wilhelmus as object of research
As described the theory, the Wilhelmus can be considered as an important factor in the birth and consolidation of
the Dutch national identity and perhaps even as an actual factor in the struggle for independence. This gives the
text major historical and contemporary importance. A nation should know its own history, and the Wilhelmus is a
central part of it. Knowing the author could be a window to other historic facts, the prince his persona and his role
and the eighty year war.165 Moreover, this sense of urgency is shared by the Dutch citizenry. With the renewed
interest in our own history and the culture of the Dutch golden age, the so called cultural nationalism, the
popularity of the national anthem has experiences a renaissance as well.166 The Wilhelmus is the only national
anthem in Western Europe that made this kind of comeback.167
A second argument Id like to point out, is the unique position the Wilhelmus holds in the Dutch literary
history and the global history of songs, as the worlds oldest official national anthem. This canonical status gives
the song major literary relevance.
Another feat that makes the Wilhelmus such an interesting object of research is the high literary quality
that is attributed to it, despite the often criticized imperfect rhyme. Dick Coster calls the Wilhelmus the highest of

161 Stamatatos 2009.


162 Stamatatos 2009.
163 Bailey, R. W. (1979). Authorship attribution in a forensic setting. In D. E. Ager, F. E. Knowles, & J. Smith (Eds),
Advances in computer-aided literary and linguistic research (pp. 1-15). Birmingham, UK: Universoty of Aston in Birmingham,
Departement of Modern Languages.

164 Holmes 1994.


165 Bonger 1985, 9.
166 Grijp I 1998, 62.
167 Grijp inleiding 1998, 10.
29

Geuzenliederen168 and Martinus Nijhoff, called the Wilhelmus poesie pure and considered it among the absolute
greatest poetry. Ive translated a comment of Nijhoff below, to illustrate his poetics. There are among the songs of
beggars poems with a real tone, grand, deep and strong. Much deeper and greater than Vondel, Hooft and
Bredero have ever made audible169 If we take literature serious as object of research, a search for the author of
one of its highlights, seems only natural.
Obviously, the opinion that the Wilhelmus is a valuable object of research is shared with by the authors of the 350
Dutch scientific publications, mostly of literary historic character, about the Wilhelmus, registered by Maljaars and
Lenselink in 1993.170 This number has only become bigger, 20 years later. According to a rough estimate,
therere currently around 350 or 400 publications concerning the Wilhelmus and authorship question, adding up to
15000 pages, dedicating about 1000 pages per verse of the song.171 So many has been written about the
Wilhelmus, that it seems almost impossible to form new thoughts on the matter without any discoveries from the
archives.172 What can my thesis possibly add?

Lack of stylistic research


An important argument for performing a comparative stylistic analysis on the Wilhelmus and texts from potential
authors is that it hasnt been too much solid research of its kind. Most researchers tried to reconstruct a system of
ideas, but defaulted to discover whether the language of the Wilhelmus was also the language of their author of
choice.173 When in 1925, Leenderts book about the Wilhelmus came out, in which it defended Marnix as the
author, De Vooys spoke about his wonder about the absence of stylistic research.174 Twenty years later, in the
same literary magazine as his colleague, Van Haeringen, came to the same verdict and suggested that a stylistic
analyses between Marnix poetry and the Wilhelmus was in order. Especially Marnix psalms would share enough
topic and tone with the hymn, to justify such a comparison.175 I add to the call of the literary history to solve its
open case, and the call from my countrymen to quench their thirst for nationalism, the call of the field of
Wilhelmus-study for performing stylistic research.

Methodological gain for my casus


Another important argument in favor of performing a comparative stylistic analysis on the Wilhelmus and texts
from potential authors, is that a lot of the current methodological problems with this type of research and possible

168 Veenendaal 1985, 90.


169 Nijhoff, M. Het Wilhelmus 1926 In; Verzameld Werk II, Kritisch, verhalend en nagelaten proza. Den
Haag: Bert Bakker, 1961, p 394
170 Grijp inleiding 1998, 9.
171 Ren van Stipriaan, Het Wilhelmus, de nationale whodunnit: cold case of oplosbaar? Commissie Taal en Letterkunde,
MNL, Leiden. 7 november 2012. [Lecture]
172 Veenendaal 1985, 73.
173 Lenselink 1950, 241.

174 Maljaars 1996, 11.


175 Lenselink 1950, 241.
30

attribution, can now be solved, by adapting the methodology of computational literature. Ive already explained
why comparative stylistic research has been problematic up until now and Ive mentioned that computational
methods could make it less so. I discuss the solutions below.
The problem of the petitio principii for comparative research was that the question brings forth the answer. When
searching on the basis of similarities in idiom, the author needed to be determined a priori, leaving only the option
for gaining evidence. The opposite, comparative analysis based on differences, can only lead to excluding
authors. With distant reading, many works of many authors can be compared on a large amount of textual
features in a matter of minutes. When we know this little limitations on the amount of text and features, because
its no longer time and effort consuming, we can examine the text of all the potential authors and compare them to
the Wilhelmus simultaneously, on a amount and scale of features previously impossible to measure. Visualization
of these analyses expresses the stylistic similarity of the Wilhelmus its potential authors style to that of the
anthem making the results accessible. Now the previous hypothesis of is Marnix work stylistically comparable
with the Wilhelmus? becomes Is Marnix the most stylistically comparable to the Wilhelmus of all its potential
authors? avoiding a circular argument and self-fulfilling prophecy.
Ascertaining objectively the relevance or amount of similarities or dissimilarities needed between two
corpora of language in order to be considered as by the same or by different authors, is still impossible, but
statistical math can help us to determine an acceptable margin of error, a probability of success, as of where we
accept a result as factual until contradicting results emerge.
A related problem is that semantic and stylistic similarities overlap176 and are not two mutually exclusive
categories. When measuring style quantitatively, a human expert need to determine the nature of a textual
dimension every time he encounters a similarity or dissimilarity. By using computational research you can ignore
this problem because it doesnt cling to this distinction while, most of the time, heavily favoring features we would
characterize as stylistic. It does not recognize vague semantic parallels177 if theyre not grounded in a textual
base. Effects of genre will occur because they do have this base. Any associations of the researcher based on an
intimate knowledge of themes and motives and strong feelings about their quality will be harder to find.
The problem of creative, so stylistic, emulation is not completely solved with the use of computational means, as
we still wouldnt know by the grace of the comparison which one is the original or the epigone, but when using
low-level features, like the frequencies of function words, it does solve conscience imitation because the usage of
these is nearly impossible to mimic and very hard to influence.178
The problems surrounding subjectivity are still present but a lot smaller and better manageable, provide the
researcher acknowledges his own horizon. As the role of the researcher is now predominantly moved to the setup

176 Maljaars 1996, 32.


177 Maljaars 1996, 32.
178 Stamatatos 2009.
31

of experiment and interpretation of the results, I proclaim that it might the computer doing the analysis, it is on the
human researchers terms.

Relevance for computational literature


This thesis has two main goals, one topical and one methodological. The niche of the computational literary
studies is in my experience, a hardworking, approachable and collaborative group of people who form a field of
study thats still relatively new, extremely exciting but also quit atypical for the humanities. In my opinion this field
of study might, with its gamma methodology, distant reading and thereby alternative scientific approach to the
study of literature, provide answers to questions of theoretical and philosophical nature about literature, about
literary research and perhaps about the Alfa sciences in general. Whether these answers will be satisfying or not,
and whether the results produced by computational literary studies will be grand and spectacular or not at all, Im
convinced we should give it a try and pursue its possibilities.
The only way to pursue is by doing research. This can be theoretical and methodological research, as
most of the papers of the computational literature are, or research aimed at answering question about literary
text, handling a case. While theoretical and methodological research, aims to test and confirm the possibility and
limits of a design. Computational literature has produced quit a lot of this type of research and has now, in my
opinion, a pretty solid methodological base. One important reason for conducting my thesis as Ive done, is my
belief that if you want a field of science to be taken serious, definitely if it provokes objections and skepticism of
the more traditional branch, it should produce results that speak for themselves. If computational literary studies
solves interesting and relevant literary cases, as it has been doing, then the usefulness of its methods will no
longer be under plausible content. This is why I desperately wanted to analyze the Wilhelmus, despite it being
obvious, that there was a bigger chance of failing to answer my main question than to succeed in answering it.
The scientific practice of accepting the main question of a thesis as unanswered is, while pretty common
in social sciences, something thats hard to sell in the humanities. This resistance is inherent for close reading
and interpretation on the basis of the subjective verdict of a human expert. A satisfying answer to How does the
postmodern novel relates to the innate modernistic practice of the literary critic can never be I dont know.
Despite my own conviction that adding a dont know option to the repertoire of possible answers is a scientific
necessity, Im well aware of the risk it poses for my individual trajectory of education. On the other hand would
succeeding, because of the relevance of the anthem, add to the acceptance of the computational literary studies,
that needs to be acknowledged for its potential. I aim to stimulate this acknowledgement, by choosing a relevant
case, one that by itself demands notice, and conducting the research in such a way that any outcome will count
as a significant result.
This case of authorship attribution probably requires my methods to be stretched to their limits, with the
threat that they will not be able to actually solve it. My thesis will function as a marker, marking the limitations of
my methods. The results will, either way, be an indication of the possibilities of the methods and their
performance on very unfavorable circumstances. The sections conclusion and discussion will elaborate on

32

these aspects of the results. In addition to this Im also enthusiastic to add to the catalogue of computational
research on Dutch text, since Im a research master student Dutch language and literature.

Methods
In this section Ill discuss the methods Ive used and the methodology that Ive seriously considered, along with
their theoretical background, previous results and practical use.

Automated Authorship Attribution


Automated techniques for authorship attribution can be divided into two main types: similarity based methods,
and machine learning methods.179 In machine learning methods, the known writings of each candidate author,
considered as a set of separate training documents, are used to construct a classifier that can then be used to
classify anonymous documents.180 The idea is to formally represent training texts as numerical vectors and then
use a learning algorithm to find a formal rule for the boundaries between classes (authors), known as a classifier,
that assigns each such training vector to its known author. This same classifier can then be used to assign
anonymous document to (what one hopes is) the right author.181 Research on these methods has focused on the
choice of features for document representation and on the choice of learning algorithms. I wont use these type of
methods because these require large portions of suited training corpora to identify classifiers, which I dont have
at my disposition.
In similarity based methods a metric is used to measure the similarity between two documents, and an
anonymous document is attributed to that author whose known writing is closest.182 Both univariate, in which a
single numeric function of a text is sought to discriminate between authors, and multivariate approaches, in which
statistical multivariate discriminant analysis is applied to word frequencies and related features, are similarity
based methods.183 Research on these methods has focused on choice of features for document representation,
on methods for dimensionality reduction of the feature space (Principal Component Analysis) and on the choice of
distance metric.184 Ill pursue similarity based methods, described in the following sections.

Features:
An important variable in authorship attribution (AA) research/analysis is the choice of stylometric features.
Changing the features means changing the things you measure and in doing so you change the conditions under

179 Koppel et al. 2012.


180 Moshe Koppel, Jonathan Schler and Shlomo Argamon, Authorship Attribution: Whats easy and whats hard? Journal
of Law and Policy 21 (2013).
181 Koppel, Schler and Argamon 2009.
182 Koppel, Schler and Argamon 2013.
183 Koppel, Schler and Argamon 2009.

184 Koppel, Schler and Argamon 2009.


33

which the things you want to measure, will be measured. Previous studies185 on authorship attribution have
proposed taxonomies of features to quantify the writing style, the so-called style markers, under different labels
and criteria.186 The classification of features can be based on the size of a feature, but can also refer to more
complicated characteristics, such as narrative perspective or textual macro-structure.187 In my classification I
distinguish size from type and discuss them, in that order, in the next paragraphs. I discuss these four units or
sizes of features first and separate from the types of features, in order to explain the characteristics of the units,
that otherwise might get lost in the grander complexity of feature types. Later on I will not persist in this
distinction.

Size of features
An influential characteristic of features is their size. Changing the unit in which theyre counted and analyzed,
means reading the text differently. Therere four different units of features that can be counted as being of
different size; character, lexical, sentence and semantic features.

Character features
To measure character-level features, means to count any character as a unit, including blank spaces and/or
punctuation marks. Therere various character level measures, for example alphabetical character count, digit
count, uppercase/lowercase count, letter frequencies and punctuation count,188 that have proven to be useful.189
Lexical features
Lexical features are words or any bundle of characters between two spaces. Frequently used examples of lexical
level features are function words or content words, predominantly nouns that are expressive of the topic or genre
the text. The usage of lexical features are of course not bound to the obvious or to the easy imaginable cases,
theres a whole palette of unexpected applications. We can also use of proper nouns or to be specific count

185 Holmes 1994;Efstathios Stamatatos, Nikos Fakotakis and George Kokkinakis, Automatic Text Categorization in Terms
of Genre and Author. Computational Linguistics 24. no.4 (2000);
Rong Zheng, et al. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification
Techniques. Journal of the American Society for Information Science and Technology 57, no. 3 (2006).

186 Stamatatos 2009.


187 Herrmann, van Dalen-Oskam and Schch 2015.
188 Stamatatos 2009.
189 Olivier de Vel et al., Mining E-mail Content for Author Identification Forensics. ACM SIGMOD Record 30, no.4 (2001):
5564;Zheng et al. 2006.

34

geographical names,190 in order to analyze the readers sense of geographical place and distance. Lexical-level
features have a grand rsum in the computational literature.191
In the case of character or lexical-level features, the analyst can choose to increase the size of the unit by using
n-grams. This means that instead of counting one character or one word, you choose a number higher than one.
The computer will now search for combinations of words or characters, like he is or ?!. This is not possible for
the other feature levels because lexical and character features consider a text as a mere sequence of word
tokens or characters while syntactic and semantic features require deeper linguistic analysis.192
Syntactic features
Syntactic features are sentence based features. Research after syntactic features is trying to discover a syntactic
pattern that the author is presumed to unconsciously put in all of his texts. It intuitively feels as a more reliable
fingerprint in comparison to lexical information, but this intuition is not necessarily true. The analysis of syntactic
features is language dependent and often require robust and accurate natural language processing (NLP) tools,
in order to perform. It relies on the availability of a parser able to analyze a particular natural language with
relatively high accuracy. This will, however, still produce noisy data-sets due to unavoidable errors made by the
parser.193 Syntactic features alone often perform worse than lexical features, but a combination of the two often
improves the results.194
Semantic features
Semantic features are specific features who are sought out and analyzed based on their meaning. We tend to
see semantics as holistic, subjective and interpretative, so they must require large units and be heavily
dependent on context. However sometimes semantic features rely on the counting of lexical features. When, for
example, looking for sexist remarks in a book, maybe by counting curse-words, youre looking for semantic
features, although semantic features often require more interpretation than just counting. Whether semantic
features, or non-semantic features for that matter, are a feature size that actually exist, is open to debate.
The more detailed text analysis is required for the extraction of stylometric features, the less accurate
and the more noisy the produced measures are. NLP tools can be applied successfully to low-level tasks such as
sentence splitting, part of speech (POS) tagging, text chunking, and partial parsing, so relevant features would be
measured accurately, and the noise in the corresponding data-sets remains low. More complicated tasks such as
full syntactic parsing, semantic analysis, or pragmatic analysis cannot yet be handled adequately by current NLP

190 Karina van Dalen-Oskam, Names in Novels: an Experiment in Computational Stylistics. Literary and
Linguistic Computing (2012).
191 Patrick Juola, Authorship attribution for electronic documents. in Advances in digital forensics II. Eds. M. Olivier and S.
Shenoi (Boston: Springer, 2006); Stamatatos 2009; Koppel, Schler and Argamon 2009.
192 Stamatatos 2009.
193 Stamatatos 2009.

194 Michael Gamon, Linguistic correlates of style: Authorship classification with deep linguistic analysis
features, in Proceedings of the 20th International Conference on Computational Linguistics (Morristown, NJ:
Association for Computational Linguistics, 2004).
35

technology for unrestricted text. As a result, very few attempts have been made to exploit high-level features for
stylometric purposes.195 The most important method of exploiting semantic information so far was described by
Schlomo Argamon.196 He defined a set of functional features that associate certain words or phrases with
semantic information. Regrettably he did not provide information about the accuracy of the tools or methods.
I will not attempt to perform analyses on semantic feature-level because of the high requirements and
the lack of convincing research and theory. It seems impossible not to measure semantics, or syntaxes, when
measuring small but meaningful units like words. However, Ive no practical way to quantify or measure the extent
of their reception, so I wont pursue semantic features as a feature-level but as a welcome influence included in
the measured low-level features.

Types of features
Types of features refers to what a feature measures. In this section Ill explain the types Ive taken into
consideration, going from easy low-level features up to the higher-level features.

Punctuation
Ill not discuss research based on punctuation in this thesis. The Wilhelmus has very little systematic use
punctuation, presumably because of all the interference by publishers and its oral transmission, so extensive
analysis will, in all probability, not lead to any useful results.197

Complexity measures
While a lot of the early research after the stylistic authorial fingerprint has focused on complexity measures, I
wont use any of these type of features. The great variety of complexity features, including sentence length,
average word length, word length distribution, word frequencies, character frequencies, syllable or letter count
and others, used to measure vocabulary richness functions or text complexity,198have all proven to be inadequate
for authorship attribution and have been surpassed by better methods.199 Word length, for example, proposed by
Mendenhall200 appears to be so unreliable that any serious student of authorship should discard it.201
Stamatatos202 discards both Sentences length counts and word length counts, two other simple complexity
measures, because they may introduce considerable noise in measurement.

195 Stamatatos 2009.


196 Shlomo Argamon et al. Stylistic text classification using functional lexical features. Journal of the
American Society for Information Science and Technology 58, no.6 (2007).
197 Maljaars 1996, 143.
198 Stamatatos 2009.
199 Koppel, Schler and Argamon 2009.
200 T.C. Mendenhall, The Characteristic Curves of Composition Science 9, no.214 (1887).
201 Holmes 1994.
202 Stamatatos 2009.
36

More sophisticated measures were invented, like the type-token ratio and the hapaxlegomena,203 the
number of words appearing with given frequency in a text. These are methodologically less primitive but still
deliver inadequate results. Both measure some kind of vocabulary richness but are too dependent on text
length.204 Even complicated statistical measures as Yules K-measure,205 Sichels S-measure206 and Honores Rmeasure,207 all of which I will not discuss any further, proved of little value.208

Idiosyncrasies
Measures to capture idiosyncrasies of an authors style, like spelling and formatting errors, are not part of my
analyses. The availability of accurate spell checkers is still problematic for many natural languages.209 Human
experts mainly use observations similar to idiosyncrasies to attribute authorship. This is an important reason not
to focus on this aspect because my aim is to apply methods and analyze features that havent already been used
or analyzed, in order to keep my thesis as relevant as possible. Maljaars210 describes some of this research and
unsurprisingly a lot of it revolves around the weak rhymes.

N-grams on character-level
Frequencies of n-grams on the character level are able to capture nuances of style, including lexical information,
hints on contextual information, use of punctuation and capitalization, among others. Character n-grams are also
tolerant to noise.211 Style-based text categorization includes style-based errors, that can be considered personal
traits of the author,212 which character n-grams will capture as such. It also captures lexical preferences and
even grammatical and orthographic preferences without the need for linguistic background knowledge.213

203 Koppel, Schler and Argamon 2009.


204 Stamatatos 2009.
205 George Udny Yule, The statistical study of literary vocabulary (Cambridge: Cambridge University Press,
2014).
206 H.S. Sichel, On a Distribution Law for Word Frequencies. Journal of the American Statistical
Association 70, no.351a (1975)
207 A. Honor, Some simple measures of richness of vocabulary. Association for Literary and Linguistic
Computing Bulletin 7, no.2 (1979).
208 J.F. Burrows, Not Unless You Ask Nicely: The Interpretative Nexus Between Analysis and Information. Literary and
Linguistic Computing 7, no.2 (1992).Jack Grieve, Quantitative authorship attribution: An evaluation of techniques. Literary
and Linguistic Computing 22, no.3 (2007).

209 Stamatatos 2009.


210 Maljaars 1996.
211 Stamatatos 2009.
212 Moshe Koppel and Jonathan Schler, Exploiting Stylistic Idiosyncrasies for Authorship Attribution. in
Proceedings of IJCAI03 Workshop on Computational Approaches to Style Analysis and Synthesis (2003).
213 Koppel, Schler and Argamon 2009.
37

A secondary advantage is that the computational requirements are minimal.214 The procedure of
extracting the most frequent n-grams on character level is, contrary to n-grams on word-level, language
independent and requires no special tools; however, the dimensionality of this representation is considerably
increased in comparison to the word-based approach.215 This is because of the capture of redundant information
and also the many character n-grams that are needed to represent single long word.
The application of n-grams on character level to authorship attribution has proven quite successful.216 In
several text-classification task, including authorship attribution, bigrams and character n-grams of variable length
produced better results than lexical features.217 This leads me to the acceptance of character n-grams.
An important consideration is the definition of n, that is, how long the string of words should be. A large n would
better capture lexical and contextual information, but it would also capture thematic information and increase the
dimensionality of representation substantially (producing hundreds of thousands of features). On the other hand,
a small n, i.e., 2 or 3, would be able to represent sub word, meaning syllable, like information, but it would not be
adequate for representing the contextual information. The selection of the best n value is language dependent,
since certain languages tend to have longer words then others.218 The problem of defining a fixed value for n can
be avoided by the extraction of n-grams of variable length.219 Sandersen and Guenter,220 used several sequences
with character 4-gramns as longest sequence. Another method that uses the character 4-grams is, one of the
solutions Koppels papers221 describe for the General authorship attribution problem also called the needle in the
haystack problem. I will come back to the methods of Koppels work, but for now its important to mention that he
used space free character 4-grams and got excellent results out of them.222For attribution of Dutch texts with
character n-grams,223 Hoorn, Frank, Kowalczyk & Van der Ham ran an methodological experiment testing the best
n for categorization by author of Dutch poetry and concluded on trigrams, character 3-grams.

214 Stamatatos 2009.


215 Efstathios Stamatatos, Authorship Attribution Based on Feature Set Subspacing Ensembles. International Journal on
Artificial Intelligence Tools 15, no.5 (2006a);Efstathios Stamatatos, Ensemble-based Author Identification Using Character
N-grams. In Proceedings of the 3rd International Workshop on Text-Based Information Retrieval (2006b).
216 Fuchun Peng, et al., Language Independent Authorship Attribution Using Character Level Language Models. in
Proceedings of the 10th Conference on European Chapter of the Association for Computational Linguistics 1 (Morristown,
NJ: Association for Computational Linguistics, 2003);Vlado Keselj et al. N-gram-based Author Profiles for Authorship
Attribution. Computational Linguistics 3 (2003);
Stamatatos 2006b.
217 Richard S. Forsyth and David I. Holmes, Feature-finding for Text Classification. Literary and Linguistic Computing 11,
no.4 (1996);Grieve 2007.

218 Stamatatos 2009.


219 Houvardas and Stamatatos 2006.
220 Sanderson and Guenter 2006,
221 Koppel et al. 2012; Koppel, Schler and Argamon 2013.
222 Koppel et al. 2012; Koppel, Schler and Argamon 2013.
223 Johan F. Hoorn et al. Neural Network Identification of Poets Using Letter Sequences. Literary and
Linguistic Computing 14, no.3 (1999): 311-338.
38

I choose two different n for character n-grams. The character 3-gram can capture sub word information and the
character 4-gram is, based on the secondary literature, most of the time the best fit. By using these two character
n-grams I open up several registers and will solve, to a certain extent, the problem of choosing between stylistics
and semantic information. Its also more sensitive to differences in an optimal n over different genres.224

N-grams on character-level Word level


To take advantage of contextual information, n-grams on word level, called collocations, have been proposed as
textual features.225 However, the classification accuracy achieved by word n-grams is not always better than
individual word features.226 The merit of word n-grams is also its problem. Because they can capture contextual
information, they can capture content specific information rather than stylistic information.227
Results are convincing enough to try it, however, the classification accuracy achieved by word n-grams
is not always better than with individual words as features.228 Ill also occasionally use the collocations, but the
absence of convincing results, especially in comparison with character n-grams and individual word features, led
me to the decision to focus on those features rather than n-grams on word-level.

Word frequencies
The most straight forward approach to represent texts is by vectors of word frequencies. The vast majority of
authorship attribution studies are, at least partially, based on lexical features to represent the style.229 Using word
frequencies, where we look at how many times individual words occur in the corpus under analysis, is different
than using vocabulary distribution (Vi), where we count how many words occur i times.230
In his pioneering study,231 George Kingsley Zipf was the first to reveal that a relationship exists between
the number of occurrences (i) and their Vi. He ranked the various words of a text according to decreasing
frequency and plotted on log-log paper the ranks r against the corresponding number of times which the word of
rank r occurred, obtaining a straight line configuration. This is called Zipfs first law. It essentially says that the
number of occurrences is inversely proportional to its place on the frequency list, meaning that the first word

224 Efstathios Stamatatos, On the Robustness of Authorship Attribution Based on Character N-gram
Features Journal of Law and Policy 21, no.2 (2013).
225 Rosa Maria Coyotl-Morales et al. Authorship Attribution using Word Sequences. in Proceedings of the 11th
Iberoamerican Congress on Pattern Recognition (Berlin: Springer, 2006);Fuchun Peng, Dale Schuurmans and Shaojun
Wang, Augmenting Naive Bayes Classifiers with Statistical Language Models. Information Retrieval Journal 7, no.1 (2004);
Conrad Sanderson and Simon Guenter, Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author
Unmasking: An Investigation. in Proceedings of International Conference on Empirical Methods in Natural Language
Processing (Sydney: Association for Computational Linguistics, 2006).

226 Coyotl-Morales et al. 2006; Sanderson and Guenter 2006.


227 Gamon 2004.
228 Coyotl-Morales et al. 2006; Peng et al. 2004; Sanderson and Guenter 2006.
229 Stamatatos 2009.
230 Holmes 1994.
231 G.K. Zipf, Selected studies of the principle of relative frequency in language. (Cambridge, MA: Harvard
University Press, 1932).
39

occurs twice as much as the second, which occurs twice as much as the third. Zipf discovered that the thirty to
fifty MFWs account for half the word tokens in a novel.232
Tallentire233 discusses the use and difficulty of word frequencies in authorship studies. He points out that
the bulk of any sample of written English is accounted for by the same few words, recurring with the same relative
frequency, even in very different writings, taking in consideration that 10 per cent of the English vocabulary
provides for 90 per cent of the text of all the volumes of (English) literature in all the libraries.234 Traditionally such
words, called function words, were excluded from the feature set of the topic-based text-classification methods
since they do not carry any semantic information. However, as it turns out, the most common words like articles,
prepositions and pronouns, are found to be among the best features to discriminate between authors.235 So, much
less words, a few hundred, are sufficient to perform authorship attribution in comparison to a thematic text
categorization task, which takes thousands of words. 236
In the field of computational literature the most frequent words (MFWs) are considered as reliable features for
the measurement of the style of an author or the authorial fingerprint.237 Except for function words the MFWs
usually also consist of other non-semantic words, or semantic words who are so generally used, like man or
time, that theyve lost all of their discriminating semantics.238
Patterns of lexical choice can also be represented by modeling the relative frequencies of content words,
but its very problematic. Content markers might just be artifacts of a particular writing situation or experimental
setup and might thus produce overly optimistic results, not applicable to real-life applications.239 Content words
are also genre and topic specific and under conscious control of the writer. The style factor of a text is generally
considered orthogonal to its topic. As a result stylometric features attempt to avoid content-specific information to
be more reliable in cross-topic texts.240 So, in cases in which all the available texts for all the candidate authors
are on the same thematic area, carefully selected content based information may reveal some authorial
choices.241
Based on these findings, I will predominantly use stylometry measures that depend on the ratio of
occurrences of non-contextual function words. I wont perform analyses that exclusively search for content words.

232 Rybicki 2014, 1.


233 D.R. Tallentire, An appraisal of methods and models in computational stylistics, with particular
reference to author attribution. Cambridge: Doctoral Thesis - University of Cambridge, 1972
234 Holmes 1994.
235 Shlomo Argamon and Shlomo Levitan, Measuring the usefulness of function words for authorship attribution. in
Proceedings of the Joint Conference of the Association for Computers and the Humanities and the Association for Literary
and Linguistic Computing (2005);J.F. Burrows, Word Patterns and Story Shapes: The Statistical Analysis of Narrative Style.
Literary and Linguistic Computing, 2 (1987).

236 Stamatatos 2009.


237 Eder 2014.
238 Rybicki 2014.
239 Koppel, Schler and Argamon 2009.
240 Stamatatos 2009.
241 Zheng et al. 2006.
40

In all probability, will there, when using most frequent words on very short texts, be content words captured during
the analyses, and thats why I wont exclude them as a feature. Still, content words will in this thesis never be the
object of analysis. I prefer function words (FWs) because I need to recognize texts by the same author on
different topics. In addition to this, its very unlikely that the use of FWs can be consciously controlled, ruling out
the possibility that an author deceives me by imitates someone else.242 Thus, theyre able to capture pure stylistic
choices of the authors.243
Many studies have shown the efficacy of FWs for authorship attribution in different scenarios,244 all
confirming the hypothesis that different authors tend to have different characteristic patterns of FW use.245
Results of different studies using somewhat different lists of FW have been similar, indicating that the precise
choice of FW is not crucial. Discriminators built from FW frequencies often perform at levels competitive with
those constructed from more complex features.246
Syntactic features (Distribution of parts of speech, POS)
Another feature option is measuring the part-of-speech (POS). A tagger assigns a tag of morpho-syntactic
information to each word-token based on contextual information.247 The different percentages of nouns, verbs,
adjectives, adverbs and other parts-of-speech in a text are, if they can be defined accurately, a possible map on

242 Chung, C. K., & Pennebaker, J. W. (2007). The psychological functions of function words. In K. Fiedler (Ed.), Social
communication (pp. 343-359). New York: Psychology Press.
243 Stamatatos 2009.
244 A brief summary of studies that have shown the efficacy of FWs for authorship attribution in different scenarios;
Argamon and Levitan 2005;
Shlomo Argamon-Engelson, Moshe Koppel and Galit Avneri, Style-based Text Categorization: What Newspaper Am I
Reading? in Proceedings of AAAI Workshop on Learning for Text Categorization (1998);
Harald Baayen, Hans van Halteren, Anneke Neijt and Fiona Tweedie, An experiment in authorship attribution. in
Proceedings of JADT 2002: Sixth International Conference on Textual Data Statistical Analysis (2002);
Jos Nilo G. Binongo, Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution.
Chance 16, no.2 (2003);
Burrows 1987; de Vel et al. 2001;
Holmes, Gordon and Wilson 2001;
David I. Holmes, Michael Robertson and Roxanna Paez, Stephen Crane and the New-York Tribune: A Case Study in
Traditional and Non-Traditional Authorship Attribution. Computers and the Humanities 35, no. 3 (2001);
Juola and Baayen 2005;
Jussi Karlgren and Douglass Cutting, Recognizing Text Genres with Simple Metrics Using Discriminant Analysis. in
COLING '94 Proceedings of the 15th conference on Computational linguistics 2 (1994);
Brett Kessler, Geoffrey Numberg and Hinrich Schtze, Automatic Detection of Text Genre. in Proceedings of the 35th
Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the
Association for Computational Linguistics (Association for Computational Linguistics, 1997);
Koppel, Avika and Dagan 2006;
Koppel, Schler and Zigdon 2005;
Thomas V. N. Merriam and Robert A. J. Matthews, Neural computation in stylometry II: An Application to the Works of
Shakespeare and Marlowe. Literary and Linguistic Computing 9, no. 1 (1994);
Morton 1978;
Ying Zhao and Justin Zobel, Effective Authorship Attribution Using Function Word. Lecture Notes in Computer Science
3689 (2005).

245 Koppel, Schler and Argamon 2009.


246 Koppel, Schler and Argamon 2009.
247 Stamatatos 2009.
41

authorial style on the syntactic and grammatical level,248 although they only provide a hint of the structural
analysis of the sentences since it is not clear how the words are combined to form phrases, or how the phrases
are combined into higher level structures.249 These features are often genre specific and show lack of
homogeneity among authors.250 They also depend on NLP technology, which still has trouble with sufficiently
handling unrestricted text. Even on formats that included meta-text, morpho-syntactic information will not be
standard available, and neither is NLP a standard application in the software I plan on using. These objections in
combination with the scarce amount of research that indicates successful authorship attribution using only POS,
lead me to reject this feature.

Univariate vs Multivariate approach


The distinction in univariate and multivariate feature approaches for authorship attribution, is that of using only
one variable or analyzing more than one variable. Univariate analysis is considered a dated approach to features,
which wouldnt be helpful for solving the Wilhelmus case, but I will discuss it briefly anyway, in order to illustrate
the fundamentals of his successor, multivariate approaches, which Ill use extensively in this thesis.
Univariate approaches go back to the late 19th century work of Mendenhall.251 The key idea was that
each author could be characterized by a unique curve expressing the relationship between length and relative
frequency of occurrence; This curve would be applied on anonymous texts in AA cases. In the 20th century, the
idea of one single determinant feature was, thanks to Zipfs work, put on firmer statistical basis with the search for
invariant properties of textual statistics. The feature was at least invariant for any given author, though possibly
varying among different authors.252 These methods are most of the time not reliable or at least the unstable.253
With the development of more sophisticated multivariate analysis techniques, larger sets of features
could be considered. Mosteller and Wallaces work254 on the authorship of the Federalist Papers was based on a
new set of methods for stylometric authorship attribution, combining information from multiple textual clues, which
we call a multivariate approach. A then novel method of Bayesian classification, which considers features as
independent of each other, and uses the frequencies of a set of a few dozen function words (FW) as probabilistic
classifiers, essentially what is now called naive Bayes classification, was performed on the case of Federalist
Papers. The fundamental insight was that a rigorous Bayesian methodology, applied to the frequencies of a set
of topic independent words, could yield a measurably reliable methods for attributing authorship.255 Finding the

248 Holmes 1994.


249 Stamatatos 2009.
250 Holmes 1994.
251 Mendenhall 1887.
252 Koppel, Schler and Argamon 2009.
253 Burrows 1992;Grieve 2007;
Sichel, H.S. (1986) Word frequency distributions and type-token characteristics. In; Mathematical Sciences, 11, 45-72

254 Mosteller and Wallace 1964.


255 Koppel, Schler and Argamon 2009.
42

most probable attribution with these type of methods can be viewed as taking documents as points in some
space, assigning a questioned document to the author whose documents are closest to it, according to an
appropriate distance measure.

Conclusions on features
In theory, any feature set can be used with nearly any classification method, provided proper methodology is
followed in the study design. In practice, however, certain combinations have been applied and studied more
often, and have given better results than other combination.256 It seems that low level features, such as character
n-grams and function words, are very successful for representing texts stylistically,257 and therefore I will
predominantly focus on low level features in this thesis. These features claim to capture only stylistic information,
however they might capture some content information as well.258 In a paper259 where successful stylometry was
performed, on a corpus and with conditions resembling mine, Moche Koppel concluded that, large sets of very
simple features, common words and character n-grams, are more accurate than small sets of sophisticated
features for this purpose. Also these are the features that are traditionally ignored in human expert authorship
attribution, and definitely overlooked in the Wilhelmus research, which until now focused on a whole other
spectrum of clues, even in the seldom cases that they searched for them within the edges of the page. Based on
the theory summed up in this section, Ive selected the following features which I expect are capable of identifying
the stylistic characteristics of a very short text while its usage being within the reach my capacities and the
limitations of this thesis, like time, space and means.

To sum up, based on independent feature selection the features Ill be using are;

Features on character level and word level, that will also indirectly measure syntactic and semantic dimensions of

my texts.
Frequencies of n-grams on character-level and word-level. The word-level n-grams, or collocations, are the sole

features measuring indirectly the syntactic dimension of my texts.


Frequencies of most frequent words, so predominantly non-contextual words and to a lesser extent content

words, but the latter never isolated as a specific genre of words.


My methods will depend on a multivariate approach to features.

256 Koppel, Schler and Argamon 2009.


257 Grieve 2007;Keselj et al. 2003;
Peng et al. 2003;
Stamatatos 2006b.
258 Ross Clement and David Sharp, Ngram and Bayesian classification of documents for topic and authorship. Literary
and Linguistic Computing 18, no.4 (2003);George K. Mikros and Eleni K. Argiri, Investigating topic influence in authorship
attribution. in Proceedings of the International Workshop on Plagiarism Analysis, Authorship Identification, and NearDuplicate Detection (2007).

259 Koppel, Schler and Argamon 2013.


43

Distance measures
This section discusses the type of distance measure Im going to use, to calculate the similarity of my texts. The
term distance measure, that has been mentioned and quickly explained before, refers to a mathematical
procedure in which the differences or similarities of texts are measured by the differences between their sets of
variables, and expressed spatially.
There are several different distance measures, but I will use the Burrows Delta, because of the underlying theory,
my experience with the methods, previous successful research results and it being the standard distance
measure of the field of stylistic research and authorship attribution. The statistical formula of John Burrows
measures style by the relative frequencies of the most frequent words. It has been extended and used for a
variety of attribution problems260 and is equivalent to an approximate probabilistic ranking based on a
multidimensional Laplacian distribution over word frequencies.261
Its procedure is at follows. After a text or a corpus of texts, here referred to as the target text, has been reduced
to a bag of words, these are sorted and counted and written out as a frequency hierarchy. The Delta transfers
this to a measure of distance, based on the sum of the Z-scores of the differences in relative frequencies of the
most frequent words, between the target text and the test corpus. This distance measure represents the stylistic
characteristics of a text, always defined in relation to the corpus its compared to. To summarize, the Delta is the
average of absolute differences between z-scores of a set of variables (words) in a corpus and the z-scores for
the same words or set of variables, of a text (the target text).262 This gives us the following formula;

1 n
(T , T1 ) z ( f i (T )) z ( f i (T1 ))
n x 1

z ( f x (T ))

f x (T ) x
x

Formula 1: The formula of Burrows delta.

fx(T) = The frequency of the word x in text T


x = The average frequency of the word x in the collection of texts
x = The average deviation of the frequency of word x

260 J.F. Burrows, Delta: A Measure of Stylistic Difference and a Guide to likely Authorship. Literary and Linguistic
Computing 17 no. 3 (2002);David L. Hoover, Delta Prime? Literary and Linguistic Computing 19, no. 4 (2004a);
David L. Hoover, Testing Burrowss Delta. Literary and Linguistic Computing 19, no. 4 (2004b).
261 Shlomo Argamon, Interpreting Burrowss Delta: Geometric and Probabilistic Foundations. Literary and Linguistic
Computing 23, no. 2 (2008);Stein Sterling and Shlomo Argamon. A Mathematical Explanation of Burrowss Delta. in
Proceedings of the Digital Humanities Conference (London: ALLC, 2006).

262 Rybicki 2014, 3.


44

The Burrows delta is very effective attribution method for texts of at least 1500 words. For shorter texts, the
accuracy drops according to length. However, even for quite short texts, the correct author was usually included
in the first five positions of the ranked authors, which provides a means for reducing the set of candidate
authors.263 When using larger sets of frequent words (>500), the accuracy of the method was increasing.264 The
performance also was improved when the personal pronouns and words for which a single text supplied most of
the Delta score itself were deleted. Alternatives and all sorts of adjustments were also examined, but no
significant improvement over the original method was achieved.265
Knowing the formula or definition of a distance measure tells a non-mathematician as myself, surprisingly little
about the practical use of the statistical method or the effects of choosing the one over the other. An comparison
of the Burrows delta with other distance measures in understandable language, may shine some light on the
implications of choosing one measure over the other. Christof Schch uses in his Beyond the black box, or:
understanding the difference between various statistical distance measures266 a political analogy for the
explanation of distance measures he picked up during a lecture by Maciej Eder. Next to very entertaining it also
makes the en-face complicated mathematical matter, that determines the difference in use and effect of the
distance measures, quit understandable. Im quoting below two alternative distance measures Eder used in his
analogy, as well as the one on the burrows delta and a comparison.
The first distance measure he explains is The euclidean distance. The Euclidean metric or the
Pythagorean metric is actually very similar to our everyday idea of distance, namely the "ordinary" straight line
between two points in space. Its definition goes as follows: The straight line distance between two points, in a
plane with p1 at (x1, y1) and p2 at (x2, y2), it is ((x1 - x2) + (y1 - y2)),267 which Eder explains as follows.
The Euclidean distance is tyrannical, because it gives a voice only to the very top of the most frequent words list.
Following the principles of Euclidean geometry, it is based on the square-root of the sum of the squared
differences between all vector points; because no weighing is applied, the usually larger absolute differences
between the top most frequent few words (see Zipfs law) have a massive influence on the results; for the lower
words on the scale, the distances are smaller, and will not weigh in very much. For this reason, the Euclidian
distance is not recommended in most cases.268
Another formula Eder explains is the Manhattan distance measure, which is a form of geometry where the usual
distance function of metric, or Euclidean geometry, is replaced by a new metric in which the distance between

263 Stamatatos 2009.


264 Hoover 2004a.
265 Hoover 2004b.
266 Christof Schch, Beyond the black box, or: understanding the difference between various statistical
distance measures. The Dragonflys Gaze. Computational analysis of literary texts Research blog, August 3,
2012: https://dragonfly.hypotheses.org/101
267 http://xlinux.nist.gov/dads//HTML/euclidndstnc.html National institute of standards of technology
268 Schch 2012.
45

two points is the sum of the absolute differences of their Cartesian269 coordinates. The Manhattan distance, also
called The taxicab metric or city block distance, alludes to the grid layout of most streets of Manhattan, and
follows of course its principals on distance.270 Its definition goes as follows: The distance between two points
measured along axes at right angles. In a plane with p1 at (x1, y1) and p2 at (x2, y2), it is |x1 - x2| + |y1 - y2|.271
Burrows's Delta, can in fact be understood as a combination of standardization (i.e. z-transformation) of
frequency counts combined with the Manhattan distance.272
The following quote is Eder explaining the Manhattan and in the quote after that Eder compares the
other distance measures with the Burrows Delta:
The Manhattan distance corresponds to an oligarchy, because it is slightly less biased towards the very most
frequent words. In contrast to the Euclidean distance, Manhattan distance relies on the sum of the difference of
all coordinates of the vectors, something which reduces just a little bit the influence of the very top most frequent
words, when compared to the Euclidean distance. But it still gives decisive importance to a very small group of
words, in the range of maybe the 10 to 20 most frequent words.273
Classic Delta by Burrows is entirely democratic: instead of comparing absolute frequencies of features, it relies
on absolute z-scores, and calculates a mean difference score from all individual difference between the z-scores.
This means that it effectively applies scaling or weighing so that the less frequent words count more in the overall
distance score, and the most frequent ones count less. Each words gets an equal say in the distance measure,
like each person has a vote in democracy.274
Christof Schch points several times to the merits of the delta. Fortunately, stylometric democracy yields much
better results; in fact, Classic Delta was a huge step forward for stylometry.275 He also explains that the choice of
the right distance measure depends on language on and advices the Burrows over the Eders Delta when dealing
with Dutch texts.276 This is very reassuring when it comes to my choice for delta, that was heavily supported by
previous research already.

Computational means
269 Refers to Descartes. A Cartesian system specifies each point individually in a graph, or spatial plane, by its numerical
coordinates.

270 https://en.wikipedia.org/wiki/Taxicab_geometry#cite_note-1 * While aware of the unscientific nature of


this source, I choose to use it anyway, because of the spot on explanation and clear phrasing. Ive the
theoretical knowledge to vow for the accuracy of this description, if I were to rephrase the explanation of the
Manhattan distance measure myself and thereby pretend to avoid the usage of this source, Id probably end
up with an inferior paraphrase.
271 http://xlinux.nist.gov/dads//HTML/euclidndstnc.html National institute of standards of technology
272 Evert et al. 2015.
273 Schch 2012.
274 Schch 2012.
275 Schch 2012.
276 Schch 2012.
46

As discussed in the methodology so far, the computational means for low-level features are minimum. Other
choices were also impacted by the preferences of low demands on soft- and hardware. The programs Ill use are
the stylometric tool R, version R-3.1.3, with package Stylo, version 0.5.9, and the network visualization program
Gephi.
R is the product of open-source statistical programming and application building environment, and it allows less
advanced researchers to use ready-made scripts and libraries. The tool, or a set of tools, combine(s)
sophisticated state-of-the-art algorithms of classification and/or clustering with a user-friendly interface. R-scripts
are made, provided with a graphic user interface and more or less documented.277
For the development of the software Stylo278 package is used from the CRAN repository that is
distributed under the GNU GPL 3 license. Stylo provides a comprehensive collection of functions used frequently
in stylometric analysis. The software is implemented entirely in R which is a popular language for statistical
computing and graphics.279
A combination of R and Stylo provides multidimensional methods, as multidimensional scaling, principal
components analysis, cluster analysis, and bootstrap consensus trees, that could be used by scholars without
programming skills. The script reads plain text files, XML, or HTML; it supports explicitly nine languages, and
implicitly many more. Publication-quality plots can be exported in PDF, JPEG, PNG, or EMF formats.280 The
Burrows delta is a standard application. This software from Eder et al, is open source and made readably
available by the computational stylistic Group on their website.281
Gephi performs network visualization, visualizing the results from quantitative analyses on texts, as a network of
dots and connections in a 3-dimensional space, where the distance and connections between the dots, define the
stylistic similarity between the texts. Gephi can visualize the results of R.

Texts
The preparation, meaning building and cleaning your corpus, and presentation of texts, meaning the shape
theyre going to be taken into the analysis, are absolutely vital to the outcome of an experiment. Ill discuss the
more topical considerations, most importantly which texts and authors to include, in the section corpus. The
more theoretical considerations of how to incorporate text in the analyses, and what the requirements for
successful analyses are, will be discussed in this section. While performing the analyses, I will face both topical
and technical altercations of my texts. No matter how sound my preparation is, it will all be just theoretical, and

277 Maciej Eder, Mike Kestemont and Jan Rybicki. Stylometry with R: a Suite of Tools. Digital Humanities
2013: Conference Abstracts. (Lincoln, NE: University of NebraskaLincoln, 2013).
278 Eder, Kestemont and Rybicki 2013.
279 R. Core Team, R: A Language and Environment for Statistical Computing. (Vienna, Austria: R Foundation for Statistical
Computing, 2014) http://www.R-project.org/

280 Eder, Kestemont and Rybicki 2013.


281 https://sites.google.com/site/computationalstylistics/
47

the practice of actually performing the analyses will point out some needed altercations regarding the texts,
underwriting the difference of scientific theory with experimental science. The settings of the parameters, smaller
text-properties and smaller altercations to my initial text-preparation and text-presentation, will be discussed
during the reporting of the analyses and interpretation of the results. The complete lists of settings and
parameters are include in the appendix. I will here focus on the basic technical characteristic for the texts in my
corpus, that will determine the exact nature of my research.

Instance based vs profile based representation


There are different ways to compare a test-corpus with the text that has to be attributed. The choice of the text
representation is not a trivial process and it directly affects the performance of the attribution model.282
One way is to concatenate all text per author in one single text file, for each author, with which the
unseen text is then compared. These are called profile based approaches, and have been practiced since as
early as the work of Mosteller & Wallace. Because theres no representation of each individual text, the
differences between training texts by the same author are disregarded, and the stylometric measures extracted
from the concatenated file may be quite different in comparison to each of the original training texts.283
The other option of text representation and the majority of the modern authorship identification
approaches, considers each text, or sample, as a unit that contributes separately to the attribution model. These
are called Instance based approaches. Each text sample of known authorship is an instance of the problem in
question, and is represented by a vector of attributes (x).284 Vector space models, where each text is considered
as a vector in a multivariate space, comprise the majority of the instance based approaches.285
Comparing both approaches, the profile-based representation initially seems to be the most
advantageous for my research. Instance based approaches can be very useful when provided with an extensive
training corpus and machine learning algorithms able to handle high-dimensional, noisy and sparse data.
However these are requirements I dont have. When only short texts are available their concatenation may
produce a more reliable representation in comparison to individual representations of short texts, because of the
diminishing accuracy over text length. Profile-based approaches are based on similarity-based methods and they
usually can handle homogeneous feature sets, like function words and character n-grams.286
However, the Wilhelmus, or at least the version that is known to us, is in all probability not an example of
a text representing the pure style of only one factor, namely the author. As Ive explained in the Wilhelmus theory,
the authorial signal of the national anthem is in all probability, based on the complicated time of birth, the probable
motives for conceiving and because of its turbid propagation, not an complete accurate stylistic representation of
its author. This makes it, in combination with the limited amount but varied nature of test corpus, resulting in the

282 Stamatatos 2009.


283 Stamatatos 2009.
284 Stamatatos 2009.
285 Stamatatos 2009.
286 Stamatatos 2009.

48

clouding of the authorial signals of my probable authors as well, unlikely to produce a 100% match between the
Wilhelmus and the profile-based representation of its author, if present. Because of the temporal distance to the
texts of my corpus, adding to the insecurity regarding the purity of their stylistic signals, I should consider them as
up to some extent concatenated files. My choice of methods, and in this case especially the ones of visual
representation, which is for the large part the object of interpretation, prefer the instance-based measures. This
way the diverse nature of all texts, among them texts of the same author but from different genres, languages and
publishers, will be captured, statistically and visually, and the Wilhelmus could group with the other texts from its
until now hidden author, forming a cluster in the graph.
Ill present my texts to my software in a profile-based representation. However my text will need some
interventions, like chunking and merging, to meet some minimal requirements of texts and balance requirements
of corpus, that Ill explain in the next section. This will also result in the performance of analyses which involve
multiple texts per author as well as analyses that only incorporate one big work per author, like a book of songs or
book of psalms. Although these analyses may not actually involve instance-based approaches, they do have the
same consequents for the analyses. The distinction between instance based and profile based, is not set in stone
to begin with, there are plenty hybrid approaches.

Problems of representation
Problems surrounding text length might just be the greatest challenge of my research. In a text thats too small,
stylistic fingerprints havent manifested enough to produce a signal that can be measured by current methods,
while great variance in text size causes all kinds of inaccurate test results, due to poor balance in the corpus.
When performing analyses on very short text, these two requirements for text length ask for opposite preparation.
The length of your texts in the test corpus should be as long as possible, to assure a proper stylistic signal, while
it should also be as close as possible to the length of the target text, which is short, in order to maintain a
balanced corpus.

Text length
Word frequencies are not random variables, and may vary considerably across different works. An
occurrence of a word depends heavily on its context. Thus, similar to other probabilistic phenomena, word
frequencies strongly depend on the size of the population (i.e. the size of the text used in the study). 287
When there are multiple texts of variable length per author, the text instances length should be
normalized, segmented to equal sized samples.288 In case of only one large text for a particular candidate author,
Stamatatos suggests to segment it to multiple parts of equal length. However in all these cases, the texts should

287 Eder 2010, 2.


288 Sanderson and Guenter 2006.
49

be long enough so that the text representation features can adequately represent their style.289 As said before,
this is one of the major challenges that lures for my Wilhelmus-case, because the text that I want to analyze is
only 551 words long.
Research for determining the limits of text-size has led to no definite answers, a critical point might be
language and genre dependant or even be individual for every particular text.290 There does seems to be the
shared opinion that the minimal sample size is somewhere around 1000 words,291 although many successful
attribution studies292do not act upon the assumed limit of 1000 words per sample.293 Theres also the strong
suggestion that the text size depends rather on genre than on language.294 Hirst and Feiguina295 varied lengths of
text by 200, 500 and 1000 words and report significantly reduced accuracy as the text block length decreases.
The Koppel paper296 that I mentioned before, used text chunks of 500 words and reported positive results, while
Sanderson and Guenter297 even used chunks of 500 characters to perform stylistic analysis on, again with
moderate success.
My distance measure, the Burrows delta, is, as I have said before, known to be very effective attribution
method for texts of at least 1500 words, while for shorter texts, the accuracy drops according to length.298
However, even when Hoover299 tested the method for rather short texts, the correct author was usually included in
the first five positions of the ranked authors, which provides a means for reducing the set of candidate authors.300
Hoover301found that by using larger sets of frequent words (>500), the accuracy of the method was increasing.302
These are all somewhat encouraging results. Therere however also researchers, who are more
skeptical about the possibilities of authorship attribution on very short texts. A paper by Maciej Eder that reviewed
past research on text size and follows up with an experiment aiming to determine an absolute minimum, draws
the unsettling conclusion that the previous, rather small estimates, are often not followed by thorough empirical
investigation, and that the minimal required text size might be closer to 5000 words than to a 1000.303

289 Stamatatos 2009.


290 Eder 2010, 2.
291 David I. Holmes, Lesley J. Gordon and Christine Wilson, A Widow and her Soldier: Stylometry and the American Civil
War. Literary and Linguistic Computing 16, no.4 (2001): 406.
292 Patrick Juola and R. Harald Baayen, A Controlled-corpus Experiment in Authorship
Identification by Cross-entropy. Literary and Linguistic Computing 20 (2005);Burrows 2002;
Matthew L. Jockers, Daniela M. Witten and Craig S. Criddle, Reassessing authorship of the Book of Mormon using delta
and nearest shrunken centroid classification. Literary and Linguistic Computing 23, no.4 (2008).

293 Eder 2010, 2.


294 Eder 2010, 14.
295 Graeme Hirst and Olga Feiguina, Bigrams of Syntactic Labels for Authorship Discrimination of Short
Texts. Literary and Linguistic Computing 22, no. 4 (2007).
296 Koppel, Moshe, Jonathan Schler, Shlomo Argamon and Eran Messeri 2006.
297 Sanderson and Guenter 2006.
298 Stamatatos 2009.
299 Hoover 2004a.
300 Stamatatos 2009, 549.
301 Hoover 2004a.
302 Hoover 2004b.
303 Eder 2010, 2.
50

Imbalance problem
An important problem in authorship attribution tasks, called the imbalance problem, arises when the distribution of
the training corpus over the candidate authors is uneven. To have multiple texts for some candidate authors and
very few texts for other authors. The length of these samples may not allow their segmentation into multiple parts
to enrich the representation of certain authors.304 In instance based approached class imbalance depends on the
amount of training texts per author. On the other hand, the class-imbalance problem in profile based approaches
depends only on text length.305 Only a few studies306have taken this factor into account so far, but I will, as much
as my corpora allow me, account for this imbalance. The use of n-grams, as it magnifies the amount of features,
and the use of a distance measure that calculates only in relative frequencies amends for some of the
imbalanced problems.

Real world AA problems/tasks & their methods/solutions


In the typical authorship attribution (AA) problem, a text of unknown authorship is assigned to one candidate
author, given a set of candidate authors for whom text samples of undisputed authorship are available.307
However since many realistic authorship problems do not fit the standard attribution paradigm, we should
consider scenarios that are likely to arise in practice. For example, a situation in which we suspect that a given
author may have written some text, but do not have an exhaustive list of alternative candidates, is a common one
and descriptive of the Wilhelmus case.
Koppel 308suggests three problems that researchers may encounter in real-life AA;
1. Theres no candidate set at all. The challenge is to provide as much demographic or psychological information,
like gender, age, ethnicity, about the author as possible. This is the profiling problem.
2. There are thousands of candidates for each of whom we might have a very limited sample. This is the needle in a
haystack problem.
3. There is closed candidate set but there is one suspect; in this case, the challenge is to determine if the suspect is
or not the author. This is the Verification problem. If the true author of an anonymous text might not be one of the
known candidates, one can reduce this case to a binary authorship verification problem; determine if the given
document was written by a specific author or not. This is usually considered in plagiarism analysis.309
The profiling problem does not apply to my research and I will not discus it any further. I will explain the remaining
two problems in further detail and apply their possible solutions to my own corpus, as they tend to overlap. I will

304 Stamatatos 2009.


305 Stamatatos 2009.
306 Marton et all 2005;Stamatatos 2007, 237-241.

307 Stamatatos 2009.


308 Koppel, Schler and Argamon 2009.
309 Koppel et al. 2012.
51

discuss the verification task, problem number three, first, because it depends on a technique called unmasking
which is useful for the reader to understand before I go on to discussing the needle in the haystack problem.

Verification problem and the unmasking method solution


In a real world verification problem, theres no closed candidate set but only one suspect; in this case, the
challenge is to determine if the suspect is the actual author of the anonymous document, or not.310 The problem
becomes more complicated when we consider the situation in which an author will use a small number of features
in a consistently different way between works, which is not uncommon. This can result of thematic differences, in
genre or purpose, from chronological stylistic drift or from deliberate attempts to alter or mask his style.311 Thus
we must distinguish between relatively shallow differences that reflect conscious or unconscious changes in an
authors style and deeper differences that reflect the styles of different authors.
This differentiation can be achieved with the Koppel technique of unmasking,312 thereby solving a vast
part of the verification problem. The unmasking method determines not only if a text from author A is
distinguishable from the anonymous text X, but also how great the depth of difference is, between text A and X,
while iteratively eliminating the most distinguishable features.313 To be precise, it removes, by stages, those
features that are most useful for distinguishing between A and X, and to gauge the speed with which crossvalidation accuracy degrades as more features are removed. The main hypothesis is that if A and X are by the
same author, then whatever differences therere between them will be reflected in only a relatively small number
of features, despite possible differences in theme genre and like.314 Two texts are probably by different authors if
the differences between them are robust to changes in the underlying feature set used to represent the
documents.315
Results316 show that once a small numbers of distinguishable markers are removed, the attribution case
becomes much stronger. Koppel got a 95.7% overall accuracy, with errors almost identically distributed between
false positives and false negatives.317 Thus by taking into account the depth of difference between two works, it
can determined if they were authored by the same person or by two different people. The Koppel method
introduced in 2007318, described in Koppel 2009319, provides a robust solution to the authorship verification

310 Koppel, Schler and Argamon 2009.


311 Koppel, Schler and Argamon 2009.
312 Koppel, Schler and Argamon 2009.
313 Moshe Koppel, Jonathan Schler, Shlomo Argamon and Eran Messeri. Authorship Attribution with
Thousands of Candidate Authors. in Proceedings of the 29th ACM SIGIR Conference on Research and
Development in Information Retrieval (New York: ACM, 2006).
314 Koppel, Schler and Argamon 2009.
315 Koppel et al. 2012.
316 Koppel, Schler and Argamon 2009.
317 Koppel, Schler and Argamon 2009.
318 Koppel, Moshe, Jonathan Schler, Shlomo Argamon and Eran Messeri 2006.
319 Koppel, Schler and Argamon 2009.
52

problem that is independent of language, period, and genre and already has been used to settle at least one
outstanding literary attribution problem.320
A limitation of the method is that unmasking requires a large amount of training text. Preliminary tests
suggest that the minimum would be in the area of 5.000 to 10.000 words.321 Unmasking does not work for short
documents.322 323 This means that its impossible to apply to my Wilhelmus case. Furthermore, is the metalearning method of unmasking out of my reach because the unknown texts should be long enough since each
unknown text has to be segmented in multiple parts to train the SVM classifiers.324
The reason for the comprehensive mention of this method in my thesis, besides the necessity of examples of real
world AA problems and solutions, is the underlying principal of feature elimination. We should understand, going
into the analyses, the different ways in which features represent style, in order to be more flexible in our
interpretation. The most distinguishable features might be the ones that blur the effects you are trying to capture.
Changing the features will not only provide us with the possible measurement of another lexical dimension, from
stylistics to semantics for example, but it can also cause the neglect of the previous most distinguishable
features, which can have an effect on its own. We performing analyses we should look for attribution that remains
stable under different experimental circumstances, not in the least different feature sets. This stability is
associated with reliability of the attribution.

The needle in a haystack problem and solution


The needle in a haystack problem is when an authorship attribution case deals with thousands of candidates
authors, for each of whom we might have a very limited sample. Standard classification techniques are unlikely to
give reasonable accuracy and may require excessive computation time to learn classification models. Learning a
classifier to distinguish them is infeasible.325

The Koppel

paper AA in the wild326 describes the problem of many candidate authors for a short document. Key insight is that
a similarity based approach can be used to identify the most likely authors, but the robustness of the similarity
must be taken into account in order to filter false positive identifications. This problem is very similar to the
problem Im faced with, so a solution offered here for this problem must be considered as one of my own
methods.
Former nave approaches assembled a representative collection of works by other authors and use a two-class
learner such as SVM, to learn a model for a versus not-a. This method is straightforward, but suffers from a

320 Koppel and Schler 2004; Koppel, Moshe, Jonathan Schler, Shlomo Argamon and Eran Messeri 2006.
321 Koppel, Schler and Argamon 2009.
322 Sanderson and Guenter 2006.
323 Koppel et al. 2012.
324 Stamatatos 2009.
325 Koppel, Schler and Argamon 2009.
326 Koppel, Schler and Argamon 2013.
53

conceptual flaw. If most chunks are attributed to not-a, a is probably not the author, but the other way around it
not true. Are most of the text samples contributed to a, a is not by definition, probably the author. Any author not
represented in not-a but with a somewhat similar style to a will be falsely determined by this method.327 This
problem can be solved if we are willing to accept Dont know as an answer for those cases where the document
to be attributed is not sufficiently distinct to permit attribution. Again, meta-learning is used to identify such cases
and find that in the remaining cases, where the system believes attribution is reliable, this method is able to
provide highly accurate results.328
A good example is that of Koppels paper computational methodes in AA,329 in which blog posts from at
least 500 words where meant to be attributed to one of the 10.000 blogs they belonged to. In this case three
representations based on content features and another one based on style features were used, along with a
standard cosine measure, to quantify the similarity of each authors known work with a given snippet. The various
authors can be ranked according to the similarity between their known works and the snippet under
consideration, with the hope that the highest-ranked author is the author of the snippet. The idea is that some
distinctive feature might render the snippet particularly similar to just one of the candidate authors, when applying
a distance measure over meaningful textual features.
This simple approach to the problem actually works surprisingly well. The three content representations
assign the snippet to the actual author between 52 and 56%, respectively. These results are impressive, but not
useful as the system is wrong almost half of the times.330 As I mentioned, applying meta learning boosts these
number, for snippets limited to 200 words, at a recall level of 30%, to a precision of 86% and at recall of 40%, to a
precision of 73%.331
A problem with the described method and the results, is that they are still those of an experimental
setting. On a real world case, we cannot assume that the author of a questioned text will in fact be contained in
the candidate set, even if that set is very large.332 My number of candidates is not very large and Ive absolutely
no certainty that the real author is include.
Another problem is that as the number of alternative candidates becomes much smaller, the problem
might, somewhat counter intuitively, become more difficult. This is because our method implicitly leverages the
fact that, if a document is much more similar to one authors writing than to those of all others, it is very likely the
document was written by that author. As the number of alternative author decreases, the reliability of such a
conclusion will similarly decrease.333 My number of alternative candidates is much smaller than that of Koppel, so
the reliability of such a positive conclusion will decrease.

327 Koppel, Schler and Argamon 2009.


328 Koppel, Moshe, Jonathan Schler, Shlomo Argamon and Eran Messeri 2006.
329 Koppel, Schler and Argamon 2009.
330 Koppel, Schler and Argamon 2009.
331 Koppel, Schler and Argamon 2009.
332 Koppel, Schler and Argamon 2009.
333 Koppel, Schler and Argamon 2009.
54

I wont be able to perform the meta-learning, and the results, of 50% correct attributions in experimental
conditions, are not high enough to be comfortable without it. Furthermore, the conditions Koppel gets to work
with, even before the meta learning, are far more favorable than mine. The needle-in-the-haystack-problem as
described and solved by Koppel is illustrative of the difficulty of solving such a case. It is however the task Ive
assigned myself to and the difficulty of this case, along with its major relevance, is exactly why it is so necessary
to try to solve it. It will stretch the possibilities of the currently available methods and means, show what they can
accomplish and where we need further development.

Tests
In this section Ill sum up the test Ill perform, quickly recap their methods, sometimes expand on their methods
and parameters, and place them in chronological order of performance, to take the reader through the process of
thought. My methodology, including the tests, is based on the discussed theory, and limited by my own capacities
and the limitations of my design, techniques and corpus. My familiarity with and preferences for some of the
software translates in a preference for certain methods.
I discern two types of tests that I will perform, namely the multivariate authorship attribution task and a principal
component analyses. The latter will come into play, when I explore the nature of the results by reducing the
dimensionality of the features, later on in the analyses, and when I mimic a successful authorship attribution
method by Burrows, using the PCA. The main attention will go out to the multiclass categorization task, used as a
authorship attribution method. Within these two types, I can vary the features, parameters, corpora, mode of
analysis, the distance measure, among many other things, dependant on the results and my own assessment.
Therefore, the range of stylistic characteristics of a text that I can test with the categorization task enlarges, as Im
able to change so many of the conditions. The number of analyses performed for this thesis runs quite high. The
broad order in which I will perform my analyses is as following: I start with analyses that can handle a lot of
candidate authors and a lot of different texts and that can visualize broad effects. So Ill perform distant reading
and exploratory analyses on a corpus as complete as possible. The process of these exploratory analyses,
including the encountered problems, and their results, are fundamental for my choices regarding the analyses
that will follow, in terms of methodology and composition the corpora. These following analyses aim at answering
the research questions and hypotheses. After the bulk of the analyses are performed, the dimensions of
distinction between texts are examined with the aid of the principal component analyses. The exact meaning of
dimensionality reduction and its specific use for the Wilhelmus-case will become clear when we get there.

TEST 1 multiclass, single-label text categorization task


1.1 Exploring Gephi visualization
The first test Ill perform is an attribution method depending on the standard information-retrieval technique, in
which we define some distance measure (Burrows) over meaningful textual features (MFWs) and possibly

55

attribute the target text (the Wilhelmus) to the closest cluster of texts, provided theyre of a common author,334
dealing with a needle in the haystack problem, meaning attributing a short document in, in an open author set,
with many possible authors. The goal is to perform an exploratory analysis on which I can base, the composition
of my corpora and the settings for further analyses, as well as to get a first indication of the methodological
capacities and a first glance at any possible results.
Using R and package stylo to measure stylistic resemblance based on word frequencies and Burrows
delta, to determine the stylistic relationship of the texts of my corpus and subsequently visualize this with Gephi.
As Ive said, I do not, at this point, wish to get an actual authorship attribution, but Ill analyze and visualize the
effects to serve the goals stated above. I hope to establish insight in my methods and my corpus, and therefore a
as large as possible corpus is needed.

1.2 Goal specific test on specialized corpora


Depending on the results of the exploratory analyses, Ill formulate further questions, hypotheses, corpora and
analyses. Any new results will be taken into consideration with every new analysis, and changes, demanded by
these results, will be incorporated in the design, if needed.
On these balanced and prepared corpora of small closed set of candidates, Ill perform the same tests
as those of 1.1, a straightforward AA test or multiclass, single-label text categorization task,335 but now under
different conditions and with different aim. As mentioned before, perfect conditions are met when we have
copious quantities of text of undisputed authorship by each candidate author and that the anonymous text is
reasonably long. Although my conditions fall short on various aspects to qualify for perfect conditions, Ill still try to
solve my Wilhelmus-case with these methods, based on previous results, the easy applicable and straightforward
design, and the fact that no previous attempts to computationally solve the Wilhelmus-case have been made,
making any computational test, a first.
My need for an multivariate distance measure based on similarity will be met by the Burrows delta, who
will again calculate the similarity between texts, and then estimate the most likely author based on a nearestneighbor algorithm,336 visualized as such in R or Gephi. Ill use a variety of features like the word collocations, FW,
Character n-grams.
Ill expect to answer my research questions and hypotheses with these analyses on these corpora, but
not before Ive identified the effects, other than authorship effect, more precisely and cancelled them out, or at
least accounted for them, by comprehending their influence and letting them be part of my interpretation. With a
good oversight of the signals in my corpora, in the target text as well as my test corpus, I aim to find authorial
signals, among other effects, and stylistic resemblance that will lead to major breakthroughs in the tradition of
stylistic and authorial research on the Wilhelmus. At the very least I hope to establish whether its possible to
stylistically analyze the Wilhelmus with the current available methods and means.

334 Koppel, Schler and Argamon 2009.


335 Stamatatos 2009.
336 Stamatatos 2009.
56

TEST 2 Multivariate statistical analysis techniques (PCA)


2.1 Dimension reduction
Principal components analysis (PCA) is a statistical technique which has the advantages of requiring no
underlying mathematical model. It aims to transform the observed variables to a new set of variables which are
uncorrelated and arranged in decreasing order of importance. The principle aim is to reduce the dimensionality of
the problem and to find new variables which will help to make the data easier to understand. These new variables
(or components) are linear combinations of the original variables and it is hoped that the first few components will
account for most of the variation in the original data.
The PCA determines, transforms and plots the data in a limited amount of components of variation, and
by doing so can be very helpful with interpreting relationships between objects of analysis and the identification of
outliers, that have different properties then other texts in the corpus. The first step in PCA is to draw a new axis
representing the direction of maximum variation through the data. This is known as the first principal component.
Next, another axis is added orthogonal to the first and positioned to represent the next highest variation through
the data. This is the second principal component. The data is then transformed or rotated to view the points on
the new axes. When the process of adding more principal components continues, each one orthogonal to the
previous one and each one accounting for less and less of the variance in the data set.337 The result can be
visualized in a two-dimensional space in such a way that spots that are "close together" on these stylistic
dimensions, will appear close together on the plot. So using a distance measure, the graphs shows how and how
strong, each part of the corpus, represented as a point in space, contributes to the plot, and also which text or
part deviates from the general component or distribution of variance.

2.2 Authorship attribution


The PCA can also be used for actual authorship attribution instead of dimensionality reduction. Burrows
visualized the differences between texts written by different authors, by projecting the high-dimensional word
frequency vectors, computed for those text onto the two dimensional subspace of a graph, spanned by the two
principal components; if good separation is seen between documents known to be written by different authors,
then new texts may be attributed by seeing which authors comparison documents are closets to them in this
space.338 This method has been used to resolve several outstanding authorship problems.339
In their pioneering paper Burrows and Hassal340 used principal component analysis to distinguish
between two authors Henry Fielding and Sarah Fielding. Based on the fifty most frequent words, the plot showed
that the first two principal components clearly assigned the disputed texts to either Henry or his sister.341 I can

337 http://www.totallab.com/products/samespots/support/faq/pca.aspx
338 Koppel, Schler and Argamon 2009.
339 Binongo 2003; Burrows 1992; Holmes, D. (2003), Stylometry and the Civil War, Chance 16(2)
340 J. F. Burrows and Anthony J. Hassall, Anna Boleyn and the Authenticity of Fielding's Feminine
Narratives. Eighteenth-Century Studies 21, no.4 (1988).
341 Koppel, Schler and Argamon 2009.
57

mimic this experiment by taking Coornhert en Marnix instead of the Fieldings and look if whether the Wilhelmus
follows one of the author and if so, which one.

Corpus

Considerations for the researcher


A great variety of influences determine the style of a text; influences related to both textual properties and to its
authors biography. When performing authorship attribution the researcher has to consider and include, all these
possible influences when composing the research corpus. Attribution methods are likely to be most reliable when
the corpus of texts of known authorship are of the same language,342 genre, date and theme as that of the
anonymous work. This way, authorship would be the most important discriminatory factor between the texts.343
Stylistic fingerprints do no always remain stable during the course of an authors lifetime.344 The lexicon
expands indefinitely until death or illness intervenes,345 but a writer can become less stylistically innovative as
certain words and patterns become increasingly preferred, because even in healthy aging, semantic retrieval
speed deteriorates and the number of repeated phrases increases.346 In a study of Lancashire and Hirst, the
works of Agatha Christie are quantitatively analyzed because she is rumoured to have suffered from Alzheimer.
Semantic, and after that phonological, output lexicon becomes progressively inaccessible in Alzheimers
disease.347 One of their conclusions goes as follows; Her familys testimony about Christies otherwise
undiagnosed physical and mental decline offers an explanation for these data: encroaching dementia.348 Not only
do age and illness have a large influence on an authors style, but its also possible that this goes by relatively
unnoticed for the human expert reader. This is a striking example of why a research corpora consists preferably of
texts from around the same period of time, because these effects wont go unnoticed by a computer.
The effects of genre give similar considerations. Different genres, like poetry and prose, can harbor
different stylistics. Possible effects of topic reveals difficulties, because this is very unspecific categorization and
leaves a you with limited amounts of texts very quickly. The use of function words over content words solves this

342 Stamatatos 2009.


343 Stamatatos 2009.
344 Holmes 1994.
345 J. Maxim and K. Bryan, Language of the Elderly: A Clinical Perspective (London: Whurr, 1994), 3, 24.
346 Marjorie Nicholas, et al. Empty Speech in Alzheimer's Disease and Fluent Aphasia. Journal of Speech, Language, and
Hearing Research 28 (1985).

347 Maxim and Bryan 1994, 3, 24


348 Ian Lancashire and Graeme Hirst, Vocabulary Changes in Agatha Christies Mysteries as an Indication of Dementia:A
Case Study. 19th Annual Rotman Research Institute Conference, Cognitive Aging: Research and Practice (Toronto:
University of Toronto, 2009), 4.

58

problem for a large part, unless topic might actually also have influence on the style, in terms of relative frequency
of function words. A certain topic or genre demands a certain structure, for example, distinct stylistic differences
were found between genres like tragedies and comedies.349 The line Ive drawn here, between topic and genre, is
pretty blurred and will not have such a clean rendition in textual reality, as any border is imaginary.
Other important evaluation parameters for a corpus are; corpus size, in terms of both the amount and
length of the texts, and test corpus size, meaning the length of the anonymous text, the number of candidate
authors and a balanced distribution of the corpus over the authors. These are already discussed.

Explanation corpus
In this section, I explain my corpus and discuss the purpose and nature of my set of candidate authors and
included texts. Texts can mean complete texts or fragments of texts; single such texts or collections of them with
a common author, genre, or language. My corpus means in this section all of the texts that were at some point
part of the analyses I ran for this thesis. I discern within my corpus three major sub corpora, based on the way
they came into my possession.
Two of them, the Meertens corpus and the DBNL corpus, Ive received from my cooperation with these
institutions of language and literature. In these cases, especially with the DBNL corpus, the compositions of the
sub corpora depended for the most part on what was handed to me. I deliberately tried to have as less influence
as possible regarding the selection process of these two sub corpora, so I could rule out that certain effects were
only present because of my selection or because of the practice of preparation in general. The initial request for
texts, was of course to some extent specified. Another advantage of the delivered corpora was that it allowed me
to include a larger amount of texts, since I didnt need to handpick and clean them manually.
A third sub corpus was constructed by myself, aimed at a specific task. I, therefore, occasionally refer to
this corpus, and its sub corpora, as specialized corpus. The selection of texts is based on secondary literature I
laid a hold on, from the experts in the field Ive been in contact with, and for a large part on the availability of the
texts. Ive downloaded texts, transcribed them from paper books as well as pixel pages or files, and copy-pasted
from DBNL, but also siphoned texts of the corpora I got from the institutions.
These three sub corpora are in many cases further divided in all sorts of smaller sub corpora. I will,
however, always start with the composition of a 0-corpus, the main sub corpus. The 0-corpus includes all texts of
the respective sub corpus and is the starting point for any further division and analysis. The three 0-subcorpora
also represent three types of format, three types of goals and three separate lists of thank you notes I will be
writing after my thesis is done. A complete index of all corpora and their texts is added (see appendix 1 corpora).
Discussing my corpus, specifically the Specialized corpus, automatically leads me to discussing my
hypotheses, which I therefore present in succession of the individual expounding of the three main sub corpora.

349 Holmes 1994.


59

Three corpora
DBNL Corpus
This corpus is send to me by the Digitale Bibliotheek van Nederlandse Letteren350 (DBNL), being exactly that. Ive
been in contact with them from February 2015 until April, especially with Cees Klapwijk. After some non-digital
paperwork, I received the corpus in correspondence with my request of all Dutch texts, also the southern
dialects, written between 1550 and 1590 without discrimination on genre, so including screenplays, that are in the
possession of the DBNL. These demands are so non-discriminatory, that theres a good chance the author of the
Wilhelmus is actually in there. The goal of this corpus is to have a birds eye view of the Dutch literature of the
second half of the sixteenth century, in order to analyze the big movements, the noise and the possibilities of
actually finding that needle in the haystack. A corpus with as many texts and as many candidate authors as
possible accomplishes this.
The texts were in XML, Extensible Markup Language, which is a programmers language or digital format
that encompasses metadata in the form of structuring elements and attributes like syntax, function, among others.

Meertens corpus
The corpus is send to me by Erik Tjong Kim Sang of the Meertens Instituut, a research institute of the Royal
Netherlands Academy of Arts and Sciences (KNAW),351 currently very active in the computational literary studies,
after elaborate mail contact with especially Prof. dr. Nicoline van der Sijs and a meeting at the institute itself. The
initial goal of the meeting was to find a solution for the variation in spelling but it resulted in the addition of an
extra corpus in my ranks.
The composition of the corpus depended on a selection of the DBNL corpus, who would be delivered in
two types of units, whole books and songs or parts, and in a different XML-based annotation format, called
FOLIA. The goal of this corpus is to have an extensive view of how the complete works, the complete books of
songs but also with each song individually prepared, stylistically relate to each other. It could also be beneficiary
for the methodological goals of my thesis, potentially providing information about format, text size and a possible
signal of work, meaning a stylistic resemblance of texts from the same book or work, for example because
theyre from the same edition.

Specialized corpus
All the texts of the Specialized corpus are in txt-format and without exception extensively prepared. I got rid of as
much meta text, and other noisy element, as possible. Because this corpus comprises of handpicked text, it has

350 the digital library of Dutch text


351 Koninklijke Nederlandse Academie van Wetenschappen
60

greater need to be accounted for than the DBNL and Meertens-corpora. I discuss the included authors and text in
this section.
The Wilhelmus and the Geuzenliedboek:
In order to get as close to the original signature of the author, its only logical to search for the earliest version of
the Wilhelmus. As discussed in the theory, the earliest available version, 1573, is in German and therefore, if the
author is Dutch, which we assume, not close to the original version. The standard in the tradition of the
Wilhelmus-research is the edition of the Geuzenliedboek of 1581 called Een nieu GeusenLieden Boecxen. For
decennia we assumed that this was the oldest version of both the Wilhelmus as the geuzenliedboek, that
survived. With the discoveries of Martine de Bruin we now got other options, however the text of the Wilhelmus
didnt change.
Ive included the 1583 version of the geuzenliedboek, which is similar regarding the included works and
spelling, to the standard 1581 version traditionally researched. My choice for the is 1583 edition is motivated by
this similarity as well as the fact that it was complete and readily available for me at the DBNL. Both songs that
werent in the 1581 and 1683 edition of the Geuzenliedboek but are in the new earliest edition, Een ander nieu
liedeken van Leyden & Van die afwijkinghe van Alckmaer I include manually in my corpus.
Ive also included the Geuzenliederenboek of 1924-1925 by ET Kuiper. This version is very different in
regards to composition and spelling, although it does share some of the songs with the 1581 version. Inclusion
makes my corpus more rich with texts similar to the Wilhelmus, with respect to date and genre, and it may also
give me insight in the stylistics representation of different editions.
Marnix
Marnix van Sint Aldegonde is the most researched and proposed potential author of the Wilhelmus, and in nonacademic context outright assumed to be its author. Hes amply represented in my corpus with a variety of texts
of different genre, date and language, among other variables.
One text thats worth mentioning is the Dordtse Rede, Marnixs oration on 19 July 1572, in the city of
Dordrecht. It was his speech at the first meeting of the rebellious Dutch states. Similarities between this text and
the Wilhelmus are numerous. Rooker performed a structural analyses352 on both texts to bring out such
similarities and decided Marnix as author of both. His analysis has been criticized as prejudice and subjective.353
A song that Id have liked to add, because of its resemblances to the Wilhelmus, but that Ive failed to
acquire, is George Lalaing, also an acrostichon and, although sometimes attributed to Marnix, also still
anonymous.

352 Rooker, C. "Marnix, de Dordtse rede en het Wilhelmus." De Nieuwe Taalgids 71 (1979): 145-164. Print.
353 J.B Drewes, Wilhelmus van Nassouwe. Een proeve van synchronische interpretatie [dissertation] (Amsterdam: Elsevier,
1946), 46-49;S.J. Lenselink, Maker van het Wilhelmus sprak in Dordt voor de Statenvergadering. Trouw 11 september 1948
(1572): 5.

61

Coornhert
The other big name, the runner up, of the Wilhelmus research is Dirk-Jan Volkert Coornhert. Coornhert is the only
seriously considered alternative to Marnix. Hes amply represented in my corpus with a variety of texts of different
genre, date and language, among other variables.
Other possible authors
As Ive already mentioned, not much attempts at authorship attribution on the Wihelmus have been made, that
didnt solely focus on Marnix as its author, and/or Coornhert as the alternative. Other possible authors are rarely
seriously proposed, let alone researched.
A logical beginning for drawing up and subsequently narrowing down a list of potential authors of the
Wilhelmus, is to accept all known authors whos songs are included in the Een nieu geuse lieden boexcken of
1577-1578 and then try to rule them out. Portema & Smith354 suggested the following names; Willem van Haecht,
Lutheran from Antwerp and poet of Psalms, Jeronimus Van der Voort, whos never been officially in service of
Willem van Oranje but has lived through prison as well as the torture rack for him and is part of the Chamber of
rhetorians of Antwerp and Lierse, and is rumoured to be the author of the chant Vive le geus is nu de leus
meaning long live the Geus is from now on the word, Jan Fruytiers, Coornhert and Laurens Jacobs Reael.
Ren van Stipriaan, specialized in the subject of Dutch seventeenth-century theatre and historian
specializing in Dutch history, researched Oranjes literary network between 1568 and 1574, and came up with a
list of known literary figures, similar to the usual suspect list of Portema & Smith. In a lecture355 he elaborates on
the names of the list Dirk-jan Coornhert, Marnix van Sint-Aldegonde, Petrus Datheen, Lucas dHeere, convinced
and radical Calvinist, Jan van der Van der Noot, famous for his exile literature Jeronimus van der Voort, Laurens
Reael, Johan Fruytiers, Jan van Hout, secretary of Leyden but fired by Count Bossu, a Spanish pawn and
predecessor of Willem of Orange as Stadtholderate,356 Janus Dousa, and Jan Baptist Houwaert. Through a
process of ruling out, he ended up with three possible authors, Marnix, Van der Voort and Fruytiers. He expresses
his surprise over the fact that Van der Voort and Fruytiers were never seriously considered for authorship.
Especially Fruytiers, who was an obvious supporter of the reformation and Huguenots, in all of his oeuvre, seems
as a very possible yet neglected author.357
Fruytiers is a skilful author of many genres, including the beggars songs, and he was appt with biblical
matter. In 1574 he gets appointed by Willem van Oranje as his counsellor, and gets a function in office, perhaps
as a token of honour and appreciation.358 Jan Fruytiers is recognized as the author of some songs in the

354 Porteman and Smits-Veldt 2008, 75.


355 Van Stipriaan 2012.
356 P.G. Witsen Geysbeek, Biographisch anthologisch en critisch woordenboek der Nederduitsche dichters.
Deel 3 HAE-IPE (Amsterdam: C.L. Schleijer, 1822), 419. Digitally aquired at 19-07-2015
http://www.dbnl.org/tekst/wits004biog03_01/wits004biog03_01_0097.php#614
357 P.G. Witsen Geysbeek, 1822.
358 Van Stipriaan 2012.
62

geuzenliedboek, because of his subscript Weest dat ghij zijt,359 as was Laurens Jacobs Reael with his subscript
Liefde vermacht al.360
Another probable author suggested by Buitendijk, based on biographical and historiographical
arguments,361 is Adriaan Saravia, preacher and writer of the manifests of the Prince. Buitendijk wasnt the first,
since, already in 1910, the widely authoritative historian P.J. Blok, points to the strong similarities in thought and
beliefs of the Wilhelmus and a pamphlet of Saravia from 1568 called Hertgrondighe Begheerte.362

Probable authors, included, half-included or not included


All of the authors mentioned in the previous section, I consider as potential authors of the Wilhelmus and Ive
searched for an available body of texts for every one of these authors, representative of the full scope of that
authors style, with the intention of including them in my corpus. When considering all the already mentioned
characteristics that may be of influence on the style of a text, (genre, date, etc) the availability of texts turns out to
be too scarce to sufficiently represent every authors style while controlling for other stylistics effects, meaning my
corpus will not be balanced. I can accept some levels of disbalance, especially because these are usual suspects
which I desperately want to include in my authorship attribution. This results in some authors only partially
included, while other were left out altogether.
Adriaan Saravia hasnt been included because he has no available texts in digital format, except for Een
hertgrondighe Begheerte vanden edelen, lanckmoedighen, hoochgheboren Prince van Oraengien of which his
authorship is disputed. Another potential Wilhelmus-author that is not included in my corpus is Jan Baptist
Houwaert. I couldnt even pull his essential text Milenus clachte
Reael, Hout and Voort are included but their included body of texts are not balanced, definitely if the
number of texts that are include per author is small, and some of the texts have characteristics that might render
them inadequate as test corpus for an authorship attribution. My assessment is that Marnix, Coornhert, Fruytiers
and dHeere, are stylistically correct and sufficiently represented by the amount, length and diversity of their texts
in my corpus.

Improbable authors
Ive also added work and texts from authors who arent considered serious options for Wilhelmus authorship. This
can have several reasons. First of all, Ive got multiple hypotheses, some of which do not apply to authorship but
are identifying and analyzing other effects. Some text are meant to contribute to these hypotheses and not to the
question of authorship. Second, besides a search for the Wilhelmus, is this also a methodological thesis, so Ill
perform a lot of different analyses on lots of different corpora, in order to find the capabilities and boundaries of

359 G.J. van Bork and P.J. Verkruijsse (red.), De Nederlandse en Vlaamse auteurs van middeleeuwen tot
heden met inbegrip van de Friese auteurs (Weesp: De Haan, 1985).
360 E.T. Kuiper en P. Leendertz Jr. (ed.), Het Geuzenliedboek (Zutphen: W.J. Thieme & Cie, 1924).
361 Bonger 1985, 184.
362 Van Stipriaan 2012.

63

my methods and design. This works as a catalyst for the amount of analyses and corpora, hence the amount of
different texts and authors. The third and last reason for the inclusion of improbable authors in my test corpus, is
to balance and enlarge my corpus with texts of similar background, and possibly similar stylistics, as the
Wilhelmus and texts of its suspect authors. This way authorships signals and other stylistic signals can be
adequately tested, false positives become a possibility, and with a now large body of texts more subtle effects
wont be as easily missed, as with a small corpus. Also, when performing analyses with only potential authors, the
suggestion may arise that all these authors write stylistically similar to the Wilhelmus, while in reality some
impossible authors may actually be more stylistically related to the national anthem, then most of them.
One of the improbable authors is Petrus Datheen. In the following (translated) quote, Maljaars explains why
Datheen has never been considered as the author of the Wilhelmus, although he has been in close contact with
Willem van Oranje, receiving some very honorary assignments from him, and so perhaps unjustly never taken
into consideration.363
Nobody, as far as we know, has ever named Datheen the poet of the Wilhelmus,.. because of the technique of
his poetry regarding the Pslams. Evenso, taking in consideration that, not everything hes made deserves the
label of flimsy. 364 365
Literary quality is not something that is objectively measured, so if subjective judgment has so far been
disruptive of the authorship question for the Wilhelmus case, this might be a opportunity to let go of these criteria
for inclusion.
I add texts of Petrus Datheen to my corpora, still assuming he isnt a real option, however, if texts of
Datheen repeatedly come out as a stylistic match to the Wilhelmus, I will not hesitate to take these results
seriously and am forced to reconsider my initial assumption. Reason for my assumption of non-authorship is that,
regarding the theory on Wilhelmus-authorship, I base myself on the general consensus of existing research. To
dissect every argument, like the exclusion of Datheen for possible authorship, to such an extent that I can take an
rational independent position, is a(nother) thesis in itself. I will not write two theses, so Im forced to trust the
findings of my predecessors.
Another writer included in my corpus but not considered a realistic option, even though its such a romantic
theory, is Willem van Oranje Nassau himself. The song is actually sung from his perspective, but the notion is
more fictive than historically probable. Willem of Orange had a large amount of people in his service, poets
among others, that were assigned to the tasks of poetry, propaganda or both. In addition to this is the general
assumption that the Wilhelmus must have been by a professional, which Willem de Zwijger (Willem the quite)

363 Maljaars 1996, 16.


364 *Quote is a translation out of Dutch
365 Maljaars 1996, 16.
64

isnt.366 Problems with the inclusion of texts of Willem van Oranje originate from these very objections against the
princes authorship, because they involve the lack of poetry of his hand and the doubtful authorship of the rest of
his texts, mainly speeches and manifests. Ive included apologie ofte verantwoordinghe in my corpus, attributing
it to Orange. He might not have written this himself, in that case preacher Villiers is the most probable option.367

Anonymous Songs
Some of the texts in my corpus, a lot coming from some version of a book of beggars-songs, are anonymous.
Some are included, just for the reason that they were joined with the Wilhelmus in the same work, some were
specifically chosen because they were interesting cases and some just to balance my corpora.
Prof. Dr. K. Heeroma writes in his essay Tsal hier haest zijn ghedaen, that the possible importance of the
Pardoen-lied and the famous beggarssong Help nu u self so helpt u Godt, have stylistic similarities that could
be explained as a common expectation of freedom or as the atmosphere of the times.368 Both are initially
included in the corpus.
Other improbable authors, most of them at one point secretary to the Prince, who were left out of the analyses
because their work wasnt readily available for analyses, were Hendrik Geldorp or Hendrik Castritius, Hendrik
Niclaes, Jacob van Wesembeke and Nicolaas Bruyninck369.

Hypotheses
Ive formulated the following two main research questions, one topical and one methodological; Who is the
author of the Wilhelmus? and Can the complicated real world authorship attribution case of the Wilhelmus be
solved with the methods of quantitative analysis and the tools of computational literature?
In order to answer these questions Ive drawn up sub questions, that answer diverse parts of my main
questions. When enough of these sub questions are thoroughly answered, Ill be able to draw the holistic
conclusions and answer the main questions. These sub questions who are part of, and can add to, answering the
two main questions are the following;

366 Maljaars 1996,


367 K.W. Swart, Willem van Oranje en de Nederlandse Opstand 1572-1584, eds. Raymond Fagel, M.E.H.N.
Mout and Henk van Nierop (Den Haag: Sdu, 1994).
368 Heeroma 1985.
369 J.G. Frederiks and F. Jos. van den Branden, Biographisch woordenboek der Noord- en Zuidnederlandsche letterkunde
(Amsterdam: L.J. Veen, 1888-1891), 375.

65

What candidate for authorship of the Wilhelmus does a quantitative stylistic analysis with
computational means supports and/or points to, as the author of the national anthem?

Who is, or is more likely to be, the author the Wilhelmus, Marnix van Sint-Aldegonde of Dirk
Volkertszoon Coornhert?

What are the current limits of computational stylistics and authorship attribution?

Can my methods detect authorial signals?

Can my methods eliminate one or more of the usual suspect of the Wilhelmus authorship
attribution case?

Can my methods give supporting evidence for one or more of the usual suspect for the
Wilhelmus authorship attribution case?

Are my methods useful and/or sufficient for authorship attribution for texts of 550 words?

During my research Ive worked on several methodological questions that needed to be answered before I could
focus on the question of authorship. These were:

Can my methods detect language signals?

What language effects does the Wilhelmus signal?

Can my methods detect genre effects?

What kind of genre effects does the Wilhelmus signal?

These questions formed hypotheses during the construction of my three specialized corpora, which Ill discuss in
the next section.

Specialized sub-corpora
An important question in authorship attribution is how to discriminate between the three basic factors; authorship,
genre, and topic.370 All three of them can be successfully measured, but how do you make sure they dont interact
and thereby cloud your results. If Text A, a tragedy, is attributed to author B, not because it was written by author
B, but because a lot of the texts of author B included in the experiment were tragedies, you have a false positive.
In order to avoid these false attributions, Ive got to be aware of the major stylistic effects in my corpus so I can
control for them.

The

specialized sub corpora are charged the heaviest for answering the stylistic question on authorship and almost
completely responsible for the measurement of other stylistic effects that could possibly dim or obscure the
authorial signal. In order to find the answers to these questions Ive formulated several hypotheses, by which I
designed and collected the sub corpora to my Tim-0 corpus. The hypotheses that Ive constructed, can be divide
in three groups, based on the characteristic these hypotheses wish to explore.

370 Stamatatos 2009.


66

The first two sub corpora are designed to answer the questions and hypotheses regarding language or
dialect and style or genre. The third group of hypotheses will aim at my main question of authorship. These first
two sub corpora I deem necessary because in order to find an authorial signal, Ive got to control or account for
other possible effects, that, based on the results of the other corpora and secondary literature, we ought to
expect. I need to discover, analyze, visualize and interpret effects of language or dialect and effects of type or
genre. When these effects are mapped, they can be accounted for. If these results provide us insight about my
corpus or the Wilhelmus itself, they need to interpreted and be integrated in my hypotheses, analyses and
methods. Ill discus the three subcorpora of the Tim-0 corpus below, individually.

Language Hypotheses on the talen-subcorpera


A cardinal question of the Wilhelmus-research is whether the song originated in Dutch or another language.371
The suggestion of a French or German origin keep popping up. Historian Jan Willem Enschede is one of the
strongest advocates of the theory that the Wilhelmus is a Dutch translation out of French, but idea gets very little
acclaim.372 A more plausible option is that the Wilhelmus is originally written German, since the oldest handed
down version of the song is the German edition of 1573 and the Wilhelmus is often described as Germanisch.373
This can be explained by all the exiled seeking refuge in Germany and spreading their literature in both Dutch
and German.
The general consensus is that our national anthem was originally a Dutch text.374 The editions in other
languages, assumed translations, all seem to contain Dutch linguistic elements and this points to a Dutch
original.375 Any further specification, about which region of the Netherlands, is completely open to debate. The
Wilhelmus is considered to have properties from dialects of both the Southern and Northern Netherlands.376
Marnix has, contrary to Coornhert, a southern background.
Texts used as propaganda seem to follow a certain pattern. We know that in this period Willem van
Oranje had a tight grip on the circulation of texts, especially the French and German ones.377 The Prince influence
makes it very unclear what texts rise spontaneously and what texts are just obediently following the path theyre
suppose to. An example of this is the poem Bewegliche Demonstration, which has been handed down in four
printed German editions from 1572, is considered of Dutch origin, but no Dutch edition remained. It also shares a
lot of ideas and themes with the Wilhelmus, which also circulated in Germany around the same time. A stylistic
analysis might create some clarity about its heritage.

371 Van Stipriaan 2012.


372 Van Stipriaan 2012.
373 Veenendaal 1985, 77.
374 Bonger 1985, 174.
375 Van Stipriaan 2012.
376 Van Stipriaan 2012.
377 Van Stipriaan 2012
67

Based on these findings I focused on researching dialects and other influences of foreign languages. Dialects I
deemed necessary to research were Flemish or southern Dutch dialects and German-Dutch or Dutch with
German influences. There seems to be very little support for the French hypotheses, so I left that one out.
Possible effects of the German or Flemish influence are the most relevant as they can contribute to our
knowledge about the Wilhelmus and also because these are important possible effects that could pollute my
results when looking for the author of the Wilhelmus.
Another effect I include in this set of hypotheses, is that of translation. Most of the potential authors have
written in the classical languages, often religious texts, and so Dutch works translated from the classical
languages Greek and Latin, are part of my corpora. If effects of translated text exist, they need to be controlled for
and therefore I choose to include them as a sub-hypotheses.
This leads to the following hypotheses:

There is according to the results of my analyses a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in the texts of my
corpus.

There is, according to the results of my analyses, a stylistic effect based on the Southern dialect, also
called Flemish, meaning Dutch from the southern regions of the Netherlands, present in the texts of my
corpus.

There is, according to the results, of my analyses a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus.

There is, according to the results of my analyses, a common stylistic effect present, in Dutch texts that
are translations out of Latin or classic Greek, of my corpus.

There is, according to the results of my analyses, a stylistic signal of any other language or accent than
Dutch language present in the Wilhelmus?

For a proper understanding of some text in my corpus and of the results they bring forth, the reader might need
some further context on these texts. In this section Ill elaborate on some of the included texts.
Van homulus by Pieter Dorland van Diest coded as dorland_vanhomulus is a text written in Hoogduits.
Hoogduits is a dialect of German they spoke in the regions south of the Benrather line, including in the German
and Dutch province of Limburg, opposed to the regions north of this language border where they spoke
Nederduits. Some argue the Van homulus is a Dutch text from origin.378
De Uilenspiegel coded as A_uilenspiegel is Nederduitse folklore. Its often attributed to Bote, but theres
uncertainty if he was capable of written in Hoogduits, the language of De Uilenspiegel.379 For my analyses I

378 Pieter Dorland van Diest, Van Homulus, een schoene comedie daer in begrepen wort hoe inder tijt des doots der
menschen alle geschapen dinghen verlaten dan alleene die duecht die blijft by hem vermeerdert ende ghebetert, ed. C.P.
Serrure (Gent: C. An van der Noot-Braeckman, 1857), 1-10.
379 Loek Geeraedts, Ulenspiegel, ed. Loek Geeraedts (Antwerpen: Berghmans Uitgevers, 1987), 5-7.

68

include the 1580 Dutch version printed in Antwerp. When analyzing the work as a whole, a lot of text that isnt
written by the author, like the introduction or other editorial notes of the reprint of 1987, is present. In the
specialized corpora I constructed I cleaned this text of the noisy Paratext.
Nederduitse Orthographie, meaning the Nederduitse orthography, of Pontus de Heuiter, coded as
heuiter_nederduitseorthografie, is an attempt to give the better Dutch alternatives for words of Nederduitse
dialect, that were often used in the Dutch language. The author aimed through his whole life for a Koin
composed of several dialects; an ambition inseparable from his travels through Holland, southern regions of the
Netherlands and France.380
The manuscript Drie historische liederen en een hekeldicht of Antonius Ghyselers from around 15051518, coded as DUITS_Antonius Ghyselers Drie historische liederen en een hekeldicht 1505-1518 is a
complicated text. In the introduction, for example, therere pieces included of Erasmus, written in Latin of course,
but also some modern Dutch editorial notes. To make things more difficult, Ghyselers studied Latin during his
military service in Austria, around the time of the writing of the manuscript, while he conversed those days in
letters in Hoogduits, his mother tongue.381
Jan van der Van der Noot wrote lofzang op Brabant, coded in my corpus as ZUID_van der Noot Himne
oft lof-sangh van Brabant in Southern Dutch or Flemish, however he also wrote during this period in French and
housed in Germany for a long time. The first lines of the Hymne on Braband, are a paraphrase of the preamble
of Ronsard's Hymne de France382 of 1549. The hymne is also an acrostichon.383
The work 25 Psalmen coded as utenhove_25Psalmen of the Felmish Dutchman Jan Utenhove
contains both Psalms, as well as his digression or explanation of them. Therere possibly Nedersaksische roots
signaling through this text.
The Flemish rhetoric Marcus van Vaernewyck wrote the mythical history de historie van Belgis in 1574
and I include it in my corpus and coded it as vaernewyck_histori Belgis. It includes Flemish as well as Latin parts.

Genre Hypotheses on the genre-subcorpera


With this corpus Ill try to answer the questions about type and genre. With the limited amount of text that I have,
excluding texts is to be done hesitantly, because too much exclusion may fatigue the corpus in such a way that
theres too little text to keep the corpus balanced. On the other hand, theres the threat of keeping texts that are
not uniform or verified in their nature, measuring all kinds of secondary or other effects. If you dont know about
these other effects or arent able to control for them, they might pollute the results, making the effects that you
wish to measure invisible. Therefore its was necessary to research whether differences in type or genre had

380 G.R.W. Dibbets, Voorbericht in Nederduitse Ortographie, ed. G.R.W. Dibbets (Groningen: Noordhoff, 1972), 6.
381 Vaderlandsch Museum and C.P. Serrure (red.) Vaderlandsch museum voor Nederduitsche letterkunde, oudheid en
geschiedenis (Vierde deel) (Gent: H. Hoste, 1861).

382 Ronsard, L.L. VI, blz. 79. In; Marcel Raymond, L'influence de Ronsard sur la posie franaise II (Paris
ca. 1927)
383 Jan van der Van der Noot, Lofsang van Braband/Hymne de Braband, ed. C.A. Zaalberg (Zwolle: W.E.J. Tjeenk Willink,
1958).

69

effect on the stylistic fingerprint of the texts. I need to know what, if any, the stylistic difference was between prose
and poetry, between Psalms and poetry and between poetry and songs.
Im also interested in possible stylistic properties of different genres, because its relevant to my
questions about the Wilhemus and even for the authorship question. If theres some common stylistic base for
topic or motive, especially if these genres are as subtle as the difference between songs of comfort or songs of
resurrection, they would be directly relevant to the entangled cardinal questions of the Wilhelmus, regarding the
reasons behind the song, date of creation and possibly its author.
This leads to the following hypotheses

There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction, present in the texts of my corpus.

There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose.

There is, according to the results of my analyses, a stylistic difference between the songs and the poetry
of my corpus.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry.

There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggars songs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggars songs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.

Authorship Hypotheses on the author-subcorpora


With this corpus Ill try to answer questions about the presence of authorship signals and the main question of my
theses; who was the author of the Wilhelmus? I measure or confirm the existence of authorship signals as well as
the possibility to confirm and measure them with my methods on my corpus. Generalization of these results can
bring questions or answers to where the limits of these technique currently lie. Topically were interested on which
texts, of which author, show stylistic resemblance to the Wilhelmus. This leads to the following hypotheses

70

There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal, present in the texts of my corpus.

The Wilhelmus shows consistently more stylistic similarity to texts in my corpus of one particular author
than to the texts of the other authors, and these texts of this author consistently show an authorial signal
by expressing common stylistic markers.

This corpus consist for the most part of texts already used in my genre and language hypotheses, and for a small
part of new texts. All texts from the geuzenliedboek bigger then 550 words have been added to the corpus. It is
highly significant that these authors were published in the same book, so printed by the same publisher and
probably conceived around the same time, and perhaps even in compliance with the reasons and ideas as the
Wilhelmus. Based on these probable similarities we can expect songs from the same work to show some
common stylistics. A few of the songs were already added, like songs Voort and Raeal, which of course I wont
add twice. The newly added authors are Cornelis van Damme, Reael and Sterlincx, some of them with multiple
songs. I also include the Wilhelmus of the geuzenliedboek, coded as _nie096nieu01_01-0018WILHELMUS,
including the intro that I edited out in the previous version. The Wilhelmus is now represented twice in my 0corpus, because the version of the Wilhelmus that is included in all my corpora remains included in this corpus as
well.

Hypotheses I will not test


Therere several different theories and possibilities of research that have come up during my preliminary research
but that I will not examine because of methodological reasons and practical limitations (time, size, availability of
corpus). These questions, that I will not try to answer but that I do feel I should acknowledge, are discussed in this
section.
I will not try to confirm any kind of literary quality, although this is an important argument in traditional authorship
attribution by close reading, and the traditional Wilhelmus research.
As stated before, the selection of authors included in my corpus depends heavily on the available texts.
Because of my methodology, Ill only be able to analyze poets with a oeuvre, thats available and comparable to
the Wilhelmus. This dismisses some off the authors, already mentioned in the section corpus, as possible authors
for analyses, and also brings me to ignore the possibility that the writer of my great national anthem was indeed
an anonymous soldier or poet , who just wrote one poem for the occasion and then disappeared from history.
Possible authors without published or available work will not be taken into account.
I will also not test the Wilhelmus on co-authorship or any other kind of multiple voices within the anthem.
Purely for practical reasons Ill limit myself to the possibility of one author. Theories like, Marnix as editor of the
anthem for an unknown poet, or the Herderians ideals of national hymns are not pursued. This opens such a
different register of theory, methods and analyses that this is should be tested separately on another occasion.

71

Co-authorship is also very hard to test on very short texts, perhaps even impossible on texts as short as the
Wilhelmus.

Expected problems
Therere some problems or obstacles that I know of before hand, I wont be able to fully, or at all, anticipate on.
These obstacles can at best be expected and taken into account when designing, analyzing or interpreting. Ill
discuss them in the following order; first the obstacles of computational literature, then obstacles for authorship
attribution and then case specific problems.

Obstacles of computational literature


One of the major obstacles for computational quantitative authorship attribution in general, is the absence of a
theoretical frame that explains how the authors style can be captured by measuring stylometric features like
function words. What we need is a higher level abstract description of the authorial style.384 We know that it
works, but we are yet to explain why it works. Despite the fact that this method is generally successful, therere no
studies that explain the underlying theoretical principles sufficiently.385 This can lead to ignoring the technique or
even discredit it. It becomes characterized as a black box approach.386 This is something that I can and will not try
to solve in my thesis. I do consider it necessary to point it out as a major blockade for further application of this
type of research.

Obstacles for authorship attribution


As mentioned, real life authorship attribution problems are rarely as elegant as straightforward text categorization
problems.387 We often encounter situations in which our list of candidates might be very large and in which theres
no guarantee that the true author of an anonymous text is even among the candidates. Furthermore, the amount
of writing weve of each candidate might be very limited, and the anonymous text itself might be short.388 Another
major weakness is if the set of candidate authors that is too small.389

384 Stamatatos 2009.


385 Eder, Maciej. A birds eye view of early modern Latin: distant reading, network analysis and style variation. New
Technologies in Medieval and Renaissance Studies. Eds. M. Ullyot, D. Jakacki, and L. Estill. Toronto/Tempe: Iter and the
Arizona Center for Medieval and Renaissance Studies, 2015

386 Harold Love, Attributing Authorship: An Introduction (Cambridge: Cambridge University Press, 2002).
387 Koppel, Schler and Argamon 2009.
388 Koppel et al. 2012.
389 Stamatatos 2009.
72

The most important issue in my research is probably the text length of the target text. How long a text should be,
so that I can adequately capture its stylistic properties, is uncertain390 As reported in my methods section, some
studies report promising results dealing with short texts, however, although its not yet possible to define such a
text length threshold, it seems to be over the 551 words.
Another problem is dirty data or dirty texts. Corruption in stylometry seems to apply either to the texts in
the corpus itself, like dirty Ocular Character Recognition (OCR), or manifest itself in cherry-picking of results. I
encountered an case of bad OCR when I tried to included Cantica Lieder offte gesange by Hendrik Niclaes,
friend of Lucas dHeere, Jan van Van der Noot & Dirk Volkertsz Coornhert. The text was in PDF format, but OCR
had not yet been applied, so I had to read from the pictures, which were unreadable. An online OCR tool had
even more trouble in reading the files, leaving me with a text unfit for analysis, and a strong awareness that this
kind of research and research tools are very sensitive for noise. I also tried to include the Psalmen unde ledern
of Hendrik Niclaes because some songs of his cantica are present in his Psalms, but this text was also beyond
my grasp. I ended up not excluding Niclaes altogether. The (un)availability of texts is in my experience one of the
biggest stumbling blocks for the performance of successful authorship attribution.

Case specific problems


R & Cherry picking
Comparing texts with R/stylo entails some methodological problems. The visualization of the cluster analyses by
a tree diagram orders the texts on the basis of stylistic similarity or difference. The nearest leaves of the tree
diagram are each others nearest stylistic neighbours and where the graph places the first division, dividing the
corpus in to two or more groups, it impersonates with these clusters, the biggest stylistic categories. A danger for
this type of research is that the strength of the connection is not visible, but only relative to other bonds.
Combined with the results being very dependent of the parameters and other choices of the researcher, this
creates a very specific danger. Two nearest neighbours of a whole set of analyses, implying strong stylistic
similarity, could in fact with a little adjustment of the parameters, suddenly move around the graph, move away
from each other and join other branches. So, texts that we considered to have a common style, can unpredictably
and unexplainably, turn out to not share these characteristics at all, and vice versa. Researchers can misinterpret
the consistent but accidental clustering of texts as a strong similar style. Whats worse, the researcher might be
tempted to vary the parameters, for example choosing the right amount of MFWs, until they show the results he
or she is looking for. On the other hand, some effects need to be lured out, as they are present all along, but were
hidden in the graph, due to an even stronger signal. An authorial signal remains present under differing
parameters, but clustering on a deeper level, like chronology or genre, can hide or fade away in the graph.391
Theres no consensus about the amount of MFWs, so how does a researcher determine this parameter without
practicing cherry picking. I quote;

390 Stamatatos 2009.


391 Eder 2015, 8-9.
73

When it comes to choosing the plot that is the most likely to be true, scholars are often in danger of
more or less unconsciously picking the one that looks more reliable than others, or that simply confirms their
hypotheses. If common sense is used to evaluate the obtained plots, any counter-intuitive results will be probably
dropped simply because they do not fit the scholars expectations.392
A solution regarding the MFWs, is the bootstrap consensus tree, a diagram that mediates cluster analyses with
different amount of MFWs and subsequently plots these in a consensus tree diagram.393
The problem of the nearest neighbour principle, who shows strong effects like the authorship signal, but in the
process conceals other stylistic effects on deeper level, like genre, topic,394 translation,395 and sometimes
chronology,396 is still present, even with the consensus tree. The visualization in R is a rather rigid expression of
the relation between texts, where a text is assigned to a (same) branch or not. In addition to this, renders a
hundred or more texts the visualization unreadable and therefore not useful.397 The visualization options of R are
not always adequate for the demands of my corpus,398 but in cooperation with Gephi, a lot of these problems are
solved. I must be aware of the limitations of my tools, especially the attribution and visualization options of R and
Gephi as well.

Editions & Spelling


Theres the possibility, that for both on the Wilhelmus as on the test corpus, the versions available to us, are
different from the original because theyve been altered, editorial or otherwise, by someone else than the initial
author. I try to obtain for every text the unaltered original, but these have often disappeared and even of the ones
who are assumed to be an original, were seldom sure.399 The original text of the Wilhelmus is lost and beyond my
grasp, so theres no way to make sure its comparable to the version of 1581.400
What is known is, that in those times, printed texts were often altered due to practical reasons as the
layout of the page.401 Reason enough to be sceptical about research based on spelling differences, as
expressions of personal style, but also reason enough to be concerned about what alteration or differences of

392 Joseph Rudman, Cherry Picking in Nontraditional Attribution Studies Chance 16, no.2 (2003).
393 Eder 2014.
394Christof Schch, Fine-Tuning our Stylometric Tools: Investigating Authorship, Genre, and Form in French Classical
Theater. in Digital Humanities 2013: Conference Abstracts (Lincoln Nebraska: University of Nebraska-Lincoln, 2013).

395 Jan Rybicki, The great mystery of the (almost) invisible translator. in Quantitative Methods in CorpusBased Translation Studies: A practical guide to descriptive translation research, eds. Michael Oakes and
Meng Ji (Amsterdam: John Benjamins Publishing Company, 2012).
396 Hugh Craig, Stylistic analysis and authorship studies. in A Companion to Digital Humanities. eds.
Susan Schreibman, Ray Siemens and John Unsworth (Oxford: Blackwell, 2004).
397 Eder 2014.
398 Zie figuur 1
399 Maljaars 1996, 65.
400 Maljaars 1996, 65.
401 Maljaars 1996, 66.
74

spelling might do to the authorial signal. Marnix, however, is known to be very involved and very strict in the
process of publication, always demanding to be present when printing his works.402
I assume, and with me the tradition of the Wilhelmus-research,403 that the version of the Wilhelmus in
geuzenliedboek of 1581 doesnt deviates to much from the original spelling. Ive searched for workable solutions
for eventual differences in spelling, but came up short. Important for the analyses is the recognition that songs,
sometimes even versions of the same text, can differ greatly in spelling, but this is not always a reflection of an
authors style. The researcher should be conscious of this fact when hes interpreting the results of this type of
authorship attribution on Dutch renaissance texts.

Availability of German texts


A problem I cant anticipate on is the difficulty of obtaining useful German songs, necessary for testing the
German hypothesis. In a study by Kossmann aimed at identifying German example songs, meaning German
songs that had influence on the Wilhelmus and perhaps were even predecessors of the Wilhelmus, he had great
difficulty to find songs that are close to 1570. Its not certain if there even are any. His main source was a great
collection from Von Liliencron, the Die historischen Volkslieder der Deutschen vom 13. bis 16. Jahrhundert, with
the newest contribution being from 1554,404 which is obviously a huge temporal gap away from the Wilhelmus.

Problems of the researcher


First of all Im heavily bound to the canon of the Wilhelmus. Especially me, who had the knowledge about the
Dutch early modern or renaissance literature of the level of a bachelor student, and could not rely on my own
expertise, has to put his faith in the established facts and fictions. My way of trying to speed up the process of
getting the handle on the Wilhelmus-case was reaching out to experts in the field, starting with my thesis
supervisor Prof. Dr. Els Stronks, but also contacting for example Prof. Dr. Louis Grijp and ask him to review my
list of literature and potential authors of the Wilhelmus. The point is that, although Ive uttered ideas about new
ways of reading and contributing to the empty pages of the library, my research depends very heavily on the
canon of the Wilhelmus-research, including the stylistic research branch. I can only trust in my predecessors that
the candidate authors are indeed logical fits based on historical and textual arguments.
The namedropping in this section, and also in the corpus section regarding the Meertens institute and
the DBNL, is not there to brush off authority on my choices, but to show that the inclusion of the objects of
research are not randomly or objectively chosen, but are part of a power structure, reaching back in history, both
influenced by and influential on, the academy and the rest of society, summarized with the word canon. The
secondary literature, my supervisors and advisors, and myself, are all part of this, and if Ive missed alternative
theory or interpretations of the Wilhelmus-case, this should explain why. This is of course a general problem, as

402 Maljaars 1996, 66-67.


403 Maljaars 1996, 69.
404 Heeroma 1985.

75

well as a problem for me specific, because Im working out of my expertise and out of my comfort zone.
Connected to this argument is the fact that Im dependant on the availability of the texts. When
performing comparative analyses, not only the target text has to be available but also a corpus of texts of
potential authors. The availability of texts of potential authors determines whether they can be included.
Canonical texts and texts of canonical authors are often easier accessible than the ones in the periphery of the
literature.
The other influence Id like to appoint, making sure that Ive freed myself from any kind of odour of objectivity, is
the researchers horizon. My personal academic and professional history are part of me as a researcher, and
these will influence my choices, however, when acknowledged and anticipated on, this type of quantitative
research should be able to filter out some biasing prejudice or preoccupation, at least from the analysis phase.
The interpretation of the results can still be largely subjective, the actual generation of the data, is less sensitive
to personal preference than qualitative research.

Analyses and Results


I performed a great number of analyses on a great number of different copora and subcopora. My tool for
analysis is, as mentioned, R with package Stylo, while my tools voor visualization are R and Gephi. In this
section Ill report my observations, interpretations and conclusions based on the results of these analyses. A
complete index of all corpora, parameters and graphs, or other kinds of visual representatation of the results, of
all the different analyses, are added in the appendices 1.Corpora and 2.Parameters. This paper already includes
some visualizations or graphs and includes mentions of the biggest changes in the parameters, so the results and
interpretations will be perfectly understandable without the appendices. I suggest, that if youre looking to critically
follow my design and methodological choices, or use this thesis for your own research, you should, to some
extent consult the appendices, to capture the exact circumstances of analysis. The sepperation of the indexation
of the parameters from the analyses here, is a choice in favour of the readability.
Concerning the visualisation performed with Gephi, its possible to open the attached gephi-file as well
as the pdf, and have the same visualization I had during the analyses. Consider these as interactive graphs,
where you can zoom in and zoom out, and even manipulate the graph. Some of the visualization includes labels
and/or IDs, while others show only colours and dots. Every dot is a text, or a part of a text depending on the
chuncking of the corpus, and every line is a connection between those texts. The thickness of the line will be
determined by the strength of the relation, in other words, the stylistic similarity, which also determines the
position of the dots in the graph. Stylisticly similar texts will group or cluster together. The colour is per text
determined by the, piece of code, or name, before the _. So if two text both share the same name before the _,
as is the case when they are both written by the same author, theyll have the same colour as will their
connections, or lines. A cluster of three Coornhert songs, code as coornhert _lied01, coornhert _lief02_01 and

76

coornhert _lijd02_01, will all have the same colour and so will their connections, looking like a piece of gum in the
gephi visualization. These are all settings and they depend on my decisions. Occasionally Ill tune up or tune
down some of the effects to make the graph readable or visually usefull, but the premises I described here, stay
the same for all the analyses.
Id like to state an editorial note, that even performing the exact the same analyses on the exact same corpus,
sometimes can give birth to, slitghly, different results. When I performed, for various reasons, some analyses a
couple of times over, I found out about this strange occurence, for which Ive no explanation. I will state here, that
Ive not, nor should any researcher, repeated analyses hoping that they would better fit the hypotheses.

Birds eye view on the DBNl corpus


I started by visualizing the stylistics of a big corpus, consisting of all Dutch texts, also the southern dialects,
written between 1550 and 1590 without discrimination on genre, so including screenplays, that are in the
possession of the DBNL, in order of getting a birds eye view of the early modern Dutch literature. This corpus is
not to determine a close up look or a in depth analysis of any text, but to get first impressions of the corpus, the
possibilities of analyses and the early modern Dutch literature

Analysis, interpretation and conclusions 0-corpus DNBL


The reader should at this point open a selection of the following two visual representations of the data, the
document 0-corpusDBNLgephifile(2) which is a Gephi project file and/or Graph 1. DBNL corpus Analyse 1.0,
which is a pdf file, in order to comprehend the results.
The pink flog on the bottom right of the graph, pink indicating the text is written by Coornhert, is clearly a cluster
based on authorship, predominantly consisting of prose. The pink Coornhert network also has two outliers
originating from different points in the pink cluster but connecting with each other in the middle of the graph.
These both consist of poetry texts from Coornhert. There is also some poetry in the bottom right pink cluster.
Coornhert is represented in the DBNL corpus with a large number of different poetry and prose texts, but they
still manage to cluster together, even if some form sub clusters by their nature, seing that only poetry distances
itself from the main cluster. So I observe clustering of Coornhert texts, definitely the prose but also the poetry, and
this leads me to the indication that there might be an authorial signal of Coornhert and that we should even look
out for a stylistic distinction between poetry and prose.
Moving to the other side of the graph, to the yellow cluster on the left, we see, yellow being Utenhoven,
a cluster of Utenhoven texts. Again this seems like an indication of an stylistic authorial signal regarding
Utenhoven texts. Looking at the nearest non utenhoven text, we see different colors, so of different authors, just
under the utenhovencluster. This nearest cluster contains all the Psalmen davids from different authors.
Following the connections away from these Psalms and the Utenhoven text, we encounter religieus texts, psalms

77

and other, again of different authors. So for these texts what seems is that even stronger than an authorial signal
is the fact that they are the same texts or on the same topics. Of course not all these psalmen davids are the
same, spelling or include the completely the same songs, but they are stlylisticly related enough to form clusters
based on other characteristics than authorial style. Beneath the religious texts, in the bottom of the graph we see
a little red cluster, being both texts of Fruytiers closely positioned next to each other.
My conclusion based on these bottom left clusters is that authors tend to cluster together just as that
versions of different authors of the same texts tend to cluster together and that even the type or topic of a text
should be taken into consideration as a possible effect. Another example of an indication for signal by topic is the
clustering of Marnix his Bienkorf with other satirical work like Erasmus lof der zoetheid, in the right bottom of the
graph, connecting to the first outliers of Coornherts prose. We can confirm these effects in the rest of the corpus
by looking at different version of the geuzenliedboek, and other authors spread over the graph, their texts also
tend to make connections with each other. These conclusion support the results of the analysis of the first
Coornhert cluster. Although authorial signals clearly exist, they can be overwritten by other signals, like different
versions of the same text. I assume that this works both ways and that other weaker signals can be overwritten
by authorial signals.
Other possible effects, like the distinction of prose and poetry, seem to be present in the graph. Most of
the poetry is being pushed to the right, but theres also poetry in the top of the graph and a bit in the middle.
Prose clusters around the bottom Coornhert outliner, for example the letters of Willem of Orange and the satirical
work of Erasmus and Marnix. When reviewing these results with my supervisor prof. dr Els Stronks, she pointed
out that some of the traditionalist and traditional texts were clustering together while the innovators of language
as van der Van der Noot and Lucas dHeere also made connections with each other.
The birds eye view gave me enough indications that the corpus Ill be working on has enough stylistic information
that can be captured, and a lot of effects based on these stylistics, to proceed my research. There is, of course,
for every seemingly effect, also noise. A lot of connections dont make sense, a lot of texts do not fulfil my
expectations and in many cases I dont have an explanation for their behaviour in the graph.
This being the case, Id like to point out the general tendencies we do see in the graph. I suggest that
the reader(s) look at the basic properties of the graph, the colours. The texts of the corpus are coloured by their
presumed authors. The positions of the text are based on their stylistic content, so roughly the relative
frequencies of their MFWs transferred to a score by a distance measure, besides of course the parameters for
visualization. Their position is definitely not determined by me directly, nor by their code, or similar names. This
means that the clusters of pink and yellow texts but also the red of Susato and Fruytiers or the dark blue of
Dheere, all noticeable to the reader on point blank, are effects worth considering as valid even if the corpus also
includes a lot of noise.

78

Specialized corpora
Now that Ive seen the grand picture and established some effects, Id like to test these observation and look at
the data a bit closer, by analyzing smaller and cleaner corpora, specifically composed to test my hypotheses. Ive
constructed three 0-corpora, meaning a starting corpus that might be further polished after seeing the results of
the analyses on it, all three designed to answer a specific hypothesis or aspect of the Wilhelmus, the texts of my
specialized corpora or my texts in general. These corpora should answer my questions about language and
dialect, genre, type and topic and the author, in that order. All of the corpora are extensively prepared. Ive tried to
exclude, delete, all the meta text, footnotes, titles, introductions and other forms of noise, so that only the text
would remain, also all the numbers and large parts of the punctuation has been edited out. In the case of plays
Ive delete the names of the speaker and other instances as choir. In the case of psalms Ive delete the endings,
which were always spells like amen or gloria. In the case of letters, and other texts, Ive delete authors notes,
names, places and dates of writing mostly on the bottom of the texts. I also tried to delete all the text that was in a
different language. An example of this is the removal of all the Latin proverbs in the Antonius Ghyselers texts.
A complete index of all the specialized corpora I constructed can be found in the appendices. I suggest
that the analyses are read in the order that I discus them, because a lot of my choices later on will be a
consequence of the results of the earlier analyses. All the analyses are performed and visualized in R-stylo.

Analyses language hypotheses


As explained in the section hypotheses I constructed corpora for three sets of hypotheses. The first are the
following language hypotheses;

There is, according to the results of my analyses, a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in the texts of my
corpus.

There is, according to the results of my analyses, a stylistic effect based on the Southern dialect, also
called Flemish, meaning Dutch from the southern regions of the Netherlands, present in the texts of my
corpus.

There is, according to the results of my analyses, a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus.

There is, according to the results of my analyses, a common stylistic effect for the Dutch texts that are
translations out of Latin or classical Greek.

There is, according to the results of my analyses, a stylistic signals of any other language or accent than
Dutch language, present in the Wilhelmus.

The texts are coded and so divided by language or dialect. NED stands for Nederlands and means the Dutch
language, the official language of the kingdom, spoken in the provinces of Holland. KLAS stands for classical

79

texts, these are translations of classical literature to Dutch. DUI stands for German, meaning these texts range
from Hoogduits to Dutch text with German influences. These are not binary categories but more of a sliding slope.
What binds them is that theyre presumed, by me, to have German influence of some kind. ZUID stands for
southern, meaning all languages or dialect or influences on the Dutch language that are spoken in the southern
provinces of the Netherlands, nowadays Vlaanderen and Brabant. I presume these texts are some sort of
Flemish or other version Dutch with Flemish or Belgium influences.
I also included the Wilhelmus and four cases of which I am unsure what region they belong to, mentioned in the
section Hypotheses. These are Van homulus by Pieter dorland, the Nederduitse Orthographie of Pontus de
Heuiter, Drie historische liederen en een hekeldicht of Antonius Ghyselers and the anonymus De Uilenspiegel.

1. Analyses language corpus 1 Alle talen, proza & poez


I started on my most broad language corpus called Alle talen, proza & poez consisting of 26 texts. The corpus
is with, 4 KLAS, 8 NED, 6 ZUID and 3 DUI all poetry texts, pretty balanced. Id prefer that more German texts
were included, and Ive tried, but many of them have a disputed language signal. Some of the attempts at more
DUIT in my corpus are now the open cases included.
My first analysis had very basic parameters, measuring on word level without culling or sampling. I performed a
bootstrap consensus tree, cancelling out cherry picking on most frequent words. I set the minimum of most
frequent words at 100, giving my analysis a little bit more range than with Burrows preferred 200 MFWs,405 and
1000 as maximum choosing a large number again because, as mentioned in the method section, this improves
performance, and also to keep a large range, which I will heavily limited later on with the culling function, and a
increment of a reasonable seized but still very precise 10.
The visualization of analysis, the graph, showed no major divisions, but theyre some clusters which
could be interpreted as signals for authorship, or as a poetry and prose distinction. I am searching for indications
that different languages or rather different dialects leave their stylistic mark visible in my graph. Results that I
could interpret like a signal for language or dialect remain very dim, because the tree diagram gives little
information, it only shows little clusters. That being said, especially the text considered to be ZUID, so southern
Dutch, do seem to cluster, and also seem to claim the text of Dorland, whos origin is uncertain, as one of their
own. The other open texts give no clear language signal. As for the Wilhelmus, it seems to cluster with the NED
texts, clinging to texts of van Hout and Coornhert. One result I like to mention is the clustering of Antonius
Ghyselers with Utenhove, because they are presumed to be of different dialects.
In the second analysis I deleted the pronouns, since previous research shows they improve the results, and
installed a culling of 50%, also because excluding words that are only present in a minority of the texts of the

405 Burrows 2002.


80

corpus, improves the results.406 Words for which a single text supplied most of the Delta score itself should be
deleted as they pollute the results.
The results now showed clear divisions and a lot of detail. The biggest division puts the ZUID text on
one side of the graph, in the same branch of the tree. Therere authorial signals again, in Coornhert, Fruytiers and
van der Van der Noot, and possible prose vs. poetry signals which I interpret as the divergent results for prose.
Dorland and the anonymous Uylenspiegel both seems to cluster to the ZUID-branch. Huiter en Verwaeck, the
other unspecified text cluster together, along with Marnixs prose text Dordtse reden, to the branch of the NEDtexts. The Wilhelmus clusters again to Coornhert. Notable other results are Coornherts prose text Corte
berispingen clustering with translated classical texts of Erasmus and Coornhert himself. Coornherts other
classical translation, his Ulyses, clusters to Dutch poetry of Marnix and Hout. The German texts under the banner
DUIT are scattered across the graph. With the varying of the two parameters, the pronouns and the culling, I test
which of the change in settings, has a, and if so what for a, kind of influence on the results.
Analysis 2.1 shows that keeping the pronouns while culling, heavily distorts the results, presumably
because keeping the pronouns generally leads to noisier results and with a large number of culling, the pronouns
will make up a larger portion of the corpus than they did before, although this is speculation.
In the third analysis, analysis 3, I use different levels of culling in one and the same graph, by choosing a different
minimum and maximum of culling and an increment. I determine the minimum and maximum by looking in the
data of analyses, displayed in the working screen of R. Culling of 50 percent leaves just 145 features, words,
available for analysis. If I only consider words that appear in all text wed only measure 2 available features, only
2 words. Results show only the big movements and divisions of ZUID texts, against the rest. This results was
slightly visible in previous analyses. The little effects are as good as gone. Therere no little clusters but instead
every text starts from the middle, on their side of the divide of course.
In analysis 3.1, I choose for low levels of culling from 0 up to 50, and the graph shows the opposite
effect. The big divide is totally gone but the clusters are reappearing. Both analysis 3 as 3.1 have unattractive
parameters because their graphs show a less distinctive landscape of this corpus, and less diversity in possible
effects, as the graphs of analysis 2, with a set choice of 50% culling.
In analysis 3.2 I chose for a culling of 30 to 50 percent, based on the available features, these were the
best iterations regarding the amount of measured MFW (most frequent words). If the amount of words becomes
to small, therere not enough features to analyse, but a feature set off over 1000, is bound to contain too many
semantic words. The choice for these parameters results in a more detailed graph in which the big divisions, as
well as some of the detailed information on the clusters, is visible. One branch of the tree clusters all the southern
texts, including the possibly Flemish Dorland, together with Uylenspiegel and the presumed German text of
Antonius Ghyselers, who has clustered before to the ZUID texts. One branch, of a Dutch text and a classical text,
springs from the middle but seems to tend to the ZUID branch. So there seems to be a cluster based on dialect
or language if we look at the clustering of ZUID texts, at the same time therere a lot of text who do not confirm

406 Hoover 2004b.


81

this hypothesis. Prose text are often exceptions. It is necessary to pay extra attention on a possible distortion of
the results based on an effect of type of text, meaning prose or poetry. An authorial signal seems present in
Coornhert, Fruytiers and Van der Noot. Huiter and verwaeck, the other two non distinguished texts, cluster
together. The Wilhelmus joins again Coornherts song ter liefden. The effects of the German influenced texts are
unclear. In general can the little available features in the analysis 3.0 be taken as a suggestion of another feature
type, character n-grams, also suggested in my theory.
In the fourth analyses I sample in order to compensate for any differences in text size, because it may cause for
an imbalanced corpus. I set the sample size at 525, which is about the size of the Wilhelmus. The graph showed
that parts from the same text cluster together, suggesting they are stylistically representative of the text from
which they came. Theres stylistic consistence within the works. Other results are that the two biggest text, who of
course also produced the most samples, are responsible for the two biggest cluster, who also form the main
branches, the biggest divisions. The third cluster is a left over cluster, meaning the rest of the texts, that do not
belong to the two biggest texts.
I can confirm that large amount of stylistically related text, not only cluster together, but also makes the
other texts seem to cluster together, driving them into one branch by the weight of their own number and the
strength of their relationships. Effects of author and the clustering of Dutch texts from the southern regions are
again present.
In analysis 5, Ive used a random sample, again with the sample size 525, and the amount of samples reduced to
one. This is following the bag of words principle were word order is completely ignored and 525 words are chosen
at random from every text. The effects of text size, so visible in analysis 4, is now undone because every text only
has one sample.
The graph as a whole is less distinctive as the previous ones. It shows a big division but almost no other
effects. Most of the text originates from the middle. Therere very few texts who cluster together. Random
sampling supposingly makes for very few effects, because of its short sample size. Most effects will not show
themselves with only 525 measurable characters, in this case, words.
Random samples of 525 words are, seemingly, so short that a, size wise, unbalanced corpus shows
more effects using word frequencies of the complete texts, instead of the samples. This leads me to the
conclusion that a text of somewhere between 500 and 550 words might just be very unsuited for author attribution
and texts of this size will cause noise in the results. This is only based on these parameters and these texts, so all
my conclusion are with a provisio.

2. Conclusions language corpus 1 Alle talen, proza & poez


82

First of all I like to mention that fine-tuning the parameters makes a lot of difference for the results, their
visualization and thereby the interpretation and conclusions. I make decisions regarding the parameters based on
theoretical arguments in order to avoid cherry picking.
There were authorial signals, possible type signals, which seemed the result of prose clustering, and a possible
language signal because the ZUID texts, texts with Flemish or southern Dutch heritage, seem to cluster. At the
same time therere a lot of texts who do not confirm these effects. Prose texts are often exceptions, this can be a
possible distortion because of a type or genre effect. The German texts do not seem to recognize each other
stylistically. There might be an effect for translated works, but I need more tests to confirm this.
Dorland and the anonymous Uylenspiegel both seem Flemisch or Flemischisch. Huiter and Verwaeck
cluster together, along with a variety of other NED or Dutch texts, but do not consistently show a language
signal. Wilhelmus clusters to Coornhert but also inconsistently.
When choosing high levels of culling the graphs showed the strong effects, hence the big divided, but lost a lot of
the little effects, who may not actually be, relevant or true effects. When using low levels of culling, the graph
showed the opposite effect. I should keep this in mind and try to theorize this, when determining culling, which I
base on the amount of features, not on the graph.
The sampling confirmed the stylistic consistence within the works and laid bare that big texts, or large amount of
stylistically related texts, push the other texts in one cluster, while these may not actually be stylistically similar.
Theyre only alike in their comparison to the other cluster, namely not like those texts. So relations are relative. I
need to be aware of this when constructing further corpora.
We saw that most effects will not show themselves with a sample size of only 525.Trying to perform an
authorship attribution on a text of somewhere between 500 and 550 words seems like stretching the methods too
far and texts of this size will inherently cause noise in the results, but I already knew this beforehand.

3. Analyses language corpus 2 Alle talen, enkel poez (DUI-NED-ZUID-KLAS)


I now remove prose from my corpus, because of the possibility that prose and poetry have different signals, and
that this obscures possible language effects in the graph. I also remove the two test cases uilenspiegel and the
text from Vaernewyck. Their results were very inconsistent and this was expected, because of their mixed
background, explained in the corpus and hypotheses sections and possibly because these texts had large
portions of editorial notes in them. Ive cleaned them up, but some of it might have escaped my attention. This
leaves us with a corpus of 22 texts.
I will be a little bit less elaborate in my explanations as with the analyses on our first corpus. Some of the design
thats already discussed will not be explained again and the interpretations already made will be discussed, but I
83

assume the reader has read the explanations of similar interpretations in previous analyses. The report will
continue to become more concise but I suspect Ill also have more and more complicated results to report.
The first analysis on this new corpus has the same parameters as the 3.2 analysis I performed on the on the
previous corpus, where we first determined the culling. The graphs shows two main branches, one clusters the
ZUID texts with Dorland and the text of Antonius Ghyselers, while the other contains the texts of Van der Noot
and two ZUID texts that crossed over to this branch as a pair. Authorship effects are present and sings of stylistic
similarity between texts from the same work, like the geuzenliedboek or any other book of songs, as well.
German texts are divided among the graph but Dousas translation from Latin clusters with the two texts of the
Niederheinische liederen. The Wilhelmus shares the same branch again with Coornherts Ter liefden van een
Maghet and the Baanderherenlied, both songs that individually were about half of the Wilhelmus, but were put
together to avoid text even smaller than the Wilhelmus existing in the corpus. Performing this intervention means
weve got to remember that therere possibly two voices in the conjunct text. The texts might be of the same
author, but we cant forget that possible genre, or other unknown effects are active, and causing style not found in
either of the individual texts. Because theres no representation of each individual text, the differences between
training texts by the same author are disregarded and the stylometric measures extracted from the concatenated
file may be quite different in comparison to each of the original training texts.407 Ive found research were results
showed that there is no substantial difference in attribution accuracy between a few chunks of 500 words
combined in one sample and a dozen concatenated chunks of 100 words. It suggest that in real attribution
studies, concatenated samples would display a very good performance.408 This was however an artificial
concatenation and moreover out of the same document. I remain wary over the conjunct files. Heuiter, the other
undefined text, doesnt show any preference for a language.
For analysis 2 I change the type of features to character n-grams, to compensate for the shrinking of the amount
of features when performing high amounts of culling. I start with n-grams of 3 characters. The results show that
the Van der Noot branch switched back to the side of the ZUID texts, making a clean division between the ZUID
texts and all others, except for the text of Antonius Ghyselers, presumed Germannisch, who joins the Flemish or
southern Dutch texts.
While the division of the ZUID texts against the rest is now nearly perfect, it does raise questions about
the reasons why the texts from Van der Noot switch sides so easily. Although the graph now seems to confirm an
effect of the Flemish or Southern Dutch dialect, it does this under very specific consequences. As weve learned
from the theory on the unmasking method, texts who cluster together under different feature sets, or different
parameters, are more convincing in their stylistic similarity. A lot of the effects are very delicate and thus not very
convincing, but these sensitivity is explained by the so much less than perfect conditions of the corpus and the
dependency of the method on the corpus. It reinforces our awareness of the importance of the parameters. If

407 Stamatatos 2009.


408 Eder 2010, 9.
84

switching the basic parameters to highly similar, but different, changes the outcome so drastically Ive got to fully
understand the conditions that Im changing and account for the considerations when preferring some over
others.
With these parameters, I refrained from choosing a specific culling but included all levels of them (with a
increment of 10), basically allowing higher levels of increment to again take their influence and as a consequence
Van der Noot sides with the other southern texts again. This was also the case in previous analyses where I
included high levels of increment. The difference is that this time around, Ive got a lot more available features on
the highest rates of culling, because of my choice for character 3-grams. However, in the included graph Ive
again switched back to the culling of 30-50, with the same effects, implying it is the choice of n-grams that pushes
the Van der Noot branch back to the other ZUID texts. The other results are broadly the same, so effects of
language and author, and no clear sign of the Wilhelmus but the anthem does cluster again to the conjured
Coornhert text.
I performed the same analysis again but this time I set the size of the character n-gram to 4. The data of the
analysis in the R working screen verified the increase in the amount of features. This gives us the possibility of
more iterations of culling, so I went with the full scope of culling from 0 to 100%.
The graph of analysis 4 can be read as an indication whether a sample size closer to word level shows
us a graph closer to that of word level analysis. Surprisingly it doesnt. The graph shows us a lot more detailed
information than the word level graph of the first analysis on this corpus did. There is a little cluster of translated
classical literature and a cluster of the two DUI texts with another KLAS text, but in general both the translated
Dutch texts, as the presumed German texts, dont seem to form any real clusters. Van der Noot stays with the
other southern authors. The Wilhelmus hasnt showed any sign of clustering with the ZUID texts.
The graph that belongs to analysis 5 shows us n-gram on word level, or collocations, discussed in the theory. I
wont again discus all its possible effects and advantages, because in this analysis on these short 550 word size
texts, collocations do not prove themselves useful. First of all, anything above a culling of 10 has too little
iterations to analyse. Looking at the graph, the top 5 longest texts, Erasmuss lof der zotheid, the anonymous
bekerigePauli Vlaams Sinnespel, Anna Bijns her Seer scoon boeck and Coornherts boethius show relations
with each other, along with the two parts of Niederheinische liederen. The rest of the graphs show no relations at
all and gives me no information on possible stylistic properties. Ill be hesitant to do further analyses using
collocations as features on corpora who involve texts of these lengths.
For the analysis 6 and 6.1 I use sampling, while keeping as features character 4-gram. Based on previous results
and plain logic I choose 2 at analysis 6 and 1 at the analysis 6.1as the n of random samples.
Graph 6 shows that samples of 550 character 4-grams do seem to interact and show stylistics relations.
Except for the 7 texts, lof der zotheid, boethius, one text of Hout, Heuiter, Marnixs 2 sonetten aan lucas
d'heere & in den duinen & De profetie van het lijden Christi, Van der Noots De vrijagie ende het houwelyck and

85

again Antonius Ghyselers, the texts of my corpus form clusters. The bad performance of the Erasmus text,
boethius and that of Heuiter can be explained/blamed on their size, consisting of texts with different styles and
perhaps less than perfect editing. Marnix text actually consist of 3 texts that I, because of the same reasons as
the Coornhert conjunct text, namely their limited text size, combined into one file. Both parts of Coornhert
however stay close, which is encouraging, hinting at internal consistency. I cant explain Van der Noot and
Ghyselers at this point, and Ill have to dive into the text, close reading it, in order to be thorough. The graph
doesnt show me that much other effects except for the close relation of both the Fruytiers texts.
Graph 6.1 shows the same pattern. A sample of 550 words is too small to suffice for analysis. This backs
the conclusions about sampling of the language corpus. When performing stylometry based on quantitative
analysis with the methods, design and tools Im using, it is advised to use larger texts or texts samples than 550
words when searching for a effect signalling language or dialect.
4. Conclusions language corpus 2 Alle talen, enkel poez (DUI-NED-ZUID-KLAS)
The results of the analysis on the second corpus are confirming the conclusions of the analyses on the language
corpus. We see authorial signals, stylistic compatibility of texts stemming from the same book of songs and a
possible language signal of ZUID texts.
The German-hypotheses seems very improbable. The available texts to test this hypotheses already fall
short and the German corpus shows no sign of consistent stylistic similarity, apart from the very obvious
connection of the two texts of the Niederheinische liederen, two samples from the same author and out of the
same work. There might also be a stylistic effect of translated classical literature.
The Wilhelmus shows no clear signs for language. He shares again the same branch with Coornherts
Ter liefden van een Maghet & Baanderherenlied, a conjunct text, which is a text heavily adapted by me, in this
case putting two texts of the same author, and with compliant genres, in one document and analyzing it as one
text. I use this procedure to either enlarge the text size or reduce the amount of texts by an author, while not
reducing the amount of text. This is a questionable practice and I should never forget the fictional nature of these
conjunct texts. They might be of the same author, but the effects on any other effects possibly present and the
complications of multiple voices in one text, are theoretically and practically unclear to me.
Some of the texts move around the graph a lot and they make and break connections very easily, when
making little adaptations to the parameters. When performing analyses we should look for attribution that remains
stable under different experimental circumstances, not in the least different feature sets. This stability is
associated with the reliability of the attribution. The opposite is also true. At this point a lot of the effects are still to
be interpreted as part of the difficulty of creating big test corpora, the text size and the conditions under which we
do the analysis. Conclusions should be made cautiously, because of the delicate nature of the effects, the amount
of noise, and the ambitious goals I have set for of my methods, with respect to the corpus and the flawed nature
of other testing conditions.
Analysis with the use of collocations do not seem useful at this point

86

5. Analyses language corpus 3 DUI NED ZUID poez


In this third corpus of the language hypothesis called DUI NED ZUID poez consisting of 18 texts, I exclude the
texts marked as KLAS, the classical translated literature. There were indications for a signal of some sort,
perhaps remains of classical stylistics, perhaps just the signal of translations or perhaps these were coincidences
as the signal was far from clear and seemed to come and go as I changed the parameters. Measuring
translations of classical stylistics was never the goal of my thesis, so to clean up my corpus, in order to measure
the effects Im looking for, I exclude the texts who are suspect of causing noise. If youre interested in effects of
translation or specifically translations from Greek and/or Latin, with my results I neither dare to reject or utter the
statement that they showed any common effect. Erasmus Lof der zoetheid and Coornherts Boethius seemed
stylistically similar though.
My first analysis goes back to basic settings of word as feature type, culling appropriate to the amount of features
and no sampling. The graph shows big divisions as well as smaller clusters. The ZUID texts cluster again with
Dorland and Antonius Ghyselers on one side of the graph, and the rest of the text which now has become
predominantly Dutch texts, texts with no accent, meaning the accent of the western provinces of Holland, coded
with NED, on the other side of the graph. The graph seem to indicate a believable effect, a stylistic characteristic,
of ZUID texts, as the texts are from different authors and included in different books, but repeatedly cluster
together over several corpora and under different parameters. The odd one out is Antonius Ghyselers, who
clusters to Dorland, an author that theoretically could have both German as Flemish influences, when looking at
his biography. The other DUIT texts that cluster together are on the opposite side, but these come from the same
book so are not immediately assumed to express a signal based on dialect or language, but rather on work or
author. The smaller effects are most of the time authorial signals; Van der Noot, Fruytiers and Marnix cluster to
their own. The Wilhelmus clusters again with Coornhert. The other open case, the Nederduitse orthografie
positions itself next to Hout.
In the second analysis I use character 3-grams and character 4-grams both with a culling appropriate to their
number of available features. Both had enough features to cull at a level of 100%, and both went over the
including of a 1000 features, when the features had to occur in less than 60% of the text in the corpus. Ive
compared the results, using these levels of culling, with levels from 0% going up to 100% , but the differences
were neglectable. The increment was in all cases 10, as will I consistently use this increment. Graph 2 and 2.1
both show roughly the same effects as the graph from analysis 1 on this corpus. The branch with the Van der
Noot texts shoves even further to the ZUID side.
6. Conclusions language corpus 3 DUI NED ZUID poez
The results on this corpus confirm results of the previous analyses on different corpora. Theres now a base for
accepting the hypothesis that texts from southern regions of the Netherlands or texts influenced by these regions
show a common stylistic pattern, that is distinct from the rest of the corpus and measurable by my methods and

87

distinguishable in the graph. This effect shows when the corpus is properly prepared and other effects are
controlled for so that these wont blur the stylistic similarity of the ZUID texts. The biggest division is between five
texts coded as ZUID, one text of Dorland, with probably heavy southern influences, and one text of Ghyselers,
who clusters with the ZUID texts for reasons unknown. I should begin to consider Ghyselers as a possible ZUID
text.
I give up un testing the German hypotheses. There is no reason to assume any type of effect on this
corpus of DUIT texts. I consider this a flaw of the corpus, as its insufficient to test this hypothesis, so I will not so
much reject the hypothesis as leave it unanswered. Ill discuss this in the conclusion section of the whole
language corpus, as dropping this hypothesis is the product of all the previous analyses on the different language
corpora.
So far the Wilhelmus has shown no relation to any other dialect than the hypernym NED.
7. Analyses language corpus 4 DUI NED ZUID poez
Having given up on the German hypothesis I exclude these texts from my corpus, giving birth to the last language
corpus called DUI NED ZUID poez, counting 15 texts. The goal of this corpus is to confirm the ZUID hypothesis,
that the Wilhelmus has southern influences, and to determine if it can be attributed to the ZUID texts. If so, it
would suggest that the Wilhelmus is from southern origin.
In the first analysis I go back to the basic settings of words as features with proper percentages of culling and
increment. The graph shows a big division between the ZUID texts and the other texts now all NED texts or
open cases. Dorland clusters with the ZUID texts and Heuiter with the NED texts, specifically with Hout. I
conclude that Dorland has southern influences and Heuiter does not. I will not go as far as say that Heuiter is
Dutch without any stylistic noticeable dialect, as the distinction between the two regionally based stylistics
obviously comes from the clustering of Flemish or southern text. I come to this conclusion based on the previous
corpora, which included text with other dialects or stylistic influences based on languages, and these
predominantly clustered with the NED texts, or more specifically away from the ZUID text, while the ZUID text
consistently clustered together over several different parameters sets, with a few exceptions. Also, conceptually,
while ZUID stands for a region, of which we test the assumption that texts from this region have some
distinguishable stylistic signal, NED stands for an accentless language, meaning what DUIT or ZUID texts are
not. Based on the results as well as the initial categorization of texts, I conclude that the common stylistic effect of
the ZUID text is indeed present, while this same effect for NED texts is not established. The Wilhelmus shows,
according to my results, no southern influences. It consistently clusters to a conjunct text of Coornhert, an author
which I assume to be without southern influences.
In analysis 2 and 2.1 the used features are character 3-gram and character 4-gram. Both graphs show less
coherence on either side of the big division than the graph of analysis 1 did, but the results stay roughly the
same. In graph 2.1 there does seem to be a division in the NED branch, something which has not occurred up

88

until now, between Marnix, Hout and Huiter on one side and on the other side Fruytiers, Coornhert and the
Wilhelmus.
For a critical interpretation of these results I consider other characteristics of the texts, than considered up until
this point. I find that the largest, most primary division could also be explained by text size, although these results
would be a lot less distinct and count a lot more of deviations than an explanation based on language. It turns out
that the ZUID texts are, most of the time, the texts with the biggest text size and Coornhert and the Wilhelmus,
the two texts that clustered repeatedly, are the texts with the smallest text size. However, there are still enough
exceptions to challenge an explanation based on text size. Heuiter, for example, is a very large text, but clusters
with the NED texts. Also the size of the ZUID texts, ranges from 3 times the Wilhelmus up to 10 times the
Wilhelmus, so the differences in size within the branch where most of the large texts reside, are bigger than the
difference of text size between the branches. Some of the authorship effects, like Van der Noot, shouldnt be
present if text size was the premier determinant. Still, these results cant be ignored and Ill need to control for any
possible size effects.
To cancel out the text size, I merged both texts of Fruytiers into one file, and did the same for both texts
of Marnix. I also added another Coonhert text, boeventucht, to the already conjunct file of Coornhert. In addition
to this I deleted 80% of Van der Noots Himne of Loft-sangh van Brabant sizing it down from 24kb to 5 kb, in txt
format, roughly the size of the Wilhelmus. With these corrections to my corpus, I perform the same analysis as
2.1 once over.
In graph 4.1, Coornhert still clusters, even after he is transformed to one of the longest text of the corpus, with the
Wilhelmus. The Fruytiers document and the Marnix document show the same relationship towards each other
and the other texts and take in roughly the same position as they were before. The relatively long text of Dorland
(50kb), does not cluster with the other long texts being Coornhert, Anna Bijns, and the anonymous sint paulie,
but with much smaller Van der Noot (20 kb) and Uytenhoven (15 kb). Anna Bijns text and begeerige pauli, two
very large texts, cluster together with another ammoniums text schandaleuze spelen which isnt very big. These
result can be explained by similarities in dialect, but not according to similarities in size. In general, shows graphs
4.1 us the familiar division of the ZUID texts on one side of the graph and the texts that are not ZUID on the
other, with the exception of the adapted text of Van der Noot who has switched over to the NED side. Based on
the assumption that the signal that divides ZUID from the rest is largely based on common stylistics of the
southern texts and not on the common stylistics of the Dutch and other texts, Van der Noot apparently shows
after its mutilation no longer enough resemblance to the stylistics of ZUID texts, and is therefore pushed away.
Apparently the sizing down of Van der Noot obscures the stylistic characteristics he showed before. I should take
his into consideration when reviewing the results of the Wilhelmus text and my methods.
Graph 4.2 shows roughly the same effects as graph 4.1. When switching to character 4 grams in
analysis 4.2, the adapted Van der Noot texts has as nearest neighbour another text of Van der Noot again,

89

however theyre still positioned in the graph on the other side as the ZUID texts. Also, both Van der Noot texts do
not form a branch of their own, but are only nearest neighbours.
8. Conclusions language corpus 4 DUI NED ZUID poez
I conclude that theres a signal for language in the form of a stylistic resemblance among Flemish and/or Dutch
texts from the southern regions. I corrected for size and this effect was still present. This stylistic signal, that I
attributed as an effect based on dialect or language, up until my revision of the DUI NED ZUID poez
and after critically reviewing other explanations, turned out be, after I corrected for text size, correctly interpreted
as an effect based on dialect or language. Text size does not account sufficiently for this signal.
Based on the results of my transformed fourth corpus, I do however conclude, that the stylistics signal of
a text does significantly change somewhere under a thousand words. Sizing down, obscures stylistic
characteristics regarding language, but less on the authorship signals, who are present in this corpus and remain
to be, although less obvious, when the text size gets significantly under a thousand words. In addition to this,
when raising the level of culling, the authorship effects that were lost or dimmed because of a small text size
seem to come back.
When performing analysis on this corpus, Ive to be aware of the limited text size and possibly correct for
it or alter and prepare my corpus because of it. Cutting texts, obscures all types of stylistic markers present in
them, while deleting all small texts or combining texts in order to increase the text size is in my case hardly a
workable option, if Id like to keep a balanced corpus, which includes different authors represented by multiple
texts with different characteristics, and avoid, in order to keep the voices of my text singular and their signals
authentic, combining texts.
I conclude that Dorland has southern influences and Heuiter does not. Ill not go as far as saying that Heuiter
writes Dutch without any stylistic noticeable dialect, as I deem the stylistic similarity of the NED texts not proven.
While the clustering of ZUID text has been convincing over four corpora and all the analyses using several
different parameters sets, exceptions excluded. As explained is the assumption of the NED or Dutch texts as a
dialect or version of the Dutch language, conceptually problematic, while ZUID has a better theoretical ground.
Therefore I interpret the consistently clustering with NED texts as not- ZUID, meaning under no measurable
stylistic influences from languages of the Southern regions of the Netherlands, like Flemish.

Conclusions language corpus:


I report here the conclusion made during my analyses of the language corpora, elaborate on these analyses and
answer the hypotheses I designed these corpora for. The last two conclusions will be methodological conclusions.

90

-The results of my analyses show that therere stylistic effects, based on the difference in style between variations
of the same language, for example dialects or other regional determinants, present in my corpus. They can be
measured by my methods and can be demonstrated in a visualization. Based on the results of this research I
conclude that texts of a similar dialect are expected to have a stylistic resemblance. Ive only satisfactory tested
this for texts from or under influences of the formally southern provinces of the Netherlands, for example
Vlaanderen. I refuse to generalize these findings to other dialects yet, because one swallow does not yet a
summer make.409 I can only suspect the possibility that similar effects are present for other dialect or linguistic
regions. Recommendations for further research are done in the sections discussion and future research.
- Doing analysis on short texts, referring to text that are around thousand words or less, weve to take the size
into account and if possible correct for it, as text characteristics will become less visible, more noisy and
sometimes disappear all together. Preparing the corpus is a big part of controlling for text size effects. However
cutting texts too much, going under a thousand words, obscures all types of stylistic markers present in the text.
- There is no indication that the Wilhelmus has a southern or Flemish origin.
- The German-hypothesis has not been sufficiently tested. I fell short on testing this part of the language sub
hypothesis for several reasons. First of all there was to little readily available text for a decent representation in
my corpus, mainly because of the other restrictions I placed on the text, like size, area and time of creation and a
qualitative and temporal closeness to the first edition. Another reason for the low quantity and quality of my DUI
texts was that, while working under the time pressure of deadlines, on a sub hypothesis, the selection of texts of
this type took very long because the biographical knowledge of a text or author was seldom available, while I had
limited knowledge of these type of texts myself. Combined with the variety of different types of German
influences, it took an enormous amount of time and effort, reading German secondary literature for example, to
understand the stylistic nature of these texts and to determine if they should be included in my corpus. In my
search for text that would qualify as DUIT Ive come across, among others, German texts, texts that were
Hoogduits and texts from the upper regions of the West-German border that were paradoxically called
nederduits or Nether-German texts. Current borders, national or language, werent there in the 16th century, but
rather a language continuum from the eastern provinces of the Netherlands up to the Baltic states.410 The
diversity is a problem if youre building corpora from the renaissance because of the limited paper and digital
availability of these type of texts. Further research would strive by the defining and the determination of the
diversity of this dialect, in order to make the categories operational.

409 Presummably by Aristotle (384 BCE - 322 BCE):


410 Paul Wackers, Methodologische overwegingen bij het Repertorium van het Nederlandse lied tot 1600 in De fiere
nachtegaal: het Nederlandse lied in de middeleeuwen, eds. Louis Peter Grijp and Frank Willaert (Hilversum: Verloren, 2008),
356.

91

- Of the five texts, or test cases, without a definite language mark, that Ive tested along side of the Wilhelmus,
namely De uilenspiegel, Dorlands van homulus, heuiters Nederduitse orthografie en Vaernewycks histori
Belgis, I can only determine Dorland as as a ZUID text.
- The First methodological conclusion I draw is, when performing normal sampling with a sample size of 525,
samples of larger text will clusters together and therefore determine the main branches of the graph, because
these are determined by the amount of relations and the strength of the relations, which samples of the same
large text will both have. Text significantly larger than others will make up too big a part of the whole amount of
samples and therefore obscuring smaller effects. I advice a balanced set, especially for text size, when using
normal sampling and in general, because weve seen that the results are under the influence of the size of the
texts. The importance of this increases as the texts grow short in an absolute way. This also leads me to conclude
that 550 words is a very short sample size and text size to use for analysis. Subtle effects will be more present in
the graphs when using bigger texts or samples.
- A second methodological conclusion concerning sampling is, when using random samples of 525 words, the
graph gives way less information and expression than using the full size of all the texts, even when the corpus is
very imbalanced regarding the n words. These findings lead me to conclude that a text of somewhere between
500 or 550 words is an inappropriate sample size, for quantitative stylistic analyses, disruptive of the results and
should be avoided if possible. The stylistic analysis on text of these short length might be difficult and even
problematic, its definitely not without results. Language and authorship signals, among others, are present and
visible.
These methodological conclusions are conform my expectations. This of course also means that the
Wilhelmus is far from an ideal text to perform authorship attribution on. However, effects seem to be present and
some are visible, and can, with further research, lead us to conclusions.
This means for my hypotheses;

There is, according to the results of my analyses, a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in my corpus. This
hypothesis is accepted.

There is, according to the results of my analyses, a stylistic effect based on the Southern dialect,
concerning Flemish or Dutch from the southern regions of the Netherlands, present in the texts of my
corpus. This hypothesis is accepted.

There is, according to the results of my analyses, a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus. This hypothesis is neither accepted or rejected, but insufficiently tested.

There is, according to the results of my analyses, a common stylistic effect in the Dutch texts of my
corpus, that are translations out of Latin or classic Greek. This hypothesis is neither accepted or

92

rejected, but insufficiently tested. Results seem to point to a possible stylistic effect of either translations
or remains of a classical language.

There is, according to the results of my analyses, a stylistic signals of any other language or accent than
Dutch language present in the Wilhelmus? This hypothesis is rejected.

Genre specialized corpora


With the birds eye view, poetry and prose seemed to move in different clusters. Id like to test these and other
effects based on the genres, by analyzing smaller and cleaner corpora, specifically composed to test my
hypotheses. The second specialized set of corpora is that of the genre hypotheses, divided in the following sub
hypotheses.

There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction coming from differences in language, present in the texts of my corpus.

There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose.

There is, according to the results of my analyses, a stylistic difference present, between the songs and
the poetry of my corpus.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry.

There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggars songs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggars songs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.

The 0-corpus consist of 42 texts. Ive included a lot of prose, coded as PROZA, which is the Dutch word for
prose, and a lot of poetry, coded as POEZIE, which is the Dutch word for poetry. Psalms are a separate type,
coded as PSALM, because Id like to account for any possible effects based on their content of form, since the
birds eye view indicated that psalms might be a stylistic distinct group.

93

For now, I put songs and poetry under the same banner of POEZIE, sometimes with an addition of
SONG or LIED, meaning and indicating song, at the end of the code. The Wilhelmus is a song, so my focus lies
on including similar works, preferably songs, to my corpus. At the same time are songs, on paper and without
melody, very similar to poems. The modern distinction between (pop)songs and poems is a modern one, and not
applicable to the sixteenth century. Ill try to account for the distinction of songs and poems later on.
One text is uncertain in its nature, namely Coornherts boeventucht coded as
00_coornhert_Boeventucht PROZA-POEZIE. Its unclear whether to call it prose or poetry.

1. Analyses genre corpus 1 Alle genres LIED, PSALM, PROZA, POEZ


For the first analysis on this corpus is use very basic parameters, like features on word level and culling with no
limitations on the percentage. Based on the available features I determined the different percentages of culling,
for analysis 1.1, up from 20% untill 90%. The graph shows us no notable big divisions. This was expected with
the amount of texts and the many different possible factors influencing the results. Clusters of exclusively prose or
exclusively poetry are present and so are authorship effects for Fruytiers, Datheen, Willem van Oranje,
Coornhert, dHeere and Marnix. Coornherts boevetucht acts like prose by clustering with prose from Hout,
Heere and Marnix. The Psalms dont cluster together as much as Id expected them to after the birds eye view.
Lucas dHeeres psalmen davids dont cluster with the other psalm Davids but with Coonrherts book of songs,
that does follow the verse of psalms, and the songs of praise by Hout. We see the texts from southern regions of
the Netherlands group together again, confirming my conclusions on the previous hypotheses.
Analyses 2 and 2.1 are on character 3-gramn with different culling. Results dont change much over 2.0 and 2.1,
but they do compared to the first analysis.
This graph suggests a very strong and convincing effect between prose and poetry. The biggest division is
between prose and poetry. The only prose text that sides with poetry is a letter of Datheen. The open case of
Boeventucht by Coornhert clusters with 5 other prose texts, but not with other prose from Coornhert himself, that
clusters in another branch. The prose cluster, were Coornherts open case is clustered, involves all citations or
recitations of speeches and lectures. This might indicate a genre effect more subtle than prose vs. poetry.
On the poetry side of the graph are 3 branches, a Flemish branch, a branch with predominantly text by
Lucas dHeere and a branch with a lot of short texts. The first two cluster are already pretty accurately described
and can be explained by an authorship effect and a language effect. The third cluster has quite a few short texts
that cluster in a fourth branch and can be called the leftover branch. It shows no clear signal and theres very
little distinction in the branch itself, except for maybe the psalm Davids clustering within the branch. Two psalms
do not cluster, Marnixs songs of praise and Coornherts book of songs, which are both explainable because
theyre not exclusively or wholly made up of psalms.
Therere authorial signals, but also texts from the same author that dont cluster, these cases are most of
the time explainable by the interference of other effects. An exception to this rule is the scaterig of two separate
clusters of Fruytiers texts.
94

Analyses. 2.2, uses character 4-grams and shows us mostly the same results especially the big
divisions. The smaller branches are all gone. Looking at nearest neighbour more than at different branches, the
relationships of the texts are mostly the same. The biggest difference is that Datheens letter now joins the prose
and at the same time the anonymous niederheinische liederen handschrift tags along and shifts over to the
prose cluster. Coornherts boeventucht sides again with the prose texts. The four apologia are still close to each
other.
The other side of the graph becomes a lot more diffuse. Authorial signals are still visible but the rest of
the effects scatter over the right side of the graph. There is still some clustering of the psalms as dHeere joins the
other Psalms. Small texts and Flemish texts seem to cluster on these characteristics. The 4-grams seem to
confirm the main effects but blur out the smaller effects.
To clean up this corpus I excluded the Flemish texts and the texts that are smaller or equal to the Wilhelmus. Im
down to 33 text and this means more visibility. Deleting the clusters whos effect wed established, but are not the
aim of these analyses, puts more weight on other possible effects within the corpus.
The same analyses were performed again. Prose and poetry cluster together, the big division is
especially visible when using character n-grams as features. Over the whole scope of the corpus authorship
effects are visible. Coornherts boeventucht clusters with prose again. Psalms also seem to cluster together, at
the very least the Psalms David. Important differences with the previous results are the clustering of the
Wilhelmus to Coornherts prose, which it does two times, and to the prose of the prince, which it does one time.
Coornhert texts are very large, 24kb in txt format or 3871 words and 22KB in txt format 3552 words, and both
prose texts of Willem van Oranje are 9kb in txt format, which is about 1500 words, so the attribution cannot be
explained based on size. When interpreting the graph up close youll notice that all relations to the prose are just
nearest neighbour relations. These effects, occurring sporadically, are perhaps not that convincing. The
Wilhelmus seems to move around easily, which is a concern.
Analysis 4 is again a collocations analysis, with 3-grams on word level. Culling on high levels was impossible
because of the limited amount of features. Both the choice for collocations as that of 0% culling raises the
possibility measuring semantic information rather than stylistics information.
The bottom branch combines 5 small poetry texts, with one large poem, Coornherts book of songs and
prose of medium size by Hout and Willem van Oranje. The analysis provides little information. All but the
authorship signals of Fruytiers and dHeere, are as good as gone. The psalms tend to cluster in one branch,
along with a few other big texts. Prose and poetry cluster again, seemingly more than average, but the results are
not convincing enough yet to draw conclusions on. The graph as a whole is to fragmentized and shows too little
detail. The poor results, the possible bias for semantics, and overall shape of the graph, lead me to the reject
these results in my overall analysis.

95

Weve already seen that sampling with just 550 words gives poor results. Therefore I removed all texts smaller
than 1500 words, including the WIlhemus and put up a sample size of 1500 words, in order to test genre or type
effects. Analyses of 5 show us clear authorship effects and stylistic distinctions between poetry and prose.
Psalms seem to form a group together, and although defining psalms as a genre might actually be the practice of
interpreting results so that they fit my hypotheses, theres without a doubt, a, possibly topical, common stylistic
trait in the psalms. Note that a lot of the psalms are psalms Davids. Theyre different parts of the text, the first 10
psalms of one author and the next ten of another, so were not just confirming different editions. None of the
psalmen davids have overlap in texts. Boeventucht shows itself again as a prose text. The graph of analysis 5
confirms my findings of analyses 1 to 4. When I add the Wilhelmus it clusters with the Psalms.
Bringing down the sample size to 1100 words in analysis 6, shows us roughly the same results but the
graphs are more fragmentized and less informative. The clusters are less distinct. This graph shows how signals
fade when the size of the samples or texts are falling below a threshold.
2. Conclusions genre corpus 1 Alle genres LIED, PSALM, PROZA, POEZ
So far there seems to be a clear stylistic difference between prose and poetry. This effect already showed itself in
different versions of the corpus and different sets of parameters. I suspect to see this effect in other corpora as
well. There are clues that there is also an effect for psalms, but this hypothesis has not been tested enough.
Weve seen small text cluster and also Flemish text group together and have excluded both groups to get rid of
their noise. Coornherts boeventucht, although partly poetry, is stylistically identified as prose. Other, perhaps
smaller, genre effects, like that of the speeches are only visible sporadically. The Wilhelmus doesnt seem to
relate stylistically to other songs and moves around the graph easily, which is a concern. The corpus also
confirmed the conclusions of the language analyses. There were author signals and language signals for Flemish
texts. Analysing samples of 1500 words showed good clear results. Bringing down the sample size to 1100 words,
definitely did some damage to the visibility of the presumed effects of the graph, but it did showed us roughly the
same results, only more fragmentized.
3. Cancelling hypotheses
Based on the previous corpus and the corpora of the language hypotheses I abandon the attempt to make further
distinctions between poetry and songs, and my initial plan to test 6 different types of songs, as well. Problematic
of the dropped hypotheses, is that it requires categorization based on very ambiguous characteristics.
Ive already mentioned that the primary distinction between poetry and song, although obvious in the
here and now, is rather blear in the 16th century, since both depended heavily on oral transmission, and the
difference lays predominantly in the way its suppose to be performed. The importance of making this distinction
is pretty low since we already know that the Wilhelmus is a song. On the other hand, if theres a stylistic
difference between poems and songs, I should be aware of this, so I can take it into account when measuring
authorial signals. My corpus, however, consist of such an amount of songs that its not methodologically valid for

96

my research. Ill pay attention to the behaviours of the poems, so if they do give out a stylistic signals, which they
havent in the slightest so far, Id pick it up and reinstate this hypothesis.
While the different genres would take, apart from the songs with the genre indications in their name, the
researcher to define measurable characteristics per genre, and allocate the texts according to these standards,
that wont be mutually exclusive. These measurable characteristics should come from a theoretical framework, or
some general consensus, of why these are typical for a certain genre. Remember the genres the Wilhelmus was
labelled, they were sometimes pretty analogous. When the genres were opposites, they were so with respect to
their goal, forgiveness or incitement, but still on some levels similar enough to both be attributed to the same
song. This will amount to, even with the consumption of huge quantities of secondary literature, vague or
ambiguous categories, probably highly contextually dependant.
Then theres the availability of texts. Some of the genres are pretty rare, if real at all. The different
genres or functions of these songs are part of the Wilhelmus research discipline, determined after the conception
of the song. Their labels were not pre-existing genres where the still anonymous poet of the Wilhelmus could pick
from. Its not certain that theres a conceptual difference between for example songs of praise and songs of
encouragement with the other type of songs, some of which are very difficult to translate (pardoenliederen). To
determine stylistic effects over different genres of songs, one has to have a justifiable corpus of every genre.
Assuming these genres are stylistically different and measurable, the Wilhelmus would have to actually be one of
the types, and be long enough to signal the stylistic properties of the genre. Knowing now how sensitive and
delicate the effects in this corpus are, we would need to exclude other effects like, time and place of birth, author,
size, dialect and so on. Ive complained several times about the difficulties of constructing the relatively easy
corpora Im testing now. This alerts me at the whale of a task that forming the 6 genre corpus must really be, if
possible at all. Ill let go of this hypothesis.
Now that I cut some hypotheses and specified my corpus a bit further, its time to reaffirm the goals of my
analyses. I aim to test 3 things on the next 3 corpora; Whether prose and poetry have distinctly different stylistic
signals, Whether psalms and prose have distinctly different stylistic signals and Whether psalms and other poetry
have distinctly and different signals.
4. Analyses genre corpus 2 poezie vs. proza
First, I test stylistic differences between poetry and prose. For the genre corpus 2 poezie vs. proza consisting of
now 14 texts
I exclude the psalms along with the earlier excluded Flemish texts and the texts shorter than three times the
Wilhelmus. The psalms will be added later on, when Ill try to measure the effects of the psalms against that of
prose or poetry. The first corpus however will focus on the effects of prose against poetry, and therefore the
psalms are excluded. I perform different analyses varying features and culling, and sampling always with sample
size of 1500 words.

97

All three analyses show effects. Poetry and prose seem to have a different and distinguishable stylistic signal
which causes prose and poetry texts to cluster to their own group, and divides the corpora into a prose and a
poetry section. This effect is strong enough to establish itself at a sample size of 1500, thereby not only verifying
the existence and strength of the signal but also indicating that the sample size is adequate.
These results come as no surprise, as weve seen the distinction between prose and poetry through out
all my analysis, and during the test on my genre corpus it definitely established itself as a pattern. Therere some,
aberrations from the pattern, but this also is to be expected. However, the reoccurring breaking of the corpus
between the prose and poetry, varying on all kinds of parameters, and on different corpora, one specially
designed to test this effect, gives me the confidence that there is indeed a stylistic difference between poetry and
prose, that my methods can and do pick up on this effects and that a sample of 1500 is big enough to show this
effect.

5. Analyses genre corpus 3 proza vs. psalm


I now perform the same tests I performed on the genre corpus 2 poezie vs. proza, again on the genre corpus 3
proza vs. psalm. This corpus is constructed in order to determine the differences between prose and psalms, so
excluding the poetry and the songs that are not psalms, coded as LIED_POEZ. The Flemish texts and the texts
shorter than three times the Wilhelmus are again excluded, while the psalm are included again.
Psalms and prose seem to have a different and distinguishable stylistic signal which causes both groups to
cluster to their own group, and to divide the corpora almost perfectly into a prose and a psalm section, if I
consider Coornherts boeventucht as prose, which I do by now. This effect is strong enough to establish itself at
an sample size of 1500.
On only one occasion was the psalm prose distinction overruled by another effect, when Boeventucht,
Coornherts prose with poetic aspects, and Coornherts book of songs which uses the musical rhythm of psalms
as his own, cluster in a separate tree. Here the least typical texts of both groups, being of the same author, are
drawn to each other. I conclude that its the authorship signal in both texts that overrule the genreefect in its least
stereotypical members.
However, the reoccurring breaking of the corpus, between the prose and psalms, varying over all kinds
of parameters, and on different corpora, one specially designed to test this effect, gives me the confidence that
there is indeed a stylistic difference between prose and psalm, that my methods pick up on this effects and that a
sample of 1500 is big enough to draw it out.

6. Analyses genre corpus 4(.b) poezie vs. psalm


Ill perform the same tests as on previous two corpora on the genre corpus 4(.b) poezie vs. psalm. This corpus is
constructed in order to determine the differences between poetry and psalms, so Ive excluded the prose texts
and included the poems and the songs. After the first analyses, a disbalance in my corpus polluted my graph, so
I adjusted my corpus again. There were too many texts of the same authors, thereby generating a lot of
98

authorship effects while others were represented by only one or two texts. Especially Lucas dheere but also
Johan Fruytiers were too prevalent in my corpus, therefore I combined the 2 smallest poem/songs of dHeere in
one file and the 2 smallest poem/songs of Fruytiers in another file. To make sure author effects cant overrule
possible genre effects, I also exclude the dHeeres psalms from my corpus. I defined Coornherts book of songs,
that must be performed on the melody of the psalms, as an open case, as its not a stereotypical psalm. On a
corpus without prose, balanced for text length, compensated for possible authorial signals, I perform an analyses
that intends to capture possible stylistic differences between psalms and poetry.
The graph shows that the biggest division is between psalms and poetry, however, the amount of texts in
the corpus has declined significantly causing the honour of being the biggest division to depend on the margin of
just one text. The margin would be bigger if Coornherts book of song is categorized as songs/poetry.
Fruytiers his ecclasius repeatedly joins the psalms. Although the results are less convincing than the
two previous analyses, those between prose and poetry and between prose and psalms, there seems a tendency
for psalms to cluster together. Based on these and previous results I conclude that, although the stylistic distinct
signals of poetry and psalms are not yet sufficiently proven, I advise research performed with similar design,
goals and text size as my own to consider these effects as probable. Ill definitely follow my own advice in my
authorship attribution experiment. I dare to speak of a delicate but conceivable stylistic effect for the genre psalm
in comparison with other poetry and songs, that manifests itself with specific methods under well tuned
parameters on a heavily prepared and balanced corpus of clean texts. If these requirements are met I believe
more different types of poetry can be distinguished. Psalms and other poems can have a stylistic differences,
although these effects may occur based on topical or other semantic characteristics.

Conclusions genre corpus


Here, I report the conclusions made during my analyses of the genre corpora, elaborate on these analyses and
answer the hypotheses I designed these corpora for.
- My first conclusion is that, based on the results of my analyses on the genre corpora, poetry and prose have
different and distinguishable stylistics. These stylistic effects of prose and poetry were present on various sets of
parameters, on various corpora, one specially designed to test their differences, and these were measurable with
my methods on a sample size of 1500 words.
- My second conclusion is that, based on the results of my analyses on the genre corpora, psalms and prose
have different and distinguishable stylistics. These stylistic effects of prose and psalms were present on various
sets of parameters, on various corpora, one specially designed to test their differences, and these were
measurable with my methods on a sample size of 1500 words.
- My third conclusion is that therere strong indications that psalms have distinctly stylistic signals. Performing
quantitative analyses, with my methods and on a texts sizes of 1500 words, psalms seem to be regarded as a
99

genre, that has common stylistic properties and that will manifest under the described conditions. However, I
deem these effects not yet sufficiently proven, to confirm them as true. Im considering them probable and
supported by the results of my analyses so far. This is why I advice to account for these possible stylistic effects
of psalms, as I will do so myself.
- I attribute no characteristics to the Wilhelmus as it hasnt showed any consistent effects.
- The other open case, Coornherts boeventucht, resembles prose stylistically, although close reading tells us it
starts with a rhyme scheme before gradually losing its rhyme. It is also the first Coornherts prose text, and prose
text in general, to leave the prose cluster, when other effects are present, suggesting that the prose signals of
boeventucht is the weakest of the prose texts in my corpus, probably due to its mixed nature.
- Methodologically these results mean that stylistic effects for poetry, prose and psalm have their influence on the
forming of the graphs. This is an important consideration when building the corpora and interpreting the results.
- Authorship signals can overrule genre effects, especially when texts are equivocal regarding their genre or type.
- Analysing sample sizes of 1500 words showed good clear results. Bringing the sample size down, had a
negative influence on clarity of the genre signals and the readability of the graph.
- Character 3-grams and character 4-grams were very instrumental and gave a lot of information and are to be
considered as a sound alternative and addition to features on word level. In comparison do character 3-grams
show more detail and smaller effects as character 4-grams.
This leads to the evaluation of the hypotheses;

There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction coming from differences in language, present in the texts of my corpus. This
hypothesis is accepted.

There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus. This hypothesis is accepted.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose. This hypothesis is neither accepted or rejected as the Wilhelmus hasnt showed any consistent
effects, and I consider its characteristics insufficiently analyzed.

There is, according to the results of my analyses, a stylistic difference between the songs and the poetry
of my corpus. This hypothesis is cancelled.

100

The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry. This hypothesis is cancelled.

There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggarssongs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen, present in the texts of my corpus. This hypothesis is cancelled.

The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggarssongs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen. This hypothesis is cancelled.

Author hypotheses
My final specialized corpus will revolve around the main question of my thesis; Who is the author of the
Wilhelmus? My aim is to map all the authorship effects, determine their strength, look for unexpected results and
attribute the Wilhelmus. In order to do so I propose the following sub hypotheses.

There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal.

The Wilhelmus shows consistently more stylistic similarity to the texts of one particular author than the
texts of other authors and the texts of this author consistently show an authorial signal by expressing
similar style.

1. Analyses author 0-corpus


Ill start with my 0-corpus which is as broad as possible, and still includes texts signalling stylometric
characteristics that are already established in previous analyses, like prose, psalms, Flemish origin, and so on.
The authorship signals have already proven themselves to be stronger, at times, than language and genre
effects, this is why I testcase whether its necessary to control for other effects. Best case scenario I can keep
most of the signals present in my corpus, while the authorship signals is the most dominant one. This way I
wouldnt have to size down my corpus, making it vulnerable for imbalance, false positives and cherry picking.
The results of the first analyses on my author 0-Corpus show no big division, just small clusters. Half of the texts
dont cluster in branches and are to be defined based on nearest neighbour principle. Although this graph gives
us little information, different texts from the same author tend to find each other. Therere also enough texts that
dont follow this hypothesis. The distinct effects of prose and poetry are again present and theres a cluster of
psalms. My observation is that texts coming from the same work tend to cluster. Regarding the geuzenliedboek

101

this means that a lot of the anonymous texts group together, this effect could also be because theyre from the
same anonymous author.
Marnix and dHeere are examples of authors whose signal is apparently strong enough to group their
prose a poetry together. Marnix, Fruytiers and Coornhert are examples of authors whos authorships signals are
in some cases overruled by poetry and prose effects. Van der Noot, an author with southern influences, clusters
separately, but this could be just his authorship signal and not a language signal. The Wilhelmus forms
connections with the prose of Willem of Nassau. Therere other possible noteworthy effects, but lets see if theyre
reoccurring under different conditions.
The graphs of the successive analyses 1.1 and 1.2, share roughly the same results, with no big divisions but
some small clusters. At culling levels from 0% to 100%, about a quarter of the texts find too little stylistic
connection to form trees, probably because the features that appeared in less than 20% of the texts, are more
stylistic and less semantic determinants.
The effects we do see are similar to those before. Therere cluster based on authors, prose, poetry and
psalms. Both included Wilhelmus versions become each others nearest neighbours but show no other
interpretable effects.
When, in order to get more results, varying the feature type, to collocations, the limited amount of words made
analysis impossible. Using character n-grams solved this, because they give me much more features to analyse a
text on. As explained in the theory, n-grams measure stylistic properties but have semantics as by-catch. Also
when using character n-grams, raising the culling of the analyses has less impact then with words, suggesting
that the correction for topic has already been performed.
Results for the analyses 2 to 2.3 show no big division again, but they do show large clusters this time.
Therere still texts with too little stylistic resemblance to other texts, to join a tree, but way less than with previous
analyses on this corpus. Also, the text that do not cluster now have nearest neighbours that are often very
explainable. Authors cluster, prose and poetry find their own branch, and so does the psalm texts.
The Wilhelmus seems to cluster sporadically to Willem van Oranje. This attribution is way to inconsistent, as it
seems almost accidental. Note that almost all authors have some link to the Wilhelmus, so I perform further
analyses until it consistently clusters to an author, before I can draw conclusions.

2. conclusions author 0-corpus


This corpus was to noisy to find any clear results. There certainly were authorship signals, but sometimes they
were overruled by genre or language effects. Some authorship signals were not present at all. The effects that
were, are consistent with previous analyses and results, even the psalm cluster. There might be an effect of
stylistic similarity between text of the same book, but Ive to perform more analyses to confirm this. The
102

Wilhelmus forms connections with the prose of Willem of Nassau, but this is inconsistently and they are easily
disconnected.

3. Analyses author corpus 1 auteur minus proza en psalmen


The 0-corpus author was a accumulation of all texts that seemed relevant to the authorship question. For the
corpus called auteur minus proza en psalmen I excluded prose texts and psalms from my corpus so their
signals cant blur authorship effects. The songs of praise out of the Psalms Davis from Marnix and Coornherts
book of songs, will remain in the corpus for now, because both in theory as in practice dont seem to have a
stylistic signal typical for psalms. Van der Noot texts form a cluster, which might be the language effect of
southern Dutch influence, but could also be his authorship signal, although theyre, of course, not mutually
exclusive categories. Van der Noots poetry will remain, for now, in my corpus. Datheen, Utenhoven and Willem
van Oranje are removed as possible authors, based on the secondary literature, the inconsistent performance of
their texts and their disbalancing influence on the corpus.
Analyses 1.1 didnt provided me with big divisions but did show big clusters. Five from the 31 texts showed too
little stylistic affiliation with other texts to cluster in a branch.
A distant look at the graph shows, based on the colours that tend to group, that authors clustered together. A
closer look shows us the green anonymous songs, that indeed do not cluster. Black is Fruytiers who divides
himself into separate trees. Coornherts cluster consist of his translation from classical language and his book of
songs. KLAS does not seem to form a stylistic marker under these conditions. Reael does not have an authorial
signal, just as in the previous analyses. Marnixs songs of praise from his Psalmen Davids and baanderliederen,
a song by Coornhert, do not cluster in the branch of their author. Songs from the geuzenliedboek seem to group
above chance level, meaning they might have some stylistically based shared properties, maybe because an
authorial signal or maybe because of the lack of it. The two versions of the Wilhelmus group together.
The results of analyses 2 to 2.3 show a similar shift in results as when I changed the features in the previous
corpus. Therere again no big divisions, which is good because the corpus is constructed to show authorship
signals, preferably of all the authors, so dividing the graph in multiple branches and not one division. The clusters
are indeed there, and predominantly according to author. These would be hopeful results if it werent for the
Wilhelmus not showing any stylistics relationships to any other author, except for itself.
Lucas dHeere forms his own tree again which includes all of his texts in the corpus. Fruytiers divides his
texts over two clusters again, who are now neighbours. One of the Fruytiers clusters takes in the text from
Jerominus Voort. Van der Noot forms consistently his own branch, except for the graph 2.3, of the analysis with
the character 4-grams with a culling of 6-% up to 100%, where his authorship signal disappears. Coornherts
texts, besides the conjunct text Ter liefden van een Maghet & Baanderherenlied, which I can consider as an
exception, clusters consistently in both analyses using the character-4 gram or word level features. In the graph
103

of the analysis with character 3-ngrams as features and without culling limitations, however, Marnix joins this tree
with his texts, except for his songs of praise. Reael doesnt show any consistent stylistics relations, signals or
similarities over the analyses so far. His texts dont cluster together.
These graphs show a lot of pretty strong authorship signals and they actually come out pretty firm. The
songs who do not show authorship effects and act against my hypotheses and expectations, or the texts that do
not connect to other texts in general, are almost always very small texts. Reael with both songs, Sterlinx, Haecht,
the dissident of Marnix and the dissident of Coornhert, and the two anonymous songs helpt uzelf zo helpt u god
and ras der provincien, are examples of short texts who have not showed a consistent stylistic profile, up until
now. Van der Noot his vrijage on the other hand is a pretty long text and more importantly average for the
included texts of Van der Noot in the corpus, but in the graph 2.3 he moves away from the other Van der Noot
texts. Blaming the weak stylistic signals completely on texts size doesnt apply here. In the next analysis Ill
sample to gain more information about these short texts and their lonerism.411
The results for Bag of words (BOW) samples of 450 words turn out to be less informative. There are less trees
and branches, the results are weaker, less defined and hardly visible. At the same time, these are results
(somewhat) challenging my expectations, and therefore should I take them seriously and be critical at my own
observations, in order to make sure that they do not stem from self-fulfilling prophecy or cherry picking. This is
exactly why it is important to determine the graph not only on the effects it shows, but also if the graph shows
effects at all, and if these are possible even if they are against your expectations.
The results are, however, mainly just a bad performance of the analysis. Its not that my desired effects
arent showing, its that therere hardly any effects here, including some weve already established and some that
were certain about. Even so, therere still some stylistic signals that could be interpreted as authorship signal,
some of them still clearly visible with BOW samples of 450 words. The dissidents of Marnix and Coornhert remain
distant from the other texts of their authors, who still form clusters. Based on these results I conclude that these
clusters do not form because of a similar text size but because of their similar stylistic signal. In consequence of
this, the two dissidents should be considered as stylistically less similar as the rest of the texts of their authors.
The texts of dHeere and the texts of Fruytiers still cluster together with texts of their own author.
These results indicate that authorsship effects are measurable and visible even at these texts lengths,
even if theyre less visible because the graph is messy, and only the strong effects remain. This puts the lack off
results for the Wilhelmus in a very specific light and it also keeps me from deleting certain texts, because I need
to perform further analyses.
For the next analysis I set the sample size to 1650 words, an already positively tested amount of words, but I will
now use character n-grams, of whom therere more available. The results seem even more diffuse with character
n-grams as with words, therefore I set the sample size to the maximum as my corpus allows me, 2500 character

411 A term taken from the album title of the second album of the band Tame impala. It refers to being a loner,
or relatively isolated.
104

4-grams. The graph 3.3 and 3.4 show indeed more results than before and give me about the same effects as the
previous analyses on this corpus. One Coornhert text travels a bit over the graph, but not the conjunct text but his
Ulyses, something I havent seen up until now. Songs from the geuzenliedboek form a tree with each other
meaning that they do have stylistic similarities and that these arent based on their length but on something else,
perhaps a common author, but at the very least we know that theyve a common publisher, which might be the
cause. The Wilhelmus shows only affiliation to itself again.

4. Conclusions author corpus 1 auteur minus proza en psalmen


Therere no more big divisions in this corpus, but there are big clusters. Most of the texts form branches, few
especially small texts dont. Texts of the same authors cluster, so author effects are present, measurable and
visible. The same goes for songs from the same work, like the geuzenlieboek. Especially dHeere has a strong
authorship signal while at the same time, the three texts from the same book of dHeere are the tightest of the
dHeere branch, so in this case, both effects, the one based on author as well as the one based on work, are
present. The texts of Van der Noot also cluster consistently and convincing. Van der Noot and dHeere are now
both unbelievable options for the Wilhelmus authorship, seeing how good their texts cluster, while the Wilhelmus
does not seem to have any stylistic resemblance with their texts. Van der Noot already was very improbable, as
the Wilhelmus did not seem to show any southern influence during the analyses on the language corpora.
The texts of Marnix cluster, as do those of Coornhert, however there were some exceptions, depending
on the parameters. The four Fruytiers texts were split into two clusters, in almost every analysis. Sendbrief and
Ecclasius are always together in a branch, just like the two poems Hier beginnen de Geusen haer hert
wederom op te halen and Hier worden verhaelt feyten Duc d'Alva bedreven heeft, sometimes in the company of
Voort. These two branches are often neighbours. Reael seems to have no strong authorial signals, at least not
the texts in this corpus. Even the cluster of the songs of the geuzenliedboek, to which the Reael texts belong, has
little pull on the Reael songs. The sampling shows us that this cant be explained by the short text size. Most of
the effects stayed the same during the analyses which used samples, as did the lack of obvious signal from
Reael. Besides both the Reaels, the dissidents from Marnix and Coornhert, Sterlinx, especially Haecht and a lot
of the anonymous songs also make very little connections and thus show very little stylistic properties.
I want to mention again that when text systematically and consistently form connections with each other,
over varying parameters and corpora, theyre believable as stylistically similar. On the other hand, texts that make
little connections or make a lot of different connections and move around a lot when the parameters change,
probably do not have very strong stylistic resemblance to other texts, and the clusters they form should be viewed
critically as they might be caused by something else than stylistic resemblance.
During tests on this corpus there has been no reason to assume that Dutch translations of text from classical
languages have their own stylistic mark.

105

Authorship effects are present and visible even with sample sizes of only 550words and also when my corpus is
disbalanced regarding text size. The results are not always clear, and probably not always correct either. Im still
convinced that a lot of subtle effects are withheld by al sorts of methodological limitations and the limitations of
the corpora, however, if the author of an anonymous song is included in the corpus with a couple of
representative texts, a successful authorship attribution is possible and maybe even likely. I base this on the fact
that a lot of texts with just as bad credentials still cluster together with other texts from the same author. The
Wilhelmus however gives us no consistent, convincing signs so far. Further analysis is required.

5. Analyses author corpus 2 'auteur balans'


Ill further balance my corpus, now auteur balans, in order to dismiss effects other than authorship effects, in
the hope that the weaker authorship effects become visible. I remove the geuzenliedboek version of the
Wilhelmus and tested how much this mattered in a later analyses. I also remove the anonymous songs. Theyve
fulfilled there purpose, of identifying the effects within one work and to check for the existence of more
contributions of the anonymous author of the Wilhelmus in my corpus, but are of no use when actually
determining the author since theyre anonymous. Anonymous is the label the Wilhelmus has got right now, the
one were trying to lift off of it.
I remove the translated works coded and labelled under KLAS from the new corpus because I want to
seize down the amount of Coornhert texts, so the graph doesnt get split up between texts of Coornhert and nonCoornhert texts. In addition to this I aim to remove a possible effect for translation or classical heritage, attempting
to clean the corpus of as many non authorship effects as possible.
Fruytiers remains in the corpus, as it shows different effects and is a possible Wilhelmus author. The
same goes for Marnix and Reaels effects, with the difference that Reael hasnt showed any notable effects the
past analyses. I remove two pieces of the poetry from Boomgaard of Lucas dHeere, whove showed strong
similarity based being in the same book. The corpus has now 22 texts.
I perform six analyses varying several of the parameters. Analyses 1, 2 and 3 show a big division in the corpus.
Van der Noot, Fruytiers and dHeere are in three separate branches of a tree, dividing them from the other texts in
the corpus. The only exception from a clean categorization based on author are two poetry pieces of Fruytiers
who place themselves on the other end of the corpus. Weve seen this splitting of Fruytiers on stylistic basis
before and it continuous to do so.
There seems to linger another tree, but it doesnt become distinct enough to analyze except in the
graphs of analyses 2, using character 3-grams. The possible tree consist of Coornhert, Marnix and Hout. The
cumulative text of Coornhert with both Ter liefden van een Maghet and the Baanderherenlied in it, moves away
from the tree it belonged to in the earlier analyses, and the Marnix psalm text never joined them in the first place.
Both effects can be explained in accordance to the non typical nature of the texts. One is a joined text and the
other a psalm text. Ive mentioned before that the effects of a text consisting of two texts are unknown to me and
they might be uncharacteristic together, for either of their styles when apart. The psalm text didnt joined the
106

psalms in earlier analyses, but this is no guarantee its nature doesnt essentially differs from other poetry. It just
tells us it didnt had the common stylistics that the other psalms had.
The graphs shows obvious authorship effects. These have become a very consistent and believable pattern
throughout the analyses on the author corpora, and also on the other analyses. Therere some exceptions to the
rule, as in the previous analyses, one being Reael. Raeal clusters in a third group consisting of, the occasional
Coornhert, two dissident Fruytiers songs, the Marnix dissident, Voort, Heacht, the Wilhelmus, Sterlinckx and Van
Damme, in which the texts dont form many relationships or clusters and the results are not very distinct or
plausible. The Wilhelmus has two times the songs of praise of Marnix as nearest neighbour and two times the
joined file of Coornhert. Other effects of this leftover tree are, Voort who joins the Fruytiers texts and Reaels een
ander lied clustering with Van Damme, sometimes accompanied by Sterlinckx. Theyre all songs from the
geuzenliedboek. The other Reael text has no clear affinity with other texts. The Haecht text doesnt make any
connections.
When looking at the size of the texts, it became clear that little texts are the least sure to stick to the
connections they make and move around the most. The Wilhelmus, Ter liefden van een Maghet en het
Baanderherenlied, the songs of praise by Marnix, een nieu liedkje of Reael, the text of Haecht, the sterlinckx
text and the other song by Reael are examples of this. Therere however enough effects of text of very limited text
size, that show itself reliable and consistent under changing parameters and different corpora.
Based on previous attempts, I assumed that sampling will not change the overall view of the graph. I
verified this assumption with a quick analysis. The only notable fact was that the Wilhelmus clustered two more
time to the Marnix texts.

6. Conclusions author corpus 2 auteur balans


Its obvious by now that the size of a short texts, lets say 550 words, has a huge influence on its attribution.
Short texts are harder to attribute, they make weaker connections and so when the parameters shift they move
around the most, making them unreliable. However, there are many short texts included in my corpus and they
often do connect with texts of the same author. The corpus and design are good enough to let the graph show us
all kinds of results. Therere authorship effects in the form of authors who consistently cluster together, forming
their own branches, respectably Van der Noot, Fruytiers and dHeere or Coornhert, Marnix and Hout.
The big exceptions to the rule is Jacob Reael who, as in the previous analyses, never really connects to
any text not even those of himself. The one text of Haecht is another case that doesnt make any connections.
The Wilhelmus clusters two times with the songs of praise of Marnix and two times with conjunct text of
Coornhert.

7. Analyses author corpus 3 Marnix Coornhert


The analyses so far showed that the Wilhelmus made very little, and very little convincing, connections, but when
the anthem did seem to connect, it was with texts from Coornhert or Marnix. These two author are also the main
107

candidates in the canon of Wilhelmus authorship research. Based on my own results as well as previous
research Ill explore this option and form a corpus called the Marnix Coornhert corpus, with texts of either
Marnix or Coornhert, in addition to the Wilhelmus itself, to see if our national anthem has a preference and if this
seems like a robust effect.
Notable inclusions are, the psalms and both the texts of the joined text Ter liefden van een Maghet en
het Baanderherenlied separately, while also keeping the joined document in the corpus. I hope to gain some
insight in the changes in signal when combining texts in one file and. The corpus consists of 14 texts.
Looking at the graph of the first analysis, it shows a strikingly lot of information, including both a big division, as a
lot of smaller clusters and detail. Interpretation confirms previous results, the distinct signals of prose and poetry
and the authorship effects, among others. After the modification of the levels of culling, as with analysis 1.1, the
prose/poetry distinction gets stronger but the details seem to fade away a bit.
The Wilhelmus clusters, in graph 1, with the poetry of Marnix, but also stays close to a cluster of
Coornhert poetry, including Ter liefden van een Maghet and Baanderherenlied as well as the combined file.
With the features on word level, these texts seem to find each other easily, meaning the combine text is
representative of both independent texts and vice versa.
For the analysis 1.1 I deleted both individual texts of the combined document of Coornhert, along with the first
psalm of Marnix. As we can see in the graph, in this corpus the Wilhelmus binds itself to the combined document
of Coornhert, but only when culling. Without culling the Wilhelmus remains with the poetry of Marnix. So
balancing the corpus, by deleting the texts that I did, made the Wilhelmus switch places to another text of another
author, making its attribution as highly doubtful. This assertion is confirmed by analysis 2 when the Wilhelmus
joins a cluster with both Marnix as Coornhert texts.
Adding different levels of culling in analysis 2.1 strips the graph of most effects. This is illustrative for the
enormous sensitivity of this body of text for changing the parameters and also how thin the branches of the graph
and how unstable the results of the analyses are.
Analyse 3 shows strong authorship signals again but the Wilhelmus still doesnt show its colours.
Exclusion of the Wilhelmus from my corpus changes little about general outlay of the graph, but theres one
important change, as both the combined text of Coornhert as Marnixs psalms, that showed stylistic resemblance
when the Wilhelmus was included, do not show this similarity towards each other anymore when the Wilhelmus is
removed. They move in opposite directions, both to the wrong author. We can take away from this that both texts
have a atypical style when comparing them with the rest of their authors texts and that they are stylistically similar
to the Wilhemus but not so much to each other.

8. Analyses author corpus 3.2 Marnix Coornhert balance corpus

108

Ill try to balance my corpus a little bit more by removing every bit of noise Im able to remove, leaving my corpus
at 9 texts. This is of course still my third corpus, the Coonhert vs. Marnix corpus, of my third hypothesis, the
authorship hypothesis, but a more balanced second version of it called Marnix Coornhert balance corpus .
This corpus consists of 9 texts.
The corpus is now balanced in a way that both Coornhert and Marnix are represented by 4 poetry texts,
all bigger than the Wilhelmus. Three of the Marnix texts are a little over 550 words and the other one, his Psalms
Davids, is the biggest of the entire corpus, at least ten times as big as the Wilhelmus. Coornherts combined text
is about as big as the Wilhelmus, while the other three are about half of Marnixs Psalm Davids. Even when were
still debating the extent of which size is responsible for effects, by now we know it can be responsible for the
absence of effects, the corpus is now well balanced for size. When interpreting the results, Ill take size into
consideration, despite the extensive balancing, just to be sure.
From the analyses 1.0 therere two ways to interpret the main division. The first, the authorship effects, with one
deviation on either side, the Psalms Davids of Marnix and the combined text of Coornhert, has been established
and described in previous analyses. The second interpretation is that the divide divides the 4 biggest texts from
the 5 smallest texts, a division that strokes with the greatest relative gap in text size, a factor 5, between the
biggest small text and the smallest big text.
When we use character 3-grams as features instead of words, the amount of features goes up and as
weve already seen, a growth in features changes the results significantly if the feature size is under a threshold.
The graphs 2.0, 3 and 3.1 show us clusters by author and by type. The Wilhelmus finds his familiar cluster with
the two dissident texts, who differ enormously in text size. The other texts show some authorship signals, but also
some texts that do not cluster according to author. An explanation of the three trees based on text size however,
seems implausible.
In order to further counter for effects based on size, I use random sampling in analysis 4. The graph now
becomes very different, there are big divisions and lots of details. The Wilhelmus still clusters to the combi-text of
Coornhert and this cluster is close to the cluster of Marnixs psalms, where the other text, that repeatedly clusters
with the Wilhelmus, resides. Authorship signals are scarce and weak in graph 4 and in graph 4.2, but in the graph
of the analysis 4.1 the authorship signals are a bit stronger.
The results are again inconclusive. The Wilhelmus tends to connect the most with a Coornhert poetry text, that
consists of two separate texts, but the authorship signals are unconvincing because of the Wilhelmuss affiliation
with a Marnix text and because the signals of both texts that the Wilhelmus clusters with, are atypical for the
author that wrote them. In addition to this, weve seen the Wilhelmus move around the graph when shifting the
basic parameters. The results considering the Wilhelmus are uncertain.

9. Conclusions author corpus 3 Marnix Coornhert

109

The results of the analyses on the corpus 3.1 and 3.2 show us the presence of authorship effects, but also the
relativity of their presence, as they are far from giving a full explanation for the effects in the graphs, and
alternative explanations might sometimes be just as believable. The Marnix vs. Coornhert corpora are not the
best corpora to test the authorship effects, because the inclusion of only 9 texts of only 2 authors, becomes
already problematic when youve got an open case and 2 dissidents, and a third of your corpus is already
disconfirm the expectations. Earlier analysis, including more authors and more texts, have consistently showed
us reliable authorship effects.
The Wilhelmus is inconclusive in his authorial signal. In analyses that do show authorship signals, the Wilhelmus
still doesnt seem to favour one author. At most, it favours texts. The Wilhelmus seems to share stylistic
characteristics with the conjunct text of Coornhert, consisting of the songs Ter liefden van een Maghet and
Baanderherenlied, as well as stylistic similarity to the songs of praise originating from Marnixs Psalms Davis.
The Wilhelmus clusters more often with the Coornhert text than with the Marnix text. The texts only share some
characteristics with the Wilhelmus and not with each other, while they also seem to be atypical for their authors
style. Exclusion of the Wilhelmus shows both the dissidents to be pulled away from each other, and joining the
opposite team, suggesting that it was not the Wilhelmus pulling them out of the author clusters were they
belonged, but actually more of finding them somewhere in between a Marnix and a Coornhert singnal. This could
also mean, and this seems to be the most straightforward interpretation, that all three texts are pushed in a
cluster because they do not belong to either one of the author clusters. However, under certain parameters, the
songs of praise of Marnix, cluster pretty convincingly to other part of the Psalms David of Marnix.
The main conclusion is that the results that are convincing in this corpus, do not involve the Wilhelmus,
who seems to be a stylistic outsider. Balancing my corpus did not change this. The results are again inconclusive.
The Wilhelmus does not have any text in the corpus to which he convincingly clusters. The connection to
Coornhert poetry is not stable enough to base any real attribution on.

Distance measures
Although this isnt a trial and error based exercition, where you can vary settings and parameters without
theoretical or practical basis for it, by now most of the parameters have been varied. Any further refining of my
corpus is highly unlikely to harvest any progress in determining the author of the Wilhelmus. At this point I need to
consider changing the settings, that I wouldnt change before, without losing sight of the theoretical basis.
As discussed in the theory, different distance measures will show us different characteristics of a text. I
stand behind my decision for the Burrows delta on practical reasons and a firm theoretical ground, however, when
easily applicable and with inconclusive results so far, Id like to get a short impression of the possible different
effects. I cant draw any conclusions on the results as this would be a form of cherry picking, however Id like to
explore the further options by generating new results and explaining them by the grace of the theory of the
distance measure, not the attribution of text.

110

I test this on the author corpus 2 auteur balans, counting 22 texts, designed to find the author of the
Wilhelmus, and therefore removing signals other than authorship signals, without deleting improbable authors.
The first alternative distance measure is the Euclidian distance measure. Remember that, in comparison with the
Burrows delta, it bestows a lot of influence on the top most frequent words, while the influence of the rest of the
MFW is marginalised.
Looking at the graph it actually shows us pretty sensible results. Therere distinct branches, and theres
detail within those branches. It shows authorship effects, of all included authors, even Reael. It also shows us
clusters of texts from the same book, see the Fruytiers, Van Damme, Voort and Hout branch, all stemming from
the geuzenliedboek. According to this graph the Wilhelmus has the most stylistic resemblance to the Reael texts.
This could be another geuzenliedboek cluster, and it could also be a cluster based on size. However, therere
other geuzenliedboek texts and other short text present in the corpus, that dont cluster with the anthem. Voort,
Sterlinnx, Van Damme are all very short and from the geuzenliedboek, but they dont cluster with the Wilhelmus
like the Raeal texts do.
Id like to note that in this case the usage of a distance measure that tyrannically prefers the top
frequent words, gives me a interpretable and sensible graph that showed a lot of different results, big and small.
Perhaps on these very short texts, words on the bottom of the list are polluting the results. We already saw that
culling, also reducing the influence of bottom MFW on the list, often showed some similar effects.
The second distance measure is Argamons linear delta, made as an alternative for the Burrows Delta, which has
not been discussed in the theory. Its based on Euclidian principles so consider its mathematical rules as such.
Looking at the graph, what immediately pulls the attention is the big division. Both of the trees show also
smaller, more detailed branches. Authorship effect seem to determine a great deal of the graph, but they cant
explain the major split. It separates 3 authors from the rest, who are also bound to texts of the same author,
except for Raeal, whose texts reside in the same region, but arent nearest neighbours. So based on Euclidean
principles of Argamon, Reael has now lost some of his authorial signal in comparison with the pure Euclidian
distance measure. One tree groups Marnix, Coornhert and Hout with the Wilhemus, so the two most important
candidates for authorship, are in the same tree with potentially their masterpiece. Marnix clusters the most with
the Wilhelmus, first of all with his song of praise, not uncharacteristic for the rest of my analyses, but also with the
rest of his texts. This doesnt by a long shot mean the Wilhelmus is written by Marnix, as Hout also behaves as if
its written by Coornhert. This means that based on these parameters, with the Argamons linear delta, Marnixs
work, especially his songs of praise, are the most stylistically similar to the Wilhelmus of all text in this corpus,
which is designed to included as many possible authors and stylistically similar texts as possible. It also means
that the Argamons linear delta cant be excluded as useful distance measure, as it seems to capture a lot of
authorship effects.

111

The third distance measure is the Manhattan distance. In contrast to the distance measures based on the
Euclidean principles, the Manhattan distance measure is less biased towards the top of the MFW. It still gives
decisive importance to the 10 or 20 MFWs, so its way less democratic as the Burrows Delta.
Looking at the graph 2 manhattan it shows us a lot of the same effects as graph 1.2 argamon, but now
a little less distinct. Again theres a affiliation of the Wilhelmus with Marnix, and especially with his songs of praise,
while also showing some relation to coonrhert and Hout. The texts of Hout are also songs of praise, which
explains the stylistic similarity towards Marnix, but not to his even closer affiliation of Coornhert or the Wilhelmus.
Id like to suggest an interpretation thats highly speculative and which I wont state as factual. Lets make the
assumption that the Wilhelmus is a song of praise, just like Marnixs and Houts texts. This is not a very wild
statement as the national anthem is obviously in celebration of country and god, and implicitly self-celebrating as
the text is sung from the perspective of the Prince which is revered as just and righteous. Coornherts joined text,
can be interpreted as a song of comfort, a song of departure and a song of exile. The songs from Coornherts
book of songs are mostly religious songs, full of praise, mercy, subjection and comfort. The same goes for
Marnixs poem Den verstrooiden Nederlandschen gemeenten Jesu Christi. The joined file from Marnix consist of
a songs of praise and religious songs. These themes are also present in the Wilhelmus, one of the cardinal
questions is particularly about these themes, so perhaps their similarity is topical. A counter argument is that,
when the attribution is indeed topical and not stylistic, the coincidence of perfect clusters according to author
seem unlikely. Also, my corpus was designed to have similar themes and considering the times, most selections
of Dutch songs would bring forth such correspondence. The final nail in the coffin of this hypotheses, apart from
already having rejected the possibility of testing further genre effects, is that these themes are also present in
songs not clustering with the Wilhelmus, like Van der Noots songs of praise and several songs of Fruytiers to
name a few. Another result Id like to discus is the fact that Reaels texts cluster again, which makes it seem like
the absence of signal in the Raeal texts, was mostly due to the choice in distance measure.
The conclusions are, for the Manhattan distance measure, pretty much the same as for the two
Euclidean based distance measures. Perhaps a distance measure that gives priority to the most frequent of the
MFW, is useful on texts of very short text size. Regarding the Wilhelmus, I observe that Marnixs work, especially
his songs of praise, seem the most stylistically similar to the Hymn.

The last, and possibly least, alternative distance measure I test is not discussed in the method section, but
available by default in R. The Canberra distance is a weighted version of Manhattan distance, used as a metric
for comparing ranked lists.412 This numerical measure of the distance between pairs of points in a vector space,413
measures the similarity between groups. Its far more democratic than the previous alternative distance
measures and I choose this one on that premise.

412 Godfrey N. Lance and William T. Williams, Mixed-Data Classificatory Programs I - Agglomerative
Systems. Australian Computer Journal 1.1 (1967).
413 Giuseppe Jurman et al., Canberra Distance on Ranked Lists. in Proceedings, Advances in Ranking
NIPS 09 Workshop, eds. S, Agarwal, C. Burges, K. Crammer (2009) Retrieved on 24-07-2015
112

The formula of Canberra distance (d), is as follows:

Looking at graph 3 Canberra we see a very limited representation of the data. The texts are sorted by their
authorship effects, branching of from the middle and in doing so losing all other information. Of course this
distance measure does not only search for the authorship signals, it measures stylistic resemblance, but these
are the only effects the graph shows. The most logical explanation is that these signals were the strongest as it
blurs out the weak signals. Only Coornherts texts are not clustered according to their author and the Reael texts
do find each other as nearest neighbor but refuse to form a branch.

Conclusions Distance measures


The conclusion based on the distance measures, are apart from the already mentioned possible positive effect of
tyrannical distance measures, that the results of the four alternative distance measures, in combination with the
theory, point to Marnix as a serious candidate for authorship of the Wilhelmus. To confirm this, we should do
elaborate analyses with all of these distance measures, varying corpora and parameters, and account for the
mathematical choices by building a theoretical frame of statistical theory on distance measures preferring the very
first MFW in combination with very short texts and by mapping all the previous research on this subject.

PCA
So far, the analyses of the authorship hypotheses havent delivered conclusive results. In order to get more data
and more context about the data already received, I perform principal component analysis. As mentioned in the
theory paragraph, the PCA can be useful to establish the author of a text when faced with a two author corpus. I
perform two PCA analyses on the author corpus 3.2 auteur Marnix Coornhert balance corpus. First a PCA
correlation, measuring the statistical relationship between two variables, and secondly a PCA covariance,
measuring how much two variables change together. All the parameters are listed in the Appendix 2. Corpora,

113

but its important to note that the performed PCAs are on word level and with the Burrows delta as distance
measure.
The first graph, shows the components in a 2 dimensional space, based on the statistical relation between the
variables. The horizontal axis is the first component and it doesnt separate the texts according to author. I could
say that most of the Coornherts texts are located more to the right while Marnix stays at the left, along with the
Wilhelmus, but these results are not really distinctive and therere exceptions. The vertical axis, however, the
second component, shows a division by author. The relationship is still more complicated than only authorship
effects, because while the Marnix and Coornhert texts are indeed able to divide with one horizontal line,
somewhere around the figure one, the real cluster actually consist of two KLAS texts by Coornhert, Psalms of
both Coornhert and Marnix and the sonnets to Lucas dHeere by Marnix. The Big difference on this component is
that, the Wilhelmus obviously distances itself from the other texts, allowing only the joined Coornhert text to come
near. Weve seen the affiliation with this particular texts, just as weve seen the isolation from all the other texts.
Lonerism is again a suited term. The PCA picks up on the stylistic individuality of the Wilhelmus, weve seen all
through the attribution analyses, and defines it as the second component.
Looking at the second graph, the covariance matrix, we see a conspicuous result. Although the Wilhelmus seems
to cluster with Coornhert, on the horizontal cluster more obvious than weve seen in the PCA graph number 1, on
the vertical axis the Wilhelmus is miles apart from any other text. This means that on his first component, the
Wilhelmus stays close to Coornhert, yet on the second variable, his covariance, the Wilhelmus will absolutely not
share a common pattern of change with any other text of both authors on this variable. Marnix and Coornhert
dont seem to differ too much on this second component, but perhaps theyre grouped together because of the
counter weight the Wilhelmus provides.
Conclusion PCA
The principal components analysis captures differences between groups and a PCA-plot exhibits these
differences. The results in general are unconvincing. The clusters are rather scattered and the differences
between the authors are sometimes smaller than the difference of the texts of one of the two authors. The texts,
are however sorted by their author, so we affirm again the presence of an authorial signal. When forced to pick an
author based on the two PCA plots Coornhert would be the most likely author, but when critically analyzing these
results, one has to agree that they seem to point out that the Wilhelmus is not an representative of either of the
two authors their style. In both graphs the Wilhelmus seems to be an outsider of the corpus, especially on the
covariance plot, where it seems to be totally different from any other text in the corpus on the second component.
These results do correspond and partly explain the behaviour weve seen from the Wilhelmus during all the
previous analyses. Obviously the Wilhelmus has some stylistic properties that sets it apart from the other texts,
therefore making attribution impossible. This divergence on the second component of the covariance PCA is very
strange and very hard to explain, if the real author was present in this 2 author corpus, both authors styles were

114

represented by their texts, and the Wilhelmus represents its authors style as well. It might suggest that the
Wilhelmus has a textual characteristic, responsible for the failed attempts to attribute it to an author, that corrupts
its authors style.

Conclusion Author hypotheses of specialized corpus.


From the very beginning, even when testing on other possible effects, the results showed authorship signals.
These became increasingly clear with the further preparation of the corpus and fine-tuning of the parameters. I
managed to get readable graphs that gave us a lot of information about big effects as well as small effects on all
corpora. Most texts show stylistic resemblance to other texts and groups of texts, thereby forming clusters and
branches, with the exception of some smaller texts who didnt show any stylistic relationships on a consistent
matter. Strong effects made big clusters, divisions and trees. I conclude that in my corpus author effects are
present and that these are retrievable and made visible by my methods.
Most of the time, text of the same author tended to cluster together and show stylistic resembelence, for most of
the authors. Lucas dHeere, Van der Noot, Fruytiers, Marnix, Coornhert among others, showed very strong
authorship effects. Exceptions were often both predictable as explainable, for example by the characteristics of
the text or because of the presence of other effects like texts coming from the same work.
Only the authorship signals of Reael were not captured by the Burrows delta, without any sufficient
explanation. When I experimented with the distance measure, the texts of Reael did show stylistic resemblance.
Both the Reael texts as the Wilhelmus are examples of the short texts that didnt seem to form any connections at
all. Ive considered attributing this to text size, but as my analyses with sampling and other counter measures
showed repeatedly, is their unwillingness to show their stylistic relations, not wholly explained by their size.
Ive argued that, when all authors are represented correctly and the Wilhelmus represents his author
correctly, Lucas dHeere, Jan van der Van der Noot and Johan Fruytiers are very unlikely. Petrus Datheen, Pieter
Sterlincx, Jan Utenhove, Jeronimus van Damme, Godevaert van Haecht, Jeronimus Voort and Willem van Oranje
were not sufficiently tested to rule out. Reael can also not be ruled out as a option for the authorship of the
Wilhelmus. The Wilhelmus hasnt showed any stylistic resemblance towards all of these authors on a consistent
and believable basis.
Two authors showed on several occasions and under different experimental circumstances stylistic resemblance
towards the Wilhelmus, Dirck Volkertszoon Coornhert and Marnix van Sint-Aldegonde. These are also the two big
potential authors according to the tradition of Wilhelmus research. Ive given some alternative explanations for
this accordance, but the fact remains that Coornhert and Marnix are again the most probable authors and I
should look in their direction for attribution of the Wilhelmus. Ive tested them further with a corpus specifically
designed to testing whether the Wilhelmus is stylistically closer to Marnix or Coornhert.
Therere some reasons not to believe in an authorship attribution to either of these authors. First of all,
the Wilhelmus has clustered with both of them about an equal amount of times. I consider the author of the
115

Wilhelmus to be singular, so its strange for a man who didnt wrote the hymn would get as many attributions on it
as the actual author. Also, the Wilhelmus has clustered with other authors and often with none at all. In other
words, the Wilhelmus hasnt showed a steady attribution to either one of these two authors or in general. In
addition to this shows the Wilhelmus most stylistic resemblance to the texts that a atypical and do not represent
their authors style very well. Especially the conjunct text of Coornhert is often the texts the Wilhelmus clusters
with in the same branch. This is very problematic because this is a constructed text, Ive put together while trying
to balance my corpus. Its also the Coornhert text that is quickest to abandon a Coornhert cluster, so the least
typical for the authors style. All these concerns were just looking at the results of the analyses performed on the
Coornhert/Marnix corpus, which didnt give me any conclusive answer on my main question. The PCA showed
that the Wilhelmus has some stylistic properties that sets it apart from the texts of both Coornhert as Marnix. Ive
no explanation for this divergence.
Although convinced in the abilities of the Burrows delta I resorted to the testing of other distance
measures, in order to confirm my initial choice. Lesser refined distance measures as the Burrows Delta,
favouring the top few MFW heavily over the rest of them, showed decent results sometimes finding more
authorship effects than my analyses with Burrows. In these tests the Wilhemus was mostly attributed to Marnix
but we should see this attribution consistently over varying parameters and corpora to consider this as an
indication. At this stage its also too early to conclude that the Burrows delta is inefficient under these research
circumstances.
In conclusion; The Wilhelmus doesnt make any strong connection or shows a consistent pattern of attribution,
and therefore doesnt show any believable stylistic affiliation with the authors of my corpus. Possibilities are that
the author is not present or not represented correctly because theres too little text of him or only text of a very
specific non-typical signal. It could also be that the Wilhelmus itself has atypical stylistics, this would also explain
why he moves around so much.
The leads to answering of the hypotheses;

There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal present in the texts of my corpus. This hypothesis is accepted.

The Wilhelmus shows consistently more stylistic similarity to the texts of one particular author in my
corpus, than to the texts of other authors, and the texts of this author consistently show an authorial
signal by expressing similar style. This hypothesis is rejected.

The Meertens corpus


The texts of my third corpus I received from the Meertens instituut, hence the name, at my own request. The
Meertens corpus contains only texts that are retrievable from the DBNL, thus most of them are also included in
my DBNL corpus, but theyre prepared differently, by Erik Tjong Kim Sang of the Meertens institute. The texts are

116

delivered in a different type of XML, Folia, and also in different units, meaning by song as well as by complete
work or book. This led to the construction of the following 3 sub corpora;
1. XML Works corpus; A corpus consisting of complete works, delivered in Folia XML format and analyzed in R.
2. XML parts corpus; A corpus consisting of equal parts of the works in terms of size, not in terms of words,
delivered in Folia XML format and analyzed in Gephi.
3. TXT songs geuzenlied corpus; All songs of Een nieu GeusenLieden Boecxen the 1583 edition, in txt format. Ill
analyze them in R and perform a network analysis in Gephi.

XML Works corpus


The graph of analysis 1.0 shows us one big division and three big clusters, so big effects as well as smaller
effects. This is to be expected as we deal with a limited amount of big texts, meaning a readable, not too crowded
graph, composed of texts who are long enough to communicate their stylistic characteristics. It also gives
conformation that the format of Folia suits R.
The big split pushes Marnix Bienkorf, Coornherts Corte berispingen, of de sielen, twee sprake and Willem van
Oranjes apologie in one cluster, only containing prose, to one side of the graph. The other two clusters on the
other side of the graph, are Hachts psalms with Marnix boek der psalmen, Coornherts Roerspel en comedies
and also the geuzenliedboek in cluster one, and the two texts of Van der Noot with dHeeres Boomgaart, Houts
speech, Voorts Joseph, een historiaalspel, Bloemlezing of Marnix, the anonymous beclagh van jongheer Jan
van Hembyse and dHeeres beschrijving van de prince in cluster two. Cluster one contains palms and plays
while cluster two contains mainly texts that are full of noise and have fewer authorship effects.
The effects of prose and poetry, the clustering of psalms, authorship signals and noise, are all present in
this graph. Weve encountered these effects under different conditions, formats, parameters and corpora. It
seems that XML Folia format and the Meertens corpus are useful for analysis.
With the analyses 1.1, I confirm the effect of a sample size of 550 words, roughly the length of the Wilhelmus. As
you can see, suffers the quality of the graph, because many of the texts whore already very noisy, now refuse to
make any connections at all. What remains are the authorship effects and some type effects, meaning prose or
poetry. Both effects have already proven to be able to be quit potent at times.
If I enlarge the sample size to 1500 words, for analysis 1.2, some of the effects found in analysis 1
return. Coornherts prose now clusters obviously with Van der Noots texts and dHeere and Hout join as well. The
sample of 1500 words is strong enough to pair stylistic similar texts but doesnt show the more delicate or
perhaps unreliable effects.
Im now selecting and adding to my corpus the Wilhelmus with the code _nie096nieu01_01-0018, which is part
number 18 and song 23 of Een nieu GeusenLieden Boecxen. This version of the anthem is another one then Ive
117

included so far, and its from another source, but has exactly the same text and spelling, yet is delivered in
another format. Also, for this version, Im not editing out the introduction.414 The corpus has now 18 texts.
After I performed the same analyses as those on the previous corpus, we get exactly the same graphs,
with the Wilhelmus present, but not making any significant connections or causing other text to reconsider their
positions. The Wilhelmus remains stylistically undefined with these parameters and on this corpus. Of course the
text length of the Wilhelmus is many times smaller as any other text in the corpus. When using Bag of Words
sampling at a sample size of 550 words, the graph loses most of its effects again. We see the geuzenliedboek
cluster to Coornherts work, an interesting result, but looking at the other effects, and especially the lack of it, the
results are worth the consideration. This means that I can not make any claims about the Wilhelmus based on
this corpus.
The few analyses performed on this corpus give us some confirmation of this format, and also show the
limitations of such a noisy and disbalanced corpus.

XML parts corpus


The reader should at this point, altough therere for this subcorpus in-text illustrations, still open one or two of the
following two visual representations of the data, the document Parts gephi 3 which is a Gephi project file and/or
Graph 2. Meertens corpus parts, which is a pdf-file, in order to comprehend the results.
For the interpretation of the analyses and Gephi visualization of the parts corpus, I switched between
distant reading and close reading. This means that, when encountering striking or surprising results during a
distant reading of the Gephi visualization, identifying the parts, analyzing their text and context can offer an
explanation for their representation in the graph.

414 Corts na dat Graef Lodewijck van Groninghen op ghebroken, ende van Groeninghen verdreuen was is de Prince van
Oraengien na de mase ghetoghen. Een nieu Christelijck Liedt gemaect ter eeren des Doorluchtigsten Heeren Wilhelm
Prince tot Oraengien, Patris Patriae mijnen G. Vorsten ende Heeren. Waer van deerste Cappitael letteren van elck Vers
zijner V. G. Name meebrengen.Na de wijse van Chartres.

118

Figure 1: Parts gephi 1


A first look at the graph shows us that theres indeed cluster forming and that these clusters are predominantly
sort out by author, although we also see more than one cluster for some authors. This means that some text are
more stylistically similar than other texts are, and that a large part of the similarity is due to the texts coming from
the same work or being of the same author. This means that these characteristics of a text, author and work of
inclusion, partly determine the style of a text, based on the relative frequencies of the most frequent words, and
that this style can be captured with my methods.
Looking at the results, the cluster of bright green (in the Gephi-file at least) in the upper part of the graph, being
Van der Noots Bosko, isolated in a sphere and showing no connections to parts of texts other than its own, not
to texts from any of the other works of Van der Noot, that fill up the whole right side of the graph, is eye-catching.
The Bosko cluster consists of texts that need to be excluded from the analyses, like references and other
paratext. Another Van der Noot Island, caza code for lofzang op braband, on the right of dHeere, also consist
entirely of paratextual elements, this time of his lofzang op braband.
The biggest Van der Noot cluster, the one shaped like a half moon on the right of the graph, consists of the actual
text of all the different works of Van der Noot that Ive included in this corpus. This is the cluster that involves his
writing, instead of only noise and paratext, and so is it a representation of Van der Noots writing style.

119

figure 2: Van der Noot texts

If we stay with Van der Noot, for further analysis of the graph, we can analyze the connections it makes with texts
of other authors. An outlier of the Van der Noot cluster, goes via Voort, a single brown text, all the way to the left
side of the graph, where texts of Coornhert and Marnix resides, along with the anonymous texts. Other texts that
connect to Van der Noot are that of Hout, being the orange lamp-like shaped cluster, hovering in the middle of the
graph, above all the other texts, not counting the isolated Van der Noot sphere. Following the outliers of the Hout
cluster, we move again to the left side of the graph, to Heacht, being the bright bleu island to the right of the dark
bleu dHeere cluster. This is were the small cluster of Van der Noots paratexts, Ive already mentioned, resides.
As you can see in figure 3, the Van der Noot noise cluster has several connections with the dark bleu dHeere
texts, one orange connection to the upper right of the graph going to the Hout lamp, and one thick thread going
to another Van der Noot text just outside figure 3, who has some connections to the anonymous texts. Haecht
Psalms is the only text of Haecht included, be it in many different parts. His psalms make connections to dHeere
text but avoid the paratext of lofzang op braband.

120

Figure 3: Haecht, Van der Noot, d'Heere

Another example of noise manifesting itself in the graph, is the one isolated text, again of Van der Noot, close to
the yellow string of Marnix texts. When close reading the part coded as boko01_01.0256 I find that its a footnote
and a part of Tot den leser meaning to the reader. This explains why a text would move away from its author to
the other side of the graph, namely, because this text is not by Van der Noot. We see, as concluded before, that
the right wing represents Van der Noots style and the other clusters, consisting of text not written by Van der
Noot, but included under his name, are not representative of Van der Noots style. The opposite is not true. Not all
the text in the green moon of Van der Noot, shown in figure 2, are clean texts. The upper part of the wing, some
of which connect to Houts texts, turn out to be, when examining the individual parts, parts that are half noise and
half the actual text. This explains their scattering and stretching across the upper right side of the graph.
Ive started with reporting my findings on the Van der Noot texts for two reasons. First of all, Van der
Noot has been included with a lot of texts of different nature, often problematic and his texts need the most
interpretation and analyses of all the included authors in my corpus. Secondly, analyzing Van der Noot shows the
reader the possibilities of my methods. After the coverage of Van der Noot its clear that these methods and these
visualization can pick up noise, and that it will even cluster, insinuating that the noise is stylistically similar in some
way. It also shows us authorial signals, effects based on stemming from the same work and the overruling of the
latter effect by the first. Ill now discuss the other results in a higher pace.
The strongest author effect is Coornhert whose four texts cluster together in a wide horizontal cluster which is
pretty compact when you realize that there is prose as well as poetry included. Within the authorial cluster,
121

Coornhert also clusters per book, especially Corte berispingen. We see the same for Marnix, who is included
with three different books, but the many different parts manage to form a long diagonal cluster, again while
clustering per work as well. Especially the texts of the Bienkorf cluster and that of the psalms, are in close
contact.
These clusters per work within the author clusters can be interpreted as that authors tends to write with
a slightly different style for each work, probably because of genre, topic, type or personal change, but it van also
mean that the influence comes from effects I dont want to measure, like spellingvariation, different editions or
alterations made by the publisher.
Willem of Oranges texts cluster, in purple, just above the red Coornhert texts. His apologie makes
strong connections with both Marnix as Coornherts prose. See figure 5. I conclude that this is an effect based on
type, although analysis on the specialized corpus showed occasionally similar results.

Figure 5: Willem van Oranje connects Marnix to Coornhert

Lucas dHeere is present with two texts, the poetry of den boomgaard vd poezie and the prose of beschrijvingen
van den prince. In Gephi-file you can see how they form a cluster of which the major part of the cloud are texts of
den boomgaard vd poezie except for the smaller cluster close to the purple anonymous texts which is
beschrijvingen van den prince. The connections the dHeere prose makes with one text in particular, which upon
examining the files turns out to be the anonymous het beclach van joncheer jan van hembyse, I can not explain.
Further research could consider Lucas dHeere as potential author for this text, but Ive got no clue if this is even
historically possible. The connection is, however, conspicuous.
The geuzeliedboek accounts for the biggest part of the big purple cloud in the most right cluster. Somehow it form
a lot of connections with other anonymous text het bclach, while this is not neccesarily logical. Where the
geuzenliedboek is written by several different authors, het bclach should represent, at least to some extent ,one
voice. The style could have resembled any author included in the corpus, not necessarily grouping together with a
bundle of mostly anonymous songs. The other text the geuzenliedboek forms connections with is Coornherts
Corte berispingen. The Wilhelmus is in the middle of the geuzenliedboek cluster among its peers, leaving little to
interpret.

122

TXT songs of geuzenlied corpus


The reader should at this point open one or two of the following two visual representations of the data, the
document geuzenliedboek songs gephi which is a Gephi project file and/or the pdf-files geuzenliedboek songs
pdf 1, geuzenliedboek songs pdf 2 and geuzenliedboek songs pdf 2.1 + namen, in order to comprehend the
results.
I received the 96 songs from het Een nieu GeusenLieden Boecxen,415 the 1576 edition, in 117 txt-files. I deleted
29 parts that were extremely small, of which 21 parts metatext and 8 very short songs. The Wilhelmus is the 23rd
song and included, along with the Wilhelmus text Ive used in my own personal corpora. This left me with 89
textfiles that included only the songtext and no paratext. Subsequently I deleted all texts significantly smaller than
the Wilhelmus, because these are presumed to be too small to analyze and therefore excluded. This leaves me
with 62 texts. Some of them could be identified according to their personalized subscript, for example Pieter
Sterlincx with his subscript Castijt zonder verwijt416 meaning Punish without blame. I did an Gephi network
analysis and got the following results.
The Wilhelmus without introduction, that of my specialized corpora, interacts the strongest with the other
Wilhelmus delivered by the Meertens institute, but also shows stylistic similarity with Voort, Van Damme and the
anonymous songs 72 een nieu liedeke and 47 miraculeuze ontsetting van leyden. The Meertens-Wilhelmus on
the other hand makes stronger connections with the anonymous songs 89 een neu liedeke, 88 een neu liedeke,
78 een neu liedeke, 82 christofora fabricio and 83 hier volgt een liedeke than with the Wilhelmus without
introduction. Voort and Van Damme both belong at the top six biggest songs of my corpus and cluster to the other
four biggest text 47, 72, 84 and 99.
Another notable effect is the drifting away of the smallest Reael song from the other Reael songs, who
do cluster. This cluster is filled with very short texts including the familiar helpt nu dijzelf dan helpt ge god coded
as _nie096nieu01_01-0036, the only song below 4kb, next to the drifting Reals song.
I further analyse this corpus with R. I dont sample because the biggest song is only 1065 words, twice the
Wilhelmus. After three analysis, 1.0, 1.1, 1.2, I got the following results.
Graph 1.0, shows a mediocre graph were most of the authorship effects have disappeared. The two
Wilhelmus files are not nearest neighbours. We see the same relations as in Gephi where the MeertensWilhelmus interacts the strongest with texts 88, 89 and 78. The Wilhelmus without the introduction, does not show
any strong connection. Ive already speculated about possible size effects. If we look at the size of the texts
again, the 8 biggest songs cluster in a tree. We also get the same results regarding Reael, who is surrounded by
small texts. Theres an alternative explanation. An analysis of the structure of the geuzenliedboek 417 shows that

415 http://www.dbnl.org/tekst/_nie096nieu01_01/
416 Gustaaf Asaert De val van Antwerpen en de uittocht van Vlamingen en Brabanders. 1585 Lannoo, Tielt, 2004. p186
417 Heeroma 1985.
123

the songs are not included in the book by chronological order, but categorized by theme. This would mean that at
least song 88 and 89 are highly suspect of sharing stylistics based on their shared topic.
In graph 1.1, which shows the results of an analysis using character 3-grams, both versions of the
Wilhelmus do cluster. Voort and Van Damme cluster again but let go of the other big texts. The character 3-grams
provided a greater amount of features and cancelled out some of the size effects. Theres again no authorship
signal for Reael. The graph of analysis 1.2, using character 4-grams, shows us a combination of analyses 1.0 and
1.1, and see the influence of a shortage of features. These results all make sense and offer us methodological
insight, on the other hand, they dont offer us much insight in the Wilhelmus.
The analyses show us how much impact a minor alteration, like removing an introduction, has on the Wilhelmus
specific and on texts with a very small text size in general. The deficit on features make analyses on word level
inappropriate. Interpretations that rely on text size for the explanation of the graphs are far more believable than
the alternative of topic I presented. However both theories are, on this particular corpus not tested sufficiently to
accept or reject. Whats clear is that a text size of 550 words is inefficient for these types of analyses and I advise
to strive for a text size of 1500 words. The Wilhelmus shows little stylistic kinship with the other texts. The songs
that showed some stylistic resemblance to the Wilhelmus, if any, and are important to further research,
computationally or otherwise, are the songs 88, 89, 78.

Conclusions Meertens corpus:


The results Ive seen on the Meertens corpus are in line with the results on the other corpora, this time on another
format and yet another corpus. The major difference of the analyses on this corpus was that I was able to perform
them on complete books as well as on their individual songs. This meant for the geuzenliedboek specifically, that I
was able to analyze 62 songs of it, in the hope that the Wilhelmus would show a strong resemblance with some
of its peers, perhaps indicating a common authorial signal. This wasnt the case. Methodologically, the necessity
of a clean corpus for the analyses of very short texts, became abundantly clear with the inclusion of two slightly
different versions of the Wilhelmus. Especially with the Gephi visualisation, some major forms noise were laid
bare.
We had two types of format, TXT and XML Folia and two types of units, parts and works. The unit of songs, the
third unit Ive used, were actually the parts delivered in txt format by the Meertens institute which I further
prepared by cleaning out all the metatext and other paratext.
The results showed again effects of prose and poetry and some indications for separate effects for
psalms. We saw authorship signals in several occasions, and we saw different situations were they didnt
establish themselves. All these effects weve already seen in different analyses, parameters, formats and corpora,
and this gives me the confidence to assume that they are present, measurable and that my methods and corpora
are good enough to pull authorships relations among other.

124

Texts or samples with a text size of 550 words or less, disturb the analyses and a text size of at least
1500 is advised. However, some effects, predominantly authorship effects and effects of same work, remain
visible on a text size of only 550. When we use character n-grams, and thereby creating more features to
analyze, and thereby cancelling out some of the negative effects due to small size, therere more correct
attributions and more distinct results and usable graphs. Culling seems less important when using character ngrams, perhaps because of the loss of semantics.
The Wilhelmus didnt seem to share some kind of stable stylistic resemblance with the other texts that we could
interpret as an authorship signal. If it showed any relation at all, the connections were very weak. The Wilhelmus
didnt cluster with other parts of the geuzenliedboek, as did some other very short texts, while we did found these
effects, in other corpora as well.
The fact that the Wilhelmus didnt cluster to other texts, even those of the geuzenliedboek, and not even
in a consistent matter to itself, is worrisome. If it cant even find itself to be stylistically resembling, it wont connect
to other texts of his author, if theyre even included. The connections the Wilhelmus made, to song 88, 89 and 78
should be further explored.

Conclusions
Ill report here the general conclusions of all my analyses. I will be more concise than in the individual conclusions
sections, in order for the pace and size of my thesis. If youd like a extensive report please go to the conclusion
sections under each analyses, or at the end of a hypothesis section.

Enumeration test conditions and test results


It was clear from the start that in general, texts from the same author, showed stylistic resemblance that could be
grasped with my methods, brought out by my choice of parameters and features and be visualized with my choice
in visualization programs. From the first set of analyses, on the first DBNL corpus on I managed to generate
results, that signal enough clear, right and logical effects, to hope for an actual authorship attribution. We saw the
two household names of the Wilhelmus research, flog together forming colourful clouds in the Gephi graph, most
of the time even forming separate wings for poetry and prose.
At the same time it was clear that other characteristics than authorial signal also made their stylistic
mark. This meant not only that I could distinguish more cool textual characteristic and saw possible room to ask
some additional stylometric questions, but also that I had to be wary for these effects possibly eclipsing the
effects I actually wanted to bring forth. Depending on the design choices Id make, I could either make the
Wilhelmus cluster based on genre or type effects or on authorship effects. The researcher has to be conscience
of his influence and be able to understand what effect is actually caught. A strong effect that at times pushed an
authorial signal to the background was the stylistic resemblance of texts out of the same work or book, like the
geuzenliedboek.

125

While the strong effects were most of the time obvious and consistent enough to interpreted, small and
less obvious effects were harder to pinpoint. In some cases, and this definitely an influence when performing a
distant reading on such a large corpus when youve got little semantic and contextual information about each text
individually, you might miss or misattribute unexpected and/or weaker effects.
I managed to see and explain, from the first graph on, the cluster of psalms and other religious texts, as
topical. What I didnt saw and probably still wouldnt have known if my supervisor prof. dr Els stronks didnt
pointed it out to me, was the tendency of authors who were innovators of language, like Van der Noot and Lucas
dHeere to connect with each other.
For every seemingly effect there was noise. A lot of the results dont make sense and a lot of the texts
did not fulfill my expectations. In many cases I dont have an explanation for this.
The birds eye view gave me very little on the author of the Wilhelmus but it gave me enough indications that the
corpus I was working on had enough stylistic information to perform successful analyses on. Up until that point,
my confidence in finding stylistic signals in my corpus was based on theory, previous research and a preliminary
experiment Ive performed during an internship in Poland a year ago.
The main part of my research where the analyses on the three corpora who were assembled and
constructed in order to answer a particular set hypotheses that would lead me to answering my main research
question. The birds eye view was a big factor in the decision making phase of the construction of these
hypotheses.
The first two corpora, of the three that were designed for specific hypotheses, were meant to give me information
on the Wilhelmus, as well as to determine other effects that werent authorship effects, in order to map all the
signals I had to consider, cancel out or compensate for, when performing authorship attribution. These corpora
would give me clarity on effects of language and dialect and on effects of type or genre. In addition to this, finding
a language signal or a genre signal of the Wilhelmus would be a discovery on its own and could provide us with
information that might bring an authorship attribution closer. As explained in the theory answering one question
about the Wilhelmus can lead to answering another.
As reported in the results, I found that dialects can perform stylistic effects, but that this is only
satisfactory revealed for the Dutch texts from, or influenced by, the southern regions of the Netherlands. As stated
the German hypothesis is not sufficiently tested and the translation-hypotheses differed to much to test on a
corpus with such strong authorship signals and not relevant enough to construct an new corpus for. Theres no
indication that the Wilhelmus has southern or Flemish origin. Other texts with the southern language signal
werent stylistically similar to the Wilhelmus, leading me to the conclusion that the Wilhelmus has not been written
by an author who has such a language signal in his style, presumably because he is not from those regions.
Secondly, I searched for effects based on genre or type of text. I found that prose and poetry had different stylistic
properties, an effect the birds eye view already hinted on.

126

Another genre effect the birds eye view seem to harbour was that Psalms have a distinct style, different
from the other songs. The results of the analyses on the genre corpus seem to suggest that this is true, although
it wasnt consistent and strong enough for me to conclude. This shared stylistic signal could be based on topic,
because all psalms are of course deeply religious.
Answering the hypotheses of the song vs. poetry and the 6 genre were abandoned, as they were
impossible to answer on the corpora that I had, with the methods I used. Therefore these were neither accepted
or rejected, only remained unanswered.
The third corpus that I assembled for the sake of answering a specific hypothesis, was assembled to answer the
main question; who is the author of the Wilhelmus?, which will remain unanswered. Neither this corpus nor other
corpora could give me an answer.
From the beginning, even when testing on other characteristics, authorship signals were present. These became
more clear with the further preparation of the author corpora and the fine-tuning of the parameters. In all corpora
and sub corpora, on three types of format, txt and two types of XML (Folia) and three types of units texts, parts
and works, we saw authorial signals. They were the strongest and most consistent effects, present over almost all
authors, and a large majority of the text. The deviations were predictable and/or explainable.
Their consistency and strength over different analyses, parameters, formats and corpora in combination
with my secondary literature and extra textual information about the text, gave me the confidence that authorship
signals are present and measurable. These methods are good enough to identify and visualize stylistic effects
and relations between text based on authorship in my corpus.
The Wilhelmus didnt show clear resemblance to any other text in a consistent manner, and neither to
any author. This is true for my own constructed corpora as for the Meertens corpus. The Wilhelmus made little
connections and if it did, were these unstable. The two texts, one of Coornhert and one of Marnix to which the
Wilhelmus showed to most stylistic affiliation were two texts atypical for their author. In addition to this, theres the
obvious problem of them being of two different authors. Based on my results of the analyses on my corpora
specifically designed to answer the author question, I conclude that its very unlikely that Lucas dHeere, Jan van
der Van der Noot or Johan Fruytiers wrote the Wilhelmus. All three authors showed very strong authorship
signals, while being represented by a body of work of which I believe is adequate in capturing the authors style,
but the Wilhelmus didnt show any stylistic resemblance towards them.
Datheen, Sterlincx, Willem of Orange, Utenhove, van Damme, Haecht and Voort were not sufficiently
tested to rule out, and Reael, actually a lot like the Wilhelmus, showed in several corpora and under several
different parameters, no strong signals at all, including authorship signals or effects based on the fact both Reael
texts come from the same book of songs. They didnt showed any systematic stylistic resemblance to the
Wilhelmus either. For whatever reason, the included Reael texts were not capable of performing an authorship
signal, so Reael also cant be excluded.

127

Coornhert and Marnix, the two big names in traditional Wilhelmus research, showed the most stylistic
affiliation with our national anthem. My results and there presence in the Wilhelmus canon are as far as I know
unrelated in terms of that my design. Ive strived to treat and include text of possible authors and improbable
authors with the same importance and care. As Ive already mentioned, the availability of Coornhert and Marnix
was above average and so their body of texts to choose from was more diverse, balanced and greater than
others. In this way their status could be of influence on their stylistic connection to the Wilhelmus.
Despite the fact that there were other authors present, some among them with a big and balanced body
of work to choose from, that showed less stylistic resemblance to the Wilhelmus, as Coornhert and Marnix. Still,
even the connections towards these authors were weak and inconsistent. With minimal variation of the
parameters, the Wilhelmus shifted from one author to the other or to any other or none at all. I cant draw any
conclusion on these results, regarding the author of the Wilhelmus.
Zooming in on the texts of the corpus of the Meertens institute, the anthem made these brief
connections with, they turned out to be stylistically unreliable, mostly showing poor stylistic signals themselves,
probably because of limited text size. The Wilhelmus didnt cluster consistently with other parts of the
geuzenliedboek and even had trouble finding another version of himself. This led me to the conclusion that
somehow, the Wilhelmus shows very poor results under the scope of research Ive performed. A critical analysis
of the text itself is in order.
Its obvious by now that the size of a short texts, lets say 550 words, has a huge influence on its
attribution. Short texts are harder to attribute, they make weaker connections and so when the parameters shift
they move around the most, making them unreliable. However, there are many short texts included in my corpus
and they often do connect with texts of the same author or book. When compensating for text length, in order to
cancel out possible size-effects that I thought made the Wilhemus so resilient against attribution, by sampling or
making very specific corpora, the anthems stylistic isolation remained. When I used character n-grams, and
thereby creating more features to analyze, and cancelling out some of the size noise and creating more distinct
results for short text, results for the Wilhelmus were the same. Authorship effects are present and visible in other
texts, even with sample sizes of only 550 words and even if the corpus is disbalanced on text size. The
Wilhelmus however, gives us no consistent and convincing signs. This means that the below average
performance was not completely because of the text length.
Based on PCA-analyses, that gave me insight in the components of the Wilhelmus and their relation to
those components of other text, by Coornhert and Marnix, the Wilhelmus seems to posses some stylistic qualities
that set it apart of all the Marnix and Coornhert in the analysis.
Changing some core parameters of my design, in order to squeeze out further results, I reconsidered my distance
measure. Using other distance measures that heavily favoured the top few MFW as features, texts of Marnix
were most of the time the Wilhelmus nearest neighbours. However, because of the course of my thesis, with its
theoretical argumentation the whole of my analyses were designed to test with the Burrows delta. The amount of

128

test with other distance measures I consider to be too little to give a decisive verdict on these measures, even
though their attribution seems at least plausible. It is interesting idea for further research.

Topical sub questions


Ive formulated the two following main research questions; Who is the author of the Wilhelmus? Can the
complicated real world authorship attribution case of the Wilhelmus be solved with the methods of quantitative
analysis and the tools of the computational literary studies? Before answering these, Ill answer my topical sub
questions in order to come closer to answering my main questions.

Can my methods detect language signals? Yes.

Can my methods detect genre effects? Yes .

What language effects does the Wilhelmus signal? None detected.

What kind of genre effects does the Wilhelmus signal? Inconclusive.

Which candidate author do my quantitative stylistic analyses point to or support as the author of the
Wilhelmus? The results are inconclusive.

Which candidate author do my quantitative stylistic analyses rule out as the author of the Wilhelmus?
The quick answer is Lucas dHeere, Jan van der Van der Noot and Johan Fruytiers, but the right answer
on this question depends on the explanation thats given about the absence of a stylistic authorship
signal of the Wilhelmus. A more elaborative answer regarding the ruling out of possible authors will be
given later on.

Who is, or is more likely to be, the author the Wilhelmus, Marnix van Sint-Aldegonde of Dirk
Volkertszoon Coornhert? The results are inconclusive.

Ive also asked some methodological, which I can try to answer now that Ive answered or failed to answer the
topical questions. However, answering the methodological questions requires a lot more debating the nature of
the results. While a successful author attribution would leave my questions on the possibilities of attributing an
author with a quick confirmation, a failure to attribute does not always mean these methods were insufficient.
Therefore some of my expectations and opinions on the matter will be discussed in the section discussion. Ill
also discuss some other possible reasons for the failed authorship attribution, and Ill question if we can even
speak of a failed authorship attribution. However, these possible explanations go no further than substantiated
speculations, and therefore I will not discuss them here, but in the discussion section.

Methodological conclusions
Lets first sum up, the in this thesis acquired methodological knowledge. As Ive shown, the methods used in this
thesis, my choice of low-level features in combination with a Burrows delta, performing a multilevel categorization
task, can measure and bring out a diverse palette of effects that in some cases have proved to be strong,
consistent and therefore believable. These effects include language effects, genre effect and authorship effects

129

among others. Graphs made in either Gephi or R often showed large corpus defining effects or little delicate
effects, subdivisions and details, and sometimes both. The different formats all performed well.
One major problem I anticipated on, and which Ive discussed several times by now is the small text size of the
Wilhelmus and the majority of the other texts. Size matters. A researcher should be aware and preferably
compensate or control for small texts and big relative differences in texts size who become more urgent as the
absolute texts size shrinks. Small text size have a negative effect on the results and big differences result in
signals based on text size.
A text of sample size of 550 words can pick up strong effects like author signals but often neglects to
show these effects and is incapable of showing more delicate effects. Samples of 550 words are often unreliable
and are likely to give inconclusive results. A larger size is strongly recommended. This means that the Wilhelmus
is far from an ideal text to perform authorship attribution on. Texts that were smaller than the Wilhelmus were
most of the time excluded from the corpus.
A text of sample size of 1500 or 1650 gives a lot more detailed graphs, including a lot of subtle and more
strong effects. The probability of attribution success and the effects an analysis expresses, increase quickly at
first when increasing text size, indicating a strong correlation between the current text size and the quality of the
analysis; but then, above a certain value, further increase of sample size would not significantly affect the
effectiveness of the attribution.418 It is preferred to have a text of sample size greater than 1650, because the
results will still improve substantially.
When performing normal sampling on a texts wise unbalanced corpus, the texts that are significantly
larger than others will make up too big a part of the whole amount of the samples and therefore defining the
graph, obscuring smaller effects. A balanced set in general and specifically in this regard is necessary.
Using character n-grams as features, among other, was a good choice. They handled, as explained in the theory,
the minimal text size by creating more features, but also on analyses that werent limited by texts size character
n-grams often showed the best results. Especially Character 3 gram gave a lot of information and definitely
proved itself to be a alternative or addition to features on word level. This counts for Character 4-gram in a lesser
extent because in comparison to character 3-grams they showed less of the smaller and weaker effects.

Methodological sub questions


1. This leads me to answering the methodological sub questions;
2. Can my methods detect authorial signals? Yes.
3. Can my methods eliminate one or more of the usual suspect of the Wilhelmus authorship attribution
case? Yes, if youre willing to rule out the possibility that the Wilhelmus doesnt stylistically resemble
other texts of its authors in any way.

418 Eder 2010, 2.


130

4. Can my methods give supporting evidence for one or more of the usual suspect for the Wilhelmus
authorship attribution case? Possibly
5. Are my methods useful and/or sufficient for authorship attribution for texts of 550 words? No theyre not
sufficient but incidentally successful non the less.
6. What are current limits of computational stylistics and authorship attribution? This question is to big to
answer here but is discussed in several parts of the conclusion and in the discussion section.
In general I consider my methods sufficient for authorship attribution, as well as identification of other stylistic
effects. I think these methods, with these parameters and on these corpora would have been capable of
identifying the author of the Wilhelmus if the was present and well represented in the corpus. The other condition
is that the Wilhelmus is representative for the style of its author. I base this assertion on the majority of texts in my
corpus that were correctly attributed to their authors. The corpora were designed to include all possible authors of
the Wilhelmus and to include them in such a way that they would represent their authors style, as well as that
they would signal a variety of other signals. The fact that the Wilhelmus isnt attributed makes me question the
capability of my methods but doesnt make me dismiss them all together.
One obvious flaw in my design is the lack of great amounts of texts by all possible authors. I pointed out
the necessity of a clean corpus for the analyses of very short texts and substantiated this with my results. Id look
for improvements in this area before dismissing my methods all together.

Concluding
Concluding; With computational methods and quantitative analysis can we determine, support or dismiss possible
authors of the Wilhelmus. I think it is definitely possible but in this thesis I did so very scarce, due to the difficult
circumstance and the difficulty of the tasks, but more importantly because the Wilhelmus, in comparison with
other texts, performed very poor. This is on itself very interesting and opens a register of debatable and testable
theories about its disguised authorial signal, its reluctance to show effects, and its conception, which I will discus
in the discussion section.
So, can the real world authorship attribution case of the Wilhelmus, with all its problems, be solved with
the methods and tools of computational literature? With the current circumstances available for the researcher I
doubt that there will be an authorship attribution possible thats scientifically acceptable and will achieve a widely
held consensus in the computational literary field, although therere a lot of options Ive considered but not yet
explored and there are many ways to improve the means of the literary scientist. Ill explain and elaborate in the
discussion.

Discussion and recommendations for future research

131

So I havent solved my case and the reasons why the attribution didnt succeed are unclear, making my answer to
the methodological questions as well as the topical question a dont know. Did I fail? I am schooled, next to the
humanities, in the social sciences, where such an answer is much more common. Its actually considered a result,
providing that the research has been set up and conducted in a proper scientific manner, but this is no different
from research that does provide an answer. The researcher needs of course to comprehend and reflect on the
scientific process, depict and interpret it, and this needs to result in suggestions for further research. The
expression is no results are also results, but this doesnt apply to this thesis and Ill sum up the reasons why it
doesnt in the paragraphs below. What definitely does apply to my thesis, authorship attribution and experimental
research in general is; failed attribution is better than false attribution.419
The first reason why the social sciences attitude of no results are also results doesnt apply to my thesis
is because, without trying to enforce boundaries between faculties, Im a Dutch language and literature student,
performing research, I myself consider to predominantly belong within the faculty of the Humanities.
Secondly, my cup is not empty, its half full. I do have results, a lot of them, I made a lot of conclusions
and answered a lot of questions and hypotheses, and more importantly, a part of them applicable to the tradition
of the Wilhelmus research. I excluded some of the potential authors of our national anthem, got some expected
genre indications and a surprising language signal and most importantly showed over and over again how the
text seems to have an exceptional inconsistent and weak stylistic signal. This opens up new registers of
interpretation while making old ones unlikely or at the very least leaves some old attributions not supported by
stylistics analysis. I also analyzed a lot of other texts, their style on all sorts of levels and effects, giving me all
kinds of interesting results. There were, for example, the Residu-cases, text of uncertain decent, and identified,
verified and falsified characteristics of a lot of them. I also performed methodological research and produced
results. Author effects, different kinds of genre effects and different kinds of language effects were confirmed,
along with stylistic similarity of text coming from the same edition, on a variety of different Dutch texts, including
very short ones.
Still, the core question remains unanswered. In order to gain insight in my design and the anthems
resistance against attribution and signalling stylistic effects in general, its time to discus and theorize about the
possible reasons of why an attribution has not been achieved. I will use this section to reflect further on the
results, theories and even speculate on them in order to recommend future research and pronounce my hopes for
the future of digital humanities.

Text size and available text


Both of the expected major problems, that of the limited text size and limited amount of available text, manifested
themselves as indeed the biggest stumbling block. I was at times unable to sufficiently test some of the
hypotheses and couldnt include all possible authors of the Wilhelmus.

419 Love 2002.


132

Texts of around 550 words are hard to attribute and only make weak connections inconsistently, so when the
parameters change they often cut previous ties and/or make new ones, exposing the results as unreliable. This
leads to a lot of noise and a lot of unusable texts, because of the short text size. However, authorship effects are
present throughout my analyses also with text sizes of only 550 words and even if the corpus is misbalanced due
to text size. In other words, many other very short texts, just as short as the Wilhelmus, did express a correct
authorial signal. Especially when correcting for text size by using character n-grams, sampling or balancing the
corpus, the possibility of correct attributions on 550 word text was obvious, but the Wilhelmus did not budge.
Other effects of the Wilhelmus were also either absent (is the Wilhelmus Flemish?) or expressed inconsistent (the
Wilhelmus as prose or poetry and the Wilhelmus as part of the Geuzenliedboek). The performance of the
Wilhelmus was very weak, and actually not very representative for the overall performance of my corpus. Only
texts by Reael seemed to project the same level impossibility regarding attribution. I conclude that the limited text
size surely didnt help the performance of the Wilhelmus, but cant be the full explanation for its stylistic
indifference.

Reasons for no AA on the Wilhelmus


Although the improvement of the availability of text is my major recommendation to the field of literature and a
sufficient text or sample size seems to be one of the most important requirements for successful authorship
attribution, my discussion section will not only plead for those two, already well understood necessities. Perhaps
the most interesting results of my analyses is the exemption position of the Wilhelmus, especially after the
principal component analysis showed its deviation on one of the two dimensions. If text size and corpus can not
fully account for the absence of attribution, then why doesnt the Wilhelmus make any strong connections and
why does it seem to underperform in comparison to other short texts? I dissect several options.
If the Wilhelmus signals an authorship signal and the author is present and well represented by his texts
in the corpus, there should be a match. Since there isnt, therere got two design related aspects of my research, I
can blame this on. The first feature is that its because of the corpus, either the author is not present in our corpus
or he is not represented sufficiently by his text to signal his/her, probably his, linguistic fingerprint. So weve got
two options here, no inclusion or insufficient inclusion, both counting as shortcomings in the corpus building. The
third option is that the Wilhelmus doesnt single its authors style, or my methods cant capture the signal. The only
option for solving the Wilhelmus case would then be to solve the shortcomings of the design.

1. Future corpus building


If the fault is in the corpus we can conclude that Lucas dHeere, Van der Noot or Johan Fruytiers, Coornhert and
Marnix were all not the author of the Wilhelmus, because theyre included, and I assume sufficiently, so that their
body of text signals their authorial fingerprint. The other authors included and all the authors not included are in
this case, still an option for authorship.
Further research should focus on repeating the analyses on a corpus that sufficiently represents Reael,
Datheen, Sterlincx, Willem of Orange, Utenhove, van Damme, Haecht, Voort and all other possible authors that
133

were analyzed but couldnt be ruled out. Texts and possible authors that I havent included and should be
considered are Jan Baptist Houwaert especially the text Milenus clachte, Hendrik Niclaes with his Cantica
Lieder offte gesange and Psalmen unde ledern and the anonymous song George Lalaing. Other authors that I
havent even mentioned but I still recommend for inclusion, in order to have a corpus as complete as possible,
are Jacob van Wesembeke, Hendrik Geldorp also known as Hendrik Castricus and Nicolaas Bruyninck.
Especially Jacob van Wesembeke is an author I would have include if representative text was available. He was
an interesting case because we know he was supporting the protestants and had to flee Holland because of that
in 1567. He was at some point in time present at Dillenburg, territory and place of exile for Nassau and also one
of the key places Stipriaan based his network analysis on. In addition to this was Wesembeke the predecessor of
Marnix as secretary of the prince.420
Also a lot of little effects, or divergent texts can be sought out by close reading and study of contextual
and biographical and historical information about the text. So stepping temporarily back from the distant reading,
during the corpus building phase, can also a option for improving your corpus, although Ive done this already.
We should also consider collecting more editions of the Geuzenliedboek to see if the Wilhemus connects
with some of songs in them and to understand these books and their relations better. I recommend extra attention
for song 88, 89, 78 of the 1583 edition of Geuzenliedboek, because these songs were closest related to the
Wilhelmus.
If its a case of the real author being absent or his true signal being absent because of insufficient corpus
building, a successful authorship attribution is very possible when all the possible authors are sufficiently
represented in a new corpus. This only holds up if other work of the author of the Wilhelmus survived the ages.
The unknown soldier, who only wrote one song and then disappeared from history, can never be verified as the
author. I see no reason yet to abandon the assumption that the poet who wrote the Wilhelmus was a professional
one, so I do not yet accept this as a possibility.
For further research a possible solution for the imbalance problem, without throwing away texts is
described in the Stamatatos paper,421 were they use only the n-grams of the unseen texts to contribute to the
calculated sum. Each term was multiplied by the relative distance of the specific n-gram frequency from the
corpus norm. The more an n-gram deviates from its normal frequency, the more it contributes to the distance
measure. This way cases with limited and imbalanced corpora were available for training.
Whether you find the corpus fallacy an logical explanation or not, I recommend to explore the possibility,
in order to either rule out this option or find the author of the Wilhelmus.

2. Future design

420 Prims, F. Verslagen en mededelingen van de Koninklijke Vlaamse Academie voor Taal- en Letterkunde 1930.
Koninklijke Vlaamsche Academie voor Taal- en Letterkunde, Gent 1930 p599608http://www.dbnl.org/tekst/_ver025193001_01/_ver025193001_01_0051.php

421 Stamatatos 2007.


134

If however the fault is in the Wilhelmus text itself, things will become a lot more difficult. In this case even Lucas
dHeere, Van der Noot, Johan Fruytiers, Coornhert and Marnix cannot be ruled out. Weve got to find an earlier or
another version of the Wilhelmus, that does signal his authorial style or find methods who can extract
characteristics that can tie the Wilhelmus to its author on probably totally different features/characteristic.
However, with the amount and availability of text from possible authors and of the Wilhelmus, improving
the design will not be easy. Other practical considerations for choosing my design were based on limitations of
space, time, my own specialties, knowledge and capacity and availability of computational and textual means. If
any of these limitations were cause for my methods to be insufficient, upgrading them can establish a positive
attribution. Another option is of course choosing a different methodology that can find the authorial signal, but it is
not likely that other methods will show greatly improved results for very short texts. The reason for choosing and
composing my methods and design was because it has been highly successful on authorship attribution and
because it also showed some promising results on very short texts.
One suggestion Id like to make is the use of the tool antconc. This tool allows the researcher to switch
easily from between distant to close reading. The design becomes very different and content features are often
the most logical features for this approach. This program can also be used for corpus building and interpreting
results as well.
Another rigorously different methods from the one of this corpus is using a standard ANOVA. This seems
necessary because when basing your results on visualizations without measuring significance, despite well
established hypotheses, good choice of style markers, advanced statistics applied and convincing results
presented, one cannot avoid the simple yet nontrivial question whether those impressive results have not been
obtained by chance, or at least not positively affected by randomness.422 An ANOVA take into account the
statistical dependence of different word frequencies, a famous examples the research on the federalist by Holmes
and Forsyth.423 These methods depend less on the visualizations and more on measurability and significance of
the outcomes. The design becomes less exploratory. With the flyweight effects the Wilhelmus has shown so far,
these methods do not seem a logical next step if the goal is to identify the author. They do seem a solid next step
to test the results and effects found in this thesis on a statistical and scientifically sound way.

3. Another authorial signal


If the Wilhelmus doesnt stylistically resemble the other texts of its author, attribution based on stylistics becomes
impossible. A lack of authorial signal would explain why the song moves around so much. Considering theres a
pretty solid case for the linguistic fingerprint, a total lack of stylistic characteristics and authorial effects of the
Wilhelmus would be very unbelievable. Therere however several possible reasons why the Wilhelmus would lack
a strong consistent signal.

422 Eder 2010, 2.


423 David I. Holmes and Richard S. Forsyth, The Federalist revisited: New directions in authorship
attribution. Literary and Linguistic Computing 10, no.2 (1995): 111127.
135

If its just the edition or spelling of this version that corrupts the signal, well have to wait until Martine de
Bruin does her next major discovery or a human expert has to correct extensively for spelling and other
influences of the publisher and exhume the original text of the Wilhelmus.
Another explanation, already mentioned, is that the Wilhelmus might not be the product of one man writing a
poem. It could be the product of two or more writers, or of the people in Herderians sense, or perhaps an effort
by a whole group, editing and revising the text over and over again. It could also be orally invented, perhaps by a
famous poet or a close fiend of Wilhelmus, like Marnix, or just orally transmitted and spread. In both cases the
song was passed on and altered before it was written down in the (oldest) version weve today. Its possible that
our version harbours multiple voices, authorial, editorial or just a product of errors in transmission. Anyway the
style of the text, the relative frequencies of function words for example, thereby no longer (or never have) carry
the authorial signal of one poet writing a song. Theres a broad spectrum of possibilities if we let go of the idea
that the text has only one voice. I havent tested this hypothesis but Ive seen the Wilhelmus show stylistic
resemblance to songs with multiple voices, like the conjoined Coornhert file.
The establishment of multiple voices in a text requires most of the time large portions of text to unravel, especially
when all authors are unknown, the number of voices are unknown and were not even sure if it is a case of
multiple authors. This is the main reason why I deliberately did not test this hypothesis. Theoretically this might be
a very reasonable option, but I seriously doubt if the methods of the computational branch of literary scholarship
could test them.

Future research
Methodological variation for future research
There are also a lot of possible variations of design, that were not pursued in this thesis, that might be worth the
consideration.
First of all we could explore several distance measures other than Burrows, especially the ones that
showed promise in the results. A larger theoretical frame should be build, regarding their preference for the very
first MFWs and refer this back to the corpus. A alternative not yet tested is the use of a similarity measure like the
Cosine similarity. Its used in several papers on AA and also when in the field of textmining. According to Smith
and Aldridge424 the cosine-based Delta measure is the best distance measure for authorship attribution425
Another approach Id like to suggest is to complete the Koppel needle in the haystack method,426 427
mentioned in the theory. Koppel used specifically designed meta-learning model, by building an SVM classifier for
each candidate author,428 to automatically determine which attributions by which representation schemes have

424 Smith and Aldridge 2011.


425 Evert et al. 2015.
426 Koppel et al. 2006
427 Koppel, Schler and Argamon 2013.
428 The best explanation I could find of this technique was next to the Articles of Koppel themselves in the
survey paper of Stamatatos 2009 page 13
136

high likelihood of being correct, using a holdout set of 10.000 blogs (those not included in the text). Koppel
showed promising results for snippets of 200 words. In its core Koppels method is an alternative selection of
features. Ive already explained this technique in the theory. If one could mimic the extraction of the most
determining features of a feature set, while keeping an eye on the graph, one could test which attributions are
stable and so believable and which do not. Instead of relying on a specified number of most frequent words
(MFW), we systematically identify a set of discriminant words by using the method of recursive feature
elimination.429 If the unmasking method fails or is impossible to perform, other ways of feature extraction should
be considered.
In this thesis Ive decided that, based on resources at my disposal, not to perform machine learning
models. Koppels method relies on this and so do a lot of other methods. Im not familiar with the complete scope
of machine learning models so I cant do any specific recommendations, but a study after their possibilities is
advised. Therere promising results. Comparative studies on machine learning with support vector machines
(SVM) is as good as any for authorship attribution.430 SVM model (support vector machine) is able to avoid over
fitting problems even when several thousand features are used and considered one of the best solutions of
current technology.431 Looking at the compared performance of several representative learning methods for
authorship attribution, we will see, however, the choice of the learning algorithm is no more important than the
choice of the features by which the texts are to be represented.432
In this light another option that weve not yet considered are Application specific features. Features that
are examined individually and selected on the basis of discriminating the authors of a given corpus into a feature
set.433 Although the most important criterion for selecting features in authorship attribution tasks is their frequency,
and Ive used MFW. Features identified by a feature selection algorithm may be too corpus-dependant and have
questionable general use.434
Another option Id suggest is further testing of dimensionality reduction. With principle component
analysis It may be more useful to look at the second and third principle component to view the variation of datapoints. Most of the text will show the same clusters, and the one that shifts may have some other underlying
source of variation. Other multivariate statistical analysis techniques, like discriminant analyses, standard factor
analyses and cluster analyses I havent performed in this thesis but could bring some insight in the data.

429 Evert et al. 2015.


430 A. Abbasi and Hsinchun Chen, Applying authorship analysis to extremist group Web forum messages. IEEE Intelligent
Systems 20, no. 5 (2005): 6775;Zheng et al 2006.
431 Jiexun Li, Rong Zheng and Hsinchun Chen, From fingerprint to writeprint. Communications of the ACM 49, no.4
(2006)Efstathios Stamatatos, Author identification: Using text sampling to handle the class imbalance problem. Information
Processing and Management 44, no.2 (2008).

432 Koppel, Schler and Argamon 2009.


433 Stamatatos 2009.
434 Stamatatos 2009.
137

Different question for future research


Besides other texts, methods and features there are also aspects or areas of the Wilhelmus tradition we need to
research.
I recommend further research after the characteristics of early modern,435 or renaissance, Dutch texts to
understand how the effects present in my corpus hold up against the general stylistics of those times, and to
grasp the grander question of how texts from this period of time compare to modern Dutch texts. It may help us
gain knowledge not only of the texts themselves but also of the Wilhelmus. If these grand effects were mapped,
Id like to see more specific stylistic signals like the genres signals of geuzenliederen and ballingenliederen436
The genre hypotheses that I havent followed through are definitely worth a try. The problems I mentioned are not
easily solved but an attempt at forming methods and corpora might on itself be enough to give an indication if
poetry vs. songs and the 6 different type hypotheses, are theoretical sound and testable, and if the methods and
means are currently available. Id recommend the exploring of the possibilities.
The same goes for further research on language effects. I can only suspect the possibility that similar
effects are present for other dialects or linguistic regions. I recommend training corpora and a text size above
1000 words, advanced biographic knowledge of the texts and authors and extensively varying the parameters.
The language effect for German texts was not sufficiently tested and further research should dig deeper
on a possible language effect on the Wilhelmus and German is definitely the first option. Further research would
strive by determining the diversity of such an dialect and if its even possible to count it as one dialect at all.
Research on this topic doesnt have to be Wilhelmus related. The boundaries of dialects should be determined
more strictly when computationally analyzing these texts, which is of course a problem when youre building
corpora from early periods because of the limited digital availability of these type of texts. If one is planning on
doing this research I recommend going through Kossman his study.437
A possible case for research is the hypothesis of literary quality, that could eventually lead to verification or
falsification of the assumption that the Wilhelmus must been written by a professional. The suggestion that literary
quality can be measure with quantitative stylistic analysis is highly controversial and on sight problematic. Quality
is more a subjective verdict than a workable definition of a category. This, however, doesnt dismiss the attempt at
finding such a signal.

435 This is a Dutch saying for its renaissance period, as the Dutch renaissance sailed an atypical course.
436 Songs of exile or the exiled. Perhaps the most apt, but also frivolous translations is outlaw songs, but
this term is already appropriated by bikers and pop-music.
437 Het ontstaan van het Wilhelmus
138

End plea
There are many ways to improve the research conducted in my thesis, quit a few which I plan on doing myself, as
are there many ways to improve as a researcher. The improvement of a field of science, however, is a process
that is way more complicated, and in which the role of a marginal figure will be a passive one, looking from the
sidelines at the changing landscape. But if we can find ways to improve the means of the literary scientist, then a
much larger portion of the researchers can participate.
Some areas or aspects that need improvements become quit obvious when youre conducting
quantitative analysis (on Dutch literary texts). The fact that theyre obvious doesnt mean theyre easily to
improve, let alone solve, however it does mean that (some) of the current limitations on this type of research are
an eyesore. If the access to digital text, ready-made for analysis with governable tools, was easy and free (or
without copyright), any intending could do a quantitative, or other computational, analysis. Besides the
democratization of this type of research, and offering to the research what is already there for consumption, would
this mean tremendous gain for the advanced student or researcher, who wants to ask specific questions on a
specific tests.
I estimate that a better accessibility of digital texts could have saved months of my research; months I
couldve used to try other approaches, like varying distance measures and become known with machine learning
techniques. Some hypotheses are now cancelled because of this limited availability, that are perhaps testable
given a bit more resources. Although this gives me a lot of follow-up research to perform, preventing me to fall in
a black hole caused by great amounts of sudden free time, now being ensured that I can workaholic myself
through those tough first post-thesis months, I cant help but feel that, if the humanities were a bit more used to
computational literary studies, I could have done more.
The more I try to participate and engage in the field of computational literature, not only in the university,
but also on my twitter account and through collaboration outside of the academy, I see experts wading through
text and methods in order to say something about text and method. This is not very different from any other
literary research, if it werent for the digital books missing from the digital libraries or digitally printed in such poor
quality, you have hard time distinguishing the text from the ink spatters.
I plead for the building of a large digital corpus of (Dutch) text, with large amounts of text per author, genre and
language, if need be even offering different spellings, dialects and formats, including contextual framing and
available for bulk download and online analysis, as well as prepared for some of the most important open source
tools. I know several projects that are hoping to realize parts of this analysis-utopia and Ive every confidence that
its only a matter of time before a community of researchers, academics and programmers will fulfil this prospect.

139

Collaboration and thanks to


As Ive mentioned quite a few times, during this thesis Ive reached out for help on several occasions. The
heading says collaboration and thanks to but maybe I should just skip the first part and express my gratefulness
to all the people who have been so good to spend time and effort on my account, out of scientific curiosity, a
sense professional responsibility or maybe just out of the kindness of their hearts.
Fist of all my supervisors, prof. dr. E. Stronks and prof. dr. K.H. Dalen-Oskam, who have guided me with a great
deal of enthusiasm, were obviously very involved in my subject and research, and therefore an irreplaceable
force in the preparation and realization of this thesis. Theyve made me keep the overview, while at the same time
turned my attention to the details that I so easily forget. In addition to this, and perhaps the most important, were
my meetings at the Trans in Utrecht with Prof Stronks, sometimes joined by Karina Dalen-Oskam over Skype.
These were a real necessity to keep my head straight, when the remaining days were, for a number of months,
filled with solely working in a room of ones own, relatively socially isolated because of temporal and financial
limitations, and without any substantial talks about the content of all these pages to which I dedicated so much of
my attention, besides of course the obligated, but surely well-meant, question, stemming from a desire to hang
out with me (again), So, when are you finished with your thesis?. Therefore I wish to thank my supervisors not
only for fulfilling their professional obligations but also for their personal support and attention. A honorable
mention for dr. Peter Boot, who was willing to sacrifice his time, and every hour of august counts double in my
book, in order to be judge and jury and act as my second corrector and reviewer, next to Els Stronks.
Secondly, I need to turn my gratitude to the institutions, and especially the people working at these knots
in the web of knowledge, that helped me with the acquisition and the preparation of the corpora, but in many
cases also showed a real interest in my doings and offered suggestions on how to proceed or where to look
further. I thank prof. dr. Louis Grijp for answering my mails about candidates of authorship, Cees Klapwijk, Michel
de Gruijter, Paul Broekhuizen and everybody else involved at the DBNL, the digital library for Dutch texts, for
providing me with a large needle in the haystack corpus and delivering the texts in several formats, and I thank
Erik Tjong Kim Sang, Prof. dr. Nicoline van der Sijs, Hennie Brugman, dr. Rene van Stipriaan and everybody else
involved at the Meertens institute, the research institute of the Royal Netherlands academy of arts and sciences
for the research and documentation of Dutch language and culture, for meeting with me, handing me secondary
literature and providing me with again a large corpora on several formats and delivering those text per book, as
well as per song, which has had a substantial influence on my research.
Last but not least Id like to thank some people in my personal environment who supported me while
writing my thesis and for all the patience they have bestowed upon me, whos names I will not state here because
Im far too shy for paying such respects publicly, but to whom I will make it up in person. You are noticed and
appreciated.

140