Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Abstract
The author of one of the most important Dutch texts in Dutch (literary) history, and on top of that the oldest
national anthem in the world, is yet to be determined. Researching the historical context and performing
qualitative text analysis has not produced conclusive answers and put a name on the anonymously
published song. Ill try to discover the author of the Wilhelmus using quantitative analysis, and the methods
and means from the computational literary studies. This involves the use of computers, performing statistical
analysis on the Wilhelmus in order to determine an authorial signal, based on textual features, and combine
this linguistic fingerprint with those of the other texts of my corpus, predominantly texts from possible authors
of the anonymous Hymn. Previous research with these methods show very promising results but the short
text size of the Wilhelmus, only 551 words, temper the expectations. Ill test previously proposed and
generally considered valid options of potential authorship of the Wilhelmus, while at the same time trying to
determine if, currently, this type of research, these methods and the available tools are capable of handling
such questions. This leads to the following two research questions; Who is the author of the Wilhelmus? &
Can the complicated real world authorship attribution case of the Wilhelmus be solved with the methods of
quantitative analysis and the tools of computational literature? While testing authorship signals, Ive also
tested other stylistic effects based on language or dialect and genre, type or topic. These effects were
present in my corpus, measurable with my methods and clearly visible in my graphs. Surprisingly enough,
the Wilhelmus shared very little stylistic effects with any other text from all the authors in the corpus.
Attempts to draw out the texts lingual characteristic, by varying features, culling, sampling or testing on
different corpora, some including only Marnix and Coornhert texts, all failed to produce a strong, consistent
and reliable attribution. When examining the nature of the failed attribution, by doubting the distance
measure Burrows Delta or by analyzing the distinctive components of the Wilhelmus with a principle
component analysis, I got results that are worth exploring and valid options for future research.
Index
Title page
Index
Quotes
Introduction
Theory
11
The Wilhelmus
1. Songs of collectivity
2. The national anthem of the Netherlands
3. The open questions of the Wilhelmus
When?
Why?
Where?
Who
4. Obstacles
Idiom
metric
subjectivity
5. Conclusions on Wilhelmus research
23
Relevance
28
Methods
Automated Authorship Attribution
Stylometric features
31
32
33
- Character features
- Lexical features
- Syntactical features
- Semantic features
2. Types of features
- Complexity measures
- Punctuation
- Idiosyncrasies
- N-grams on character-level
- N-grams on word-level
- Word frequencies
- Syntactic (Distribution of parts of speech or POS)
- Univariate vs. Multivariate approach
- Conclusions on features
Distance measures
Computational means
Text
43
46
47
50
Tests
54
Corpus
57
1. Considerations for the researcher
2. Explanation corpus
Three corpora
59
1. DBNL corpus
2. Meertens corpus
3. Specialized corpus
- Het Wilhelmus and the Geuzenliedboek
- Marnix
- Coornhert
- Other possible authors
- Probable authors, included, half-included or not included
- Improbable authors included
- Anonymous texts
Hypotheses
65
3
Language Hypotheses
Genre Hypotheses
Authorship Hypotheses
Hypotheses is will not test
Expected problems
Obstacles of computational literary studies
Obstacles for authorship attribution
Case specific problems
71
72
74
75
Specialized corpora
78
89
92
100
1.
2.
3.
4.
5.
6.
7.
8.
9.
98
- Distance measures
- PCA
- Conclusions author hypotheses
Meertens corpora
109
112
113
115
Conclusions
122
130
Future research
134
End plead
Collaboration and thanks to
138
139
Een nieuw Geusenlieden boecxken waerinne begrepen is, den gantschen handel der Nederlandsche
geschiedenissen, dees voorleden jaeren tot noch toe gedragen.1
Bouwstenen, fundamenten, grondslagen het blijven metaforen die iets vanzelfsprekend moeten laten
lijken wat niet vanzelfsprekend is.2
1 G.A. van Es, Het Wilhelmus, in Het Wilhelmus in artikelen. Een bundel herdrukte studies over het Wilhelmus, ed. J. de
Gier (Utrecht: Hes, 1985), 51.
2 Frans-Willem Korsten, Grondslagen, situaties en houdingen Vooys; tijdschrift voor letteren 33, no. 2
(2015)
3 Louis Peter Grijp, Inleiding, in Nationale hymnen. Het Wilhelmus en zijn buren, ed. Louis Peter Grijp
(Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998), 13. Look up in; Eyck, F.G., The voice of Nations. European National
Anthems and their Authors (Westport, Conn/London 1995, pag xix)
Introduction
At least one thing was familiar during the brand new major sports events, both held for the very first time,
July of this year; Opening every World Cup game of the lionesses, the Dutch female national soccer team,
and rewarding any gold medal won by a Dutch athlete at the European games, the oldest and most beautiful
anthem in the world will chime in the stadiums and flow through our television speakers, or, at least, that one
verse we actually know will. As the Dutch symbol of tradition as well as revolution, representing the nation,
its people and the Royal house, the Wilhelmus was heard no less than 26 times in Azerbaijan and 4 times in
Canada. Despite our alleged down to earth mentality and the recent turbulent social debates around Dutch
symbols and traditions, we seem to take pride in our national anthem and do not challenge its unifying
function.
Although it
might be no surprise that the man in the street has no extensive knowledge of the origin of things, after all
the average citizen of Rotterdam thinks the Erasmus bridge is named after its architect, the absence of an
author for our national anthem should be part of the collective memory. Elementary school textbooks and,
more importantly, Wikipedia,4 dont help by proclaiming Marnix van Sint-Aldegonde as the author, which is an
uncertainty at best. Truth is that the Dutch literary history has a huge open case for authorship attribution,
because the author of the Wilhelmus has never been sufficiently established. Prof. Dr. K. Heeroma
maintained, back in 1970,5 that the idea of an anthem springing from its people is almost to romantic to
debunk .6 However, I consider the relevance, or even importance, of this national hymn, which is so
fundamental to the origin story and perhaps even the origin of the independence of the Netherlands, to great
to leave to myth.
Im going to do
a serious attempt at solving the case of open authorship of the Wilhelmus song, by performing quantitative
analysis. Research based on biographical or other historical information and qualitative analysis, has filled
up the Dutch libraries but hasnt led to conclusive evidence or scientific consensus. There have however
been no attempts, that I know of, to solve this question with quantitative analysis. This is not all that
surprising due to the infancy of the computational literary studies and the towering case-specific problems
with stylometric research on our national anthem, who seem as impossible to overcome as the Duke Alva
himself. Although the main question (who is the author of the Wilhelmus?) has not been answered, research
has yielded a whole spectrum of assumptions and conclusions, unfortunately often incompatible. I intent to
test some of these theories, with quantitative analyses, while going after the author. By including all the
4 https://nl.wikipedia.org/wiki/Wilhelmus Last checked at 02-08-2015
5 Klaas Hanzen Heeroma, Tsal hier haest zijn ghedaen , in Het Wilhelmus in artikelen. Een bundel herdrukte studies over
het Wilhelmus, ed. J. de Gier (Utrecht: Hes, 1985), 251-268.
6 Ik houd het probleem van het auteurschap van het Wilhelmus voor onoplosbaar en heb daar eigenlijk ook
wel vrede mee. Dat hrt bij een dergelijk lied, een dergelijke verbeelding, een echt geestelijk verzetslied,
7
possible authors, suggested by previous research, in my corpus and analyzing their style quantitatively, I aim
to find some statistical evidence for their inclusion or exclusion on the list of possible authors. Authorship
attribution with quantitative analysis and other means and methods from the computational literary sciences
have worked very well in several languages, including Dutch.7
The primary goal of my thesis is to contribute to the discussion on the authorship of the Wilhelmus. If my
research brings us any closer to finding that author, I consider my thesis a success. My secondary goal is to
explore the possibilities of computational literary studies for authorship attribution. By doing research which
is obviously stretching the capabilities of its methods, and by critically analyzing any methodological step of
the way, this thesis will be, besides an attempt at determining the author of the Wilhelmus, a test case for
using quantitative stylistic analysis with computational means on a real life open attribution case. The goal of
my research is not necessarily to give conclusive evidence, but rather to give an indication if its even
reasonable to pursue an answer with this type of data-analysis. Of course actually answering my main
question would be ideal. So one could call my thesis an investigative or explorative experiment, while
researching a major open question of the Dutch literary history.
Both goals are, in my opinion, highly relevant. Knowledge about the author of the Wilhelmus is
relevant because of the immense symbolic value the text has, as the worlds oldest national anthem and as
a part of the origin story of the Netherlands. The methodological questions are relevant because these
concern the relatively new methods of a relatively new approach of literary research, controversial and
possibly revolutionary. Knowledge of its methodological capacities and examples of successful or
unsuccessful research may contribute to the understanding and perhaps acceptance of this approach to the
study of literature. Ive formulated the two following main research questions;
1. Who is the author of the Wilhelmus?
2. Can the complicated real world authorship attribution case of the Wilhelmus be solved with the methods of
quantitative analysis and the tools of the computational literary studies?
Ive formulated the following sub questions who are part of, and/or can add to answering my two main
questions;
Which candidate author do my quantitative stylistic analyses point to or support as the author of the
Wilhelmus?
7 See; Burrows, J.F. Delta: A Measure of Stylistic Difference and a Guide to likely Authorship. Literary and Linguistic
Computing 17. 3 (2002), Hoover, David L. Testing Burrowss Delta. Literary and Linguistic Computing 19.4 (2004b), &
Schch, Christof. Beyond the black box, or: understanding the difference between various statistical distance measures.
The Dragonflys Gaze. Computational analysis of literary texts (Research blog, August 3, 2012):
https://dragonfly.hypotheses.org/101
Which candidate author do my quantitative stylistic analyses rule out as the author of the
Wilhelmus?
Who is, or is more likely to be the author the Wilhelmus, Marnix van Sint-Aldegonde or Dirk
Volkertszoon Coornhert?
What are currently the limits of computational stylistics and authorship attribution?
Can my methods eliminate one or more of the usual suspect of the Wilhelmus authorship attribution
case?
Can my methods give supporting evidence for one or more of the usual suspect for the Wilhelmus
authorship attribution case?
Are my methods useful and/or sufficient for authorship attribution for texts of around 550 words?
A major methodological problem, that imposed itself as soon as I explored the possibilities of successful
authorship attribution, is the small size of the to be attributed text.The Wilhelmus only has 551 words, which
is considered way to short to make any solid authorship attribution. Estimates about the minimal amount of
words required to do successful authorship attribution on poetry are between 217 8 and 30009 words, at best.
Research attempting to determine this bare minimum is often inconclusive and conclusions differ.
Other problems I experienced during preliminary research were difficulties in corpus building, the
general poor availability of ready-for-analysis digital text versions, excluding genre and spelling effects, and
possible plural authorship. Ill invest large amounts of my time and effort in collecting and preparing
contemporary texts from possible authors, preferably in the same genre as the Wilhelmus, the
geuzenliederen (songs of beggars), and in the right (that is: original) spelling.
I stumbled upon several sub questions about stylistics that needed to be answered before I could
focus on the question of authorship. These questions were; Can my methods detect language signals? Can
my methods detect genre effects? What language effects does the Wilhelmus signal? What kind of genre
effects does the Wilhelmus signal?
My corpus probably needs to account for other effects and considerations that are not yet visible but
that will reveal themselves while performing my analyses. Ill keep my corpora as broad as possible, to
ensure maximum flexibility.
8 Moshe Koppel, Jonathan Schler, and Shlomo Argamon, Computational Methods in Authorship Attribution,
Journal of the American Society for Information Science and Technology 60, no. 1 (2009)
9 Maciej Eder, Does Size Matter? Authorship Attribution, Small Samples, Big Problem Digital Humanities (2010):
Conference Abstracts (2010): 6.
A variety of statistics, features and other methods are currently under consideration, but this thesis will
definitely rely heavily on the Burrows delta, a statistical procedure where the differences of relative
frequencies of textual features, between a target text and the average of the corpus, determines the stylistic
distance of the texts of a corpus, and the software R including its packet Stylo, to do the calculations.
As for my secondary literature, Ill need to construct a composite body of theory, bringing together
two traditions, that of quantitative stylistic analysis and authorship attribution, and that of the Wilhelmusresearch revolving around the possible author(s), in order to fully grasp the questions Im asking, to
understand the means and methods that are required for answering it, and to avoid avoidable pitfalls.
My thesis will of course be product of independent research, but Ive already, and will continue to do
so, collaborated with numerous specialist in the field and several institutions. A lot of these contacts were
made thanks to my supervisors, prof. dr. E. Stronks and prof. dr. K.H. Dalen-Oskam.
As mentioned, authorship attribution for texts of this size might not be possible, so why try? My confidence in
the usefulness of the attempt is based on the hopeful results of a similar but much smaller research that Ive
performed at the Jagiellonian University in Krakow. During an internship under the supervision of Jan
Rybicki, member of the Computational Stylistics Group and an absolute expert on stylistic research and
performing quantitative analysis with computational means, I set up a small experiment in order to find out
who the author of the Wilhelmus was, using the means and methods of the computational literary studies.
My results came not even close to answering my question. The Wilhelmus was attributed to several authors
as it moved around the graph extensively, meaning the parameters are insufficient for capturing the stylistic
signal of the text, or worse, the text is too small for the methods to capture its authorial signal. Despite all
this, I was positively surprised. Looking at the results I saw stylistic resemblance of texts written by same
author, clustering to the same branch of the tree diagram. I even saw faint genre signals, while I was dealing
with texts that were merely 550 words long. As it turns out, even when dealing with very short texts, some
traces of style are visible for quantitative analysis. The results seemed hopeful in spite of the limited amount
of hours and blank pages at my disposal and some geographical inconvenience, since a lot of the Dutch
sources were unavailable to me, behind my computer at Henryka Siemiradzkiego, 10 and random trips to the
Dutch libraries had become costly. There was so much room for improvement.
This is
why Ill have another go at determining the author of my national anthem. A thesis provides the amount of
time and space I need, the Dutch libraries among other institutions are accessible and Ive made progress in
the field of computational literary studies. Now all is set for me to contribute to the tradition of the Wilhelmusresearch by taking on this interesting test-case for authorship attribution on very short early modern texts.
Theory
10 Alley of the Karmelicka, road in the centre of Krakow.
10
The Wilhelmus
Songs of collectivity
The origin of the popularity of a national anthem can be found in moments of strong collective feelings,
whereupon its able to feed enough to last the times of calm, until another moment of national excitement
emerges.11
In 1568, the bleak year of the duke of Alvas triumphs, the Netherlands were in need of collectivity,12 as the duke
swept the low countries with military operations and restored the repressive reign of Philips II. Opinionating or
propagating papers, which came in rapid succession in the year 1568, were meant to form the public opinion.
They were political weapons and major part of the battle for independence. They justified the use of violence and
the rise in arms against Alvas rule, and openly preached, although still in vain, a national revolt.13 Songs were
also used to thank and honor supporters, to convince doubting citizens or city governments (Amsterdam!) and to
attack opponents.14
Just like martyr songs, who are an expression of the revolt of the oppressed believers, songs of beggars
or geuzenliederen evoke consensus and resolution. The songs were both a herald, as an actual part of the
resistance. Poets and singers adorned themselves with the honorary name geuzen (beggars), first uttered as an
insult, in 1566, by the counselors of Margaretha of Parma, guardian of the Netherlands under the Spanish King,
branding the train of nobleman petitioning against the inquisition. Vive le Geus was the word!15 16
The Wilhelmus became the most famous geuzenlied in history, acting as the anthem of the royalists.17
Even today it reminds us of the nationalist unity of nation, monarch and religion, which meant in the 16th century
that Holland, Willem of Orange and Calvinism, were indissoluble attached to each other.18 In the song Willem of
Orange or Willem van Oranje, father of the fatherland, defends the Dutch David, Holland, and the Calvinist way of
life, against the common enemy, Catholic Spain. The acrostic emphasizes this by forming WILLEM VAN
NASSOV,19 with first letters of each of the fifteen verses, including the prinsestrofe, verse of the prince, referring
15 Refering to the (rhyming) slogan of resistance Vive le Geus is de leus meaning Long live the beggar is
now the slogan.
16 Porteman and Smits-Veldt 2008, 74.
17 Porteman and Smits-Veldt 2008, 76.
18 Martine de Bruin, Het Wilhelmus tijdens de Republiek., in Nationale hymnen. Het Wilhelmus en zijn
buren, ed. Louis Peter Grijp (Nijmegen/Amsterdam: SUN/ Meertens Instituut, 1998)
19 Porteman and Smits-Veldt 2008, 76.
11
to the Prince and his house of Nassou. The use of acrostics20 as well as prinsenstrofen21 places it in the tradition
of the chambers of the rederijckers22, local gatherings of poets, often affiliated to or representing a city.
The Wilhelmus stayed popular throughout the Eighty Years War, and long after the peace of Mnster, the
Wilhelmus retained both its political overtone and its signaling function.23 It kept the memory of the Eight Years
War vivid, not in the least because the acquired freedom was jeopardized several times after 1648.24 This resulted
in, because of the symbolic value the song carries, a ban during periods of occupation, in the disaster year 1672
as well in the heyday of the Patriots in 1787.25 The Wilhelmus was also forbidden in the Japanese camps in the
Second World War. The Frisian anthem, which the Japanese did not know, served as a substitute.26
Despite of a history full of exile and resistance, the anthem survived and kept ascending, for example
when the member of the Oranje family came to power again or after the expulsion of the French in 1813.27 This
is because national anthems grow in moments of strong collective feelings. The liberation of the French
domination in 1813, the Belgian revolt in 1830-1831, the inauguration of Wilhelmina in 1889 are all examples of
this.28
The parallel with the Eight Years Wars was experienced especially strong, in 1940-1945. The resistance
poetry of those years contained many reminiscence of the Wilhelmus. There was, for example, a bundle
significantly called the Geuzenliedboek, meaning a book of beggars songs, which carries the same name as the
work were we find the earliest versions of the Wilhelmus, among other geuzeliederen29 of course. Again the main
goal was dispelling that tyranny and they remained faithful to their fatherland upon death.30 During the Second
World War the Wilhelmus was the resounding symbol of the Dutch aversion against the Germans in general, and
of the active underground resistance in particular. The objections to the anthem that some socialists still felt
before war, disappeared completely after the invasion of the Germans. Simon Carmiggelt said in 1960 Only when
the Germans stood at the borders, the Wilhelmus became my song.31
In spite of the strong associations of righteousness the Dutch feel upon hearing the Wilhelmus, its
symbolic appropriation has not been unequivocal. Even before the outbreak of the war the national Socialists
hijacked the song, and sang it during key moments of the war. During a mass meeting in the Galgenwaard,32 in
1941, to celebrate Anton Mussert, leader of the NSB, the Dutch National Socialist, the Wilhelmus was sung while
performing the Hitler salute. One can point out the irony that, during the Second World War, the Wilhelmus
experienced an apotheosis, being owned and appropriated by the total sum of its people, from socialist to
NSBer,33 be it on the opposite sides of a rifle. In the 18th century the anthem was above all the song of the
Orangists, supporters of the royal house, while in the next century the Calvinists appropriated the song, and they
were followed in the twentieth century by all kinds of groups including by the NSB. The Wilhelmus follows the
political changes of power, both the dominant as the subversive streams.34
and probably published quickly after the events they described.43 The estimate is that the first version of the
geuzenliedboek was published in 1574, including the Wilhelmus which was then already the most famous
geuzenlied of its time. From that moment on, publishers rapidly produced new - almost exclusively Dutch
editions of the geuzenliedboek, extended with both more recent as older songs, up to a sum of 252.44
For a long time the Wilhelmus as it occurs in Een nieu GeusenLieden Boecxen (1581), the title of the this
particular edition of the geuzenliedboek, was considered the oldest persevered Dutch version of our national
anthem.45 This was until 1996 when Martine de Bruin came across an even older edition of the geuzenliedboek in
the Paris Biblioteque Nationale. This new found version - without explicit references to year, place and author could on account of historical and typographic research be attributed to a certain publisher, and so could the date
be determined. It seems to have been published in 1577-1578 by Jan Canin in Dordrecht and was therefore three
or four years older than the standard edition of 1581. Shortly after this discovery, De Bruin found an even older
edition of the geuzenliedboek in the Niedersachsische Staats und Inuversitatsbibliothek in Gottingen. This new
oldest printing was included in the Repertorium van het Nederlands lied tot 1600 as number D294, without giving
much publicity to it.46 This is still not the oldest Wilhemus-text that survived the ages, because that is a German
edition published in 1573.47 Generally, it is believed that the first Dutch edition of Een nieu GeusenLieden
Boecxen originated in 1574. This means that, even with the discovery of the book from 1577-1578, the first
printing is not yet in sight.48
The title of being the oldest national anthem,49 confirmed by several illustrious cultural whales like the Guinness
book of records or Flippo number 430,50 refers to the conjunction between text and melody, which tribes from the
sixteenth century. The Wilhelmus was sung at de wijse van Chartres, the melody of the popular French antiHuguenots song.51 The oldest known paper version of the melody dates from 1574, and is assumed at that point
in time, to be at least six years old.52 The oldest Dutch version where both the text and melody of the Wilhelmus
are written down, can be found in Adriaan Valerius Nederlandtsche gedenck-clanck from 1626.53 In terms of text,
the Wilhelmus is exceeded by the thousand year old Japanese anthem Kimiga yowa, but this anthem received its
melody only in 1880. Consequently, the Wilhelmus as a song, meaning an inseparable whole of text and melody,
43 Martine de Bruin, Een ng ouder geuzenliedboek: signalement van de druk [1576-1577] met de oudst bekende
Nederlandse Wilhelmustekst in De fiere nachtegaal: het Nederlandse lied in de middeleeuwen, eds. Louis Peter Grijp and
Frank Willaert (Hilversum: Verloren, 2008), 231-250.
44 Porteman and Smits-Veldt 2008, 75.
45 Kossmann 1985, 343.
46 de Bruin 2008, 231- 233.
47 de Bruin 2008, 234-235.
48 de Bruin 2008, 231-233.
49 de Bruin 1998, 16.
50 Flippos were a huge promotional toy included in Lays chips. Many of tokens had an educational function.
Flippo number 430, chester in orange, tells us that the Wilhelmus is the oldest national anthem.
51 Porteman and Smits-Veldt 2008, 76.
52 de Bruin 1998, 25.
53 Kossmann 1985, 343.
14
is the oldest anthem of the world. However, this statement is only true if were talking about anthems that are
currently officially acknowledged as the national anthem,54 so excluding past or unofficial national anthems.
The currently oldest edition of het geuzenliedboek includes 89 songs and 5 choruses, of which almost all
were, upon the discovery in the Gottingen, already known from other printings.55 Every reissue of the
geuzenliedboek includes the Wilhelmus, and while the text varies over the different editions, the variations are
surprisingly small. Apparently the evolution of the text of the Wilhelmus has, against custom, come to an halt
since its inclusion in the geuzenliedboek. The text seems to have been significantly less subject to change than
the melody, although it is possible that big alterations performed in the oral tradition, have vanished from our
sight.56
When?
54 de Bruin 1998, 16.
55 de Bruin 2008, 234-235.
56 de Bruin 1998, 26, 28.
57 Porteman and Smits-Veldt 2008, 70.
58 Van Welck liedje een wijs man eens seyde, dat het aan ons Vaderlant meer voordeel gedaen heeft, als tienduysent
Soldaeten, want als Soldaet en Matroos dat hoort, dan wordt haer bloedt gaende.
59 van Es 1985, 50.
60 Porteman and Smits-Veldt 2008, 68.
15
An extensively debated historical question about the Wilhelmus, important for the attribution to an author,
revolves around the date of conception. Several important events that took place in the year 1568, like the battle
in Friesland, the battle in Heiligerlee and the trip across the Maas to battle the Duke of Alva, are mentioned in the
Wilhelmus.61 The military campaign of the Prince in the fall of 1568 is a historic terminus a quo, being the latest
historical occurrence the Wilhelmus clearly refers to, in the eleventh stanza.62 This means of course that the text
was written after these events took place. In a similar but slightly less objective fashion, an end date for the
composition of the Wilhelmus can be determined, by the events it doesnt mention. The prince returns to Holland
in 1572 and this marks the start of the real revolt. On the first of April the small city of Den Briel was conquered
back by the geuzen. This famous first victory resulted in the even more famous rhyme On the first of April, Alva
lost his glasses,63 meaning that from that point on, the Duke lost his air of invincibility along with his perfect
military record earned by his skills of asserting the situation. Besides this, its also a play on words since the
Dutch word for glasses is bril, which is similar to Den Briel, the city. Anyway, this takeover is not mentioned in the
Wilhelmus, and furthermore the tone of the song doesnt correspond with the change that moment brought.
Therefore, 1 April 1972 is seen as another boundary date.64
Although there is consensus on where these coarse border dates should lay, a more precise dating has not been
agreed upon. Before the Second World War it was assumed that the Wilhelmus was manufactured at the end of
1568 or at the beginning of 1569, shortly after the failed campaign of the prince against Alva in Maastricht.65 This
dating assumes that the song is written directly after the historic episode it describes. Postwar Wilhelmusresearch however, dates our national anthem, considering it a propaganda song, at 1571-1572, or at the earliest
1570.66 Assuming that a propaganda song is released directly after its manufacture as it seeks immediate
impact,67 this dating is based on the assessment that this was the period when propaganda like that of the
Wilhelmus was needed. In addition to this, are there a lot of other texts, or manifests, of Prince Willem van
Oranje-Nassau from the period 1570-1572, which have a similar message as the Wilhelmus.68 Another argument
for the terminus ad quem of 1572, is the fact that song 55 in Kuipers edition of the Geuzenliedboek, Ras
seventhien Provincen, which must be written in July or August 1572, has Wilhelmus van Naussauwe, as the
direction of melody.69
The previous section is a concise and simplifying summary of the enormous discussion concerning this question.
I will refrain from reporting this discussion any further, because elaboration of this discussion is not necessary for
61 Abraham Maljaars, Het Wilhelmus: auteurschap, datering en strekking: een kritische toetsing en nieuwe interpretatie
(Kampen: Kok, 1996), 167.
62 Heeroma 1985.
63 Ive translated the original slogan which says Op 1 april verloor Alva zijn Bril.
64 Maljaars 1996, 165.
65 Maljaars 1996, 151, 167.
66 Maljaars 1996, 151, 171, 217.
67 Maljaars 1996, 217-218.
68 Maljaars 1996, 171-173.
69 Heeroma 1985.
16
the understanding of my research, nor is it necessary for performing it. If my thesis were to produce results that
inform us in any way about a more precise date of creation of the Wilhelmus, this will be highly relevant, on which
Ill elaborate in the following sections.
Where to?
Another appertained classic question in the Wilhelmus research is, where to it was written. What is the exact
nature of the song? An interpretation of the tone and ethos of the song as apologetic, comforting, and accepting,
is vital for an early dating, because then would the nature and function of the song agree with the year of
presumed release.70 At the end of 1568 and the beginning of 1569 the Dutch needed the comfort and the Prince
had lost enough to be accepting about and to apologize for. If the Wilhelmus is indeed a consolatory song, a
poetical reproduction of a supposed farewell speech by the prince at the disposal of his troops at Staatsburg in
January 1569, then an early dating, 1568-1569, would be a logical one.71 Researchers Schotel, van Eyck and
Duinkerken among others, agreed to this stance.72
This interpretation has, however, gone out of fashion. Kuiper and Drewes, for example, kept to a later
dating, because they defined the Wilhelmus as a propaganda song,73 which should make the Dutch support the
prince and shed the Spanish yoke. An early dating would then be impossible, because the situation would have
been too hopeless, the people too skeptic and the prince would have been too bitter for pugnacious propaganda.
The song, in this interpretation, focuses on Orange, presents him to the Dutch people and appeals to follow.74
The debate has not been settled but one assumptions runs through it, the song implies the attitude of the people
to whom it was destined for.
Solving the where to? may solve the question of When?. However, searching for the intention of an author is a
practice of ill repute, because evidence is often based on personal interpretation. Lets include another opinion to
emphasize this point.
According to F.K.H. Kossmann, the tune of the Wilhelmus reveals that it should not only be regarded as
a geuzenlied, but that is has also origin in spiritual songs, martyr songs and historical songs. It contains a polite
and spiritual sound, as well as a rough sound which is characteristic of a mockery song or a soldiers story.75 The
manifestos of the prince had already justified his actions and announced his political sense. Saravias76 sermon
had placed, on behalf of the prince, the strength and value of the spiritual above that of the earth.77 The poet of
the Wilhelmus must have been interested in another urge, namely to personate the emotional life of the prince.
The defeated general, the ignored nobleman, the righteous and selfless politician and above all the sincere
conscience, which the poet gave him, the symbol of palladium and exile. Its not a song which draws its poetic
value from the beauty of expression, its artistic form or the depth of its contents, but a true expression of sincere
emotion, which rules over form and content. In other words, according to Kossmann it was much more than a
propaganda script.78
This
en face convincing assertion can be challenged by the fact that the Wilhelmus was, rather quickly after its
making, grouped in with other geuzenliederen in a geuzenliedboek. Also, textually there seems to be no reason to
separate the Wilhelmus from other geuzenliederen. Whether you interpret Kossmanns theory as an example for
the need of contextual research, or for an argument against speculation about the intention of the author, it
showcases the diversity of interpretations qualitative research bears, as well as the necessity of solid theoretical
ground in the Wilhelmus-research.
Where?
The third classical Wilhelmus-question revolves around the geographical location of its conceiving. This question
is again connected to the others as the location can give us exclusion about the time and reasons for creation.
The general consensus is that the Wilhelmus must have been written somewhere in or nearby the presence of
the prince, probably during his exile at the stronghold Dillenburg, in Germany.79 In the years 1568-1572, Willem
van Oranje tried to win over the public opinion of our eastern neighbors. It is therefore most likely that the song
was spread in German around the same time that it was first published in the Netherlands, following the same
pattern as the other propaganda scripts of the prince.80 This does not mean necessarily mean that it was written
in Germany or in German for that matter. The debate has not been as vast as that of the other Wilhelmusquestions, probably because therere not enough historic certainties to speculate on.
18
song combined, altered the lyrics of the Wilhelmus, something that wouldnt even have been a possibility if the
glorified Marnix was generally considered its author.82 J. Spoel perpetuated the fictive moment where the alleged
poet recites the song to Willem of Orange. This scene is without a doubt inspired by the famous painting of Pils,
who portrayed the poet of the Marseillaise reciting his song to the major of Staatsburg.83 Nowadays, the common
Dutchman, who knows only the first and possibly sixth couplet, will point to Marnix as the author of its national
anthem.84 The tradition of Wilhelmus-research favors Marnix as well, as Drewes, Rooker en Lenselink all
performed comparative stylistic analysis on his texts in relation to the hymn.85
Reasons in favor of Marnix being the author, are numerous and divers. One example of these, is the
allegation that the anthem has lingual elements of the dialects spoken in the southern regions of the Netherlands,
the region where Marnix was born and raised.86 An important argument against Marnix authorship is the literary
quality of the song. Our national anthem has seven rhyme schemes of poor quality, of which Valerius version
corrects four.87 Therefore, adversaries of Marnixs authorship, among them Maljaars, Van Eyk and Buitendijk
claim that the attribution to Marnix, degrades him to a second or third rate poet.88 It goes without saying that the
rhetoricians and other renaissances poets in the second half of the sixteenth century consider purity of rhyme as
a quality of good poetry. An arrived and excellent poet such as Marnix, shows hardly any impure rhyme in his
oeuvre.
Coornhert was first mentioned in 1663, by the remonstrant preacher Geeraert Brandt89in his Historie der
reformatie in de Nederlanden.90 There has been considerately less research after Coornhert as the author of
Wilhelmus, stylistic and otherwise. Garmt Stuiveling was the only notable modern researcher who paid attention
to Coornhert as possibility. This is strange because he seems a high potential option. He knew the prince
personally and has been incarcerated for a short period of time because of this, after which he faced a long
lasting banishment, under the reign of Alva.91 Coornhert never alleged himself to institutions such as the church,
school or a Rhetoricians chamber. He was opposed against the Catholic persecution of heretics as well as the
Calvinist iconoclasm.92
Marnix was an obvious Calvinist while Coornhert is, arguably, a Libertarian. Over the course of history
some (Drewes) have interpreted the Wilhelmus as containing specifically Calvinistic elements, while others see a
more general Christian tone.93 Choosing a side in author debate is often attached to a typical Dutch complication.
Because of the typical Christian character of the Wilhelmus, which is atypical for a national hymn, and because of
the Dutch history of religious quarrel, people tend to project their own religious believes onto the lyrics. Its not
oinconceivable that they wish, subconsciously or not, for the author of the national anthem to be a fellow believer.
This occurs especially in reformed circles, where they make a point of the Calvinist interpretation.94 As you can
see, also this unofficial fifth cardinal question, about the presence and forte of the Calvinist and anti-Catholic tone
of the Wilhelmus, which is typical for the ideas of both Marnix van Sint Aldegonde and Willem van Oranje during
the years 1968- 1971,95 is strongly connected to the question of authorship. Perhaps the refusal of Coornhert to
allege himself to institutions or communities, made it harder to strongly identify with him, and because of this,
imagine him as the author anthem that binds a nation. This is only speculation from my part, but the absence of a
body of serious research after his character is highly conspicuous.
Obstacles
The anonymous publication of the Wilhelmus is on itself in no way strange or rare.96 Most of the pro-Oranje
pamphlets were published anonymously. A lot of these songs have never been linked to a specific author. For
example, only a minority of 60 songs of the Een nieu geuse lieden boexcken of 1577- 1578 have an identified
author.97 What stands out with the Wilhelmus is the amount of attention it has received. Why hasnt the author
admitted to his masterpiece, when the battle for independence was won and his song could have made him
immortal. More importantly, how come all my predecessors fell short when it came to attributing the text to its
author. Even if only Marnix was seriously considered, as the possible author of the Wilhelmus, the question
remains why all the research didnt lead to more significant results. How come that despite extensive research,
stylistic or otherwise, has not yet determined Marnix alleged authorship to be true or false? Limiting myself to
stylistic research, when trying to understand the current status quo in the Wilhelmus-tradition, several major
methodological flaws become visible when performed by a human expert based on close reading. Ill discuss
them per problem.
Idiom
Van Haeringen98 states that studies on the basis of similarities, comparing the idiom of the Wilhelmus with that of
texts by Marnix, cant produce convincing evidence. Large amounts of similarities are to be expected a priori,
because the Wilhelmus and the Marnix corpus share a common cultural background, being written in same time,
same geographical location and probably the same language, therefore sharing topical and lingual influences.99
Examples of these type of studies, like that of Lenselink,100Rooker101and Den Besten,102confirm Van Haeringens
statement as they seem to reach no further than the identification of some striking resemblances.103 In addition to
this, do Marnix Psalmen Davids, a text often used for comparative stylistic analysis, count a 1419 verses and this
translates to, after a rough estimation of Maljaars,104 8000 lines of poetry. The text size of the psalms is 70-fold
compared to the Wilhelmus of 120 lines of poetry. This imbalance almost guarantees the occurrence of the idiom
of the Wilhelmus to be present in the psalms but not the other way around. The discovered parallels are,
however, also found in other authors their oeuvres, making them aspecific and therefore substantiating little,105
and not being very convincing unless preaching to the choir.106
Another problem with comparing the idiom of two texts is the possibility of imitation. How do we know, for
example, that Marnix isnt strongly influenced by the Wilhelmus and imitates this in his other songs, making them
stylistically similar to the anthem. Considering the times and the influence of the Wihelmus, this seems very
logical. To make matters worse, we also do not know if the Wilhelmus itself hasnt been written by an author who
was heavily influenced by Marnix work. Again, this is, considering the influence of Marnix and the practices of
emulation of those times, not unthinkable.107
Metric
One major argument used in favor of Marnix as author, proclaimed by Den Besten and Otterloo, is that Marnix
was the only poet to convincingly use jambs, at such an early phase of the development of metric in poetry,
thereby solving the problematic literary quality of the Wilhelmus, as well as strengthening their Marnix-hypothesis.
This theory has, however, been debunked by Maljaar, who claims at that point in time, many, among them Marnix
and Coornhert,108 were able to produce a iambic ground pattern. Jan van Hout was credited by his peers as the
first one who applied this altercation, and witness reports should definitely not be ignored.109 One of those
witnesses is poet Maeren Beheyt, who dedicates his own poem Van t Maetvinden to Jan van Hout, concluding
his last verse with the line; Neerduyts maetklanckx voorbeelt sproot uyt van Hout in Leyden, meaning The
The
reason this argument is included in this section is that analyses of metric could be considered as a useless
exercise to begin with, and thereby losing yet another feature that could provide information. Reason for the
dismissal is that the axiom,112 of the Wilhelmus as a consciously arranged metric structure, seems very
improbable. The metric of sixteenth century songs was often evoked by the melody of the music.113 I see no
ground for regarding the Wilhelmus to be exceptionally free from its musical chains.
Subjectivity
A major obstacle for human expert comparative stylistic analysis is the subjective nature of differentiating between
relevant and irrelevant stylistic parallels. Maljaars explains that a completely objective norm for the amount of
differences that determine when a certain text has or has not been written by a certain author, is non-existent.
Nobody can factually determine the relevance or amount of similarities or dissimilarities needed between two
corpora of language in order to be considered as by the same or by different authors.114
Conclusions of existing Wilhelmus-research
The four questions that Veenendaal askes in his fundamental article115 from 1954, namely; when, where, where to
and by whom is the Wilhelmus written, are not yet adequately answered despite the detailed examination of the
anthem. The shy conclusions with which Ad Den Besten had to end his scientific paper116 are characteristic for
the current situation in the Wilhelmus research: adamant evidence has still not been found.117 The exact year and
place of emergence are disputed, and so is its literary value and the nature of the national anthem, alongside of
course, its anonymous poet. In the course of time it has been characterized as a valedictory song,118 a
consolatory song,119 an encouragement or exhilarating song, an apologia120 and as propaganda121 song.122 There
110 A dialect or combination of Dutch and German spoken in the eastern provinces of the Netherlands and
the western provinces of Germany.
111 Maljaars 1996, 41-43.
112 Maljaars 1996, 48.
113 Maljaars 1996, 48.
114 Maljaars 1996, 12.
115 Veenendaal 1985, 73-92.
116 Den Besten 1983.
117 Abraham Maljaars and Samuel Jan Lenselink, Inleiding en verantwoording., in Het Wilhelmus: Een Bibliografie (The
Hague: Stichting Bibliographia Neerlandica, 1993), 1, 5.
118 A. van Duinkerken, "Het Wilhelmus" in Verzamelde Geschriften III (Utrecht: Het Spectrum, 1962).
119 Gilles Dionysius Jacob Schotel, Gedagten over het oude volkslied Wilhelmus van Nassouwen en den vervaardiger van
hetzelve (Leiden: 1834); Johannes Postmus, Het Wilhelmus (Kampen: Kok,1900);
P.N. van Eyck, "Het Wilhelmus." in Wilhelmus van Nassouwe (Middelburg: P. Geyl, 1933).
120 P. Leendertz (jr.), Het Wilhelmus van Nassouwe. Met verklaring en historische toelichting. (Zutphen:
Thieme, 1925).
121 E.T. Kuiper, Wilhelmus van Nassouwe. Taal en Letteren 12 (1902): 1-7-120; Veenendaal 1985.
122 van Es 1985, 47.
22
is some consensus about a date of commencement, between October 1568 and April 1572, but within those four
years the opinions differ greatly, as do they on the other three questions.123
My thesis initially attempts to answer the who? question but as it seems, insight about the when, where
and why brings forth insight that may answer my main question, therefore I expand my field of interest. I dare to
say that contributing to any of the four Wilhelmus-questions means contributing to every of the four Wilhelmusquestions. The Marnix or Coornhert debate for example, is largely based on historic knowledge. If these historic
facts were changed or reinterpreted, Marnix could lose its role as primary option, like what after the second world
war when the birth date of the anthem was reconsidered. In the years 1568 and 1569 Marnix was not yet
convinced, although he knew Willem van Oranje through his brother Lodewijk van Nassau, that the Prince was
the man the Wilhelmus portrays him to be.124 The poet had seen Hendrik van Bredero as the promised leader of
the Netherlands,125 but Brederos had to cease his armed resistance against Spain in 1567. Marnix said about
the period after this defeat, se soumettre a un chef, qui commande avec authorit126, a headless resistance. If
the anthem was written during this period, its very unlikely Marnix is its author. This is an illustration of how one
piece of information can change the outlook on the whole puzzle.
As mentioned, Ill not chase argumentation on the basis of biographic or other historic evidence, nor am I in the
conviction that I could do this better than my predecessors. The Wilhelmus-case is, however, momentarily void of
stylistic research and this is where Im determined to contribute. A large part of the problems of traditional stylistic
research, that are discussed in the past section, can now be circumvented and solved by the methods of
nontraditional authorship attribution. Now that the historical context and the Wilhelmus research tradition has
been summarized, I discuss the new methodology and its theory in the following sections.
authorship of an anonymous text based solely on internal evidence is also a very old one, dating back to at least
the medieval scholastics.127 Human experts performed qualitative analyses, finding clues within the text. Being
able to contribute will still be almost impossible if Im only to follow in the footsteps of giants, meaning trying to
repeat or add to their methods. Ive got to try to stand on their shoulders, applying relatively new methods to this
old case pulsing of tradition, resting one foot on the canon of Wilhelmus research and one foot on the discipline of
text based authorship attribution with statistical methods and computational means.
These methods require quantitative analyses. While qualitative research is based on precise observation
and description of individual occurrences, quantitative research is based on computing frequencies, relations, and
distributions of features and relevant statistics.128 Most modern research in the computational literary studies, and
the research I favor is likely to practice a mixed method. Qualitative observations can be confirmed by
quantitative analysis, and quantitative findings often need qualitative analyses to explain certain results. Ill
elaborate in the following sections.
129 Oakes, M. P. Corpus linguistics and stylometry. In A. Ldeling & M. Kyt (eds.) Corpus Linguistics: An International
Handbook, Berlin: Mouton de Gruyter, Berlin, pp. 10701090. 2009. print
130 Koppel, Schler and Argamon 2009 .
131 Frederick Mosteller and David L. Wallace, Inference and disputed authorship: The Federalist (Reading, MA:AddisonWesley,1964).
132 Efstathios Stamatatos, A survey of modern authorship attribution methods, Journal of the American Society for
Information Science and Technology 60, no. 3 (2009).
133 David I. Holmes, Authorship attribution, Computers and the Humanities 28. no.2 (1994).David I. Holmes, The
Evolution of Stylometry in Humanities Scholarship, Literary and Linguistic Computing
13, no.3 (1998).
134 Stefan Evert et al. Explaining Delta, or: How do distance measures for authorship attribution work?
Presentation 24 July 2015, Lancaster.
24
quantitative text analysis, this leads to the practice of attributing texts of unknown or disputed authorship to an
author with the same style based on quantitatively measured linguistics.135
Hesitation by literary scholars and mistrust of such a blatantly quantitative approach may be
alleviated by choosing the least contestable mode of analysis, namely that of counting. The stylometrist
looks for a unit of counting which translates accurately the style of the text, where we may define style as a
set of measurable patterns, which may be unique to an author.136 Thus, style, as defined by computational
literature, means something completely different than our usual understanding of this term. 137 Traditional
stylometry, has a holistic approach to style, focusing on the semantics of a text, while computational
stylometry focuses on the aspect of style which is usually ignored by the traditional stylistics.138 For the sake
of clarity, I employ the following definition of style: Style is a property of texts constituted by an ensemble of
formal features which can be observed quantitatively or qualitatively.139 In this definition, style is not
something unique to literary works; rather, every text has a certain kind of style, however, the described
ensemble of formal features may be interpreted from a literary perspective. 140
The statistical analysis of a literary text can be justified by the need to apply an objective methodology to
work which, for a long time, may have received only impressionistic and subjective treatment.141 Features are
counted or measured and often further analyzed by mathematical statistics to discover threshold, significant
regularity (patterns) or irregularity (outliers) and many more stylistics characteristics of a text or corpus of virtually
any size. Statistically and computationally supported authorship attribution measures textual features, sometimes
of great complexity and/or in great quantity, to distinguish between texts written by different authors.142 This is not
only another way of analysis, its another way of reading a text.
Distant reading
With the ever progressing digitalization of literature and the development of tools for selection, structuring and
analysis for digital text, distant reading has become an alternative for close reading.
When close reading, the heuristically performing researcher searches within a text for confirmation or
falsification of a theory or a frame of interpretation. This method has given the literary scholar many findings of
subjective or intersubjective nature. A reader or group of readers are limited in the amount of work they can read.
The study or reading of entire genres or literary periods are therefore beyond the grasp of the close reader.
Moreover, the counting of small textual elements as words, letters or punctuation is most of the time too time
consuming for the scholar to perform even when confining himself to one or several novels.143
Distant reading is a mode of reading where textual units are quantitatively analyzed with the aid of a
computer. This databased research, disaggregates the text in measurable units, transfers them in numbers, who
can be analyzed in enormous amounts with relative ease. It is only with these methods, those of the
computational literary studies, that patterns, correlations, models and structures within the data can be calculated
and visualized and that these trends and relations can be discovered and confirmed or rejected.144
The methods of distant reading have a character that conforms to the social sciences, where data and
interpretation of that data are often the main focus of performing research. The analysis itself doesnt contain
interpretation and the data is an objective results based on calculations.145 Interpretation of the text starts with the
interpretation of the results or data. Conclusions are backed up or justified by reproducible numbers, of course
always depending on the methods, the corpus, the parameters, the tools and other choices of the researcher
concerning design, who can be considerate as an approximation of the facts. Distance is a condition of
knowledge.146 The distance to the text allows us to research on a scale and of a kind that is unfeasible with close
reading.
143 Tim de Winkel Het vergezicht van de Nederlandse literatuur. Een distant reading van een groot corpus van modern
Nederlandstalig proza 2015 unpublished
144 Anne Burdick et al., Emerging Methods and Genres in Digital Humanities (Cambridge, MA: The MIT
Press, 2012), 38.
145 Stephen Ramsay, Reading Machines: Toward an Algorithmic Criticism (Champaign IL: University of
Illinois Press, 2011), 19.
146 Franco Moretti, Conjectures on World Literature. New Left Review 1 (2000): 57.
26
While I lose one aspect of understanding my corpus, I will gain the opportunity to read an enormous
quantity of work, read features smaller than the units we traditionally read like characters or punctuation marks,
and perform difficult analysis, not performable with the naked eye, a pencil and a piece of paper. In order to gain
knowledge about the system, youve got to sacrifice the knowledge about the texts itself.147
This makes the essence of non-traditional authorship attribution and stylistics incompatible with the
traditional way of reading in such a way that I join the assertion that we see computational literature not only as
new methodological options, but as a new way of reading. So when reading a text, or perhaps even a great
number of texts, or when asking questions about literature, language and its anchors in society, and the scholar
has to determine how to go about it, why not stylometric/visual?148
27
When performing authorship attribution the choice of criteria believed to characterize authors is the very first step.
One should probably not believe that any single set of variables is guaranteed to work for every problem, so
researchers must be familiar with variables that have worked in previous studies as well as the statistical methods
to determine their effectiveness for the current problem.149 In my case the selection of the features, tools and
methods depends on its ability to handle short and noisy text from multiple candidate authors.
Categorization task
A variety of methods has been applied to authorship problems of various sorts,150 151 like author verification152,
plagiarism detection, author profiling and detection of stylistics inconsistencies.153 The simplest kind of authorship
attribution, and the one that has received the most attention is the one in which we are given a small closed set
of candidates authors and are asked to attribute an anonymous text to one of them. Ideally, we have copious
quantities of text of undisputed authorship by each candidate author and that the anonymous text is reasonably
long.154 This is called a multiclass, single-label text categorization task.155 This is a very solvable problem, and
done very often and very successful.
So in the straightforward form, authorship attribution problems fit the standard modern paradigm of a
categorization problem.156 157 There are however some important characteristics that distinguish authorship
attribution from other text categorization tasks and these are the differences weve got to keep an eye on when
determining design choices.158 First of all, in style-based text categorization, the most significant features are the
most frequent ones159 while in topic-based text categorization, the best features should be selected based on their
discriminatory power.160 Another difference is that in authorship attribution tasks, especially in forensic
160 George Forman, An extensive empirical study of feature selection metrics for text classification.
Journal of Machine Learning Research 3 (2003).
28
applications, there is extremely limited training text material while in most text categorization problems, there is
plenty of both labeled and unlabeled data.161 This is definitely the case in my thesis. Also, in most cases candidate
authors are imbalanced. In such cases, the evaluation of authorship attribution methods should not follow the
practice of other text categorization tasks, since they most of the time have a well-balanced corpus.162
Writing in a forensic context, Bailey163 proposed three rules to define the circumstances necessary for
authorship attribution164: One, the number of putative authors should constitute a well-defined set; Second, the
lengths of the writing should be sufficient to reflect the linguistic habits of the author of the disputed text and also
those of each of the candidates; Third, the texts used for comparison should be commensurate with the disputed
writing. Bailey lists the general properties for such features: They should be salient, structural, frequent and
easily quantifiable, and relatively immune from conscious control.
Relevance
The Wilhelmus as object of research
As described the theory, the Wilhelmus can be considered as an important factor in the birth and consolidation of
the Dutch national identity and perhaps even as an actual factor in the struggle for independence. This gives the
text major historical and contemporary importance. A nation should know its own history, and the Wilhelmus is a
central part of it. Knowing the author could be a window to other historic facts, the prince his persona and his role
and the eighty year war.165 Moreover, this sense of urgency is shared by the Dutch citizenry. With the renewed
interest in our own history and the culture of the Dutch golden age, the so called cultural nationalism, the
popularity of the national anthem has experiences a renaissance as well.166 The Wilhelmus is the only national
anthem in Western Europe that made this kind of comeback.167
A second argument Id like to point out, is the unique position the Wilhelmus holds in the Dutch literary
history and the global history of songs, as the worlds oldest official national anthem. This canonical status gives
the song major literary relevance.
Another feat that makes the Wilhelmus such an interesting object of research is the high literary quality
that is attributed to it, despite the often criticized imperfect rhyme. Dick Coster calls the Wilhelmus the highest of
Geuzenliederen168 and Martinus Nijhoff, called the Wilhelmus poesie pure and considered it among the absolute
greatest poetry. Ive translated a comment of Nijhoff below, to illustrate his poetics. There are among the songs of
beggars poems with a real tone, grand, deep and strong. Much deeper and greater than Vondel, Hooft and
Bredero have ever made audible169 If we take literature serious as object of research, a search for the author of
one of its highlights, seems only natural.
Obviously, the opinion that the Wilhelmus is a valuable object of research is shared with by the authors of the 350
Dutch scientific publications, mostly of literary historic character, about the Wilhelmus, registered by Maljaars and
Lenselink in 1993.170 This number has only become bigger, 20 years later. According to a rough estimate,
therere currently around 350 or 400 publications concerning the Wilhelmus and authorship question, adding up to
15000 pages, dedicating about 1000 pages per verse of the song.171 So many has been written about the
Wilhelmus, that it seems almost impossible to form new thoughts on the matter without any discoveries from the
archives.172 What can my thesis possibly add?
attribution, can now be solved, by adapting the methodology of computational literature. Ive already explained
why comparative stylistic research has been problematic up until now and Ive mentioned that computational
methods could make it less so. I discuss the solutions below.
The problem of the petitio principii for comparative research was that the question brings forth the answer. When
searching on the basis of similarities in idiom, the author needed to be determined a priori, leaving only the option
for gaining evidence. The opposite, comparative analysis based on differences, can only lead to excluding
authors. With distant reading, many works of many authors can be compared on a large amount of textual
features in a matter of minutes. When we know this little limitations on the amount of text and features, because
its no longer time and effort consuming, we can examine the text of all the potential authors and compare them to
the Wilhelmus simultaneously, on a amount and scale of features previously impossible to measure. Visualization
of these analyses expresses the stylistic similarity of the Wilhelmus its potential authors style to that of the
anthem making the results accessible. Now the previous hypothesis of is Marnix work stylistically comparable
with the Wilhelmus? becomes Is Marnix the most stylistically comparable to the Wilhelmus of all its potential
authors? avoiding a circular argument and self-fulfilling prophecy.
Ascertaining objectively the relevance or amount of similarities or dissimilarities needed between two
corpora of language in order to be considered as by the same or by different authors, is still impossible, but
statistical math can help us to determine an acceptable margin of error, a probability of success, as of where we
accept a result as factual until contradicting results emerge.
A related problem is that semantic and stylistic similarities overlap176 and are not two mutually exclusive
categories. When measuring style quantitatively, a human expert need to determine the nature of a textual
dimension every time he encounters a similarity or dissimilarity. By using computational research you can ignore
this problem because it doesnt cling to this distinction while, most of the time, heavily favoring features we would
characterize as stylistic. It does not recognize vague semantic parallels177 if theyre not grounded in a textual
base. Effects of genre will occur because they do have this base. Any associations of the researcher based on an
intimate knowledge of themes and motives and strong feelings about their quality will be harder to find.
The problem of creative, so stylistic, emulation is not completely solved with the use of computational means, as
we still wouldnt know by the grace of the comparison which one is the original or the epigone, but when using
low-level features, like the frequencies of function words, it does solve conscience imitation because the usage of
these is nearly impossible to mimic and very hard to influence.178
The problems surrounding subjectivity are still present but a lot smaller and better manageable, provide the
researcher acknowledges his own horizon. As the role of the researcher is now predominantly moved to the setup
of experiment and interpretation of the results, I proclaim that it might the computer doing the analysis, it is on the
human researchers terms.
32
these aspects of the results. In addition to this Im also enthusiastic to add to the catalogue of computational
research on Dutch text, since Im a research master student Dutch language and literature.
Methods
In this section Ill discuss the methods Ive used and the methodology that Ive seriously considered, along with
their theoretical background, previous results and practical use.
Features:
An important variable in authorship attribution (AA) research/analysis is the choice of stylometric features.
Changing the features means changing the things you measure and in doing so you change the conditions under
which the things you want to measure, will be measured. Previous studies185 on authorship attribution have
proposed taxonomies of features to quantify the writing style, the so-called style markers, under different labels
and criteria.186 The classification of features can be based on the size of a feature, but can also refer to more
complicated characteristics, such as narrative perspective or textual macro-structure.187 In my classification I
distinguish size from type and discuss them, in that order, in the next paragraphs. I discuss these four units or
sizes of features first and separate from the types of features, in order to explain the characteristics of the units,
that otherwise might get lost in the grander complexity of feature types. Later on I will not persist in this
distinction.
Size of features
An influential characteristic of features is their size. Changing the unit in which theyre counted and analyzed,
means reading the text differently. Therere four different units of features that can be counted as being of
different size; character, lexical, sentence and semantic features.
Character features
To measure character-level features, means to count any character as a unit, including blank spaces and/or
punctuation marks. Therere various character level measures, for example alphabetical character count, digit
count, uppercase/lowercase count, letter frequencies and punctuation count,188 that have proven to be useful.189
Lexical features
Lexical features are words or any bundle of characters between two spaces. Frequently used examples of lexical
level features are function words or content words, predominantly nouns that are expressive of the topic or genre
the text. The usage of lexical features are of course not bound to the obvious or to the easy imaginable cases,
theres a whole palette of unexpected applications. We can also use of proper nouns or to be specific count
185 Holmes 1994;Efstathios Stamatatos, Nikos Fakotakis and George Kokkinakis, Automatic Text Categorization in Terms
of Genre and Author. Computational Linguistics 24. no.4 (2000);
Rong Zheng, et al. A Framework for Authorship Identification of Online Messages: Writing-style Features and Classification
Techniques. Journal of the American Society for Information Science and Technology 57, no. 3 (2006).
34
geographical names,190 in order to analyze the readers sense of geographical place and distance. Lexical-level
features have a grand rsum in the computational literature.191
In the case of character or lexical-level features, the analyst can choose to increase the size of the unit by using
n-grams. This means that instead of counting one character or one word, you choose a number higher than one.
The computer will now search for combinations of words or characters, like he is or ?!. This is not possible for
the other feature levels because lexical and character features consider a text as a mere sequence of word
tokens or characters while syntactic and semantic features require deeper linguistic analysis.192
Syntactic features
Syntactic features are sentence based features. Research after syntactic features is trying to discover a syntactic
pattern that the author is presumed to unconsciously put in all of his texts. It intuitively feels as a more reliable
fingerprint in comparison to lexical information, but this intuition is not necessarily true. The analysis of syntactic
features is language dependent and often require robust and accurate natural language processing (NLP) tools,
in order to perform. It relies on the availability of a parser able to analyze a particular natural language with
relatively high accuracy. This will, however, still produce noisy data-sets due to unavoidable errors made by the
parser.193 Syntactic features alone often perform worse than lexical features, but a combination of the two often
improves the results.194
Semantic features
Semantic features are specific features who are sought out and analyzed based on their meaning. We tend to
see semantics as holistic, subjective and interpretative, so they must require large units and be heavily
dependent on context. However sometimes semantic features rely on the counting of lexical features. When, for
example, looking for sexist remarks in a book, maybe by counting curse-words, youre looking for semantic
features, although semantic features often require more interpretation than just counting. Whether semantic
features, or non-semantic features for that matter, are a feature size that actually exist, is open to debate.
The more detailed text analysis is required for the extraction of stylometric features, the less accurate
and the more noisy the produced measures are. NLP tools can be applied successfully to low-level tasks such as
sentence splitting, part of speech (POS) tagging, text chunking, and partial parsing, so relevant features would be
measured accurately, and the noise in the corresponding data-sets remains low. More complicated tasks such as
full syntactic parsing, semantic analysis, or pragmatic analysis cannot yet be handled adequately by current NLP
190 Karina van Dalen-Oskam, Names in Novels: an Experiment in Computational Stylistics. Literary and
Linguistic Computing (2012).
191 Patrick Juola, Authorship attribution for electronic documents. in Advances in digital forensics II. Eds. M. Olivier and S.
Shenoi (Boston: Springer, 2006); Stamatatos 2009; Koppel, Schler and Argamon 2009.
192 Stamatatos 2009.
193 Stamatatos 2009.
194 Michael Gamon, Linguistic correlates of style: Authorship classification with deep linguistic analysis
features, in Proceedings of the 20th International Conference on Computational Linguistics (Morristown, NJ:
Association for Computational Linguistics, 2004).
35
technology for unrestricted text. As a result, very few attempts have been made to exploit high-level features for
stylometric purposes.195 The most important method of exploiting semantic information so far was described by
Schlomo Argamon.196 He defined a set of functional features that associate certain words or phrases with
semantic information. Regrettably he did not provide information about the accuracy of the tools or methods.
I will not attempt to perform analyses on semantic feature-level because of the high requirements and
the lack of convincing research and theory. It seems impossible not to measure semantics, or syntaxes, when
measuring small but meaningful units like words. However, Ive no practical way to quantify or measure the extent
of their reception, so I wont pursue semantic features as a feature-level but as a welcome influence included in
the measured low-level features.
Types of features
Types of features refers to what a feature measures. In this section Ill explain the types Ive taken into
consideration, going from easy low-level features up to the higher-level features.
Punctuation
Ill not discuss research based on punctuation in this thesis. The Wilhelmus has very little systematic use
punctuation, presumably because of all the interference by publishers and its oral transmission, so extensive
analysis will, in all probability, not lead to any useful results.197
Complexity measures
While a lot of the early research after the stylistic authorial fingerprint has focused on complexity measures, I
wont use any of these type of features. The great variety of complexity features, including sentence length,
average word length, word length distribution, word frequencies, character frequencies, syllable or letter count
and others, used to measure vocabulary richness functions or text complexity,198have all proven to be inadequate
for authorship attribution and have been surpassed by better methods.199 Word length, for example, proposed by
Mendenhall200 appears to be so unreliable that any serious student of authorship should discard it.201
Stamatatos202 discards both Sentences length counts and word length counts, two other simple complexity
measures, because they may introduce considerable noise in measurement.
More sophisticated measures were invented, like the type-token ratio and the hapaxlegomena,203 the
number of words appearing with given frequency in a text. These are methodologically less primitive but still
deliver inadequate results. Both measure some kind of vocabulary richness but are too dependent on text
length.204 Even complicated statistical measures as Yules K-measure,205 Sichels S-measure206 and Honores Rmeasure,207 all of which I will not discuss any further, proved of little value.208
Idiosyncrasies
Measures to capture idiosyncrasies of an authors style, like spelling and formatting errors, are not part of my
analyses. The availability of accurate spell checkers is still problematic for many natural languages.209 Human
experts mainly use observations similar to idiosyncrasies to attribute authorship. This is an important reason not
to focus on this aspect because my aim is to apply methods and analyze features that havent already been used
or analyzed, in order to keep my thesis as relevant as possible. Maljaars210 describes some of this research and
unsurprisingly a lot of it revolves around the weak rhymes.
N-grams on character-level
Frequencies of n-grams on the character level are able to capture nuances of style, including lexical information,
hints on contextual information, use of punctuation and capitalization, among others. Character n-grams are also
tolerant to noise.211 Style-based text categorization includes style-based errors, that can be considered personal
traits of the author,212 which character n-grams will capture as such. It also captures lexical preferences and
even grammatical and orthographic preferences without the need for linguistic background knowledge.213
A secondary advantage is that the computational requirements are minimal.214 The procedure of
extracting the most frequent n-grams on character level is, contrary to n-grams on word-level, language
independent and requires no special tools; however, the dimensionality of this representation is considerably
increased in comparison to the word-based approach.215 This is because of the capture of redundant information
and also the many character n-grams that are needed to represent single long word.
The application of n-grams on character level to authorship attribution has proven quite successful.216 In
several text-classification task, including authorship attribution, bigrams and character n-grams of variable length
produced better results than lexical features.217 This leads me to the acceptance of character n-grams.
An important consideration is the definition of n, that is, how long the string of words should be. A large n would
better capture lexical and contextual information, but it would also capture thematic information and increase the
dimensionality of representation substantially (producing hundreds of thousands of features). On the other hand,
a small n, i.e., 2 or 3, would be able to represent sub word, meaning syllable, like information, but it would not be
adequate for representing the contextual information. The selection of the best n value is language dependent,
since certain languages tend to have longer words then others.218 The problem of defining a fixed value for n can
be avoided by the extraction of n-grams of variable length.219 Sandersen and Guenter,220 used several sequences
with character 4-gramns as longest sequence. Another method that uses the character 4-grams is, one of the
solutions Koppels papers221 describe for the General authorship attribution problem also called the needle in the
haystack problem. I will come back to the methods of Koppels work, but for now its important to mention that he
used space free character 4-grams and got excellent results out of them.222For attribution of Dutch texts with
character n-grams,223 Hoorn, Frank, Kowalczyk & Van der Ham ran an methodological experiment testing the best
n for categorization by author of Dutch poetry and concluded on trigrams, character 3-grams.
I choose two different n for character n-grams. The character 3-gram can capture sub word information and the
character 4-gram is, based on the secondary literature, most of the time the best fit. By using these two character
n-grams I open up several registers and will solve, to a certain extent, the problem of choosing between stylistics
and semantic information. Its also more sensitive to differences in an optimal n over different genres.224
Word frequencies
The most straight forward approach to represent texts is by vectors of word frequencies. The vast majority of
authorship attribution studies are, at least partially, based on lexical features to represent the style.229 Using word
frequencies, where we look at how many times individual words occur in the corpus under analysis, is different
than using vocabulary distribution (Vi), where we count how many words occur i times.230
In his pioneering study,231 George Kingsley Zipf was the first to reveal that a relationship exists between
the number of occurrences (i) and their Vi. He ranked the various words of a text according to decreasing
frequency and plotted on log-log paper the ranks r against the corresponding number of times which the word of
rank r occurred, obtaining a straight line configuration. This is called Zipfs first law. It essentially says that the
number of occurrences is inversely proportional to its place on the frequency list, meaning that the first word
224 Efstathios Stamatatos, On the Robustness of Authorship Attribution Based on Character N-gram
Features Journal of Law and Policy 21, no.2 (2013).
225 Rosa Maria Coyotl-Morales et al. Authorship Attribution using Word Sequences. in Proceedings of the 11th
Iberoamerican Congress on Pattern Recognition (Berlin: Springer, 2006);Fuchun Peng, Dale Schuurmans and Shaojun
Wang, Augmenting Naive Bayes Classifiers with Statistical Language Models. Information Retrieval Journal 7, no.1 (2004);
Conrad Sanderson and Simon Guenter, Short Text Authorship Attribution via Sequence Kernels, Markov Chains and Author
Unmasking: An Investigation. in Proceedings of International Conference on Empirical Methods in Natural Language
Processing (Sydney: Association for Computational Linguistics, 2006).
occurs twice as much as the second, which occurs twice as much as the third. Zipf discovered that the thirty to
fifty MFWs account for half the word tokens in a novel.232
Tallentire233 discusses the use and difficulty of word frequencies in authorship studies. He points out that
the bulk of any sample of written English is accounted for by the same few words, recurring with the same relative
frequency, even in very different writings, taking in consideration that 10 per cent of the English vocabulary
provides for 90 per cent of the text of all the volumes of (English) literature in all the libraries.234 Traditionally such
words, called function words, were excluded from the feature set of the topic-based text-classification methods
since they do not carry any semantic information. However, as it turns out, the most common words like articles,
prepositions and pronouns, are found to be among the best features to discriminate between authors.235 So, much
less words, a few hundred, are sufficient to perform authorship attribution in comparison to a thematic text
categorization task, which takes thousands of words. 236
In the field of computational literature the most frequent words (MFWs) are considered as reliable features for
the measurement of the style of an author or the authorial fingerprint.237 Except for function words the MFWs
usually also consist of other non-semantic words, or semantic words who are so generally used, like man or
time, that theyve lost all of their discriminating semantics.238
Patterns of lexical choice can also be represented by modeling the relative frequencies of content words,
but its very problematic. Content markers might just be artifacts of a particular writing situation or experimental
setup and might thus produce overly optimistic results, not applicable to real-life applications.239 Content words
are also genre and topic specific and under conscious control of the writer. The style factor of a text is generally
considered orthogonal to its topic. As a result stylometric features attempt to avoid content-specific information to
be more reliable in cross-topic texts.240 So, in cases in which all the available texts for all the candidate authors
are on the same thematic area, carefully selected content based information may reveal some authorial
choices.241
Based on these findings, I will predominantly use stylometry measures that depend on the ratio of
occurrences of non-contextual function words. I wont perform analyses that exclusively search for content words.
In all probability, will there, when using most frequent words on very short texts, be content words captured during
the analyses, and thats why I wont exclude them as a feature. Still, content words will in this thesis never be the
object of analysis. I prefer function words (FWs) because I need to recognize texts by the same author on
different topics. In addition to this, its very unlikely that the use of FWs can be consciously controlled, ruling out
the possibility that an author deceives me by imitates someone else.242 Thus, theyre able to capture pure stylistic
choices of the authors.243
Many studies have shown the efficacy of FWs for authorship attribution in different scenarios,244 all
confirming the hypothesis that different authors tend to have different characteristic patterns of FW use.245
Results of different studies using somewhat different lists of FW have been similar, indicating that the precise
choice of FW is not crucial. Discriminators built from FW frequencies often perform at levels competitive with
those constructed from more complex features.246
Syntactic features (Distribution of parts of speech, POS)
Another feature option is measuring the part-of-speech (POS). A tagger assigns a tag of morpho-syntactic
information to each word-token based on contextual information.247 The different percentages of nouns, verbs,
adjectives, adverbs and other parts-of-speech in a text are, if they can be defined accurately, a possible map on
242 Chung, C. K., & Pennebaker, J. W. (2007). The psychological functions of function words. In K. Fiedler (Ed.), Social
communication (pp. 343-359). New York: Psychology Press.
243 Stamatatos 2009.
244 A brief summary of studies that have shown the efficacy of FWs for authorship attribution in different scenarios;
Argamon and Levitan 2005;
Shlomo Argamon-Engelson, Moshe Koppel and Galit Avneri, Style-based Text Categorization: What Newspaper Am I
Reading? in Proceedings of AAAI Workshop on Learning for Text Categorization (1998);
Harald Baayen, Hans van Halteren, Anneke Neijt and Fiona Tweedie, An experiment in authorship attribution. in
Proceedings of JADT 2002: Sixth International Conference on Textual Data Statistical Analysis (2002);
Jos Nilo G. Binongo, Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution.
Chance 16, no.2 (2003);
Burrows 1987; de Vel et al. 2001;
Holmes, Gordon and Wilson 2001;
David I. Holmes, Michael Robertson and Roxanna Paez, Stephen Crane and the New-York Tribune: A Case Study in
Traditional and Non-Traditional Authorship Attribution. Computers and the Humanities 35, no. 3 (2001);
Juola and Baayen 2005;
Jussi Karlgren and Douglass Cutting, Recognizing Text Genres with Simple Metrics Using Discriminant Analysis. in
COLING '94 Proceedings of the 15th conference on Computational linguistics 2 (1994);
Brett Kessler, Geoffrey Numberg and Hinrich Schtze, Automatic Detection of Text Genre. in Proceedings of the 35th
Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the
Association for Computational Linguistics (Association for Computational Linguistics, 1997);
Koppel, Avika and Dagan 2006;
Koppel, Schler and Zigdon 2005;
Thomas V. N. Merriam and Robert A. J. Matthews, Neural computation in stylometry II: An Application to the Works of
Shakespeare and Marlowe. Literary and Linguistic Computing 9, no. 1 (1994);
Morton 1978;
Ying Zhao and Justin Zobel, Effective Authorship Attribution Using Function Word. Lecture Notes in Computer Science
3689 (2005).
authorial style on the syntactic and grammatical level,248 although they only provide a hint of the structural
analysis of the sentences since it is not clear how the words are combined to form phrases, or how the phrases
are combined into higher level structures.249 These features are often genre specific and show lack of
homogeneity among authors.250 They also depend on NLP technology, which still has trouble with sufficiently
handling unrestricted text. Even on formats that included meta-text, morpho-syntactic information will not be
standard available, and neither is NLP a standard application in the software I plan on using. These objections in
combination with the scarce amount of research that indicates successful authorship attribution using only POS,
lead me to reject this feature.
most probable attribution with these type of methods can be viewed as taking documents as points in some
space, assigning a questioned document to the author whose documents are closest to it, according to an
appropriate distance measure.
Conclusions on features
In theory, any feature set can be used with nearly any classification method, provided proper methodology is
followed in the study design. In practice, however, certain combinations have been applied and studied more
often, and have given better results than other combination.256 It seems that low level features, such as character
n-grams and function words, are very successful for representing texts stylistically,257 and therefore I will
predominantly focus on low level features in this thesis. These features claim to capture only stylistic information,
however they might capture some content information as well.258 In a paper259 where successful stylometry was
performed, on a corpus and with conditions resembling mine, Moche Koppel concluded that, large sets of very
simple features, common words and character n-grams, are more accurate than small sets of sophisticated
features for this purpose. Also these are the features that are traditionally ignored in human expert authorship
attribution, and definitely overlooked in the Wilhelmus research, which until now focused on a whole other
spectrum of clues, even in the seldom cases that they searched for them within the edges of the page. Based on
the theory summed up in this section, Ive selected the following features which I expect are capable of identifying
the stylistic characteristics of a very short text while its usage being within the reach my capacities and the
limitations of this thesis, like time, space and means.
To sum up, based on independent feature selection the features Ill be using are;
Features on character level and word level, that will also indirectly measure syntactic and semantic dimensions of
my texts.
Frequencies of n-grams on character-level and word-level. The word-level n-grams, or collocations, are the sole
Distance measures
This section discusses the type of distance measure Im going to use, to calculate the similarity of my texts. The
term distance measure, that has been mentioned and quickly explained before, refers to a mathematical
procedure in which the differences or similarities of texts are measured by the differences between their sets of
variables, and expressed spatially.
There are several different distance measures, but I will use the Burrows Delta, because of the underlying theory,
my experience with the methods, previous successful research results and it being the standard distance
measure of the field of stylistic research and authorship attribution. The statistical formula of John Burrows
measures style by the relative frequencies of the most frequent words. It has been extended and used for a
variety of attribution problems260 and is equivalent to an approximate probabilistic ranking based on a
multidimensional Laplacian distribution over word frequencies.261
Its procedure is at follows. After a text or a corpus of texts, here referred to as the target text, has been reduced
to a bag of words, these are sorted and counted and written out as a frequency hierarchy. The Delta transfers
this to a measure of distance, based on the sum of the Z-scores of the differences in relative frequencies of the
most frequent words, between the target text and the test corpus. This distance measure represents the stylistic
characteristics of a text, always defined in relation to the corpus its compared to. To summarize, the Delta is the
average of absolute differences between z-scores of a set of variables (words) in a corpus and the z-scores for
the same words or set of variables, of a text (the target text).262 This gives us the following formula;
1 n
(T , T1 ) z ( f i (T )) z ( f i (T1 ))
n x 1
z ( f x (T ))
f x (T ) x
x
260 J.F. Burrows, Delta: A Measure of Stylistic Difference and a Guide to likely Authorship. Literary and Linguistic
Computing 17 no. 3 (2002);David L. Hoover, Delta Prime? Literary and Linguistic Computing 19, no. 4 (2004a);
David L. Hoover, Testing Burrowss Delta. Literary and Linguistic Computing 19, no. 4 (2004b).
261 Shlomo Argamon, Interpreting Burrowss Delta: Geometric and Probabilistic Foundations. Literary and Linguistic
Computing 23, no. 2 (2008);Stein Sterling and Shlomo Argamon. A Mathematical Explanation of Burrowss Delta. in
Proceedings of the Digital Humanities Conference (London: ALLC, 2006).
The Burrows delta is very effective attribution method for texts of at least 1500 words. For shorter texts, the
accuracy drops according to length. However, even for quite short texts, the correct author was usually included
in the first five positions of the ranked authors, which provides a means for reducing the set of candidate
authors.263 When using larger sets of frequent words (>500), the accuracy of the method was increasing.264 The
performance also was improved when the personal pronouns and words for which a single text supplied most of
the Delta score itself were deleted. Alternatives and all sorts of adjustments were also examined, but no
significant improvement over the original method was achieved.265
Knowing the formula or definition of a distance measure tells a non-mathematician as myself, surprisingly little
about the practical use of the statistical method or the effects of choosing the one over the other. An comparison
of the Burrows delta with other distance measures in understandable language, may shine some light on the
implications of choosing one measure over the other. Christof Schch uses in his Beyond the black box, or:
understanding the difference between various statistical distance measures266 a political analogy for the
explanation of distance measures he picked up during a lecture by Maciej Eder. Next to very entertaining it also
makes the en-face complicated mathematical matter, that determines the difference in use and effect of the
distance measures, quit understandable. Im quoting below two alternative distance measures Eder used in his
analogy, as well as the one on the burrows delta and a comparison.
The first distance measure he explains is The euclidean distance. The Euclidean metric or the
Pythagorean metric is actually very similar to our everyday idea of distance, namely the "ordinary" straight line
between two points in space. Its definition goes as follows: The straight line distance between two points, in a
plane with p1 at (x1, y1) and p2 at (x2, y2), it is ((x1 - x2) + (y1 - y2)),267 which Eder explains as follows.
The Euclidean distance is tyrannical, because it gives a voice only to the very top of the most frequent words list.
Following the principles of Euclidean geometry, it is based on the square-root of the sum of the squared
differences between all vector points; because no weighing is applied, the usually larger absolute differences
between the top most frequent few words (see Zipfs law) have a massive influence on the results; for the lower
words on the scale, the distances are smaller, and will not weigh in very much. For this reason, the Euclidian
distance is not recommended in most cases.268
Another formula Eder explains is the Manhattan distance measure, which is a form of geometry where the usual
distance function of metric, or Euclidean geometry, is replaced by a new metric in which the distance between
two points is the sum of the absolute differences of their Cartesian269 coordinates. The Manhattan distance, also
called The taxicab metric or city block distance, alludes to the grid layout of most streets of Manhattan, and
follows of course its principals on distance.270 Its definition goes as follows: The distance between two points
measured along axes at right angles. In a plane with p1 at (x1, y1) and p2 at (x2, y2), it is |x1 - x2| + |y1 - y2|.271
Burrows's Delta, can in fact be understood as a combination of standardization (i.e. z-transformation) of
frequency counts combined with the Manhattan distance.272
The following quote is Eder explaining the Manhattan and in the quote after that Eder compares the
other distance measures with the Burrows Delta:
The Manhattan distance corresponds to an oligarchy, because it is slightly less biased towards the very most
frequent words. In contrast to the Euclidean distance, Manhattan distance relies on the sum of the difference of
all coordinates of the vectors, something which reduces just a little bit the influence of the very top most frequent
words, when compared to the Euclidean distance. But it still gives decisive importance to a very small group of
words, in the range of maybe the 10 to 20 most frequent words.273
Classic Delta by Burrows is entirely democratic: instead of comparing absolute frequencies of features, it relies
on absolute z-scores, and calculates a mean difference score from all individual difference between the z-scores.
This means that it effectively applies scaling or weighing so that the less frequent words count more in the overall
distance score, and the most frequent ones count less. Each words gets an equal say in the distance measure,
like each person has a vote in democracy.274
Christof Schch points several times to the merits of the delta. Fortunately, stylometric democracy yields much
better results; in fact, Classic Delta was a huge step forward for stylometry.275 He also explains that the choice of
the right distance measure depends on language on and advices the Burrows over the Eders Delta when dealing
with Dutch texts.276 This is very reassuring when it comes to my choice for delta, that was heavily supported by
previous research already.
Computational means
269 Refers to Descartes. A Cartesian system specifies each point individually in a graph, or spatial plane, by its numerical
coordinates.
As discussed in the methodology so far, the computational means for low-level features are minimum. Other
choices were also impacted by the preferences of low demands on soft- and hardware. The programs Ill use are
the stylometric tool R, version R-3.1.3, with package Stylo, version 0.5.9, and the network visualization program
Gephi.
R is the product of open-source statistical programming and application building environment, and it allows less
advanced researchers to use ready-made scripts and libraries. The tool, or a set of tools, combine(s)
sophisticated state-of-the-art algorithms of classification and/or clustering with a user-friendly interface. R-scripts
are made, provided with a graphic user interface and more or less documented.277
For the development of the software Stylo278 package is used from the CRAN repository that is
distributed under the GNU GPL 3 license. Stylo provides a comprehensive collection of functions used frequently
in stylometric analysis. The software is implemented entirely in R which is a popular language for statistical
computing and graphics.279
A combination of R and Stylo provides multidimensional methods, as multidimensional scaling, principal
components analysis, cluster analysis, and bootstrap consensus trees, that could be used by scholars without
programming skills. The script reads plain text files, XML, or HTML; it supports explicitly nine languages, and
implicitly many more. Publication-quality plots can be exported in PDF, JPEG, PNG, or EMF formats.280 The
Burrows delta is a standard application. This software from Eder et al, is open source and made readably
available by the computational stylistic Group on their website.281
Gephi performs network visualization, visualizing the results from quantitative analyses on texts, as a network of
dots and connections in a 3-dimensional space, where the distance and connections between the dots, define the
stylistic similarity between the texts. Gephi can visualize the results of R.
Texts
The preparation, meaning building and cleaning your corpus, and presentation of texts, meaning the shape
theyre going to be taken into the analysis, are absolutely vital to the outcome of an experiment. Ill discuss the
more topical considerations, most importantly which texts and authors to include, in the section corpus. The
more theoretical considerations of how to incorporate text in the analyses, and what the requirements for
successful analyses are, will be discussed in this section. While performing the analyses, I will face both topical
and technical altercations of my texts. No matter how sound my preparation is, it will all be just theoretical, and
277 Maciej Eder, Mike Kestemont and Jan Rybicki. Stylometry with R: a Suite of Tools. Digital Humanities
2013: Conference Abstracts. (Lincoln, NE: University of NebraskaLincoln, 2013).
278 Eder, Kestemont and Rybicki 2013.
279 R. Core Team, R: A Language and Environment for Statistical Computing. (Vienna, Austria: R Foundation for Statistical
Computing, 2014) http://www.R-project.org/
the practice of actually performing the analyses will point out some needed altercations regarding the texts,
underwriting the difference of scientific theory with experimental science. The settings of the parameters, smaller
text-properties and smaller altercations to my initial text-preparation and text-presentation, will be discussed
during the reporting of the analyses and interpretation of the results. The complete lists of settings and
parameters are include in the appendix. I will here focus on the basic technical characteristic for the texts in my
corpus, that will determine the exact nature of my research.
48
clouding of the authorial signals of my probable authors as well, unlikely to produce a 100% match between the
Wilhelmus and the profile-based representation of its author, if present. Because of the temporal distance to the
texts of my corpus, adding to the insecurity regarding the purity of their stylistic signals, I should consider them as
up to some extent concatenated files. My choice of methods, and in this case especially the ones of visual
representation, which is for the large part the object of interpretation, prefer the instance-based measures. This
way the diverse nature of all texts, among them texts of the same author but from different genres, languages and
publishers, will be captured, statistically and visually, and the Wilhelmus could group with the other texts from its
until now hidden author, forming a cluster in the graph.
Ill present my texts to my software in a profile-based representation. However my text will need some
interventions, like chunking and merging, to meet some minimal requirements of texts and balance requirements
of corpus, that Ill explain in the next section. This will also result in the performance of analyses which involve
multiple texts per author as well as analyses that only incorporate one big work per author, like a book of songs or
book of psalms. Although these analyses may not actually involve instance-based approaches, they do have the
same consequents for the analyses. The distinction between instance based and profile based, is not set in stone
to begin with, there are plenty hybrid approaches.
Problems of representation
Problems surrounding text length might just be the greatest challenge of my research. In a text thats too small,
stylistic fingerprints havent manifested enough to produce a signal that can be measured by current methods,
while great variance in text size causes all kinds of inaccurate test results, due to poor balance in the corpus.
When performing analyses on very short text, these two requirements for text length ask for opposite preparation.
The length of your texts in the test corpus should be as long as possible, to assure a proper stylistic signal, while
it should also be as close as possible to the length of the target text, which is short, in order to maintain a
balanced corpus.
Text length
Word frequencies are not random variables, and may vary considerably across different works. An
occurrence of a word depends heavily on its context. Thus, similar to other probabilistic phenomena, word
frequencies strongly depend on the size of the population (i.e. the size of the text used in the study). 287
When there are multiple texts of variable length per author, the text instances length should be
normalized, segmented to equal sized samples.288 In case of only one large text for a particular candidate author,
Stamatatos suggests to segment it to multiple parts of equal length. However in all these cases, the texts should
be long enough so that the text representation features can adequately represent their style.289 As said before,
this is one of the major challenges that lures for my Wilhelmus-case, because the text that I want to analyze is
only 551 words long.
Research for determining the limits of text-size has led to no definite answers, a critical point might be
language and genre dependant or even be individual for every particular text.290 There does seems to be the
shared opinion that the minimal sample size is somewhere around 1000 words,291 although many successful
attribution studies292do not act upon the assumed limit of 1000 words per sample.293 Theres also the strong
suggestion that the text size depends rather on genre than on language.294 Hirst and Feiguina295 varied lengths of
text by 200, 500 and 1000 words and report significantly reduced accuracy as the text block length decreases.
The Koppel paper296 that I mentioned before, used text chunks of 500 words and reported positive results, while
Sanderson and Guenter297 even used chunks of 500 characters to perform stylistic analysis on, again with
moderate success.
My distance measure, the Burrows delta, is, as I have said before, known to be very effective attribution
method for texts of at least 1500 words, while for shorter texts, the accuracy drops according to length.298
However, even when Hoover299 tested the method for rather short texts, the correct author was usually included in
the first five positions of the ranked authors, which provides a means for reducing the set of candidate authors.300
Hoover301found that by using larger sets of frequent words (>500), the accuracy of the method was increasing.302
These are all somewhat encouraging results. Therere however also researchers, who are more
skeptical about the possibilities of authorship attribution on very short texts. A paper by Maciej Eder that reviewed
past research on text size and follows up with an experiment aiming to determine an absolute minimum, draws
the unsettling conclusion that the previous, rather small estimates, are often not followed by thorough empirical
investigation, and that the minimal required text size might be closer to 5000 words than to a 1000.303
Imbalance problem
An important problem in authorship attribution tasks, called the imbalance problem, arises when the distribution of
the training corpus over the candidate authors is uneven. To have multiple texts for some candidate authors and
very few texts for other authors. The length of these samples may not allow their segmentation into multiple parts
to enrich the representation of certain authors.304 In instance based approached class imbalance depends on the
amount of training texts per author. On the other hand, the class-imbalance problem in profile based approaches
depends only on text length.305 Only a few studies306have taken this factor into account so far, but I will, as much
as my corpora allow me, account for this imbalance. The use of n-grams, as it magnifies the amount of features,
and the use of a distance measure that calculates only in relative frequencies amends for some of the
imbalanced problems.
discuss the verification task, problem number three, first, because it depends on a technique called unmasking
which is useful for the reader to understand before I go on to discussing the needle in the haystack problem.
problem that is independent of language, period, and genre and already has been used to settle at least one
outstanding literary attribution problem.320
A limitation of the method is that unmasking requires a large amount of training text. Preliminary tests
suggest that the minimum would be in the area of 5.000 to 10.000 words.321 Unmasking does not work for short
documents.322 323 This means that its impossible to apply to my Wilhelmus case. Furthermore, is the metalearning method of unmasking out of my reach because the unknown texts should be long enough since each
unknown text has to be segmented in multiple parts to train the SVM classifiers.324
The reason for the comprehensive mention of this method in my thesis, besides the necessity of examples of real
world AA problems and solutions, is the underlying principal of feature elimination. We should understand, going
into the analyses, the different ways in which features represent style, in order to be more flexible in our
interpretation. The most distinguishable features might be the ones that blur the effects you are trying to capture.
Changing the features will not only provide us with the possible measurement of another lexical dimension, from
stylistics to semantics for example, but it can also cause the neglect of the previous most distinguishable
features, which can have an effect on its own. We performing analyses we should look for attribution that remains
stable under different experimental circumstances, not in the least different feature sets. This stability is
associated with reliability of the attribution.
The Koppel
paper AA in the wild326 describes the problem of many candidate authors for a short document. Key insight is that
a similarity based approach can be used to identify the most likely authors, but the robustness of the similarity
must be taken into account in order to filter false positive identifications. This problem is very similar to the
problem Im faced with, so a solution offered here for this problem must be considered as one of my own
methods.
Former nave approaches assembled a representative collection of works by other authors and use a two-class
learner such as SVM, to learn a model for a versus not-a. This method is straightforward, but suffers from a
320 Koppel and Schler 2004; Koppel, Moshe, Jonathan Schler, Shlomo Argamon and Eran Messeri 2006.
321 Koppel, Schler and Argamon 2009.
322 Sanderson and Guenter 2006.
323 Koppel et al. 2012.
324 Stamatatos 2009.
325 Koppel, Schler and Argamon 2009.
326 Koppel, Schler and Argamon 2013.
53
conceptual flaw. If most chunks are attributed to not-a, a is probably not the author, but the other way around it
not true. Are most of the text samples contributed to a, a is not by definition, probably the author. Any author not
represented in not-a but with a somewhat similar style to a will be falsely determined by this method.327 This
problem can be solved if we are willing to accept Dont know as an answer for those cases where the document
to be attributed is not sufficiently distinct to permit attribution. Again, meta-learning is used to identify such cases
and find that in the remaining cases, where the system believes attribution is reliable, this method is able to
provide highly accurate results.328
A good example is that of Koppels paper computational methodes in AA,329 in which blog posts from at
least 500 words where meant to be attributed to one of the 10.000 blogs they belonged to. In this case three
representations based on content features and another one based on style features were used, along with a
standard cosine measure, to quantify the similarity of each authors known work with a given snippet. The various
authors can be ranked according to the similarity between their known works and the snippet under
consideration, with the hope that the highest-ranked author is the author of the snippet. The idea is that some
distinctive feature might render the snippet particularly similar to just one of the candidate authors, when applying
a distance measure over meaningful textual features.
This simple approach to the problem actually works surprisingly well. The three content representations
assign the snippet to the actual author between 52 and 56%, respectively. These results are impressive, but not
useful as the system is wrong almost half of the times.330 As I mentioned, applying meta learning boosts these
number, for snippets limited to 200 words, at a recall level of 30%, to a precision of 86% and at recall of 40%, to a
precision of 73%.331
A problem with the described method and the results, is that they are still those of an experimental
setting. On a real world case, we cannot assume that the author of a questioned text will in fact be contained in
the candidate set, even if that set is very large.332 My number of candidates is not very large and Ive absolutely
no certainty that the real author is include.
Another problem is that as the number of alternative candidates becomes much smaller, the problem
might, somewhat counter intuitively, become more difficult. This is because our method implicitly leverages the
fact that, if a document is much more similar to one authors writing than to those of all others, it is very likely the
document was written by that author. As the number of alternative author decreases, the reliability of such a
conclusion will similarly decrease.333 My number of alternative candidates is much smaller than that of Koppel, so
the reliability of such a positive conclusion will decrease.
I wont be able to perform the meta-learning, and the results, of 50% correct attributions in experimental
conditions, are not high enough to be comfortable without it. Furthermore, the conditions Koppel gets to work
with, even before the meta learning, are far more favorable than mine. The needle-in-the-haystack-problem as
described and solved by Koppel is illustrative of the difficulty of solving such a case. It is however the task Ive
assigned myself to and the difficulty of this case, along with its major relevance, is exactly why it is so necessary
to try to solve it. It will stretch the possibilities of the currently available methods and means, show what they can
accomplish and where we need further development.
Tests
In this section Ill sum up the test Ill perform, quickly recap their methods, sometimes expand on their methods
and parameters, and place them in chronological order of performance, to take the reader through the process of
thought. My methodology, including the tests, is based on the discussed theory, and limited by my own capacities
and the limitations of my design, techniques and corpus. My familiarity with and preferences for some of the
software translates in a preference for certain methods.
I discern two types of tests that I will perform, namely the multivariate authorship attribution task and a principal
component analyses. The latter will come into play, when I explore the nature of the results by reducing the
dimensionality of the features, later on in the analyses, and when I mimic a successful authorship attribution
method by Burrows, using the PCA. The main attention will go out to the multiclass categorization task, used as a
authorship attribution method. Within these two types, I can vary the features, parameters, corpora, mode of
analysis, the distance measure, among many other things, dependant on the results and my own assessment.
Therefore, the range of stylistic characteristics of a text that I can test with the categorization task enlarges, as Im
able to change so many of the conditions. The number of analyses performed for this thesis runs quite high. The
broad order in which I will perform my analyses is as following: I start with analyses that can handle a lot of
candidate authors and a lot of different texts and that can visualize broad effects. So Ill perform distant reading
and exploratory analyses on a corpus as complete as possible. The process of these exploratory analyses,
including the encountered problems, and their results, are fundamental for my choices regarding the analyses
that will follow, in terms of methodology and composition the corpora. These following analyses aim at answering
the research questions and hypotheses. After the bulk of the analyses are performed, the dimensions of
distinction between texts are examined with the aid of the principal component analyses. The exact meaning of
dimensionality reduction and its specific use for the Wilhelmus-case will become clear when we get there.
55
attribute the target text (the Wilhelmus) to the closest cluster of texts, provided theyre of a common author,334
dealing with a needle in the haystack problem, meaning attributing a short document in, in an open author set,
with many possible authors. The goal is to perform an exploratory analysis on which I can base, the composition
of my corpora and the settings for further analyses, as well as to get a first indication of the methodological
capacities and a first glance at any possible results.
Using R and package stylo to measure stylistic resemblance based on word frequencies and Burrows
delta, to determine the stylistic relationship of the texts of my corpus and subsequently visualize this with Gephi.
As Ive said, I do not, at this point, wish to get an actual authorship attribution, but Ill analyze and visualize the
effects to serve the goals stated above. I hope to establish insight in my methods and my corpus, and therefore a
as large as possible corpus is needed.
337 http://www.totallab.com/products/samespots/support/faq/pca.aspx
338 Koppel, Schler and Argamon 2009.
339 Binongo 2003; Burrows 1992; Holmes, D. (2003), Stylometry and the Civil War, Chance 16(2)
340 J. F. Burrows and Anthony J. Hassall, Anna Boleyn and the Authenticity of Fielding's Feminine
Narratives. Eighteenth-Century Studies 21, no.4 (1988).
341 Koppel, Schler and Argamon 2009.
57
mimic this experiment by taking Coornhert en Marnix instead of the Fieldings and look if whether the Wilhelmus
follows one of the author and if so, which one.
Corpus
58
problem for a large part, unless topic might actually also have influence on the style, in terms of relative frequency
of function words. A certain topic or genre demands a certain structure, for example, distinct stylistic differences
were found between genres like tragedies and comedies.349 The line Ive drawn here, between topic and genre, is
pretty blurred and will not have such a clean rendition in textual reality, as any border is imaginary.
Other important evaluation parameters for a corpus are; corpus size, in terms of both the amount and
length of the texts, and test corpus size, meaning the length of the anonymous text, the number of candidate
authors and a balanced distribution of the corpus over the authors. These are already discussed.
Explanation corpus
In this section, I explain my corpus and discuss the purpose and nature of my set of candidate authors and
included texts. Texts can mean complete texts or fragments of texts; single such texts or collections of them with
a common author, genre, or language. My corpus means in this section all of the texts that were at some point
part of the analyses I ran for this thesis. I discern within my corpus three major sub corpora, based on the way
they came into my possession.
Two of them, the Meertens corpus and the DBNL corpus, Ive received from my cooperation with these
institutions of language and literature. In these cases, especially with the DBNL corpus, the compositions of the
sub corpora depended for the most part on what was handed to me. I deliberately tried to have as less influence
as possible regarding the selection process of these two sub corpora, so I could rule out that certain effects were
only present because of my selection or because of the practice of preparation in general. The initial request for
texts, was of course to some extent specified. Another advantage of the delivered corpora was that it allowed me
to include a larger amount of texts, since I didnt need to handpick and clean them manually.
A third sub corpus was constructed by myself, aimed at a specific task. I, therefore, occasionally refer to
this corpus, and its sub corpora, as specialized corpus. The selection of texts is based on secondary literature I
laid a hold on, from the experts in the field Ive been in contact with, and for a large part on the availability of the
texts. Ive downloaded texts, transcribed them from paper books as well as pixel pages or files, and copy-pasted
from DBNL, but also siphoned texts of the corpora I got from the institutions.
These three sub corpora are in many cases further divided in all sorts of smaller sub corpora. I will,
however, always start with the composition of a 0-corpus, the main sub corpus. The 0-corpus includes all texts of
the respective sub corpus and is the starting point for any further division and analysis. The three 0-subcorpora
also represent three types of format, three types of goals and three separate lists of thank you notes I will be
writing after my thesis is done. A complete index of all corpora and their texts is added (see appendix 1 corpora).
Discussing my corpus, specifically the Specialized corpus, automatically leads me to discussing my
hypotheses, which I therefore present in succession of the individual expounding of the three main sub corpora.
Three corpora
DBNL Corpus
This corpus is send to me by the Digitale Bibliotheek van Nederlandse Letteren350 (DBNL), being exactly that. Ive
been in contact with them from February 2015 until April, especially with Cees Klapwijk. After some non-digital
paperwork, I received the corpus in correspondence with my request of all Dutch texts, also the southern
dialects, written between 1550 and 1590 without discrimination on genre, so including screenplays, that are in the
possession of the DBNL. These demands are so non-discriminatory, that theres a good chance the author of the
Wilhelmus is actually in there. The goal of this corpus is to have a birds eye view of the Dutch literature of the
second half of the sixteenth century, in order to analyze the big movements, the noise and the possibilities of
actually finding that needle in the haystack. A corpus with as many texts and as many candidate authors as
possible accomplishes this.
The texts were in XML, Extensible Markup Language, which is a programmers language or digital format
that encompasses metadata in the form of structuring elements and attributes like syntax, function, among others.
Meertens corpus
The corpus is send to me by Erik Tjong Kim Sang of the Meertens Instituut, a research institute of the Royal
Netherlands Academy of Arts and Sciences (KNAW),351 currently very active in the computational literary studies,
after elaborate mail contact with especially Prof. dr. Nicoline van der Sijs and a meeting at the institute itself. The
initial goal of the meeting was to find a solution for the variation in spelling but it resulted in the addition of an
extra corpus in my ranks.
The composition of the corpus depended on a selection of the DBNL corpus, who would be delivered in
two types of units, whole books and songs or parts, and in a different XML-based annotation format, called
FOLIA. The goal of this corpus is to have an extensive view of how the complete works, the complete books of
songs but also with each song individually prepared, stylistically relate to each other. It could also be beneficiary
for the methodological goals of my thesis, potentially providing information about format, text size and a possible
signal of work, meaning a stylistic resemblance of texts from the same book or work, for example because
theyre from the same edition.
Specialized corpus
All the texts of the Specialized corpus are in txt-format and without exception extensively prepared. I got rid of as
much meta text, and other noisy element, as possible. Because this corpus comprises of handpicked text, it has
greater need to be accounted for than the DBNL and Meertens-corpora. I discuss the included authors and text in
this section.
The Wilhelmus and the Geuzenliedboek:
In order to get as close to the original signature of the author, its only logical to search for the earliest version of
the Wilhelmus. As discussed in the theory, the earliest available version, 1573, is in German and therefore, if the
author is Dutch, which we assume, not close to the original version. The standard in the tradition of the
Wilhelmus-research is the edition of the Geuzenliedboek of 1581 called Een nieu GeusenLieden Boecxen. For
decennia we assumed that this was the oldest version of both the Wilhelmus as the geuzenliedboek, that
survived. With the discoveries of Martine de Bruin we now got other options, however the text of the Wilhelmus
didnt change.
Ive included the 1583 version of the geuzenliedboek, which is similar regarding the included works and
spelling, to the standard 1581 version traditionally researched. My choice for the is 1583 edition is motivated by
this similarity as well as the fact that it was complete and readily available for me at the DBNL. Both songs that
werent in the 1581 and 1683 edition of the Geuzenliedboek but are in the new earliest edition, Een ander nieu
liedeken van Leyden & Van die afwijkinghe van Alckmaer I include manually in my corpus.
Ive also included the Geuzenliederenboek of 1924-1925 by ET Kuiper. This version is very different in
regards to composition and spelling, although it does share some of the songs with the 1581 version. Inclusion
makes my corpus more rich with texts similar to the Wilhelmus, with respect to date and genre, and it may also
give me insight in the stylistics representation of different editions.
Marnix
Marnix van Sint Aldegonde is the most researched and proposed potential author of the Wilhelmus, and in nonacademic context outright assumed to be its author. Hes amply represented in my corpus with a variety of texts
of different genre, date and language, among other variables.
One text thats worth mentioning is the Dordtse Rede, Marnixs oration on 19 July 1572, in the city of
Dordrecht. It was his speech at the first meeting of the rebellious Dutch states. Similarities between this text and
the Wilhelmus are numerous. Rooker performed a structural analyses352 on both texts to bring out such
similarities and decided Marnix as author of both. His analysis has been criticized as prejudice and subjective.353
A song that Id have liked to add, because of its resemblances to the Wilhelmus, but that Ive failed to
acquire, is George Lalaing, also an acrostichon and, although sometimes attributed to Marnix, also still
anonymous.
352 Rooker, C. "Marnix, de Dordtse rede en het Wilhelmus." De Nieuwe Taalgids 71 (1979): 145-164. Print.
353 J.B Drewes, Wilhelmus van Nassouwe. Een proeve van synchronische interpretatie [dissertation] (Amsterdam: Elsevier,
1946), 46-49;S.J. Lenselink, Maker van het Wilhelmus sprak in Dordt voor de Statenvergadering. Trouw 11 september 1948
(1572): 5.
61
Coornhert
The other big name, the runner up, of the Wilhelmus research is Dirk-Jan Volkert Coornhert. Coornhert is the only
seriously considered alternative to Marnix. Hes amply represented in my corpus with a variety of texts of different
genre, date and language, among other variables.
Other possible authors
As Ive already mentioned, not much attempts at authorship attribution on the Wihelmus have been made, that
didnt solely focus on Marnix as its author, and/or Coornhert as the alternative. Other possible authors are rarely
seriously proposed, let alone researched.
A logical beginning for drawing up and subsequently narrowing down a list of potential authors of the
Wilhelmus, is to accept all known authors whos songs are included in the Een nieu geuse lieden boexcken of
1577-1578 and then try to rule them out. Portema & Smith354 suggested the following names; Willem van Haecht,
Lutheran from Antwerp and poet of Psalms, Jeronimus Van der Voort, whos never been officially in service of
Willem van Oranje but has lived through prison as well as the torture rack for him and is part of the Chamber of
rhetorians of Antwerp and Lierse, and is rumoured to be the author of the chant Vive le geus is nu de leus
meaning long live the Geus is from now on the word, Jan Fruytiers, Coornhert and Laurens Jacobs Reael.
Ren van Stipriaan, specialized in the subject of Dutch seventeenth-century theatre and historian
specializing in Dutch history, researched Oranjes literary network between 1568 and 1574, and came up with a
list of known literary figures, similar to the usual suspect list of Portema & Smith. In a lecture355 he elaborates on
the names of the list Dirk-jan Coornhert, Marnix van Sint-Aldegonde, Petrus Datheen, Lucas dHeere, convinced
and radical Calvinist, Jan van der Van der Noot, famous for his exile literature Jeronimus van der Voort, Laurens
Reael, Johan Fruytiers, Jan van Hout, secretary of Leyden but fired by Count Bossu, a Spanish pawn and
predecessor of Willem of Orange as Stadtholderate,356 Janus Dousa, and Jan Baptist Houwaert. Through a
process of ruling out, he ended up with three possible authors, Marnix, Van der Voort and Fruytiers. He expresses
his surprise over the fact that Van der Voort and Fruytiers were never seriously considered for authorship.
Especially Fruytiers, who was an obvious supporter of the reformation and Huguenots, in all of his oeuvre, seems
as a very possible yet neglected author.357
Fruytiers is a skilful author of many genres, including the beggars songs, and he was appt with biblical
matter. In 1574 he gets appointed by Willem van Oranje as his counsellor, and gets a function in office, perhaps
as a token of honour and appreciation.358 Jan Fruytiers is recognized as the author of some songs in the
geuzenliedboek, because of his subscript Weest dat ghij zijt,359 as was Laurens Jacobs Reael with his subscript
Liefde vermacht al.360
Another probable author suggested by Buitendijk, based on biographical and historiographical
arguments,361 is Adriaan Saravia, preacher and writer of the manifests of the Prince. Buitendijk wasnt the first,
since, already in 1910, the widely authoritative historian P.J. Blok, points to the strong similarities in thought and
beliefs of the Wilhelmus and a pamphlet of Saravia from 1568 called Hertgrondighe Begheerte.362
Improbable authors
Ive also added work and texts from authors who arent considered serious options for Wilhelmus authorship. This
can have several reasons. First of all, Ive got multiple hypotheses, some of which do not apply to authorship but
are identifying and analyzing other effects. Some text are meant to contribute to these hypotheses and not to the
question of authorship. Second, besides a search for the Wilhelmus, is this also a methodological thesis, so Ill
perform a lot of different analyses on lots of different corpora, in order to find the capabilities and boundaries of
359 G.J. van Bork and P.J. Verkruijsse (red.), De Nederlandse en Vlaamse auteurs van middeleeuwen tot
heden met inbegrip van de Friese auteurs (Weesp: De Haan, 1985).
360 E.T. Kuiper en P. Leendertz Jr. (ed.), Het Geuzenliedboek (Zutphen: W.J. Thieme & Cie, 1924).
361 Bonger 1985, 184.
362 Van Stipriaan 2012.
63
my methods and design. This works as a catalyst for the amount of analyses and corpora, hence the amount of
different texts and authors. The third and last reason for the inclusion of improbable authors in my test corpus, is
to balance and enlarge my corpus with texts of similar background, and possibly similar stylistics, as the
Wilhelmus and texts of its suspect authors. This way authorships signals and other stylistic signals can be
adequately tested, false positives become a possibility, and with a now large body of texts more subtle effects
wont be as easily missed, as with a small corpus. Also, when performing analyses with only potential authors, the
suggestion may arise that all these authors write stylistically similar to the Wilhelmus, while in reality some
impossible authors may actually be more stylistically related to the national anthem, then most of them.
One of the improbable authors is Petrus Datheen. In the following (translated) quote, Maljaars explains why
Datheen has never been considered as the author of the Wilhelmus, although he has been in close contact with
Willem van Oranje, receiving some very honorary assignments from him, and so perhaps unjustly never taken
into consideration.363
Nobody, as far as we know, has ever named Datheen the poet of the Wilhelmus,.. because of the technique of
his poetry regarding the Pslams. Evenso, taking in consideration that, not everything hes made deserves the
label of flimsy. 364 365
Literary quality is not something that is objectively measured, so if subjective judgment has so far been
disruptive of the authorship question for the Wilhelmus case, this might be a opportunity to let go of these criteria
for inclusion.
I add texts of Petrus Datheen to my corpora, still assuming he isnt a real option, however, if texts of
Datheen repeatedly come out as a stylistic match to the Wilhelmus, I will not hesitate to take these results
seriously and am forced to reconsider my initial assumption. Reason for my assumption of non-authorship is that,
regarding the theory on Wilhelmus-authorship, I base myself on the general consensus of existing research. To
dissect every argument, like the exclusion of Datheen for possible authorship, to such an extent that I can take an
rational independent position, is a(nother) thesis in itself. I will not write two theses, so Im forced to trust the
findings of my predecessors.
Another writer included in my corpus but not considered a realistic option, even though its such a romantic
theory, is Willem van Oranje Nassau himself. The song is actually sung from his perspective, but the notion is
more fictive than historically probable. Willem of Orange had a large amount of people in his service, poets
among others, that were assigned to the tasks of poetry, propaganda or both. In addition to this is the general
assumption that the Wilhelmus must have been by a professional, which Willem de Zwijger (Willem the quite)
isnt.366 Problems with the inclusion of texts of Willem van Oranje originate from these very objections against the
princes authorship, because they involve the lack of poetry of his hand and the doubtful authorship of the rest of
his texts, mainly speeches and manifests. Ive included apologie ofte verantwoordinghe in my corpus, attributing
it to Orange. He might not have written this himself, in that case preacher Villiers is the most probable option.367
Anonymous Songs
Some of the texts in my corpus, a lot coming from some version of a book of beggars-songs, are anonymous.
Some are included, just for the reason that they were joined with the Wilhelmus in the same work, some were
specifically chosen because they were interesting cases and some just to balance my corpora.
Prof. Dr. K. Heeroma writes in his essay Tsal hier haest zijn ghedaen, that the possible importance of the
Pardoen-lied and the famous beggarssong Help nu u self so helpt u Godt, have stylistic similarities that could
be explained as a common expectation of freedom or as the atmosphere of the times.368 Both are initially
included in the corpus.
Other improbable authors, most of them at one point secretary to the Prince, who were left out of the analyses
because their work wasnt readily available for analyses, were Hendrik Geldorp or Hendrik Castritius, Hendrik
Niclaes, Jacob van Wesembeke and Nicolaas Bruyninck369.
Hypotheses
Ive formulated the following two main research questions, one topical and one methodological; Who is the
author of the Wilhelmus? and Can the complicated real world authorship attribution case of the Wilhelmus be
solved with the methods of quantitative analysis and the tools of computational literature?
In order to answer these questions Ive drawn up sub questions, that answer diverse parts of my main
questions. When enough of these sub questions are thoroughly answered, Ill be able to draw the holistic
conclusions and answer the main questions. These sub questions who are part of, and can add to, answering the
two main questions are the following;
65
What candidate for authorship of the Wilhelmus does a quantitative stylistic analysis with
computational means supports and/or points to, as the author of the national anthem?
Who is, or is more likely to be, the author the Wilhelmus, Marnix van Sint-Aldegonde of Dirk
Volkertszoon Coornhert?
What are the current limits of computational stylistics and authorship attribution?
Can my methods eliminate one or more of the usual suspect of the Wilhelmus authorship
attribution case?
Can my methods give supporting evidence for one or more of the usual suspect for the
Wilhelmus authorship attribution case?
Are my methods useful and/or sufficient for authorship attribution for texts of 550 words?
During my research Ive worked on several methodological questions that needed to be answered before I could
focus on the question of authorship. These were:
These questions formed hypotheses during the construction of my three specialized corpora, which Ill discuss in
the next section.
Specialized sub-corpora
An important question in authorship attribution is how to discriminate between the three basic factors; authorship,
genre, and topic.370 All three of them can be successfully measured, but how do you make sure they dont interact
and thereby cloud your results. If Text A, a tragedy, is attributed to author B, not because it was written by author
B, but because a lot of the texts of author B included in the experiment were tragedies, you have a false positive.
In order to avoid these false attributions, Ive got to be aware of the major stylistic effects in my corpus so I can
control for them.
The
specialized sub corpora are charged the heaviest for answering the stylistic question on authorship and almost
completely responsible for the measurement of other stylistic effects that could possibly dim or obscure the
authorial signal. In order to find the answers to these questions Ive formulated several hypotheses, by which I
designed and collected the sub corpora to my Tim-0 corpus. The hypotheses that Ive constructed, can be divide
in three groups, based on the characteristic these hypotheses wish to explore.
The first two sub corpora are designed to answer the questions and hypotheses regarding language or
dialect and style or genre. The third group of hypotheses will aim at my main question of authorship. These first
two sub corpora I deem necessary because in order to find an authorial signal, Ive got to control or account for
other possible effects, that, based on the results of the other corpora and secondary literature, we ought to
expect. I need to discover, analyze, visualize and interpret effects of language or dialect and effects of type or
genre. When these effects are mapped, they can be accounted for. If these results provide us insight about my
corpus or the Wilhelmus itself, they need to interpreted and be integrated in my hypotheses, analyses and
methods. Ill discus the three subcorpora of the Tim-0 corpus below, individually.
Based on these findings I focused on researching dialects and other influences of foreign languages. Dialects I
deemed necessary to research were Flemish or southern Dutch dialects and German-Dutch or Dutch with
German influences. There seems to be very little support for the French hypotheses, so I left that one out.
Possible effects of the German or Flemish influence are the most relevant as they can contribute to our
knowledge about the Wilhelmus and also because these are important possible effects that could pollute my
results when looking for the author of the Wilhelmus.
Another effect I include in this set of hypotheses, is that of translation. Most of the potential authors have
written in the classical languages, often religious texts, and so Dutch works translated from the classical
languages Greek and Latin, are part of my corpora. If effects of translated text exist, they need to be controlled for
and therefore I choose to include them as a sub-hypotheses.
This leads to the following hypotheses:
There is according to the results of my analyses a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in the texts of my
corpus.
There is, according to the results of my analyses, a stylistic effect based on the Southern dialect, also
called Flemish, meaning Dutch from the southern regions of the Netherlands, present in the texts of my
corpus.
There is, according to the results, of my analyses a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus.
There is, according to the results of my analyses, a common stylistic effect present, in Dutch texts that
are translations out of Latin or classic Greek, of my corpus.
There is, according to the results of my analyses, a stylistic signal of any other language or accent than
Dutch language present in the Wilhelmus?
For a proper understanding of some text in my corpus and of the results they bring forth, the reader might need
some further context on these texts. In this section Ill elaborate on some of the included texts.
Van homulus by Pieter Dorland van Diest coded as dorland_vanhomulus is a text written in Hoogduits.
Hoogduits is a dialect of German they spoke in the regions south of the Benrather line, including in the German
and Dutch province of Limburg, opposed to the regions north of this language border where they spoke
Nederduits. Some argue the Van homulus is a Dutch text from origin.378
De Uilenspiegel coded as A_uilenspiegel is Nederduitse folklore. Its often attributed to Bote, but theres
uncertainty if he was capable of written in Hoogduits, the language of De Uilenspiegel.379 For my analyses I
378 Pieter Dorland van Diest, Van Homulus, een schoene comedie daer in begrepen wort hoe inder tijt des doots der
menschen alle geschapen dinghen verlaten dan alleene die duecht die blijft by hem vermeerdert ende ghebetert, ed. C.P.
Serrure (Gent: C. An van der Noot-Braeckman, 1857), 1-10.
379 Loek Geeraedts, Ulenspiegel, ed. Loek Geeraedts (Antwerpen: Berghmans Uitgevers, 1987), 5-7.
68
include the 1580 Dutch version printed in Antwerp. When analyzing the work as a whole, a lot of text that isnt
written by the author, like the introduction or other editorial notes of the reprint of 1987, is present. In the
specialized corpora I constructed I cleaned this text of the noisy Paratext.
Nederduitse Orthographie, meaning the Nederduitse orthography, of Pontus de Heuiter, coded as
heuiter_nederduitseorthografie, is an attempt to give the better Dutch alternatives for words of Nederduitse
dialect, that were often used in the Dutch language. The author aimed through his whole life for a Koin
composed of several dialects; an ambition inseparable from his travels through Holland, southern regions of the
Netherlands and France.380
The manuscript Drie historische liederen en een hekeldicht of Antonius Ghyselers from around 15051518, coded as DUITS_Antonius Ghyselers Drie historische liederen en een hekeldicht 1505-1518 is a
complicated text. In the introduction, for example, therere pieces included of Erasmus, written in Latin of course,
but also some modern Dutch editorial notes. To make things more difficult, Ghyselers studied Latin during his
military service in Austria, around the time of the writing of the manuscript, while he conversed those days in
letters in Hoogduits, his mother tongue.381
Jan van der Van der Noot wrote lofzang op Brabant, coded in my corpus as ZUID_van der Noot Himne
oft lof-sangh van Brabant in Southern Dutch or Flemish, however he also wrote during this period in French and
housed in Germany for a long time. The first lines of the Hymne on Braband, are a paraphrase of the preamble
of Ronsard's Hymne de France382 of 1549. The hymne is also an acrostichon.383
The work 25 Psalmen coded as utenhove_25Psalmen of the Felmish Dutchman Jan Utenhove
contains both Psalms, as well as his digression or explanation of them. Therere possibly Nedersaksische roots
signaling through this text.
The Flemish rhetoric Marcus van Vaernewyck wrote the mythical history de historie van Belgis in 1574
and I include it in my corpus and coded it as vaernewyck_histori Belgis. It includes Flemish as well as Latin parts.
380 G.R.W. Dibbets, Voorbericht in Nederduitse Ortographie, ed. G.R.W. Dibbets (Groningen: Noordhoff, 1972), 6.
381 Vaderlandsch Museum and C.P. Serrure (red.) Vaderlandsch museum voor Nederduitsche letterkunde, oudheid en
geschiedenis (Vierde deel) (Gent: H. Hoste, 1861).
382 Ronsard, L.L. VI, blz. 79. In; Marcel Raymond, L'influence de Ronsard sur la posie franaise II (Paris
ca. 1927)
383 Jan van der Van der Noot, Lofsang van Braband/Hymne de Braband, ed. C.A. Zaalberg (Zwolle: W.E.J. Tjeenk Willink,
1958).
69
effect on the stylistic fingerprint of the texts. I need to know what, if any, the stylistic difference was between prose
and poetry, between Psalms and poetry and between poetry and songs.
Im also interested in possible stylistic properties of different genres, because its relevant to my
questions about the Wilhemus and even for the authorship question. If theres some common stylistic base for
topic or motive, especially if these genres are as subtle as the difference between songs of comfort or songs of
resurrection, they would be directly relevant to the entangled cardinal questions of the Wilhelmus, regarding the
reasons behind the song, date of creation and possibly its author.
This leads to the following hypotheses
There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction, present in the texts of my corpus.
There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose.
There is, according to the results of my analyses, a stylistic difference between the songs and the poetry
of my corpus.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry.
There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggars songs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggars songs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.
70
There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal, present in the texts of my corpus.
The Wilhelmus shows consistently more stylistic similarity to texts in my corpus of one particular author
than to the texts of the other authors, and these texts of this author consistently show an authorial signal
by expressing common stylistic markers.
This corpus consist for the most part of texts already used in my genre and language hypotheses, and for a small
part of new texts. All texts from the geuzenliedboek bigger then 550 words have been added to the corpus. It is
highly significant that these authors were published in the same book, so printed by the same publisher and
probably conceived around the same time, and perhaps even in compliance with the reasons and ideas as the
Wilhelmus. Based on these probable similarities we can expect songs from the same work to show some
common stylistics. A few of the songs were already added, like songs Voort and Raeal, which of course I wont
add twice. The newly added authors are Cornelis van Damme, Reael and Sterlincx, some of them with multiple
songs. I also include the Wilhelmus of the geuzenliedboek, coded as _nie096nieu01_01-0018WILHELMUS,
including the intro that I edited out in the previous version. The Wilhelmus is now represented twice in my 0corpus, because the version of the Wilhelmus that is included in all my corpora remains included in this corpus as
well.
71
Co-authorship is also very hard to test on very short texts, perhaps even impossible on texts as short as the
Wilhelmus.
Expected problems
Therere some problems or obstacles that I know of before hand, I wont be able to fully, or at all, anticipate on.
These obstacles can at best be expected and taken into account when designing, analyzing or interpreting. Ill
discuss them in the following order; first the obstacles of computational literature, then obstacles for authorship
attribution and then case specific problems.
386 Harold Love, Attributing Authorship: An Introduction (Cambridge: Cambridge University Press, 2002).
387 Koppel, Schler and Argamon 2009.
388 Koppel et al. 2012.
389 Stamatatos 2009.
72
The most important issue in my research is probably the text length of the target text. How long a text should be,
so that I can adequately capture its stylistic properties, is uncertain390 As reported in my methods section, some
studies report promising results dealing with short texts, however, although its not yet possible to define such a
text length threshold, it seems to be over the 551 words.
Another problem is dirty data or dirty texts. Corruption in stylometry seems to apply either to the texts in
the corpus itself, like dirty Ocular Character Recognition (OCR), or manifest itself in cherry-picking of results. I
encountered an case of bad OCR when I tried to included Cantica Lieder offte gesange by Hendrik Niclaes,
friend of Lucas dHeere, Jan van Van der Noot & Dirk Volkertsz Coornhert. The text was in PDF format, but OCR
had not yet been applied, so I had to read from the pictures, which were unreadable. An online OCR tool had
even more trouble in reading the files, leaving me with a text unfit for analysis, and a strong awareness that this
kind of research and research tools are very sensitive for noise. I also tried to include the Psalmen unde ledern
of Hendrik Niclaes because some songs of his cantica are present in his Psalms, but this text was also beyond
my grasp. I ended up not excluding Niclaes altogether. The (un)availability of texts is in my experience one of the
biggest stumbling blocks for the performance of successful authorship attribution.
When it comes to choosing the plot that is the most likely to be true, scholars are often in danger of
more or less unconsciously picking the one that looks more reliable than others, or that simply confirms their
hypotheses. If common sense is used to evaluate the obtained plots, any counter-intuitive results will be probably
dropped simply because they do not fit the scholars expectations.392
A solution regarding the MFWs, is the bootstrap consensus tree, a diagram that mediates cluster analyses with
different amount of MFWs and subsequently plots these in a consensus tree diagram.393
The problem of the nearest neighbour principle, who shows strong effects like the authorship signal, but in the
process conceals other stylistic effects on deeper level, like genre, topic,394 translation,395 and sometimes
chronology,396 is still present, even with the consensus tree. The visualization in R is a rather rigid expression of
the relation between texts, where a text is assigned to a (same) branch or not. In addition to this, renders a
hundred or more texts the visualization unreadable and therefore not useful.397 The visualization options of R are
not always adequate for the demands of my corpus,398 but in cooperation with Gephi, a lot of these problems are
solved. I must be aware of the limitations of my tools, especially the attribution and visualization options of R and
Gephi as well.
392 Joseph Rudman, Cherry Picking in Nontraditional Attribution Studies Chance 16, no.2 (2003).
393 Eder 2014.
394Christof Schch, Fine-Tuning our Stylometric Tools: Investigating Authorship, Genre, and Form in French Classical
Theater. in Digital Humanities 2013: Conference Abstracts (Lincoln Nebraska: University of Nebraska-Lincoln, 2013).
395 Jan Rybicki, The great mystery of the (almost) invisible translator. in Quantitative Methods in CorpusBased Translation Studies: A practical guide to descriptive translation research, eds. Michael Oakes and
Meng Ji (Amsterdam: John Benjamins Publishing Company, 2012).
396 Hugh Craig, Stylistic analysis and authorship studies. in A Companion to Digital Humanities. eds.
Susan Schreibman, Ray Siemens and John Unsworth (Oxford: Blackwell, 2004).
397 Eder 2014.
398 Zie figuur 1
399 Maljaars 1996, 65.
400 Maljaars 1996, 65.
401 Maljaars 1996, 66.
74
spelling might do to the authorial signal. Marnix, however, is known to be very involved and very strict in the
process of publication, always demanding to be present when printing his works.402
I assume, and with me the tradition of the Wilhelmus-research,403 that the version of the Wilhelmus in
geuzenliedboek of 1581 doesnt deviates to much from the original spelling. Ive searched for workable solutions
for eventual differences in spelling, but came up short. Important for the analyses is the recognition that songs,
sometimes even versions of the same text, can differ greatly in spelling, but this is not always a reflection of an
authors style. The researcher should be conscious of this fact when hes interpreting the results of this type of
authorship attribution on Dutch renaissance texts.
75
well as a problem for me specific, because Im working out of my expertise and out of my comfort zone.
Connected to this argument is the fact that Im dependant on the availability of the texts. When
performing comparative analyses, not only the target text has to be available but also a corpus of texts of
potential authors. The availability of texts of potential authors determines whether they can be included.
Canonical texts and texts of canonical authors are often easier accessible than the ones in the periphery of the
literature.
The other influence Id like to appoint, making sure that Ive freed myself from any kind of odour of objectivity, is
the researchers horizon. My personal academic and professional history are part of me as a researcher, and
these will influence my choices, however, when acknowledged and anticipated on, this type of quantitative
research should be able to filter out some biasing prejudice or preoccupation, at least from the analysis phase.
The interpretation of the results can still be largely subjective, the actual generation of the data, is less sensitive
to personal preference than qualitative research.
76
coornhert _lijd02_01, will all have the same colour and so will their connections, looking like a piece of gum in the
gephi visualization. These are all settings and they depend on my decisions. Occasionally Ill tune up or tune
down some of the effects to make the graph readable or visually usefull, but the premises I described here, stay
the same for all the analyses.
Id like to state an editorial note, that even performing the exact the same analyses on the exact same corpus,
sometimes can give birth to, slitghly, different results. When I performed, for various reasons, some analyses a
couple of times over, I found out about this strange occurence, for which Ive no explanation. I will state here, that
Ive not, nor should any researcher, repeated analyses hoping that they would better fit the hypotheses.
77
and other, again of different authors. So for these texts what seems is that even stronger than an authorial signal
is the fact that they are the same texts or on the same topics. Of course not all these psalmen davids are the
same, spelling or include the completely the same songs, but they are stlylisticly related enough to form clusters
based on other characteristics than authorial style. Beneath the religious texts, in the bottom of the graph we see
a little red cluster, being both texts of Fruytiers closely positioned next to each other.
My conclusion based on these bottom left clusters is that authors tend to cluster together just as that
versions of different authors of the same texts tend to cluster together and that even the type or topic of a text
should be taken into consideration as a possible effect. Another example of an indication for signal by topic is the
clustering of Marnix his Bienkorf with other satirical work like Erasmus lof der zoetheid, in the right bottom of the
graph, connecting to the first outliers of Coornherts prose. We can confirm these effects in the rest of the corpus
by looking at different version of the geuzenliedboek, and other authors spread over the graph, their texts also
tend to make connections with each other. These conclusion support the results of the analysis of the first
Coornhert cluster. Although authorial signals clearly exist, they can be overwritten by other signals, like different
versions of the same text. I assume that this works both ways and that other weaker signals can be overwritten
by authorial signals.
Other possible effects, like the distinction of prose and poetry, seem to be present in the graph. Most of
the poetry is being pushed to the right, but theres also poetry in the top of the graph and a bit in the middle.
Prose clusters around the bottom Coornhert outliner, for example the letters of Willem of Orange and the satirical
work of Erasmus and Marnix. When reviewing these results with my supervisor prof. dr Els Stronks, she pointed
out that some of the traditionalist and traditional texts were clustering together while the innovators of language
as van der Van der Noot and Lucas dHeere also made connections with each other.
The birds eye view gave me enough indications that the corpus Ill be working on has enough stylistic information
that can be captured, and a lot of effects based on these stylistics, to proceed my research. There is, of course,
for every seemingly effect, also noise. A lot of connections dont make sense, a lot of texts do not fulfil my
expectations and in many cases I dont have an explanation for their behaviour in the graph.
This being the case, Id like to point out the general tendencies we do see in the graph. I suggest that
the reader(s) look at the basic properties of the graph, the colours. The texts of the corpus are coloured by their
presumed authors. The positions of the text are based on their stylistic content, so roughly the relative
frequencies of their MFWs transferred to a score by a distance measure, besides of course the parameters for
visualization. Their position is definitely not determined by me directly, nor by their code, or similar names. This
means that the clusters of pink and yellow texts but also the red of Susato and Fruytiers or the dark blue of
Dheere, all noticeable to the reader on point blank, are effects worth considering as valid even if the corpus also
includes a lot of noise.
78
Specialized corpora
Now that Ive seen the grand picture and established some effects, Id like to test these observation and look at
the data a bit closer, by analyzing smaller and cleaner corpora, specifically composed to test my hypotheses. Ive
constructed three 0-corpora, meaning a starting corpus that might be further polished after seeing the results of
the analyses on it, all three designed to answer a specific hypothesis or aspect of the Wilhelmus, the texts of my
specialized corpora or my texts in general. These corpora should answer my questions about language and
dialect, genre, type and topic and the author, in that order. All of the corpora are extensively prepared. Ive tried to
exclude, delete, all the meta text, footnotes, titles, introductions and other forms of noise, so that only the text
would remain, also all the numbers and large parts of the punctuation has been edited out. In the case of plays
Ive delete the names of the speaker and other instances as choir. In the case of psalms Ive delete the endings,
which were always spells like amen or gloria. In the case of letters, and other texts, Ive delete authors notes,
names, places and dates of writing mostly on the bottom of the texts. I also tried to delete all the text that was in a
different language. An example of this is the removal of all the Latin proverbs in the Antonius Ghyselers texts.
A complete index of all the specialized corpora I constructed can be found in the appendices. I suggest
that the analyses are read in the order that I discus them, because a lot of my choices later on will be a
consequence of the results of the earlier analyses. All the analyses are performed and visualized in R-stylo.
There is, according to the results of my analyses, a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in the texts of my
corpus.
There is, according to the results of my analyses, a stylistic effect based on the Southern dialect, also
called Flemish, meaning Dutch from the southern regions of the Netherlands, present in the texts of my
corpus.
There is, according to the results of my analyses, a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus.
There is, according to the results of my analyses, a common stylistic effect for the Dutch texts that are
translations out of Latin or classical Greek.
There is, according to the results of my analyses, a stylistic signals of any other language or accent than
Dutch language, present in the Wilhelmus.
The texts are coded and so divided by language or dialect. NED stands for Nederlands and means the Dutch
language, the official language of the kingdom, spoken in the provinces of Holland. KLAS stands for classical
79
texts, these are translations of classical literature to Dutch. DUI stands for German, meaning these texts range
from Hoogduits to Dutch text with German influences. These are not binary categories but more of a sliding slope.
What binds them is that theyre presumed, by me, to have German influence of some kind. ZUID stands for
southern, meaning all languages or dialect or influences on the Dutch language that are spoken in the southern
provinces of the Netherlands, nowadays Vlaanderen and Brabant. I presume these texts are some sort of
Flemish or other version Dutch with Flemish or Belgium influences.
I also included the Wilhelmus and four cases of which I am unsure what region they belong to, mentioned in the
section Hypotheses. These are Van homulus by Pieter dorland, the Nederduitse Orthographie of Pontus de
Heuiter, Drie historische liederen en een hekeldicht of Antonius Ghyselers and the anonymus De Uilenspiegel.
corpus, improves the results.406 Words for which a single text supplied most of the Delta score itself should be
deleted as they pollute the results.
The results now showed clear divisions and a lot of detail. The biggest division puts the ZUID text on
one side of the graph, in the same branch of the tree. Therere authorial signals again, in Coornhert, Fruytiers and
van der Van der Noot, and possible prose vs. poetry signals which I interpret as the divergent results for prose.
Dorland and the anonymous Uylenspiegel both seems to cluster to the ZUID-branch. Huiter en Verwaeck, the
other unspecified text cluster together, along with Marnixs prose text Dordtse reden, to the branch of the NEDtexts. The Wilhelmus clusters again to Coornhert. Notable other results are Coornherts prose text Corte
berispingen clustering with translated classical texts of Erasmus and Coornhert himself. Coornherts other
classical translation, his Ulyses, clusters to Dutch poetry of Marnix and Hout. The German texts under the banner
DUIT are scattered across the graph. With the varying of the two parameters, the pronouns and the culling, I test
which of the change in settings, has a, and if so what for a, kind of influence on the results.
Analysis 2.1 shows that keeping the pronouns while culling, heavily distorts the results, presumably
because keeping the pronouns generally leads to noisier results and with a large number of culling, the pronouns
will make up a larger portion of the corpus than they did before, although this is speculation.
In the third analysis, analysis 3, I use different levels of culling in one and the same graph, by choosing a different
minimum and maximum of culling and an increment. I determine the minimum and maximum by looking in the
data of analyses, displayed in the working screen of R. Culling of 50 percent leaves just 145 features, words,
available for analysis. If I only consider words that appear in all text wed only measure 2 available features, only
2 words. Results show only the big movements and divisions of ZUID texts, against the rest. This results was
slightly visible in previous analyses. The little effects are as good as gone. Therere no little clusters but instead
every text starts from the middle, on their side of the divide of course.
In analysis 3.1, I choose for low levels of culling from 0 up to 50, and the graph shows the opposite
effect. The big divide is totally gone but the clusters are reappearing. Both analysis 3 as 3.1 have unattractive
parameters because their graphs show a less distinctive landscape of this corpus, and less diversity in possible
effects, as the graphs of analysis 2, with a set choice of 50% culling.
In analysis 3.2 I chose for a culling of 30 to 50 percent, based on the available features, these were the
best iterations regarding the amount of measured MFW (most frequent words). If the amount of words becomes
to small, therere not enough features to analyse, but a feature set off over 1000, is bound to contain too many
semantic words. The choice for these parameters results in a more detailed graph in which the big divisions, as
well as some of the detailed information on the clusters, is visible. One branch of the tree clusters all the southern
texts, including the possibly Flemish Dorland, together with Uylenspiegel and the presumed German text of
Antonius Ghyselers, who has clustered before to the ZUID texts. One branch, of a Dutch text and a classical text,
springs from the middle but seems to tend to the ZUID branch. So there seems to be a cluster based on dialect
or language if we look at the clustering of ZUID texts, at the same time therere a lot of text who do not confirm
this hypothesis. Prose text are often exceptions. It is necessary to pay extra attention on a possible distortion of
the results based on an effect of type of text, meaning prose or poetry. An authorial signal seems present in
Coornhert, Fruytiers and Van der Noot. Huiter and verwaeck, the other two non distinguished texts, cluster
together. The Wilhelmus joins again Coornherts song ter liefden. The effects of the German influenced texts are
unclear. In general can the little available features in the analysis 3.0 be taken as a suggestion of another feature
type, character n-grams, also suggested in my theory.
In the fourth analyses I sample in order to compensate for any differences in text size, because it may cause for
an imbalanced corpus. I set the sample size at 525, which is about the size of the Wilhelmus. The graph showed
that parts from the same text cluster together, suggesting they are stylistically representative of the text from
which they came. Theres stylistic consistence within the works. Other results are that the two biggest text, who of
course also produced the most samples, are responsible for the two biggest cluster, who also form the main
branches, the biggest divisions. The third cluster is a left over cluster, meaning the rest of the texts, that do not
belong to the two biggest texts.
I can confirm that large amount of stylistically related text, not only cluster together, but also makes the
other texts seem to cluster together, driving them into one branch by the weight of their own number and the
strength of their relationships. Effects of author and the clustering of Dutch texts from the southern regions are
again present.
In analysis 5, Ive used a random sample, again with the sample size 525, and the amount of samples reduced to
one. This is following the bag of words principle were word order is completely ignored and 525 words are chosen
at random from every text. The effects of text size, so visible in analysis 4, is now undone because every text only
has one sample.
The graph as a whole is less distinctive as the previous ones. It shows a big division but almost no other
effects. Most of the text originates from the middle. Therere very few texts who cluster together. Random
sampling supposingly makes for very few effects, because of its short sample size. Most effects will not show
themselves with only 525 measurable characters, in this case, words.
Random samples of 525 words are, seemingly, so short that a, size wise, unbalanced corpus shows
more effects using word frequencies of the complete texts, instead of the samples. This leads me to the
conclusion that a text of somewhere between 500 and 550 words might just be very unsuited for author attribution
and texts of this size will cause noise in the results. This is only based on these parameters and these texts, so all
my conclusion are with a provisio.
First of all I like to mention that fine-tuning the parameters makes a lot of difference for the results, their
visualization and thereby the interpretation and conclusions. I make decisions regarding the parameters based on
theoretical arguments in order to avoid cherry picking.
There were authorial signals, possible type signals, which seemed the result of prose clustering, and a possible
language signal because the ZUID texts, texts with Flemish or southern Dutch heritage, seem to cluster. At the
same time therere a lot of texts who do not confirm these effects. Prose texts are often exceptions, this can be a
possible distortion because of a type or genre effect. The German texts do not seem to recognize each other
stylistically. There might be an effect for translated works, but I need more tests to confirm this.
Dorland and the anonymous Uylenspiegel both seem Flemisch or Flemischisch. Huiter and Verwaeck
cluster together, along with a variety of other NED or Dutch texts, but do not consistently show a language
signal. Wilhelmus clusters to Coornhert but also inconsistently.
When choosing high levels of culling the graphs showed the strong effects, hence the big divided, but lost a lot of
the little effects, who may not actually be, relevant or true effects. When using low levels of culling, the graph
showed the opposite effect. I should keep this in mind and try to theorize this, when determining culling, which I
base on the amount of features, not on the graph.
The sampling confirmed the stylistic consistence within the works and laid bare that big texts, or large amount of
stylistically related texts, push the other texts in one cluster, while these may not actually be stylistically similar.
Theyre only alike in their comparison to the other cluster, namely not like those texts. So relations are relative. I
need to be aware of this when constructing further corpora.
We saw that most effects will not show themselves with a sample size of only 525.Trying to perform an
authorship attribution on a text of somewhere between 500 and 550 words seems like stretching the methods too
far and texts of this size will inherently cause noise in the results, but I already knew this beforehand.
assume the reader has read the explanations of similar interpretations in previous analyses. The report will
continue to become more concise but I suspect Ill also have more and more complicated results to report.
The first analysis on this new corpus has the same parameters as the 3.2 analysis I performed on the on the
previous corpus, where we first determined the culling. The graphs shows two main branches, one clusters the
ZUID texts with Dorland and the text of Antonius Ghyselers, while the other contains the texts of Van der Noot
and two ZUID texts that crossed over to this branch as a pair. Authorship effects are present and sings of stylistic
similarity between texts from the same work, like the geuzenliedboek or any other book of songs, as well.
German texts are divided among the graph but Dousas translation from Latin clusters with the two texts of the
Niederheinische liederen. The Wilhelmus shares the same branch again with Coornherts Ter liefden van een
Maghet and the Baanderherenlied, both songs that individually were about half of the Wilhelmus, but were put
together to avoid text even smaller than the Wilhelmus existing in the corpus. Performing this intervention means
weve got to remember that therere possibly two voices in the conjunct text. The texts might be of the same
author, but we cant forget that possible genre, or other unknown effects are active, and causing style not found in
either of the individual texts. Because theres no representation of each individual text, the differences between
training texts by the same author are disregarded and the stylometric measures extracted from the concatenated
file may be quite different in comparison to each of the original training texts.407 Ive found research were results
showed that there is no substantial difference in attribution accuracy between a few chunks of 500 words
combined in one sample and a dozen concatenated chunks of 100 words. It suggest that in real attribution
studies, concatenated samples would display a very good performance.408 This was however an artificial
concatenation and moreover out of the same document. I remain wary over the conjunct files. Heuiter, the other
undefined text, doesnt show any preference for a language.
For analysis 2 I change the type of features to character n-grams, to compensate for the shrinking of the amount
of features when performing high amounts of culling. I start with n-grams of 3 characters. The results show that
the Van der Noot branch switched back to the side of the ZUID texts, making a clean division between the ZUID
texts and all others, except for the text of Antonius Ghyselers, presumed Germannisch, who joins the Flemish or
southern Dutch texts.
While the division of the ZUID texts against the rest is now nearly perfect, it does raise questions about
the reasons why the texts from Van der Noot switch sides so easily. Although the graph now seems to confirm an
effect of the Flemish or Southern Dutch dialect, it does this under very specific consequences. As weve learned
from the theory on the unmasking method, texts who cluster together under different feature sets, or different
parameters, are more convincing in their stylistic similarity. A lot of the effects are very delicate and thus not very
convincing, but these sensitivity is explained by the so much less than perfect conditions of the corpus and the
dependency of the method on the corpus. It reinforces our awareness of the importance of the parameters. If
switching the basic parameters to highly similar, but different, changes the outcome so drastically Ive got to fully
understand the conditions that Im changing and account for the considerations when preferring some over
others.
With these parameters, I refrained from choosing a specific culling but included all levels of them (with a
increment of 10), basically allowing higher levels of increment to again take their influence and as a consequence
Van der Noot sides with the other southern texts again. This was also the case in previous analyses where I
included high levels of increment. The difference is that this time around, Ive got a lot more available features on
the highest rates of culling, because of my choice for character 3-grams. However, in the included graph Ive
again switched back to the culling of 30-50, with the same effects, implying it is the choice of n-grams that pushes
the Van der Noot branch back to the other ZUID texts. The other results are broadly the same, so effects of
language and author, and no clear sign of the Wilhelmus but the anthem does cluster again to the conjured
Coornhert text.
I performed the same analysis again but this time I set the size of the character n-gram to 4. The data of the
analysis in the R working screen verified the increase in the amount of features. This gives us the possibility of
more iterations of culling, so I went with the full scope of culling from 0 to 100%.
The graph of analysis 4 can be read as an indication whether a sample size closer to word level shows
us a graph closer to that of word level analysis. Surprisingly it doesnt. The graph shows us a lot more detailed
information than the word level graph of the first analysis on this corpus did. There is a little cluster of translated
classical literature and a cluster of the two DUI texts with another KLAS text, but in general both the translated
Dutch texts, as the presumed German texts, dont seem to form any real clusters. Van der Noot stays with the
other southern authors. The Wilhelmus hasnt showed any sign of clustering with the ZUID texts.
The graph that belongs to analysis 5 shows us n-gram on word level, or collocations, discussed in the theory. I
wont again discus all its possible effects and advantages, because in this analysis on these short 550 word size
texts, collocations do not prove themselves useful. First of all, anything above a culling of 10 has too little
iterations to analyse. Looking at the graph, the top 5 longest texts, Erasmuss lof der zotheid, the anonymous
bekerigePauli Vlaams Sinnespel, Anna Bijns her Seer scoon boeck and Coornherts boethius show relations
with each other, along with the two parts of Niederheinische liederen. The rest of the graphs show no relations at
all and gives me no information on possible stylistic properties. Ill be hesitant to do further analyses using
collocations as features on corpora who involve texts of these lengths.
For the analysis 6 and 6.1 I use sampling, while keeping as features character 4-gram. Based on previous results
and plain logic I choose 2 at analysis 6 and 1 at the analysis 6.1as the n of random samples.
Graph 6 shows that samples of 550 character 4-grams do seem to interact and show stylistics relations.
Except for the 7 texts, lof der zotheid, boethius, one text of Hout, Heuiter, Marnixs 2 sonetten aan lucas
d'heere & in den duinen & De profetie van het lijden Christi, Van der Noots De vrijagie ende het houwelyck and
85
again Antonius Ghyselers, the texts of my corpus form clusters. The bad performance of the Erasmus text,
boethius and that of Heuiter can be explained/blamed on their size, consisting of texts with different styles and
perhaps less than perfect editing. Marnix text actually consist of 3 texts that I, because of the same reasons as
the Coornhert conjunct text, namely their limited text size, combined into one file. Both parts of Coornhert
however stay close, which is encouraging, hinting at internal consistency. I cant explain Van der Noot and
Ghyselers at this point, and Ill have to dive into the text, close reading it, in order to be thorough. The graph
doesnt show me that much other effects except for the close relation of both the Fruytiers texts.
Graph 6.1 shows the same pattern. A sample of 550 words is too small to suffice for analysis. This backs
the conclusions about sampling of the language corpus. When performing stylometry based on quantitative
analysis with the methods, design and tools Im using, it is advised to use larger texts or texts samples than 550
words when searching for a effect signalling language or dialect.
4. Conclusions language corpus 2 Alle talen, enkel poez (DUI-NED-ZUID-KLAS)
The results of the analysis on the second corpus are confirming the conclusions of the analyses on the language
corpus. We see authorial signals, stylistic compatibility of texts stemming from the same book of songs and a
possible language signal of ZUID texts.
The German-hypotheses seems very improbable. The available texts to test this hypotheses already fall
short and the German corpus shows no sign of consistent stylistic similarity, apart from the very obvious
connection of the two texts of the Niederheinische liederen, two samples from the same author and out of the
same work. There might also be a stylistic effect of translated classical literature.
The Wilhelmus shows no clear signs for language. He shares again the same branch with Coornherts
Ter liefden van een Maghet & Baanderherenlied, a conjunct text, which is a text heavily adapted by me, in this
case putting two texts of the same author, and with compliant genres, in one document and analyzing it as one
text. I use this procedure to either enlarge the text size or reduce the amount of texts by an author, while not
reducing the amount of text. This is a questionable practice and I should never forget the fictional nature of these
conjunct texts. They might be of the same author, but the effects on any other effects possibly present and the
complications of multiple voices in one text, are theoretically and practically unclear to me.
Some of the texts move around the graph a lot and they make and break connections very easily, when
making little adaptations to the parameters. When performing analyses we should look for attribution that remains
stable under different experimental circumstances, not in the least different feature sets. This stability is
associated with the reliability of the attribution. The opposite is also true. At this point a lot of the effects are still to
be interpreted as part of the difficulty of creating big test corpora, the text size and the conditions under which we
do the analysis. Conclusions should be made cautiously, because of the delicate nature of the effects, the amount
of noise, and the ambitious goals I have set for of my methods, with respect to the corpus and the flawed nature
of other testing conditions.
Analysis with the use of collocations do not seem useful at this point
86
87
distinguishable in the graph. This effect shows when the corpus is properly prepared and other effects are
controlled for so that these wont blur the stylistic similarity of the ZUID texts. The biggest division is between five
texts coded as ZUID, one text of Dorland, with probably heavy southern influences, and one text of Ghyselers,
who clusters with the ZUID texts for reasons unknown. I should begin to consider Ghyselers as a possible ZUID
text.
I give up un testing the German hypotheses. There is no reason to assume any type of effect on this
corpus of DUIT texts. I consider this a flaw of the corpus, as its insufficient to test this hypothesis, so I will not so
much reject the hypothesis as leave it unanswered. Ill discuss this in the conclusion section of the whole
language corpus, as dropping this hypothesis is the product of all the previous analyses on the different language
corpora.
So far the Wilhelmus has shown no relation to any other dialect than the hypernym NED.
7. Analyses language corpus 4 DUI NED ZUID poez
Having given up on the German hypothesis I exclude these texts from my corpus, giving birth to the last language
corpus called DUI NED ZUID poez, counting 15 texts. The goal of this corpus is to confirm the ZUID hypothesis,
that the Wilhelmus has southern influences, and to determine if it can be attributed to the ZUID texts. If so, it
would suggest that the Wilhelmus is from southern origin.
In the first analysis I go back to the basic settings of words as features with proper percentages of culling and
increment. The graph shows a big division between the ZUID texts and the other texts now all NED texts or
open cases. Dorland clusters with the ZUID texts and Heuiter with the NED texts, specifically with Hout. I
conclude that Dorland has southern influences and Heuiter does not. I will not go as far as say that Heuiter is
Dutch without any stylistic noticeable dialect, as the distinction between the two regionally based stylistics
obviously comes from the clustering of Flemish or southern text. I come to this conclusion based on the previous
corpora, which included text with other dialects or stylistic influences based on languages, and these
predominantly clustered with the NED texts, or more specifically away from the ZUID text, while the ZUID text
consistently clustered together over several different parameters sets, with a few exceptions. Also, conceptually,
while ZUID stands for a region, of which we test the assumption that texts from this region have some
distinguishable stylistic signal, NED stands for an accentless language, meaning what DUIT or ZUID texts are
not. Based on the results as well as the initial categorization of texts, I conclude that the common stylistic effect of
the ZUID text is indeed present, while this same effect for NED texts is not established. The Wilhelmus shows,
according to my results, no southern influences. It consistently clusters to a conjunct text of Coornhert, an author
which I assume to be without southern influences.
In analysis 2 and 2.1 the used features are character 3-gram and character 4-gram. Both graphs show less
coherence on either side of the big division than the graph of analysis 1 did, but the results stay roughly the
same. In graph 2.1 there does seem to be a division in the NED branch, something which has not occurred up
88
until now, between Marnix, Hout and Huiter on one side and on the other side Fruytiers, Coornhert and the
Wilhelmus.
For a critical interpretation of these results I consider other characteristics of the texts, than considered up until
this point. I find that the largest, most primary division could also be explained by text size, although these results
would be a lot less distinct and count a lot more of deviations than an explanation based on language. It turns out
that the ZUID texts are, most of the time, the texts with the biggest text size and Coornhert and the Wilhelmus,
the two texts that clustered repeatedly, are the texts with the smallest text size. However, there are still enough
exceptions to challenge an explanation based on text size. Heuiter, for example, is a very large text, but clusters
with the NED texts. Also the size of the ZUID texts, ranges from 3 times the Wilhelmus up to 10 times the
Wilhelmus, so the differences in size within the branch where most of the large texts reside, are bigger than the
difference of text size between the branches. Some of the authorship effects, like Van der Noot, shouldnt be
present if text size was the premier determinant. Still, these results cant be ignored and Ill need to control for any
possible size effects.
To cancel out the text size, I merged both texts of Fruytiers into one file, and did the same for both texts
of Marnix. I also added another Coonhert text, boeventucht, to the already conjunct file of Coornhert. In addition
to this I deleted 80% of Van der Noots Himne of Loft-sangh van Brabant sizing it down from 24kb to 5 kb, in txt
format, roughly the size of the Wilhelmus. With these corrections to my corpus, I perform the same analysis as
2.1 once over.
In graph 4.1, Coornhert still clusters, even after he is transformed to one of the longest text of the corpus, with the
Wilhelmus. The Fruytiers document and the Marnix document show the same relationship towards each other
and the other texts and take in roughly the same position as they were before. The relatively long text of Dorland
(50kb), does not cluster with the other long texts being Coornhert, Anna Bijns, and the anonymous sint paulie,
but with much smaller Van der Noot (20 kb) and Uytenhoven (15 kb). Anna Bijns text and begeerige pauli, two
very large texts, cluster together with another ammoniums text schandaleuze spelen which isnt very big. These
result can be explained by similarities in dialect, but not according to similarities in size. In general, shows graphs
4.1 us the familiar division of the ZUID texts on one side of the graph and the texts that are not ZUID on the
other, with the exception of the adapted text of Van der Noot who has switched over to the NED side. Based on
the assumption that the signal that divides ZUID from the rest is largely based on common stylistics of the
southern texts and not on the common stylistics of the Dutch and other texts, Van der Noot apparently shows
after its mutilation no longer enough resemblance to the stylistics of ZUID texts, and is therefore pushed away.
Apparently the sizing down of Van der Noot obscures the stylistic characteristics he showed before. I should take
his into consideration when reviewing the results of the Wilhelmus text and my methods.
Graph 4.2 shows roughly the same effects as graph 4.1. When switching to character 4 grams in
analysis 4.2, the adapted Van der Noot texts has as nearest neighbour another text of Van der Noot again,
89
however theyre still positioned in the graph on the other side as the ZUID texts. Also, both Van der Noot texts do
not form a branch of their own, but are only nearest neighbours.
8. Conclusions language corpus 4 DUI NED ZUID poez
I conclude that theres a signal for language in the form of a stylistic resemblance among Flemish and/or Dutch
texts from the southern regions. I corrected for size and this effect was still present. This stylistic signal, that I
attributed as an effect based on dialect or language, up until my revision of the DUI NED ZUID poez
and after critically reviewing other explanations, turned out be, after I corrected for text size, correctly interpreted
as an effect based on dialect or language. Text size does not account sufficiently for this signal.
Based on the results of my transformed fourth corpus, I do however conclude, that the stylistics signal of
a text does significantly change somewhere under a thousand words. Sizing down, obscures stylistic
characteristics regarding language, but less on the authorship signals, who are present in this corpus and remain
to be, although less obvious, when the text size gets significantly under a thousand words. In addition to this,
when raising the level of culling, the authorship effects that were lost or dimmed because of a small text size
seem to come back.
When performing analysis on this corpus, Ive to be aware of the limited text size and possibly correct for
it or alter and prepare my corpus because of it. Cutting texts, obscures all types of stylistic markers present in
them, while deleting all small texts or combining texts in order to increase the text size is in my case hardly a
workable option, if Id like to keep a balanced corpus, which includes different authors represented by multiple
texts with different characteristics, and avoid, in order to keep the voices of my text singular and their signals
authentic, combining texts.
I conclude that Dorland has southern influences and Heuiter does not. Ill not go as far as saying that Heuiter
writes Dutch without any stylistic noticeable dialect, as I deem the stylistic similarity of the NED texts not proven.
While the clustering of ZUID text has been convincing over four corpora and all the analyses using several
different parameters sets, exceptions excluded. As explained is the assumption of the NED or Dutch texts as a
dialect or version of the Dutch language, conceptually problematic, while ZUID has a better theoretical ground.
Therefore I interpret the consistently clustering with NED texts as not- ZUID, meaning under no measurable
stylistic influences from languages of the Southern regions of the Netherlands, like Flemish.
90
-The results of my analyses show that therere stylistic effects, based on the difference in style between variations
of the same language, for example dialects or other regional determinants, present in my corpus. They can be
measured by my methods and can be demonstrated in a visualization. Based on the results of this research I
conclude that texts of a similar dialect are expected to have a stylistic resemblance. Ive only satisfactory tested
this for texts from or under influences of the formally southern provinces of the Netherlands, for example
Vlaanderen. I refuse to generalize these findings to other dialects yet, because one swallow does not yet a
summer make.409 I can only suspect the possibility that similar effects are present for other dialect or linguistic
regions. Recommendations for further research are done in the sections discussion and future research.
- Doing analysis on short texts, referring to text that are around thousand words or less, weve to take the size
into account and if possible correct for it, as text characteristics will become less visible, more noisy and
sometimes disappear all together. Preparing the corpus is a big part of controlling for text size effects. However
cutting texts too much, going under a thousand words, obscures all types of stylistic markers present in the text.
- There is no indication that the Wilhelmus has a southern or Flemish origin.
- The German-hypothesis has not been sufficiently tested. I fell short on testing this part of the language sub
hypothesis for several reasons. First of all there was to little readily available text for a decent representation in
my corpus, mainly because of the other restrictions I placed on the text, like size, area and time of creation and a
qualitative and temporal closeness to the first edition. Another reason for the low quantity and quality of my DUI
texts was that, while working under the time pressure of deadlines, on a sub hypothesis, the selection of texts of
this type took very long because the biographical knowledge of a text or author was seldom available, while I had
limited knowledge of these type of texts myself. Combined with the variety of different types of German
influences, it took an enormous amount of time and effort, reading German secondary literature for example, to
understand the stylistic nature of these texts and to determine if they should be included in my corpus. In my
search for text that would qualify as DUIT Ive come across, among others, German texts, texts that were
Hoogduits and texts from the upper regions of the West-German border that were paradoxically called
nederduits or Nether-German texts. Current borders, national or language, werent there in the 16th century, but
rather a language continuum from the eastern provinces of the Netherlands up to the Baltic states.410 The
diversity is a problem if youre building corpora from the renaissance because of the limited paper and digital
availability of these type of texts. Further research would strive by the defining and the determination of the
diversity of this dialect, in order to make the categories operational.
91
- Of the five texts, or test cases, without a definite language mark, that Ive tested along side of the Wilhelmus,
namely De uilenspiegel, Dorlands van homulus, heuiters Nederduitse orthografie en Vaernewycks histori
Belgis, I can only determine Dorland as as a ZUID text.
- The First methodological conclusion I draw is, when performing normal sampling with a sample size of 525,
samples of larger text will clusters together and therefore determine the main branches of the graph, because
these are determined by the amount of relations and the strength of the relations, which samples of the same
large text will both have. Text significantly larger than others will make up too big a part of the whole amount of
samples and therefore obscuring smaller effects. I advice a balanced set, especially for text size, when using
normal sampling and in general, because weve seen that the results are under the influence of the size of the
texts. The importance of this increases as the texts grow short in an absolute way. This also leads me to conclude
that 550 words is a very short sample size and text size to use for analysis. Subtle effects will be more present in
the graphs when using bigger texts or samples.
- A second methodological conclusion concerning sampling is, when using random samples of 525 words, the
graph gives way less information and expression than using the full size of all the texts, even when the corpus is
very imbalanced regarding the n words. These findings lead me to conclude that a text of somewhere between
500 or 550 words is an inappropriate sample size, for quantitative stylistic analyses, disruptive of the results and
should be avoided if possible. The stylistic analysis on text of these short length might be difficult and even
problematic, its definitely not without results. Language and authorship signals, among others, are present and
visible.
These methodological conclusions are conform my expectations. This of course also means that the
Wilhelmus is far from an ideal text to perform authorship attribution on. However, effects seem to be present and
some are visible, and can, with further research, lead us to conclusions.
This means for my hypotheses;
There is, according to the results of my analyses, a stylistic effect based on languages, dialects or any
other kind of accent or influence stemming from differences in language, present in my corpus. This
hypothesis is accepted.
There is, according to the results of my analyses, a stylistic effect based on the Southern dialect,
concerning Flemish or Dutch from the southern regions of the Netherlands, present in the texts of my
corpus. This hypothesis is accepted.
There is, according to the results of my analyses, a stylistic effect based on the Eastern dialect that
stems from influences of German language, Hoog duits being the most prominent one, present in the
texts of my corpus. This hypothesis is neither accepted or rejected, but insufficiently tested.
There is, according to the results of my analyses, a common stylistic effect in the Dutch texts of my
corpus, that are translations out of Latin or classic Greek. This hypothesis is neither accepted or
92
rejected, but insufficiently tested. Results seem to point to a possible stylistic effect of either translations
or remains of a classical language.
There is, according to the results of my analyses, a stylistic signals of any other language or accent than
Dutch language present in the Wilhelmus? This hypothesis is rejected.
There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction coming from differences in language, present in the texts of my corpus.
There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose.
There is, according to the results of my analyses, a stylistic difference present, between the songs and
the poetry of my corpus.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry.
There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggars songs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggars songs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen.
The 0-corpus consist of 42 texts. Ive included a lot of prose, coded as PROZA, which is the Dutch word for
prose, and a lot of poetry, coded as POEZIE, which is the Dutch word for poetry. Psalms are a separate type,
coded as PSALM, because Id like to account for any possible effects based on their content of form, since the
birds eye view indicated that psalms might be a stylistic distinct group.
93
For now, I put songs and poetry under the same banner of POEZIE, sometimes with an addition of
SONG or LIED, meaning and indicating song, at the end of the code. The Wilhelmus is a song, so my focus lies
on including similar works, preferably songs, to my corpus. At the same time are songs, on paper and without
melody, very similar to poems. The modern distinction between (pop)songs and poems is a modern one, and not
applicable to the sixteenth century. Ill try to account for the distinction of songs and poems later on.
One text is uncertain in its nature, namely Coornherts boeventucht coded as
00_coornhert_Boeventucht PROZA-POEZIE. Its unclear whether to call it prose or poetry.
Analyses. 2.2, uses character 4-grams and shows us mostly the same results especially the big
divisions. The smaller branches are all gone. Looking at nearest neighbour more than at different branches, the
relationships of the texts are mostly the same. The biggest difference is that Datheens letter now joins the prose
and at the same time the anonymous niederheinische liederen handschrift tags along and shifts over to the
prose cluster. Coornherts boeventucht sides again with the prose texts. The four apologia are still close to each
other.
The other side of the graph becomes a lot more diffuse. Authorial signals are still visible but the rest of
the effects scatter over the right side of the graph. There is still some clustering of the psalms as dHeere joins the
other Psalms. Small texts and Flemish texts seem to cluster on these characteristics. The 4-grams seem to
confirm the main effects but blur out the smaller effects.
To clean up this corpus I excluded the Flemish texts and the texts that are smaller or equal to the Wilhelmus. Im
down to 33 text and this means more visibility. Deleting the clusters whos effect wed established, but are not the
aim of these analyses, puts more weight on other possible effects within the corpus.
The same analyses were performed again. Prose and poetry cluster together, the big division is
especially visible when using character n-grams as features. Over the whole scope of the corpus authorship
effects are visible. Coornherts boeventucht clusters with prose again. Psalms also seem to cluster together, at
the very least the Psalms David. Important differences with the previous results are the clustering of the
Wilhelmus to Coornherts prose, which it does two times, and to the prose of the prince, which it does one time.
Coornhert texts are very large, 24kb in txt format or 3871 words and 22KB in txt format 3552 words, and both
prose texts of Willem van Oranje are 9kb in txt format, which is about 1500 words, so the attribution cannot be
explained based on size. When interpreting the graph up close youll notice that all relations to the prose are just
nearest neighbour relations. These effects, occurring sporadically, are perhaps not that convincing. The
Wilhelmus seems to move around easily, which is a concern.
Analysis 4 is again a collocations analysis, with 3-grams on word level. Culling on high levels was impossible
because of the limited amount of features. Both the choice for collocations as that of 0% culling raises the
possibility measuring semantic information rather than stylistics information.
The bottom branch combines 5 small poetry texts, with one large poem, Coornherts book of songs and
prose of medium size by Hout and Willem van Oranje. The analysis provides little information. All but the
authorship signals of Fruytiers and dHeere, are as good as gone. The psalms tend to cluster in one branch,
along with a few other big texts. Prose and poetry cluster again, seemingly more than average, but the results are
not convincing enough yet to draw conclusions on. The graph as a whole is to fragmentized and shows too little
detail. The poor results, the possible bias for semantics, and overall shape of the graph, lead me to the reject
these results in my overall analysis.
95
Weve already seen that sampling with just 550 words gives poor results. Therefore I removed all texts smaller
than 1500 words, including the WIlhemus and put up a sample size of 1500 words, in order to test genre or type
effects. Analyses of 5 show us clear authorship effects and stylistic distinctions between poetry and prose.
Psalms seem to form a group together, and although defining psalms as a genre might actually be the practice of
interpreting results so that they fit my hypotheses, theres without a doubt, a, possibly topical, common stylistic
trait in the psalms. Note that a lot of the psalms are psalms Davids. Theyre different parts of the text, the first 10
psalms of one author and the next ten of another, so were not just confirming different editions. None of the
psalmen davids have overlap in texts. Boeventucht shows itself again as a prose text. The graph of analysis 5
confirms my findings of analyses 1 to 4. When I add the Wilhelmus it clusters with the Psalms.
Bringing down the sample size to 1100 words in analysis 6, shows us roughly the same results but the
graphs are more fragmentized and less informative. The clusters are less distinct. This graph shows how signals
fade when the size of the samples or texts are falling below a threshold.
2. Conclusions genre corpus 1 Alle genres LIED, PSALM, PROZA, POEZ
So far there seems to be a clear stylistic difference between prose and poetry. This effect already showed itself in
different versions of the corpus and different sets of parameters. I suspect to see this effect in other corpora as
well. There are clues that there is also an effect for psalms, but this hypothesis has not been tested enough.
Weve seen small text cluster and also Flemish text group together and have excluded both groups to get rid of
their noise. Coornherts boeventucht, although partly poetry, is stylistically identified as prose. Other, perhaps
smaller, genre effects, like that of the speeches are only visible sporadically. The Wilhelmus doesnt seem to
relate stylistically to other songs and moves around the graph easily, which is a concern. The corpus also
confirmed the conclusions of the language analyses. There were author signals and language signals for Flemish
texts. Analysing samples of 1500 words showed good clear results. Bringing down the sample size to 1100 words,
definitely did some damage to the visibility of the presumed effects of the graph, but it did showed us roughly the
same results, only more fragmentized.
3. Cancelling hypotheses
Based on the previous corpus and the corpora of the language hypotheses I abandon the attempt to make further
distinctions between poetry and songs, and my initial plan to test 6 different types of songs, as well. Problematic
of the dropped hypotheses, is that it requires categorization based on very ambiguous characteristics.
Ive already mentioned that the primary distinction between poetry and song, although obvious in the
here and now, is rather blear in the 16th century, since both depended heavily on oral transmission, and the
difference lays predominantly in the way its suppose to be performed. The importance of making this distinction
is pretty low since we already know that the Wilhelmus is a song. On the other hand, if theres a stylistic
difference between poems and songs, I should be aware of this, so I can take it into account when measuring
authorial signals. My corpus, however, consist of such an amount of songs that its not methodologically valid for
96
my research. Ill pay attention to the behaviours of the poems, so if they do give out a stylistic signals, which they
havent in the slightest so far, Id pick it up and reinstate this hypothesis.
While the different genres would take, apart from the songs with the genre indications in their name, the
researcher to define measurable characteristics per genre, and allocate the texts according to these standards,
that wont be mutually exclusive. These measurable characteristics should come from a theoretical framework, or
some general consensus, of why these are typical for a certain genre. Remember the genres the Wilhelmus was
labelled, they were sometimes pretty analogous. When the genres were opposites, they were so with respect to
their goal, forgiveness or incitement, but still on some levels similar enough to both be attributed to the same
song. This will amount to, even with the consumption of huge quantities of secondary literature, vague or
ambiguous categories, probably highly contextually dependant.
Then theres the availability of texts. Some of the genres are pretty rare, if real at all. The different
genres or functions of these songs are part of the Wilhelmus research discipline, determined after the conception
of the song. Their labels were not pre-existing genres where the still anonymous poet of the Wilhelmus could pick
from. Its not certain that theres a conceptual difference between for example songs of praise and songs of
encouragement with the other type of songs, some of which are very difficult to translate (pardoenliederen). To
determine stylistic effects over different genres of songs, one has to have a justifiable corpus of every genre.
Assuming these genres are stylistically different and measurable, the Wilhelmus would have to actually be one of
the types, and be long enough to signal the stylistic properties of the genre. Knowing now how sensitive and
delicate the effects in this corpus are, we would need to exclude other effects like, time and place of birth, author,
size, dialect and so on. Ive complained several times about the difficulties of constructing the relatively easy
corpora Im testing now. This alerts me at the whale of a task that forming the 6 genre corpus must really be, if
possible at all. Ill let go of this hypothesis.
Now that I cut some hypotheses and specified my corpus a bit further, its time to reaffirm the goals of my
analyses. I aim to test 3 things on the next 3 corpora; Whether prose and poetry have distinctly different stylistic
signals, Whether psalms and prose have distinctly different stylistic signals and Whether psalms and other poetry
have distinctly and different signals.
4. Analyses genre corpus 2 poezie vs. proza
First, I test stylistic differences between poetry and prose. For the genre corpus 2 poezie vs. proza consisting of
now 14 texts
I exclude the psalms along with the earlier excluded Flemish texts and the texts shorter than three times the
Wilhelmus. The psalms will be added later on, when Ill try to measure the effects of the psalms against that of
prose or poetry. The first corpus however will focus on the effects of prose against poetry, and therefore the
psalms are excluded. I perform different analyses varying features and culling, and sampling always with sample
size of 1500 words.
97
All three analyses show effects. Poetry and prose seem to have a different and distinguishable stylistic signal
which causes prose and poetry texts to cluster to their own group, and divides the corpora into a prose and a
poetry section. This effect is strong enough to establish itself at a sample size of 1500, thereby not only verifying
the existence and strength of the signal but also indicating that the sample size is adequate.
These results come as no surprise, as weve seen the distinction between prose and poetry through out
all my analysis, and during the test on my genre corpus it definitely established itself as a pattern. Therere some,
aberrations from the pattern, but this also is to be expected. However, the reoccurring breaking of the corpus
between the prose and poetry, varying on all kinds of parameters, and on different corpora, one specially
designed to test this effect, gives me the confidence that there is indeed a stylistic difference between poetry and
prose, that my methods can and do pick up on this effects and that a sample of 1500 is big enough to show this
effect.
authorship effects while others were represented by only one or two texts. Especially Lucas dheere but also
Johan Fruytiers were too prevalent in my corpus, therefore I combined the 2 smallest poem/songs of dHeere in
one file and the 2 smallest poem/songs of Fruytiers in another file. To make sure author effects cant overrule
possible genre effects, I also exclude the dHeeres psalms from my corpus. I defined Coornherts book of songs,
that must be performed on the melody of the psalms, as an open case, as its not a stereotypical psalm. On a
corpus without prose, balanced for text length, compensated for possible authorial signals, I perform an analyses
that intends to capture possible stylistic differences between psalms and poetry.
The graph shows that the biggest division is between psalms and poetry, however, the amount of texts in
the corpus has declined significantly causing the honour of being the biggest division to depend on the margin of
just one text. The margin would be bigger if Coornherts book of song is categorized as songs/poetry.
Fruytiers his ecclasius repeatedly joins the psalms. Although the results are less convincing than the
two previous analyses, those between prose and poetry and between prose and psalms, there seems a tendency
for psalms to cluster together. Based on these and previous results I conclude that, although the stylistic distinct
signals of poetry and psalms are not yet sufficiently proven, I advise research performed with similar design,
goals and text size as my own to consider these effects as probable. Ill definitely follow my own advice in my
authorship attribution experiment. I dare to speak of a delicate but conceivable stylistic effect for the genre psalm
in comparison with other poetry and songs, that manifests itself with specific methods under well tuned
parameters on a heavily prepared and balanced corpus of clean texts. If these requirements are met I believe
more different types of poetry can be distinguished. Psalms and other poems can have a stylistic differences,
although these effects may occur based on topical or other semantic characteristics.
genre, that has common stylistic properties and that will manifest under the described conditions. However, I
deem these effects not yet sufficiently proven, to confirm them as true. Im considering them probable and
supported by the results of my analyses so far. This is why I advice to account for these possible stylistic effects
of psalms, as I will do so myself.
- I attribute no characteristics to the Wilhelmus as it hasnt showed any consistent effects.
- The other open case, Coornherts boeventucht, resembles prose stylistically, although close reading tells us it
starts with a rhyme scheme before gradually losing its rhyme. It is also the first Coornherts prose text, and prose
text in general, to leave the prose cluster, when other effects are present, suggesting that the prose signals of
boeventucht is the weakest of the prose texts in my corpus, probably due to its mixed nature.
- Methodologically these results mean that stylistic effects for poetry, prose and psalm have their influence on the
forming of the graphs. This is an important consideration when building the corpora and interpreting the results.
- Authorship signals can overrule genre effects, especially when texts are equivocal regarding their genre or type.
- Analysing sample sizes of 1500 words showed good clear results. Bringing the sample size down, had a
negative influence on clarity of the genre signals and the readability of the graph.
- Character 3-grams and character 4-grams were very instrumental and gave a lot of information and are to be
considered as a sound alternative and addition to features on word level. In comparison do character 3-grams
show more detail and smaller effects as character 4-grams.
This leads to the evaluation of the hypotheses;
There is, according to the results of my analyses, a stylistic effect based on type, genre or any other kind
of topical distinction coming from differences in language, present in the texts of my corpus. This
hypothesis is accepted.
There is, according to the results of my analyses, a stylistic difference between the prose and the poetry
of my corpus. This hypothesis is accepted.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to poetry than to
prose. This hypothesis is neither accepted or rejected as the Wilhelmus hasnt showed any consistent
effects, and I consider its characteristics insufficiently analyzed.
There is, according to the results of my analyses, a stylistic difference between the songs and the poetry
of my corpus. This hypothesis is cancelled.
100
The Wilhelmus is, according to the results of my analyses, stylistically more similar to songs than to
poetry. This hypothesis is cancelled.
There is, according to the results of my analyses, a stylistic effect based on a common type or any other
kind of topical characteristic of the beggarssongs or geuzenliederen, propaganda songs, songs of
comfort or troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen, present in the texts of my corpus. This hypothesis is cancelled.
The Wilhelmus is, according to the results of my analyses, stylistically more similar to one of the different
genre songs, namely the beggarssongs or geuzenliederen, propaganda songs, songs of comfort or
troostliederen, songs of departure or afscheidsliederen, songs of encouragement or
bemoedigingliederen, songs of incitement or opwekkingsliederen and songs of mockery or
spotliederen. This hypothesis is cancelled.
Author hypotheses
My final specialized corpus will revolve around the main question of my thesis; Who is the author of the
Wilhelmus? My aim is to map all the authorship effects, determine their strength, look for unexpected results and
attribute the Wilhelmus. In order to do so I propose the following sub hypotheses.
There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal.
The Wilhelmus shows consistently more stylistic similarity to the texts of one particular author than the
texts of other authors and the texts of this author consistently show an authorial signal by expressing
similar style.
101
this means that a lot of the anonymous texts group together, this effect could also be because theyre from the
same anonymous author.
Marnix and dHeere are examples of authors whose signal is apparently strong enough to group their
prose a poetry together. Marnix, Fruytiers and Coornhert are examples of authors whos authorships signals are
in some cases overruled by poetry and prose effects. Van der Noot, an author with southern influences, clusters
separately, but this could be just his authorship signal and not a language signal. The Wilhelmus forms
connections with the prose of Willem of Nassau. Therere other possible noteworthy effects, but lets see if theyre
reoccurring under different conditions.
The graphs of the successive analyses 1.1 and 1.2, share roughly the same results, with no big divisions but
some small clusters. At culling levels from 0% to 100%, about a quarter of the texts find too little stylistic
connection to form trees, probably because the features that appeared in less than 20% of the texts, are more
stylistic and less semantic determinants.
The effects we do see are similar to those before. Therere cluster based on authors, prose, poetry and
psalms. Both included Wilhelmus versions become each others nearest neighbours but show no other
interpretable effects.
When, in order to get more results, varying the feature type, to collocations, the limited amount of words made
analysis impossible. Using character n-grams solved this, because they give me much more features to analyse a
text on. As explained in the theory, n-grams measure stylistic properties but have semantics as by-catch. Also
when using character n-grams, raising the culling of the analyses has less impact then with words, suggesting
that the correction for topic has already been performed.
Results for the analyses 2 to 2.3 show no big division again, but they do show large clusters this time.
Therere still texts with too little stylistic resemblance to other texts, to join a tree, but way less than with previous
analyses on this corpus. Also, the text that do not cluster now have nearest neighbours that are often very
explainable. Authors cluster, prose and poetry find their own branch, and so does the psalm texts.
The Wilhelmus seems to cluster sporadically to Willem van Oranje. This attribution is way to inconsistent, as it
seems almost accidental. Note that almost all authors have some link to the Wilhelmus, so I perform further
analyses until it consistently clusters to an author, before I can draw conclusions.
Wilhelmus forms connections with the prose of Willem of Nassau, but this is inconsistently and they are easily
disconnected.
of the analysis with character 3-ngrams as features and without culling limitations, however, Marnix joins this tree
with his texts, except for his songs of praise. Reael doesnt show any consistent stylistics relations, signals or
similarities over the analyses so far. His texts dont cluster together.
These graphs show a lot of pretty strong authorship signals and they actually come out pretty firm. The
songs who do not show authorship effects and act against my hypotheses and expectations, or the texts that do
not connect to other texts in general, are almost always very small texts. Reael with both songs, Sterlinx, Haecht,
the dissident of Marnix and the dissident of Coornhert, and the two anonymous songs helpt uzelf zo helpt u god
and ras der provincien, are examples of short texts who have not showed a consistent stylistic profile, up until
now. Van der Noot his vrijage on the other hand is a pretty long text and more importantly average for the
included texts of Van der Noot in the corpus, but in the graph 2.3 he moves away from the other Van der Noot
texts. Blaming the weak stylistic signals completely on texts size doesnt apply here. In the next analysis Ill
sample to gain more information about these short texts and their lonerism.411
The results for Bag of words (BOW) samples of 450 words turn out to be less informative. There are less trees
and branches, the results are weaker, less defined and hardly visible. At the same time, these are results
(somewhat) challenging my expectations, and therefore should I take them seriously and be critical at my own
observations, in order to make sure that they do not stem from self-fulfilling prophecy or cherry picking. This is
exactly why it is important to determine the graph not only on the effects it shows, but also if the graph shows
effects at all, and if these are possible even if they are against your expectations.
The results are, however, mainly just a bad performance of the analysis. Its not that my desired effects
arent showing, its that therere hardly any effects here, including some weve already established and some that
were certain about. Even so, therere still some stylistic signals that could be interpreted as authorship signal,
some of them still clearly visible with BOW samples of 450 words. The dissidents of Marnix and Coornhert remain
distant from the other texts of their authors, who still form clusters. Based on these results I conclude that these
clusters do not form because of a similar text size but because of their similar stylistic signal. In consequence of
this, the two dissidents should be considered as stylistically less similar as the rest of the texts of their authors.
The texts of dHeere and the texts of Fruytiers still cluster together with texts of their own author.
These results indicate that authorsship effects are measurable and visible even at these texts lengths,
even if theyre less visible because the graph is messy, and only the strong effects remain. This puts the lack off
results for the Wilhelmus in a very specific light and it also keeps me from deleting certain texts, because I need
to perform further analyses.
For the next analysis I set the sample size to 1650 words, an already positively tested amount of words, but I will
now use character n-grams, of whom therere more available. The results seem even more diffuse with character
n-grams as with words, therefore I set the sample size to the maximum as my corpus allows me, 2500 character
411 A term taken from the album title of the second album of the band Tame impala. It refers to being a loner,
or relatively isolated.
104
4-grams. The graph 3.3 and 3.4 show indeed more results than before and give me about the same effects as the
previous analyses on this corpus. One Coornhert text travels a bit over the graph, but not the conjunct text but his
Ulyses, something I havent seen up until now. Songs from the geuzenliedboek form a tree with each other
meaning that they do have stylistic similarities and that these arent based on their length but on something else,
perhaps a common author, but at the very least we know that theyve a common publisher, which might be the
cause. The Wilhelmus shows only affiliation to itself again.
105
Authorship effects are present and visible even with sample sizes of only 550words and also when my corpus is
disbalanced regarding text size. The results are not always clear, and probably not always correct either. Im still
convinced that a lot of subtle effects are withheld by al sorts of methodological limitations and the limitations of
the corpora, however, if the author of an anonymous song is included in the corpus with a couple of
representative texts, a successful authorship attribution is possible and maybe even likely. I base this on the fact
that a lot of texts with just as bad credentials still cluster together with other texts from the same author. The
Wilhelmus however gives us no consistent, convincing signs so far. Further analysis is required.
psalms in earlier analyses, but this is no guarantee its nature doesnt essentially differs from other poetry. It just
tells us it didnt had the common stylistics that the other psalms had.
The graphs shows obvious authorship effects. These have become a very consistent and believable pattern
throughout the analyses on the author corpora, and also on the other analyses. Therere some exceptions to the
rule, as in the previous analyses, one being Reael. Raeal clusters in a third group consisting of, the occasional
Coornhert, two dissident Fruytiers songs, the Marnix dissident, Voort, Heacht, the Wilhelmus, Sterlinckx and Van
Damme, in which the texts dont form many relationships or clusters and the results are not very distinct or
plausible. The Wilhelmus has two times the songs of praise of Marnix as nearest neighbour and two times the
joined file of Coornhert. Other effects of this leftover tree are, Voort who joins the Fruytiers texts and Reaels een
ander lied clustering with Van Damme, sometimes accompanied by Sterlinckx. Theyre all songs from the
geuzenliedboek. The other Reael text has no clear affinity with other texts. The Haecht text doesnt make any
connections.
When looking at the size of the texts, it became clear that little texts are the least sure to stick to the
connections they make and move around the most. The Wilhelmus, Ter liefden van een Maghet en het
Baanderherenlied, the songs of praise by Marnix, een nieu liedkje of Reael, the text of Haecht, the sterlinckx
text and the other song by Reael are examples of this. Therere however enough effects of text of very limited text
size, that show itself reliable and consistent under changing parameters and different corpora.
Based on previous attempts, I assumed that sampling will not change the overall view of the graph. I
verified this assumption with a quick analysis. The only notable fact was that the Wilhelmus clustered two more
time to the Marnix texts.
candidates in the canon of Wilhelmus authorship research. Based on my own results as well as previous
research Ill explore this option and form a corpus called the Marnix Coornhert corpus, with texts of either
Marnix or Coornhert, in addition to the Wilhelmus itself, to see if our national anthem has a preference and if this
seems like a robust effect.
Notable inclusions are, the psalms and both the texts of the joined text Ter liefden van een Maghet en
het Baanderherenlied separately, while also keeping the joined document in the corpus. I hope to gain some
insight in the changes in signal when combining texts in one file and. The corpus consists of 14 texts.
Looking at the graph of the first analysis, it shows a strikingly lot of information, including both a big division, as a
lot of smaller clusters and detail. Interpretation confirms previous results, the distinct signals of prose and poetry
and the authorship effects, among others. After the modification of the levels of culling, as with analysis 1.1, the
prose/poetry distinction gets stronger but the details seem to fade away a bit.
The Wilhelmus clusters, in graph 1, with the poetry of Marnix, but also stays close to a cluster of
Coornhert poetry, including Ter liefden van een Maghet and Baanderherenlied as well as the combined file.
With the features on word level, these texts seem to find each other easily, meaning the combine text is
representative of both independent texts and vice versa.
For the analysis 1.1 I deleted both individual texts of the combined document of Coornhert, along with the first
psalm of Marnix. As we can see in the graph, in this corpus the Wilhelmus binds itself to the combined document
of Coornhert, but only when culling. Without culling the Wilhelmus remains with the poetry of Marnix. So
balancing the corpus, by deleting the texts that I did, made the Wilhelmus switch places to another text of another
author, making its attribution as highly doubtful. This assertion is confirmed by analysis 2 when the Wilhelmus
joins a cluster with both Marnix as Coornhert texts.
Adding different levels of culling in analysis 2.1 strips the graph of most effects. This is illustrative for the
enormous sensitivity of this body of text for changing the parameters and also how thin the branches of the graph
and how unstable the results of the analyses are.
Analyse 3 shows strong authorship signals again but the Wilhelmus still doesnt show its colours.
Exclusion of the Wilhelmus from my corpus changes little about general outlay of the graph, but theres one
important change, as both the combined text of Coornhert as Marnixs psalms, that showed stylistic resemblance
when the Wilhelmus was included, do not show this similarity towards each other anymore when the Wilhelmus is
removed. They move in opposite directions, both to the wrong author. We can take away from this that both texts
have a atypical style when comparing them with the rest of their authors texts and that they are stylistically similar
to the Wilhemus but not so much to each other.
108
Ill try to balance my corpus a little bit more by removing every bit of noise Im able to remove, leaving my corpus
at 9 texts. This is of course still my third corpus, the Coonhert vs. Marnix corpus, of my third hypothesis, the
authorship hypothesis, but a more balanced second version of it called Marnix Coornhert balance corpus .
This corpus consists of 9 texts.
The corpus is now balanced in a way that both Coornhert and Marnix are represented by 4 poetry texts,
all bigger than the Wilhelmus. Three of the Marnix texts are a little over 550 words and the other one, his Psalms
Davids, is the biggest of the entire corpus, at least ten times as big as the Wilhelmus. Coornherts combined text
is about as big as the Wilhelmus, while the other three are about half of Marnixs Psalm Davids. Even when were
still debating the extent of which size is responsible for effects, by now we know it can be responsible for the
absence of effects, the corpus is now well balanced for size. When interpreting the results, Ill take size into
consideration, despite the extensive balancing, just to be sure.
From the analyses 1.0 therere two ways to interpret the main division. The first, the authorship effects, with one
deviation on either side, the Psalms Davids of Marnix and the combined text of Coornhert, has been established
and described in previous analyses. The second interpretation is that the divide divides the 4 biggest texts from
the 5 smallest texts, a division that strokes with the greatest relative gap in text size, a factor 5, between the
biggest small text and the smallest big text.
When we use character 3-grams as features instead of words, the amount of features goes up and as
weve already seen, a growth in features changes the results significantly if the feature size is under a threshold.
The graphs 2.0, 3 and 3.1 show us clusters by author and by type. The Wilhelmus finds his familiar cluster with
the two dissident texts, who differ enormously in text size. The other texts show some authorship signals, but also
some texts that do not cluster according to author. An explanation of the three trees based on text size however,
seems implausible.
In order to further counter for effects based on size, I use random sampling in analysis 4. The graph now
becomes very different, there are big divisions and lots of details. The Wilhelmus still clusters to the combi-text of
Coornhert and this cluster is close to the cluster of Marnixs psalms, where the other text, that repeatedly clusters
with the Wilhelmus, resides. Authorship signals are scarce and weak in graph 4 and in graph 4.2, but in the graph
of the analysis 4.1 the authorship signals are a bit stronger.
The results are again inconclusive. The Wilhelmus tends to connect the most with a Coornhert poetry text, that
consists of two separate texts, but the authorship signals are unconvincing because of the Wilhelmuss affiliation
with a Marnix text and because the signals of both texts that the Wilhelmus clusters with, are atypical for the
author that wrote them. In addition to this, weve seen the Wilhelmus move around the graph when shifting the
basic parameters. The results considering the Wilhelmus are uncertain.
109
The results of the analyses on the corpus 3.1 and 3.2 show us the presence of authorship effects, but also the
relativity of their presence, as they are far from giving a full explanation for the effects in the graphs, and
alternative explanations might sometimes be just as believable. The Marnix vs. Coornhert corpora are not the
best corpora to test the authorship effects, because the inclusion of only 9 texts of only 2 authors, becomes
already problematic when youve got an open case and 2 dissidents, and a third of your corpus is already
disconfirm the expectations. Earlier analysis, including more authors and more texts, have consistently showed
us reliable authorship effects.
The Wilhelmus is inconclusive in his authorial signal. In analyses that do show authorship signals, the Wilhelmus
still doesnt seem to favour one author. At most, it favours texts. The Wilhelmus seems to share stylistic
characteristics with the conjunct text of Coornhert, consisting of the songs Ter liefden van een Maghet and
Baanderherenlied, as well as stylistic similarity to the songs of praise originating from Marnixs Psalms Davis.
The Wilhelmus clusters more often with the Coornhert text than with the Marnix text. The texts only share some
characteristics with the Wilhelmus and not with each other, while they also seem to be atypical for their authors
style. Exclusion of the Wilhelmus shows both the dissidents to be pulled away from each other, and joining the
opposite team, suggesting that it was not the Wilhelmus pulling them out of the author clusters were they
belonged, but actually more of finding them somewhere in between a Marnix and a Coornhert singnal. This could
also mean, and this seems to be the most straightforward interpretation, that all three texts are pushed in a
cluster because they do not belong to either one of the author clusters. However, under certain parameters, the
songs of praise of Marnix, cluster pretty convincingly to other part of the Psalms David of Marnix.
The main conclusion is that the results that are convincing in this corpus, do not involve the Wilhelmus,
who seems to be a stylistic outsider. Balancing my corpus did not change this. The results are again inconclusive.
The Wilhelmus does not have any text in the corpus to which he convincingly clusters. The connection to
Coornhert poetry is not stable enough to base any real attribution on.
Distance measures
Although this isnt a trial and error based exercition, where you can vary settings and parameters without
theoretical or practical basis for it, by now most of the parameters have been varied. Any further refining of my
corpus is highly unlikely to harvest any progress in determining the author of the Wilhelmus. At this point I need to
consider changing the settings, that I wouldnt change before, without losing sight of the theoretical basis.
As discussed in the theory, different distance measures will show us different characteristics of a text. I
stand behind my decision for the Burrows delta on practical reasons and a firm theoretical ground, however, when
easily applicable and with inconclusive results so far, Id like to get a short impression of the possible different
effects. I cant draw any conclusions on the results as this would be a form of cherry picking, however Id like to
explore the further options by generating new results and explaining them by the grace of the theory of the
distance measure, not the attribution of text.
110
I test this on the author corpus 2 auteur balans, counting 22 texts, designed to find the author of the
Wilhelmus, and therefore removing signals other than authorship signals, without deleting improbable authors.
The first alternative distance measure is the Euclidian distance measure. Remember that, in comparison with the
Burrows delta, it bestows a lot of influence on the top most frequent words, while the influence of the rest of the
MFW is marginalised.
Looking at the graph it actually shows us pretty sensible results. Therere distinct branches, and theres
detail within those branches. It shows authorship effects, of all included authors, even Reael. It also shows us
clusters of texts from the same book, see the Fruytiers, Van Damme, Voort and Hout branch, all stemming from
the geuzenliedboek. According to this graph the Wilhelmus has the most stylistic resemblance to the Reael texts.
This could be another geuzenliedboek cluster, and it could also be a cluster based on size. However, therere
other geuzenliedboek texts and other short text present in the corpus, that dont cluster with the anthem. Voort,
Sterlinnx, Van Damme are all very short and from the geuzenliedboek, but they dont cluster with the Wilhelmus
like the Raeal texts do.
Id like to note that in this case the usage of a distance measure that tyrannically prefers the top
frequent words, gives me a interpretable and sensible graph that showed a lot of different results, big and small.
Perhaps on these very short texts, words on the bottom of the list are polluting the results. We already saw that
culling, also reducing the influence of bottom MFW on the list, often showed some similar effects.
The second distance measure is Argamons linear delta, made as an alternative for the Burrows Delta, which has
not been discussed in the theory. Its based on Euclidian principles so consider its mathematical rules as such.
Looking at the graph, what immediately pulls the attention is the big division. Both of the trees show also
smaller, more detailed branches. Authorship effect seem to determine a great deal of the graph, but they cant
explain the major split. It separates 3 authors from the rest, who are also bound to texts of the same author,
except for Raeal, whose texts reside in the same region, but arent nearest neighbours. So based on Euclidean
principles of Argamon, Reael has now lost some of his authorial signal in comparison with the pure Euclidian
distance measure. One tree groups Marnix, Coornhert and Hout with the Wilhemus, so the two most important
candidates for authorship, are in the same tree with potentially their masterpiece. Marnix clusters the most with
the Wilhelmus, first of all with his song of praise, not uncharacteristic for the rest of my analyses, but also with the
rest of his texts. This doesnt by a long shot mean the Wilhelmus is written by Marnix, as Hout also behaves as if
its written by Coornhert. This means that based on these parameters, with the Argamons linear delta, Marnixs
work, especially his songs of praise, are the most stylistically similar to the Wilhelmus of all text in this corpus,
which is designed to included as many possible authors and stylistically similar texts as possible. It also means
that the Argamons linear delta cant be excluded as useful distance measure, as it seems to capture a lot of
authorship effects.
111
The third distance measure is the Manhattan distance. In contrast to the distance measures based on the
Euclidean principles, the Manhattan distance measure is less biased towards the top of the MFW. It still gives
decisive importance to the 10 or 20 MFWs, so its way less democratic as the Burrows Delta.
Looking at the graph 2 manhattan it shows us a lot of the same effects as graph 1.2 argamon, but now
a little less distinct. Again theres a affiliation of the Wilhelmus with Marnix, and especially with his songs of praise,
while also showing some relation to coonrhert and Hout. The texts of Hout are also songs of praise, which
explains the stylistic similarity towards Marnix, but not to his even closer affiliation of Coornhert or the Wilhelmus.
Id like to suggest an interpretation thats highly speculative and which I wont state as factual. Lets make the
assumption that the Wilhelmus is a song of praise, just like Marnixs and Houts texts. This is not a very wild
statement as the national anthem is obviously in celebration of country and god, and implicitly self-celebrating as
the text is sung from the perspective of the Prince which is revered as just and righteous. Coornherts joined text,
can be interpreted as a song of comfort, a song of departure and a song of exile. The songs from Coornherts
book of songs are mostly religious songs, full of praise, mercy, subjection and comfort. The same goes for
Marnixs poem Den verstrooiden Nederlandschen gemeenten Jesu Christi. The joined file from Marnix consist of
a songs of praise and religious songs. These themes are also present in the Wilhelmus, one of the cardinal
questions is particularly about these themes, so perhaps their similarity is topical. A counter argument is that,
when the attribution is indeed topical and not stylistic, the coincidence of perfect clusters according to author
seem unlikely. Also, my corpus was designed to have similar themes and considering the times, most selections
of Dutch songs would bring forth such correspondence. The final nail in the coffin of this hypotheses, apart from
already having rejected the possibility of testing further genre effects, is that these themes are also present in
songs not clustering with the Wilhelmus, like Van der Noots songs of praise and several songs of Fruytiers to
name a few. Another result Id like to discus is the fact that Reaels texts cluster again, which makes it seem like
the absence of signal in the Raeal texts, was mostly due to the choice in distance measure.
The conclusions are, for the Manhattan distance measure, pretty much the same as for the two
Euclidean based distance measures. Perhaps a distance measure that gives priority to the most frequent of the
MFW, is useful on texts of very short text size. Regarding the Wilhelmus, I observe that Marnixs work, especially
his songs of praise, seem the most stylistically similar to the Hymn.
The last, and possibly least, alternative distance measure I test is not discussed in the method section, but
available by default in R. The Canberra distance is a weighted version of Manhattan distance, used as a metric
for comparing ranked lists.412 This numerical measure of the distance between pairs of points in a vector space,413
measures the similarity between groups. Its far more democratic than the previous alternative distance
measures and I choose this one on that premise.
412 Godfrey N. Lance and William T. Williams, Mixed-Data Classificatory Programs I - Agglomerative
Systems. Australian Computer Journal 1.1 (1967).
413 Giuseppe Jurman et al., Canberra Distance on Ranked Lists. in Proceedings, Advances in Ranking
NIPS 09 Workshop, eds. S, Agarwal, C. Burges, K. Crammer (2009) Retrieved on 24-07-2015
112
Looking at graph 3 Canberra we see a very limited representation of the data. The texts are sorted by their
authorship effects, branching of from the middle and in doing so losing all other information. Of course this
distance measure does not only search for the authorship signals, it measures stylistic resemblance, but these
are the only effects the graph shows. The most logical explanation is that these signals were the strongest as it
blurs out the weak signals. Only Coornherts texts are not clustered according to their author and the Reael texts
do find each other as nearest neighbor but refuse to form a branch.
PCA
So far, the analyses of the authorship hypotheses havent delivered conclusive results. In order to get more data
and more context about the data already received, I perform principal component analysis. As mentioned in the
theory paragraph, the PCA can be useful to establish the author of a text when faced with a two author corpus. I
perform two PCA analyses on the author corpus 3.2 auteur Marnix Coornhert balance corpus. First a PCA
correlation, measuring the statistical relationship between two variables, and secondly a PCA covariance,
measuring how much two variables change together. All the parameters are listed in the Appendix 2. Corpora,
113
but its important to note that the performed PCAs are on word level and with the Burrows delta as distance
measure.
The first graph, shows the components in a 2 dimensional space, based on the statistical relation between the
variables. The horizontal axis is the first component and it doesnt separate the texts according to author. I could
say that most of the Coornherts texts are located more to the right while Marnix stays at the left, along with the
Wilhelmus, but these results are not really distinctive and therere exceptions. The vertical axis, however, the
second component, shows a division by author. The relationship is still more complicated than only authorship
effects, because while the Marnix and Coornhert texts are indeed able to divide with one horizontal line,
somewhere around the figure one, the real cluster actually consist of two KLAS texts by Coornhert, Psalms of
both Coornhert and Marnix and the sonnets to Lucas dHeere by Marnix. The Big difference on this component is
that, the Wilhelmus obviously distances itself from the other texts, allowing only the joined Coornhert text to come
near. Weve seen the affiliation with this particular texts, just as weve seen the isolation from all the other texts.
Lonerism is again a suited term. The PCA picks up on the stylistic individuality of the Wilhelmus, weve seen all
through the attribution analyses, and defines it as the second component.
Looking at the second graph, the covariance matrix, we see a conspicuous result. Although the Wilhelmus seems
to cluster with Coornhert, on the horizontal cluster more obvious than weve seen in the PCA graph number 1, on
the vertical axis the Wilhelmus is miles apart from any other text. This means that on his first component, the
Wilhelmus stays close to Coornhert, yet on the second variable, his covariance, the Wilhelmus will absolutely not
share a common pattern of change with any other text of both authors on this variable. Marnix and Coornhert
dont seem to differ too much on this second component, but perhaps theyre grouped together because of the
counter weight the Wilhelmus provides.
Conclusion PCA
The principal components analysis captures differences between groups and a PCA-plot exhibits these
differences. The results in general are unconvincing. The clusters are rather scattered and the differences
between the authors are sometimes smaller than the difference of the texts of one of the two authors. The texts,
are however sorted by their author, so we affirm again the presence of an authorial signal. When forced to pick an
author based on the two PCA plots Coornhert would be the most likely author, but when critically analyzing these
results, one has to agree that they seem to point out that the Wilhelmus is not an representative of either of the
two authors their style. In both graphs the Wilhelmus seems to be an outsider of the corpus, especially on the
covariance plot, where it seems to be totally different from any other text in the corpus on the second component.
These results do correspond and partly explain the behaviour weve seen from the Wilhelmus during all the
previous analyses. Obviously the Wilhelmus has some stylistic properties that sets it apart from the other texts,
therefore making attribution impossible. This divergence on the second component of the covariance PCA is very
strange and very hard to explain, if the real author was present in this 2 author corpus, both authors styles were
114
represented by their texts, and the Wilhelmus represents its authors style as well. It might suggest that the
Wilhelmus has a textual characteristic, responsible for the failed attempts to attribute it to an author, that corrupts
its authors style.
Wilhelmus to be singular, so its strange for a man who didnt wrote the hymn would get as many attributions on it
as the actual author. Also, the Wilhelmus has clustered with other authors and often with none at all. In other
words, the Wilhelmus hasnt showed a steady attribution to either one of these two authors or in general. In
addition to this shows the Wilhelmus most stylistic resemblance to the texts that a atypical and do not represent
their authors style very well. Especially the conjunct text of Coornhert is often the texts the Wilhelmus clusters
with in the same branch. This is very problematic because this is a constructed text, Ive put together while trying
to balance my corpus. Its also the Coornhert text that is quickest to abandon a Coornhert cluster, so the least
typical for the authors style. All these concerns were just looking at the results of the analyses performed on the
Coornhert/Marnix corpus, which didnt give me any conclusive answer on my main question. The PCA showed
that the Wilhelmus has some stylistic properties that sets it apart from the texts of both Coornhert as Marnix. Ive
no explanation for this divergence.
Although convinced in the abilities of the Burrows delta I resorted to the testing of other distance
measures, in order to confirm my initial choice. Lesser refined distance measures as the Burrows Delta,
favouring the top few MFW heavily over the rest of them, showed decent results sometimes finding more
authorship effects than my analyses with Burrows. In these tests the Wilhemus was mostly attributed to Marnix
but we should see this attribution consistently over varying parameters and corpora to consider this as an
indication. At this stage its also too early to conclude that the Burrows delta is inefficient under these research
circumstances.
In conclusion; The Wilhelmus doesnt make any strong connection or shows a consistent pattern of attribution,
and therefore doesnt show any believable stylistic affiliation with the authors of my corpus. Possibilities are that
the author is not present or not represented correctly because theres too little text of him or only text of a very
specific non-typical signal. It could also be that the Wilhelmus itself has atypical stylistics, this would also explain
why he moves around so much.
The leads to answering of the hypotheses;
There is, according to the results of my analyses, an stylistic fingerprint based on authorship or an
stylistic authorial signal present in the texts of my corpus. This hypothesis is accepted.
The Wilhelmus shows consistently more stylistic similarity to the texts of one particular author in my
corpus, than to the texts of other authors, and the texts of this author consistently show an authorial
signal by expressing similar style. This hypothesis is rejected.
116
delivered in a different type of XML, Folia, and also in different units, meaning by song as well as by complete
work or book. This led to the construction of the following 3 sub corpora;
1. XML Works corpus; A corpus consisting of complete works, delivered in Folia XML format and analyzed in R.
2. XML parts corpus; A corpus consisting of equal parts of the works in terms of size, not in terms of words,
delivered in Folia XML format and analyzed in Gephi.
3. TXT songs geuzenlied corpus; All songs of Een nieu GeusenLieden Boecxen the 1583 edition, in txt format. Ill
analyze them in R and perform a network analysis in Gephi.
included so far, and its from another source, but has exactly the same text and spelling, yet is delivered in
another format. Also, for this version, Im not editing out the introduction.414 The corpus has now 18 texts.
After I performed the same analyses as those on the previous corpus, we get exactly the same graphs,
with the Wilhelmus present, but not making any significant connections or causing other text to reconsider their
positions. The Wilhelmus remains stylistically undefined with these parameters and on this corpus. Of course the
text length of the Wilhelmus is many times smaller as any other text in the corpus. When using Bag of Words
sampling at a sample size of 550 words, the graph loses most of its effects again. We see the geuzenliedboek
cluster to Coornherts work, an interesting result, but looking at the other effects, and especially the lack of it, the
results are worth the consideration. This means that I can not make any claims about the Wilhelmus based on
this corpus.
The few analyses performed on this corpus give us some confirmation of this format, and also show the
limitations of such a noisy and disbalanced corpus.
414 Corts na dat Graef Lodewijck van Groninghen op ghebroken, ende van Groeninghen verdreuen was is de Prince van
Oraengien na de mase ghetoghen. Een nieu Christelijck Liedt gemaect ter eeren des Doorluchtigsten Heeren Wilhelm
Prince tot Oraengien, Patris Patriae mijnen G. Vorsten ende Heeren. Waer van deerste Cappitael letteren van elck Vers
zijner V. G. Name meebrengen.Na de wijse van Chartres.
118
119
If we stay with Van der Noot, for further analysis of the graph, we can analyze the connections it makes with texts
of other authors. An outlier of the Van der Noot cluster, goes via Voort, a single brown text, all the way to the left
side of the graph, where texts of Coornhert and Marnix resides, along with the anonymous texts. Other texts that
connect to Van der Noot are that of Hout, being the orange lamp-like shaped cluster, hovering in the middle of the
graph, above all the other texts, not counting the isolated Van der Noot sphere. Following the outliers of the Hout
cluster, we move again to the left side of the graph, to Heacht, being the bright bleu island to the right of the dark
bleu dHeere cluster. This is were the small cluster of Van der Noots paratexts, Ive already mentioned, resides.
As you can see in figure 3, the Van der Noot noise cluster has several connections with the dark bleu dHeere
texts, one orange connection to the upper right of the graph going to the Hout lamp, and one thick thread going
to another Van der Noot text just outside figure 3, who has some connections to the anonymous texts. Haecht
Psalms is the only text of Haecht included, be it in many different parts. His psalms make connections to dHeere
text but avoid the paratext of lofzang op braband.
120
Another example of noise manifesting itself in the graph, is the one isolated text, again of Van der Noot, close to
the yellow string of Marnix texts. When close reading the part coded as boko01_01.0256 I find that its a footnote
and a part of Tot den leser meaning to the reader. This explains why a text would move away from its author to
the other side of the graph, namely, because this text is not by Van der Noot. We see, as concluded before, that
the right wing represents Van der Noots style and the other clusters, consisting of text not written by Van der
Noot, but included under his name, are not representative of Van der Noots style. The opposite is not true. Not all
the text in the green moon of Van der Noot, shown in figure 2, are clean texts. The upper part of the wing, some
of which connect to Houts texts, turn out to be, when examining the individual parts, parts that are half noise and
half the actual text. This explains their scattering and stretching across the upper right side of the graph.
Ive started with reporting my findings on the Van der Noot texts for two reasons. First of all, Van der
Noot has been included with a lot of texts of different nature, often problematic and his texts need the most
interpretation and analyses of all the included authors in my corpus. Secondly, analyzing Van der Noot shows the
reader the possibilities of my methods. After the coverage of Van der Noot its clear that these methods and these
visualization can pick up noise, and that it will even cluster, insinuating that the noise is stylistically similar in some
way. It also shows us authorial signals, effects based on stemming from the same work and the overruling of the
latter effect by the first. Ill now discuss the other results in a higher pace.
The strongest author effect is Coornhert whose four texts cluster together in a wide horizontal cluster which is
pretty compact when you realize that there is prose as well as poetry included. Within the authorial cluster,
121
Coornhert also clusters per book, especially Corte berispingen. We see the same for Marnix, who is included
with three different books, but the many different parts manage to form a long diagonal cluster, again while
clustering per work as well. Especially the texts of the Bienkorf cluster and that of the psalms, are in close
contact.
These clusters per work within the author clusters can be interpreted as that authors tends to write with
a slightly different style for each work, probably because of genre, topic, type or personal change, but it van also
mean that the influence comes from effects I dont want to measure, like spellingvariation, different editions or
alterations made by the publisher.
Willem of Oranges texts cluster, in purple, just above the red Coornhert texts. His apologie makes
strong connections with both Marnix as Coornherts prose. See figure 5. I conclude that this is an effect based on
type, although analysis on the specialized corpus showed occasionally similar results.
Lucas dHeere is present with two texts, the poetry of den boomgaard vd poezie and the prose of beschrijvingen
van den prince. In Gephi-file you can see how they form a cluster of which the major part of the cloud are texts of
den boomgaard vd poezie except for the smaller cluster close to the purple anonymous texts which is
beschrijvingen van den prince. The connections the dHeere prose makes with one text in particular, which upon
examining the files turns out to be the anonymous het beclach van joncheer jan van hembyse, I can not explain.
Further research could consider Lucas dHeere as potential author for this text, but Ive got no clue if this is even
historically possible. The connection is, however, conspicuous.
The geuzeliedboek accounts for the biggest part of the big purple cloud in the most right cluster. Somehow it form
a lot of connections with other anonymous text het bclach, while this is not neccesarily logical. Where the
geuzenliedboek is written by several different authors, het bclach should represent, at least to some extent ,one
voice. The style could have resembled any author included in the corpus, not necessarily grouping together with a
bundle of mostly anonymous songs. The other text the geuzenliedboek forms connections with is Coornherts
Corte berispingen. The Wilhelmus is in the middle of the geuzenliedboek cluster among its peers, leaving little to
interpret.
122
415 http://www.dbnl.org/tekst/_nie096nieu01_01/
416 Gustaaf Asaert De val van Antwerpen en de uittocht van Vlamingen en Brabanders. 1585 Lannoo, Tielt, 2004. p186
417 Heeroma 1985.
123
the songs are not included in the book by chronological order, but categorized by theme. This would mean that at
least song 88 and 89 are highly suspect of sharing stylistics based on their shared topic.
In graph 1.1, which shows the results of an analysis using character 3-grams, both versions of the
Wilhelmus do cluster. Voort and Van Damme cluster again but let go of the other big texts. The character 3-grams
provided a greater amount of features and cancelled out some of the size effects. Theres again no authorship
signal for Reael. The graph of analysis 1.2, using character 4-grams, shows us a combination of analyses 1.0 and
1.1, and see the influence of a shortage of features. These results all make sense and offer us methodological
insight, on the other hand, they dont offer us much insight in the Wilhelmus.
The analyses show us how much impact a minor alteration, like removing an introduction, has on the Wilhelmus
specific and on texts with a very small text size in general. The deficit on features make analyses on word level
inappropriate. Interpretations that rely on text size for the explanation of the graphs are far more believable than
the alternative of topic I presented. However both theories are, on this particular corpus not tested sufficiently to
accept or reject. Whats clear is that a text size of 550 words is inefficient for these types of analyses and I advise
to strive for a text size of 1500 words. The Wilhelmus shows little stylistic kinship with the other texts. The songs
that showed some stylistic resemblance to the Wilhelmus, if any, and are important to further research,
computationally or otherwise, are the songs 88, 89, 78.
124
Texts or samples with a text size of 550 words or less, disturb the analyses and a text size of at least
1500 is advised. However, some effects, predominantly authorship effects and effects of same work, remain
visible on a text size of only 550. When we use character n-grams, and thereby creating more features to
analyze, and thereby cancelling out some of the negative effects due to small size, therere more correct
attributions and more distinct results and usable graphs. Culling seems less important when using character ngrams, perhaps because of the loss of semantics.
The Wilhelmus didnt seem to share some kind of stable stylistic resemblance with the other texts that we could
interpret as an authorship signal. If it showed any relation at all, the connections were very weak. The Wilhelmus
didnt cluster with other parts of the geuzenliedboek, as did some other very short texts, while we did found these
effects, in other corpora as well.
The fact that the Wilhelmus didnt cluster to other texts, even those of the geuzenliedboek, and not even
in a consistent matter to itself, is worrisome. If it cant even find itself to be stylistically resembling, it wont connect
to other texts of his author, if theyre even included. The connections the Wilhelmus made, to song 88, 89 and 78
should be further explored.
Conclusions
Ill report here the general conclusions of all my analyses. I will be more concise than in the individual conclusions
sections, in order for the pace and size of my thesis. If youd like a extensive report please go to the conclusion
sections under each analyses, or at the end of a hypothesis section.
125
While the strong effects were most of the time obvious and consistent enough to interpreted, small and
less obvious effects were harder to pinpoint. In some cases, and this definitely an influence when performing a
distant reading on such a large corpus when youve got little semantic and contextual information about each text
individually, you might miss or misattribute unexpected and/or weaker effects.
I managed to see and explain, from the first graph on, the cluster of psalms and other religious texts, as
topical. What I didnt saw and probably still wouldnt have known if my supervisor prof. dr Els stronks didnt
pointed it out to me, was the tendency of authors who were innovators of language, like Van der Noot and Lucas
dHeere to connect with each other.
For every seemingly effect there was noise. A lot of the results dont make sense and a lot of the texts
did not fulfill my expectations. In many cases I dont have an explanation for this.
The birds eye view gave me very little on the author of the Wilhelmus but it gave me enough indications that the
corpus I was working on had enough stylistic information to perform successful analyses on. Up until that point,
my confidence in finding stylistic signals in my corpus was based on theory, previous research and a preliminary
experiment Ive performed during an internship in Poland a year ago.
The main part of my research where the analyses on the three corpora who were assembled and
constructed in order to answer a particular set hypotheses that would lead me to answering my main research
question. The birds eye view was a big factor in the decision making phase of the construction of these
hypotheses.
The first two corpora, of the three that were designed for specific hypotheses, were meant to give me information
on the Wilhelmus, as well as to determine other effects that werent authorship effects, in order to map all the
signals I had to consider, cancel out or compensate for, when performing authorship attribution. These corpora
would give me clarity on effects of language and dialect and on effects of type or genre. In addition to this, finding
a language signal or a genre signal of the Wilhelmus would be a discovery on its own and could provide us with
information that might bring an authorship attribution closer. As explained in the theory answering one question
about the Wilhelmus can lead to answering another.
As reported in the results, I found that dialects can perform stylistic effects, but that this is only
satisfactory revealed for the Dutch texts from, or influenced by, the southern regions of the Netherlands. As stated
the German hypothesis is not sufficiently tested and the translation-hypotheses differed to much to test on a
corpus with such strong authorship signals and not relevant enough to construct an new corpus for. Theres no
indication that the Wilhelmus has southern or Flemish origin. Other texts with the southern language signal
werent stylistically similar to the Wilhelmus, leading me to the conclusion that the Wilhelmus has not been written
by an author who has such a language signal in his style, presumably because he is not from those regions.
Secondly, I searched for effects based on genre or type of text. I found that prose and poetry had different stylistic
properties, an effect the birds eye view already hinted on.
126
Another genre effect the birds eye view seem to harbour was that Psalms have a distinct style, different
from the other songs. The results of the analyses on the genre corpus seem to suggest that this is true, although
it wasnt consistent and strong enough for me to conclude. This shared stylistic signal could be based on topic,
because all psalms are of course deeply religious.
Answering the hypotheses of the song vs. poetry and the 6 genre were abandoned, as they were
impossible to answer on the corpora that I had, with the methods I used. Therefore these were neither accepted
or rejected, only remained unanswered.
The third corpus that I assembled for the sake of answering a specific hypothesis, was assembled to answer the
main question; who is the author of the Wilhelmus?, which will remain unanswered. Neither this corpus nor other
corpora could give me an answer.
From the beginning, even when testing on other characteristics, authorship signals were present. These became
more clear with the further preparation of the author corpora and the fine-tuning of the parameters. In all corpora
and sub corpora, on three types of format, txt and two types of XML (Folia) and three types of units texts, parts
and works, we saw authorial signals. They were the strongest and most consistent effects, present over almost all
authors, and a large majority of the text. The deviations were predictable and/or explainable.
Their consistency and strength over different analyses, parameters, formats and corpora in combination
with my secondary literature and extra textual information about the text, gave me the confidence that authorship
signals are present and measurable. These methods are good enough to identify and visualize stylistic effects
and relations between text based on authorship in my corpus.
The Wilhelmus didnt show clear resemblance to any other text in a consistent manner, and neither to
any author. This is true for my own constructed corpora as for the Meertens corpus. The Wilhelmus made little
connections and if it did, were these unstable. The two texts, one of Coornhert and one of Marnix to which the
Wilhelmus showed to most stylistic affiliation were two texts atypical for their author. In addition to this, theres the
obvious problem of them being of two different authors. Based on my results of the analyses on my corpora
specifically designed to answer the author question, I conclude that its very unlikely that Lucas dHeere, Jan van
der Van der Noot or Johan Fruytiers wrote the Wilhelmus. All three authors showed very strong authorship
signals, while being represented by a body of work of which I believe is adequate in capturing the authors style,
but the Wilhelmus didnt show any stylistic resemblance towards them.
Datheen, Sterlincx, Willem of Orange, Utenhove, van Damme, Haecht and Voort were not sufficiently
tested to rule out, and Reael, actually a lot like the Wilhelmus, showed in several corpora and under several
different parameters, no strong signals at all, including authorship signals or effects based on the fact both Reael
texts come from the same book of songs. They didnt showed any systematic stylistic resemblance to the
Wilhelmus either. For whatever reason, the included Reael texts were not capable of performing an authorship
signal, so Reael also cant be excluded.
127
Coornhert and Marnix, the two big names in traditional Wilhelmus research, showed the most stylistic
affiliation with our national anthem. My results and there presence in the Wilhelmus canon are as far as I know
unrelated in terms of that my design. Ive strived to treat and include text of possible authors and improbable
authors with the same importance and care. As Ive already mentioned, the availability of Coornhert and Marnix
was above average and so their body of texts to choose from was more diverse, balanced and greater than
others. In this way their status could be of influence on their stylistic connection to the Wilhelmus.
Despite the fact that there were other authors present, some among them with a big and balanced body
of work to choose from, that showed less stylistic resemblance to the Wilhelmus, as Coornhert and Marnix. Still,
even the connections towards these authors were weak and inconsistent. With minimal variation of the
parameters, the Wilhelmus shifted from one author to the other or to any other or none at all. I cant draw any
conclusion on these results, regarding the author of the Wilhelmus.
Zooming in on the texts of the corpus of the Meertens institute, the anthem made these brief
connections with, they turned out to be stylistically unreliable, mostly showing poor stylistic signals themselves,
probably because of limited text size. The Wilhelmus didnt cluster consistently with other parts of the
geuzenliedboek and even had trouble finding another version of himself. This led me to the conclusion that
somehow, the Wilhelmus shows very poor results under the scope of research Ive performed. A critical analysis
of the text itself is in order.
Its obvious by now that the size of a short texts, lets say 550 words, has a huge influence on its
attribution. Short texts are harder to attribute, they make weaker connections and so when the parameters shift
they move around the most, making them unreliable. However, there are many short texts included in my corpus
and they often do connect with texts of the same author or book. When compensating for text length, in order to
cancel out possible size-effects that I thought made the Wilhemus so resilient against attribution, by sampling or
making very specific corpora, the anthems stylistic isolation remained. When I used character n-grams, and
thereby creating more features to analyze, and cancelling out some of the size noise and creating more distinct
results for short text, results for the Wilhelmus were the same. Authorship effects are present and visible in other
texts, even with sample sizes of only 550 words and even if the corpus is disbalanced on text size. The
Wilhelmus however, gives us no consistent and convincing signs. This means that the below average
performance was not completely because of the text length.
Based on PCA-analyses, that gave me insight in the components of the Wilhelmus and their relation to
those components of other text, by Coornhert and Marnix, the Wilhelmus seems to posses some stylistic qualities
that set it apart of all the Marnix and Coornhert in the analysis.
Changing some core parameters of my design, in order to squeeze out further results, I reconsidered my distance
measure. Using other distance measures that heavily favoured the top few MFW as features, texts of Marnix
were most of the time the Wilhelmus nearest neighbours. However, because of the course of my thesis, with its
theoretical argumentation the whole of my analyses were designed to test with the Burrows delta. The amount of
128
test with other distance measures I consider to be too little to give a decisive verdict on these measures, even
though their attribution seems at least plausible. It is interesting idea for further research.
Which candidate author do my quantitative stylistic analyses point to or support as the author of the
Wilhelmus? The results are inconclusive.
Which candidate author do my quantitative stylistic analyses rule out as the author of the Wilhelmus?
The quick answer is Lucas dHeere, Jan van der Van der Noot and Johan Fruytiers, but the right answer
on this question depends on the explanation thats given about the absence of a stylistic authorship
signal of the Wilhelmus. A more elaborative answer regarding the ruling out of possible authors will be
given later on.
Who is, or is more likely to be, the author the Wilhelmus, Marnix van Sint-Aldegonde of Dirk
Volkertszoon Coornhert? The results are inconclusive.
Ive also asked some methodological, which I can try to answer now that Ive answered or failed to answer the
topical questions. However, answering the methodological questions requires a lot more debating the nature of
the results. While a successful author attribution would leave my questions on the possibilities of attributing an
author with a quick confirmation, a failure to attribute does not always mean these methods were insufficient.
Therefore some of my expectations and opinions on the matter will be discussed in the section discussion. Ill
also discuss some other possible reasons for the failed authorship attribution, and Ill question if we can even
speak of a failed authorship attribution. However, these possible explanations go no further than substantiated
speculations, and therefore I will not discuss them here, but in the discussion section.
Methodological conclusions
Lets first sum up, the in this thesis acquired methodological knowledge. As Ive shown, the methods used in this
thesis, my choice of low-level features in combination with a Burrows delta, performing a multilevel categorization
task, can measure and bring out a diverse palette of effects that in some cases have proved to be strong,
consistent and therefore believable. These effects include language effects, genre effect and authorship effects
129
among others. Graphs made in either Gephi or R often showed large corpus defining effects or little delicate
effects, subdivisions and details, and sometimes both. The different formats all performed well.
One major problem I anticipated on, and which Ive discussed several times by now is the small text size of the
Wilhelmus and the majority of the other texts. Size matters. A researcher should be aware and preferably
compensate or control for small texts and big relative differences in texts size who become more urgent as the
absolute texts size shrinks. Small text size have a negative effect on the results and big differences result in
signals based on text size.
A text of sample size of 550 words can pick up strong effects like author signals but often neglects to
show these effects and is incapable of showing more delicate effects. Samples of 550 words are often unreliable
and are likely to give inconclusive results. A larger size is strongly recommended. This means that the Wilhelmus
is far from an ideal text to perform authorship attribution on. Texts that were smaller than the Wilhelmus were
most of the time excluded from the corpus.
A text of sample size of 1500 or 1650 gives a lot more detailed graphs, including a lot of subtle and more
strong effects. The probability of attribution success and the effects an analysis expresses, increase quickly at
first when increasing text size, indicating a strong correlation between the current text size and the quality of the
analysis; but then, above a certain value, further increase of sample size would not significantly affect the
effectiveness of the attribution.418 It is preferred to have a text of sample size greater than 1650, because the
results will still improve substantially.
When performing normal sampling on a texts wise unbalanced corpus, the texts that are significantly
larger than others will make up too big a part of the whole amount of the samples and therefore defining the
graph, obscuring smaller effects. A balanced set in general and specifically in this regard is necessary.
Using character n-grams as features, among other, was a good choice. They handled, as explained in the theory,
the minimal text size by creating more features, but also on analyses that werent limited by texts size character
n-grams often showed the best results. Especially Character 3 gram gave a lot of information and definitely
proved itself to be a alternative or addition to features on word level. This counts for Character 4-gram in a lesser
extent because in comparison to character 3-grams they showed less of the smaller and weaker effects.
4. Can my methods give supporting evidence for one or more of the usual suspect for the Wilhelmus
authorship attribution case? Possibly
5. Are my methods useful and/or sufficient for authorship attribution for texts of 550 words? No theyre not
sufficient but incidentally successful non the less.
6. What are current limits of computational stylistics and authorship attribution? This question is to big to
answer here but is discussed in several parts of the conclusion and in the discussion section.
In general I consider my methods sufficient for authorship attribution, as well as identification of other stylistic
effects. I think these methods, with these parameters and on these corpora would have been capable of
identifying the author of the Wilhelmus if the was present and well represented in the corpus. The other condition
is that the Wilhelmus is representative for the style of its author. I base this assertion on the majority of texts in my
corpus that were correctly attributed to their authors. The corpora were designed to include all possible authors of
the Wilhelmus and to include them in such a way that they would represent their authors style, as well as that
they would signal a variety of other signals. The fact that the Wilhelmus isnt attributed makes me question the
capability of my methods but doesnt make me dismiss them all together.
One obvious flaw in my design is the lack of great amounts of texts by all possible authors. I pointed out
the necessity of a clean corpus for the analyses of very short texts and substantiated this with my results. Id look
for improvements in this area before dismissing my methods all together.
Concluding
Concluding; With computational methods and quantitative analysis can we determine, support or dismiss possible
authors of the Wilhelmus. I think it is definitely possible but in this thesis I did so very scarce, due to the difficult
circumstance and the difficulty of the tasks, but more importantly because the Wilhelmus, in comparison with
other texts, performed very poor. This is on itself very interesting and opens a register of debatable and testable
theories about its disguised authorial signal, its reluctance to show effects, and its conception, which I will discus
in the discussion section.
So, can the real world authorship attribution case of the Wilhelmus, with all its problems, be solved with
the methods and tools of computational literature? With the current circumstances available for the researcher I
doubt that there will be an authorship attribution possible thats scientifically acceptable and will achieve a widely
held consensus in the computational literary field, although therere a lot of options Ive considered but not yet
explored and there are many ways to improve the means of the literary scientist. Ill explain and elaborate in the
discussion.
131
So I havent solved my case and the reasons why the attribution didnt succeed are unclear, making my answer to
the methodological questions as well as the topical question a dont know. Did I fail? I am schooled, next to the
humanities, in the social sciences, where such an answer is much more common. Its actually considered a result,
providing that the research has been set up and conducted in a proper scientific manner, but this is no different
from research that does provide an answer. The researcher needs of course to comprehend and reflect on the
scientific process, depict and interpret it, and this needs to result in suggestions for further research. The
expression is no results are also results, but this doesnt apply to this thesis and Ill sum up the reasons why it
doesnt in the paragraphs below. What definitely does apply to my thesis, authorship attribution and experimental
research in general is; failed attribution is better than false attribution.419
The first reason why the social sciences attitude of no results are also results doesnt apply to my thesis
is because, without trying to enforce boundaries between faculties, Im a Dutch language and literature student,
performing research, I myself consider to predominantly belong within the faculty of the Humanities.
Secondly, my cup is not empty, its half full. I do have results, a lot of them, I made a lot of conclusions
and answered a lot of questions and hypotheses, and more importantly, a part of them applicable to the tradition
of the Wilhelmus research. I excluded some of the potential authors of our national anthem, got some expected
genre indications and a surprising language signal and most importantly showed over and over again how the
text seems to have an exceptional inconsistent and weak stylistic signal. This opens up new registers of
interpretation while making old ones unlikely or at the very least leaves some old attributions not supported by
stylistics analysis. I also analyzed a lot of other texts, their style on all sorts of levels and effects, giving me all
kinds of interesting results. There were, for example, the Residu-cases, text of uncertain decent, and identified,
verified and falsified characteristics of a lot of them. I also performed methodological research and produced
results. Author effects, different kinds of genre effects and different kinds of language effects were confirmed,
along with stylistic similarity of text coming from the same edition, on a variety of different Dutch texts, including
very short ones.
Still, the core question remains unanswered. In order to gain insight in my design and the anthems
resistance against attribution and signalling stylistic effects in general, its time to discus and theorize about the
possible reasons of why an attribution has not been achieved. I will use this section to reflect further on the
results, theories and even speculate on them in order to recommend future research and pronounce my hopes for
the future of digital humanities.
Texts of around 550 words are hard to attribute and only make weak connections inconsistently, so when the
parameters change they often cut previous ties and/or make new ones, exposing the results as unreliable. This
leads to a lot of noise and a lot of unusable texts, because of the short text size. However, authorship effects are
present throughout my analyses also with text sizes of only 550 words and even if the corpus is misbalanced due
to text size. In other words, many other very short texts, just as short as the Wilhelmus, did express a correct
authorial signal. Especially when correcting for text size by using character n-grams, sampling or balancing the
corpus, the possibility of correct attributions on 550 word text was obvious, but the Wilhelmus did not budge.
Other effects of the Wilhelmus were also either absent (is the Wilhelmus Flemish?) or expressed inconsistent (the
Wilhelmus as prose or poetry and the Wilhelmus as part of the Geuzenliedboek). The performance of the
Wilhelmus was very weak, and actually not very representative for the overall performance of my corpus. Only
texts by Reael seemed to project the same level impossibility regarding attribution. I conclude that the limited text
size surely didnt help the performance of the Wilhelmus, but cant be the full explanation for its stylistic
indifference.
were analyzed but couldnt be ruled out. Texts and possible authors that I havent included and should be
considered are Jan Baptist Houwaert especially the text Milenus clachte, Hendrik Niclaes with his Cantica
Lieder offte gesange and Psalmen unde ledern and the anonymous song George Lalaing. Other authors that I
havent even mentioned but I still recommend for inclusion, in order to have a corpus as complete as possible,
are Jacob van Wesembeke, Hendrik Geldorp also known as Hendrik Castricus and Nicolaas Bruyninck.
Especially Jacob van Wesembeke is an author I would have include if representative text was available. He was
an interesting case because we know he was supporting the protestants and had to flee Holland because of that
in 1567. He was at some point in time present at Dillenburg, territory and place of exile for Nassau and also one
of the key places Stipriaan based his network analysis on. In addition to this was Wesembeke the predecessor of
Marnix as secretary of the prince.420
Also a lot of little effects, or divergent texts can be sought out by close reading and study of contextual
and biographical and historical information about the text. So stepping temporarily back from the distant reading,
during the corpus building phase, can also a option for improving your corpus, although Ive done this already.
We should also consider collecting more editions of the Geuzenliedboek to see if the Wilhemus connects
with some of songs in them and to understand these books and their relations better. I recommend extra attention
for song 88, 89, 78 of the 1583 edition of Geuzenliedboek, because these songs were closest related to the
Wilhelmus.
If its a case of the real author being absent or his true signal being absent because of insufficient corpus
building, a successful authorship attribution is very possible when all the possible authors are sufficiently
represented in a new corpus. This only holds up if other work of the author of the Wilhelmus survived the ages.
The unknown soldier, who only wrote one song and then disappeared from history, can never be verified as the
author. I see no reason yet to abandon the assumption that the poet who wrote the Wilhelmus was a professional
one, so I do not yet accept this as a possibility.
For further research a possible solution for the imbalance problem, without throwing away texts is
described in the Stamatatos paper,421 were they use only the n-grams of the unseen texts to contribute to the
calculated sum. Each term was multiplied by the relative distance of the specific n-gram frequency from the
corpus norm. The more an n-gram deviates from its normal frequency, the more it contributes to the distance
measure. This way cases with limited and imbalanced corpora were available for training.
Whether you find the corpus fallacy an logical explanation or not, I recommend to explore the possibility,
in order to either rule out this option or find the author of the Wilhelmus.
2. Future design
420 Prims, F. Verslagen en mededelingen van de Koninklijke Vlaamse Academie voor Taal- en Letterkunde 1930.
Koninklijke Vlaamsche Academie voor Taal- en Letterkunde, Gent 1930 p599608http://www.dbnl.org/tekst/_ver025193001_01/_ver025193001_01_0051.php
If however the fault is in the Wilhelmus text itself, things will become a lot more difficult. In this case even Lucas
dHeere, Van der Noot, Johan Fruytiers, Coornhert and Marnix cannot be ruled out. Weve got to find an earlier or
another version of the Wilhelmus, that does signal his authorial style or find methods who can extract
characteristics that can tie the Wilhelmus to its author on probably totally different features/characteristic.
However, with the amount and availability of text from possible authors and of the Wilhelmus, improving
the design will not be easy. Other practical considerations for choosing my design were based on limitations of
space, time, my own specialties, knowledge and capacity and availability of computational and textual means. If
any of these limitations were cause for my methods to be insufficient, upgrading them can establish a positive
attribution. Another option is of course choosing a different methodology that can find the authorial signal, but it is
not likely that other methods will show greatly improved results for very short texts. The reason for choosing and
composing my methods and design was because it has been highly successful on authorship attribution and
because it also showed some promising results on very short texts.
One suggestion Id like to make is the use of the tool antconc. This tool allows the researcher to switch
easily from between distant to close reading. The design becomes very different and content features are often
the most logical features for this approach. This program can also be used for corpus building and interpreting
results as well.
Another rigorously different methods from the one of this corpus is using a standard ANOVA. This seems
necessary because when basing your results on visualizations without measuring significance, despite well
established hypotheses, good choice of style markers, advanced statistics applied and convincing results
presented, one cannot avoid the simple yet nontrivial question whether those impressive results have not been
obtained by chance, or at least not positively affected by randomness.422 An ANOVA take into account the
statistical dependence of different word frequencies, a famous examples the research on the federalist by Holmes
and Forsyth.423 These methods depend less on the visualizations and more on measurability and significance of
the outcomes. The design becomes less exploratory. With the flyweight effects the Wilhelmus has shown so far,
these methods do not seem a logical next step if the goal is to identify the author. They do seem a solid next step
to test the results and effects found in this thesis on a statistical and scientifically sound way.
If its just the edition or spelling of this version that corrupts the signal, well have to wait until Martine de
Bruin does her next major discovery or a human expert has to correct extensively for spelling and other
influences of the publisher and exhume the original text of the Wilhelmus.
Another explanation, already mentioned, is that the Wilhelmus might not be the product of one man writing a
poem. It could be the product of two or more writers, or of the people in Herderians sense, or perhaps an effort
by a whole group, editing and revising the text over and over again. It could also be orally invented, perhaps by a
famous poet or a close fiend of Wilhelmus, like Marnix, or just orally transmitted and spread. In both cases the
song was passed on and altered before it was written down in the (oldest) version weve today. Its possible that
our version harbours multiple voices, authorial, editorial or just a product of errors in transmission. Anyway the
style of the text, the relative frequencies of function words for example, thereby no longer (or never have) carry
the authorial signal of one poet writing a song. Theres a broad spectrum of possibilities if we let go of the idea
that the text has only one voice. I havent tested this hypothesis but Ive seen the Wilhelmus show stylistic
resemblance to songs with multiple voices, like the conjoined Coornhert file.
The establishment of multiple voices in a text requires most of the time large portions of text to unravel, especially
when all authors are unknown, the number of voices are unknown and were not even sure if it is a case of
multiple authors. This is the main reason why I deliberately did not test this hypothesis. Theoretically this might be
a very reasonable option, but I seriously doubt if the methods of the computational branch of literary scholarship
could test them.
Future research
Methodological variation for future research
There are also a lot of possible variations of design, that were not pursued in this thesis, that might be worth the
consideration.
First of all we could explore several distance measures other than Burrows, especially the ones that
showed promise in the results. A larger theoretical frame should be build, regarding their preference for the very
first MFWs and refer this back to the corpus. A alternative not yet tested is the use of a similarity measure like the
Cosine similarity. Its used in several papers on AA and also when in the field of textmining. According to Smith
and Aldridge424 the cosine-based Delta measure is the best distance measure for authorship attribution425
Another approach Id like to suggest is to complete the Koppel needle in the haystack method,426 427
mentioned in the theory. Koppel used specifically designed meta-learning model, by building an SVM classifier for
each candidate author,428 to automatically determine which attributions by which representation schemes have
high likelihood of being correct, using a holdout set of 10.000 blogs (those not included in the text). Koppel
showed promising results for snippets of 200 words. In its core Koppels method is an alternative selection of
features. Ive already explained this technique in the theory. If one could mimic the extraction of the most
determining features of a feature set, while keeping an eye on the graph, one could test which attributions are
stable and so believable and which do not. Instead of relying on a specified number of most frequent words
(MFW), we systematically identify a set of discriminant words by using the method of recursive feature
elimination.429 If the unmasking method fails or is impossible to perform, other ways of feature extraction should
be considered.
In this thesis Ive decided that, based on resources at my disposal, not to perform machine learning
models. Koppels method relies on this and so do a lot of other methods. Im not familiar with the complete scope
of machine learning models so I cant do any specific recommendations, but a study after their possibilities is
advised. Therere promising results. Comparative studies on machine learning with support vector machines
(SVM) is as good as any for authorship attribution.430 SVM model (support vector machine) is able to avoid over
fitting problems even when several thousand features are used and considered one of the best solutions of
current technology.431 Looking at the compared performance of several representative learning methods for
authorship attribution, we will see, however, the choice of the learning algorithm is no more important than the
choice of the features by which the texts are to be represented.432
In this light another option that weve not yet considered are Application specific features. Features that
are examined individually and selected on the basis of discriminating the authors of a given corpus into a feature
set.433 Although the most important criterion for selecting features in authorship attribution tasks is their frequency,
and Ive used MFW. Features identified by a feature selection algorithm may be too corpus-dependant and have
questionable general use.434
Another option Id suggest is further testing of dimensionality reduction. With principle component
analysis It may be more useful to look at the second and third principle component to view the variation of datapoints. Most of the text will show the same clusters, and the one that shifts may have some other underlying
source of variation. Other multivariate statistical analysis techniques, like discriminant analyses, standard factor
analyses and cluster analyses I havent performed in this thesis but could bring some insight in the data.
435 This is a Dutch saying for its renaissance period, as the Dutch renaissance sailed an atypical course.
436 Songs of exile or the exiled. Perhaps the most apt, but also frivolous translations is outlaw songs, but
this term is already appropriated by bikers and pop-music.
437 Het ontstaan van het Wilhelmus
138
End plea
There are many ways to improve the research conducted in my thesis, quit a few which I plan on doing myself, as
are there many ways to improve as a researcher. The improvement of a field of science, however, is a process
that is way more complicated, and in which the role of a marginal figure will be a passive one, looking from the
sidelines at the changing landscape. But if we can find ways to improve the means of the literary scientist, then a
much larger portion of the researchers can participate.
Some areas or aspects that need improvements become quit obvious when youre conducting
quantitative analysis (on Dutch literary texts). The fact that theyre obvious doesnt mean theyre easily to
improve, let alone solve, however it does mean that (some) of the current limitations on this type of research are
an eyesore. If the access to digital text, ready-made for analysis with governable tools, was easy and free (or
without copyright), any intending could do a quantitative, or other computational, analysis. Besides the
democratization of this type of research, and offering to the research what is already there for consumption, would
this mean tremendous gain for the advanced student or researcher, who wants to ask specific questions on a
specific tests.
I estimate that a better accessibility of digital texts could have saved months of my research; months I
couldve used to try other approaches, like varying distance measures and become known with machine learning
techniques. Some hypotheses are now cancelled because of this limited availability, that are perhaps testable
given a bit more resources. Although this gives me a lot of follow-up research to perform, preventing me to fall in
a black hole caused by great amounts of sudden free time, now being ensured that I can workaholic myself
through those tough first post-thesis months, I cant help but feel that, if the humanities were a bit more used to
computational literary studies, I could have done more.
The more I try to participate and engage in the field of computational literature, not only in the university,
but also on my twitter account and through collaboration outside of the academy, I see experts wading through
text and methods in order to say something about text and method. This is not very different from any other
literary research, if it werent for the digital books missing from the digital libraries or digitally printed in such poor
quality, you have hard time distinguishing the text from the ink spatters.
I plead for the building of a large digital corpus of (Dutch) text, with large amounts of text per author, genre and
language, if need be even offering different spellings, dialects and formats, including contextual framing and
available for bulk download and online analysis, as well as prepared for some of the most important open source
tools. I know several projects that are hoping to realize parts of this analysis-utopia and Ive every confidence that
its only a matter of time before a community of researchers, academics and programmers will fulfil this prospect.
139
140