Sei sulla pagina 1di 40

GENDER/GENRE:

GENDER DIFFERENCES IN
PROFESSIONAL WRITING

Image: flickr/srqpix CC BY 2.0

Brian N. Larson
29 October 2014
Current Research in Writing Studies

Housekeeping
www.Rhetoricked.com (these slides + some
additional)
Communicate with me:
@Rhetoricked
Larson@Rhetoricked.com

Research supported by:


Graduate Research Partnership Program fellowship (U of M
CLA), 2012
James I. Brown Summer Research Fellowship, 2014

www.Rhetoricked.com
@Rhetoricked

Gender, sex,
and research constructs
When I talk about my own data, Ill refer to
Gender F authors/writers
Gender M authors/writers

These categories may or may not


correspond to other researchers
{woman, female, feminine}
{man, male, masculine}

Thats the subject of another talk (or for


Q&A)
www.Rhetoricked.com
@Rhetoricked

Many researchers have asked


Do men and women communicate
differently?
Much work inspired by Robin Lakoff (1975)
Scholarly and popular works by Deborah
Tannen (e.g. 1990[2001]) and others
Much of this research in oral/face-to-face
communication
www.Rhetoricked.com
@Rhetoricked

Writing:
Process and product
In writing studies, we can (roughly)
divide process and product
Do men and women produce writing using
different processes?
Is the writing they produce distinguishable
based on author gender?

www.Rhetoricked.com
@Rhetoricked

Previous studies:
Process research
Focus on interpersonal communications
in mixed-gender contexts
Lay, 1989 (Schuster); Rehling, 1996; Raign
& Sims, 1993; Ton & Klecun, 2004; Wolfe
& Alexander, 2005; Brown & Burnett, 2006;
Wolfe & Powell, 2006, 2009.

www.Rhetoricked.com
@Rhetoricked

Previous studies:
Product research
In technical and professional
communication
Sterkel, 1988 (20 stylistic chars)
Smeltzer & Werbel, 1986 (16 stylistic and
evaluative measures)
Tebeaux, 1990 (quality of responses)
Allen, 1994 (markers of authoritativeness)

Manual methods, small samples


www.Rhetoricked.com
@Rhetoricked

Enter computational methods


Natural language processing (NLP)
Allows processing of large quantities of
text data
Study that attracted my attention
Koppel, Argamon & Shimoni, 2002
(machine-learning algorithms)
Argamon et al., 2003 (statistical analysis)
Ill focus on Argamon et al. in this talk
www.Rhetoricked.com
@Rhetoricked

Argamon et al. 2003


Used 500 published texts from BNC
Mean 34,000 words (tokens) per text
Statistical analysis showed
correspondence to Bibers (1995)
informational/involved dimension

www.Rhetoricked.com
@Rhetoricked

Gender in computer-mediated
communication (CMC)
CMC popular for NLP studies
Data are readily available
Data are voluminous

Examples
Herring & Paolillo, 2006 (blog posts, stat analysis)
Yan & Yan, 2006 (blog posts, MLA analysis)
Argamon et al., 2007 (blog posts, MLA analysis)
Rao et al., 2010 (Twitter, MLA analysis)
Burger et al., 2011 (Twitter, MLA analysis)
www.Rhetoricked.com
@Rhetoricked

Rationale:
Why is the question important?
Lend support to one or more theories of
gender
Two cultures (Maltz & Borker, 1982)
Standpoint (Barker & Zifcak, 1999)
Performative (Butler 1993, 1999, 2004)
Others

Sorting out methodological problems,


particularly use of gender as a variable
www.Rhetoricked.com
@Rhetoricked

Study design goals


Research questions
Did Gender F and Gender M writers in a disciplinary
genre in which they are being trained use lexical and
quasi-syntactic stylistic features with relative
frequencies that varied with their genders?
If so, did the differences appear in interpretable
patterns?

Examine a corpus of texts


All of the same genre
Where we can be confident of single authorship
Where author gender is self-identified

www.Rhetoricked.com
@Rhetoricked

Data collection
Major writing project at end of first year of
law school
Students address hypothetical problem
(writing in same genre)
Students not allowed to collaborate
Plagiarism difficult (but still possible)

Students self-identified gender*


193 texts (mean word tokens = 3764)
*This study IRB-approved (UMN Study #1202E10685)
www.Rhetoricked.com
@Rhetoricked

Text genre: Memorandum


regarding motion to dismiss
Written to hypothetical court
Supporting or opposing a motion before
the court
High-level organization is formulaic

www.Rhetoricked.com
@Rhetoricked

r
t

www.Rhetoricked.com
@Rhetoricked

Memorandum Sections

Caption**
Introduction/summary*
Facts
Legal standard of review*
Argument
Conclusion
Signature block**
* Not always present.
**I did not analyze (content is highly formulaic)
www.Rhetoricked.com
@Rhetoricked

Feature (variable)
selection
For now, those of Argamon et al. 2003
Relative frequencies of
429 function words (Argamon used 405)
45 parts of speech from the Penn
Treebank tagset (Argamon used 76 BNC
POS tags)
100 common part-of-speech bigrams
500 common POS trigrams
www.Rhetoricked.com
@Rhetoricked

Part-of-speech tags?
Bigrams & trigrams?
First, tokenize each sentence
(automated):
My aunts pen is on the table.

www.Rhetoricked.com
@Rhetoricked

POS tags
Purple words are function words

Tag the parts of speech (automated)


Then calculate relative frequency of
function words and POS tags
(automated)
www.Rhetoricked.com
@Rhetoricked

POS bigrams and trigrams


A bigram or trigram is a 2- or 3-token
window on the sentence.
Automated calculation

www.Rhetoricked.com
@Rhetoricked

Feature (variable)
selection
First-person pronouns (total)
Singular: I, me, my, mine, myself.
Plural: We, us, our, ours, ourselves.

Second-person pronouns: You, your, yours, yourself.


Third-person pronouns (total)
Singular (total)
Feminine: She, her, hers, herself.
Masculine: He, him, his, himself.

Plural: They, them, their, theirs, themselves.

Contractions: Including all instances of nt, ld, ve, etc.


All relative frequencies calculated (automated)

www.Rhetoricked.com
@Rhetoricked

Each students text is


represented by variables
A series of numerical values expressing each
feature (variable), i.e., the relative frequency of:
Function words / total tokens
POS tags / total tokens
Bigrams / total bigrams*
Trigrams / total trigrams*
Pronouns
Automated calculation
*Multiplied by a factor.
www.Rhetoricked.com
@Rhetoricked

t
T

www.Rhetoricked.com
@Rhetoricked

Example 1

Tokens of the function word-type all in


paper 1007 account for less than 7/100
of 1% of all tokens in that paper.
www.Rhetoricked.com
@Rhetoricked

Example 2
Bigrams made up of
a plural common
noun (NNS) followed
by a coordinating
conjunction (CC)
accounted for 1/10
of 1% of bigrams in
paper 1009.
www.Rhetoricked.com
@Rhetoricked

Mean relative frequencies


calculated
For each feature
Mean frequency (SD) for Gender F authors
Mean frequency (SD) for Gender M
authors
Statistical significance assessed with
Mann-Whitney U test (expressed as pvalue)

A priori threshold for significance: 0.05


www.Rhetoricked.com
@Rhetoricked

What Argamon et al. 2003


found: Men
Males used significantly more
Determiners, a, the, these
Determiner+noun bigrams: the books, a
dog, these Tories
Attributive-adjective+noun bigrams: great
leaders, old form
Prepositions: at, from, for, of, behind
Its
www.Rhetoricked.com
@Rhetoricked

What Argamon et al. 2003


found: Women
Females used significantly more
Pronouns (all)
1st person sing.: I, my, mine
2nd person: you, yours
3rd person: they, them, theirs

Present tense verbs: walks, eradicates


Contractions
Negation with not
www.Rhetoricked.com
@Rhetoricked

Informational/involved
Biber (1995) labeled this a dimension of
register variation after doing cluster
analyses on frequencies to identify covarying features as dimensions
Consistent with popular conceptions
and works such as Tannen (1990
[2001]) that characterize women as
affiliative and men as informative
www.Rhetoricked.com
@Rhetoricked

What I found:
Nouns & determiners
Nouns
Some categories showed non-significant
Gender F preference (weakly contradicting
Argamon)

Determiners and determiner+noun


Only significant: DET-NNP (proper noun)
But all showed non-significant Gender M
preference
(Overall, weakly supporting Argamon)
www.Rhetoricked.com
@Rhetoricked

What I found:
Adjectives & prepositions
Attributive-adjective+noun
Non-significant Gender M preference
(weakly supporting Argamon)

Prepositions
Non-significant Gender M preference
(weakly supporting Argamon)

www.Rhetoricked.com
@Rhetoricked

What I found:
Pronouns (i.e., a mess)
All pronouns: Non-significant Gender M
preference (weakly contradicting Argamon)
1st p sing., 2nd p., 3rd p. overall, 3rd s. fem: Nonsignificant Gender F preference (weakly
supporting Argamon)
3rd p. plural: Significant Gender M preference
(contradicting Argamon)
Its: Non-significant Gender F preference
(weakly contradicting Argamon)
www.Rhetoricked.com
@Rhetoricked

What I found:
Verbs, contractions, not
Present-tense verbs
Significant Gender M preference for 3rd p.
singular (contradicting Argamon)
Non-significant Gender M preference for the
rest (weakly contradicting Argamon)

Contractions: Non-significant Gender F


preference (weakly supporting Argamon)
Negation with not: (weakly supporting
Argamon)
www.Rhetoricked.com
@Rhetoricked

The take-away?
Statistics: The non-significant differences
should probably be regarded as nonsignificant
In that case, M-informational/F-involved is not
confirmed in this study

If the non-significant differences are real,


evidence for M-informational/F-involved is
still mixed, especially in pronouns and
present-tense verbs
www.Rhetoricked.com
@Rhetoricked

Explaining the findings with


relevance theory
Relevance theory (Sperber & Wilson 1995)
recognizes the effects of habituation
If boys and girls are acculturated to writing
in certain genres and certain topics in their
youths . . .
. . . they may unconsciously habituate to
certain (appropriate) word choices
. . . and may not be completely free to
vary their word choices consciously later.
www.Rhetoricked.com
@Rhetoricked

Situating the findings within


gender & language theories
Findings weakly support or contradict
Two sociolinguistic cultures view (Maltz &
Borker 1982; Tannen 1990 [2001])
Intersectionality/performativity views (Barker &
Zifcak 1999; Butler; many others)

Some gendered linguistic habits appeared


to resist retraining and conscious efforts to
conform to register conventions . . .
. . . others were apparently overcome.
www.Rhetoricked.com
@Rhetoricked

Im left with more questions


than answers . . .
But you are entitled to ask some
questions now . . .

www.Rhetoricked.com
@Rhetoricked

THANK YOU!
www.Rhetoricked.com (these slides + some
additional)
Communicate with me:
@Rhetoricked
Larson@Rhetoricked.com

Research supported by:


Graduate Research Partnership Program fellowship (U of M
CLA), 2012
James I. Brown Summer Research Fellowship, 2014

www.Rhetoricked.com
@Rhetoricked

Works cited
Allen, J. (1994). Women and authority in business/technical
communication scholarship: An analysis of writing... Technical
Communication Quarterly, 3(3), 271.
Argamon, S., Koppel, M., Fine, J., & Shimoni, A. R. (2003). Gender,
genre, and writing style in formal written texts. Text, 23(3), 321346.
Argamon, S., Koppel, M., Pennebaker, J. W., & Schler, J. (2007).
Mining the Blogosphere: Age, gender and the varieties of selfexpression. First Monday, 12(9). Retrieved from http://firstmonday.org/
issues/issue12_9/argamon/index.html
Armstrong, C. L., & McAdams, M. J. (2009). Blogs of information: How
gender cues and individual motivations influence perceptions of
credibility. Journal of Computer-Mediated Communication, 14(3), 435
456.
Barker, R. T., & Zifcak, L. (1999). Communication and gender in
workplace 2000: creating a contextually-based integrated paradigm.
Journal of Technical Writing & Communication, 29(4), 335.
Biber, D. (1995). Dimensions of register variation: a cross-linguistic
comparison. Cambridge;;New York: Cambridge University Press.
Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing
with Python (1st ed.). OReilly Media.
Brown, S. M., & Burnett, R. E. (2006). Women hardly talk. Really!
Communication practices of women in undergraduate engineering
classes (pp. T3F1T3F9). Presented at the 9th International
Conference on Engineering Education, San Juan, Puerto Rico:
International Network for Engineering Education & Research. Retrieved
from http://ineer.org/Events/ICEE2006/papers/3219.pdf
Burger, J., Henderson, J., Kim, G., & Zarrella, G. (2011). Discriminating
gender on Twitter. Bedford, MA: MITRE Corporation. Retrieved from
http://www.mitre.org/work/tech_papers/2011/11_0170/

Butler, J. (1993). Bodies that matter: on the discursive limits of sex.


New York: Routledge.
Butler, J. (1999). Gender trouble. New York: Routledge.
Butler, J. (2004). Undoing gender. New York: Routledge.
Cunningham, H., Maynard, Diana, Bontcheva, K., Tablan, V., Aswani,
N., Roberts, I., Peters, W. (2012, December 28). Developing
Language Processing Components with GATE Version 7 (a User
Guide). GATE: General Architecture for Text Engineering. Retrieved
January 1, 2013, from http://gate.ac.uk/sale/tao/split.html
Cunningham, H., Tablan, V., Roberts, A., & Bontcheva, K. (2013).
Getting More Out of Biomedical Documents with GATEs Full Lifecycle
Open Source Text Analytics. PLoS Computational Biology, 9(2),
e1002854.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., &
Witten, I. H. (2009). The WEKA Data Mining Software: An Update.
SIGKDD Explorations, 11(1), 1018.
Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in
weblogs. Journal of Sociolinguistics, 10(4), 439459.
Koppel, M., Argamon, S., & Shimoni, A. R. (2002). Automatically
categorizing written texts by author gender. Literary and Linguistic
Computing, 17(4), 401 412.
Lakoff, R. T. (1975/2004). Language and Womans Place: Text and
Commentaries. (M. Bucholtz, Ed.) (Revised and expanded ed.). New
York: Oxford University Press.

www.Rhetoricked.com
@Rhetoricked

Works cited
Lay, M. M. (1989). Interpersonal conflict in collaborative writing: What
we can learn from gender studies. Journal of Business and Technical
Communication, 3(2), 528.
Maltz, D. N., & Borker, R. (1982). A cultural approach to male-female
miscommunication. In J. J. Gumperz (Ed.), Language and social
identity (pp. 196216). Cambridge U.K.: Cambridge University Press.
Pakhomov, S. V., Hanson, P. L., Bjornsen, S. S., & Smith, S. A. (2008).
Automatic classification of foot examination findings using clinical notes
and machine learning. Journal of the American Medical Informatics
Association, 15, 198202.
Raign, K. R., & Sims, B. R. (1993). Gender, persuasion techniques, and
collaboration. Technical Communication Quarterly, 2(1), 89104.
Rao, D., Yarowsky, D., Shreevats, A., & Gupta, M. (2010). Classifying
latent user attributes in Twitter. In Proceedings of the 2nd international
workshop on Search and mining user-generated contents (pp. 3744).
Toronto, ON, Canada: ACM.
Rehling, L. (1996). Writing together: Genders effect on collaboration.
Journal of Technical Writing and Communication, 26(2), 163176.
Smeltzer, L. R., & Werbel, J. D. (1986). Gender differences in
managerial communication: Fact or folk-linguistics? Journal of Business
Communication, 23(2), 4150.
Sperber, D., & Wilson, D. (1995). Relevance: Communication and
Cognition (2nd ed.). Wiley-Blackwell.
Sterkel, K. S. (1988). The relationship between gender and writing style
in business communications. Journal of Business Communication,
25(4), 1738.
Tannen, D. (2001). You Just Dont Understand: Women and Men in
Conversation. William Morrow Paperbacks.
Tebeaux, E. (1990). Toward an understanding of gender differences in
written business communications: A suggested perspective for future
research. Journal of Business and Technical Communication, 4(1), 25
43.

Tong, A., & Klecun, E. (2004). Toward accommodating gender


differences in multimedia communication. Professional Communication,
IEEE Transactions on, 47(2), 118129.
Wolfe, J., & Alexander, K. P. (2005). The computer expert in mixedgendered collaborative writing groups. Journal of Business and
Technical Communication, 19(2), 135170.
Wolfe, J., & Powell, B. (2006). Gender and expressions of
dissatisfaction: A study of complaining in mixed-gendered student work
groups. Women & Language, 29(2), 1320.
Wolfe, J., & Powell, E. (2009). Biases in interpersonal communication:
How engineering students perceive gender typical speech acts in
teamwork. Journal of Engineering Education, 98(1), 516.
Yan, X., & Yan, L. (2006). Gender classification of weblog authors. In
AAAI Spring Symposium: Computational Approaches to Analyzing
Weblogs (pp. 228230).

www.Rhetoricked.com
@Rhetoricked

Potrebbero piacerti anche