Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
RESEARCH INTERESTS
I am interested in deep models for structured prediction in natural language processing and machine learning. I
also work on approaches that leverage unlabeled data for unsupervised knowledge induction.
EDUCATION
UNIVERSITY OF TEXAS AT AUSTIN Austin, Texas
Masters in Computer Science Aug 2017 – May 2019 (Expected)
Thesis Advisor: Dr. Greg Durrett
GPA: 3.945/4.0
INDIAN INSTITUTE OF TECHNOLOGY, GUWAHATI Assam, India
Bachelor of Technology in Mathematics and Computing July 2011 – June 2015
CPI: 9.10/10
SELECT PUBLICATIONS
● Goyal, Tanya, et al. “An Empirical Analysis of Edit Importance between Document Versions.” Proceedings
of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017).
● Goyal, Tanya, et al. “Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to
Ensure Quality Relevance Annotations.” Proceedings of the 6th AAAI Conference on Human Computation
and Crowdsourcing (HCOMP 2018).
● Goyal, Tanya, et al. “Preventing inadvertent information disclosures via automatic security policies.”
Pacific-Asia Conference on Knowledge Discovery and Data Mining. (PAKDD 2017).
ACADEMIC RESEARCH
• Temporal Event Ordering (Master’s Thesis) Aug 2018 – Present
Advisor: Dr. Greg Durrett, University of Texas at Austin
This work focuses on inferring the temporal order of events in text. We approach this problem as a pairwise
classification model along with structural constraints to ensure global consistency. One of the major
limitations of the problem is the low data availability. We are formulating unsupervised techniques that use
a number of distant learners to inform the model.
• Crowd source quality control through worker reliability modeling Aug 2017 – May 2018
Advisor: Dr. Matt Lease, University of Texas at Austin
Explored a new direction for quality control of data collected from Mechanical Turk; estimating labeling
quality through workers’ behavioral signals.
Proposed three behavior-based models to predict worker accuracy and label correctness, developed a
markov decision process based approach to dynamically design cost-optimized tasks.
PROJECTS
• Joint Modeling of Image Captioning and Visual Question Answering Fall 2017
Explored multi-task models and architectures for the visual question-answering problem. Experimented with
various architectures combining the main VQA task with related tasks such as image captioning or sub-tasks
such as question-type prediction.
• Named Entity Recognition in Low Resource Domains Fall 2017
Performed a comparative analysis of domain adaptive techniques for the task of Named Entity Recognition
in low resource domains, using deep models.
INDUSTRIAL RESEARCH
• Edit Classification Research Engineer, Adobe Research, 2015 - 2017
Proposed a supervised approach to extract textual differences between multiple versions of a document, e.g.
Wikipedia edit history, and assign labels to edits, classified as paraphrase or factual. A ranking algorithm
was proposed to rank edits in order of perceived importance by reviewers.
Published in EMNLP 2017.
• Extreme Multi-label Document Classification Research Engineer, Adobe Research, 2015 - 2017
Developed a hierarchical multi-label classification algorithm for the task of recipient prediction for
documents, e.g. emails, based on the text content and user social network.
Published in PAKDD 2017, demoed at Adobe Tech Summit, San Jose 2017, bi-annual conference
showcasing innovative projects across research and engineering teams.
REFERENCES
Dr. Greg Durrett Dr. Matt Lease Balaji Vasan Srinivasan
Assistant Professor Associate Professor Sr. Computer Scientist
University of Texas, Austin University of Texas, Austin Adobe Research
gdurrett@cs.utexas.edu ml@utexas.edu balsrini@adobe.com