Sei sulla pagina 1di 3

PhD and MSc open positions

Data-driven modeling and processing of multimodal


document images.

Research project and objectives


Analysis of documents using different digitization modalities (gray, color, multispectral,
spectroscopy, etc) is crucial for bringing to life the many ancient documents that are stored in our
libraries and museums. Most of these documents are difficult to process with the standard image
processing methods as they present many physical and scanning degradations. The overall
objective of this research project is to develop a holistic, computational, data-driven approach for
nondestructive analysis and efficient understanding of the visual information contained in
degraded (ancient) documents. Another aspect of the project is to handle the challenging problem
of processing large collections (up to millions) of document images acquired by traditional
digitization techniques. The goals of the project can be summarized in the following:

- Design a unified framework for multispectral document image analysis.


- Devise mathematical models for the degradations and associated numerical methods for
their concealing in order to obtain clean images.
- Develop a data-driven approach for understanding visual information and discovering
latent relations in large collections of ancient documents.

In the framework of this project, three PhD grants are proposed for highly motivated candidates
interested in completing a PhD thesis on pattern recognition, machine learning ​and applied
mathematics. Two MSc grants are also offered and can be upgraded to PhD positions if the
candidates demonstrate promising results and/or good academic profile.

The candidates will register at the École de Technologie Supérieure (ETS) in Montréal, Canada,
and will join the ​pattern recognition and machine learning team of Synchromedia Laboratory
which has long experience in document image processing and machine learning. The candidate
will also have the opportunity to follow advanced courses and trainings and to get involved in
international research collaborations.

PhD Position 01:


The candidate will first investigate some standard dimension reduction and matrix factorization
techniques (such as the nonnegative matrix factorization) for the the processing of multispectral
(MS) document images. She/He will focus on the case where the number of factors is unknown.
Suitable stochastic processes and inference methods should be designed. The candidate will
demonstrate the efficiency of the proposed approaches on different applications such as MS
document restoration, MS document binarization, MS image segmentation.
PhD Position 02:
The candidate will work on degradation modeling and associated inverse schemes for degradation
removal. ​Two situations will be considered: (i) when the degradation process is known and thus
can be modeled, and (ii) when the degradation process is not known. For the former case,
diffusion-advection-reaction equations could be considered where time-space dependent terms
should be designed and calibrated through experimental measures or microscopic models. The
second case can be seen as a family of ill-posed inverse degradation problems as the forward
models are not known. For this case, image-based approaches should be considered. Efficient
numerical methods and suitable a priori terms are required. The candidate may inspire from the
literature of applied inverse problems. Alternatively (or as complement), databases of synthetic
and real document images can be exploited to train deep neural network algorithms for supervised
degradation removal.

PhD Position 03:


The objective of this position is devoted to visual understanding of large collections of gray/color
document images with a focus on textual information transliteration. ​Text recognition/spotting
tasks are very challenging because of the high variability of the writing process. Additionally, the
existence of various types of degradations makes these tasks very prone to errors. The candidate
will try to ​combine ​convolutional with recurrent neural network models to define an end-to-end
text recognition model that do not rely on ​prior layout analysis, text localization, and word
segmentation. The models should ​deal with arbitrary input document images and should be
flexible regarding the size of textual information. Proofs of concept of such a combination at the
word level have shown promising results.

MSc Position 01:


The candidate will work on the development of deep neural networks for degradation removal
using training datasets of real and synthetic ancient documents. She/He will investigate the
potential applicability (with the suitable adaptation) of our previous methods such as transfer
learning, domain adaptation and pre-trained feature extraction models. This subject being related
to the subject of PhD Position 02, both candidates can exchange knowledge and collaborate.

MSc Position 02:


The objective of this position is to build connectivity graphs for the large collections of document
images through unsupervised discovery of categories and their latent relations using visual
similarity properties. The candidate should investigate different similarity measures and variant
clustering approaches. Related works have been performed in case of natural images, but the
particular nature of document images makes them not applicable to our case. The candidate may
start by adapting these standard approaches before proposing new ones.

Qualifications
We are looking for highly motivated, creative students with a background in a relevant discipline
(machine learning, pattern recognition, image processing, statistics, applied mathematics). They
should have willingness to work in a team and undertake independent learning. A good knowledge of
written and spoken English as well as matlab/Python programming skills are prerequisites. Previous
experience in image processing or pattern recognition will be an asset.

Applications have to be submitted by email and must include


❏ a full CV, containing any relevant experience;
❏ official transcripts of the candidate’s university grades.
and for PhD candidates, the email should contain also
❏ a one page research proposal describing the candidate’s research interests and reasons for
pursuing a PhD on the proposed topic;

References
● Hedjam, R. (2013). ​Visual image processing in various representation spaces for
documentary preservation​ (Doctoral dissertation, École de technologie supérieure).
● Drira, F., LeBourgeois, F., & Emptoz, H. (2012). A new pde-based approach for
singularity-preserving regularization: application to degraded characters restoration. ​International
Journal on Document Analysis and Recognition (IJDAR),​ ​15(​ 3), 183-212.
● Mhiri, M., Desrosiers, C., & Cheriet, M. (2019). Word spotting and recognition via a joint
deep embedding of image and text. ​Pattern Recognition,​ ​88​, 312-320.
● Irani, M. (2017). “Blind” visual inference by composition. ​Pattern Recognition Letters.​
● Harley, A. W., Ufkes, A., & Derpanis, K. G. (2015, August). Evaluation of deep
convolutional nets for document image classification and retrieval. In ​ICDAR 2015 ​(pp. 991-995).
IEEE.

Contact
❏ Mohamed Cheriet:​ professor, director of the Synchromedia lab, official supervisor.
m​ohamed.cheriet@etsmtl.ca
❏ Athmane Bakhta:​ Research Associate, Synchromedia lab manager, co-supervisor.
​athmane.bakhta.1@etsmtl.net

Synchromedia Laboratory, École de technologie supérieure (ÉTS), 1100 Notre-Dame West. Montreal (QC)
H3C1K3, Canada.

Potrebbero piacerti anche